Tuesday, May 20, 2008

Heisenberg's Uncertainty Principle and Google's Algorithms

We recently discovered that Google was no longer finding the home page of one of our partner publishers because the description of the magazine Quest Bulgaria on their home page was pretty much identical to the description on our system. Our derivative entry knocked them out, rather than the other way round, because our site is busier than theirs.....so given more weight by Google.

This was a puzzling and unwanted result so the publisher quickly changed the description on our system (our publishers can do this in real time through a form in which they edit the blurb), and Google is now finding Quest Bulgaria again (at the moment we come in a respectable third on the Google search). It was not difficult to make the changes and to invite the Google spider to return, but as one of my colleagues observed: Google is finding that it is not possible to be an accurate measure of the web because the way in which it maps and cadastrates the web is itself changing and deforming the natural shape of the web. My colleague finds Heisenberg's uncertainty principle at work here, but I am not so sure about that, it may simply be a lack of competition which is allowing the Google algorithms to become over bossy and over fussy. Would web spam be just as bad if there were three broadly competitive search engines at work? And if web spam were reduced would Google get subtler at discriminating between content which has a difference of function even though little linguistic difference on the page?

No comments: