Tuesday, September 9, 2008

“Google Racing”/Googlewhacking

How fast can new web pages containing unique phrases appear on Google? I’ve tested this twice now, with the same result of about three days. That’s how long some unique phrase takes to appear on Google. I haven’t found a term or jargon to describe to act of measuring this. “SEO” - search engine optimization - discusses issues related to this, like how long it takes for new terms on a web page to pull in related hits and how long a new web page takes to be indexed by Google. (You can measure the same thing on any search engine; Google is so ubiquitous and easy to make into jargon that its what most people use when describing any web-search phenomenon.)

What I’m curious about it how quickly Google indexes changes to a given page. I tested this earlier on my blog when I found a phrase that wasn’t found though Google when searched AS A PHRASE (i.e., with quotation marks around the two search terms) and thus which, for all intents and purposes, does not exist as a phrase on the web.

It took three days for my first test - the prhase “tarpaper kumquat” - to appear. And it also took three days for my unintentional second test - “Hurricane Evacuation Bingo” - to appear as a Google phrase search.

But then I got to thinking, this blog is new and isn’t linked anyway except my home page and, of course, blogspot/blogger is owned by Google. So a better test would be to post a unique phrase somewhere else and see how fast it appears on Google from that page. My home page is linked a few places, including CALI because of the legal research lessons I’ve written for them, so I should post a test phrase on my home page and see how long that takes. I should also create a test blog somewhere else and see when that is picked up. OR I could post a unique phrase in all three places and see which one comes up first, hence the jargon “Google Racing”, which I’m calling this until I find out that, of course, someone has done this before and calls it something else.

Another point is related to finding a unique test phrase. And this one DOES have jargon, but I forgot what its called. This may be it - Googlewhacking. Yes, I think that's it. I read a newspaper article a few years ago about this. Basically, two words - no quotes - and try to get the fewest results on Google. Any number below 100 is good, and surprisingly hard to do, and the closer to one the better. You can’t use proper names, and no fair looking through a dictionary for obscure words you’ve never heard of.

My first test: tarpaper kumquat

gets 731 results none of which, of course, had the words together as a phrase until three days after I posted them AS a phrase in sentence in my first test.

This shows why it is usually so pointless for reporters to blithely mention something like “Just Google [term1] and [term2] and you’ll get 30,000,000 results” to try to illustrate how popular or prevalent something is. The vast majority of those hits - for two terms NOT in quotation marks - will bring up pages where those two terms do NOT appear in relation to each other. There are a lot of forums, blogs, discussion boards etc., that contains dozens or hundreds of pages (if printed out) of text on one web page and, through shear probability, the chances of two words which are even at all someone common will appear many times on different pages is very likely. Its hard to find two words with few hits. I mean, jeez:

arsenic imagination

gets 195,000 results.

Anyway, its something to do when things are slow at the reference desk.

Oh, and it turns out that there actually is an education tool that IS Hurricane Bingo (though not hurricane evacuation bingo). Its recommended for grades six and up as a way for students to learn hurricane terms in a “fun, fast, atmosphere”. (Do kids have any concept of games that aren’t electronic?)

No comments: