Saturday 21 July 2007

Google's timelines

Google’s amazing new timeline facility is as yet a rather well kept secret. But it is extremely easy to access. All you have to do, when the results of any Google search come up, is add after your search terms a space and then view:timeline

A click brings a miraculous transformation. Selected events relating to your search, all with a date now attached, are suddenly displayed in timeline format. The two lines of text in each result are exactly as usual, but they look much cleaner because of two elegant changes. The often confusing jumble of words in the top line, in blue underlined, is replaced by a year (or by day, month and year – whatever features in the two lines of text). And the lengthy url, in green, in the last line, is replaced by the name of the article and of the website in which it appears.

The mind boggles at the complexity of the algorithm that has achieved this change. The normal Google results for “Adolf Hitler” number about 2.8 million. From these the Google programme has selected just 60 events for the timeline.

So what has happened? Clearly the first task is to identify those results that include a year. But these must surely still number in tens of thousands.

After that there is one obvious and easy step. The first event needs to include the word ‘born’. (Easy? Results for “Adolf Hitler 1889 born - 87,000.) In the interests of accuracy the programme no doubt accepts only those results containing the month and day agreed by the majority. Thereafter a much used source is likely to win the struggle for inclusion (Wikipedia features frequently, as also does Spartacus Educational).

That first process doesn’t sound too difficult for a massive computer. But after that? How to select the remaining 59 events?

This is where things get problematical, because it appears – as one would expect – to be impossible to do so without editorial input. An example is the two crucial events of 1939 – the invasions of Czechoslavakia and Poland. There is no mention of the former (though a search for ‘Hitler invaded Czechoslovakia 1939’ does bring up 116,000 results, so there was no shortage of opportunities). But the invasion of Poland appears twice, perhaps because one source, BBC History, dates it just September 1939 and the other, American Experience, gives September 1, 1939, thus making it seem like two events.

You can judge the problem for yourself if you bring up the final page of Hitler’s timeline, from 1940 to 1945.

Whenever I have gone afresh to this url it brings up a slightly different group of events – suggesting that the selection process is somehow done, even more miraculously, in real time. But whatever selection comes up, it has always been a laughable representation of Hitler’s activities during those five years.

Admittedly Hitler is an exceptionally difficult challenge to the system, and it works far better with a subject such as an author – where a majority of the dated events will be directly relevant items such as the publication of each book. Google themselves recommend their system mainly for people, companies and places. And it is still only a beta version, which is why it is not much profiled (it is not yet among the thirty or so special Google products that they feature on a link from their home page). So maybe the Google magic will overcome the obvious problems and somehow transform the system?

I doubt it. My own view is that even they can’t achieve something as focused as a good timeline by purely mechanical means. Maybe they will prove me wrong. But I do have a passionate belief in the need for editorial input to counter the chaos, albeit highly stimulating chaos, that constitutes the internet.

Bring on the humans!

No comments: