New York Times: A Stock-Killer Fueled by Algorithm After Algorithm
September 10 2008
One of the big news stories in the tech world this week is how United Airlines stock plunged in value after an old (2002) Florida News-Sentinel story resurfaced and was mistakenly believed to be fresh news. The Times, among many others, covers it:
Both human error and far-from-foolproof technology seem to have played a role in the episode, which involved a 2002 Chicago Tribune report; the web site of the Sun Sentinel, a Florida newspaper owned by the same company; the Bloomberg News financial wire service; and Google, all apparently unwittingly.The Tribune acknowledges that the article, plucked from its archives, had no publication date. Google acknowledges that, as a result, it assigned the date that the page was crawled, thus, making it appear "new".
But what doesn't seem to be coming out in the NYT coverage, or anywhere else, is that what then happened is the Google alerts kicked in. I have seen this of late as well -- I've been getting Google alerts for articles about Lotus Notes that are 5+ years old...in some cases going all the way back to 1998. The Tribune article suffers from this problem, and they say (as the Times quotes):
The December 10, 2002, story contains information that would clearly lead a reader to the conclusion that it was related to events in 2002. In addition, the comments posted along with the story are dated 2002. It appears that no one who passed this story along actually bothered to read the story itself.Yep, when I started seeing articles referencing Notes 6 in my Google alerts, it was pretty clear to me that they were old. The reader should have been able to determine that in the Tribune article. But why is it OK that Google is date-stamping these as "new" web pages, when clearly they are not?
Link: New York Times: A Stock-Killer Fueled by Algorithm After Algorithm >
Post a Comment
- 2
Timothy Briley | 9/10/2008 8:10:49 PM
usatoday.com pulls this stunt. For example, pull up Sagarin's college football ratings:
{ Link }
At the top of the page you'll find this time/date stamp:
Updated: 9/10/2008 8:57:38 PM
For those unaware, these ratings were published on Monday, 9/08/2008 after the conclusion of the weekend's games. It's safe to say there wasn't a need to update them 5 minutes ago.
So it really ought to say Updated: 9/08/2008 3:00:00 AM
Hopefully no stock crashed as a result. ;)
- 3
NeilT | 9/11/2008 4:13:35 AM
If there is one thing I have found that really doesn't work in Google it is trying to refine the dates of articles down to a specific timeframe. It really sucks at that.
- 4
Ben Poole http://benpoole.com | 9/11/2008 5:15:18 AM
If a page has no publication meta-data, and that page is new to the Google index, I don't understand how Google could stamp it with anything other than the "spider date".
- 5
Ed Brill http://www.edbrill.com | 9/11/2008 7:41:24 AM
@If Field "HasNoDateMetadata"=true
then do not use spider date for google alerts.
Seems pretty simple to me, at the risk of not alerting about "new" web content.
- 6
Andre H | 9/11/2008 9:13:41 AM
congrats Ed. You made it to the front page of techmeme: { Link }
- 7
Ed Brill http://www.edbrill.com | 9/11/2008 9:35:11 AM
@6 cool, thanks for letting me know!
- 8
Vitor Pereira http://www.vitor-pereira.com | 9/11/2008 4:53:30 PM
Please don't tell me that people are not only playing with redirections and are now playing with dates too. :-)


I guess that's what happens (sometimes) when you let engineers make all the decisions :-)
I'd be willing to bet that this behavior was an off-the-cuff decision by some engineer programming the alert features on some given day. "Oh, some of these pages don't have a datestamp. What to do? Well, I'll just give it the current date/time..."
Perhaps this decision was even the result of a carefully setup process! Maybe Google QE noticed one day that pages were being examined and (maybe) ignored becuase they had no date stamp. Maybe an SPR was written up, triaged, and approved for fixing.
Maybe even the "right" fix was discussed in a meeting somewhere. If so, it's pretty obvious that the implications were not carefully examined. Oh well, maybe that's why they call it SOFTware...