Tuesday, February 06, 2007

Jim Gray, Pattern Recognition, and the Limits of Web 2.0

Microsoft employee Jim Gray went missing in his boat off the coast of San Francisco on Sunday. Based on an idea proposed by Matt Haughey, when James Kim went missing, friends of Gray have leveraged the power of Amazon's Mechanical Turk, to have volunteers search satellite images for signs of the missing boat. In theory it sounds like a good plan; so good in fact that the search process has become mainstream news in its own right. Even Nature have reported, that this effort "will make future responses to natural disasters even smoother and better coordinated." In my opinion the search process is fundamentally flawed, and a complete waste of time and emotional energy.
Searching a satellite image for signs of the boat is a similar process to searching a chest x-ray for signs of tuberculosis (TB). If you, dear reader, as a civilian had to scan chest x-rays for TB, it would take you minutes to review each film, and your overall sensitivity and specificity would be exceedingly low. It wouldn't matter how many different civilians tried to do it. Adding additional untrained reviewers would only increase the time spent, while marginally, if at all, increasing the sensitivity, and substantially decreasing the specificity of the review process. As an actual doctor, but one who rarely looks at chest x-rays, I can review a film in around 30 seconds, and would have a reasonable sensitivity for TB, and reasonable specificity. But give films to an actual radiologist, and he or she can review them at a rate of 10-20 per minute, with a far higher sensitivity and specificity. The radiologist, because of years of training, can entirely abandon any search strategy or conscious cognitive processing, in favour of pure "pattern recognition".


If you look at this actual satellite image of an area where Jim Gray might or might not be adrift, you'll see a lot of white flecks, that might or might not be bits of boat. To an untrained reviewer the noise to signal ratio is high. In the high res image (click to enlarge) it takes a long time to carefully check each fleck for possible "boatness", and even then you can't be sure. An expert reviewer though, could use pattern recognition to review this image in only a matter of seconds, and say accurately whether there is any boat there, or whether these flecks of white are all wave crests, and seagulls.
I don't know how many images have been uploaded, but well meaning volunteers have now scanned them 100,000 times, and suggested 2,000 possible images for further investigation. At a conservative estimate, that represents more than 500 man-hours of searching. The 2,000 "flagged" images now need the attention of an expert reviewer, in order to direct coast guard helicopters to real possibilities. However, because the Mechanical Turk effort relies on different people reviewing images multiple times, it's likely that that single expert reviewer could have viewed all the individual images within an 8 hour shift, sometime on Sunday, shortly after Gray was reported missing. I'm sure that no-one at NASA or the coastguard is actually bothering to review the results of the Mechanical Turk search.
While Web 2.0 has accomplished a lot of wonderful things through online collaboration, having non-experts search in this way is utterly unconstructive. It's a modern day version of praying for Gray's safe return.

Update 7/2/07: The Mechanical Turk search finished yesterday, and involved 530,000 individual images; many more than I had anticipated. Despite this, I don't believe for a second that the process will help find Gray's missing boat. Here's a collection of 5 of the most promising images. None of them, to my non-expert eye, seem to have any obvious boats in them, despite the fact that there must have been hundreds, if not thousands of boats off-shore from San Francisco during the time these pictures were taken.

Labels: ,


Comments: Post a Comment

Subscribe to Post Comments [Atom]





<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]