TR10: Real-Time Search
Social networking is changing the way we find information.
The real-time man: Google’s Amit Singhal is mining social networks to generate up-to-the-second search results.
This article is part of an annual list of what we believe are the 10 most important emerging technologies. See the full list here.
How do you parse a tweet? Five years ago, that question would have been gibberish. Today, it’s perfectly sensible, and it’s at the front of Amit Singhal’s mind. Singhal is leading Google’s quest to incorporate new data into search results in real time by tracking and ranking updates to online content–particularly the thousands of messages that course through social networks every second.
Real-time search is a response to a fundamental shift in the way people use the Web. People used to visit a page, click a link, and visit another page. Now they spend a lot of time monitoring streams of data–tweets, status updates, headlines–from services like Facebook and Twitter, as well as from blogs and news outlets.
Ephemeral info-nuggets are the Web’s new currency, and sifting through them for useful information is a challenge for search engines. Its most daunting aspect, according to Singhal, is not collecting the data. Facebook and Twitter are happy to sell access to their data feeds–or “fire hoses,” as they call them–directly to search providers; the information pours straight into Google’s computers.
What’s really hard about real-time search is figuring out the meaning and value of those fleeting bits of information. The challenge goes beyond filtering out spam, though that’s an important part of it. People who search real-time data want the same quality, authority, and relevance that they expect when they perform traditional Web searches. Nobody wants to drink straight from a fire hose.
Google dominates traditional search by meticulously tracking links to a page and other signals of its value as they accumulate over time. But for real-time search, this doesn’t work. Social-networking messages can lose their value within minutes of being written. Google has to gauge their worth in seconds, or even microseconds.
Google is notoriously tight-lipped about its search algorithms, but Singhal explains a few of the variables the company uses to analyze what he calls “chatter.” Some are straightforward. A Twitter user who attracts many followers, and whose tweets are often “retweeted” by other users, can generally be assumed to have more authority. Similarly, Facebook users gain authority as their friends multiply, particularly if those friends also have many friends.