Last days, there’s a bouncing topic on Real Time Searches, since twitter launched their search engine, with the ability of getting results on real time. Traditional search enterprises, specially Google, are discussing about getting the Real Time search on their services.
Google is working on a partnership with Twitter as Google’s CIO states in this way, Larry Page admitted Google had so far “done a relatively poor job of creating things that work on a per second basis”.
What is going on? Let’s analize it a little comming from the precedent search engines.
We will use the trend “American Idol” to compare and decide over the results.
We’ve ever had the possibility of making a restricted search for a site e.g. :site:twitter.com “american idol”
If we analyze the 10 first results we get:
- 1st result is a twitter user mobile’s page: American Idol (idolatry) containing the keywords on URL and title
- 3 top results are from mobile user pages, not Statuses
- 1st statuses result is from Mashable user on 4th position, 1 day and 8hrs ago from the posting
2. Google Custom Search Engine
Steve Rubel created on May 10, 2006 a scanning Twitter, based on the Google’s Crawler, PageRank and results service.
The results are worst to direct search on Google:
- 7 top results are from user names or tittles
- 1st status result is the same as before, Mashable on 8th position
3. TweetMeme and other Twitter search engines
TweetMeme is a search engine for Twitter. This kind of search engine was the first on getting real statuses and order not provided by the PageRank or similar ranking algorism.
The results are quite good in content information but 1st result is from 59 minutes ago… that shouldn’t be considered real time…
4. Google + Greasemonkey
The best real time results before Twitter Search.
- First result is about 5 minutes ago
Nowadays, with Twitter Search the results are really Real Time.
Kinds of information
Now is time to think if the Real Time Search makes sense, specially for searching Social Network status. So we are going to separate the Social Information from the Real Time Requied Information
1. Real Time Required Information
Let’s think about what relevant information would be required on Real Time:
- Disaster coverage
- Currency exchange / Trade market
- Transport information: Traffic, flights, trains…
- Live Events Coverage
So anybody can ask… where are the News?
News are not real time… when a new occurs, someone has to arrive there, write or record the notice, send it to the Media and publish…so the delay make them not Real Time. In spite of this, news should be available for searching within 10 seconds after the publication.
One common point of all this kind of information is the continuous status changing and the need to discover the Real Time status anytime we should need this.
All this information is nowadays unavailable for searching on search engines, meaning in Real Time, and we need to go to the original sources to get the most closest to Real Time information.
2. Real Time Social Information
With the Twitter explosion, people are requiring the search for the social information, especially for the information covering some of the real time required information described before.
But as usual there is a lot of noise on the information: irrelevant information and spam are the common noise generators, as we can note on Twitter Search. And crawl all this information really doesn’t make sense!!!
So the Social Valuable Information have to be filtered in order to get the real value of its, and this is the problem in my opinion Google can have.
A good way of filtering the information is crawling the social trend, this is hot information, and use strong recognition algorisms to find spam on this (same as made on blogs comments)
In spite of filtering algorism some info would be not considered as spam when it is, and some moderation is required, but aren’t we talking about Social?
Why not let the Social Users who can mark Real Time Search Results as spam and delete them!