William M. Davidson, University of Vermont, Burlington, VT 05405, wdavidso@uvm.edu
One of the disadvantages of writing this column only once a year is that it is difficult to predict what is going to happen in search engines during the next month, let alone a year from now. As Yogi Berra is reputed to have said, "It's tough to make predictions, especially about the future." For the past several years this column has predicted a battle royal as Microsoft and Google fought it out to become the search engine of the future. In each case, the predicted battle never happened; Microsoft's share of the search market stayed below ten percent, and Google's market share slowly but steadily edged higher as Microsoft's new initiative was a flash in the pan.
This year, finally, there seems to be a real competition heating up. Microsoft has not only come out with a new search engine, called Bing, but also committed millions of dollars to a supporting publicity campaign. Furthermore, after a number of false steps that are probably not of interest to the readers of this column, Microsoft and Yahoo! announced that Microsoft's Bing search engine will replace Yahoo! Whether or not this is a good deal for Yahoo! is questionable (and mainly of interest to those who own Yahoo stock), but this should give Microsoft the opportunity to combine its share of the market with that of Yahoo, giving a potential overall share of approximately 30%. If nothing else, this should bring greater competition into the world of search, which is always to be desired.
Bing, Microsoft's latest entry carries a number of features over from the previous LiveSearch engine, including hotspots, deep links, Instant Answers. It also includes new features, such as Web Groups, which groups results around popular search terms, Bing Health, which includes content from sources such as the Mayo Clinic and MedlinePlus, and Quick Previews, which allows users to mouse over a result and see a preview Bing also includes a very useful left sidebar, called the Explorer Pane, which suggests related searches. Bing is clearly a full-featured search engine which deserves careful evaluation.
Google has responded to the competition from Bing by creating a suite of new options, including nine new search options to respond to the competition from social sites, like Twitter. Finally, and potentially most significant, comes the Google Wave, which is claimed to be a tsunami of a new application. I have not negotiated an invitation to try Wave, so that evaluation will have to wait until next year.
Bing is clearly an improvement over the previous LiveSearch, but has it improved enough to cause users to change the search engine they use? When first released, Bing seemed to be expanding market share, perhaps partially because of an advertising campaign reputed to cost $100 million. A more recent survey showed that Bing's growth had slowed, Google had regained some of the share it previously lost, and Yahoo! lost share (See Table 1). For the time being, it appears that any market share that Bing gains will be at the expense of the Yahoo! search.
Table 1: Share of the Search Market by the Three Main Engines (http://blog.searchenginewatch.com/091015-020040).
The important question for chemists is, "How does Bing compare with Google at finding useful chemical sites. As noted in a previous column, there are three main criteria that should be used to evaluate search engines: comprehensiveness, currency, and relevance. Comprehensiveness is a measure of what fraction of the accessible web sites the search engine actually includes in its index. This is particularly important for chemists, who are more likely to be looking for specialized information that may not be included in the index of a less comprehensive engine. Currency measures how often the search engine revisits sites to determine whether or not there have been any changes. Not only are new web sites constantly being created, but also many sites are vanishing. Failure to keep up can produce dead links. The final concern is relevance. Are the most useful sites not just included but listed early in the search results? This factor is probably the most difficult to evaluate quantitatively.
Since Google has been in operation for some time, and Bing is relatively new, it is difficult to think of a way to compare these two sites in terms of the currency of the references. Bing would have too much of an advantage since its index has probably been created more recently. Relevance is very much in the eye of the beholder (or the searcher, if you will). The first pages of hits for all three engines seemed very similar, but the reader can judge for him or her self by going to a web site called Blind Search, which allows one to search all three engines simultaneously and compare the results. The search is blind so choosing the engine that delivers the most relevant sites is not skewed by prior reputation.
Past reviews have focused mainly on the index size. Since the number of scientific web sites is only a small fraction of the accessible web, it is reasonable that engines which produce fewer hits on chemical terms will have less utility for chemists. As noted in previous articles, multiple word searches tend to give unreliable counts, since the engine algorithm may rapidly begin to include related terms, even if the compound term is narrowly focused with Boolean algebra or quotes. Another problem with counting hits is that engines may stop a search when it appears that the number of hits appears adequate. The difference between 20,000 hits and 40,000 hits may be connected more to the search load at the time rather than the size of the index. The way to avoid this problem is to use single word search terms that deliver as few hits as possible, since these numbers are more likely to be meaningful.
Table 2 presents the results for a number of different chemical search terms, including several that are relatively uncommon. It is worth noting that in no case did either Bing or Yahoo! deliver more hits than Google, and for specialized chemical terms, like polyphosphazine, perfluorophenyl, or discodermolide, Google was clearly better. Based on these results, Bing seems to have a smaller index since it gives significantly fewer hits. Thus, Google continues to be a better choice for chemists. With passing time, Bing may expand its index and become more competitive in this area, but that is fodder for another column.
What is Coming Next for Search
There are several important questions about search that are being discussed. One topic mentioned on several blogs is, "The Death of Search." Some bloggers argue that since web advertisements pay to keep the engines going, what if universally accessible web content were no longer available? This could possibly diminish the revenue (which pays to support the service) enough to impair the performance of the engine. Some content providers are trying to put their content behind firewalls, where the search engine netbots can't reach it. In addition, the increasing use of widgets and other net gadgets may make search harder. For example, Rob Griffin says that these developments are, ". . . a harbinger of doom for search as we know it."
I don't find these arguments to be convincing. First, an estimated 90% of the Web is already inaccessible to search engines. The 10% of the web that is available still represents billions of pages. Although a few content producers may shield their material from web search, most companies are working very hard to make their content more, not less, accessible. With a few notable exceptions, such as The Wall Street Journal, content providers have not been very successful at getting users to pay for access. A few newspapers may be successful in locking up their material (and a few more newspapers may vanish due to bankruptcy), but it seems unlikely that the lack of web content will become a serious problem. In the long term, the increasing use of widgets and other web gadgets may make life more difficult for search engines, but advertisers seem to prefer web ads, with quantifiable and targeted results, over the more traditional media. For the time being, these dire predictions do not seem to be reasonable, particularly for chemists.
Within the past month press reports about web search have not been limited to the announcement that Microsoft has a new search engine. To understand these developments it is necessary to recognize that two new kinds of search are being defined. The first is a direct result of the popularity of Twitter, the microblogging platform. Some argue that Google searches tell what was happening in the past, but Twitter searches tell what is happening right now. This type of search, often called Real-Time Search, is desirable to see what people are saying about a movie or TV show at the same time that it is appearing. Tweets, FaceBook "status updates", and FriendFeed posts normally achieve a level of immediacy that is different from normal blogging or web pages. Both Google and Bing have both made agreements with Twitter to search the Tweet stream, and there are a few specialized engines, like Scoopler, that are trying to do real-time search.
Social search in the form of social tagging or shared tags has been around for some time. It determines relevance of results based on the evaluation of other users, usually members of a self-defined online group. The major change here is that Bing has contracted to use FaceBook content that is publicly available. Facebook has many more members than Twitter, but many of the FaceBook users have elected to make their account information closed to the general public. In the short term, searching Twitter is potentially more valuable because it provides access to more information, even though FaceBook has many more subscribers, however; FaceBooks's agreement with Microsoft seems to suggest that FaceBook is hoping to increase the amount of open content. For more information, read Marshall Kirkpatrick's recent post on the ReadWriteWeb Blog, especially the comments.
The implicit message in all these moves is that social media, like Twitter or FaceBook, are becoming a new kind of search resource. A group with similar interests can share information on one of these social programs. Google search helps find what you know that you need to know; social groups can help you find what you didn't yet know would be useful. Thus far, there seem to be relatively few chemists using these Twitter or FaceBook in this way, but if the number of chemistry users reaches a critical level, this could open up an interesting new search avenue.