“Relevance Ranking on Google: Are Top Ranked Results Really Considered More Relevant By the Users?”

Nadjla Hariri’s article examines the differences between relevance as determined by users and search algorithms. He begins the discussion with a brief description of Google’s ranking methods. To sort by relevance, Google uses a patented algorithm called PageRank, which considers more than 500 million variables and 200 billion terms in sorting pages. It also factors in ‘votes,’ which are cast for a page when another page links to it, making a page more likely to be considered as relevant. From his research, Hariri wanted to find out whether the relevance ranking produced by this complex algorithm matched what users really found relevant to their lives. Additionally, noting that users rarely go past the first or second page in skimming results, he wanted to know whether the results on later pages are actually less relevant than those displayed at the beginning. The article proceeds to define ‘relevance’ as it is used in the research: “the closeness of retrieved documents to queries on the basis of the subject matter or other specified attributes of the record” (599), and to demarcate what records would be found on which page. Next, Hariri launches into his literature review. Some of the studies cited compared various search engines to each other, producing a variety of contradictory conclusions as to what engine produced the most relevant results. Another batch of studies discussed tested whether or not search engine rankings matched up with user choices and or expectations (this occurred very rarely). Notably, most of these studies judged relevance using human judgement. The article then explains the methodology used: thirty-four different students from various backgrounds utilized Google for thirty-four different searches. The students were then given the first four pages of their hits and told to mark those which were most relevant, relevant, and irrelevant. Hariri took this data and calculated precision ratios from it. The results found that often the document users most relevant was ranked fifth by Google, with Google’s most relevant ranking placing second. Almost all students surveyed found either the first or the fifth item to be the most relevant. Generally, the highest precision indicated was on page one, followed by two, four, and then three. He does note that even in the fourth page there were at least three hits judged relevant by 40% of the users, some of which were just as good quality as those in earlier pages. Hariri ends with his conclusions. This study upheld previous studies which showed that even the best search engines aren’t as good as they might first appear to be. Because of this, searchers should look on later pages of their search results even if they weren’t able to find good material on the first couple of pages, as there may be relevant results a little further back.

This article is easy to understand, and describes an experiment with a rational methodology and believable results. Overall, it seems like a useful and reliable source.

Advertisements