High-speed Googling

June 19, 2003
Computer science researchers at Stanford University have developed several new techniques for calculating Web-page rankings (as used in the Google search engine) five times faster.

Google currently ranks and searches 3 billion Web pages using a ranking algorithm called Computing PageRank. The program, however, can take several days to search and rank a billion Web pages. To speed the process, Stanford researchers developed a trio of techniques in numerical linear algebra. One uses extrapolation techniques, which make some assumptions about the Web's link structure that aren't true, but still permit quick and easy computation of PageRank. Because the assumptions are false, PageRank isn't completely accurate. However it is close enough to be refined using the original PageRank algorithm. Researchers have shown that their extrapolation techniques can speed up PageRank by 50%.

Another method, BlockRank, relies on a feature of the Web's link structure. The Stanford team found that approximately 80% of the pages on any given Web site point to other pages on the same site. That means they can compute many single-site PageRanks, glue them together, and use that as a starting point for the original PageRank algorithm. This could make PageRank computation three times faster.

Other Stanford findings show the rankings for some pages are calculated early in the PageRank process, while others of highly rated pages take much longer to compute. A method called Adaptive PageRank eliminates redundant computations associated with those pages whose PageRanks finish early. This speeds computation by as much as 50%. "Further speed-ups are possible when we use all these methods," says one Stanford researcher. Still, researchers say other issues must be solved. "We are closer to a topic-based PageRank than to a personalized ranking," he says.

Complex personalized rankings need even greater speed-ups to the PageRank calculations. And while a faster algorithm shortens computation time, storage problems remain. The results from a single PageRank computation on a few billion Web pages need several gigabytes of storage. Saving a limited number of topic-specific PageRank calculations is more practical, say researchers. For more information, go to the Stanford Database Group's Publication Server at http://dbpubs.stanford.edu/.

Sponsored Recommendations

Food Production: How SEW-EURODRIVE Drives Excellence

Feb. 18, 2025
Optimize food production with SEW-EURODRIVE’s hygienic, energy-efficient automation and drive solutions for precision, reliability, and sustainability.

Optimizing Agricultural Operations with SEW-EURODRIVE

Feb. 18, 2025
Boost efficiency with SEW-EURODRIVE's durable drive solutions for agriculture. Reliable, efficient, and tailored for you!

Ensure Safety with Explosion-Proof Pumps for Critical Applications

Feb. 10, 2025
For high-risk environments, reliability is paramount. Learn how KNF's explosion-proof pumps provide enhanced safety and performance in demanding OEM and process applications, ...

Revolutionizing Pump Efficiency with Advanced Drive Technology

Feb. 10, 2025
Discover how KNF’s innovative MI Motors are transforming pump intelligence and system integration. With enhanced efficiency and smarter control, this breakthrough technology optimizes...

Voice your opinion!

To join the conversation, and become an exclusive member of Machine Design, create an account today!