Burn Media Sites

Why digital entertainment platforms are evolving faster than ever

Five years ago, launching a decent streaming platform took millions. Now? A teenager with a laptop can build something that reaches millions. That shift…

Samsung Celebrated for Transformative Tech by Consumer Technology Association

Samsung heads into CES 2026 with momentum Samsung Electronics is closing out 2025 with a strong signal of where its future tech ambitions lie….

GPT 5.2 lands in the API and could reshape how SA builds with AI

OpenAI has launched GPT 5.2, a major model upgrade now available in both the API and ChatGPT. It is described as the company’s most…

Innovate47 Launches Global Food & Agri Accelerator to Back Climate-Smart Startups

Innovate47, the global venture builder and entrepreneur support organisation, has launched a new Food & Agri Accelerator to help founders reshape food systems and…

Simbi Wabote’s 10-Year Roadmap: How Nigeria Achieved 54% Local Content Growth

When Simbi Wabote assumed leadership of Nigeria’s Content Development and Monitoring Board (NCDMB) in 2016, he inherited an oil industry where local participation had…

Records Tumble In Best Year Yet For South Africa’s Hedge Fund Industry

South Africa’s hedge fund industry closed out 2024 with its strongest performance to date, according to the latest Novare Hedge Fund Survey. Assets under…

HONOR’s customer first strategy is reshaping South Africa’s smartphone market

HONOR has been steadily climbing in South Africa’s smartphone rankings, but its rise is not being driven by specs alone. Instead, the brand has…

GTA 6 Is Officially Coming in 2026: The New Date, the Delays and the South African Reality Check

Rockstar Confirms the New Date Rockstar Games has officially confirmed that Grand Theft Auto 6 will release on 19 November 2026 for PlayStation 5…

The Gaming Gear South Africans Are Actually Buying This December

December is here, and while the Black Friday dust has barely settled, South African gamers are nowhere near done hunting for upgrades. Retailers across…

Omoda C5 X review

Omoda is one of Chery’s sub-brands in the South African market, combining daring design with an abundance of in-cabin digitisation. Its mid-size crossovers are…

TDK enhances IMUs for extreme temps

TDK has responded to developing market needs with a new range of advanced inertial measurement units (IMU) for automotive applications. The Japanese electronics giant…

Data centres and defence are reviving diesel

Data centres will command power equivalent to the entire Japanese power grid by 2030. It’s a startling prediction and one that infrastructure futurists, data…

Continue in 10 seconds

Skip

Google • 13 Jul 2011

The story behind Google’s search algorithms

By Rowan Puttergill: Columnist

Read nextChinese websites fall victim to strict controls

The SEO industry has been thriving for a number of years, thanks to the fact that search ranking algorithms are constantly evolving and Google’s results ranking system is probably one of the internet’s most closely guarded secrets.

Any insight into what Google considers to be an innovative approach to search ranking, and the direction that Google is taking with regard to how it indexes pages, is a gem in the rubble of obscure guesswork that most of us are forced to undertake. That’s why a recent publication titled ‘Indexing the World Wide Web: The Journey So Far’ makes for some of the most interesting reading in the SEO world to date.

In the article, Google engineers explain some of the techniques that can be used to improve search result relevancy and the trade-offs that must be accounted for in terms of the machine resources required to implement them. Right up front, in the introduction, Google presents a fantastic picture of all of the major innovations that it considers to have changed the way that search engines have functioned since their inception in 1994.

Google gives a nod in the direction of Cuil and even to Bing, one of its major competitors. The latest innovation that the paper identifies, however, is “Realtime and Social Search,” which it credits partly to Facebook and to Twitter, but also includes Bing and Google as major players in this new arena.

While most of these innovations are public knowledge, it is useful to be able to pin down the developments that Google considers to have shifted the way that search engines function. The search monolith is more than likely attempting to incorporate as many of these innovations as possible into their own approach.

The article quickly jumps into a more technical analysis of how machine resources are used to handle indexing and the resolving of user queries. While much of this information seems overly complex, it is clear that the authors consider something that they call an ‘inverted index’ to be the most efficient method of storing and index structure.

This inverted index effectively keeps a dictionary, which contains a list of all of the documents that contain a word or term, along with the number of times the term is used within the document, to put against searched words and terms.

The authors go on to mention some of the shortcomings of this approach, such as the fact that the Internet is multilingual and that words have multiple forms or variations. The document then goes on to describe some of the techniques put in place to handle these problems.

Another point of interest is how Google’s research describes user intent. In order to return more relevant results, the authors describe how important it is that terms used in a search query have greater proximity, meaning that a search engine will seek to find pages where the query terms appear closer together within the document.

In order to avoid the huge processing and storage costs involved in indexing pages in this way, much research has gone into actually indexing whole phrases and their relationships. The paper states that Google has been experimenting with this in its TeraGoogle project, and lists some of the advantages and disadvantages of the approach.

It is clear that Google loves this indexing technique, with the only listed disadvantages being that it is difficult to implement and manage. Google is not known for shying away from anything difficult, so we can be pretty certain that phrase-based indexing will be the way forward.

The most interesting part of the publication is toward the end, where the authors start exploring how social media can be mined to help improve the relevance of search rankings. By building graphs of user followings and user influence, they suggest adding UserRank and UserTopicRank as additional features to start understanding how important links and information presented in social media actually are.

Furthermore, by using real time data, the search engine can perform something called topic clustering, so that search result relevance can be improved through awareness of the topics that seem to have a lot of social media ‘buzz’.

Finally, by using natural language processing, the paper suggests that it may be possible to work out user sentiment from social network postings. This means that your result relevance may be skewed by a majority sentiment, or by the sentiment of people within your own social networks.

It is rare to stumble across a document that presents such a complete overview of the search industry and the general direction it is taking, especially when that document comes from one of the biggest players in the market.

While I have tried to cover some of the major points in the document that really stood out to me, the document is packed with information. If you’re interested in SEO or simply in the technology that Google employs, I highly recommend that you download the article and take a look yourself.

Pic: Robert Scoble

Rowan Puttergill: Columnist

Chinese websites fall victim to strict controls

News • 13 Jul 2011

Why digital entertainment platforms are evolving faster than ever

Samsung Celebrated for Transformative Tech by Consumer Technology Association

GPT 5.2 lands in the API and could reshape how SA builds with AI

Innovate47 Launches Global Food & Agri Accelerator to Back Climate-Smart Startups

Simbi Wabote’s 10-Year Roadmap: How Nigeria Achieved 54% Local Content Growth

Records Tumble In Best Year Yet For South Africa’s Hedge Fund Industry

HONOR’s customer first strategy is reshaping South Africa’s smartphone market

GTA 6 Is Officially Coming in 2026: The New Date, the Delays and the South African Reality Check

The Gaming Gear South Africans Are Actually Buying This December

Omoda C5 X review

TDK enhances IMUs for extreme temps

Data centres and defence are reviving diesel

The story behind Google’s search algorithms

Rowan Puttergill: Columnist

News

Why digital entertainment platforms are evolving faster than ever

Samsung Celebrated for Transformative Tech by Consumer Technology Association

GPT 5.2 lands in the API and could reshape how SA builds with AI

We use cookies

Welcome to Memeburn