Google’s core business is all about finding you what you need to know. The company has become synonymous with searching the internet, so it is no surprise that it will always try to innovate in that area. The next step in search? Well, that would be Knowledge Graph.
When Google launched its latest version of search, it said it was trying to build a search of the future that taps into the “collective intelligence of the web and understands the world a bit more like people do”. Knowledge Graph is all about intelligence: it’s an algorithm that understands real-world entities and how they relate to one another, represented by a panel to the right of search results that gives you a snapshot view of the topic at hand. But how did it all get started?
On a trip to the Googleplex in California, we chatted to Shashi Thakur, Google’s principal software engineer for search, the technical lead on the Knowledge Graph project. Thakur tells us that the future of search will always be wherever the user is, citing that the smarter technology gets, the more integrated search will become.
According to the software engineer, Knowledge Graph is a representation of truth in the world, making it all but ungameable by black hat SEO practitioners. This means only the most accurate content is shown through Knowledge Graph, though Thakur admits that there is room for error. When mistakes are reported though, Google is happy to correct them and continually improve.
With Google Now acting as an intelligent personal assistant on Android phones around the world, the next obvious place for Google’s smart services is those futuristic augmented reality glasses it is developing. In the future, Thakur reckons that Graph will integrate well with Google Glass, providing a seamless interface when it comes to searching, allowing users to use their pupils to move the page and scroll.
Memeburn: Can you tell us how search is evolving especially with reference to Knowledge Graph?
Shashi Thakur: I’ve worked on a spectrum of search technologies ranging from the core of web ranking to spam detection to search features, and now the Knowledge Graph project for the last couple of years. In terms of how this is affecting search and transforming search, one thing we say often is that the Knowledge Graph launch last year was one of our biggest of the last several years. The last big one is when we went from regular website links to images and products and what not. This was at the same scale.
It was a fundamental shift in how we approached search, which previously was that you throw a query at the search engine — and that’s a bunch of words and phrases, and we essentially search for those words and phrases in documents, videos and images and give you back results. But it was still at the level of characters which form together a word which form a phrase — there was no intrinsic understanding of how the United States of America is this thing, and a football team is another thing, and this cricket team is another thing. But it worked fabulously well — I mean, Google is the best out there in terms of relevant web results.
The reason it worked well is we have these intelligent algorithms which mine the host of data that we have from our users or from the web to understand that certain things mean the same thing, certain words are spelling errors to be corrected, what are popular documents based on the interlinking of web structures, the whole algorithm of PageRank — they all come together to create a pretty powerful ranking experience. Still, the underlying problem is that we don’t understand about real things in the real world.
That’s where Knowledge Graph comes in. The way it started was there was Freebase. It was not Google, it was [developed by] Metaweb. They started with Wikipedia entities and various sources of entities, and they had a database of several million entities and relationships.
Since we acquired them, the ambition for the Knowledge Graph has just [increased], now it’s like half a billion entities and a few billion links between entities. Both of which are important — you need to know about things and relationships between things. Together, that gives you the power. Similar to websites — you need to know about pages and linkages between pages and that gives you power. So the ambition of Knowledge Graph just grew exponentially since the acquisition of Metaweb.
In terms of applications, the Knowledge panel that you see on searches was our biggest application of the data in Knowledge Graph. It was transformative on search. Number one, it understood the name or the phrase that you typed in. You typed in ‘New York City’ and it knows what it is. It grounds the query from that phrase into a thing in the real world.
Now what can you do with it? We try to predict the next things that you may want to know. It gives you a summary without having to ask further questions. It’ll tell you the population or we’ll give you the map or the weather or whatever it is. That’s number one, a snapshot or summary on the search page based on the vast amounts of queries that we see from users.
Number two is this notion of a scaffolding. The deepest information still lives on the web. We’ll give you a summary of New York City, but you still want hotels and flights, places nearby, who knows what. So the way the Knowledge Graph panel creates a scaffolding is if you type in something ambiguous. A bunch of people names are ambiguous. Knowledge Graph gives you the scaffolding.
MB: What qualifies as a knowledge graph search?
ST: Pretty much a search for an entity in various different categories — everything from a place to a person to a movie to a book and hundreds of such categories. We don’t distinguish between Knowledge Graph search and regular web search — you’re still on Google. There is no special Knowledge Graph search, except that your web search also goes and searches the Knowledge Graph in tandem. As far as you are concerned, you only did one search. We predicted that images are important or that you may have the following questions. It’s still the web and the summary, but we lead you to the right query, which gets you to the best of the web.
MB: What will search look like in the future?
ST: If you ask our senior VP, his vision is that search goes where users go. You don’t have a billion people using Glass now, but you can conceive of a billion people using Glass or some such device or watches or a mobile device in your headphones. Your TVs are becoming smarter. Like it or not, users are moving there and users’ information needs are moving there.
How does search evolve to meet that? Number one, search has to be voice-activated. Secondly, it has to be natural language driven. Users are going to talk to search like an intelligent agent and ask natural questions like you would ask a friend. And the computer is responding like a friend would respond. It’s not going to talk for 10 minutes, it’ll talk for 10 seconds. ‘Did you mean this other thing?’ ‘Yes.’ ‘Ok, let me tell you more.’ That’s the end game. Conversational, voice-activated, natural language interfacing. That’s how users would consume information.
MB: Is knowledge graph the planet’s best bet for artificial intelligence?
ST: Artificial intelligence is a pretty broad question, because it has to be answered in the context of specific applications. If you went back ten years and I said you could interact with a machine through natural language and the machine would respond back with natural language and know a large amount of knowledge in the world, that would have seemed like science fiction. That would have seemed like the world of artificial intelligence, but I don’t think you’d have disbelieved me if I said in two, three, five years, we’ll get there.
There are other things that make for intelligent interaction with a machine. There’s your personal information, like ‘when’s my next flight?’. That together makes for an intelligent agent. I would say it is a pretty important and strong underpinning for any intelligent interaction. Do these applications, like me, understand your voice and your natural language questions, and know when to stop and respond to your next question? That is definitely intelligent.
MB: How does Knowledge Graph work for SEO practitioners trying to take advantage of it?
ST: Right now, what we say is Knowledge Graph is a representation of truth in the world. That’s number one. What we show in the panel is answering the popular questions people have about that topic. We don’t allow anybody to influence us. They can correct us if we’re wrong, and we’ll be happy to correct ourselves. But if somebody says that so-and-so doesn’t want their age revealed, because that’s not good for their profession, really the question is ‘Is that age wrong?’. If it not wrong and people are asking for that age, then we’ll show it.
This often happens with celebrities, but that’s what people are asking for. In Japan they’re interested in blood groups. If you show somebody’s blood group in the US, that’s extremely private information. But the question is, is it true or is it not true? Do users care about it and are we serving our users or not? Those are the questions we answer. Under those concerns we are happy to take knowledge from anywhere.
MB: Google is obviously gathering a lot of data from many sources. What are some of the strangest things you’ve seen?
ST: I don’t know if it’s strange, but for instance, sumo wrestling. If we just had a Western-centric product, we wouldn’t put so much interest into sumo wrestlers. But we do, because we are in Japan. There’s no cricket in America, but in South Africa, and growing up in India, you know about cricket. So we cannot serve even the English-speaking world without things like that.
Then there is the non-English-speaking world, which you have to not just localise, but also internationalise. You have right to left languages for Arabic, correct translations of people names. In the US, ‘spouse’ has a specific meaning — husband or wife. But in some languages, you have to explicitly call out ‘husband’ and ‘wife’.
In terms of availability in the world, [Knowledge Graph] is available wherever Google serves, not necessarily all languages. We have launched in English, French, Italian, Spanish, Portuguese, German and Russian, and we keep adding more.
MB: Google products are all incredibly integrated. How is Graph integrating with Google Now?
ST: It’s still early days, and nothing I’m going to say will be a complete exposition of the space. But if you go back to that personal assistant business, a personal assistant knows about everything in the world, but it also knows about your next flight, that memo from your boss. All of that becomes a useful application for you when you are in the line and you want to see the barcode for the boarding pass. That’s really the kind of thing a personal assistant makes really useful for you. That’s part of Google Now.
MB: How would Glass and Knowledge Graph work together?
ST: Now is a query-less interface — it brings it up without you asking. It knows your calendar, it knows your destination, it knows the map, it knows the routes. That fits in totally with Glass.
Same thing with questions. In a traditional search interface, you want to know the books by Charles Dickens. You might have read the Wikipedia or the Amazon page. But certainly that’s not the right thing in a Glass interface. Showing a list of the popular books and having Glass use your pupils to move the screen, and tell you more about Oliver Twist. The same conversation, with a different interface.
MB: What is the next phase for Knowledge Graph?
ST: A continuously improving repository of world knowledge, more topics, more types, more connections, more languages, more regions. That’s one core thing. That’s the underpinning for all of this. Everything that we build keeps surfacing it — [in addition to] answers like Charles Dickens books, now you’ll have books from some extremely rare author. A regionally famous author in Vietnam — we want to show you the books for that. The depth and the reach of the Knowledge Graph across regions, across topics, across people. That’s one big thing.
The Charles Dickens books [example] illustrates another point — exploration. It’s not only about showing an answer, and now you’re done. Now you go away. Exploration is ‘Here’s a summary, now tell me more about Oliver Twist’. It’s not a one-shot conversation. Me giving you a list of ten books is not the end of our conversation, you want to know more about some of these things. Enabling these deeper conversations — that’s the other big thing. It ties into this bigger world of conversational and voice-directed interfaces.