Machine learning is the future of Internet search.
With RankBrain, we finally have an artificial intelligence (AI) system that will give answers to all of our questions. In Oct. 2015, Google announced that this new system will become an inseparable part of its search engine. Basically, its main function is to better understand queries and provide end-users with more relevant results.
So, how does RankBrain do this?
Given that this is a program, it first has to convert language into something more understandable. It puts words and phrases into mathematical entities called vectors. This allows it to interpret results. After that, all these phrases are categorized into clusters based on their semantic meaning. If two clusters are interconnected, they will be placed near each other. In that way, RankBrain is able to understand synonyms and related phrases and to make connections between them.
First introduced in 2012, Knowledge Graph represented a very sophisticated system that treats keywords as real things, the same way humans see them, instead of plain strings. It used numerous databases to extract information. As a result, Google started understanding relations between objects marking the transition toward semantic search.
With Hummingbird, Google managed to take their product a step further. In 2013, the company implemented this algorithm into their search engine. Similarly to the previous Knowledge Graph, it included semantic search, being able to understand synonyms and related keywords. But, there was one big difference. With this algorithm, Google was also able to understand user intent. In a sense, if Knowledge Graph was first step toward semantic Web, Hummingbird was the initial phase toward machine learning.
About 15 percent of queries on the Internet are completely new. On daily basis, it translates to about 450 million searches. Unfortunately, Google can't find a valid answer to many of these questions. It is no wonder that the company is constantly trying to find ways to reduce this percentage. Thus, Google invented RankBrain.
The thing you have to understand about RankBrain is the fact that it is actually a part of Hummingbird algorithm and that it implemented all the positive progress made in the past and polished it even further. Back in the day, on-page optimization was based on keyword density (percentage of keyword repetition within the text). Given that this led to several SEO malpractices such as unreadable content, Google had to invent something that will improve user experience (UX).
Of course, semantic search and understanding user intent are great. However, the true value of this AI system lies in its learning capabilities. Even with all the synonyms and related keywords, Google still had issues understanding certain queries. In some cases, it gave improper value to certain words, prioritizing parts of the phrase instead of keywords as a whole. With RankBrain, Google is able to understand slang and questions that were previously ambiguous. This system is excellent for long-tail keywords. Furthermore, when processing new queries, it is able to make sound predictions based on previously accumulated information.
There is another thing that needs to be mentioned. Back in the day, during Hummingbird era, when you enter something in search tab you would get results that are same or similar to phrase which you entered. After RankBrain, when you type in something, you get direct answer to a question even if it doesn't have direct connection to the words which you typed in the bar.
Example:
- Hummingbird - Search: "How old is Will Ferrell?" Result: "Will Ferrell - Wikipedia, the free encyclopedia"
- RankBrain - Search: "How old is Will Ferrell?", Result: "48"
One thing is for certain, as the initial description implies, RankBrain represents a system that is able to learn and improve as the time goes by. With this in mind, it can only get better when it comes to assisting Google users even without further polishing.
Given its name, you could presume that this system is in direct connection with ranking signals. In recent Google Q&A, we learned that RankBrain is now one of the three main Google ranking factors together with content and links. But, there is a lot of debate in SEO world about this. Many experts see RankBrain, not as a direct ranking factor, but instead as something that affects queries and through them, the search as a whole.
In a way, we can observe it from the same perspective as the Hummingbird. As long as you are creating normal, natural content, you will be visible to Google. But, according to our predictions, it probably won't have a drastic impact on on-page optimization.
Why is that? Simply put, this update helps people who are searching for certain results, but it shouldn't affect articles themselves given that semantic search is in place for some time now.
For example, if your article is optimized around word "pet food", previously, visitors could have reached your page by typing things such as "pet products", "animal food", "things for my cat" etc. With RankBrain, Google will most likely be able to lead visitors to your page even if they put some obscure things such as "the stuff that cat puts in its mouth". In that regard, your mission as an author doesn't change.
We can safely presume that Google will become smarter. At this point in time, this is pretty much a fact. Nevertheless, LSI keywords concept will most likely remain in place as the best way to convey an idea to the search engine. Given that RankBrain will be used to understand language patterns, slang as well as to connect the dots within the content, its main purpose would be to help Google user. Nevertheless, copywriters will still have to concentrate on semantics and general meaning of the text if they wish to be visible.
This is the million dollar question. Google is usually very secretive when it comes to its technology (which is to be expected). But, we can speculate that click-through rate, time spent on a website and bounce rate will become the most important indications which will help Google determine whether a certain article is good fit for a certain query.
Another thing to note; RankBrain doesn't only help with slang and long phrases. It also helps with keywords that have numerous meanings. In that regard, as it starts learning through click-through rate, time spent on a website and bounce rate, it will most likely give advantage to queries that have more monthly searches (Apple computer instead of apple fruit). This can pose quite a problem for those who are working with less searched keywords. Nevertheless, as it always goes, people will look for an answer until they find the wanted result.
Now, we can suggest some SEO tricks such as great titles or intriguing META descriptions that will attract some of these floating visitors. However, given the RankBrain's learning potential, this will most likely be a temporary thing as it eventually starts leading people to websites with the most relevant information for their query. In other words, user will have to ask something really unusual and obscure given that the system already has answers to the most common questions. Besides, do you really wish to have visitors who are not interested in your topic?
At the beginning of the article, we mentioned that RankBrain understands connections between the terms through system of vectors. This is an example which was provided to us by Google.
As you can see, it shows us the connections between countries and their capital cities. However, this is only the part of the story. Relationships between the countries are much more interesting. On this graph, you can see Mediterranean countries such as Greece, Spain, Italy and France grouped together. Similarly, France, Germany and Poland are close to each other (given that they are in the center of Europe) where Germany and France are close to each other (most likely due to mutual physical borders as well as close diplomatic and economic relationship between two countries).
As RankBrain starts processing all the articles and news feed on the Internet, AI system is able to notice patterns and realize similarities between different countries.
So, how does this apply in practice?
Let's take keyword apple as an example. We all know that this word can be used for both fruit and technological company. As such, it can pose a real problem for users. We tried with two different queries:
- What is apple?
- What is an apple?
In both cases, first thing that popped up was linguistic definition of the apple. It described the fruit. However, in case of "what is apple?" query, search engine gave us information about the company. On the other hand, "what is an apple?" gave us numerous results with the fruit. Given that most people, who search for apple, search for data regarding tech company, it was only logical that plain, basic form will give us results about the Apple Inc. On the other hand, Google recognized that "an" and "apple" usually go together when we talk about the apple tree so it showed us results about the fruit.
There is additional interesting innovation that is introduced with this machine learning system. According to Google, RankBrain will distinguish queries based on location. Ok, this is not a new concept given that even previously, results varied slightly depending on the place where you are living. However, this difference will be even greater when some results get removed, as RankBrain deems them irrelevant for a certain country or region.