Most people use Google to access the Internet. In fact, Google is the first thing that pops in your mind when we talk about internet search. So much so that we prefer using the verb “googling” instead of “browsing the web”.
But, how does Google search engine actually work? Have you ever thought about it?
As you guessed, there is a complex mathematical algorithm behind it. This algorithm is referred to as the semantic search.
Although words such as algorithm and semantics may sound intimidating, they actually aren’t. Semantic search is a rather easy concept to comprehend.
I’ve decided to create this article so that even newbies can understand the complex mechanisms behind Google’s search engine. Check it out!
Semantic search as an algorithm
Before we start with anything else, it is necessary to explain algorithm as a term.
In general, all informational technologies are based on mathematical algorithms. Algorithm is a set of processes that guide a program.
Still not clear enough?
OK, let’s use a simple calculator as an example.
When you type in 1 + 1 you get 2 as the final result. However, if you change one of these two variables and instead of a 1 you enter 3, the results will be completely different 1 + 3 = 4.
Whenever one of the variables is changed, you get a different result. However, you can use different elements to get the same results (1 + 1 = 2 and 4 / 2 = 2).
Basically, based on your input you will get a specific result.
Still got some questions? We will talk more about importance of the algorithm so keep on reading.
So, what do semantics mean?
Semantics is a branch of linguistics which is primarily dedicated to the study of word and phrase meaning.
Some words and phrases can have a set of different meanings. Because of this, it may be hard for a computer to understand the true intent behind a query.
Google has worked long and hard to create an algorithm that will behave the same way people do. This is how the concept of semantic search was invented.
Nowadays, when a person enters a keyword in the search bar, Google won’t return articles with the same string of words. Instead, its sophisticated algorithm will recognize the true meaning behind the search and it will try to provide a valuable answer.
With developed semantic search, the algorithm is able to comprehend the words and return the necessary data.
Short history of semantic search
Google’s semantic algorithm hasn’t developed overnight. It’s a product of continuous work:
- Knowledge Graph (2012)
- Hummingbird (2013)
- RankBrain (2015)
- Related Questions and Rich Answers (ongoing)
Back in the day, Google used some rudimentary algorithms. A specific keyword was the main focus of the search. So, by filling your article with that same phrase, you could easily reach top results within this search engine.
In other words, you could easily cheat the algorithm. The result? End users were provided with low quality, irrelevant garbage.
First steps towards semantic web were made by introducing Knowledge Graph. This is an informational base that should provide more data regarding the topic. Besides links to other resources, Google would also introduce some additional info on the sidebar.
In 2013, Google’s algorithm was further refined with Hummingbird.
Unlike previous algorithms, Hummingbird was trying to answer the actual question instead of simply providing articles with highest keyword density. It would dissect the entire query, find connections between the words and try to provide the most suitable article that will satisfy user’s intent.
RankBrain was the next step. Introduced in 2015, it was hailed as a revolutionary system that might even end SEO profession.
RankBrain took everything a step further. According to Google, this system is able to use the Hummingbird basis and to continuously improve itself without human interference. Through system of trials and errors, it would be able to learn on the go, continuously refining the results.
Given that RankBrain is currently Google’s go-to algorithm you probably wish to learn more about it. Make sure to read my article for additional info. It’s also considered one of the top 3 ranking signals.
Why is semantic search important for regular users?
As an Internet user, you require valid answers to your questions.
However, getting those answers can sometimes be harder than it seems.
Google search engine uses approximately 200 different ranking factors determine the SERPs of a specific article.
Casual users do not know this but you actually have to overcome two different obstacles in your search for valid answers:
- Semantic obstacle – The way your define your query and the way Google interprets it
- Relevancy obstacle – Quality of the returned results
Let me expand on that.
Even with all the sophisticated algorithms, Google may misinterpret your query.
Semantic obstacles are possible during a conversation between two people, let alone between a person and a computer. There are a lot of ambiguous queries which may point in other direction.
Also, a query can point to various things and in some cases, there will be no clear answer.
For example, you might type in “Unique swords” while searching to buy an authentic katana. Google may return results that are littered with European swords, swords from video games or even props from movies.
Relevancy obstacle is the second thing we need to consider.
A common issue with Google is the fact that it always gives clear advantage to authoritative websites. Here, the search engine presumes that reputable websites will always provide the best information.
In reality, this is often not the case.
Smaller website may prove to be more diligent creating an amazing piece on the topic. Unfortunately, given that Google gives a clear advantage to authoritative sites, you may never see these articles. Especially if you don’t know SEO.
Based on that, it’s obvious that semantics play a crucial role in returning better search results.
How can bloggers benefit from semantic search?
Most bloggers knows how important SEO is.
Semantic web is the next logical phase of the Internet development. Nowadays, it is no longer enough to write nice articles. Your piece should also have a required term frequency to rank better.
Wait! What is term frequency? I will explain in a bit.
First, let’s discuss two sets of word phrasing that are necessary for article to rank:
- User-oriented phrasing
- Algorithm-oriented phrasing
User-oriented phrasing refers to phrasing that is compelling to majority of readers.
It is a common knowledge that people react to words such as “amazing, awesome, incredible, cool, must watch, can’t miss etc.” There are certain physiological triggers that will make article more readable, more interesting and more likely to be shared.
Nowadays, user experience or UX is one of the main things to be considered when writing a copy. But, that is not all.
Bloggers should also include algorithm-oriented phrasing within their content.
Here’s an example to draw this better.
When a user types “Aladdin’s magic lamp” articles with this exact phrase will most likely pop-up first. After that, the results (articles) will become less relevant.
As you keep scrolling down results will deteriorate in terms of relevancy. Articles will start having the words “Aladdin”, “magic” and “lamp” scattered throughout the text, not as a whole phrase.
This sounds simple and makes sense, right?
But, what happens when there are numerous articles with the same phrase? I am talking millions results. Then, the relevancy will NOT be determined based on the keyword density anymore. Instead, Google will turn to TF-IDF.
Remember when I told you that it is no longer beneficial to spam one and the same keyword throughout the article? Here is the reason why.
When determining the relevancy of an article, Google no longer refers to a single word or phrase. As mentioned, this method can easily be compromised. Instead, Google observes article as a whole.
We used an example with “Aladdin’s magic lamp”.
Now, let’s presume there are hundreds of websites that have the exact keyword within their article. How to determine which one is the best?
Simple. We use TF-IDF or term frequency – inverse document frequency.
TF-IDF refers to importance of certain words within a text as well as their expected frequency.
OK, but what’s expected frequency?
Back to our example again.
When we talk about “Aladdin’s magic lamp”, it is expected that words such as Jasmin, magic carpet, sabre, Arabic, Middle East will appear within the text. Also, words such as Aladdin and Jasmin will be much more frequent than word Arabic.
So, if we only mention Aladdin’s magic lamp but our text lacks other important words? Well, then Google can safely assume that this article will not be that relevant for the user.
Now, you are probably wondering how your website can profit from this knowledge.
Simple – by finding the expected frequency of core words which you can do this by copying the frequency of top competitors. In the end, if they reached first page of Google they are probably doing something right.
I would have to write an additional piece in order to give you step-by-step process. So, if you wish to learn more, download our upgraded content.
If you wish to create well-optimized content, you shouldn’t focus on text in the traditional sense. Instead, you should focus on words and word formation which Google expects to see.
In this day and age, users’ feedback plays a crucial role in determining the importance of content. You will have to cater to both sides. Create content with lots of synonyms and semantically related words incorporated in it. Try to be provocative and readable at the same time.
How do you approach content creation? Do you use some different strategy? Share it in the comment section below.