Search engines like Google have moved away from concepts like keyword density. These things used to work well in the past, but as Google has evolved it’s become better at identifying what a piece of content is all about by taking into account other elements such as related entities.
Artificial Intelligence (AI) in the form of Natural Language Processing (NLP) makes Google much more capable of processing information and understanding the context of the content that exists in its index. Unfortunately, many SEO professionals and content creators have yet to understand these changes and thus are still using tactics like keyword stuffing in the hope of getting results.
In this guide, we’ll talk about entities for SEO, explain what they are, and break down the process of using them to optimize your content. If you want to learn more about entities and entity optimization, keep reading.
What Are Entities for SEO?
We’ll begin with the most obvious question. According to a 2016 patent by Google, “an entity is a thing or a concept that is singular, unique, well-defined, and distinguishable”. Google continues by stating that an entity can be a person, place, item, idea, abstract concept, or concrete element.
Image Source: Google Patents
Now, you may be wondering why that matters and how it’s connected to SEO. In search engine optimization, everything starts by inserting a search query into a Google search. Let’s assume that the search query in our example is “video conferencing”.
Now, before moving forward with this, and if you’re like most people, when hearing the term “video conferencing”, you would think of terms such as:
- Video calls
- Screen sharing
- Video conferencing solutions
- Zoom
- Google Meet
- Webinars
- Video meetings
- Skype
For the most part, those are the terms that Google will be “thinking” about as well. It’s doing so through entities, along with other ways. Remember what we said earlier? An entity is,
- Singular
- Unique
- Well-defined
- Distinguishable
Meaning that these are the four characteristics that Google has assigned to entities. Let’s use a couple of examples to make this more simple. Let’s say that we want to extract entities — elements that are singular, unique, well-defined, and distinguishable for the sentence, “how to run a video conference with zoom”. We’re going to insert that piece of text into Google’s NLP API and click “Analyze”.
Image Source: Google’s NLP API
Here’s what we’re going to see next:
Image Source: Google’s NLP API
These two terms – Zoom and video conference – are the two entities Google’s NLP has extracted from the piece of text that we’ve inserted. As we can see, the first term is labeled as “Other” while the second is labeled as “Event”. (More on that later.)
Simply put, entity extraction is one of Google’s ways of understanding what a piece of text is all about. This way, Google’s making associations between search queries and the pieces of content that exist in its index so it can serve up the most relevant results to the searcher.
Entities can be several types, some of which are:
- Person
- Location
- Organization
- Event
- Consumer good
- Address
- Date
- Number
Author’s Note: You can visit this page to learn more about different types of entities.
For example, “New York City” is identified by Google as a Location.
Image Source: Google’s NLP API
According to Google, for most entity types, “the associated metadata is a Wikipedia URL (wikipedia_url) and Knowledge Graph MID (mid)”. This is confirmed by the screenshot we shared above, with the associated metadata to be a Wikipedia URL about “New York City”.
When Google is determining what results to pull from its index and show to the searcher, it ranks the entity references based on four ranking signals:
1) Relatedness
This is determined by the co-occurrence of entities on pages that exist together, very often in pages that exist in Google’s index. It has to do with the relevance between the two entities and their combination to make a meaning.
For example, the entity “best football player” and “Messi” are often mentioned together across many pages that exist in Google’s index. Thus, when you insert a search query such as “best football player of all time”, you get something like:
Image Source: Google
In other words, Google has made an association between the two entities and thus serves the result on the SERPs when a searcher is conducting a search. Of course, there are other factors that determine what’s going to be shown on the SERPs, e.g. PageRank, backlinks, overall authority of the website, etc.
However, in general, entities and entity references heavily influence the search results that we all see when using Google search.
2) Notability
Notability has to do with how notable an entity is in terms of links, reviews, online mentions, and relevancy. In general, the more notable an entity, the higher the likelihood of dominating for a particular topic or search query.
For example, if we’re searching for “the best video conferencing software” and Zoom is one of the tools with the most mentions, links, reviews, and the highest relevancy, then chances are it’s going to be included in the top results on Google for our search query.
Image Source: Google
In this example, two things aren’t random:
- The fact that Zoom is included first in G2’s product category,
- The fact that Google has selected G2’s result to be displayed in the featured snippet for that search query.
Since “Zoom” is one of the most notable entities when it comes to the associated entity of “video conferencing software”, it’s only logical that a result that includes Zoom among other products will rank at the top of the search results. (We’ll come back to this.)
3) Contribution
This signal has to do with the contribution of the entity to the topic. It’s mostly affected by external factors such as links and reviews. According to an article by Dave Davies, CEO at Beanstalk Internet Marketing, on Search Engine Journal, “a review from a well-established and respected food critic would add more to this metric than Dave’s rant on Yelp about the price because their entity contribution in the space is higher.”
Even though we don’t have sufficient data to back up this claim, we believe that contribution is one of the reasons why links from semantically relevant websites matter most – and can have a higher impact overall – when it comes to link acquisition. Of course, it’s also why link building — the way it’s done most of the time — is wrong.
4) Prizes
This signal has prizes associated with a specific entity. For example, LearnWorlds, one of our clients and one of the fastest-growing SaaS in the eLearning industry, has various prizes and awards from various competitions and reviews sites.
Not only is it important that LearnWorlds, as an entity, is associated with those awards in the algorithmic eyes of Google, but it can help it improve its rankings for relevant topics.
This is why we recommend that all our clients that receive awards and prizes ask the website giving them the prize to link back to their website using a branded anchor text. This can make the association between the entities even more tightly bonded.
Even though entities have existed for some years now, they still haven’t got the attention they deserve by the wider SEO community. Let’s move on to the next section, where we’ll discuss why entities are important.
Why Are Entities Important?
So far, we’ve seen what entities are and have examined them on a macro level. To understand why they’re important — besides the obvious reasons — and how they could help you shape your content strategy, we have to examine them on a micro level. To do that, we’re going to use two examples.
By now, you’ve understood that since entities are taken into account when forming the results on the SERPs, their importance can’t be overlooked. Let’s see how that happens using Clearscope — one of the tools we use for content creation for our clients.
Clearscope is a tool that uses Artificial Intelligence (AI) to provide recommendations for words and phrases that you need to include in your content for any given topic. Here’s what you get for the term “COVID-19”,
Some of these terms are:
- Public health
- Pandemic
- CDC
- Social distancing
- Stay home
- Handwashing
Author’s Note: In the example we’re using, results are heavily dominated by the latest news, because the intent behind the query indicates that the searcher, in most cases, is looking for the latest news regarding the pandemic.
Do any of these terms come to mind when thinking about “COVID-19”? If you’re like most people, then the answer is “yes”. The question is: How can a content and SEO software like Clearscope come up with those terms and why is it important?
Clearscope is able to identify, or extract, those terms by analyzing the top results on Google. Google is ranking its search results based on entities. Here are some of the results analyzed and taken into account to extract the terms, i.e. entities, that have to be used to be topically relevant based on our target topic.
Thus, these are basically what Google wants to see to determine whether a piece of content is relevant to the target query or topic, which in our case is “COVID-19”.
Author’s Note: In fact, Clearscope is doing entity extraction using Google’s NLP API and IBM Watson.
To determine the quality of a piece of content based on the usage of suggested terms, or entities, Clearscope is using a metric called “Content Grade”. According to a recent study of 11.8 million search results by Backlinko,
“Comprehensive content with a high “Content Grade” (via Clearscope), significantly outperformed content that didn’t cover a topic in-depth.”
Image Source: Backlinko
In plain English, this means that the right use of entities, based on the entities identified on top ranking results for a given topic, can have an impact on rankings. Of course, here we have to be aware of two things:
- The study we’re referring to is a correlation study and correlation doesn’t necessarily prove causation,
- There are many other factors that have an impact on a page’s rankings, such as the number of relevant links.
In general, what we can take from this example is the fact that entities matter when it comes to search engine optimization. We don’t know how much they matter, but it seems that they’re an increasingly important factor in the way Google is forming its top results on search results pages.
The second example has to do with one of the pieces we published recently on our blog. Our Clearscope guide is a walkthrough of Clearscope, covering the main aspects of using the tool for content creation and updates. When we decided to write this piece of content, we didn’t conduct keyword research, and our decision wasn’t affected by the search volume of the term — we just wanted to create a thorough guide on how to use the tool.
According to Ahrefs, this piece now ranks in position seven for the target term “Clearscope”, even though we weren’t interested in ranking for the term!
Two things to consider here are that a) the difficulty of the keyword isn’t high and b) our domain rating, or authority, and overall organic visibility isn’t high. Thus, the only way we could rank for the target term was by creating something that fits Google’s criteria in terms of entity associations.
Most of the entities extracted by Google’s NLP API from our piece of content include terms like “Shared Reports” that are really relevant when it comes to Clearscope. Thus, it’s highly possible that Google considers this piece of content to be something relevant to the query “Clearscope”.
Image Source: Google’s NLP API
From an entity standpoint, the piece is relevant and thus ranks on the first page of the SERPs. Of course, a question arises here: How’s Google forming the associations between the different entities that belong in its database? This is done with the use of Google’s Knowledge Graph.
According to this knowledge panel by Wikipedia,
The knowledge graph, also known as the entity graph, comes in the form of a graph and helps Google to understand the association between different entities. Here’s what the knowledge graph would look like for the example we saw earlier, “COVID-19”,
As you can imagine, in an increasingly complex world with tons of information flowing everywhere, maintaining a database of interconnected information is both practical and necessary.
This is how Google is capable of making connections between queries and entities and serving up the most relevant results. This type of “semantic search” forms the majority of search results and is expected to be more and more important as time passes.
To answer the question “why entities matter”, we have to realize that entities play a key role in the way search results are served nowadays. That applies both to desktop and mobile searches, but also to relatively new types of search such as voice search.
Now that you know why entities matter, let’s move on to the final section, where we’ll explain how you can use them as part of your content strategy.
How to Use Entities to Improve Your Content
We believe that since 2015-2016, when Google started introducing entities to the SEO world, many things have changed in search. What remains the same is the importance of entities when it comes to classifying content online and serving the most relevant results based on a search query. Having said that, entities should be a part of your content strategy. The question, of course, is how you can do that.
1) Use a Pre-built NLP Editor
The first way to do this is by using an NLP editor similar to Clearscope. This will make the process of entity extraction and identification — from the top-ranking search results — easier and much faster. After all, as we saw in Backlinko’s study earlier, when it comes to getting higher rankings with your content, Clearscope works. This is how the editor looks on Google Docs for this piece of content:
In this case, the target term, that is the term we’re targeting and want to rank for, is “entities for seo”. Some of Clearscope’s recommendations in terms of importance are:
- SEO
- Search engines
- SERP
- Natural language
- Knowledge graph
As you can see below, at the top of the results for that term is this piece from Search Engine Journal:
Author’s Note: Results for that term may vary; you may see rich snippets and other SERP features based on your location, browser preferences, and overall online activity.
Not surprisingly, this is one of the content pieces that Clearscope is using in its analysis to give us the recommendations that we just saw.
Thus, it’s evident that the recommendations that we get in terms of what entities we need to use are heavily influenced by the top ranking content on the SERPs, as well as Clearscope’s criteria of determining what’s considered to be relevant and high quality for our target topic.
Following that process when you’re creating content to drive organic traffic will help you be more relevant based on what Google considers to be relevant for a given term. As we mentioned before, there are other factors that affect performance when it comes to SEO, but entities are one of the most important ones and Clearscope allows you to integrate them seamlessly into your content strategy.
2) Using an NLP Processor
As we just saw, the first way to use entities as part of your strategy is to use software like Clearscope that does the entity extraction and identification for you and allows you integrate entities into a piece of content through its built-in editor. Of course, you may argue that this isn’t cost-effective; this is why we’re going to share a second way to use entities without using a paid-for tool.
Let’s get back to the example that we used earlier and let’s assume that we’re interested in identifying important entities for the term “entities for seo”. As we noted, one of the top results on that SERP came from Search Engine Journal. What we’re going to do is copy the main text – avoiding copying secondary content, comments, etc. – of the page and paste it into Google’s NLP API.
Here’s what we see next:
Image Source: Google’s NLP API
As you’ll notice, some of those terms are similar to the ones that Clearscope already suggested to us for the same target term. That’s expected since a) the tool is using the NLP API to extract entities, and b) the piece we’re using as a case study is, in fact, one of the pieces Clearscope includes in its analysis.
Of course, Clearscope’s analysis doesn’t end here; the tool is also using IBM’s Watson for natural language processing and entity extraction. It’s also analyzing more top results from the SERPs and not just one result as we’re doing here.
As you can imagine, analyzing more results would be truly labour intense and time-consuming. But, let’s get a bit deeper into the process of identifying entities manually, without the use of a tool.
As you can see, right below each identified entity, there is a number called “Salience”.
Image Source: Google’s NLP API
According to Google, “Salience shows importance or centrality of an entity to the entity document text, ranges from 0 (less salient) to 1 (highly salient)”. Thus, we can understand that an entity with a saliency of 0.14, as for the entity “SEO” we’ve highlighted above, is pretty important for our target topic.
Having said that, when you’re doing manual entity extraction using a tool like Google’s NLP API, you definitely need to pay attention to salience, as a reference of relevancy and the importance of an entity related to the whole text you’re analyzing.
The second element you need to pay attention to is the label that exists at the right of each identified entity.
This is what we called entity “type” earlier. In most cases, the type will be “Other”. On a purely practical level, the type shouldn’t affect your decisions as to what entities matter or whether you should include it in your piece of content. However, in some cases, you may find references on Wikipedia pages that you can use as a source of information.
For example, the identified entity “RankBrain” has a Wikipedia link right below it.
Even though the salience, in this case, is fairly low, this may be something we need to investigate further and even include in our piece of content. As you can see below, several words and phrases on that Wikipedia page are relevant to our target topic of entities for SEO, and therefore could be used as part of our piece of content.
Of course, as mentioned above, this process is really time-consuming and includes a lot of manual work. Up to a certain extent — and with questionable results — it could help you create a piece of content that’s relevant to your target topic and has a chance of performing well organically.
One thing we need to mention is that not all content on a website should be aiming to get organic traffic. If you’re creating content that’s intended only to satisfy search engines, then you’re clearly missing something. You have to also create content that’s intended for your users or customers, and even team members.
For example, take a look at the following announcement on LearnWorlds’s blog:
Image Source: LearnWorlds
It’s an announcement for LearnWorlds’ integration with Zoom. In other words, it basically introduces a new feature with new capabilities for the product to LearnWorlds’ users. Looking for entity extraction and identification here, before creating the piece, would be meaningless.
In the same vein, not all content you produce and publish should aim to acquire organic traffic. If that’s your sole goal, then you definitely should reconsider the way you’re doing things. Let’s wrap this up and close with some final thoughts.
Wrapping Up
Throughout this guide, we’ve explained what entities are, why they’re important for SEO, and how you can use them to effectively optimize your web pages. Keep in mind that entities aren’t another “SEO trend”; they’ve existed for years. They’re here to stay and therefore you have to get better at using them as part of your SEO strategy.
As Google becomes better at processing content and understanding entities, you have to get better at using those entities to your benefit. The biggest takeaway from this guide is that Google’s changing and if you don’t change as well, then you may get left behind. Getting a deep understanding of entities will help you understand why Google serves its results the way it does and what you need to do in order to be included in the top of those results.
Machine learning and NLP are transforming the way Search Engine Optimization is performed for good. Understanding the searcher and giving them what they need is essential to be successful in terms of organic search. In that context, user experience is already one of the most important ranking factors. Try to become better at satisfying the user and Google will definitely reward you.