LOADING

How to Conduct a Content Audit: Step-by-Step Guide for 2020

Georgios Chasiotis

Georgios Chasiotis

Conducting a content audit is an essential part of any SEO and content strategy. Even though most content marketers and SEO professionals know this, they tend to start by creating new content, instead of auditing their existing content inventory. Needless to say, creating new content pieces without reviewing your existing ones first is the wrong way to go about things.

Thus, every SEO and content marketing strategy should start with a content audit. In this step-by-step guide, we’ll teach you how to conduct an audit for your website content. The process described here is the exact process we use to conduct content audits for our own clients.

Table of Contents

What is a Content Audit?

Doing a content audit means carrying out a review of the content pages of your website, judging them against specific performance metrics and making decisions that we call “Actions” on what to do with each of these pages. 

The reason why you want to judge only your pieces of content and not other types of pages (e.g. feature pages, product pages, landing pages) is because content pages are usually created specifically for SEO purposes. Thus, judging them based on their performance is both a necessary and logical step. 

For us here at MINUTTIA, the content audit is the first thing that we do right after we have our clients fill in the Onboarding Questions—an extended set of questions that help us get all the information we need from a client in order to fully understand their needs. 

We do this because time and time again, we’ve seen that the Actions of a content audit can—and in most cases will—affect the performance of our clients’ websites (e.g. organic rankings, organic traffic). 

You can use many different ways to judge your content’s performance. Some of these ways (and metrics) include: 

  • Word count
  • Bounce rate 
  • Conversion rate
  • Number of backlinks
  • Overall content quality
  • Number of social shares
  • Number of organic keywords
  • User behavior (% of the page scrolled)
  • Conversion data (performance metrics)
  • Average time on page (or session duration)
  • Traffic metrics such as number of page views

We’re pretty selective with the metrics we use to judge a piece of content against. Thus, you might say that there isn’t one single standard—and absolute right way—to conduct a content audit of your website. Each methodology can be correct as long as it’s based on data and does indeed help to improve your content performance. 

Let’s see why and when you should be conducting a content audit of your website. 

Why & When Do You Need to Perform a Content Audit?

Recently, we had to conduct an audit of a website with a little over 7,000 content pages. As you can imagine, conducting an audit of so many pages can be a time-consuming, not to mention often tedious task. There are two facts regarding this audit:

  1. The website owners (our clients) had never conducted a content analysis since the launch of the website back in 2013.
  2. This had started to affect the website’s performance, especially the amount of search traffic the website was receiving. 

Since organic traffic had started to decline, it’s only natural that our client had to do something about it. Since they couldn’t manually review more than 7,000 content pieces, they sought our help. As you’ll see later in this guide, with our semi-automated audit spreadsheet, the content auditing process becomes much easier. 

In a sense, having content pieces that underperform on your website is keeping your website down. This is why a content audit is absolutely necessary, particularly for companies that publish a lot of content.

Most of the time following a content audit, our clients witness an uplift to their organic search traffic and page rankings within the first month of implementing the first Action of the audit. Of course, not all cases are the same. In some cases, we’ve noticed traffic improvement immediately, while in other instances, improvement took a bit longer to see. 

Something worth mentioning is that most business owners and marketers start by creating new content. This isn’t good practice. Content auditing is the first thing you should be doing before creating new pieces of content. After auditing your current content, you can begin with the content creation phase.

The question that arises here is: when should you conduct a content audit of your website’s content? The answer will depend on the following factors:

  1. The amount of content you’re publishing on your blog or resources section
  2. The frequency at which you’re publishing on your blog or resources section
  3. Your in-house capabilities (e.g. content team, SEO team), content and overall budget 
  4. Your overall content audit goals and objectives

For a company that publishes 3-4 new content pieces per week, a good time interval would be once every quarter. For a company that publishes 1-2 new blog posts every month, a good time interval would be once every year. 

Moreover, for a company that has based its growth on content marketing and SEO, auditing content more frequently is more important than it is for a company that has based its growth on outbound sales. 

Based on the frequency you’re publishing new content, you should also give time to your newly published pieces of content to perform. If you don’t do this, you’ll end up making poor decisions. 

Thus, as you can understand, how often you should be conducting content audits of your website varies based on how you treat content and what your overall goals are from it. However, as a general rule of thumb, you should be conducting a content audit every six months, or at least once every year.

How Long Does a Content Audit Take?

How long a content takes is—once again—affected by several factors. Not all cases are the same and they shouldn’t be treated that way. Back to the example we mentioned earlier, if your website has hundreds or even thousands of content pages that weren’t ever audited, a content audit may last up to 2-3 months. 

However, if auditing your website is part of your content and SEO strategy—meaning you do it often—then each content audit will take less time than the previous one. The truth is that even though the assessment can be automated—as long as you’ve chosen the right criteria—the research process and data collection can be a bit more time-consuming. 

For example, our process requires us to collect content and SEO metrics from data sources like Google Analytics (for traffic and conversion data), Google Search Console (for internal linking data), and a keyword analysis tool like SEMrush or Ahrefs (for keyword ranking data). 

The number of data points you need to collect data from and their ability to connect (e.g. through APIs) to your audit spreadsheet (or content audit tool) will also affect the time needed to conduct the audit. 

Moreover, the way you crawl data and the flexibility you have regarding the data you can retrieve through your crawler also both affect the time needed to perform the audit. There are many great options for crawling out there (e.g, Screaming Frog, URL profiler), but we use Sitebulb to crawl our clients’ websites. 

Last but not least, the CMS you’re using (e.g. WordPress) can affect not only the audit, but also the level of performing the Actions that will come up from the audit. We’ve seen cases where editing or updating a blog post on a client’s website was an extremely time-consuming task due to customizations and interference between code and text. 

As you might have guessed, there isn’t a standard answer as to how long a content audit will last. It depends on many factors that you need to be aware of before conducting the audit (or having someone else do it for you). 

How to Perform a Content Audit in 3 Steps

Before getting into the specifics of conducting a content audit, we need to be clear about one thing: different people use different methodologies. This means that when it comes to conducting a content audit, there isn’t a standalone solution. All methodologies can be right and effective as long as they’re backed by data. 

Our process when conducting an audit consists of three main steps:

  1. Set up the content audit template
  2. Gather data from various data sources
  3. Assign an Action for each page included in the audit

Let’s examine these steps one by one.

1) Set up the content audit template

The first thing we do when auditing a site’s content is to set up—or else prepare—the template we’ve created for this purpose. We first got this template from The Blueprint SEO Training, but completely modified to fit our specific needs. (Credits for the initial template go to creators Ryan Stewart and David Krevitt.) Let’s take a look at our template. 

Our template consists of the following tabs:

  • Start Here
  • Sitemaps
  • Actions
  • Revenue Generated
  • Backlinks & Keywords
  • Internal Links
  • Organic Traffic

Even though this template comes in Google Sheets format, you can also use Microsoft Excel. Google Sheets happen to power almost all of our processes, so everything we do involves Sheets one way or another. From the tabs we just mentioned, some are directly connected to data (e.g. Revenue Generated), while others are connected to the Actions we need to take after completing the audit.

Start Here

The Start Here tab is the first tab we need to work on to set up our Content Audit Template. The first thing we need to do is insert the homepage URL of the website that we want to audit. This should be the URL of your root domain, and not a subfolder (example.com/subfolder) or subdomain (subdomain.example.com).

Once we insert the root domain, the red box next to the blue one will turn green. 

Next, we need to insert the URLs of the Sitemaps that we want to audit. To locate the XML Sitemap for your website, you just have to visit example.com/sitemap.xml. Most of the time, this will lead you to your XML Sitemap or Index Sitemap. In our example, the Sitemap that we want to examine is included in the Index Sitemap. 

An Index Sitemap contains different Sitemaps, such as Post Sitemaps, Category Sitemaps, Author Sitemaps or Attachment Sitemaps. This audit is all about content, and thus we include only the Post Sitemaps. Once we finalize which Sitemaps we’re going to include in the audit, we paste them one by one to the “Sitemap URL” section. (Our Content Audit Template works for up to 10 Sitemaps.)

Note: If you don’t have Sitemap data, you need to use a Sitemap generator such as Yoast. 

Once we paste the Sitemaps that we want to audit, the red box once again turns green and we also see the total number of pages that we’re going to audit, as well as the number of Sitemaps included in the audit. This is particularly useful when auditing large websites with hundreds or even thousands of content pages. 

Something very important to mention here is that the number of pages may change as we add new pages to the Sitemap included in the audit. This means that the Google Sheet we’re using is in fact dynamic and not static. However, including new pages to the audit will give you a wrong impression of the performance of those pages. (More on that later.)

Next, we set the thresholds for a page to qualify as “Leave as is (200)” for the four metrics we use to judge our pages against. Those metrics are:

  1. Revenue Generated (Google Analytics)
  2. Backlinks (Ahrefs)
  3. Keywords (Ahrefs)
  4. Sessions (Google Analytics)

In our example, a page will qualify as “Leave as is (200)” if it has generated more than $2 in the last 12 months, had more than 1,000 sessions in the last 12 months, has more than 1 backlink and ranks for more than 1 keyword. Changing these parameters will affect the end results and the Action for each page included in the audit. 

Our template then has a section that shows us how many pages belong to each Action category. This information is particularly useful to give our clients a high level of understanding regarding their site’s content. Here’s how the results looked for the audit we mentioned a while back:

We are interested in the exact number of pages that we need to update as well as the percentage of the pages based on the total number of pages included in the audit. This way, we get a high-level overview of the status of our clients’ content. 

In the last section in the Start Here tab, we have the four following buttons:

  1. Download Leave as is (200) URLs
  2. Download Manual Review URLs 
  3. Download Delete (404) URLs
  4. Download Redirect or Merge (301) URLs

Our clients can therefore download the list of URLs for each of the four Actions in CSV format. This is particularly useful for those who prefer to work with Microsoft Excel instead of Google Sheets. We don’t include any other details on a page level for each of the provided URLs (e.g. Title Tags, Page Titles, Target Keywords or Content Type)—we just include the page URLs.  

Sitemaps

The “Sitemaps” tab is where the web pages of the Sitemaps are auto-inserted after inserting a new Sitemap URL in the Start Here tab. This isn’t a tab that we need to work on or adjust any parameters for, with the exception that it can contain up to 10 Sitemaps. 

Once we have our Content Audit Template set up, we can move forward with data gathering from the various data sources. 

2) Gather data from various data sources

The data we use to judge the pages included in the audit against are:

  • Revenue Generated—Revenue generated from each page
  • Organic Traffic & Visibility—Number of sessions and number of keywords
  • Link Equity—Number of backlinks

We collect this data and insert it into the four remaining tabs, using the following analytics providers or tools:

  • Google Analytics—For revenue and number of sessions
  • Ahrefs—For number of backlinks and number of keywords
  • Google Search Console—For number of internal links
  • Supermetrics—To insert data automatically in our template

The data limitations we have when dealing with the analytics providers and tools mentioned above are:

  1. Even though Ahrefs has a huge index of live backlinks and keywords, it can sometimes miss reporting some keywords, or even worse, backlinks—this is something you need to keep in your mind.
  2. Google Search Console gives us the number of internal links for up to 1,000 pages—meaning that you won’t get data if pages that have internal links in your website exceed 1,000.

Now that we’ve seen the most prominent data limitations we have, let’s describe the process of gathering data from those data sources. 

Even though Supermetrics is a really useful tool and absolutely essential in the content auditing process, the 14-day free trial isn’t enough to cover your needs if you’re auditing more than 500 pages. For that reason, we’ve decided to describe the process without the use of Supermetrics. 

Let’s start with Google Analytics and the number of page views for each of the pages included in the audit. Once we get to Google Analytics, the first thing we need to do is create a Custom Report that shows us the number of Sessions per Page. To do that, we first click on Customization > Custom Reports > New Custom Report. 

As you can see in the screenshot below, we choose “Sessions” as our Metric and “Page” as our Dimension. Rather than that, you just have to choose the right View for your audit and click Save. 

This is how the data will look once you create and enter your Custom Report:

By default, what we get is all sessions, which means that Google Analytics reports sessions from all channels, including organic, paid, referral and direct. To change that and display only organic traffic, we click on “Add Segment”, then “Organic Traffic”, and remove any other channels.

Once we do that, we get only the number of sessions that come from organic search. We then adjust the Date Range we want to get data for—we go as far back as 12 months—by clicking on the date displayed in the top right corner in the Custom Report. 

From there, it’s only a matter of clicking the “Export” button… 

… And inserting the data—in the right columns—in the Organic Traffic tab, which is included in our Content Audit Template. 

Note: The template will then match the pages included in the tab with those included in the Actions tab and, using a VLOOKUP formula, will gather the number of sessions for each page as reported in the Organic Traffic tab.

We then repeat this process for Revenue Generated. Very often, the websites that we audit don’t track generated revenue for their pages, or simply haven’t assigned any values based on actions that visitors take. In those cases, we modify our template based on different criteria. 

Next, we need to insert backlink and keyword data from Ahrefs. There are two ways to do that. We use Ahrefs’ Batch Analysis feature to do so at scale. Batch Analysis helps you to “generate multiple backlink reports at once by entering a list of domains or URLs”. Here’s what Batch Analysis looks like once you get there:

As you can see, you can analyze up to 200 page URLs every time. This is not a problem if you have to audit a few hundred pages, but it can be a problem—timewise—if you have to audit a website with thousands of pages. What we do now is copy the pages included in the Sitemaps tab and paste them in batches of 200 in the Batch Analysis tool.

Once you click “Analyse”, you get the list of pages you analyzed along with a series of very important metrics for each of those pages. From the metrics that Ahrefs gives us, we are only interested in “Backlinks (Total)” and “Keywords”. 

Then, we click on “Export” and without changing anything in the window that opens, we click on “Start Export”. 

By default, Ahrefs will include every column included in the Batch Analysis Overview. We want to keep only the Target (page URL), Keywords and Total Backlinks. Once we delete all other columns that aren’t necessary for the purposes of the audit, we copy and paste data to the Backlinks & Keywords tab in the Content Audit Template. 

Note: The template will then match the pages included in the tab with those included in the Actions tab and, using a VLOOKUP formula, will gather the number of sessions for each page as reported in the Backlinks & Keywords tab.

In the last part of the process, we need to insert the number of internal links that each page has. That may not directly affect the outcome of the audit but is very important nonetheless. To do that, we use Google Search Console and more specifically, the “Internal Links” report. 

The number of internal links in displayed in the Actions tab only when the page needs to be deleted or redirected. The reason for this is simple: for pages that we have to delete or redirect, we need to change or remove the internal URLs as well, so that we don’t create links pointing to broken pages or redirect chains inside our clients’ websites.

To find internal links for the pages of our clients’ websites, we click on Links > Internal links. 

Once we open the report, we need to scroll to the bottom of the page and choose to display the maximum (500) number of rows per page. 

Then, we scroll back to the top and click on the download symbol at the right top corner. 

From there, it’s only a matter of copying/pasting data to the Internal Links tab of our Content Audit Template. 

Note: The template will then match the pages included in the tab with those included in the Actions tab and, using a VLOOKUP formula, will gather the number of internal links (if any) for each page as reported in the Internal Links tab.

As we stressed in the beginning, the process of gathering data from Google Analytics and Google Search Console can easily be addressed through a software like Supermetrics. We wanted to show you how the process works without using any software—just by using free analytics tools like Google Analytics and Google Search Console. (However, you still need to have access to Ahrefs to perform the audit the way we do.) 

Now, let’s take a look at how we assign an Action to each page included in the audit. 

3) Assign an Action to each page included in the audit

In the final step of the process, we assign an Action to each of the pages included in the audit. These Actions will affect not only what we do with the pages we included in the audit, but also what we do with future content pieces that we need to create. Here’s what the Actions tab looks like without data.

And here’s how the same tab looks when we’ve inserted data for each of the pages included in the audit. 

Note: Our template automatically removes duplicate content (duplicate URLs) and doesn’t include any secondary page elements. 

As you’ll notice, next to the Actions column, we have a message that reads either “This page has {x} internal links” or “No need to display internal links”. Even though we don’t take internal links into account in our audit, we still value them and thus feel like we should include them in our template. 

The reason for this is simple. If, for example, a page has to be deleted—even though we have to manually review everything before deleting anything—we may need to remove or redirect to similar content any internal links pointing back to it. Thus, knowing the number of internal links for each page is crucial. 

The assignment of the Actions is made automatically through the criteria and the thresholds we’ve set in the Start Here tab. In the next section, we’ll explain how we assign an Action to each of the pages included in the audit.

Content Audit Actions

The Actions that we assign to each of the web pages we include in our audits are:

Leave as is (200) URLs
Manual Review URLs
Delete (404) URLs
Redirect or Merge (301) URLs

Let’s examine each of these Actions separately and see how we come up with them.

1) Delete (404)

Usually referred to as content pruning, deleting pages can be truly beneficial for your website’s performance. The criteria we use to mark a page as “Delete (404)” are the following:

  • The page hasn’t generated any revenue
  • The page had no sessions for the last 12 months
  • The page isn’t ranking for any organic keywords
  • The page has no backlinks pointing back to it

As you can imagine, a page that gets no organic traffic, has no organic visibility whatsoever, has no link equity to pass on (from backlinks) and doesn’t bring in any revenue is a page that we have to consider removing from our site. 

This is one of the biggest weaknesses that we most often witness when auditing websites. The truth is that instead of SEO purposes, such pages add no value to your visitors and may even harm the overall experience, since they potentially keep your most important content assets from being more visible to users. 

There are times that such pages can’t be removed—for example, when they include mission statements or when they get traffic from other sources (e.g. referral traffic)— but in general, a page that ticks all the above boxes has to be removed. The paradox is that most content creators and marketers start by creating new content before getting rid of the content that devalues their website. 

What does it mean when we delete a page? First of all, it means that we delete it from our clients’ CMS (or, in some cases, change its status to “Draft”). Second, we remove the page from the Sitemap that includes it. Lastly, we add a 301 redirect from the removed page to the blog section or to another similar page, in case there are any backlinks that we’ve missed identifying.

2) Leave as is (200)

The second Action that we assign to pages included in a content audit are those we “Leave as is (200)”. These pages—based on our criteria, of course—have some kind of value and thus need to exist. The fact that we assign the Action “Leave as is (200)” doesn’t mean that we can’t improve them. We just have to be strategic about it and make optimal use of our time and resources. 

In order for a page to be characterised as “Leave as is (200)”, it has to qualify based on the criteria that we’ve set in the Start Here tab. Thus, the number of pages that we’ll leave as they are will be determined by the decisions that we make when choosing our thresholds. 

Now let’s look at when a page is characterized as “Redirect or Merge (301)”. 

3) Redirect or Merge (301)

In general, pages that we’ve assigned with the Action “Redirect or Merge (301)” are those that have some link equity they can pass on to other pages, but get no organic traffic whatsoever. These pages could be redirected to similar pages or merged with other, similar content pieces.

We recommended that the types of content you’re merging are the same. For example, it isn’t good practice to merge an infographic with a checklist, especially if this is something that’s visible in the page URL. Try to merge pages with similar types of content. The criteria we’re using here are the following:

  • The number of backlinks is higher than 0
  • The page has no organic traffic

Note: Revenue generated and number of keywords are not taken into account, since we cover those instances with “Leave as is (200)” or “Manual Review”.

4) Manual Review

The final Action that we assign to the pages of the websites we audit is “Manual Review”. When we say that a page has to be manually reviewed, we mean that someone has to visit the page and look for things like:

  • Bounce rate
  • Content quality
  • Social media shares
  • Average time on page

We generally don’t take into account other metrics (e.g. word count), as they don’t indicate anything about the actual quality of the page. The question is, when do we know that a page has to be manually reviewed? Even though there isn’t a correct answer to that question, for us here at MINUTTIA, pages have to be manually reviewed when they:

  1. Aren’t assigned with the “Leave as is (200)” Action
  2. Aren’t assigned with the “Delete (404)” Action
  3. Aren’t assigned with the “Redirect or Merge (301)” Action

Manually reviewing pages is one of the most time-consuming parts of the auditing process, but at the same time one of the most significant ones. We simply can’t complete the audit without manually auditing the pages that were assigned with the “Manual Review” Action. Now, let’s discuss three things you need to pay attention to when conducting a content audit for your own website. 

Things to pay attention to

When conducting a content audit, there are certain things that you need to pay attention to. The first thing you need to pay attention to is that you shouldn’t delete anything without doing a quick manual review first. Also, add redirects to deleted pages just in case there are any lost backlinks, and remove internal links as we mentioned earlier.

Next, you want to add annotations to Google Analytics every time you make a major change to the website (e.g. deleting a number of pages based on the content audit). Last but not least, you have to be very careful when merging or redirecting pages. As we mentioned earlier, the pages you merge or redirect to have to be relevant to each other.

Let’s close this article with some final thoughts.

Final Thoughts

By now, you know that thin content that underperforms can harm your search performance and overall site quality. The content audit process may be a bit time-consuming—especially if you have to manually audit hundreds or even thousands of content pages—but it’s necessary for your keyword rankings, overall content and SEO performance.

Through the work we’ve done for our clients, we’ve seen the difference that a content audit can have, not only in terms of search engine optimization but also on a user experience level. Keeping only high-quality pages on your website and taking action for poor quality content that adds no value whatsoever is not important only for search engines, but also for your users and visitors.

We hope that this step-by-step process has brought you one step closer to understanding what a content audit is, why it’s important and why you need to add it to your marketing strategy.