Combing through the content on your site, removing it, and updating it can feel like a daunting task. But the benefits of auditing content can have a positive impact on visibility and traffic long-term (if done properly).
Here at Siege the importance of high-quality content is always top of mind and should be for any website looking to succeed in the competitive digital marking sphere. Trust us — the proof is in the pudding. Check out our case studies to see these strategies (and their results) in action.
So, we’ve established high-quality pages enhance user experience and demonstrate authority with search engines. But since “high-quality content” is a subjective term, how can you be sure that the content on your site is high-quality? That’s where content audits come in.
What Is a Content Audit?
A content audit can inform you of any low content performance that is bringing down the quality of the rest of your site. It can also provide insights to help improve your content marketing strategy in the long run.
This guide will show you how to audit a site, with an emphasis on content pruning, using different tools based on the level of access to the site’s analytics.
SEO Content Audit Steps
Depending on what kind of analytics access and tools you have, there are different methods you can use to conduct a content audit. We’ll cover three audit methods in this tutorial. The first uses Screaming Frog, the second uses Google Analytics, and the third uses neither.
Each method mentioned will use the SEO tool, Ahrefs. It is possible to use alternatives to Ahrefs, Screaming Frog, and Google Analytics for your content audit, but these are the most common and powerful tools available. We have no affiliation with any tool mentioned.
1. Gather and Setup Your Data
Method: No Screaming Frog, Google Analytics access, Ahrefs
Tools Needed: Google Analytics, Ahrefs, spreadsheet software
The first data collection method is for users with access to Google Analytics and Ahrefs.
- Open Google Analytics for the domain you plan to audit.
- Go to “BEHAVIOR” > “Site Content” > “All Pages.”
- Set the time frame to 12 months if the business is highly seasonal to get a full scope of traffic. This could be reduced to as short as three months if there is not much seasonality for the content.
- Set a filter to include only the specified content section you would like to audit, for example “/blog/.”
- Change the number of rows shown on the bottom to show all of the selected pages, because Google Analytics exports the data as it is currently displayed.
- Select “Export” and choose your desired file of choice.
- Open up the exported file in your spreadsheet software (i.e. Microsoft Excel, Google Sheets, etc.)
- Delete the extra information in the top six rows and the “Day Index” rows on the bottom.
- Add a column next to the “Page” column, and use the concatenate function to link the domain to the page’s path. The formula should look like this: “=CONCATENATE(“https://www.website.com”,A2)”
- Input the formula on one cell and then drag the fill handle down or double click it to copy the formulas for the rest of the dataset.
- Log into Ahrefs, and navigate to “More” > “Batch analysis.”
- Paste your list of URLS into Ahref’s batch analysis tool, 200 at a time. Hit “Start Analysis” to run the URLs through the tool.
- Export the list and add the data to the spreadsheet you imported the Google Analytics data into, making sure that the data lines up to correctly to each URL.
Method: No Google Analytics Access, Ahrefs
Tools Needed: Ahrefs, spreadsheet software
Sometimes having access to a website’s Google Analytics isn’t an option. The following method will show you how to use an SEO tool like Ahrefs to pull data for a content audit.
Keep in mind that tools like Ahrefs generate their data points and numbers based off of search traffic estimations, so traffic numbers will not be exact. That is why it is best to try to gain Google Analytics access.
- Login to Ahrefs and input the domain. Make sure to include the directory of the specified content you want to audit, if applicable.
- Select “Top pages.”
- Sort by the lowest traffic, then export all the rows.
- Download the exported file from the top right file folder in Ahrefs.
- Open up your spreadsheet software (i.e. Microsoft Excel or Google Sheets) and import the data.
- Go back into Ahrefs and into “More” > “Batch analysis.”
- Take the URLs on your spreadsheet and run them through the tool, 200 at a time.
- Export the data from the batch analysis and paste it accordingly into your original spreadsheet.
Note: Using this method could provide inaccurate data for newer industries since keywords and jargon can be new to Ahrefs or any other SEO tool.
Method: Screaming Frog, Google Analytics access, Google Search Console access, Ahrefs
Tools Needed: Screaming Frog, Google Analytics, Google Search Console, Ahrefs, spreadsheet software
Having access to each of these tools will allow you to look at different layers of data from your site. With this information, you will be able to make better decisions in regard to your content.
One thing to note is that the API data may not be 100% accurate, so always compare your data between platforms.
- Open Screaming Frog and start by clicking on the API tab found on the top right or in the “Configuration” menu, under “API Access.”
- Connect Google Analytics to the Gmail account that has access to the domain’s data.
- Select what segment you want to look at, either all users, organic traffic, or a custom segment you have created.
- Set the date range for 12 months and select the metrics to pull from Google Analytics. Some metrics to consider are unique pageviews, sessions, time on page, landing page views, and goals.
- Select the next API, Google Search Console. Connect the API by logging into the Gmail account with access to the domain’s data.
- Specify the dates for Google Search Console (The API has access to 16 months of data).
- Select the final API, Ahrefs, and plug in the API access token for your account.
- Set your metrics. A few to include are backlinks, referring domains, and social shares.
- Setup date extraction for your site. Go into the menu and select “Configuration” > “Custom” > “Extraction.” This is where you are going to create a custom extraction to pull dates. Label the extractor as “Date,” select the appropriate scraping method, and input your syntax.
Note: There are a number of different ways to pull dates from the site, which we have listed in the next section.
- Configure the spider. Go into “Configuration” > “Spider.” Afterwards, uncheck images, CSS, javascript, and SWF, then hit “OK.”
- Enter your domain and hit “Start.”
Note: If you plan on auditing a specific section of your site, like the blog, go into “Configuration” > “Include,” then enter the directory you plan to focus on (i.e. website.com/blog/.*)
- Wait. The time it will take to crawl through all the pages will depend on the size of your site or audit.
- Pull out the data from the crawl, go to the “Internal” tab, filter by “HTML,” and select “Export.”
- Import the data into your spreadsheet software of choice and double check to see if the analytics data matches the data from Google Analytics online interface.
- Paste each URL from the spreadsheet you created, up to 200 at a time inside Ahref’s batch analysis tool for the keywords metric. “More” > “Batch analysis.”
- Export that list and paste it into your current set of data, keeping only the “Keywords” column (Make sure that the URLs and data line up accordingly).
2. Set Custom Extraction for Dates
Finding dates for your content audit is important. By extracting dates from the pages, you can avoid cutting any new content from the site.
Date Is on the Blog Post
If the date is on the blog post, use the “CSSPath” or “XPath” extractor in Screaming Frog, under “Configuration” > “Custom” > “Extraction.”
Right click on the date within the blog post with Google Chrome and select “Inspect”.
Then right click on the HTML line of the date and select either “Copy” > “Copy selector” or “Copy” > “Copy XPath.” Paste this syntax into the custom extraction section of Screaming Frog.
Date Is Not on the Page
If there is no date on the page, take a look at the source code. Sometimes the date can be found with certain attributes, such as “datetime=.”
To find out if any of the pages have a datetime attribute or anything else, click on a blog post and select “View Page Source.” Next, use the “Find” option (command/control + f) to search for “datetime.”
If you spot the “datetime=” attribute in the page source, go into Screaming Frog, “Configuration” > “Custom” > “Extraction,” and use the “Regex” method and input the following syntax:
datetime=”……….
Export the date info from the “Extraction” filter inside of the “Custom” tab, and put that data into your spreadsheet.
Clean up the data in your spreadsheet by creating a new column using the formula, “=RIGHT(A2,10),” where A2 represents the cell with your date output and the number 10 represents the numbers/hyphens of the date.
Neither Method Above Works
If neither of the above methods work for the domain you plan to audit, then you will have to go through the blog manually.
Sort the posts by new on the site and note those URLs for your audit. Alternatively, if you have a content inventory, calendar, or something similar, you can reference that as well.
3. Filter and Label Your Data
After you’ve gathered your data from either Google Analytics, Ahrefs or Screaming Frog, you’ll need to organize it. Start by creating two columns next to each URL to label actions for each page and to list notes.
Pageviews
Filter your data based on the most valuable metric to you, such as traffic. This could mean only focusing on pages under 50 page views (or any other number that makes sense for your site). This is a good base to start the filtering process, before looking through other sets of data.
Status Codes
Make sure that all the URLs you are looking through have a status code of 200 (OK), if crawled with Screaming Frog.
Fresh Content
Exclude any newer pages that may need some more time to gain momentum.
Social Shares
Consider keeping URLs with a large number of social shares, exclude these from your list as you spot them.
Goal Completions
Take a look at the number of goals completed by each page. Omit any pages from your list that are converting.
Impressions
If you see a high number of impressions from GSC, maybe the metadata (title tag and meta description) and content can be improved. Label these as “improve,” if necessary.
Number of Keywords
For URLs with a large number of keywords from the batch analysis, label as “improve.”
Backlinks and Referring Domains
If there are any backlinks or referring domains, make sure to redirect those to a similar page. To find a similar page on Google, enter into the search bar “site:website.com topic.” Label the action as “redirect,” and note the target URL under the notes column on your sheet.
Thin Content
If you notice that there are thin content pages that could be merged together to create a bigger resource, label those as “consolidate.” Note the pages that should be consolidated on the sheet. Other pages that could be consolidated are ones with overlapping topics/information or duplicate content.
Anything else, label as “remove.”
4. Perform Content Audit Actions
Once all your data is filtered and labeled, the next step is to take action based on the labels set for each.
Improve
There may be a chance that some pages could be improved, rather than wiping them from your site. Use your judgement to decide what aspects of the page can be improved.
If the page looks outdated, improve the page by refreshing the content to represent current data and information. If you spot any thin pages or notice that something could use more explanation, expand the content.
Redirect
Setup 301 redirects for these URLs to pages that are relevant and topically similar.
Consolidate
Merge the content of the pages that you find overlapping or have thin content. Optimize the content to focus on the overarching topic of the pages that you plan to combine.
Remove
These are your pages that are dead weight compared to everything else on the site. Delete these pages by serving a 410 header.
Also, make sure to unlink any internal links to them and remove the deleted URLs from the sitemap. To find internal links to each URL, you can use Screaming Frog.
Complete Your Content Audit
When going through your list of content, go through each page diligently. If you see any pages that you believe to be important and provide adequate information but are in the “remove” column, then maybe you need to surface the page higher up in the site structure or add a noindex tag.
Be creative and logical when going through your audit to think of solutions to improve your overall content strategy. However, don’t worry if you need a little help — that’s where we come in. Check out our content marketing and SEO services to learn more.
We totally get that conducting a content audit sounds overwhelming. But by following the steps outlined above, you can get one step closer to having a higher quality and trustworthy website.