Combing through the content on your site and removing some of it may seem shocking. But the benefits of pruning content can have a positive impact on visibility and traffic if done properly.
The importance of quality content for any website is echoed in digital marketing. Having quality pages enhances user experience and establishes authority with search engines. But since “quality content” is a subjective term, how can you be sure that the content on your site is high-quality? The easiest way to do this is by conducting a content audit.
A content audit can inform you of any low content performance that is bringing down the quality of the rest of your site. It can also provide insights to help improve your content marketing strategy in the long run. This guide will show you how to audit a site, with an emphasis on content pruning, using different tools based on the level of access to the site’s analytics.
SEO Content Audit Steps
Depending on what kind of analytics access and tools you have, there are different methods you can use to conduct a content audit. We’ll cover three audit methods in this tutorial. The first uses Screaming Frog, the second uses Google Analytics and the third uses neither.
Each method mentioned will use the SEO tool, Ahrefs. It is possible to use alternatives to Ahrefs, Screaming Frog, and Google Analytics for your content audit, but these are the most common and powerful tools available. We have no affiliation with any tool mentioned.
1. Gather and Setup Your Data
No Screaming Frog Method (GA access, Ahrefs)
Tools Needed: Google Analytics, Ahrefs, Spreadsheet software
The first data collection method is for users with access to Google Analytics and Ahrefs.
- Open Google Analytics for the domain you plan to audit.
- Go to “BEHAVIOR” > “Site Content” > “All Pages.”
- Set the time frame to 12 months if the business is highly seasonal to get a full scope of traffic. This could be reduced to as short as three months if there is not much seasonality for the content.
- Set a filter to include only the specified content section you would like to audit, for example “/blog/.”
- Change the number of rows shown on the bottom to show all of the selected pages, because Google Analytics exports the data as it is currently displayed.
- Select “Export” and choose your desired file of choice.
- Open up the exported file in your spreadsheet software (i.e. Microsoft Excel, Google Sheets, etc.)
- Delete the extra information in the top six rows and the “Day Index” rows on the bottom.
- Add a column next to the “Page” column, and use the concatenate function to link the domain to the page’s path. The formula should look like this: “=CONCATENATE(“https://www.website.com”,A2)”
- Input the formula on one cell and then drag the fill handle down or double click it to copy the formulas for the rest of the dataset.
- Log into Ahrefs, and navigate to “More” > “Batch analysis.”
- Paste your list of URLS into Ahref’s batch analysis tool, 200 at a time. Hit “Start Analysis” to run the URLs through the tool.
- Export the list and add the data to the spreadsheet you imported the Google Analytics data into, making sure that the data lines up to correctly to each URL.
No Google Analytics Access Method (Ahrefs)
Tools Needed: Ahrefs, Spreadsheet software
Sometimes having access to a website’s Google Analytics isn’t an option. The following method will show you how to use an SEO tool like Ahrefs to pull data for a content audit. Keep in mind that tools like Ahrefs generate their data points and numbers based off of search traffic estimations, so traffic numbers will not be exact. That is why it is best to try to gain analytics access.
- Login to Ahrefs and input the domain. Make sure to include the directory of the specified content you want to audit, if applicable.
- Select “Top pages.”
- Sort by the lowest traffic, then export all the rows.
- Download the exported file from the top right file folder in Ahrefs.
- Open up your spreadsheet software (i.e. Microsoft Excel or Google Sheets) and import the data.
- Go back into Ahrefs and into “More” > “Batch analysis.”
- Take the URLs on your spreadsheet and run them through the tool, 200 at a time.
- Export the data from the batch analysis and paste it accordingly into your original spreadsheet.
Screaming Frog Method (GA access, GSC access, Ahrefs)
Tools Needed: Screaming Frog, Google Analytics, Google Search Console, Ahrefs, Spreadsheet software
Having access to each of these tools will allow you to look at different layers of data from your site. With this information, you will be able to make better decisions in regard to your content. One thing to note is that the API data may not be 100% accurate, so always compare your data between platforms.
- Open Screaming Frog and start by clicking on the API tab found on the top right or in the “Configuration” menu, under “API Access.”
- Connect Google Analytics to the Gmail account that has access to the domain’s data.
- Select what segment you want to look at, either all users, organic traffic, or a custom segment you have created.
- Set the date range for 12 months and select the metrics to pull from Google Analytics. Some metrics to consider are unique pageviews, sessions, time on page, landing page views, and goals.
- Select the next API, Google Search Console. Connect the API by logging into the Gmail account with access to the domain’s data.
- Specify the dates for Google Search Console (The API has access to 16 months of data).
- Select the final API, Ahrefs, and plug in the API access token for your account.
- Set your metrics. A few to include are backlinks, referring domains, and social shares.
- Setup date extraction for your site. Go into the menu and select “Configuration” > “Custom” > “Extraction.” This is where you are going to create a custom extraction to pull dates. Label the extractor as “Date,” select the appropriate scraping method, and input your syntax.
- Enter your domain and hit “Start.”
- Wait. The time it will take to crawl through all the pages will depend on the size of your site or audit.
- Pull out the data from the crawl, go to the “Internal” tab, filter by “HTML,” and select “Export.”
- Import the data into your spreadsheet software of choice and double check to see if the analytics data matches the data from Google Analytics online interface.
- Paste each URL from the spreadsheet you created, up to 200 at a time inside Ahref’s batch analysis tool for the keywords metric. “More” > “Batch analysis.”
- Export that list and paste it into your current set of data, keeping only the “Keywords” column (Make sure that the URLs and data line up accordingly).
2. Set Custom Extraction for Dates
Finding dates for your content audit is important. By extracting dates from the pages, you can avoid cutting any new content from the site.
Date is on the blog post
If the date is on the blog post, use the “CSSPath” or “XPath” extractor in Screaming Frog, under “Configuration” > “Custom” > “Extraction.” Right click on the date within the blog post with Google Chrome and select “Inspect”. Then right click on the HTML line of the date and select either “Copy” > “Copy selector” or “Copy” > “Copy XPath.” Paste this syntax into the custom extraction section of Screaming Frog.
Date is not on the page
If there is no date on the page, take a look at the source code. Sometimes the date can be found with certain attributes, such as “datetime=.” To find out if any of the pages have a datetime attribute or anything else, click on a blog post and select “View Page Source.” Next, use the “Find” option (command/control + f) to search for “datetime.”
If you spot the “datetime=” attribute in the page source, go into Screaming Frog, “Configuration” > “Custom” > “Extraction,” and use the “Regex” method and input the following syntax:
Export the date info from the “Extraction” filter inside of the “Custom” tab, and put that data into your spreadsheet. Clean up the data in your spreadsheet by creating a new column using the formula, “=RIGHT(A2,10),” where A2 represents the cell with your date output and the number 10 represents the numbers/hyphens of the date.
Neither method above works
If neither of the above methods work for the domain you plan to audit, then you will have to go through the blog manually. Sort the posts by new on the site and note those URLs for your audit. Alternatively, if you have a content inventory, calendar, or something similar, you can reference that as well.
3. Filter and Label Your Data
After you’ve gathered your data from either Google Analytics, Ahrefs or Screaming Frog, you’ll need to organize it. Start by creating two columns next to each URL to label actions for each page and to list notes.
Filter your data based on the most valuable metric to you, such as traffic. This could mean only focusing on pages under 50 page views (or any other number that makes sense for your site). This is a good base to start the filtering process, before looking through other sets of data.
Make sure that all the URLs you are looking through have a status code of 200 (OK), if crawled with Screaming Frog.
Exclude any newer pages that may need some more time to gain momentum.
Consider keeping URLs with a large number of social shares, exclude these from your list as you spot them.
Take a look at the number of goals completed by each page. Omit any pages from your list that are converting.
If you see a high number of impressions from GSC, maybe the metadata (title tag and meta description) and content can be improved. Label these as “improve,” if necessary.
Number of Keywords
For URLs with a large number of keywords from the batch analysis, label as “improve.”
Backlinks and Referring Domains
If there are any backlinks or referring domains, make sure to redirect those to a similar page. To find a similar page on Google, enter into the search bar “site:website.com topic.” Label the action as “redirect,” and note the target URL under the notes column on your sheet.
If you notice that there are thin content pages that could be merged together to create a bigger resource, label those as “consolidate.” Note the pages that should be consolidated on the sheet. Other pages that could be consolidated are ones with overlapping topics/information or duplicate content.
Anything else, label as “remove.”
4. Perform Content Audit Actions
Once all your data is filtered and labeled, the next step is to take action based on the labels set for each.
There may be a chance that some pages could be improved, rather than wiping them from your site. Use your judgement to decide what aspects of the page can be improved. If the page looks outdated, improve the page by refreshing the content to represent current data and information. If you spot any thin pages or notice that something could use more explanation, expand the content.
Setup 301 redirects for these URLs to pages that are relevant and topically similar.
Merge the content of the pages that you find overlapping or have thin content. Optimize the content to focus on the overarching topic of the pages that you plan to combine.
These are your pages that are dead weight compared to everything else on the site. Delete these pages by serving a 410 header. Also, make sure to unlink any internal links to them and remove the deleted URLs from the sitemap. To find internal links to each URL, you can use Screaming Frog.
Wrapping Things Up
When going through your list of content, go through each page diligently. If you see any pages that you believe to be important and provide adequate information but are in the “remove” column, then maybe you need to surface the page higher up in the site structure or add a noindex tag. Be creative and logical when going through your audit to think of solutions to improve your overall content strategy.
Conducting a content audit sounds like a daunting task. But by following the steps outlined above, you can get one step closer to having a higher quality and trustworthy website.