Catalog all the URLs on your site that you may want to optimize for users and search engines.
Remove unnecessary items from your crawl.
Enter your homepage URL and start the crawl.
Crawling may take a few seconds or a couple of hours, depending on the size of your site. If your site has more than 20,000 pages, Screaming Frog may crash. Either save your progress periodically or consider a different tool.
Monitor the crawl to identify any “crawl traps.”
If the crawler gets stuck in a subdirectory (/wp-content/) with thousands of irrelevant pages, pause and clear the crawl, add the subdirectory to the Exclude filter, and start the crawl again. You may need to repeat this step.
Once the crawl has completed, export the crawl as a CSV file and open in Google Sheets or Excel.
In the Content column, delete all rows that are not HTML.
In the Status Code column, delete all rows that are not a 200.
Delete all columns except for Address, Title 1, and Meta Description 1.
Alternatively, keep H1 and Word Count if you’re interested in reviewing those elements.
Sort the spreadsheet on the Address column.
Remove any pages that you don’t expect to optimize.
- Pages with UTM parameters
- Tag pages
- Author pages
- Pagination URLs.
Your content inventory should include only pages that marketers will actively try to improve.