Keep a search engine from crawling a page

onajite_omare · January 22, 2021, 7:31pm

Contributors

@brandon-leuangpaseuth @andreea-macoveiciuc-content-expert

Business Benefits

Stop search engine bots from discovering and reviewing a page’s content.

Type “site:yourdomain.com/page-url” into Google Search and other search engines to check whether a page has been crawled.

Replace “yourdomain.com with that of the page you want to prevent search engine bots from crawling.
You can also use the page title instead of the page URL to crosscheck. For example, site:yourdomain.com “page title”.
Proceed to step 3 if results show up. Otherwise, move on to step 2.

Type the page URL into the URL Inspection Tool in Google Search Console to determine whether Google search bots can crawl it.

Results should show URL is not on Google if Google search bots are blocked from crawling the page. Move on to step 3 if you get different results.

Decide whether you want to block search engine bots using your robots.txt file, password protection, or the noindex tag.

Search bots can’t crawl password-protected pages. Reach out to your web developer to password-protect the page then continue to step 5.
Move on to the next step to block the page in your robots.txt file if you don’t want to password-protect the page.
Block search indexing with Noindex. You can prevent a page from appearing in Google Search by including a noindex meta tag in the page’s HTML code, or by returning a noindex header in the HTTP request.

Log into your web server and use a text editor to add rules blocking search engine bots to your robots.txt file below any existing rules.

For example:

User-agent: [user-agent-name]
Disallow: [URL string]``

[user-agent name] stands for the bot. If you want to block all bots, use *, that is user-agent: *

```- Replace [user-agent-name] with the name of the search engine bot you want to block. Add a ‘*’ if you want to block all search engine bots from crawling your page.

Replace [URL string] with the URL string you want to prevent search engine bots from crawling. For example, if you want to block https://domain.com/your-page/, then the URL string would be /your-page/.

Last edited by @hesh_fekry 2023-11-14T12:30:31Z

Topic	Replies	Views
Keep a search engine from indexing a page Step-by-step Playbooks marketing , seo , playbook , indexing , noindex	286	January 22, 2021
Create a robots.txt file Step-by-step Playbooks marketing , seo , playbook , create , indexing , robots-txt	263	January 22, 2021
Audit your robots.txt file Step-by-step Playbooks marketing , seo , audit , playbook , indexing , robots-txt	315	January 23, 2021
Check if your site is accessible without JS Step-by-step Playbooks marketing , seo , playbook , accessibility , javascript	195	January 22, 2021
Manage a robots.txt file Step-by-step Playbooks marketing , seo , playbook , indexing , robots-txt	290	March 23, 2022