The robots.txt file is a small text file that is found in the root folder of your site. It tells search engine bots about which part of the site to crawl and which part is not.
If you make a slight mistake while editing/customizing it, then search engine bots will stop crawling and indexing your site and your site will not appear in the search result.
In this article, I will tell you what is
Why do you need robots.txt
When search engine bots come to websites, they follow the robots file and crawl the content. But your site will not have a robots.txt file, then search engine bots or web crawler will start indexing and crawling all the contents of your website which you do not want to index.
Web crawler searches the robots file before crawling any website. When they do not get any instructions from
Therefore, robots.txt file is required for these reasons. If we do not give Instructions to search engine Bots through this file, then they index our entire site. Also, index some of the data that you did not want to index.
Benefits of Robots.txt File
- Tells the search engine about which part of the site to crawl and which part is not.
- Prevent any particular file, folder, image, pdf etc from being indexed in the search engine.
- Sometimes search engine crawlers crawl your site like a hungry lion, which greatly influences your site performance. But you can get rid of this problem by adding crawl-delay to your robots file. However, Googlebot does not follow this command. But you can set Crawl rate in Google Search Console. This prevents your server from being overloaded.
- You can make the entire section of any website private.
- You can stop the Internal search results page from showing in SERPs.
- You can improve your website SEO by blocking low
Where is the Robots.txt Found on a Site?
If you are a WordPress user, then it remains in the root folder of your site. If this file is not available in this location, the search engine bot starts indexing your entire website. Because search engine bots do not search your entire website for a robots.txt file.
If you do not know whether your site has a robots.txt file or not? Then in the search engine address bar, you just have to type this – example.com/robots.txt
A text page will open in front of you as you can see in the screenshot.
This is Robots.txt file of InHindiHelp. If you do not see any such text page, then you have to create a robots.txt file for your site.
In addition, you can check it by going to Google Search Console tools.
Here is a guide on how to create a Perfect Robots.txt file for SEO…
Robots.txt File’s Basic Format
The basic format of the robots.txt file is very simple and it looks like this,
User-agent: [user-agent name]
Disallow: [URL or page that you do not want to crawl]
These two commands are considered as a complete robots.txt file. However, a robots file can contain many commands of user agents and directives (disallows, allow, crawl-delay etc).
- User-agent: are search engines’ crawlers/bots. If you want to give the same instruction to all search engine bots, use the * symbol after User-agent: like this User-agent: *
- Disallow: This prevents files and directories from being indexed in search engines.
- Allow: This allows search engines bots to crawl and index your content.
- Crawl-delay: How many seconds should the bot wait before loading and crawling the page contents.
Preventing All Web Crawlers from Indexing Website
Using this command in the robots.txt file, you can prevent all web crawlers/bots from crawling the website.
Allow All Web Crawlers to Index All Content
In the robots.txt file, this command allows all search engine bots to crawl all the pages of your site.
Blocking a Specific Folder for Specific Web Crawlers
This command prevents only Google’s crawler to crawl example-subfolder. But if you want to block all Crawlers, then your robots file should be like this.
Preventing a Specific Page (Thank You Page) from Index
Disallow: /page URL (Thank You Page)
This will prevent all crawlers from crawling your page URL. But if you want to block specific Crawlers, then you write it like this.
Disallow: /page URL
This command will only prevent Bingbot from crawling your page URL.
Adding a Sitemap to a Robots.txt File
You can add your sitemap anywhere in robots.txt – at the top or just below. Here is a guide –How to add Sitemap to Robots.txt File and Why is it important?
Find this article helpful? Don’t forget to share!