Where should I place my robots.txt file?

The robots.txt file must be placed at the root of your domain — for example, https://www.yourdomain.com/robots.txt. It will not work if placed in a subdirectory. After uploading, verify it is accessible at that URL.

Free Robots.txt Generator - Zero Server Tools

Q: What is a robots.txt file?

A robots.txt file is a standard text file placed at the root of your website (e.g., yourdomain.com/robots.txt) that instructs search engine crawlers which pages or directories they are allowed or not allowed to access.

Q: Does disallowing a page in robots.txt remove it from Google?

No. Disallowing a URL in robots.txt prevents Google from crawling it, but if other pages link to it, Google may still index it without visiting. To completely remove a page from search, use a noindex meta tag or the Google Search Console URL removal tool.

Q: Is robots.txt a security measure?

No. Robots.txt is a public file and is NOT a security measure. Anyone can read it and see what directories you are trying to hide. For true security, use HTTP authentication, server-level firewall rules, or database-level access controls.

Q: Does Googlebot respect the Crawl-Delay directive?

Modern Googlebot does not support the Crawl-Delay directive. To control Google's crawl rate, use Google Search Console's crawl rate settings. Other bots like Bingbot and Yandexbot do respect this directive.

What is a robots.txt File and Why Does It Matter?

A robots.txt file is a plain text file placed at the root directory of your website (e.g., https://yourdomain.com/robots.txt). It follows the Robots Exclusion Protocol (REP) — a long-standing web standard that allows website owners to communicate directly with search engine crawlers about which parts of the site they want indexed and which they prefer to keep private from crawlers.

Every time a search engine bot (like Googlebot, Bingbot, or Yandexbot) visits your website, it checks your robots.txt file first before crawling any other page. This makes robots.txt one of the most powerful, yet often overlooked, tools in technical SEO.

Key Directives Explained

User-agent: Specifies the crawler the rule applies to. A * wildcard means the rule applies to all crawlers.
Disallow: Tells the crawler NOT to access a specific URL path. A blank Disallow line means everything on the site is allowed.
Allow: Explicitly permits access to a URL path, even if a broader Disallow rule blocks the parent directory.
Sitemap: Points crawlers to the absolute URL of your XML sitemap, helping them discover all your pages more efficiently.
Crawl-delay: Asks bots to wait a specified number of seconds between requests, protecting server resources from aggressive crawlers.

Practical Example

A typical robots.txt file for a standard website looks like this:

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /

Sitemap: https://www.yourdomain.com/sitemap.xml

Which Bots Should You Block?

Most legitimate search engine bots (Googlebot, Bingbot, Yandexbot) should be allowed to crawl your website. However, you may want to control access for:

AI Training Bots: GPTBot (OpenAI), ClaudeBot (Anthropic), and CCBot (Common Crawl) are used to train large language models. You can block these if you prefer your content not to be used for AI training.
Scrapers: Aggressive bots that scrape content without adding SEO value. These rarely respect robots.txt, but blocking them can reduce server load.
Redundant Bots: If you don't target certain markets, you can safely disallow bots like Baiduspider (China) or YandexBot (Russia) to reduce irrelevant crawl budget usage.

Frequently Asked Questions (FAQs)

Does disallowing a page in robots.txt remove it from Google?

No. Disallowing a URL in robots.txt prevents Google from crawling it, but if other pages link to it, Google may still index the URL without visiting it. To completely remove a page from search results, use a <meta name="noindex"> tag or the Google Search Console URL removal tool.

Where should I place the robots.txt file?

The file must be placed in the root directory of your domain — accessible at https://www.yourdomain.com/robots.txt. It cannot be placed in a subdirectory and still function correctly. After uploading, verify it's accessible at that URL.

Does Googlebot respect the Crawl-Delay directive?

Modern Googlebot does not support the Crawl-delay directive. To control Google's crawl rate, use the Crawl Rate settings inside Google Search Console. Other bots like Bingbot, Yandexbot, and Slurp do typically respect this directive.

Is robots.txt a security measure?

No! Robots.txt is a public file that anyone can read. Listing your private directories in robots.txt can actually expose them to malicious users who check this file specifically. For real security, use HTTP authentication, server-level access rules, or application-level authentication.

Should I block AI training bots like GPTBot?

This is a personal or business decision. If you don't want your content used to train AI models like ChatGPT, you can add Disallow: / under User-agent: GPTBot. Note that compliance is voluntary — ethical crawlers will respect it, but unethical ones may not.

Free Robots.txt Generator

What is robots.txt?

Not a Security Tool

Include Your Sitemap

What is a robots.txt File and Why Does It Matter?

Key Directives Explained

Practical Example

Which Bots Should You Block?

Frequently Asked Questions (FAQs)

Why Use Our Generator

Visual Interface

AI Bot Support

Instant Download

Complete Your Technical SEO

SEO Tools

Developer Tools

Webmaster Tools

Company