Must be placed in the root directory (e.g., https://example.com/robots.txt).
1. Basic Syntax
User-agent: [crawler-name]
Disallow: [path-to-block]
Allow: [path-to-allow]
2. Common Examples
Block All Crawlers from a Folder
User-agent: *
Disallow: /private-folder/
Block a Specific Bot (e.g., Bingbot)
User-agent: Bingbot
Disallow: /no-bing/
Allow One Bot While Blocking Others
User-agent: Googlebot
Allow: /
User-agent: *
Disallow: /
Block Images but Allow HTML Pages
User-agent: Googlebot-Image
Disallow: /
User-agent: *
Allow: /
1. Don’t Use robots.txt to Hide Sensitive Data
2. Avoid Blocking CSS & JavaScript
3. Use Allow to Override Disallow
Google needs these to render pages properly. Blocking them can hurt indexing.
4. Submit robots.txt to Google Search Console
Google’s Robots.txt Tester (in Search Console)
– Checks for errors and crawlability.
Third-Party Tools
– Screaming Frog
– SEOrobot
<?php
header(‘Content-Type: text/plain’);
echo “User-agent: *\n”;
echo “Disallow: /search/\n”;
?>
Common robots.txt Mistakes
1. Blocking SEO-Critical Pages
Bad:
Disallow: /blog/
Fix:
Only block non-essential pages (e.g., /temp/, /test/).
2. Typos & Incorrect Syntax
Bad:
Useragent: *
Disallow: /private
✅ Fix:
User-agent: *
Disallow: /private/
3. Using Wildcards Incorrectly
🚫 Bad:
Disallow: *
✅ Correct:
Disallow: /
A well-optimized robots.txt file helps guide search engines to your most important content while blocking irrelevant pages.
Key Takeaways:
Need help? Run a crawl audit with Screaming Frog to check for errors!
If your website isn’t ranking as it should or your content isn’t converting, it’s time for a comprehensive on-page and content optimization strategy. Let’s enhance your website’s performance and visibility.