WordPress Robots.txt Security Guide: Best Practices & Mistakes

On This Page

The robots.txt file is primarily known as an SEO tool—it tells search engine crawlers like Googlebot which parts of your site they should or shouldn’t access.

However, from a security perspective, robots.txt is often a double-edged sword. While it can help reduce server load from bad bots, it is frequently used incorrectly to “hide” sensitive areas of a website.

Here is the golden rule of robots.txt security: Since robots.txt is a public file that anyone can read, listing a sensitive directory there doesn’t protect it—it advertises it to attackers.

In this guide, we’ll cover how to configure your WordPress robots.txt for maximum security without hurting your SEO, and how to avoid common information disclosure pitfalls.


What is robots.txt and Where Does it Live?

The robots.txt file sits in the root directory of your website.

https://yoursite.com/robots.txt

It uses a simple syntax to instruct “User-agents” (bots) on where they are allowed to go.

Basic Syntax:

Plaintext

User-agent: *
Disallow: /private-folder/

This tells all bots (*) not to crawl the /private-folder/.


The “Security Through Obscurity” Trap

The biggest security mistake website owners make is adding private paths to robots.txt thinking it keeps them safe.

The Risk:

Attackers and vulnerability scanners routinely check robots.txt first. If you have a line like Disallow: /backup-2024/ or Disallow: /admin-staging/, you have just handed the attacker a map to your most sensitive files.

FunSentry’s Perspective:

When you run a security scan with FunSentry, our scanner analyzes your robots.txt file specifically looking for “Information Disclosure.” We check if you are accidentally revealing paths to:

  • Backup directories
  • Config files
  • Staging environments
  • Old versions of the site
  • Private admin portals

Rule #1: Never put a path in robots.txt that you wouldn’t want a hacker to know exists. If a folder is private, password-protect it or restrict access via .htaccess—do not just ask bots nicely to ignore it.


Part 1: Essential WordPress Robots.txt Settings

For a standard WordPress site, your goal is to block access to backend files while allowing access to frontend assets (CSS/JS/Images) so Google can render your site correctly.

1. The Standard WordPress Setup

At a minimum, you should block the core WordPress directories that contain no public content.

Plaintext

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Why the Allow line?

WordPress uses admin-ajax.php for many frontend dynamic features (like “Load More” buttons or contact forms). If you block all of /wp-admin/, you might break these features for crawlers, hurting your SEO.

2. Blocking Core Files

You can add rules to prevent crawlers from wasting resources on core files that shouldn’t be indexed.

Plaintext

Disallow: /wp-includes/
Disallow: /xmlrpc.php
Disallow: /readme.html
Disallow: /license.txt

Note: Be careful blocking /wp-includes/. If your theme loads jQuery or other scripts from there, blocking it might prevent Google form seeing your site correctly. Always test using the Google Search Console “URL Inspection” tool.


Part 2: Preventing Bot Abuse

While robots.txt cannot stop a malicious bot (hackers ignore these rules), it can stop “polite” scrapers and aggressive SEO tools that waste your server resources.

3. Blocking Aggressive Bots

If you notice specific bots spiking your CPU usage, you can block them specifically.

Plaintext

User-agent: AhrefsBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: MJ12bot
Disallow: /

Note: This only stops them from crawling; it doesn’t block them from the server. For true blocking, use a firewall (WAF) or .htaccess.

4. Protecting Internal Search Results

Attackers sometimes use your internal site search to generate thousands of spam URLs (Search Result Spam). You can tell Google not to index these pages.

Plaintext

Disallow: /?s=
Disallow: /search/

Part 3: What NOT to Do (Common Mistakes)

1. Blocking CSS and JS

In the past, it was common to block /wp-content/. Do not do this. Google needs to access your CSS and JavaScript files to understand if your site is mobile-friendly.

Bad Configuration:

Plaintext

# ❌ DANGEROUS FOR SEO
Disallow: /wp-content/
Disallow: /wp-includes/

2. Leaking Plugin Vulnerabilities

Avoid listing specific plugin paths unless necessary.

Bad Configuration:

Plaintext

# ❌ TELLS HACKERS WHAT PLUGINS YOU USE
Disallow: /wp-content/plugins/vulnerable-slider-plugin/

If an attacker sees this, they know exactly which exploit to try against your site.


The “Perfect” Secure WordPress Robots.txt

Here is a balanced configuration that protects backend paths, prevents information leakage, and maintains SEO visibility.

Plaintext

User-agent: *
# Block backend admin access
Disallow: /wp-admin/
# Allow the AJAX handler for frontend functionality
Allow: /wp-admin/admin-ajax.php

# Block specific file types often targeted by bots
Disallow: /xmlrpc.php
Disallow: /readme.html
Disallow: /license.txt

# Prevent indexing of internal search results
Disallow: /?s=
Disallow: /search/

# Prevent indexing of trackbacks
Disallow: /trackback/

# ─── SECURITY NOTICE ───
# Do NOT list private backup folders here (e.g., /backups/).
# Protect them via server config, not robots.txt.

# Location of Sitemap (Helps good bots)
Sitemap: https://yoursite.com/sitemap_index.xml

How to Verify Your Robots.txt

After updating your file, you must verify it to ensure you haven’t accidentally de-indexed your whole site or leaked data.

1. Use FunSentry for a Security Audit

FunSentry performs a specialized Robots.txt Analysis as part of its security scan. It checks for:

  • Sensitive Path Disclosure: Does the file reveal hidden directories?
  • Installation Files: Are readme.html or license.txt accessible?
  • Logic Errors: Are you accidentally blocking legitimate traffic?

2. Google Search Console

Use the Robots.txt Tester inside Google Search Console to see exactly how Google interprets your rules.

3. Manual Inspection

Simply visit yoursite.com/robots.txt in Incognito mode. If you see paths like /secret/ or /backup/, remove them immediately and protect those folders on the server level instead.


Summary

Do This ✅Don’t Do This ❌
Block /wp-admin/Block /wp-content/ (blocks CSS/Images)
Allow /wp-admin/admin-ajax.phpList private folders like /backup_v2/
Block /xmlrpc.phpUse robots.txt to hide sensitive data
Include your Sitemap URLAssume Disallow stops hackers

Your robots.txt is a public billboard. Use it to guide helpful search engines, not to hide secrets from bad actors.

Is your robots.txt revealing too much?

Don’t guess. Run a free scan at FunSentry today to check for Information Disclosure risks and ensure your WordPress site is properly hardened against 2025’s most common threats.