The Silent Problem
Unlike search engines, AI crawlers don't send you a notification when they can't access your site. There's no AI equivalent of Google Search Console showing crawl errors. Your site could be completely invisible to ChatGPT, Claude, and Perplexity — and you'd never know unless you explicitly test for it.
w2agent checks your site against all known AI crawler User-Agents and reports exactly what's blocked, partially accessible, or fully open. Here are the most common causes it finds.
Cause 1: Security Plugins
WordPress security plugins are the #1 cause of blocked AI crawlers. Wordfence, Sucuri, iThemes Security, and All In One WP Security all have features that block "suspicious" User-Agents — and AI bot names often trigger these rules.
Common Wordfence block rule that catches AI bots:
# Wordfence advanced blocking # This pattern blocks any User-Agent containing "bot" # — including GPTBot, ClaudeBot, PerplexityBot Block User-Agents matching: /bot/i
Fix: Add specific exceptions for AI User-Agents in your security plugin's allowlist. In Wordfence, go to Firewall → Blocking → Advanced → and add exceptions for GPTBot, ClaudeBot, and PerplexityBot.
Cause 2: CDN/WAF Rules
Cloudflare, AWS WAF, and other CDN/WAF services have bot management features that can block AI crawlers. Cloudflare's "Bot Fight Mode" and "Super Bot Fight Mode" treat AI crawlers as automated traffic — which they are — and may challenge or block them.
Cloudflare: Security → Bots → Configure Bot Fight Mode. Add AI User-Agents to the "Verified Bots" allowlist.
AWS WAF: Check your rate-limiting rules. AI crawlers may exceed per-IP request limits during content ingestion.
Akamai: Review Bot Manager policies. AI crawlers are typically classified as "unknown bots."
Cause 3: robots.txt Misconfiguration
Many sites have overly restrictive robots.txt files that unintentionally block AI bots. Common patterns:
Blanket disallow
User-agent: * Disallow: /
Blocks everything — including AI. If you need this, add explicit Allow rules for AI bots above it.
Explicit AI bot blocks
User-agent: GPTBot Disallow: / User-agent: ClaudeBot Disallow: /
Sometimes added by SEO plugins or copied from template robots.txt files without understanding the impact.
Cause 4: Client-Side Rendering
Pages that rely entirely on JavaScript to render content (React SPAs, Angular, Vue without SSR) appear as empty shells to AI crawlers. Most AI crawlers do not execute JavaScript — they fetch the HTML and parse what they get.
Fix: Use server-side rendering (SSR), static site generation (SSG), or at minimum ensure critical content is in the initial HTML response. Next.js, Nuxt, and similar frameworks support this natively.
Cause 5: Authentication Walls
Login-required content, paywalls, and gated content are invisible to AI crawlers. Unlike Googlebot, which some paywall providers allow to crawl via "First Click Free" policies, AI crawlers have no equivalent program.
Fix: If you want AI to know about your gated content, provide metadata and previews in the public-facing HTML. Schema.org's isAccessibleForFree property can signal this.
Cause 6: Rate Limiting
AI crawlers can be aggressive — fetching dozens of pages in quick succession. If your server or CDN rate-limits by IP or User-Agent, the crawler may get blocked partway through.
Fix: Set reasonable rate limits (not <1 req/sec) and consider whitelisting known AI crawler IP ranges. OpenAI and Anthropic publish their crawler IP ranges.
Diagnose Your Site
w2agent tests all of these causes automatically. It sends requests with each AI crawler's User-Agent and reports exactly what's blocked, what's slow, and what's accessible. Run an audit to see your site from an AI crawler's perspective.
Audit your site now
Get a free AI readiness score and generate the files your site needs.
Start Free Audit