Build a robots.txt that
AI engines respect.
Most robots.txt files were written before ChatGPT existed. This builder gives you fifteen AI crawlers in one panel: toggle, preview, copy, ship. Same file I deploy on every AEO engagement before content work starts.
covered
one click
output
Allow every AI engine that drives real citation traffic. Block heavy training-only crawlers that consume bandwidth without sending visitors back.
Helps every crawler discover your full URL set in one fetch. Pointing AI bots at your sitemap is one of the cheapest indexation wins.
Each path must start and end with /. Common entries: /admin/, /checkout/, /api/, /wp-admin/.
Most modern crawlers ignore Crawl-delay. Useful only on shared hosting that throttles aggressive bots.
Drop this file at yoursite.com/robots.txt. Pair with the AI Bot Access Checker after deploy to confirm every crawler reads it correctly.
Blocking AI bots is
a hidden tax.
Every B2B founder I work with starts in the same spot: a robots.txt that blocks AI crawlers because the developer who wrote it five years ago erred on the side of caution. The cost shows up later, when buyers ask ChatGPT for vendor recommendations and the brand never appears in the answer.
Allowing AI bots costs you nothing in compute and everything in citations if you do not. AI engines route real traffic to the brands they cite. Citations are the new backlinks.
The brands that win the next decade of AI search are not building bigger content libraries. They are removing the silent blockers that keep their existing libraries invisible. A clean robots.txt is the first one.
Block Googlebot or Google-Extended and you are invisible on nearly half of every Google search a buyer runs in your category.
Buyers cross-check ChatGPT, Perplexity, and Google AI Overviews. Missing from one engine is missing from the shortlist.
GPTBot, ClaudeBot, and PerplexityBot are completely separate from Googlebot. Allowing them affects nothing else.
Generate, copy, paste, deploy, verify. Most teams sit on broken AI bot access for months while running content marketing.
Every AI engine, one panel.
No guesswork.
Each crawler below maps to a specific AI surface. Block one and you lose that surface entirely. Block all and your content marketing budget compounds in a closed room.
Pick the preset that
matches your stance.
All four generate valid, RFC-compliant files. The difference is intent: how aggressively you want to expose your content to AI engines. Most B2B brands should pick Citation-ready and stop overthinking it.
Citation-ready
Allow every AI engine that drives real citation traffic. Block heavy training-only crawlers that consume bandwidth without sending visitors back.
Open to all AI
Every major and minor AI crawler allowed. Best for content sites that benefit from any indexation. Some bandwidth cost on high-traffic crawlers.
Ecommerce focused
Citation crawlers plus Amazonbot for Alexa shopping recommendations and product discovery. Excludes pure training-only training scrapers.
Block training, allow citation
Allow only crawlers that cite your work back with attribution. Block training-only scrapers that consume content without sending traffic.
What I find when teams
write robots.txt themselves.
A `User-agent: *` block followed by `Disallow: /` blocks every crawler not explicitly allowed elsewhere. New AI bots launch every quarter. The wildcard catches all of them by default. Always list AI agents above the wildcard with their own Allow rules.
OpenAI runs three crawlers: GPTBot for training, OAI-SearchBot for ChatGPT live web search, ChatGPT-User for user-shared links. Allowing only GPTBot keeps you out of live ChatGPT search results. The most-cited surface needs OAI-SearchBot allowed too.
Microsoft Copilot runs entirely on the Bing index. A site missing from Bing is invisible in every Copilot answer across Microsoft 365, Edge, and Windows. After fixing robots.txt, verify Bing Webmaster Tools indexation separately. Bing indexation is independent of Google.
A Sitemap directive at the bottom of robots.txt tells every crawler exactly which URLs to fetch. AI bots use it the same way Googlebot does. Most teams ship robots.txt without the sitemap line, then wonder why discovery is patchy. One line. Two minutes.
After deploy, run the AI Bot Access Checker on the live URL to verify each crawler can actually read what you intended. If you want a full surface audit including entity schema and answer-first content structure, take the AI Visibility Score quiz.
What people ask before
they hit deploy.
Privacy, SEO impact, deployment paths, and how this fits the wider AEO system.
01Will allowing AI bots hurt my Google SEO?
No. GPTBot, ClaudeBot, PerplexityBot, and other AI user-agents are entirely separate crawlers from Googlebot. Allowing them has zero effect on how Google ranks, crawls, or indexes your site for traditional search. The only effect is making your content eligible for citation in AI-generated answers.
02Should I block AI bots to protect my content?
For most B2B brands the answer is no. Blocking is a content protection move that costs you AI search visibility. The brands cited in ChatGPT, Perplexity, and Google AI Overviews are the ones that allowed the crawlers. The brands not cited are the ones still treating AI bots like adversaries. Consider blocking only if your business model relies on gated content or original research you do not want training models to absorb.
03Where do I put the generated robots.txt file?
In the root of your domain at yoursite.com/robots.txt. The exact upload path depends on the platform. Webflow has a robots.txt field in SEO settings. Shopify uses robots.txt.liquid. WordPress writes it via Yoast or Rank Math. Astro and Next.js drop it in /public/robots.txt. After upload, verify by visiting yoursite.com/robots.txt in a browser.
04Is allow per user-agent better than a wildcard?
Yes. Explicit User-agent blocks beat wildcards every time. Conservative platforms apply Disallow to unlisted crawlers by default, so a User-agent: GPTBot block with Allow: / removes any ambiguity. The robots.txt this tool generates lists each AI bot explicitly so no platform default can override your intent.
05How long does the change take to be picked up?
Most crawlers refetch robots.txt every 24 hours. After uploading the new file, expect access status to update across AI engines within one to two days. If you use a CDN like Cloudflare or Fastly, purge the cache for /robots.txt or wait for natural expiry. Verify the change live with the AI Bot Access Checker on this site.
06Does this tool store my settings or my domain?
No. Everything runs in the browser. Your bot selections, sitemap URL, and disallowed paths never leave your device. The tool generates the file client-side and copies or downloads it directly. No login, no analytics on the tool inputs, no logs of your domain.
A clean robots.txt is the start.
The full system goes deeper.
The generator covers access. A full engagement covers entity schema, answer-first content structure, and citation tracking across all six AI engines. Replied within 4 hours.