Website management has shifted. In previous years, we focused on “Crawl Budget” to ensure Googlebot indexed our pages correctly. Today, the challenge is different: Server Survival. AI bots from OpenAI, Anthropic, and dozens of others are hitting servers with unprecedented aggression.
If you’ve checked your server logs recently and seen 700+ hits from GPTBot but only 40 from Google, you aren’t alone. These bots are “scraping” your content to train Large Language Models (LLMs), often consuming massive amounts of bandwidth without sending a single visitor to your site. At HostWP.io, we’ve seen these spikes cripple sites that aren’t prepared.
In this guide, we’ll dive deep into how you can block AI crawler bots on WordPress to protect your fast WordPress hosting environment, while weighing the financial and performance pros and cons.
The Hidden Cost: Why AI Bots Are Emptying Your Wallet
The biggest issue with the AI crawler surge isn’t just a slow website; it’s the financial “Bot Tax.” Many well-known hosting companies charge based on “Monthly Visits” or “Data Transfer.” Their systems often fail to distinguish between a potential customer and an AI bot. If GPTBot hits your site 20,000 times in a month, many hosts will count that as 20,000 visits and send you an overage bill or force you into a higher-tier plan.
While many known hosting companies charge high premiums and penalize you for bot-driven traffic spikes, HostWP.io is built on transparency. However, even on the most robust infrastructure, these bots use CPU cycles and RAM. By blocking them, you aren’t just saving on overage fees; you are freeing up your server resources to serve the real-human visitors. This is just as critical as other performance tweaks, such as knowing how to minify Javascript files to speed up WordPress.
The Strategic Choice: To Block or Not to Block?
Before we get to the code, you need to decide which bots are “friends” and which are “foes.”
1. The Training Bots (e.g., GPTBot, CCBot)
These bots gather data to make AI models smarter.
- Pros of Blocking: Saves massive bandwidth; protects your intellectual property; avoids overage charges.
- Cons of Blocking: Your brand might be “unknown” to future AI models and won’t be visible to people who use AI models for research.
| Feature | Blocking Bots | Throttling (The HostWP Way) |
| Bandwidth | Saves 90%+ | Saves 60-70% |
| AI Visibility | Zero. You don’t exist to AI. | Maintained. You stay in the index. |
| Server Load | Lowest | Stable & Controlled |
2. The Search/Answer Bots (e.g., OAI-SearchBot, PerplexityBot)
These power tools like “SearchGPT” and cite sources with links.
- Pros of Blocking: Maximize security of your WordPress hosting by safeguarding content your content from AI crawlers
- Cons of Blocking: You lose out on AI referral traffic. If you decide to block everyone, make sure you don’t accidentally block the “good” crawlers. We have a dedicated guide on how to whitelist Googlebot on WordPress to ensure your actual SEO remains unharmed.
SEO Pro Tip: The “Throttling” Compromise If you don’t want to disappear from AI Search entirely, don’t use a “Disallow” in robots.txt. Instead, use LiteSpeed’s Crawl-delay or Rate Limiting. This tells the AI, “You can learn from me, but you have to walk, not run.”
Method #1: The robots.txt “Polite” Request
The first line of defence is your robots.txt file. Most reputable AI companies respect this file. Add these lines to block the most resource-heavy training bots:
Plaintext
# Block OpenAI Training Bot
User-agent: GPTBot
Disallow: /
# Block Common AI Scrapers
User-agent: CCBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
# Block Google’s AI Training (Separate from Search)
User-agent: Google-Extended
Disallow: /
Often many AI bots do not respect robots.txt and might not work 100% in blocking the bots crawling.
Method #2: Server-Level Enforcement (.htaccess)
If a bot ignores your robots.txt and continues to hammer your server, then you will need “The Hammer.” This stops the bot before it ever touches your WordPress core, which is a key part of our WordPress security best practices.
HostWP.io utilizes LiteSpeed Enterprise servers, and you have access to write rules in the .htaccess file of your WordPress website. You can use Rewrite Rules to specifically target AI User-Agents.
We will use .htaccess rules to trigger Litespeed server rules, like the captcha trigger, and using 429 status that tells the bot that the content exists, but it needs to try again later.
Add the following code to the top of your .htaccess file:
<IfModule Litespeed>
RewriteEngine On
# Target known AI crawlers
RewriteCond %{HTTP_USER_AGENT} (GPTBot|ChatGPT-User|ClaudeBot|CCBot|ImagesiftBot|PerplexityBot|Omgilibot|FacebookBot) [NC]
# Option A: Force a 429 “Too Many Requests” status
RewriteRule .* – [R=429,L]
#Option B: Force them to solve a CAPTCHA (if reCAPTCHA is enabled in LiteSpeed)
# RewriteRule .* – [E=verifycaptcha]
</IfModule>
Why we use the 429 Status
As an SEO-focused host, we recommend the 429 (Too Many Requests) status over the traditional 403 (Forbidden).
A 403 error can sometimes be misinterpreted by search engine algorithms as a site-wide technical failure. A 429 error, however, is the industry-standard way of saying, “I’m here, but you’re asking for too much right now.” It protects your bandwidth while maintaining your professional “standing” with search crawlers.
Method #3: Automation and Stability
Sometimes, the high load from these bots can interfere with your site’s routine maintenance. For example, if your server is struggling under bot traffic during a background update, it could lead to a crashed site.
If you are worried about server stability during updates, you might want to learn how to disable or manage automatic updates in WordPress so you can run them when traffic is low. And if a bot-induced spike ever causes an update to fail, don’t panic—you can always follow our guide onhow to downgrade WordPress or rollback plugins to restore functionality.
Method #4: Cloudflare “AI Crawl Control”
For our clients who use Cloudflare in front of their HostWP.io server, this is the “Easy Button.” Cloudflare now offers a dedicated AI Crawl Control feature under Security > Bots. Turning this on prevents these scrapers from even reaching our data centers, keeping your site perfectly clean.
Detailed Pros and Cons of Blocking AI Bots
| Feature | Blocking Bots | Allowing Bots |
| Bandwidth | Save 90% of non-human data transfer. | Risk of heavy overage fees from other hosts. |
| Server Speed | Stable. PHP workers stay free for customers. | Variable. Spikes can cause “503 Service Unavailable.” |
| Security | High. Prevents content scraping and IP theft. | Low. Your data is harvested for third-party use. |
| SEO/AEO | No impact on Google; loss of AI search. | Potentially high referral traffic from AI Search. |
Conclusion: Take Back Your Server
While the AI revolution is exciting, it shouldn’t come at the cost of your website’s performance or your business’s bottom line. By proactively choosing to block AI crawler bots on WordPress, you ensure that your server resources are spent on what matters: Your Customers.
If you’re tired of high-priced hosting companies that charge you for bot traffic, it’s time to switch to a provider that actually understands the modern web.
Don’t let scrapers eat your budget. Not sure which bots to block? Reach out to the HostWP.io support team for a Custom Bot Audit. We’ll analyze your logs and help you build a surgical .htaccess file that protects your speed without killing your future traffic.




