Skip to main content
Back to Blog

How AI Crawlers See Your Website (And What They Miss)

AI crawlers like GPTBot and ClaudeBot skip JavaScript rendering entirely. Learn what 569 million requests reveal about how AI search engines read your site differently from Google.

By Dynamic SEO TeamPublished April 4, 202616 min read
Split view comparing what Googlebot sees with full rendering versus what AI crawlers see without JavaScript

Your website might be perfectly optimized for Google and completely invisible to ChatGPT.

That statement sounds dramatic, but the data supports it. A 2024 analysis by Vercel and MERJ examined 569 million GPTBot requests and found a consistent pattern: GPTBot does not execute JavaScript. It fetches HTML, it fetches JavaScript files, but it never runs them. The pages it indexes are the raw HTML that your server returns, not the fully rendered DOM that users see in their browsers.

This distinction has always mattered for traditional search engines, but Googlebot solved it years ago by adding a rendering service that executes JavaScript. For AI crawlers — the bots that feed content into ChatGPT, Claude, Perplexity, and other AI-powered search tools — the JavaScript rendering gap is wide open. And as AI search grows from a curiosity to a meaningful traffic source, that gap is becoming a visibility problem that most websites are not even aware of.

The New Crawl Landscape

The crawl ecosystem is no longer just Googlebot and Bingbot. A growing roster of AI-specific crawlers now represents a significant share of automated traffic.

According to the Vercel/MERJ analysis, AI crawlers collectively generate approximately 28% of Googlebot's volume on Vercel's network, translating to roughly 1.3 billion AI crawler requests per month across the sites they measured. That is not a rounding error. That is a meaningful percentage of total crawl activity, and it is growing fast.

PerplexityBot has shown the most dramatic growth trajectory, with a reported 157,000% increase in request volume — albeit from a near-zero base. Meta-ExternalAgent, the crawler feeding Meta's AI systems, has also scaled rapidly. These bots are not experiments. They are the indexing infrastructure for products that millions of people use every day to find information.

The shift matters because these AI systems are increasingly becoming the first place people go for answers. When someone asks ChatGPT or Perplexity a question about your product category, the answer is assembled from whatever those systems were able to crawl and index from the web. If they could not read your content because it was rendered by JavaScript, your site does not exist in their training data or retrieval indexes. Your competitors who serve static HTML are represented. You are not.

What AI Crawlers Actually Do

To understand the problem, you need to understand how crawling works at a mechanical level, and how AI crawlers differ from traditional search engine bots.

When Googlebot visits a page, it follows a two-phase process. First, it fetches the raw HTML. Second, it sends that HTML to a rendering service — essentially a headless Chrome browser — that executes all JavaScript, waits for the DOM to stabilize, and then indexes the fully rendered page. This means that a React application that renders its title tag and meta description via client-side JavaScript will still have those elements indexed by Google, because Googlebot sees the page the same way a user does.

AI crawlers skip the second phase entirely.

GPTBot fetches the HTML and reads whatever is in it. According to the Vercel/MERJ data, 11.5% of GPTBot's requests are for JavaScript files, which suggests it downloads them as part of its resource fetching behavior. But downloading JavaScript is not the same as executing it. The JavaScript files are fetched and then effectively ignored. The page content that GPTBot indexes is whatever was present in the initial HTML response.

ClaudeBot follows a similar pattern. It fetches JavaScript files at a higher rate — 23.84% of its requests — but there is no evidence that it executes them. The same applies to PerplexityBot and Meta-ExternalAgent.

The Vercel/MERJ analysis tested this systematically and found that approximately 69% of AI crawlers they examined lack JavaScript rendering capability entirely. They operate like search engines from 2010: they see the source HTML and nothing else.

The JavaScript Rendering Gap in Practice

What does this mean for a real website? Consider a modern e-commerce site built on a JavaScript framework.

The server returns an HTML shell — a minimal document with a root element and a bundle of JavaScript files. The JavaScript executes in the browser, fetches product data from an API, and renders the full page including the product title, description, price, images, reviews, and structured data. The user sees a complete product page. Googlebot sees a complete product page. GPTBot sees an empty shell.

This is not a theoretical edge case. A significant percentage of the web now renders critical content via JavaScript. Single-page applications built with React, Vue, or Angular often fall into this category. Even traditional server-rendered sites frequently use JavaScript to inject or modify metadata: a tag management system that sets Open Graph tags after page load, a client-side A/B testing tool that modifies the title tag, or a dynamic structured data component that assembles JSON-LD from API responses.

Any content or metadata that depends on JavaScript execution is invisible to AI crawlers. This includes:

  • Title tags set or modified by client-side JavaScript
  • Meta descriptions injected by tag managers or SPA routers
  • Open Graph and Twitter Card tags added dynamically
  • JSON-LD structured data assembled client-side from API responses
  • Page body content rendered by JavaScript frameworks
  • Navigation and internal links generated by client-side routing
  • Canonical URLs set dynamically based on query parameters or user state

If any of these elements are critical to how your page appears in AI-powered search results, and they rely on JavaScript to be present in the DOM, they are missing from AI crawler indexes.

A Crawler Comparison

Not all crawlers are equal. Here is how the major bots compare on JavaScript rendering capability:

Googlebot: Full JavaScript rendering via a headless Chromium-based service. Sees the same page a user sees. Has had this capability since 2019 when it upgraded to an evergreen Chromium renderer.

Bingbot: JavaScript rendering capability, though historically less consistent than Googlebot. Microsoft has invested in improving Bingbot's rendering, and it generally handles modern JavaScript frameworks.

GPTBot (OpenAI): No JavaScript rendering. Fetches HTML and JavaScript files but does not execute scripts. Indexes raw HTML only. Powers ChatGPT's web browsing and training data pipeline.

ClaudeBot (Anthropic): No JavaScript rendering. Higher rate of JavaScript file fetching (23.84%) but no execution. Feeds into Claude's web knowledge.

PerplexityBot: No JavaScript rendering. Fastest-growing AI crawler by volume. Powers Perplexity's real-time search answers.

Meta-ExternalAgent: No JavaScript rendering. Crawls for Meta's AI training and features.

The pattern is clear: traditional search engines render JavaScript, AI crawlers do not. If your SEO strategy is built around ensuring Googlebot can read your pages, you have solved one problem while leaving another wide open.

Why This Matters More Than You Think

The natural response to this information is to say: "Google is still the dominant source of organic traffic, so as long as Googlebot can see my pages, the AI crawlers are a secondary concern."

That was a reasonable position in 2023. It is increasingly difficult to defend in 2026.

AI-powered search is growing across multiple vectors simultaneously. ChatGPT with web browsing, Perplexity as a search engine replacement, Claude's web-informed responses, and Meta's AI features across its platforms all draw from web crawl data. The combined reach of these platforms is substantial and growing.

More importantly, AI search tends to compress results. When Google returns ten blue links, there are ten chances for your site to appear. When ChatGPT answers a question, it typically synthesizes information from a handful of sources into a single response. If your site is not one of those sources because the crawler could not read your content, you have zero visibility in that answer — not reduced visibility, zero.

The compounding effect is that AI systems often cite their sources. A Perplexity answer that references your competitor's product creates a feedback loop: users click through to the competitor, the competitor gains engagement signals, and future AI answers become even more likely to reference the competitor. The site that is invisible to AI crawlers does not just miss one answer. It misses the entire chain of downstream visibility.

What AI Crawlers Cannot See: A Technical Inventory

Let us get specific about the categories of content and metadata that AI crawlers typically miss on JavaScript-dependent sites.

Client-side rendered page content: If your page body is rendered by a JavaScript framework and the server returns only a loading spinner or empty container, AI crawlers see the spinner. They do not see your product descriptions, article text, or any other content that matters for being cited in AI answers.

Dynamically assembled metadata: Many modern sites use JavaScript to construct their metadata. A React application might use a library that sets the document title and meta tags based on the current route and data. These tags exist in the rendered DOM but not in the initial HTML. AI crawlers miss them entirely.

JavaScript-injected structured data: Sites that build their JSON-LD structured data client-side — assembling it from API responses or state management stores — have structured data that exists only after JavaScript execution. Googlebot sees it. AI crawlers do not.

Tag manager modifications: If a tag management system modifies or adds meta tags after page load, those modifications are invisible to AI crawlers. This is particularly common for Open Graph tags, where the tag manager sets image, title, and description properties based on page-level variables.

Single-page application navigation: SPA routing creates pages that exist only in the browser's JavaScript runtime. A URL that renders a complete page in the browser might return only the application shell when fetched directly by a crawler. If your SPA does not implement server-side rendering or static generation, every route beyond the initial page load is empty to AI crawlers.

Lazy-loaded content: Content that loads as the user scrolls or interacts — common for performance optimization — is never triggered by AI crawlers. Below-the-fold content, expandable sections, tabbed interfaces, and infinite scroll patterns are all invisible.

The Server-Side Rendering Advantage

Sites that use server-side rendering or static site generation have a natural advantage in AI crawler visibility. When the server returns fully rendered HTML — complete with content, metadata, and structured data — every crawler that fetches the page sees the same thing.

This is not a new insight. The SEO community has recommended server-side rendering for years, primarily for Googlebot's benefit. The difference now is that the stakes have expanded. Server-side rendering is no longer just a best practice for Google. It is a requirement for visibility in AI-powered search.

For sites already using server-side rendering frameworks like Next.js with SSR or SSG, the news is good: your pages are likely already visible to AI crawlers. The content is in the HTML that the server returns, and it will be indexed regardless of whether the crawler executes JavaScript.

For sites using client-side rendering exclusively, the remediation path is more complex. Moving to server-side rendering is a significant architectural change. But there are intermediate steps that can improve AI crawler visibility without a full rewrite.

Practical Steps to Improve AI Crawler Visibility

Audit your server-rendered HTML: Fetch your key pages with JavaScript disabled and examine what you get. If the title tag, meta description, main content, and structured data are all present in the raw HTML, your pages are AI-crawler-compatible. If any of these are missing, you have a visibility gap.

Implement server-side rendering for critical metadata: Even if your page content is client-rendered, ensure that title tags, meta descriptions, Open Graph tags, and canonical URLs are present in the initial HTML response. Most JavaScript frameworks support this through head management libraries that render metadata server-side.

Move structured data to the server: JSON-LD should be present in the HTML that the server returns, not assembled client-side. If your structured data depends on API responses, fetch that data server-side and embed the JSON-LD in the initial HTML.

Test with AI crawler user agents: Set up monitoring that fetches your pages using AI crawler user agent strings and compares the result to what a browser sees. Automated monitoring can perform this comparison across your entire site, flagging pages where the server-rendered version is missing content or metadata that appears in the browser-rendered version.

Review your robots.txt: Ensure you are not accidentally blocking AI crawlers from pages you want indexed. Some sites block GPTBot or ClaudeBot entirely out of concern about AI training data usage. If you want visibility in AI-powered search results, you need to allow these crawlers access to your content.

Implement pre-rendering for critical pages: If a full SSR migration is not feasible, consider pre-rendering your most important pages. Pre-rendering generates static HTML for specific routes at build time, making them available to all crawlers regardless of JavaScript capability.

The Dual-Visibility Strategy

Going forward, technical SEO needs to account for two distinct audiences: traditional search engines that render JavaScript and AI crawlers that do not.

This does not mean maintaining two versions of your site. It means ensuring that the critical elements — content, metadata, and structured data — are present in the server-rendered HTML while still providing the enhanced experience that JavaScript enables for users and Googlebot.

The pages that perform best across both audiences are those that follow the principle of progressive enhancement: the server delivers a complete, functional, SEO-ready page, and JavaScript enhances it with interactivity, dynamic content updates, and richer user experiences. The base layer is accessible to every crawler. The enhanced layer is a bonus for crawlers that can process it.

This approach has the added benefit of improving performance. Pages that render meaningful content server-side load faster for users, score better on Core Web Vitals, and are more resilient to network and JavaScript failures. Optimizing for AI crawler visibility and optimizing for user experience are, in this case, the same thing.

Looking Ahead

The JavaScript rendering gap between traditional search engines and AI crawlers is unlikely to close quickly. Building and operating a rendering service at crawl scale is expensive and complex. Google invested years and significant infrastructure to make Googlebot's renderer reliable. AI companies are focused on model development, not crawler engineering, and the economics of rendering billions of pages to feed training pipelines are not favourable.

This means the gap will persist for the foreseeable future. Sites that depend on JavaScript rendering for their content and metadata will remain invisible to AI-powered search for years, while their competitors who serve static HTML capture that growing audience.

The time to address this is now, while AI search is still in its growth phase and the competitive landscape is forming. The sites that establish visibility in AI search results early will have an advantage that compounds over time — being cited in AI answers generates traffic, engagement, and authority signals that make future citations more likely.

This is why dynamic SEO — which ensures metadata and structured data are present in the HTML response before it reaches any crawler — provides visibility across all crawlers, including those that never execute JavaScript.

Your Google rankings tell you half the story. The other half is what happens when someone asks an AI about your industry and your site is not in the answer.

Frequently Asked Questions

Does GPTBot execute JavaScript when crawling websites?

No. According to analysis of 569 million GPTBot requests conducted by Vercel and MERJ, GPTBot does not execute JavaScript. It fetches HTML pages and downloads JavaScript files — approximately 11.5% of its requests are for JavaScript resources — but it does not run those scripts. The content GPTBot indexes is limited to whatever is present in the raw HTML that the server returns. Any content, metadata, or structured data that is rendered by JavaScript after page load is invisible to GPTBot.

Which AI crawlers can render JavaScript and which cannot?

Among the major crawlers, Googlebot and Bingbot can render JavaScript using headless browser technology. All major AI-specific crawlers — including GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, and Meta-ExternalAgent — lack JavaScript rendering capability. Testing by Vercel and MERJ found that approximately 69% of AI crawlers examined cannot render JavaScript at all. These AI crawlers read only the raw HTML returned by the server, meaning any content or metadata that depends on JavaScript execution is missing from their indexes.

How much traffic do AI crawlers generate compared to Googlebot?

AI crawlers collectively generate approximately 28% of Googlebot's volume, translating to roughly 1.3 billion requests per month on Vercel's network based on the Vercel/MERJ analysis. PerplexityBot has shown the most dramatic growth rate at 157,000% increase in request volume, though from a very small starting base. These numbers are growing rapidly as AI-powered search products gain mainstream adoption. While Googlebot still dominates crawl volume, AI crawlers represent a substantial and fast-growing share of automated traffic to websites.

How do I check if AI crawlers can see my website content?

The most direct test is to fetch your pages with JavaScript disabled and examine the raw HTML. If your title tag, meta description, main content, structured data, and Open Graph tags are all present in the initial HTML response, AI crawlers can see them. If any of these elements are missing — rendered only by JavaScript — AI crawlers will miss them. You can also use curl or a tool that fetches pages without executing JavaScript and compare the output to what you see in a browser. For site-wide auditing, tools like Dynamic SEO can automate this comparison across all pages, identifying where server-rendered content differs from browser-rendered content.

Should I block or allow AI crawlers like GPTBot and ClaudeBot?

This depends on your business goals. If you want your content to appear in AI-powered search results from ChatGPT, Claude, Perplexity, and similar platforms, you should allow these crawlers access via your robots.txt file. Blocking them means your content will not be included in AI-generated answers, giving your competitors who allow crawling an advantage in AI search visibility. If you have concerns about AI training data usage, some crawlers offer separate user agent strings for training versus real-time retrieval — for example, you could allow GPTBot for search while reviewing OpenAI's data usage policies. The strategic consideration is that AI search is a growing traffic source, and sites that are invisible to AI crawlers today are missing a channel that is likely to become more significant over time.

Share

Related Articles