How Can We Help?

Guides, explanations, and troubleshooting for your AI visibility audits.

Getting Started with Your First Audit

1

Enter a URL

Type or paste any URL you want to check. You can audit homepages, specific pages, product pages, blog posts—any publicly accessible URL.

2

Wait for Results

Typically 15–30 seconds. Performance data from CrUX, crawlability checks (robots.txt, HTTP Status Check, noindex), and content analysis checks (visibility gap, Content Diff impact, render pattern, structure) all run in parallel.

3

Review Findings

Results appear across tabs: Overview (summary), Performance (TTFB/CLS/INP), Crawlability (33 AI crawlers + HTTP Status Check + noindex), and Content Visibility (JS vs HTML, Content Diff, Render Pattern, Semantic Heading, ARIA Labels).

4

Export Report

Click "Export PDF" to download a complete report you can share with developers, clients, or your team.

Understanding Performance Results

Google's CrUX dataset only includes URLs with sufficient real-user traffic. Low-traffic pages may not meet the aggregation threshold required to generate metrics. When URL-level data is unavailable, we check origin-level data (your whole domain) as a fallback. We clearly label when either is missing rather than guessing.

For performance insights on low-traffic pages, consider running Lighthouse for synthetic test data alongside CrUX monitoring.

URL-level: Metrics specific to the exact page you entered. Reflects performance for that specific URL based on real user visits.

Origin-level: Metrics aggregated across your entire domain. Useful for understanding overall site performance when page-specific data isn't available.

Both matter. A fast homepage with a slow product page means AI systems may reliably access some pages while timing out on others.

TTFB measures server response time—the raw time before any content renders. Your site may feel fast due to geographic proximity (you're near the server), browser caching, or client-side rendering that makes subsequent navigation feel instant.

CrUX shows real-world data from diverse users across different networks and locations. A 75th percentile TTFB of 2000ms means 25% of your visitors experience even slower responses—and AI crawlers querying from data center IPs may not benefit from your CDN caching.

MetricGoodNeeds ImprovementPoor
TTFB≤ 800ms800–1800ms> 1800ms
CLS≤ 0.10.1–0.25> 0.25
INP≤ 200ms200–500ms> 500ms

Understanding Crawlability Results

Check for these common causes:

  • A User-agent: * followed by Disallow: / at the top of your robots.txt blocks all crawlers
  • Specific user-agent rules that unintentionally match AI bot strings via partial matches
  • Conflicting rules where a specific directive overrides a general allow rule

We parse your robots.txt according to RFC 9309. If results look unexpected, review your file at yourdomain.com/robots.txt directly.

No. Selective blocking is completely valid. What matters is that your decisions are intentional, not accidental. You may want to allow search crawlers (PerplexityBot, OAI-SearchBot) for real-time visibility while blocking training crawlers (GPTBot, CCBot) if you're concerned about training data use. Both are legitimate choices.

robots.txt is a request for crawlers to honor your rules—it's not a technical enforcement mechanism. Any compliant, well-behaved crawler will respect the rules, but robots.txt cannot technically prevent access the way authentication or IP blocking can.

Major AI providers (OpenAI, Anthropic, Google, etc.) document that they respect robots.txt. For real access control, use server-level authentication, not robots.txt.

If your goal is maximum AI visibility, prioritize allowing: GPTBot (ChatGPT training/search), OAI-SearchBot (ChatGPT Search), ClaudeBot (Claude), PerplexityBot (Perplexity), Google-Extended (Gemini), CCBot.

See our full list of 33 crawlers we check →

robots.txt is a crawl access request file (usually origin-wide). noindex is a page-level directive sent via meta robots or X-Robots-Tag to request exclusion from indexing.

A page can be crawlable and still excluded if noindex is present.

robots.txt may allow access, but runtime requests can still fail due to WAF rules, anti-bot filtering, or infrastructure controls. HTTP Status Check validates live response codes for Browser + GPTBot + ClaudeBot + PerplexityBot on the audited URL.

Interpretation: 2xx = reachable, 3xx = redirected (still good), 4xx = warning, 429/5xx = critical, timeout/no response = unavailable. If Browser returns 429 and the server sends Retry-After, that value is shown in the Browser row.

Understanding Content Visibility Results

We compare the word count in the JavaScript-rendered view (what a user sees) against the raw HTML response (what many crawlers see). The gap percentage is the proportion of content that requires JavaScript to appear.

Example: JS view has 1000 words, raw HTML has 700 words = 30% gap. This means 30% of your content only appears after JavaScript executes.

Use the Content Diff panel alongside this gap number to see which specific sections are likely invisible to crawlers and worth fixing first.

Not necessarily. Context matters:

  • <5%: Normal and low-risk. Minor JS-dependent elements (timestamps, interactive elements) don't significantly affect AI access.
  • 5–30%: May affect some crawlers. Investigate what's in the gap—navigation links missing is less critical than main article text.
  • >30%: Significant gap. Prioritize getting critical content (headings, main text, key data) into raw HTML.

If the gap looks manageable but your impact summary is high, trust the Content Diff list and fix those highlighted sections first.

Both indicate a mismatch between raw HTML and the rendered page. "Added by JS" means content may be invisible to crawlers that do not execute JavaScript fully. "Removed by JS" means content present in raw HTML no longer appears in the final rendered view.

Either mismatch can reduce extraction reliability, so both are treated as risk signals in the Content Diff section.

  1. Server-side rendering (SSR): Render pages on the server so HTML contains all content before sending to the client.
  2. Static site generation (SSG): Pre-render pages at build time for content that doesn't change per-request.
  3. Pre-rendering services: Serve static snapshots to bots while users get the full JS experience.
  4. Progressive enhancement: Ensure critical content exists in base HTML, use JS to enhance rather than deliver.

The Render Pattern card helps you decide which fix is most likely appropriate, but it remains heuristic rather than a hard framework detector. Use Content Diff to prioritize exactly which missing sections to move into raw HTML first.

Learn more about content visibility fixes →

The Render Pattern card is a heuristic summary based on raw HTML coverage, framework markers, and selected response headers. It helps you quickly tell whether the page looks mostly CSR, SSR, Static, Hydrated, or Hybrid.

It can also show secondary hints for SSG, ISR, ESR, or islands/partial hydration when enough evidence is exposed, but those are directional hints rather than confirmed states. DPR cannot be reliably identified from a single external fetch.

The Semantic Heading card checks heading hierarchy from the raw HTML response—the order of H1–H6 tags, whether levels are skipped, and overall structural consistency. Crawlers and agent-style workflows use heading structure as a cue for content extraction, not just visual layout.

This is a structural extraction signal for AI systems, not a full WCAG accessibility audit. Native HTML elements and a logical heading hierarchy are the foundation; ARIA complements them for custom interactions.

The ARIA Labels card counts aria-label, aria-labelledby, aria-describedby attributes and explicit role values present in the raw HTML response. Pages built primarily with native HTML elements may have low counts—this is normal and expected.

The check is most useful for identifying custom interactive controls (tabs, modals, carousels) that lack explicit roles or labels, as these can reduce extraction reliability for systems that rely on structural cues. This is not a full WCAG accessibility audit.

Understanding Lab Experiments Results

Lab Experiments are exploratory diagnostics available inside the audit tool. They check signals that have been widely discussed in the AI visibility space—but none of them (so far) have been shown to have any direct impact on LLM visibility.

Use them to investigate potential structure and ingestion signals, not as a ranking guarantee. They are kept separate from the core checks (Performance, Crawlability, Content Visibility) to avoid mixing confidence levels.

Checks whether a /llms.txt file is present at the root of your domain. llms.txt is a proposed convention for giving LLMs structured guidance about a site's content. Its presence or absence is an ecosystem signal—not proof of ingestion or indexing behavior by any specific AI system.

Detects @type values from JSON-LD blocks found in the raw HTML. The chips shown represent all detected types from sampled items—both valid and invalid. Structured data can improve machine readability, but its presence is not direct proof of inclusion in AI answers.

Converts the page content to Markdown and compares token, word, and character counts between the HTML and Markdown representations. A negative difference means Markdown is more compact for this content. Token counts depend on representation and tokenizer assumptions—treat these figures as directional, not absolute.

Troubleshooting

Audit timed out or returned errors

Common causes: Server is down or extremely slow (>30 seconds), invalid URL format, firewall blocking the audit request.

Try: Verify the URL loads in your browser, check if the site is reachable, try a different page from the same domain.

Chrome extension not working on certain pages

The extension cannot analyze browser internal pages (chrome://), extension settings pages, local files (file://), or pages with strict Content Security Policies.

Solution: Use the web app at beseenby.ai for these cases.

robots.txt shows old rules after an update

We fetch robots.txt fresh on each audit, but there may be CDN caching at your end.

Try: Wait 5–10 minutes, verify your changes are visible at yourdomain.com/robots.txt directly, then run the audit again.

Key Terms

TTFB (Time to First Byte)
The time from when a browser requests a page to when it receives the first byte of the response. Key indicator of server response speed and fetch reliability.
CLS (Cumulative Layout Shift)
A measure of visual stability—how much page content moves unexpectedly during loading. High CLS can disrupt content extraction by crawlers.
INP (Interaction to Next Paint)
A measure of page responsiveness to user interactions. Indicates JavaScript execution load, which can affect content availability timing.
CrUX (Chrome User Experience Report)
Google's public dataset of real-user performance metrics collected from Chrome users. The same data that powers Google Search Console's Core Web Vitals reports.
robots.txt
A file at the root of a website (yourdomain.com/robots.txt) that tells crawlers which pages they are allowed to access. Advisory only—not enforced technically.
Noindex
A directive that asks systems not to index a page. It can appear in HTML meta robots tags or in the HTTP X-Robots-Tag header.
Origin-level
Analysis at the domain level (e.g., example.com), aggregating across all pages. Used for robots.txt checks and site-wide performance data.
URL-level
Analysis of a specific page URL. Used for page-specific performance metrics, HTTP Status Check results, noindex checks, and content visibility checks.
Render Pattern
A heuristic summary of how a page appears to be rendered based on raw HTML coverage, framework markers, and selected response headers. Useful for diagnosing whether content depends heavily on client-side rendering.

Still Need Help?

Contact Support

Have a question that's not answered here? Get in touch directly.

Contact Us

Explore Features

Detailed documentation on each diagnostic check we run.

See Features

Ready to Run Your First Audit?

Free, unlimited audits. No signup required.

Run Free Audit
Also available as a Chrome Extension

Quick audits while you browse—all core features included, always free.

Add to Chrome