ToolHub

robots.txt Tester

Test if a URL is allowed or blocked

robots.txt tester
Blocked
Path
/admin/page
Group used
*
Matched rule
Disallow: /admin/

This is a simplified test of the standard robots rules: it picks the matching user-agent group (falling back to the wildcard group) and applies longest-match precedence with Allow winning ties. Real crawlers may differ in edge cases. Everything runs in your browser.

Overview

Test whether a URL is allowed by robots.txt

A robots.txt file lives at the root of a site and tells web crawlers which paths they may or may not request. It is the first file search engines and bots look for before crawling. Getting it wrong can quietly block important pages from being indexed, or expose paths you meant to keep out of search results.

This tool parses a robots.txt file into user-agent groups and their Allow and Disallow rules, then checks a path you provide against the rules for a chosen user-agent. It tells you whether the path is Allowed or Blocked and which rule decided the outcome. Everything runs in your browser.

Step-by-step

How to use the robots.txt tester

  1. 1

    Paste your robots.txt

    Drop the contents of your robots.txt file into the editor, or start from the example and adjust it.
  2. 2

    Enter a path to test

    Type the URL or path you want to check, for example /admin/page. A full URL works too; only the path part is used.
  3. 3

    Choose a user-agent

    Set the user-agent token, such as Googlebot or the wildcard *. The result updates live and names the matched rule.

Background

How robots.txt matching works

A robots.txt file is a list of groups. Each group starts with one or more User-agent lines, followed by Disallow and Allow rules that apply to those agents. A crawler picks the group that matches its own name, and falls back to the User-agent: * group if no specific group matches.

Longest match wins

When several rules match a path, the rule with the longest path pattern takes precedence. So Allow: /admin/public/ beats Disallow: /admin/ for a URL like /admin/public/page, because the Allow pattern is longer and more specific. If two matching rules are the same length, Allow wins the tie.

Wildcards and end anchors

Most major crawlers support * as a wildcard for any run of characters and $ to anchor a rule to the end of the path. For example Disallow: /*.pdf$ blocks every URL that ends in .pdf. This tester applies both conventions.

Use cases

When to use this tool

Before deploying robots.txt

Confirm your new rules block the paths you intend and leave important pages crawlable.

Debugging deindexed pages

Check whether a page that dropped out of search is being blocked by an overly broad Disallow.

Per-crawler rules

Test how Googlebot and the wildcard group differ when you target specific bots separately.

Staging and admin paths

Verify that admin, staging, or private directories are actually disallowed for crawlers.

SEO audits

Spot-check a competitor or client robots.txt to understand what they keep out of search.

Learning the syntax

Experiment with Allow, Disallow, wildcards, and groups to see how matching plays out.

Tips and best practices

  • robots.txt controls crawling, not indexing. Use a noindex meta tag to keep a page out of search results.
  • An empty Disallow value means allow everything for that group.
  • Rules are case-sensitive for paths but user-agent matching is case-insensitive.
  • Keep robots.txt at the site root. A file at /folder/robots.txt is ignored.
  • Do not rely on Disallow to hide secrets. Blocked URLs can still be discovered and the file is public.

Common questions

What happens if no rule matches my path?

If no Allow or Disallow rule in the relevant group matches the path, the path is allowed by default. Crawlers assume access unless a rule explicitly disallows it.

Which group applies to my crawler?

A crawler uses the group whose user-agent matches its own token. If none match, it uses the User-agent: * group. This tester follows the same fallback, and shows which group it used.

Does Disallow remove a page from Google?

No. Disallow only stops crawling. A blocked URL can still appear in search results if other pages link to it. To truly remove a page, let it be crawled and add a noindex directive instead.

How accurate is this tester?

It implements the common rules: group selection, longest-match precedence, Allow winning ties, plus * and $ wildcards. Individual crawlers may differ in rare edge cases, so treat this as a simplified check rather than the final word.

100% private

Privacy and security

Your robots.txt content and the path you test are parsed entirely in your browser. Nothing is uploaded or sent over the network, so you can safely test private or unpublished rules.

Related tools

Frequently asked questions

How does the test work?

It parses the User-agent groups and their Allow and Disallow rules, then applies the longest matching rule to decide if your path is allowed or blocked.

Which crawler does it test?

Whichever User-Agent you enter. If there is no matching group it falls back to the wildcard star group.

Is this exactly how Google reads robots.txt?

It follows the standard rules and is a close approximation, but it is a simplified test, not the official Google parser.