Overview
Test whether a URL is allowed by robots.txt
A robots.txt file lives at the root of a site and tells web crawlers which paths they may or may not request. It is the first file search engines and bots look for before crawling. Getting it wrong can quietly block important pages from being indexed, or expose paths you meant to keep out of search results.
This tool parses a robots.txt file into user-agent groups and their Allow and Disallow rules, then checks a path you provide against the rules for a chosen user-agent. It tells you whether the path is Allowed or Blocked and which rule decided the outcome. Everything runs in your browser.
Step-by-step
How to use the robots.txt tester
- 1
Paste your robots.txt
Drop the contents of your robots.txt file into the editor, or start from the example and adjust it. - 2
Enter a path to test
Type the URL or path you want to check, for example/admin/page. A full URL works too; only the path part is used. - 3
Choose a user-agent
Set the user-agent token, such asGooglebotor the wildcard*. The result updates live and names the matched rule.
Background
How robots.txt matching works
A robots.txt file is a list of groups. Each group starts with one or more User-agent lines, followed by Disallow and Allow rules that apply to those agents. A crawler picks the group that matches its own name, and falls back to the User-agent: * group if no specific group matches.
Longest match wins
When several rules match a path, the rule with the longest path pattern takes precedence. So Allow: /admin/public/ beats Disallow: /admin/ for a URL like /admin/public/page, because the Allow pattern is longer and more specific. If two matching rules are the same length, Allow wins the tie.
Wildcards and end anchors
Most major crawlers support * as a wildcard for any run of characters and $ to anchor a rule to the end of the path. For example Disallow: /*.pdf$ blocks every URL that ends in .pdf. This tester applies both conventions.
Use cases
When to use this tool
Before deploying robots.txt
Confirm your new rules block the paths you intend and leave important pages crawlable.
Debugging deindexed pages
Check whether a page that dropped out of search is being blocked by an overly broad Disallow.
Per-crawler rules
Test how Googlebot and the wildcard group differ when you target specific bots separately.
Staging and admin paths
Verify that admin, staging, or private directories are actually disallowed for crawlers.
SEO audits
Spot-check a competitor or client robots.txt to understand what they keep out of search.
Learning the syntax
Experiment with Allow, Disallow, wildcards, and groups to see how matching plays out.
Tips and best practices
- robots.txt controls crawling, not indexing. Use a noindex meta tag to keep a page out of search results.
- An empty Disallow value means allow everything for that group.
- Rules are case-sensitive for paths but user-agent matching is case-insensitive.
- Keep robots.txt at the site root. A file at /folder/robots.txt is ignored.
- Do not rely on Disallow to hide secrets. Blocked URLs can still be discovered and the file is public.
Common questions
What happens if no rule matches my path?
If no Allow or Disallow rule in the relevant group matches the path, the path is allowed by default. Crawlers assume access unless a rule explicitly disallows it.
Which group applies to my crawler?
A crawler uses the group whose user-agent matches its own token. If none match, it uses the User-agent: * group. This tester follows the same fallback, and shows which group it used.
Does Disallow remove a page from Google?
No. Disallow only stops crawling. A blocked URL can still appear in search results if other pages link to it. To truly remove a page, let it be crawled and add a noindex directive instead.
How accurate is this tester?
It implements the common rules: group selection, longest-match precedence, Allow winning ties, plus * and $ wildcards. Individual crawlers may differ in rare edge cases, so treat this as a simplified check rather than the final word.
100% private