Supported Wildcards
1. * – Matches any number of characters
txt
CopyEdit
Disallow: /private/*
➡ Blocks all URLs starting with /private/ (e.g., /private/data, /private/files/image.png).
2. $ – Anchors the pattern to the end of the URL
txt
CopyEdit
Disallow: /*.pdf$
➡ Blocks all .pdf files anywhere on the site.
❌ Not Supported by All Bots
While Googlebot and Bingbot support * and $, other bots might not:
| Bot Name | Wildcard Support | Notes |
|---|---|---|
| DuckDuckBot | Partial/Unknown | No official wildcard documentation. Likely follows standard rules only. |
| YandexBot | ✅ Limited | Supports *, but not $ (per Yandex docs). |
| Baiduspider | ❌ No wildcard support | Ignores * and $. Uses strict string match only. |
| Sogou Spider | ❌ No wildcard support | Ignores advanced rules. Known for aggressive crawling. |
| AhrefsBot / SemrushBot | ❌ No clear support | Respect disallow directives but typically do not interpret wildcards. |
| MJ12bot (Majestic) | ❌ No wildcard support | Follows basic syntax only. |
| Applebot | ✅ Partial | Supports basic patterns, but $ may not be recognized. |
| archive.org_bot | ❌ No support for *, only respects Disallow paths. |
🧠 Common Use Cases
Case 1: Block tracking parameters
Disallow: /*?ref=
➡ Blocks URLs like /page?ref=affiliate, but does not block /page?ref=123&other=456.
✅ Better:
Disallow: /*?ref=*
➡ Blocks any ?ref= parameter, regardless of its value.
Case 2: Prevent indexing of file types
Disallow: /*.zip$
Disallow: /*.exe$
➡ Blocks downloads or archives from being indexed.
Case 3: Block specific folders
Disallow: /temp/
Disallow: /dev/*
➡ Blocks everything inside /temp/ and any subfolders under /dev/.
Case 4: Allow certain paths while disallowing broader ones
Disallow: /images/
Allow: /images/public/
➡ Blocks /images/ but allows /images/public/logo.png.
Case 5: Catch all dynamic URLs
Disallow: /*?*
➡ Blocks all URLs that include any query string.
Edge Cases to Watch
❌ Incorrect wildcard inside query:
Disallow: /?ref=*
This will not work as expected in most cases. Match query patterns like this:
Disallow: /*?ref=*
🛠️ Testing & Tools
Always test with Robots.txt Tester:
✅ Recap Cheatsheet
| Pattern | Matches | Use Case |
|---|---|---|
| * | Any characters | Wildcards in path or query |
| $ | End of URL | Block file extensions |
| /*?* | All query strings | Block dynamic URLs |
| /*?utm_* | All utm tags | Block marketing parameters |
Learn more about when to use wildcard and when not.