Exclude URLs is a crawl option in Dragon Metrics created to give user ability to instruct our crawlers to not crawl certain URLs on their sites.
This feature can be useful to user for a few main reasons:
- Faster data retrieval - Because our crawler don't need to crawl as much URLs, with URL exclusions you will be able to get crawl data faster every time we crawl your sites
- More efficient limit usage - Every account in Dragon Metrics has a limit in how many URLs they can crawl, by excluding URLs you can save up crawl limits for future usage
- Skipping duplicate pages - If you have many duplicate pages on your site which are properly canonicalized, you may want to skip crawling them to use your crawl budget more effectively.
URL exclusions is set at campaign level and it can be found in the following pages:
Excluding URLs in Site Auditor
In the Site Auditor, click "Crawl Options" in the top right of the page.
Next to "Excluded URLs", click the "Manage" button
You can set up multiple rules to exclude URLs. Each one will run together to exclude larger portions of the site, and there's no limit to the number of rules you may add.
In the first screen, you'll be presented with a table showing all the current rules in effect. To create a new rule, click on the + icon button:
After clicking the + button, there will be another popup window allowing you to exclude URLs by subdirectory, subdomain or individual URLs, click on Show Advanced Options you can exclude by URL parameters or Regular Expression:
Choose the option you prefer and click Save. You can then test URLs to see if they're be excluded, or have the chance to add additional rules.
When you are finished adding rules and testing URLs, simply close the modal and click Save. Changes will take effect next crawl. You can also request an immediate re-crawl of the site.