What is a 4xx Error?
A "4xx Error" is a URL that returns an HTTP Status code in between 400 and 499. Statuses in the 400-499 range are all types of client errors, which means the problem has to do with the request, not the server's response. There are a number of different types of errors in this range, but the vast majority are either 404, 403, or 401.
404 (Not Found): A 404 means that no page exists at the URL specified. These errors are typically the result of one of three problems:
- There’s a typo in the href value in the anchor tag of the link
- The page no longer exists or has moved to a different URL
- The actual page's URL or filename has a typo in it which doesn’t match the href value in the anchor tag
403 (Forbidden): This error means that the client is not authorized to access the requested content. The most common reason is that the URL is a directory listing (not page content), and directory browsing is not permitted by the web server.
On occasion some servers will return a 403 if you do not have the proper authentication to access the content, though the correct status code for this is 401. A 403 error should be issued when the URL is forbidden whether or not the user provides valid authentication.
401 (Unauthorized): These are URLs that require authentication (logging in) in order to see the content. If you try to access content that is behind a login page, you may receive this message.
Why It's Important
A 4xx error on your site is a big deal. They are perhaps the highest priority issue on your site, and should take first priority in your SEO efforts.
First of all, the page's content is not available to search engines, so they will not be included on search engine results pages and you won't have any chance of getting organic traffic to this page. Secondly, when search engines notice that a URL returns a 4xx error, they will typically remove the page from their index. It can sometimes be a challenge to get these URLs re-indexed once fixed. Lastly, search engines may regard sites with many 4xx errors poorly, and in result may lower their rankings or the number of pages they index on your site.
How To Fix
Luckily, 4xx errors usually aren't too difficult to fix. The method will depend on what kind of problem caused the error.
All 4xx Errors (Especially 404 Errors)
You'll first want to look at each URL and determine whether the page should exist at this URL or not. This usually shouldn't be too hard - check for obvious typos in the URL or try to see if the page has moved to a new URL.
If you can find typos in the URL, you'll want to correct them in every place you linked to this URL. You can find the pages linking to this URL by clicking on the expand arrow on the left side of the row or by clicking on the number of internal links to the page for the row. Now you just need to find the place in the page's source code that has this link with the typo, and correct it.
If the page has moved, the best way to fix it is to create a 301 permanent redirect from the old URL to the new URL. If this can’t be done, you’ll need to find all of the links to the old URL and change them to point to the new URL.
If there are no typos in the URL, and the page has not moved, it's likely that there is a typo in the actual URL. Check the file name and directory structure of the URL on the server, or if using URL rewrites, verify there are no typos.
Ensure that the URL is referring to a page with content, not a a directory listing. This is the most common cause of 403 errors. If the URL is a directory listing, you'll want to change any links pointing to this URL or create a 301 permanent redirect to a new URL.
In some cases, a web server might be blocking or not accepting requests from Dragonbot, and the server could return a 403. If this is the case, you will have to modify your server settings to allow requests from Dragonbot.
Check to see if the page should be available without logging in. If it should be available, you'll have to modify your web server's settings or application code to allow unauthenticated users access this page. If this is a page only available to logged-in users, consider excluding these pages using the robots.txt file.