Dr. Link Check puts each link through several tests and returns one of the following the results.
The link works as intended. No action is needed.
For https or http links (like https://www.example.com/
), this means that the web server returned a HTTP status code in the 2xx range, indicating a successful request. The most common success status code is 200 (OK).
For data links (like data:text/plain;charset=utf-8;base64,SGVsbG8sIFdvcmxkIQ==
), our crawler makes sure that the URL is syntactically valid and the data can be correctly decoded.
For mailto links (like mailto:mail@example.com
), it is verified that the email address’s domain name actually exists and has an MX record. An MX record is a DNS entry that specifies which mail server is responsible for receiving emails for a domain. Emails sent to domains without MX records will usually be returned as undeliverable.
The link address is not properly formatted, possibly due to a typing or copy-and-pasting error.
For instance, https://www..example.com/
is flagged as an invalid URL due to the second dot. Here are a few other typical examples:
https://www.example.com)
: The URL ends with a closing parenthesis instead of a slash (/
).https://insert link here
: A placeholder text was turned into an invalid http URL.mailto:jane@example.com&subject=Hello World
: The mail subject is mistakenly delimited by &
instead of ?
.Please note that Dr. Link Check is a bit more strict than browsers like Chrome or Firefox when it comes to validating the syntax of a URL. For instance, https:///www.example.com/
works in most browsers but is marked as an Invalid URL by our crawler due to the third slash (which is not allowed according to the official specification).
The link’s URL is well-formed but uses a scheme (the part at the beginning before the colon) that our crawler doesn’t support. Only http
, https
, data
, and mailto
links can be fully checked by the crawler.
Examples of unsupported schemes include:
tel
: Used to link to a phone number that is dialed when clicking the link (example: tel:+1-123-555-5555
).javascript
: Used to specify JavaScript code that is executed when clicking the link (example: javascript:alert('Hello, world!')
).ftp
: Used to link to a file on an FTP (File Transfer Protocol) server (example: ftp://speedtest.tele2.net/10MB.zip
). Since FTP is deprecated in modern browsers, we recommend replacing FTP links with links to HTTP(S) servers where possible.file
: Used to link to a file on the user’s local file system (example: file:///c:/path/to/the%20file.txt
). If you see a file link on a website, it’s almost always a mistake from copy/pasting content from a local computer to the web server. In this case, update the link to point to the correct location of that content on the server.Sometimes our crawler encounters URLs like htttps://www.example.com/
or httphttp://www.example.com/
. These URLs are probably incorrect and contain a typo. Nevertheless, they are not counted as broken and instead are marked as Unsupported because in theory htttps
and httphttp
could be valid URL schemes supported by some app or browser extension. It’s up to you to decide whether URLs like these are intentional and correct or not.
In general, we recommend manually reviewing each link that is marked as Unsupported.
The domain name used in the URL (such as www.example.com) could not be resolved to an IP address (like 93.184.216.34 or 2606:2800:220:1:248:1893:25c8:1946).
This means that our crawler failed to find a name server responsible for the domain and query that server to get at least one A (IPv4 address) or AAAA (IPv6 address) record for the domain.
Possible causes for this error include the following:
Please note that Host not found can be a transient error that resolves itself after some time. It’s possible that the problem has already been fixed, but the old DNS records are still cached somewhere because their Time-to-Live (TTL) has not expired yet. This error can also be caused by a temporary problem with the internet connection or an overloaded name server.
A useful website for diagnosing domain and DNS issues is GWhois.org. It simply performs a WHOIS lookup to check if a domain is registered and also displays the DNS records for the domain.
If you are not afraid of the command line, you can also use dig (macOS, Linux) or nslookup (Windows) to perform DNS lookups:
dig www.example.com A
to get the A records (IPv4 addresses), or dig www.example.com AAAA
to get the AAAA records (IPv6 addresses).nslookup -type=a www.example.com
to get the IPv4 addresses, or nslookup -type=aaaa www.example.com
for the IPv6 addresses.Our crawler could not establish a connection to the target server.
This means that there appears to be a server available at the target address, but the connection attempt fails for one of the following reasons:
Quite often, connect errors are only temporary and disappear when re-running the check.
The crawler failed to complete an SSL/TLS handshake with the target server.
An SSL/TLS handshake is the first step in establishing an HTTPS connection. During the SSL/TLS handshake, our crawler and the target server try to agree on which version of SSL/TLS and cipher they will use to encrypt and authenticate the communication. When a handshake fails, it’s typically for one of the following reasons:
If you see an SSL connect error error for on outbound link, you can try contacting the owners of the website and asking them to update their servers to support a current and secure TLS version.
If your own website still only supports SSLv2 or SSLv3, you should strongly consider enabling TLS 1.2 and 1.3. This might be as simple as editing a configuration file, or it may require upgrading your web server software and SSL/TLS library.
A helpful online tool for debugging SSL configuration issues is Qualys SSL Server Test. It simulates several SSL/TLS handshakes and rates the security and quality of the server’s configuration in terms of common SSL vulnerabilities.
The SSL certificate presented by the target server failed verification.
Common certificate issues include:
For a deeper analysis of a server’s SSL certificate, see Namecheap’s SSL Checker online tool.
An error occurred while transferring data between crawler and target server.
This issue typically results from a sudden interruption of the connection, possibly due to a sudden server outage or a network hiccup.
Send/receive errors are often temporary and clear up on their own.
The target server didn’t respond in time.
A timeout error can be the result of any of the following:
Server timeouts can be caused by different things:
The target server returned an HTTP status code outside the 2xx and 3xx range.
Every HTTP response begins with a status line, consisting of the protocol version, a numeric status code, and a text phrase that explains the status code (example: HTTP/1.1 200 OK
). The status code is typically from one of the following ranges:
If the HTTP status code returned by the server is outside the range of 200 to 399, our crawler considers the link to be broken and reports an HTTP error code.
Below you can find a list of the most common HTTP error status codes and their meanings.
The target server rejected the request as malformed or incorrect.
This issue is normally due to one of two causes:
<
or &
(which should be encoded to %3E
and %26
). Sometimes servers also use this error to complain about missing required query parameters.Host
header.Even though the specification considers 400 Bad Request a client error, it doesn’t necessarily mean that the URL is incorrect or our crawler is doing something wrong. More often than not, this error is triggered by a configuration or programming issue on the server-side.
The requested resource is restricted and requires authentication.
This isn’t necessarily an error that needs fixing. 401 Unauthorized is often used to indicate that a visitor is currently not logged in and therefore doesn’t have permission to access the content. This is perfectly fine as long as the website provides a way to log in.
However, it can also mean that something went wrong on the server. Microsoft’s IIS web server, for instance, reports an Unauthorized error if it can’t access a local file due to missing read permissions on the folder.
Some type of payment is required to access the resource.
Different websites use this status code in different contexts:
The server doesn’t allow access to the resource.
This error code is widely used for anything that’s “not allowed,” including the following:
The requested resource was not found.
This is by far the most common error code. It simply means that the server cannot find any content at the requested URL.
It’s possible that the resource was deleted, moved (without setting up a proper redirect), or that it was never available and the URL was incorrect in the first place.
If you want to replace a previously working link that now returns a 404 Not Found error and are struggling to find a suitable alternative, give the Wayback Machine a try. It might have an archived copy of the original page that you can link to instead.
Please note that some websites return a 404 status code in the HTTP header but still deliver normal-looking content. This typically indicates a server configuration issue.
The request method (GET, POST, etc.) is not allowed for the resource.
Our crawler usually sends a GET request to a server to ask for the desired document. If the server responds with 405 Method Not Allowed, it means that GET is not an appropriate HTTP method for this resource and that a POST, PUT, PATCH, or DELETE request was expected instead.
We sometimes see this error with websites that mistakenly use an <a href>
element to link to a form URL that requires data to be POSTed and should have been used within a <form>
element.
The resource is not available in the requested form.
When our crawler makes a request, it sends various Accept headers indicating the type, encoding, and language of the content it would prefer to receive from the server. The headers typically look like this:
Accept: text/html,application/xhtml+xml,*/*
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
If the server cannot produce a response matching the acceptable formats and is unwilling to supply a default representation, it returns a 406 Not Acceptable error.
The server timed out waiting for the request to complete.
This means that the server didn’t receive a full HTTP request within the time it was prepared to wait. Consequently, the server gave up and killed the connection.
Possible reasons for this error include an unstable network connection, an overloaded system or a server configuration problem. Quite often, this is a temporary issue that resolves on its own after a time.
The request could not be processed due to a conflict on the server.
This status code is supposed to be used in cases where the request conflicts with another request or with the server configuration. For example, you may see this error when multiple simultaneous updates cause version conflicts. Ideally, the response should include enough information for the user to resolve the issue.
In practice, we most often see this error with websites served by Cloudflare (a large content delivery network) when there’s a problem/conflict in the domain’s DNS settings.
The requested resource is no longer available and will not be available again.
This error is a more specific version of 404 Not Found. It indicates that a resource was available in the past but has intentionally been removed and will no longer be available at any URL.
If an outgoing link on your website generates this error, it’s best to remove the link or replace it with a link to a different site.
The request URL is too long for the server to process.
Some web servers impose a limit on the length of URLs they are willing to accept. If a request for a URL longer than that limit comes in, they refuse the request with a 414 URI Too Long error.
This error often occurs with dynamically generated pages that contain dynamically generated URLs leading to other pages. If each page takes its own URL and appends something new, the resulting URLs get longer and longer.
The request was sent to the wrong server.
This error indicates that the server is unable or unwilling to generate responses for the requested combination of scheme, host, and port.
For instance, some web servers use the 421 code to signal that access is no longer available via http and further requests should use https instead.
The requested resource is locked.
This status code is only intended to be used with WebDAV (a protocol for transferring files) and shouldn’t occur when browsing the web. Nevertheless, we sometimes see 423 Locked errors on websites that have been temporarily suspended by their web hosting companies.
Too many requests were sent in too short a period of time.
Many web servers have rate limits in place that affect how many requests a client is allowed to make in a given amount of time. If the rate limit is exceeded, a 429 status code is returned.
When our crawler comes across a 429 code, it doesn’t consider the link to be broken, but instead marks it as blocked.
The server refused to fulfill the request for legal reasons.
The code 451 signals censorship and is a reference to the dystopian novel Fahrenheit 451, where books are banned and burnt. It is typically used in the one of the following situations:
The server encountered an unexpected error.
This is a generic “catch-all” error that is used for all kinds of different server problems, including the following:
The server lacks the functionality required to fulfill the request.
This error is often used to indicate that a resource or function is not available yet but planned for later. In that case you can consider it a “coming soon” response.
Some web hosting companies return a 501 Not Implemented status if a customer’s domain or website has not yet been fully set up.
The server was acting as a proxy for a different server, and that server didn’t respond as expected.
It’s a common setup to have a proxy server accept incoming requests and forward them to one or more other servers. A specific example is an NGINX web server that proxies requests and passes them on to a PHP-FPM service running a PHP application. Another example is a load balancer that distributes incoming traffic across multiple web servers.
If the origin server returns an invalid response (like a malformed HTTP header) or is not available (due to being overloaded or down for maintenance), the proxy typically returns a 502 Bad Gateway error back to the requester.
The server is temporarily unable to process the request.
It’s possible that the server is overloaded, down for maintenance, or has some other problem that is expected to be resolved soon.
We sometimes also see this status code being used to indicate that a request has been blocked. This can be the case if the server flagged our crawler as an unwanted bot.
The server was acting as a proxy for a different server, and that server didn’t respond in time.
This error is similar to 502 Bad Gateway, except that 502 is typically returned when the proxy received an invalid response, while 504 should be used to indicate that the origin server didn’t respond at all (within the time the proxy was willing to wait).
Possible reasons are that the upstream server is currently overloaded or temporarily down for maintenance.
The website has been temporarily suspended for exceeding its allowed traffic limit.
Status code 509 is often used by web hosting providers that limit the amount of bandwidth customers can use for their sites. If this limit is reached, the website is automatically suspended for the remainder of customer’s the billing period.
The origin server returned an unknown error.
This error code is not officially specified anywhere, but is used by several CDNs (content delivery networks) to indicate an unspecific problem with the origin server that the request was forwarded to. It’s possible that the origin server unexpectedly reset the connection or that it returned an invalid HTTP response to the CDN servers.
The origin server is not returning a connection.
This status code is not specified by any standard, but it’s used by Cloudflare to indicate that the origin server (that the request was supposed to be forwarded to) is down or blocking requests from the Cloudflare network. Specifically, it means that Cloudflare tried to connect to the origin server but received a connection refused error.
The attempt to connect to the origin server timed out.
This is another unofficial status code used by Cloudflare. It signals that the Cloudflare server couldn’t establish a full TCP connection to the origin web server within the time it was prepared to wait. There might be a firewall in place that is blocking Cloudflare’s requests, or the server is currently overloaded and unresponsive.
The origin server could not be reached.
This status code is not defined in any official specification but is Cloudflare-specific. If Cloudflare servers respond with this code, it typically means that the DNS records for the origin server (which was supposed to provide the actual response) are incorrect.
The origin server didn’t provide an HTTP response in time.
This Cloudflare-specific status code indicates that Cloudflare successfully connected to the origin server, but it didn’t reply with an HTTP response within the time the Cloudflare server was willing to wait (typically 100 seconds).
Most likely the origin web server is overloaded and therefore too slow or unable to respond.
The SSL/TLS handshake with the origin server failed.
This response code is used by Cloudflare when one of their servers fails to negotiate an SSL/TLS handshake with an origin web server.
A missing SSL certificate is a common cause of the 525 SSL Handshake Failed error.
The origin’s SSL certificate could not be validated.
Cloudflare servers return this error when they are unable to successfully validate the SSL certificate presented by the origin web server.
This typically happens for one of the following reasons:
The connection between Cloudflare and the origin’s Railgun Listener was interrupted.
Railgun is a service that speeds up the delivery of dynamic content from an origin server to the Cloudflare network. Taking advantage of the fact that dynamic content is usually mostly static, Railgun compares the generated content to its previous version and sends only the changes. For this to work, a so-called “Listener” needs to be installed on the origin server. This Listener communicates with the Sender component that runs on the Cloudflare servers all over the world.
Error 527 indicates an interrupted connection between the Sender (on a Cloudflare server) and the Listener (on the origin server). Common causes include TLS/SSL-related errors, firewall blacks, and other network issues.
An error occurred with the proxied website.
530 is used by Cloudflare as a general error code for different types of issues. The specific error is included in the response and displayed in the browser when visiting the URL.
We see the 530 status code mostly being used to indicate an “Origin DNS error” (Cloudflare-specific error code 1016). This occurs when Cloudflare is unable to resolve the origin server’s IP address via DNS.
The server returned an unrecognized HTTP status code.
Although there is an official registry of HTTP status codes, which is maintained by the Internet Assigned Numbers Authority (IANA), nothing prevents programmers from inventing and using their own 3-digit codes.
If the code returned by the server is not in the 2xx or 3xx range, our crawler considers it to be an error code and reports the corresponding link as broken.
The link was redirected more than 20 times.
A redirect occurs when the web server responds with a redirect HTTP status code (301, 302, 303, 307, or 308) and gives our crawler a new URL at which it can find the requested resources. Our crawler then sends a new request to this URL, to which the server might again respond with a redirect instruction. If this goes on and on, our crawler gives up after following 20 redirects and reports a Too many redirects error.
This issue is often caused by a redirect loop, where a page either redirects to itself or redirects to another page that then redirects back to the original page. If you are getting this error for your own website, you should check your web server’s configuration for errors. Here are a few pointers that may help your troubleshooting efforts:
RewriteRule
, possible the one for redirecting http:// URLs to https:// URLs.rewrite
and return
directives.The encoding of the HTTP response body could not be recognized.
A web server typically includes a Content-Encoding
header in the response to indicate whether and how the data is compressed (example: Content-Type: gzip
). If no Content-Encoding
header is provided, it is assumed that the response body is uncompressed and plain-text.
The following Content-Encoding
values are supported by our crawler:
identity
, none
: No compression is used.deflate
: The data is compressed using the deflate algorithm as implemented by zlib.gzip
, x-gzip
: The data is compressed using the gzip algorithm. This is the most common server compression method.br
: The data is compressed using the Brotli algorithm.Our crawler reports a Bad content encoding error in the following situations:
Content-Encoding
value not included in the list above. We sometimes see servers returning Content-Encoding: UTF-8
, which is apparently the result of a mix-up of Content-Encoding
and Content-Type
.In any case, this issue is usually caused by a configuration or programming error on the server side.
The link appears to be part of a so-called crawler trap.
A crawler trap is an issue with a website that results in crawlers discovering a never-ending number of new links while navigating from page to page as it follows each new link. Below are three common examples:
Our crawler attempts to detect traps like these by looking for characteristic patterns in a link’s URL structure and the overall structure of the website. The algorithm is designed to identify as many crawler traps as possible, while not classifying legitimate links as traps. If you think a link was mistakenly flagged, please get in touch to let us know.
In general, there are good reasons for avoiding crawler traps:
It’s always best to resolve crawler traps by making changes to the website’s code. If that’s not possible or too cumbersome, you can instead block the trap URLs in the website’s robots.txt file. For instance, if your online store uses query parameters named “category” and “color” for filtering (as in https://example.com/products?category=shoes&color=black), you can instruct crawlers to ignore filter links with the following robots.txt instructions:
User-agent: *
Disallow: /*?*category=
Disallow: /*?*color=
There is no mail server configured for the email address’s domain name.
When our crawler discovers a mailto
link (like mailto:john.doe@example.com
), it performs a DNS query to retrieve the MX records for the recipient’s domain name. An MX (short for mail exchange) record contains the address of the mail server responsible for handling emails for a domain. If there is no MX record present, it’s likely that emails sent to the specified address will bounce.
There was an error of unknown type when checking the link.
This is an error you should ideally never see, because it means that something unexpected happened. It’s possible that the target server sent an invalid response that our crawler didn’t know how to handle or that the crawler crashed while checking the link.
If you get an Unknown error and suspect the problem to be on our crawler’s side, please let us know and we will look into it.
The link is blacklisted for hosting phishing or malware content.
Phishing is a scheme where scammers clone the websites of well-known organizations (like banks or e-commerce sites) in order to lure visitors into entering their login credentials. Malware is any kind of software that is designed with malicious intent, such as viruses, ransomware, or spyware.
Websites hosting phishing or malware attacks are often hacked without the owners ever knowing that their servers have been compromised. If you want to inspect a blacklisted website, be careful, because it might try to exploit vulnerabilities in your browser.
In order to determine if a URL is safe, our crawler checks it against up to four different blacklists (depending on your subscription plan):
Please note: If a URL appears on a blacklist, it doesn’t necessarily mean that it’s actually dangerous. There is always a chance that a safe website is mistakenly identified as risky.
Even though the server responded with a success code (such as 200 OK
), this link is considered broken based on the page’s content.
The link points to a domain or website for sale.
This often happens when a domain name’s registration expires and the domain is purchased by a professional domain investor. The domain is then used to host a “For sale” page that lets visitors know that the domain name is available to buy.
The link points to a parked domain filled with nothing but ads.
This is typically the result of a domain expiring and being purchased by someone else. The new owner then monetizes the existing traffic by serving ads to visitors.
Even if the original content on the domain is no longer available, parked domains still return 200 OK. The domain parkers clearly don’t want their domains to be identified, because that would put them at risk of losing valuable backlinks and organic traffic.
The link points to a page with placeholder content.
This can be the default page that comes with the web server or a page with a generic “Coming Soon” or “Under Construction” message.
The link points to a website that is no longer in service.
It’s possible that the domain name registration expired, the hosting account was suspended, or maybe the website owner simply shut down the site and put up a “Closed” sign.
The link points to a page with no or little content.
A completely empty page together with a 200 OK HTTP status code is often the result of a configuration error on the server.
We also sometimes see pages with nothing more than a “Test” or “Hello World” message. Even if these pages are set up intentionally, they are not worth linking to.
The link points to a default page that lists the contents of the current directory on the server.
It’s possible to configure web servers (like Apache or NGINX) to automatically list the content of directories that don’t have an index file.
Although there can be reasons to provide directory listings, more often than not a directory listing page is an indicator of something missing on the server, which is why we consider it an error.
The linked page looks like an error page for a 4xx or 5xx HTTP status code (such as “404 Not Found”, “500 Internal Server Error” or “503 Service Unavailable”).
This means that the web server sent a 2xx success status code in the header, but the text in the response body indicates there has been some kind of server error.
For example, some sites incorrectly send a 200 OK status even though the message displayed in the browser clearly states that the requested file is not available. This is called a soft 404.
The link could not be checked because the target server blocked our crawler’s request.
Many websites have measures in place to identify and block unwanted bot traffic. Unfortunately, this means that sometimes our crawler gets blocked as well. In cases like these, the server typically returns an error message (like Request denied or Too many requests) and one of the following HTTP status codes:
Most of the time, the blocking appears on one of these levels:
If you notice that your website blocks our crawler, you can try the following: