Links are the very foundation of the web. They connect web resources with each other and make it possible for visitors to navigate between pages and allow pages to reference images and other content.
Unfortunately, unlike diamonds, links are not forever. They have a tendency to break over time. Companies go out of business, servers are shut down, blog posts get deleted, domains expire… the web is dynamic, and there are lots of reasons why a link that works today might stop working tomorrow.
At best, a broken link is merely annoying and results in a poor user experience. At worst, it can pose a security threat to anyone visiting the website.
Imagine what could happen if Google shuts down their Analytics service and later lets the google-analytics.com domain expire. There would be millions of websites left with obsolete script code that attempts to load and run code from https://www.google-analytics.com/analytics.js
. A third-party could snatch up the expired domain and serve malicious JavaScript code under this URL. This is one form of an attack called Broken Link Hijacking.
Broken Link Hijacking is an exploit in which an attacker gains control over the target of a broken link.
Typical candidates for link hijacking include:
Depending on how the hijacked link is embedded into the website’s code, there are different ways to exploit the vulnerability, with varying levels of risks.
If you have embedded an external script into your website (using code like this: <script src="https://example.com/script.js"></script>
) and the link’s domain name gets taken over, an attacker can inject arbitrary code into the site.
You might ask what harm could come from some extra JavaScript code. The answer is plenty. Here are a few examples of how an attacker could exploit this vulnerability:
The possibility to execute attacker-supplied code basically makes this a Stored Cross-Site Scripting (XSS) vulnerability, which Bugcrowd classifies as a P2 (high risk) issue.
A hijacked link to an image (<img src="https://example.com/image.jpg">
) or style sheet (<link href="https://example.com/styles.css" rel="stylesheet">
) is not as bad as a hijacked script link, but can still have serious security implications:
background: url("https://example.net/hacked.gif")
) and to inject text (body::before { content: "HACKED!" }
).Attacks like these are often referred to as defacement or content spoofing and typically fall into Bugcrowd’s P4 (low risk) category.
It’s also worth noting that each request made to an attacker-controlled external server leaks information about both the website and the visitor. The attacker is able to track who visits the site (IP address, browser user-agent, referring website) and how often.
When you link to an external page from your site (<a href="https://example.com/">Link</a>
), this link can be seen as a recommendation. You are indicating that the content of the page is relevant and worth a visit, otherwise you wouldn’t have included the link as part of your own content.
Gaining access to the target of the link allows an attacker to exploit the trust that your visitors give you and your recommendation in order to:
This is basically an impersonation attack. The attacker pretends that the linked website is legitimate and from a trusted source. Bugcrowd rates Impersonation via Broken Link Hijacking as P4 (low risk).
Subresource Integrity (SRI) allows you to ensure that linked scripts and style sheets are only loaded if they haven’t changed since the page was published. This is accomplished by computing a cryptographic hash of the content and adding it to the <script>
or <link>
element via the integrity
attribute (as a base64-encoded string):
<script src="https://example.com/script.js" integrity="sha384-/u6/iE9tq+bsqNaONz1r5IjNql63ZOiVKQM2/+n/lpaG8qnTYumou93257LhRV8t" crossorigin="anonymous"><script>
Before executing a script or applying a style sheet, the browser compares the requested resource to the expected integrity
hash value and discards it if the hashes don’t match.
By adding a Content-Security-Policy
HTTP header to your server’s responses, you can restrict which domains resources can be loaded from:
Content-Security-Policy: default-src 'self' example.net *.example.org
In this example, resources (such as scripts, style sheets, images, etc.) may only be requested from the site’s own origin (self
, excluding subdomains), example.net
(excluding subdomains) and example.org
(including subdomains). Requests to other origins are blocked by the browser.
A Content Security Policy doesn’t help when one of your trusted domains gets hijacked, but it does make sure that you don’t accidently embed resources from unexpected sources, whether that’s due to a simple typo or an obsolete link on an old and long-forgotten page.
Broken links happen. And when they do, it’s always better to know sooner than later, before an attacker might exploit the issue. Our link checker, Dr. Link Check, allows you to schedule regular scans of your website and notifies you of new link problems by email. Our crawler not only looks for typical issues like 404s, timeouts, and server errors, but also checks if links lead to parked domains.
Quite often, redirects are an early indicator that a link might break soon. When a website is redesigned and restructured, redirects are used to map the old URL structure to the new one. This typically works fine for the first redesign, but with each new restructure, the redirect chains get longer and longer, with more potential breaking points. It’s therefore advisable to keep a close eye on redirected links and update them if necessary.
In order to identify redirects on your website, run a scan in Dr. Link Check and click on one of the items in the Redirects section of the Overview report to see the details.
A broken external link doesn’t just disrupt the visitor experience; it can also have serious security implications. An attacker might be able to hijack the broken link and gain control over the link’s target. In the worst case, this can lead to an account takeover and the theft of sensitive data.
Using modern browser security features like Subresource Integrity and Content Security Policy you can mitigate these risks. Regular crawls with a broken link checker help you identify broken links early and reduce the attack surface.