How to Find All External Links on Your Website

Links are the backbone of the web and play an essential role in SEO and how well a website ranks in search engines like Google or Bing. Based on their destination, links can be classified into three categories, each serving its own purpose:

  • Internal links connect pages within the same website, guiding users and search engines through the content.
  • Outbound links, sometimes also called external links, take users to other websites, providing resources or information outside the site.
  • Inbound links, or backlinks, are links from other sites pointing to yours, helping drive referral traffic and attract potential customers.

Together, these links form what’s known as a website’s “link profile.” While each link type plays an important role in shaping this profile, this post will focus on outbound links.

So why exactly do outbound links matter so much? They’re essential for building your website’s credibility. By linking to trustworthy, relevant sources, you show readers that your content is well-researched and backed by reliable references. This helps establish your site as an authority. Outbound links also improve the user experience by giving readers quick access to related information without making them search elsewhere.

From an SEO standpoint, links to authoritative sites signal quality and relevance, which can positively impact search rankings. But remember, not all outbound links are created equal. Links to low-quality or even malicious sites can harm your site’s reputation, and broken links can frustrate users and affect SEO. Since links may break over time or lead to unexpected content if a domain changes hands, it’s essential to regularly check and update outbound links to keep them valuable and trustworthy.

Crawl Your Website with Dr. Link Check

An easy way to keep track of your outbound links is by using Dr. Link Check, a powerful and user-friendly link-checking tool. While Dr. Link Check is primarily designed to find broken links, it’s also excellent for getting an overview of all links on your site, including outbound (external) links.

To get started with Dr. Link Check, visit https://www.drlinkcheck.com/, enter your website’s URL in the text box, and click the Start Check button.

Start link check

The service will begin crawling your website, starting from the homepage, and will continue until it either reaches the limit of your plan or exhausts all available links. The free “Lite” plan allows crawling of up to 1,500 links, which is sufficient for many smaller websites.

Once the crawl is complete, Dr. Link Check provides several reports accessible from the left-hand sidebar. Since we are only interested in outbound links, select Outbound from the sidebar.

Outbound links report

The report may include a variety of link types beyond standard hyperlinks, such as script links (<script src="...">), image links (<img src="...">), and other resource links that may not be relevant to you. To refine the report to display only the links you want, go to the Filter section at the top of the report and select AddLink Type.

Filter outbound links

If you plan to check outbound links regularly (which is highly recommended), you can save this customized filter by clicking Save as Custom Report…. This will add a new item to your sidebar for easy access.

Alternatively, if you’d prefer to filter out resource links during the crawl itself, you can instruct the crawler to ignore all links except standard anchor (<a href="...">) links in your HTML code. Note that this option is only available with a paid subscription, starting with the “Standard” plan. To apply this setting, open the Project Settings dialog, expand Advanced Settings, and enter the following rule under Ignore links if…:

HtmlElement != "a"

Ignore non-hyperlinks

Once you rerun the crawl, the Outbound report will only display outbound <a> links.

Conclusion

Managing outbound links is key to maintaining your website’s credibility, user experience, and SEO health. With Dr. Link Check, staying on top of these links is simple and efficient. Regular checks ensure that your site remains a reliable resource for both visitors and search engines alike.


How To Remove Broken Links From a List of URLs

In this article, we’ll explore how you can use Dr. Link Check to check the status of multiple URLs or domain names in bulk, without having to manually visit each site.

Step 1: Create a free account

Go to the sign up page to create a new account (or log in, if you already have one).

Step 2: Create a new project

Click the Add Project button and paste (or enter) your list of URLs into the URLs to check field (the textbox expands automatically):

New project

The free Lite subscription allows you to check up to 1,500 URLs, while a paid subscription enables you to enter 10,000 URLs at once. If you enter domain names without a protocol (such as http or https), Dr. Link Check automatically prepends them with “http://” to turn them into a valid URL.

As you only want to check the status of single URLs without having Dr. Link Check crawl any linked pages, make sure that URLs to crawl is set to None as shown in the screenshot above.

Step 3: Start the check

Now hit the Create Project button to start the check. You will be redirected to the Overview report that gives you a summary of the results.

Overview report

Step 4: See which links are broken

If you want to see which of the links in your list are broken, select the All Issues report from the sidebar on the left.

All issues

Getting a report of links that actually work and are not dead is slightly more challenging. Select the All Links tab from the sidebar and click on Add… in the Filter bar:

Add filter

Now select Broken check result from the filter list and change Host not found to OK.

Configure filter

Starting with the Professional plan, you can also export your report to CSV format for import into Excel or other spreadsheet software.

Export report

Conclusion

In addition to analyzing all links on a single website, Dr. Link Check also enables you to efficiently mass check a list of URLs for dead links. This saves you a significant amount of time and resources compared to manual checks.


How to Find Soft 404 Errors on Your Website

A soft 404 is a type of error where a web server returns a 200 OK status code (indicating that the request succeeded), even though the delivered page doesn’t contain the expected content, and a 404 Not Found status would have been the appropriate response.

Think of a page with no or very little content, a page with an error message, or a search results page without any results – that’s what a soft 404 error looks like to you in the browser, despite the server sending a status code 200 in the HTTP response headers as if there were no problem.

Why Are Soft 404 Errors Problematic?

Soft 404 errors create a bad user experience, just like regular 404 errors. Clicking on a link, waiting for the page to load, and then not finding the expected content is frustrating and gravely damages the website’s credibility.

It can also impact the site’s search rankings if users encounter a soft 404 and quickly leave the page. Bounce Rate and Time on Page are two important metrics that influence a website’s SEO performance and signal to search engines how relevant and valuable the content is. In addition, soft 404s consume valuable crawl resources, causing search engines to continue crawling unimportant pages instead of important ones, leading to a reduced frequency of crawls, decreased indexation, and ultimately, a negative impact on the website’s search visibility.

How to Detect Soft 404 Errors?

Detecting soft 404 errors is tricky. You can’t trust the HTTP status code returned by the server but have to examine the page content. Standard link checkers don’t do this and therefore fail to identify soft 404s.

Our link checking solution, on the other hand, can rely on a large database of content patterns to automatically identify different kinds of soft 404s on a website. Starting with the Professional plan, detected soft errors are reported under the “Soft errors” tab in the sidebar.

Soft Errors in Dr. Link Check

An alternative (and free) way to identify at least some of the soft 404 errors is to check out the site’s “Indexing → Pages” report in Google Search Console. This report lists crawl errors, including soft 404s, that Google encountered when indexing your pages.

Google Search Console: Soft 404

Another resource you should take a look at is your website’s analytics data. Try finding pages with particularly high bounce rates or low time-on-page values as these are indicators of soft 404 errors.

Last but not least, verify that your server actually sends a 404 status code if a non-existent resource is requested:

  • Open a new browser window or tab.
  • Enter a URL with your website’s domain that you are certain should result in a 404 Not Found error (such as https://www.example.com/this-page-does-not-exist).
  • Open the browser’s developer tools (Control + Shift + I on Windows or Linux, Command + Option + I on macOS).
  • Select the “Network” tab and press Control + R (or Command + R on macOS) to reload the page.
  • Check which code was returned for the page in the “Status” column.

Chrome DevTools: 404

If the request was not redirected to a different URL and the server responded with code 200, you have stumbled upon a soft 404 error.

What Causes Soft 404 Errors?

Soft 404 errors are frequently the result of an incorrect server configuration or a programming error. Here are two real-life examples:

A website hosted on an Apache web server had a line similar to this in its .htaccess file to configure a custom 404 error page:

htaccess ErrorDocument 404 https://www.example.com/404.html

Instead of serving the content of the 404.html file directly, the server redirected to the URL https://www.example.com/404.html and returned the 404.html file with a 200 OK status. Changing the line to

htaccess ErrorDocument 404 /404.html

fixed the issue.

In a different case, a website had a custom “404 Not Found” page with the following PHP code at the top:

php+HTML <?php header("Status: 200 OK"); ?>

This line resulted in 200 OK being sent instead of the correct 404 code.

Sometimes soft 404s are also remnants of changed website structures or removed content. Products that are no longer available may result in empty search result pages or moved blog posts in empty categories. In situations like these, it can be a good idea to just remove the empty pages or the links pointing to them.

If that’s not possible or practical, you can restrict search engines from indexing the pages by adding a disallow rule to your site’s robots.txt file or including a meta robots tag with the parameter “noindex” (<meta name="robots" content="noindex">) in your pages’ HTML code.

Conclusion

Soft 404 errors can significantly impact a website’s user experience and search engine visibility. Website owners can identify these errors through the use of tools such as Dr. Link Check and Google Search Console and by carefully examining the website’s analytics. Resolving soft 404s may involve reviewing the server’s configuration files and delving into the website’s source code.


Ältere Posts