Broken Links  Broken Links

go to home page Java Applications full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish ©1996-2009 Roedy Green, Canadian Mind Products
View the latest version of this manual online at http://mindprod.com/application/brokenlinks.manual.html.
Introduction Sample Text Report
Why use Xenu? Sample HTML Export
How to Use Xenu Repairing Broken Links
Configuring Brokenlinks Futures
Running Brokenlinks Links
Presumed Good File

Introduction

Brokenlinks is a tool to help you find and track broken links on your website, namely URLs that no longer point to anything useful. It is a back end to the Xenu broken link detector that compensates for Xenu’s weakness of overwhelming you with reports of links that are not really broken. You get the basic idea. Brokenlinks whittles Xenu’s giant list of broken links to the ones you should look at first. This saves you immense amounts of time researching links that are not really broken.

Both Xenu and Brokenlinks share a common limitation. They can’t detect a broken link that has been redirected to a working place-holder site, e.g. one advertising that the domain is up for sale. Similarly, some sites just quietly redirect all broken links to the home page. Brokenlinks cannot detect that. Most embarrassingly, Brokenlinks can’t detect a domain bought out by a pornography company. You can still have people threaten to sue or kill you for “deliberately” trying to send to them to a porn site.

Why use Xenu?

Finding the broken links is only 10% of the work. Fixing them is what is so labour intensive. If you let your website deteriorate with broken links, visitors become frustrated, and stop visiting. Having clean links encourages Google to take your site more cleanly.

How to Use Xenu

Download and install a free copy of Xenu Link Sleuth.

First you spider your local copy of your website with Xenu. Read the Xenu documentation on how to do that. You first have to be sure Xenu is working properly before Brokenlinks will work. Use Xenu directly to find orphans.

Once you are pretty sure you have Xenu configured correctly, run it on your local website, with external link checking turned on.

Be careful to verify the check external links option is on at the very last moment before you start the spidering. Xenu mischievously like to change the flag on you unexpectedly.
When it has finished spidering your website and checking all the links, click Export Page Map to TAB-separated File. (Don’t confuse this with Export to TAB-separated File). You may optionally get Xenu to also produce an HTML report.

Configuring Brokenlinks

Download and install a free copy of Brokenlinks.

The first time you use Brokenlinks you must configure it by creating a text file with a text editor. It will look something like this:

Configure it according to the embedded comments. Then save the file, giving it a name of the form xxxx.properties.

The properties are all pretty straightforward except for brokenForgivenessDays=7.

  1. If you have only a handful of broken links, and you religiously run Xenu/Brokenlinks every day, you might set brokenForgivenessDays=2, though I still set it to 6. One advantage of running every day is you stay on top of researching and repairing bronken links. You are never faced with large numbers of them to fix all at once. I personally run BrokenLinks twice a day so that I test sites at differenet times of day, avoiding treating them as dead when they are just temporarily down for backup. Further, that way I rarely have more than a couple of links to reasearch at any one time.
  2. If you have only a handful of broken links, and you religiously run Xenu/Brokenlinks twice a week, use brokenForgivenessDays=5
  3. If you don’t want to think about brokenForgivenessDays, leave this property out, and accept the default: brokenForgivenessDays=7
  4. If you have only a handful of broken links, and you religiously run Xenu/Brokenlinks every week, use brokenForgivenessDays=8
  5. If you have hundreds of broken links, and you run Xenu/Brokenlinks only every once in a while, use brokenForgivenessDays=14
  6. You can experiment setting it to various values. The smaller the brokenForgivenessDays number, the the sooner and the more broken links will be revealed to you. However, you will be pestered with more temporarily broken links. If you are feeling overwhelmed by broken links, increase the value to show you only the deadest links. The minimum value that makes much sense is 1. Xenu itself effectively uses 0.

Running Brokenlinks

Now run Brokenlinks like this:
java.exe -jar brokenlinks.jar xxxx.properties
If you have Jet, you simplify that to:
brokenlinks.exe xxxx.properties

You will get a report of the critical broken links to research both in text and html form. Embed the html in a web page somewhere. Here is my list of broken links for mindprod.com. The layout is designed so make it easy to research the problems. You can click to get the page where the broken link is, or click to where it was trying to go.

Then research the broken links and fix them. The run Xenu again, click Export Page Map to TAB-separated File and run brokenlinks. Run this cycle at different times of the day, since some websites shutdown part of the day for maintenance. You want to catch them when they are up. Run the cycle after repairing a batch of links to see how you did. After you get the list whittled down to none, run the cycle weekly, twice weekly or daily to stay on top of the broken links. I find running it daily works best since you never get overwhelmed with work, and thus are not tempted to postpone the work.

If you erase the history.bin file, it will automatically start over from scratch collecting history.

It is best to run brokenLinks at various times of day so that you won’t think a site is down that is just offline for an hour each day for backup. I am a bit compulsive. I run it twice a day.

Presumed Good File

If you find a link that Xenu/Brokenlinks thinks is broken, but which is actually ok, or it doesn’t matter for some reason, add it to your list of presumed good links. The presumedgood csv file will look something like this:
Thereafter that presumed good link will be excluded from the broken links list.

Sample Text Report

Here is roughly what the text report that Brokenlinks produces will look like:

Sample HTML Export

Here is roughly what the combined broken links and presumed good HTML report that Brokenlinks produces will look like:

Broken Links Sorted by Error Code

There are 4 links that have been broken for at least 6 days yet to be fixed. Last revised: 2009-07-08
Broken Links by Status Code
Status Code Links To
    Linked From
Forbiddenhttp://www.thefreedictionary.com/403.htm
 http://www.thefreedictionary.com/
no connecthttp://www.greenhousenet.org/
 /environment/kyoto.html
Not Foundhttp://linkshareware.com/login.php
 /jgloss/padsubmission.html
Page not availablehttp://www.microsoft.com/library/errorpages/smarterror.aspx?aspxerrorpath=/windows/windowsmedia/download/AllDownloads.aspx
 http://www.microsoft.com/windows/windowsmedia/download/AllDownloads.aspx

Links Presumed Good

Xenu claims the following links are broken, but they have been manually found to be good. They should be manually rechecked from time to time. The problem may be an unknown SSL certificate authority which needs to be OKed manually, (a missing/unknown/uninstalled certificate root authority) or it may be the website sends the data, but with not-found status.

There are 10 links marked as presumed good despite what Xenu says. Last revised: 2009-07-08

Links Presumed Good
Link To
http://localhost/
http://www.akademika.no/
http://www.glish.com/css/7.asp
http://www.os2site.com/sw/internet/time/clock2.htm
http://www.telegraph.co.uk/news/yourview/1562772/David-Cameron-answers-your-questions.html
http://www.theserverside.com/tt/books/wiley/masteringEJB/
https://calnetpki.berkeley.edu/getrootcert.asp
https://tsa.aloaha.com/
https://www.eecs.harvard.edu/mailman/listinfo/jopt-users
https://www.foldershare.com/welcome.aspx


Repairing Broken Links

Here are some tips to help you find a replacement link for a broken one.

Futures

Here are various ways I hope eventually to improve Brokenlinks:
  1. Vastly improve the speed of rechecking links by checking 30 of them time simultaneously the way Xenu does.
  2. Convert to Java Web Start. This will make the program easier to use by novices since it will not require configuration. The Configuration properties file will be replaced by a GUI. The user will not have to manually allocate a directory for the history file.
  3. Remove the dependence on Xenu. Handle everything it does in Brokenlinks.
  4. Avoid checking links that recently checked OK to vastly speed up link checking. You could then afford to do it daily or even before every upload. Xenu rechecks everything from scratch every time you run it.
  5. Check Applet links. Xenu thinks all Applet links are broken.
  6. Check style sheet links. Xenu ignores them.
  7. Tools to insert warnings styles on broken links so they will have an icon next to them warning your visitors of the problem and letting them know you are aware of it.
  8. Tools to help automate repair of broken links.
  9. Remove the reliance on Xenu. This will as a side effect make BrokenLinks notice local links than are in the wrong case. Wrong case links work under Windows and Xenu, but fail after you upload to a Unix-based webserver.
Google sitemap
HTML Broken link fixer student project
Xenu

CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/application/brokenlinks.manual.html J:\mindprod\application\brokenlinks.manual.html
CMP logofeedback Please email your feedback for publication, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.191.100]
You are visitor number 11.