A program that looks words up in a dictionary to check your spelling.
Commercial Spell Checkers
All prices are in
- The JSpell Java Spell Checker is an expensive Java spell checker.
See the separate entry for details.
- Keyoti RapidSpell : add to JSP (Java Server Pages) to check
HTML (Hypertext Markup Language) form text. Costs to
- Source Code Spell Checker is specially
designed for spell checking programs. It spell checks string literals, code comments, variable names, class
names, method names.
unlimited license. Last revised/verified: 2012-01-30
- Spellex offers over a dozen different products. They are
so expensive they do not divulge the prices on their website.
- WinterTree’s Java-based spell checker gradually loads the entire dictionary into
RAM (Random Access Memory) as it is needed. 11 languages. Java interface.
to add spell checking to an Applet.
for source code. Last revised/verified: 2012-01-30
Free Spell checkers
- Gnu Aspell: multilingual, free, open source C++, windows
version is behind, library or command line, handles UTF-8, good at suggesting alternative words in English
only, replacement for Ispell. Last revised: 2011-07-01 Verified: 2012-01-30.
- Google Spelling API
works by sending a query to Google’s spelling API (Application Programming Interface) servers.
- International Ispell. Originally
written in 1971.Last revised: 2005-06-12 Verified: 2012-01-30
- Jazzy: opensource, disk based,
Java spell checker, based on Aspell. Last revised: 2005-11-23 Verified: 2012-01-30.
- JOrtho Java Spell Checker. This
is for spell checking small amounts of text the user types into a JTextComponent.
Last revised/verified: 2012-01-30.
- Suggester Spell Check: Nine languages.
200,000 word English dictionary and an English medical dictionary.
Last revised: 2008-02-01 Verified: 2012-02-01
Need for Spell Checkers
Spelling, though now neglected by the education system, is more important than ever. If you compose a web
page, unless you spell the words correctly, including proper names, they will not be properly indexed by the
search engines. If you compose programs with variable names incorrectly spelled, others will not be able to
remember them. If you post on the Internet, there is no secretary to take dictation. Your spelling errors betray
you as an ignoramus to all your readers. They will dismiss your ideas before they even consider them.
A typo is a spelling mistake where it is clear you know how to spell a work but you
fingers fumbled and produced something weird when typing, e.g. that for than. These are not
quite as damaging to your reputation as spelling errors, but they have most of the same consequences.
Most word processors, email programs and newsreaders come with a built-in spell checker. You still have to use
it. They can’t catch errors such using your for you’re. You have to train yourself
to catch those manually.
Under the Hood
Conceptually, a spell checker is very simple. It has a list of correctly spelled words. It goes through your
document one by one looking up the word to see if it is present in the list of correctly spelled words. The trick
is to encode the list in such a way it takes up little space on disk and in RAM
and the lookup is very fast. The spell checker can use some of the following techniques.
- The spell checker knows the frequency of use of each word. It can thus arrange them in layers with common
words kept in a special high speed cache.
- The words can be stored in alphabetical order, this mean the first few characters are usually the same as
the word before it in the list. Thus, there is no need to explicitly store them.
- The frequency of letters is known. Thus it is possible to use a Huffman encoding, using shorter bit
patterns for common letters and letter pairs.
- There is no need to store a both a word an its plural if the plural follows one of the standard
patterns.
- The spell checker can cache words it has already checked in a document, or earlier that day. Those words
are more likely to recur.
- The main dictionary list is prepared and frozen well ahead of time. It does not need the ability to add new
words. You can use another separate smaller updatable exceptions dictionary for user-defined words. You know
everything there is to know about the master list. It will not change. It is completely practical to have a
computer spend hours and hours massaging and compressing the list, looking for perfect hashes etc.
- Use of hashing. See Hashtable.
Rant
There are a number of problems with spell checkers.
- Every program uses a different spell checking engine. Not only do I have to learn the quirks of multiple
spell checkers, I must teach each one separately my personal list of exception words that are legitimate, but
are not in its dictionaries.
- They pay no attention to context. They can’t catch my two most common: confusing
it ⇔ in ⇔ is
andthat ⇔ than
. All the words are it the list of legitimate words, so the spell checker does not
notice if I accidentally substitute them in creative ways. I make these errors commonly because the home row on
the DSK keyboard looks like this: AOEUI DHTNS with T next to N and
N next to S. It needs to do a primitive grammar analysis to
see a correctly spelled word should ever occur surrounded by the other words in the context.
- I once had a very trying customer back in the days when Canadian Mind Products built and repaired custom
computers. I laughed and laughed when I noticed the spell checker had corrected the spelling of her name to
Enema. Had I not been quite so alert, the invoice could have created quite an uproar. She would never
have believed me I did not insult her intentionally.
- Every time I spell check a web page, the spell checker makes me mark as OK the same old exceptions time
after time after time. These are document specific words. I don’t want add them to the general
dictionary.
What do we need to rectify these problems? In descending order of importance:
- We need a universal interface for spell checker plugins, much like JCE or JavaMail. You can buy a high performance one,
plug it in, and it works identically with all apps. We should start with Java, and later try to extend it to an
all the apps on an OS (Operating System).
- Spell checkers need to work anywhere and everywhere you edit text
… filling in forms, composing email, programming, browsing, chattering on Facebook… all in
exactly the same way.
- There need to be hierarchical exception lists of additional legitimate words. Some words are universally
ok, some ok just in the context of a certain document, others only in the context of a sentence or even word
instance.
- Hidden in the text needs to be embedded information about what checks have been already done, or which
parts of the document, by whom. That way you don’t have to keep rechecking the same stuff over and over
every time you make a tiny change to the document. It also can be used to ensure you never export anything
without first spell checking it.
- Spell checkers need to be transparently collaborative. You should be able to automatically submit your
document to several automated checkers and/or professional human proofreaders, then automatically compare the
results and deal only the discrepancies yourself. The various proofreading services (who might just be friends
you swap with the get fresher eyes), can work simultaneously, and continuously as you edit your documents.