classifying characters : Java Glossary

classifying characters
There are some methods in Character for classifying characters: getType, isWhiteSpace, isIdentifierIgnorable, isLetter, isDigit, isUpperCase, isLowercase etc.

These methods are quite complex internally since they deal with the full Unicode character set. If you are dealing only with ASCII (American Standard Code for Information Interchange) characters you can use simpler logic such as:

if ( '0' <= c && c <= '9' )...
   if ( 'a' <= c && c <= 'z' )...
      if ( 'A' <= c && c <= 'Z' )...

An easy way to detect a vowel would be:

if ( "aeiou".indexof ( c ) >= 0 )

The switch statement with cases for each character let the compiler figure out how to efficiently categorise, but the categories must be fixed at compile time.

The traditional classifying method of using a translate table of byte classifications indexed by character consumes 64K per table. It is fast, but gobbles RAM (Random Access Memory). You could use a BitSet to shrink that to 8K. Consider using a HashMap indexed by Character to look up a sparse set of characters. Another technique is to use a binary search table of special characters. You might look inside the Sun character-classifying methods in Character and Collate to learn a few clever tricks.


CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/classifying.html J:\mindprod\jgloss\classifying.html
logo
Please email your , letters to the editor, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email. If you want your message, your name or email kept confidential, not considered for public posting, please explicitly specify that. Unless you state otherwise, I will treat your message as a letter to the editor that I may or may not publish in the feedback section. After that, it will be too late to retract it. If you disagree with something I said, please quote it and cite the web page where you found it, tell me why you think it is wrong, and, if possible, provide some supporting evidence. Threatening to kill me or spouting obscenities has yet to persuade me to change my mind.
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.179.213]
You are visitor number 9,410.