accents : Java Glossary

*0-9ABCDEFGHIJKLMNOPQRSTUVWXYZ (all)

accents
English, French, German, Italian and Swedish use modified letters such as é (e acute), ê (e circumflex), è (e grave), ç (c cedille). These appear in the range 0x000c to 0x00ff in the Latin-1 supplement part of Unicode.

Eastern European languages have additional accents such as š (s caron) in the range 0x0100 to 0x017f in the Latin Extended-A section of Unicode.

Esperanto has 6 accented letters ĉ (c circumflex), ĝ (g circumflex), ĥ (h circumflex), ĵ (j circumflex), ŝ (s circumflex), û (u circumflex).

Detecting Accented Vowels

com.mindprod.common18.ST. isVowel will tell you if a given character is a vowel, including accented vowels. You can download the source as part of the COMMON18 distributable. This works in JDK (Java Development Kit) 1.+.

Removing Accents

Here is how you can convert accented chars to unaccented ones, in Java version 1.6 or later.

Complications

Learning More

Oracle’s Javadoc on Normalizer class : available:
Oracle’s Javadoc on Normalizer.Form : available:

This page is posted
on the web at:

http://mindprod.com/jgloss/accents.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\accents.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[44.200.122.214]
You are visitor number