| UTF (Unicode Transformation unit) BOM (Byte Order Mark) Unicode-encoding Endian Indicators | |
|---|---|
| 0xfeff BOM as it appears encoded |
Description |
| ef bb bf | UTF-8 endian, strictly speaking does not apply, though it uses big-endian most-significant-bytes first representation. |
| fe ff | UTF-16 for 16-bit internal UCS-2, big endian, Java network order |
| ff fe | UTF-16 for 16-bit internal UCS-2, little endian, Intel/Microsoft order. Note you must examine subsequent bytes to tell this apart from a UTF-32 BOM since they both start ff fe. |
| 00 00 fe ff | UTF-32 for 32-bit internal UCS-4, big-endian, Java network order |
| ff fe 00 00 | UTF-32 for 32-bit internal UCS-4, little endian, Intel/Microsoft order. |
There are also variants of these encodings that have an implied endian marker.
Unfortunately, often applications, even Javac.exe, choke on these byte order marks. Java Readers don’t automatically filter them out. There is not much you can do but manually remove them.
This program tests how Java handles BOM s. It discovers than Java never inserts BOM and it never removes them on its own. You have to bypass, insert and delete them explicitly.
|
|
You can get the freshest copy of this page from: | or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror) |
| http://mindprod.com/jgloss/bom.html | J:\mindprod\jgloss\bom.html | |
![]() | ||
| Canadian Mind Products | ||
| mindprod.com IP:[65.110.21.43] | ||
| view Blog | Your face IP:[38.107.179.213] | |
| Feedback | You are visitor number 23,408. | |