binary formats : Java Glossary
home B words local find no local find frame, full screen Google search web for topic jump to footer translate with Babelfish by Roedy Green ©1996-2008 Canadian Mind Products
Go to : punctuation 0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z (all)
binary formats
DataOutputStream puts out Java data in big-endian internal binary format. In contrast PrintStream.println and PrintStream.print put out data as human-readable 8-bit ASCII characters.

Java DataOutputStream Binary File Formats

When you create a file via DataOutputStream, what does the binary file look like? It looks like the internal binary RAM format in a big-endian CPU. These are also the internal formats in the Java Virtual Machine.

Everything is stored big endian, MSB (Most Significant Byte) first. (People who cut their teeth on Intel or the MOS 6502 are used to the little endian LSB first format.) Even on Intel hardware Java uses big endian file formats. This permits data interchange with other platforms.

There are no separators between fields. The files are in binary, not readable ASCII.

Method Type Size Description
writeBoolean(boolean v) boolean 1 byte 8-bit 0x00=false 0x01=true
writeByte(int v) byte 1 byte 8-bit signed binary integer
or 8-bit ASCII char
writeBytes(String s) bytes 1 byte 8-bit signed binary integers
or string of ASCII chars.
not null terminated.
not in quotes.
not counted.
not delimited in any way.
writeChar(int v) char 2 byte 16-bit unsigned binary integer
or 16-bit Unicode char.
writeChars(String s) chars 2 byte 16-bit unsigned binary integers
or string of 16-bit Unicode chars.
not null terminated.
not in quotes.
not counted.
not delimited in any way.
writeDouble(double v) double 8 bytes 64-bit IEEE binary
1-bit sign
11-bit base 2 exponent
biased+1023
52-bit fraction, lead 1 implied
e.g.
3. = 0x4008000000000000
-3. = 0xC008000000000000
writeFloat(float v) float 4 bytes 32-bit IEEE binary
1-bit sign
8-bit base 2 exponent
biased+127
23-bit fraction, lead 1 implied
e.g.
3. = 0x404000
-3. = 0xC04000
writeInt(int v) int 4 bytes 32-bit signed binary
e.g.
3 = 0x00 0x00 0x00 0x03
-3 = 0xff 0xff 0xff 0xfd
writeLong(long v) long 8 bytes 64-bit signed binary
writeShort(int v) short 2 bytes 16-bit signed binary
writeUTF(String s) utf 2 bytes 16-bit length count
followed by ASCII-7 string.
Not null terminated.
"ABC" == 0x0003414243
non-ASCII-7 chars use multibyte
encodings with first byte having
the high bit on.

UTF

UTF is a compact form of Unicode that uses a mixture of 8, 16 and 24-bit codes. Strings are stored as a 16-bit big-endian length count followed by a 7-bit ASCII string. Not null terminated. "ABC" == 0x0003414243. Non-ASCII-7 chars use multibyte encodings with first byte having the high bit on. UTF is an external format. UTF strings are interconverted to ordinary Strings during I/O by readUTF and writeUTF. Unicode-2 supports even 32 bit characters, and UTF has been extended to handle them as well.
Unicode UTF bytes required to represent the character
00000000 0xxxxxxx 0xxxxxxx 1
00000yyy yyxxxxxx 110yyyyy 10xxxxxx 2
zzzzyyyy yyxxxxxx 1110zzzz 10yyyyyy 10xxxxxx 3

CMP_homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.16] The information on this page is for non-military use only.
You are visitor number 39,405. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/Mindprod website mirror)
http://mindprod.com/jgloss/binaryformats.html J:\mindprod\jgloss\binaryformats.html