Zip : Java Glossary

*0-9ABCDEFGHIJKLMNOPQRSTUVWXYZ (all)

ZIP file format  Zip
The word zip refers both to American postal codes and PkWare’s public domain file archiving and compression format. Sun has extended it in its JAR and WAR (Web Archive) files to have a formal table of contents.
Zip Postal Codes Gotchas
Zip File Format GZIP vs Zip
Writing Elements of a Zip File encryption
Reading elements of a Zip File Sequentially Java 7
Reading elements of a Zip File Randomly Future Archive Format
Verifying Learning More
Directories Links
Nesting

Zip Postal Codes

ZIP (Zoning Improvement Plan), the American postal code made of a 5+4 numeric. The code is assigned so that you can determine the state from the first three digits of the zip code. The US Post Office has an online zip code lookup.

Zip File Format

Zip files and jars have a similar format. Each element is preceded by a header, then there is a summary set of headers at the very end of the file. PKware documents the ZIP file header format.

PKZIP and WinZip use / as the directory separator character. It is up to you to convert the \ to / in element names for the ZipEntry write and back again on read. If you don’t bother, the \ will get in the zip file and you will have a platform-dependent zip.

Apache VFS gives you a common API (Application Programming Interface) for files that works both for regular files and zip file members. Normally you do your work with ZipFile, ZipEntry. ZipInputStream and ZipOutputStream or for simpler takes GZIPInputStream and GZIPOutputStream.

Writing Elements of a Zip File

Here is how to The classes in package java.util.zip such as ZipFile, ZipInputStream and ZipOutputStream will let you read and create zip or jar files. Don’t worry about ZipEntry.setCrc since it and setCompressedSize get set automatically.

Reading elements of a Zip File Sequentially

The following code won’t work if ZipOutputStream was used to create the zip file. This includes *.jar files created by jar.exe.

To read all the elements of a zip, you might think you would use ZipFile. getEntries() to enumerate all the entries. Unfortunately, this enumeration is in random order — Hashtable order really. So you need to use the random access method below. To efficiently move the disk arms over the file, you really should sort the entries first in the order they appear in the zip.

Reading elements of a Zip File Randomly

The following code will work to read elements by randomly given the element name, even if ZipOutputStream was used to create the zip file, which fails to build the length elements correctly.

Verifying

Here is how you verify a zip for distribution contains all the files in the corresponding jar.

Directories

Normally directories are not explicitly created or even stored as separate entries in a zip file. When the file is extracted, any directories needed to contain the extracted files are automatically created as needed. However, you can store empty directories in a zip file. They appear as filenames ending in /.

Nesting

The member files in a zip file can be accessed individually, just like the files in a jar file (a species of zip file). However, when one zip is contained within another zip, you can only access the contained zip file itself, not its individual members. You would need to expand it to disk somewhere before accessing its members.

There are three approaches to the problem:

  1. Put all members in the same jar/zip.
  2. Use several individual jar files and arrange to have them on the path.
  3. Use a JWS (Java Web Start) installer class to unpack a nested jar into individual jars.
Why would you nest?

Gotchas

GZIP vs Zip

GZIP is a more primitive file format than zip. GZIPInputStream and GZIPOutputStream let you read and create compressed files, but not using the zip directory structure. The file consists of just one compressed lump, without any embedded members filenames, timestamps etc. For sample GZIPOutputStream code, consult the File I/O Amanuensis.

encryption

WinZip supports:

Java’s ZipEntryStream does not support the WinZip compression scheme. So you must manually encrypt and decrypt either on the plaintext or the compressed form perhaps using JCE. Unfortunately you will need your code at both ends to encrypt/decrypt. You won’t be able to create encrypted files that WinZip can decrypt on its own.

Utilities

There are command line and GUI (Graphic User Interface) utilities to create, update and extract ZIP files.
bzip2
jZip: free, based on 7-zip
PKZIP
WinRar
WinZip

Java 7

Java 7 lets you treat a Zip file as if it were a directory tree. You use the FileSystem and Paths classes.

Classes of interest include:

Future Archive Format

Phil Katz designed PkZip format back in the DOS days. It has survived so long partly because it was extensible — it permitted new compression algorithms. If we were to do it over again, I would make the following changes:

Learning More

Oracle’s Javadoc on ZipFile class : available:
Oracle’s Javadoc on ZipEntry class : available:
Oracle’s Javadoc on ZipInputStream class : available:
Oracle’s Javadoc on ZipOutputStream class : available:
Oracle’s Javadoc on GZIPInputStream class : available:
Oracle’s Javadoc on GZIPOutputStream class : available:
Oracle’s Javadoc on java.nio.file.Files class : available:
Oracle’s Javadoc on java.nio.file.FileStore class : available:
Oracle’s Javadoc on java.nio.file.FileSystem class : available:
Oracle’s Javadoc on java.nio.file.Path class : available:
Oracle’s Javadoc on java.nio.file.Paths class : available:
Oracle’s Javadoc on Files class : available:

This page is posted
on the web at:

http://mindprod.com/jgloss/zip.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\zip.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[3.94.202.151]
You are visitor number