TrueZIP compresses as you add, but does not construct the final archive file until you call umount. Using the default settings, TrueZIP archives take up about an extra 28% more space that WinZip using its proprietary compression algorithms. It possible to squeeze more compression out of TrueZIP if you are willing to take more time.
See the timestamp gotchas about PkZip format. They plague TrueZIP too. TrueZIP can use other formats that may avoid these problems.
Java 7 has have built-in True-zip like features. Mark Hall is working on giving TrueZIP an API (Application Programming Interface) that will be compatible. That means you can write code for Java 7 that will work on earlier JDKs (Java Development Kits).
If you are starting a new project, you should use TrueVFS instead of TrueZIP. It requires Java version 1.7. TrueVFS is more scalable in terms of memory and runtime (less impact on heap), provides new features (Apple Keychain, JMX (Java Management extensions)), truly employs convention over configuration (all features are locatable on the class path) and is better tested (runs new tests which are not available for TrueZIP 7).
TrueVFS does log4J logging. So you also need a jar to handle that. slf4j-nop.jar will turn off logging. logback-classic.jar will turn on logging.
TrueZIP 7 is a rewrite of TrueZIP 6 with a slightly different API.
TrueZIP 6 works on Java version 1.4 or later. TrueZIP 7 works on Java version 1.6 or later.
TrueZIP supports to following flavours of archive:
Archive Formats that TrueZIP supports | ||||
---|---|---|---|---|
Type | Canonical Suffixes | Description | Advantages | Disadvantages |
ZIP | zip | ZIP file: Archive file with central directory and compressible entries |
Widely supported. People can easily access the archive without TrueZIP. Uses standard Java java.util.zip.Deflater to do the actual compression, which is also a disadvantage because it is not a particularly quick or strong. You can trade off time for additional compression. | Incompetent date-time stamp format. They don’t understand time zones or daylight saving. They are only accurate to two seconds. |
JAR | ear | jar | war | Java Archive: ZIP with custom directory tree layout | Fully multiplatform because of Java support. | Uses rather lame compression techniques. Same advantages and disadvantages as zip. Jar is just a flavour of zip. |
ODF | odb | odf | odg | odm | odp | ods | odt | otg | oth | otp | ots | ott | OpenDocument Format, like XML (extensible Markup Language) compressed with PkZip. | Works with OpenOffice | Not a general format |
TZP | tzp | zip.rae | zip.raes | RAES encrypted ZIP file | AES (Advanced Encryption Standard) is serious encryption. | Needs aux BouncyCastle bcprov.jar. Does not use JCE (Java Cryptography Extension) because JCE lacks the needed random access. |
SFX/EXE | exe | ZIP file with a code preamble for self extraction | If you send the archive to someone, they need no additional software at all to open it. | This driver is pretty slow. Windows only. Read-only. |
TAR (Tape Archive) | tar | TAR : Uncompressed tape archive file. | Universally supported under Unix. | Needs aux ant.jar. |
TAR.BZ | tbz | tb2 | tar.bz2 | TAR file wrapped in BZIP2 compression format | More aggressive compression than ZIP. | Needs aux ant.jar. |
TAR.GZ | tar.gz | tgz | TAR file wrapped in GZIP compression format. | Traditional Unix archive | Not particularly aggressive compression. Needs aux ant.jar. |
Here is a simple program to add a file to a zip and display a directory of its contents.
Here is a simple program to add a file to a zip and display a directory of its contents.
Here is a simple program to add a file to a zip and display a directory of its contents.
ZIP files use IBM437, an eight bit character set to encode the filenames. Anything which is not representable in this charset gets rejected. You can change this in the File API and the ZIP API. For the File API, just do this:
Another way to do it is to create an empty zip file using ZipOutputStream with a specified encoding.
However, this will stop interoperability of the created ZIP files with older tools because support for UTF-8 has been added only fairly recently! So anybody else will probably not be able to extract these ZIP files. WinZip, however, can handle them.
If you need a better option, use the JAR file format — it supports UTF-8. The TAR file format is not an option either because it supports only US-ASCII.
Now that Java 1.7 supports a TrueZIP-like Zip filesystem, which should you use? I asked Christian Schlichtherle, the author of TrueZIP, the advantages and disadvantages of both. Here is his reply.
The most fundamental difference is that TrueZIP is designed as a true VFS (Virtual File System), while NIO.2 (New Input/Output version 2) is just an abstraction over particular file system implementations: NIO.2 lacks file systems federation, so an application cannot transparently traverse and access different file systems with a uniform addressing system.
For example, if an application wants to access a ZIP file, it has to address the ZIP file with a special URI (Uniform Resource Indicator) and do specific API calls for looking up the particular file system provider. Once it has obtained a Path object for that ZIP file, it can only resolve entries within that ZIP file, but not step out to the parent file system or step in to an inner archive file from that Path object. With TrueZIP however, an application doesn’t even need to know that it accesses an archive file — all access is fully transparent with the help of a uniform addressing scheme encapsulated in the TFile or TPath classes.
Once I discussed these constraints about file system federation with Alan Bateman, who is the specification lead for JSR (Java Specification Request) 203 (alias NIO.2 ). At the time, his answer was that they didn’t want to replicate the functionality of JNDI (Java Naming and Directory Interface). My point of view is that JNDI is a terrible API in general and a horrible API to do any I/O in particular.
How does this matter? For example, if you are writing a search engine to index the platform file system, then with TrueZIP your application could easily traverse the directory tree in a simple, uniform way and it would transparently step into archive files (ZIP, TAR, etc) too if you want it to do so. Or if you are writing a software build tool, then with TrueZIP your application could easily compile nested EAR (Enterprise Archive file) / WAR (Web Archive)/JAR from various sources without needing to consider the nesting level and the different compression strategies at each nesting level.
The way I look at it is that the TrueZIP Path module isn’t just another implementation of an NIO.2 FileSystemProvider — it’s also a nice façade to the NIO.2 API.
Also, I have always strived to make simple things easy while making complex things possible. A result of this is the provision of the copy/move/delete operations for directory trees in the TFile class (alias bulk I/O). Thanks to file system federation and this feature, converting a ZIP file to a TAR.GZ file can be done in one line of code as shown on the home page. In comparison, with NIO.2 you need to implement the FileVisitor interface, which is cumbersome and provides very little benefit over implementing the traversal yourself because it accounts for all corner cases.
There are so many differences here that I can hardly address them all. Just looking at the ZIP drivers I can easily spot that Java 7 has no BZIP2 compression, no WinZip AES encryption, no RAES encryption, no appending to ZIP files and doesn’t know about the subtle differences between ZIP and JAR files when it comes to encoding entry names or date/time stamps.
How does this matter? The first time you hit a ZIP file with mojibake (I love this word) in the entry names you’ll know it.
Because TrueZIP provides convenient methods for copying data, it can also do some important optimizations. As a standard feature, TrueZIP splits reading and writing the data into separate threads. So whenever the application copies data, a pooled thread is used to read the input. Your mileage may vary, but my personal experience is that this easily cuts 30% of the runtime in comparison to a naïve read-stop-write-stop loop.
Another optimization avoids redundant recompression: If the application copies a ZIP entry from one ZIP file to another, then the TrueZIP Driver ZIP recognizes that and avoids to inflate the data from the input ZIP file just to deflate it again to the output ZIP file — this is called RDC (Raw Data Copy). This feature unloads a significant burden from the CPU (Central Processing Unit) and may be more than welcome in a server app, e.g. a Continuous Integration system.
Providing file system federation comes with some unique challenges: What if one of the addressed archive files in a path name is a false positive archive file, e.g. a regular directory or file or is non-existent? The TrueZIP Kernel recognizes this and deals with it according to the true state of the file system entry. So if for example a prospective archive file is in fact a regular directory, the application can still proceed with the operation.
As a user, you might take this aspect for granted, but please allow me to stress its importance and how TrueZIP tries to make a difference: When you are storing data persistently, would you want to bet its faith on a file system which is unreliable? Of course, not! And so this is the one aspect where I don’t want to make any compromises. In clear and bullish words, I want TrueZIP to be solid as a rock!
TrueZIP uses static code analysis, assertions, unit tests, functional tests and integration tests. Of course, such a complex system isn’t bug free, but with each version, I add more tests to cover even the strangest corner cases, e.g. parallel copying/moving/deleting of entries between different levels of nested archive files which get concurrently synced to their respective parent file system — yeah, I know it sounds weird.
In comparison, at the time Java 7 was released I asked Alan Bateman if there was a test suite for NIO.2 FileSystemProvider implementations. Of course, my intention was to run my implementation for TrueZIP 7.2 against this suite. Unfortunately, there was none, not even for ZipFileSystemProvider. To be fair, I don’t know if this situation has changed. I don’t care anymore because I have ported my integration tests from the TrueZIP File API to the TrueZIP Path API for the release of TrueZIP 7.2.
// use of TPath to get a TrueZip style Path. import de.schlichtherle.truezip.nio.file.TPath; import java.io.InputStream; import java.nio.file.Files; import java.nio.file.Path; ... // Path path = Paths.get( "archive.zip/entry" ); // wouldn't work Path path = new TPath( "archive.zip/entry" ); // use this InputStream in = Files.newInputStream( path );Try the TrueZIP Archetype Path. It provides some ready-to-run Java and Scala sample code for the TrueZIP Path API and discusses some options you have with it, e.g. using the fastest way for copying streams.
This page is posted |
http://mindprod.com/jgloss/truezip.html | |
Optional Replicator mirror
|
J:\mindprod\jgloss\truezip.html | |
Please read the feedback from other visitors,
or send your own feedback about the site. Contact Roedy. Please feel free to link to this page without explicit permission. | ||
Canadian
Mind
Products
IP:[65.110.21.43] Your face IP:[44.192.49.72] |
| |
Feedback |
You are visitor number | |