Several modern complex file formats are based on a ZIP container: this is at least the case of OpenDocument and EPUB. However, they are not simply a bunch of files joined into an archive, but they follow some rules in order to be easily recognized by tools such as file. As I had to unpack, modify and repack such a container, here is a recipe to do that.
The zip command is flexible enough to allow you to work in place (I am not sure that it is really in place) to add, remove and replace files in the archive, however I find it more convenient to work on the unpacked tree and repack later.
Unpack and work
To unpack, just unzip, being aware that these containers do not have a root directory, so it is better to extract them to a dedicated directory.
Modify as you wish; if you add, remove or replace files, remember that there is a file list to modify accordingly: META-INF/manifest.xml for OpenDocument, OEBPS/content.opf and possibly OEBPS/toc.ncx for EPUB.
Repack
To repack, you must create a ZIP archive without extra file attributes, and put the file mimetype uncompressed in first position. Its goal is to be visible as plain text at a fixed position, to serve as a magic file type indication. The remaining files can be stored compressed and in random order. For an EPUB, for instance, this gives the following commands:
% zip --no-extra --compression-method store ../book.epub mimetype % zip --recurse-paths ../book.epub META-INF OEBPS
This could be automated by a dedicated script, but I felt no need to
write one yet. By the way, these options can be abbreviated as -X0
and -r
.
1 comment
saturday 03 september 2011 à 19:18 Christoph Anton Mitterer said : #1