JavaSoft, a Sun Microsystems Business

What's Java(tm)? JavaSoft News
Products and Services Developer's Corner


Last modified 08 Aug 1996

JAR File Format Specification version 1.0

Abstract

This specification defines a general purpose, compact archive format for packaging the components of a Java application. The JAR format supports Unicode names for entries, as well as a CRC for detecting data corruption. The format is also designed to be stream based, so that a JAR file can be created on any Java output stream, and likewise read from any Java input stream. Additionally, an optional directory can be included for random access to JAR file entries.

1. Introduction

This specification defines a general purpose data archive format that:

2. Specification

2.1. Conventions

The types u1, u2, and u4 represent unsigned 8-, 16-, and 32-bit integer values, respectively. All 16-bit and 32-bit quantities are represented in network (big-endian) order, where the high byte comes first. The type utf represents a Java UTF format string as handled by the classes java.io.DataInputStream and java.io.DataOutputStream.

The JAR format is described here using a C-like structure notation, where successive fields appear in the structure sequentially without padding or alignment. Additionally, header fields followed by (optional) are meant to be optional depending on the value of the header flag byte.

2.2. Overview

A JAR file begins with a main header, followed by zero or more JAR entries, and optional directory, and an end header. A JAR entry consists of an entry header immediately followed by the entry data stored in the ZLIB compressed data format (see references [1] and [2] for more information on the ZLIB format).

Each header also contains a flag byte that specifies the type of header as well as any optional fields present in the header. The least significant 2 bits together specify the header type, which can be one of the following:

    HEAD_MAIN  = 0
    HEAD_ENTRY = 1
    HEAD_DIR   = 2
    HEAD_END   = 3
The most significant 6 bits specify the optional fields that are contained in the header.

Each header can also contain optional extra field data which has the following format:

extra_data {
    u2 size;
    extra_data_entry entries[];
}
The field size specifies the total number of bytes of extra field data, followed by entries which contains the extra data itself. Each entry has the following format:
extra_data_entry {
    u1 type;
    u2 size;
    u1 data[size];
}
The field type indicates the type of entry, the field size the size of the entry data in bytes, and data the entry data. Currently, the only recognized extra field data types are:
    EDATA_COMMENT   = 0
The type EDATA_COMMENT is used to specify an optional comment for the header.

2.2. Main header

A JAR file begins with a main header, whose structure is given below, followed by a description of each of the main header fields:
main_header {
    u4 magic;
    u1 flags;
    u2 major_version;
    u2 minor_version;
    extra_data edata;	(optional)
    u2 crc;
}
magic (magic number)
This has the fixed value of magic = 0xC0C0ADAC and identifies the file as being in the JAR file format.

flags (header flags)
The flag byte indicates the header type and any optional header fields, and is divided into individual bits as follows:
bit 0	0
bit 1	0
bit 2	FLAG_EXTRA
bit 3-7 reserved

major_version (major version)
This is the major version of the JAR format, and currently has the value of major_version = 1 to indicate version 1.0.

minor_version (minor version)
This is the minor version of the JAR format, and currently has the value of minor_version = 0 to indicate version 1.0.

edata (extra data)
If FLAG_EXTRA is set, then optional extra field data is present as described above.

crc (header crc)
The field crc specifies the 16-bit CRC of the main header contents. The CRC-16 consists of the two least significant bytes of the CRC-32 for all bytes of the main header up to but not including the CRC-16 field itself.

2.3. Entry header and data

The main header is followed by zero or more JAR entries. Each JAR entry consists of an entry header, immediately followed by the entry data in the ZLIB compressed data format. The structure of an entry header is given below, followed by a description of each of the entry header fields:
entry_header {
    u1 flags;
    utf name;		(optional)
    u4 size;		(optional)
    u4 mtime;		(optional)
    extra_data edata;	(optional)
    u2 crc;
}
flags (flag byte)
The flag byte indicates the header type and optional field information, and is divided into individual bits as follows:
bit 0	1
bit 1	0
bit 2	FLAG_EXTRA
bit 3	FLAG_MTIME
bit 4	FLAG_SIZE
bit 5	FLAG_NAME
bit 6-7 reserved

name (entry name)
If FLAG_MTIME is set then the field name specifies the name of the entry, represented as a Java UTF string.

size (entry data size)
If FLAG_SIZE is set, then the size field specifies the total size of the uncompressed entry data in bytes.

mtime (modification time)
If FLAG_MTIME is set then the field mtime specifies the modification time of the entry expressed as the number of seconds since the epoch

edata (extra field data)
If FLAG_EXTRA is set, then optional extra field data is present as described above.

crc (header crc)
The field crc specifies the 16-bit CRC of the entry header contents. The CRC-16 consists of the two least significant bytes of the CRC-32 for all bytes of the entry header up to but not including the CRC-16 field itself.

2.4. Directory header and data

The last JAR entry can be followed by an optional directory section that can be used for random access to JAR entries. The optional directory consists of a directory header immediately followed by the directory contents stored in the ZLIB compressed data format. A JAR file can contain only one directory and it must immediately precede the end header. The structure of the directory header is given below, followed by a description of each of the directory header fields:
dir_header {
    u1 flags;
    u4 count;
    extra_data edata;	(optional)
    u2 crc;
}
flags (flag byte)
The flag byte indicates the header type and optional field information, and is divided into individual bits as follows:
bit 0	0
bit 1	1
bit 2	FLAG_EDATA
bit 3-7 reserved

count (entry count)
The field count indicates the total number of entries in the directory, and must be the same as the total number of JAR file entries.

edata (extra data)
If FLAG_EDATA is set, then extra field data is present as specified above.

crc (header crc)
The field crc specifies the 16-bit CRC of the directory header contents. The CRC-16 consists of the two least significant bytes of the CRC-32 for all bytes of the directory header up to but not including the CRC-16 field itself.
The directory data consists of count headers of the following format. The headers appear in the same order as the corresponding JAR entries:
dir_entry {
    utf name;
    u4 size;
    u4 mtime;
    u4 head_off;
    u4 data_off;
}
name (entry name)
The field name specifies the name of the entry, represented as a Java UTF string. An empty string indicates that the entry has no name.

size (entry data size)
The field size indicates the total number of bytes of uncompressed entry data.

mtime (modification time)
The field mtime indicates the modification time of the entry, or 0 if not specified.

head_off (entry header offset)
The field head_off is the offset in bytes of the entry header from the beginning of the JAR file.

data_off (entry data offset)
The field data_off is the offset in bytes of the entry data from the start of the JAR file.

2.5. End header

Every JAR file includes an end header which has the following structure and fields:
end_header {
    u1 flags;
    u4 dir_off;		(optional)
    u4 dir_size;	(optional)
    u4 mtime;		(optional)
    extra_data edata;	(optional)
    u4 end_off;
    u2 crc;
}
flags (flag byte)
The flag byte indicates the type of the header as well as optional field information, and has the following bits:
bit 0	1
bit 1	1
bit 2	FLAG_EXTRA
bit 3	FLAG_MTIME
bit 4	FLAG_DIR
bit 5-7 reserved

mtime (modification time)
If FLAG_MTIME is set, then the mtime field indicates the last modification time of the archive file, expressed as the number of seconds since the epoch.

dir_off (directory offset)
If FLAG_DIR is set, then an entry directory is present and the field dir_off indicates the offset in bytes of the directory header from the start of the JAR file.

dir_size (directory size)
If FLAG_DIR is set, then an entry directory is present and the field dir_size indicates the total size in bytes of the uncompressed directory data.

edata (extra field data)
If FLAG_EXTRA is set, then optional extra field data is presentA as described above.

end_off (end header offset)
The field end_off specifies the offset in bytes of the end header from the start of the JAR file, and is used to locate the optional directory from the end of the JAR file when random access to JAR file entries is required.

crc (header crc)
The field crc specifies the 16-bit CRC of the end header contents. The CRC-16 consists of the two least significant bytes of the CRC-32 for all bytes of the end header up to but not including the CRC-16 field itself.

2.6. Limits

The size of a JAR file, and hence any JAR file entry, is limited to 2^32 bytes. Additionally, the size of extra field data is limited to 64K bytes.

3. References

[1] Deutsch, L.P., "ZLIB Compressed Data Format Specification", available in http://quest.jpl.nasa.gov/zlib/rfc-zlib.html

[2] Deutsch, L.P., "DEFLATE Compressed Data Format Specification", available in http://quest.jpl.nasa.gov/zlib/rfc-deflate.html


Copyright © 1996 Sun Microsystems, Inc., 2550 Garcia Ave., Mtn. View, CA 94043-1100 USA. All rights reserved.

Contact the Java developer community via the newsgroup comp.lang.java
or JavaSoft technical support via email to java@java.sun.com.

Send questions or comments about this web site to
webmaster@java.sun.com.

 Java