Mod files for CS4 (TES4 Construction Set) are essentially collections of records, which are further divided into fields. Records generally correspond to objects in the construction set (e.g., a creature, a GMST setting, a dialog entry), with the fine details of the object (e.g., health of a creature, a dialog entry test) being handled by the fields of the record. Records themselves are organized into groups by GRUP records. At the highest grouping level, the TES4 file is simply:
A single TES4 record
A collection of top groups.
While CS4 seems to have some flexibility in the ordering and structure of records and groups in the files it reads, it also clearly likes to write files in a more specific ordering, which is described below. If your application is writing a mod file, it is suggested that you follow this preferred format if possible.
GRUPs were introduced in Tes4 to improve scanning of files since they make it easier to skip over blocks of records that the reading program isn't interested in. In addition to this, subgroups for WRLD and CELLS provide some useful structural information (e.g., the division of cell data into persistent and non-persistent references). The header is comparable to a Record header (see below), but with many of the fields repurposed.
Size of the entire group, including the group header (20 bytes).
This is in contrast to records and fields, whose sizes do not include their header sizes.
uint8[4]
Label. Format depends on group type (see next field).
In the CS4 Details view, you can mark a group as ignored, but CS4 ignores this setting and reads the group anyway. The ignore flag interferes with what should ordinarily be in the label field (e.g., "HAIR" becomes "HQIR"). This mislabeling has no effect on record loading. In short, the label field of a group is not reliable. If you subsequently save, the group will be written without the ignore markings.
uint32
Group type
Type
Info
Label
Label
0
Top (Type)
char[4]
Record type
1
World Children
formid
Parent
2
Interior Cell Block
long
Block number
3
Interior Cell Sub-Block
long
Sub-block number
4
Exterior Cell Block
short[2]
Grid Y, X (Note the reverse order)
5
Exterior Cell Sub-Block
short[2]
Grid Y, X (Note the reverse order)
6
Cell Children
formid
Parent
7
Topic Children
formid
Parent
8
Cell Persistent Childen
formid
Parent
9
Cell Temporary Children
formid
Parent
10
Cell Visible Distant Children
formid
Parent
uint32
Version Control Info
The low word is a timestamp or 0. The low byte of that word is the day of the month and the high byte is nominally the number of months since December 2002. (1 = January 2003, 13 = January 2004, etc.) However, the algorithm only adds 12 months per year for years not ending in 1, 2, or 3. In other words, it only supports years from 2003-2010 and becomes unreliable for anything outside that range.
Dates are based on when the Construction Set was launched, not the date the file was saved. Typically, dates will be the same throughout an entire file, but if version control is in use, different groups may have different dates.
If version control is enabled, the high word marks ownership; otherwise, it's 0000. The low byte is the user id that last had the form checked out and the high byte is the user id that currently has the form checked out. Values of 0 mean no one has the object checked out currently.
The game expects certain groups to appear before others, and can crash if it encounters references to records that haven't been loaded yet.
All top groups contain records matching their label (e.g., the GMST top group contains GMST records). For most top groups, only the matching record types are present. However, in the CELL, WRLD and DIAL top groups, each main record can be followed by one or more child groups which contain additional records of a different type. Structure and ordering of those are as follows...
Form ID: Unique record identifier within the file.
The TES4 record always has a Form ID of 0.
Some GMST records have Form IDs of 0.
uint32
Version Control Info (see the same entry in Groups)
Only ESM files have version control information on a per-record basis; for ESPs, it's typically per-group, although there seem to be occasional exceptions. For most records, it will be set to 0.
uint8[dataSize]
Data
For uncompressed records, this is a sequence of fields.
Compressed data is the same, except that the fields are compressed using ZLIB level 6, and stored into the data field like so...
If the previous field has the type XXXX, then dataSize of the current field will be 0 and the size of the data is in fact the 32 bit quantity stored in the XXXX field. This happens once in Oblivion.esm.