14 unstable releases (3 breaking)
0.4.9 | Oct 4, 2024 |
---|---|
0.4.8 | Oct 4, 2024 |
0.4.4 | Sep 25, 2024 |
0.3.0 | Sep 19, 2024 |
0.1.1 | Sep 16, 2024 |
#388 in Encoding
Used in ntdsextract2
71KB
740 lines
flow-record
Library for the creation of DFIR timelines, to be used by rdump
Record flow format
Basically, the record format uses MsgPack. A record stream is a sequence of tuples, each containing of a 4 byte size field and a msgpack encoded record pack (see below).
The very first of those tuples is a special case; it is some kind of a header.
┌────────[record size] bytes───────┐
/ \
┌──────────────────────────────┬────────────────────────────────────┐
│record size (32bit big endian)│ msgpack encoded content │
└──────────────────────────────┴────────────────────────────────────┘
Header
The header is formed by the serialized version of the string RECORDSTREAM\n
, encoded as bin8
:
┌───────────────msgpack type: bin8
│
│ ┌──────────length: 13 bytes
│ │
│ │ ┌─────13 bytes of data
▼ ▼ ▼
┌────┬────┬──────────────┐
│0xc4│0x0d│RECORDSTREAM\n│
└────┴────┴──────────────┘
In the following description I omit the fact that every distinct record and every descriptor must be preceded by its length.
Record packs
All data in the record format are specified as a record pack, which is simply a tuple (a fixarray of length 2) consisting of an record pack type and the record pack data.
┌──────────────────────────── msgpack type ext8/ext16/ext32
│ ┌─────────────────────── length of content
│ │ ┌────────────────── type id must be 0x0e
│ │ │ ┌──────────── array of length 2
│ │ │ │ ┌─────── record pack type
▼ ▼ ▼ │ │ ┌── record pack data
┌────┬────┬────┬───┼────┼────┼──────────────────────────
│ │ │ │ ▼ ▼ ▼
│ │ │ │┌────┬────┬──────────────────
│0xc7│ │0x0e││0x92│ │
│ │ │ │└────┴────┴──────────────────
│ │ │ │
└────┴────┴────┴────────────────────────────────────────
The following record pack types are known:
Object ID | Raw value | Description |
---|---|---|
RecordPackTypeRecord | 0x1 |
|
RecordPackTypeDescriptor | 0x2 |
a record descriptor |
RecordPackTypeFieldtype | 0x3 |
|
RecordPackTypeDatetime | 0x10 |
|
RecordPackTypeVarint | 0x11 |
|
RecordPackTypeGroupedrecord | 0x12 |
Descriptor
Every record must have some certain type, which must be specified using a record descriptor first. A record descriptor is a record pack of type RecordPackTypeDescriptor
, which is wrapped as an msgpack ext8
(depending on its size). The msgpack type id is 0x0e
.
Consider the following type:
struct test_csv_test1 {
field11: String,
field12: String,
field13: String
}
which will the following msgpack encoding:
raw value | explanation |
---|---|
0xc7 |
This is an ext8 record |
0x43 |
length of the containing data |
0x0e |
marker for rdump that this contains an object |
The record pack itself will be the msgpack encoded equivalent of the following data:
[
2,
[
"test_csv_test1",
[
[
"string",
"field11"
],
[
"string",
"field12"
],
[
"string",
"field13"
]
]
]
]
It is important to note that every field is encoded as a tuple where the first entry is the datatype, and the second is the field name. The following datatypes are supported:
Datatype | Mapped from Rust type | Explanation |
---|---|---|
boolean |
bool |
|
command |
||
dynamic |
||
datetime |
DateTime<TZ: TimeZone> |
UNIX timestamp, encoding as integer in msgpack |
filesize |
flow-record-common::Filesize |
|
uint16 |
u8 , u16 |
|
uint32 |
u32 , u64 |
|
float |
f32 , f64 |
|
string |
String |
|
stringlist |
||
dictlist |
||
unix_file_mode |
String |
the chmod formatted string will be converted to u16 internally |
varint |
i8 ,i16 , i32 , i64 |
|
wstring |
||
net.ipv4.Address |
||
net.ipv4.Subnet |
||
net.tcp.Port |
||
net.udp.Port |
||
uri |
||
digest |
||
bytes |
Vec<u8> |
|
record |
||
net.ipaddress |
||
net.ipnetwork |
||
path |
flow_record_common::types::Path |
Identifier
A record descriptor is identified by
- the name of the record type
- a hash, which equals the first 32 bit of a SHA256-Hash of the record type name and the names and types (in that order) of the record fields. For example, the above struct would have the following input for the hash function:
test_csv_test1field11stringfield12stringfield13string
, which would result in the hash12a9d8d90aa34e5068dbf6692b82baf6fff0143eeaa84d7b2a9c92021f7747c2
. Here we take the first 4 bytes12a9d8d9
, interpret them as byte endian integer313120985
and use this as hash.
Every remaining record can refer to a record descriptor using the name and hash of it.
Record data
A record contains a reference to the record descriptor and a list of values, in the order of fields like specified in the descriptor.
┌───────────────────────────────────────────────────────────── record pack type 1
│ ┌─────────────────────────────────────── name of the descriptor
│ │ ┌─────────────────── hash of the descriptor
│ │ │ ┌─────────── array of values
▼ │ │ │ ┌─────── value of the first data field
┌────┬────┬───────────────────┼───────────────────┼───────┼───┼───────────────────────────
│ │ │ │ │ │ │
│ │ │┌────┬─────────────┼───────────────────┼───┬───┼───┼───────────────────────────
│ │ ││ │ ▼ ▼ │ ▼ ▼
│ │ ││ │┌────┬────┬─────────────┬────┬──────┐│┌────┬─────────┬─────────┬─────────
│0x92│0x01││0x92││0x92│0xa?│<struct name>│0xce│<hash>│││0x9?│<field 1>│<field 2>│...
│ │ ││ │└────┴────┴─────────────┴────┴──────┘│└────┴─────────┴─────────┴─────────
│ │ ││ │ │
│ │ │└────┴─────────────────────────────────────┴───────────────────────────────────
│ │ │
└────┴────┴───────────────────────────────────────────────────────────────────────────────
License: GPL-3.0
Dependencies
~5–7MB
~123K SLoC