2 releases

0.2.1 Jan 7, 2020
0.2.0 Jan 7, 2020
0.1.0 Jan 7, 2020

#1751 in Database interfaces

21 downloads per month

MIT license

22KB
144 lines

audis

Audis is an implementation of a multi-index audit log, built atop the Redis key-value store solution.

An audit log consists of zero or more Event objects, indexed against one or more subjects, each. The correspondence between Redis databases and audit logs is 1:1 -- each audit log inhabits precisely one Redis instance, and an instance can only house a single audit log.

Example: Logging Events

The simplest way to use audis is to point it at a Redis instance and start logging to it:

extern crate audis;

fn main() {
    let client = audis::Client::connect("redis://127.0.0.1:6379").unwrap();

    client.log(&audis::Event{
        id: "foo1".to_string(),
        data: "{\"some\":\"data\"}".to_string(),
        subjects: vec![
            "system".to_string(),
            "user:42".to_string(),
        ],
    }).unwrap();

    // ... etc ...
}

Retrieving The Audit Log

What good is an audit log if you can't ever review it?

extern crate audis;

fn main() {
    let client = audis::Client::connect("redis://127.0.0.1:6379").unwrap();

    for subject in &client.subjects().unwrap() {
        println!("## {} ######################", subject);
        for event in &client.retrieve(subject).unwrap() {
            println!("  {}", event.data);
        }
        println!("");
    }
}

Distributed Audit Logging via Threads

A common pattern with audis is to delegate a single thread to the task of shunting event logs into Redis, allowing other threads to focus on their work, without being slowed down by momentary hiccups in the auditing layer.

You can do this via the background() function, which returns a buffered channel you can send events to, and the join handle of the executing background thread:

extern crate audis;

fn main() {
    let client = audis::Client::connect("redis://127.0.0.1:6379").unwrap();

    // buffer 50 events
    let (tx, thread) = client.background(50).unwrap();

    tx.send(audis::Event{
        id: "foo1".to_string(),
        data: "{\"some\":\"data\"}".to_string(),
        subjects: vec![
            "system".to_string(),
            "user:42".to_string(),
        ],
    }).unwrap();

    // ... etc ...

    thread.join().unwrap();
}

Implementation Details

Audis uses four (4) types of objects A) the events themselves, B) reference counts for inserted events, C) per-subject lists of events, sorted in insertion-order, and D) a master list of all known subjects.

Each audit event is stored as an opaque blob, usually JSON -- for the purposes of this library, the exact contents of events is immaterial. Each event gets its own Redis key, derived from its globally unique ID, in the form audit:$id. This makes retrieving events an O(1) operation, using the Redis GET command.

Accompanying each event object is a reference count, kept under a parallel keying structure that appends :ref to the main event object key. These reference counts are integers that track how many different subjects are currently still referencing the given event. For example, audit:ae2:ref is the reference count key for audit:ae2.

Each subject in the audit log maintains its own list of event IDs that are relevant to it. These lists are stored under keys derived from the subject itself. Callers are strongly urged to ensure that subject names are as unique as they need to be for analysis.

Finally, a single Redis Set, called subjects, exists to track the complete set of known subject strings. This facilitates discovery of the different subsets of the audit log.

Here is some pseudocode for the insertion logic of the LOG(e) operation, where e is an object:

LOG(e):
    var id = $e[id]
    SETEX "audit:$id" $e[data]
    for s in $e[subjects]:
        SADD "subjects" "$s"
        RPUSH "$s" "$id"
        INCR "audit:$id:ref"

Technically speaking, LOG(e) runs in O(n), linearly to the number of subjects that the audit log event applies to. However, given that this n is usually very small (almost always < 100), LOG(e) performs well.

RETR(s) is straightforward: iterate over the subject list in Redis via LRANGE and then GET the referenced event objects:

RETR(s):
    var log = []
    for id in LRANGE "$s" 0 -1:
        $log.append( GET "audit:$id" )
    return $log

Since LOG(e) only ever adds to our audit log dataset, and RETR(s) is a read-only operation, our Redis footprint will forever grow, unless we define operations to clear out old log entries. Deleting parts of our audit log seems wrong and counter-productive -- the whole point is to know what happened! Storage (especially memory), however, isn't unlimited, and for debugging purposes (at least), audit log events become less relevant over time.

For these reasons, we define two pruning operations: TRUNC(s,n), for truncating a subject eventset to the most recent n audit events, and PURGE(s,last), for deleting a events from a subject eventset, until a given ID is found. (That ID will also be removed, as well).

These two operations allow us to define a cleanup routine, outside of the audis library, which can do things like render and persist audit events to a log file in a filesystem or external blobstore (i.e. S3), before ultimately deleting them from Redis.

Here is the pseudo-code for TRUNC(s,n):

TRUNC(s,n):
    var end = 0 - n - 1
    for id in LRANGE "$s" 0 $end:
        LPOP "$s"
        DECR "audit:$id:ref"
        if GET "audit:$id:ref" <= 0:
            DEL "audit:$id:ref"
            DEL "audit:$id"

As events are truncated from the subject's index, the associated reference counts are checked to determine if any larger cleanup (via DEL) needs to be performed.

PURGE(s,last) is similar:

PURGE(s,last):
    for id in LRANGE "$s" 0 -1:
        LPOP "$s"
        DECR "audit:$id:ref"
        if GET "audit:$id:ref" <= 0:
            DEL "audit:$id:ref"
            DEL "audit:$id"
        if $id == $last
            break

Both of these operations suffer from massive problems when run concurrently with each other, or with other calls to themselves. A future version of this library will correct this, by the judicious use of LOCK()/UNLOCK() primitives implemented inside of the same Redis database.

Dependencies

~8MB
~138K SLoC