#mongo-db #sqlite #user #single-file #api #sql-database

bin+lib hoardbase

Hoardbase is a single-file embedded database based on sqlite with an API identical to that of mongodb

1 unstable release

0.1.0-alpha Dec 20, 2021

#61 in #single-file

MIT/Apache

135KB
2.5K SLoC

Hoardbase

Hoardbase is sqlite disguised as a NoSql database with an API similar to that of mongodb. There had been many times that I need a single-file embeded NoSql solution and couldn't find any. For my use cases, a good choice should meet the following requirements:

  1. It needs to be NoSql. This is convinent when data are dirty, which is common in the data ETL use case. Another benefit enabled by NoSql is less effort in implementing data backward compatibility. Even when a data schema can eventually be defined and a Sql database is desired, prototyping using NoSql is also easier.
  2. The database has to be embedable for easy deployment. In many use cases, for example, a standalone desktop app, end users might not have the skills for setting up and maintaining a database server.
  3. The database must be contained in a single file. This will guarantee data integrity and make data migration and backup easier for untrained users.
  4. There should be cross-language support (at least C/C++, Python, Rust and Nodejs).

I feel that an embedable NoSql is a very common building block that lacks good choices. The cloest one, in my opinion, is ejdb2. However, that project is inactive and its code readability is poor. But what about this project? Sqlite is a solid fundation and has been battle tested. I try to keep my warpper layer simple and its internal well documented to make sure fixability.

Usage

Hoardbase tries to provide a similar programming interface as that of mongodb. If you are already familiar with mongodb, using Hoardbase should be very simple.

Internals

The key mechanism for storing and querying json data using sqlite is serializing json documents into the blob type. Currently bson is used as the serialized format. Another interesting format is Amazon Ion. I may add support for Ion in the future when its rust binding matures.

Indexing and searching is implemented using sqlite's application-defined functions. Basically, we can define custom functions that operates on the blob type to extract a json field, or patch a blob. As long as those custom functions are deterministic, they can be used for indexing and searching. For example, we can define a function bson_field(path, blob) that extracts a bson field from the blob. If we invoke this function with WHERE bson_field('name.id', blob) = 3 against a collection, we will find all documents with name.id equals to 3. We can also create indices on bson fields using this function. For more references, these are some good links:

how to query json within a database

sqlite json support

Dependencies

~42MB
~757K SLoC