1 unstable release
0.1.0-alpha | Dec 20, 2021 |
---|
#61 in #single-file
135KB
2.5K
SLoC
Hoardbase
Hoardbase is sqlite disguised as a NoSql database with an API similar to that of mongodb. There had been many times that I need a single-file embeded NoSql solution and couldn't find any. For my use cases, a good choice should meet the following requirements:
- It needs to be NoSql. This is convinent when data are dirty, which is common in the data ETL use case. Another benefit enabled by NoSql is less effort in implementing data backward compatibility. Even when a data schema can eventually be defined and a Sql database is desired, prototyping using NoSql is also easier.
- The database has to be embedable for easy deployment. In many use cases, for example, a standalone desktop app, end users might not have the skills for setting up and maintaining a database server.
- The database must be contained in a single file. This will guarantee data integrity and make data migration and backup easier for untrained users.
- There should be cross-language support (at least C/C++, Python, Rust and Nodejs).
I feel that an embedable NoSql is a very common building block that lacks good choices. The cloest one, in my opinion, is ejdb2. However, that project is inactive and its code readability is poor. But what about this project? Sqlite is a solid fundation and has been battle tested. I try to keep my warpper layer simple and its internal well documented to make sure fixability.
Usage
Hoardbase tries to provide a similar programming interface as that of mongodb. If you are already familiar with mongodb, using Hoardbase should be very simple.
Internals
The key mechanism for storing and querying json data using sqlite is serializing json documents into the blob type. Currently bson
is used
as the serialized format. Another interesting format is Amazon Ion. I may add support for Ion in the future
when its rust binding matures.
Indexing and searching is implemented using sqlite's application-defined functions. Basically, we can define
custom functions that operates on the blob type to extract a json field, or patch a blob. As long as those custom functions are deterministic, they
can be used for indexing and searching. For example, we can define a function bson_field(path, blob)
that extracts a bson field from the blob.
If we invoke this function with WHERE bson_field('name.id', blob) = 3
against a collection, we will find all documents with name.id equals to 3. We can
also create indices on bson fields using this function. For more references, these are some good links:
Dependencies
~42MB
~757K SLoC