#parquet #local-storage #datafusion #database-table #apache-arrow #tsdb

bin+lib tsdb_timon

Efficient local storage and Amazon S3-compatible data synchronization for time-series data, leveraging Parquet for storage and DataFusion for querying, all wrapped in a simple and intuitive API

3 stable releases

new 1.0.85 Jan 20, 2025
1.0.8 Jan 17, 2025
1.0.7 Oct 10, 2024
1.0.0 Oct 5, 2024

#776 in Database interfaces

Download history 53/week @ 2024-09-29 488/week @ 2024-10-06 91/week @ 2024-10-13 21/week @ 2024-12-08 9/week @ 2024-12-15 100/week @ 2025-01-12

100 downloads per month

Apache-2.0

120KB
2.5K SLoC

Timon File & S3-Compatible Storage API

This API provides a set of functions for managing databases and tables in both local file storage and S3-compatible storage. It supports creating databases and tables, inserting data, querying using SQL, and more.

Table of Contents

Mobile based API's

  1. File Storage Functions
  2. S3-Compatible Storage Functions
  3. Function Descriptions

Utility CLI

  1. Get The Latest Utility Build
  2. How To Run The Utility

File Storage Functions

These functions manage databases and tables stored locally on the file system. Data can be inserted, queried, and organized using SQL-like operations.

// Initialize Timon with a local storage path
external fun initTimon(storagePath: String, bucketInterval: Number): String

// Create a new database
external fun createDatabase(dbName: String): String

// Create a new table within a specific database
external fun createTable(dbName: String, tableName: String): String

// List all available databases
external fun listDatabases(): String

// List all tables within a specific database
external fun listTables(dbName: String): String

// Delete a specific database
external fun deleteDatabase(dbName: String): String

// Delete a specific table within a database
external fun deleteTable(dbName: String, tableName: String): String

// Insert data into a table in JSON format
external fun insert(dbName: String, tableName: String, jsonData: String): String

// Query a database with a date range and SQL query
external fun query(dbName: String, sqlQuery: String): String

S3-Compatible Storage Functions

These functions manage data stored in an S3-compatible bucket, allowing for querying and saving daily data as Parquet files.

// Initialize S3-compatible storage with endpoint and credentials
external fun initBucket(bucket_endpoint: String, bucket_name: String, access_key_id: String, secret_access_key: String, bucket_region: String): String

// Query the bucket with a date range and SQL query
external fun queryBucket(userName: String, sqlQuery: String, dateRange: Map<String, String>): String

// Sink dayly data to Parquet format in the bucket
external fun cloudSyncParquet(userName: String, dbName: String, tableName: String): String

Function Descriptions

  • initTimon(storagePath: String, bucketInterval: Number) Initializes the local file storage at the specified path.

  • createDatabase(dbName: String) Creates a new database with the specified name.

  • createTable(dbName: String, tableName: String) Creates a new table in the specified database.

  • listDatabases() Lists all databases in the local storage.

  • listTables(dbName: String) Lists all tables in the specified database.

  • deleteDatabase(dbName: String) Deletes the specified database.

  • deleteTable(dbName: String, tableName: String) Deletes the specified table from the given database.

  • insert(dbName: String, tableName: String, jsonData: String) Inserts JSON-formatted data into the specified table.

  • query(dbName: String, sqlQuery: String) Executes an SQL query on the specified database within the given date range.

  • initBucket(bucket_endpoint: String, bucket_name: String, access_key_id: String, secret_access_key: String, bucket_region: String) Initializes an S3-compatible bucket for data storage.

  • queryBucket(userName: String, sqlQuery: String, dateRange: Map<String, String>) Queries data in the S3 bucket based on the given date range and SQL query.

  • cloudSyncParquet(userName: String, dbName: String, tableName: String) Upload data from the specified database and table as Parquet files, organized by day into S3-compatible bucket.

Get The Latest Utility Build

Build the Binary

Run the following command to build the utility with the necessary features:

Cross-Compile the Binary

Rust provides tools to cross-compile your code for different platforms. This involves building the binary for a platform different from your current one.

Example for Windows:

On Linux or macOS, you can compile for Windows:

rustup target add x86_64-pc-windows-gnu
cargo build --features dev_cli --release --target x86_64-pc-windows-gnu

Example for macOS:

On Linux, you can compile for macOS:

rustup target add x86_64-apple-darwin
cargo build --features dev_cli --release --target x86_64-apple-darwin

Build Natively on Each Platform

If cross-compilation is not feasible, you can build the binary on each target platform natively. This ensures compatibility.

On macOS:

cargo build --release

On Windows:

cargo build --release

Use Cross (Simplified Cross-Compiling)

The cross tool simplifies cross-compiling by providing pre-configured Docker containers for various targets. It automatically handles dependencies and toolchains.

Install cross:

cargo install cross
cross build --release --target x86_64-pc-windows-gnu
cross build --release --target x86_64-apple-darwin

Consider Using Rust's MUSL for Static Linking (Linux Only)

If targeting Linux systems with no shared libraries, you can build a statically linked binary using MUSL:

rustup target add x86_64-unknown-linux-musl
cargo build --release --target x86_64-unknown-linux-musl

This produces a binary that works on most Linux distributions.

Summary

  • Use cross-compilation to build for other platforms without a native environment.
  • Use cross for easier cross-compilation.
  • If you have access to all platforms, build natively on each.

How To Run The Utility

Available Commands

1. Convert JSON to Parquet

To convert a JSON file to a Parquet file, use the following command:

./tsdb_timon convert <json_file_path> <parquet_file_path>

Example:

./tsdb_timon convert test_input.json test_output.parquet

2. Execute SQL Query on Parquet

Run an SQL query against the Parquet file:

./tsdb_timon query <parquet_file_path> "<sql_query>"

Example:

./tsdb_timon query test_output.parquet "SELECT * FROM timon"

Notes:

  • The table name is always set to timon. Ensure all SQL queries reference the timon table explicitly.
  • Replace <json_file_path>, <parquet_file_path>, and <sql_query> with your respective input file paths and query.

Dependencies

~71MB
~1.5M SLoC