#sql #sql-database #data-fusion #mysql #postgresql

datafusion-remote-table

A DataFusion table provider for executing SQL queries on remote databases

17 releases (11 breaking)

Uses new Rust 2024

new 0.12.0 Apr 11, 2025
0.10.0 Apr 1, 2025
0.9.0 Mar 31, 2025

#1492 in Database interfaces

Download history 158/week @ 2025-03-01 318/week @ 2025-03-08 294/week @ 2025-03-15 192/week @ 2025-03-22 238/week @ 2025-03-29 763/week @ 2025-04-05

1,630 downloads per month

MIT license

205KB
5K SLoC

datafusion-remote-table

License Crates.io

Features

  1. Execute SQL queries on remote databases and stream results as datafusion table provider
  2. Support pushing down filters and limit to remote databases
  3. Execution plan can be serialized for distributed execution
  4. Record batches can be transformed before outputting to next plan node

Usage

#[tokio::main]
pub async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let options = ConnectionOptions::Postgres(PostgresConnectionOptions::new(
        "localhost",
        5432,
        "user",
        "password",
    ));
    let remote_table = RemoteTable::try_new(options, "select * from supported_data_types").await?;

    let ctx = SessionContext::new();
    ctx.register_table("remote_table", Arc::new(remote_table))?;

    ctx.sql("select * from remote_table").await?.show().await?;

    Ok(())
}

Supported databases

  • Postgres
    • Int2 / Int4 / Int8
    • Float4 / Float8 / Numeric
    • Char / Varchar / Text / Bpchar / Bytea
    • Date / Time / Timestamp / Timestamptz / Interval
    • Bool / Oid / Name / Json / Jsonb / Geometry(PostGIS)
    • Int2[] / Int4[] / Int8[]
    • Float4[] / Float8[]
    • Char[] / Varchar[] / Bpchar[] / Text[] / Bytea[]
  • MySQL
    • TinyInt (Unsigned) / Smallint (Unsigned) / MediumInt (Unsigned) / Int (Unsigned) / Bigint (Unsigned)
    • Float / Double / Decimal
    • Date / DateTime / Time / Timestamp / Year
    • Char / Varchar / Binary / Varbinary
    • TinyText / Text / MediumText / LongText
    • TinyBlob / Blob / MediumBlob / LongBlob
    • Json / Geometry
  • Oracle
    • Number / BinaryFloat / BinaryDouble / Float
    • Varchar2 / NVarchar2 / Char / NChar / Long / Clob / NClob
    • Raw / Long Raw / Blob
    • Date / Timestamp
    • Boolean
  • SQLite
    • Null / Integer / Real / Text / Blob

Thanks

Dependencies

~83MB
~1.5M SLoC