#hdfs #hadoop #api #binding #safe #libhdfs #safely

fs-hdfs3

libhdfs binding library and safe Rust APIs

4 releases

0.1.12 Sep 7, 2023
0.1.11 Apr 10, 2023
0.1.10 Oct 26, 2022
0.1.9 Oct 21, 2022

#568 in Filesystem

Download history 148/week @ 2023-11-20 117/week @ 2023-11-27 182/week @ 2023-12-04 162/week @ 2023-12-11 128/week @ 2023-12-18 161/week @ 2023-12-25 126/week @ 2024-01-01 162/week @ 2024-01-08 198/week @ 2024-01-15 208/week @ 2024-01-22 340/week @ 2024-01-29 262/week @ 2024-02-05 438/week @ 2024-02-12 319/week @ 2024-02-19 296/week @ 2024-02-26 331/week @ 2024-03-04

1,395 downloads per month
Used in 4 crates (2 directly)

Apache-2.0

240KB
6K SLoC

C 4.5K SLoC // 0.1% comments Rust 1.5K SLoC // 0.1% comments

fs-hdfs3

It's based on the version 0.0.4 of http://hyunsik.github.io/hdfs-rs to provide libhdfs binding library and rust APIs which safely wraps libhdfs binding APIs.

Current Status

  • All libhdfs FFI APIs are ported.
  • Safe Rust wrapping APIs to cover most of the libhdfs APIs except those related to zero-copy read.
  • Compared to hdfs-rs, it removes the lifetime in HdfsFs, which will be more friendly for others to depend on.

Documentation

Requirements

  • The C related files are from the branch 3.1.4 of hadoop repository. For rust usage, a few changes are also applied.
  • No need to compile the Hadoop native library by yourself. However, the Hadoop jar dependencies are still required.

Usage

Add this to your Cargo.toml:

[dependencies]
fs-hdfs3 = "0.1.12"

Build

We need to specify $JAVA_HOME to make Java shared library available for building.

Run

Since our compiled libhdfs is JNI-based implementation, it requires Hadoop-related classes available through CLASSPATH. An example,

export CLASSPATH=$CLASSPATH:`hadoop classpath --glob`

Also, we need to specify the JVM dynamic library path for the application to load the JVM shared library at runtime.

For jdk8 and macOS, it's

export DYLD_LIBRARY_PATH=$JAVA_HOME/jre/lib/server

For jdk11 (or later jdks) and macOS, it's

export DYLD_LIBRARY_PATH=$JAVA_HOME/lib/server

For jdk8 and Centos

export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/amd64/server

For jdk11 (or later jdks) and Centos

export LD_LIBRARY_PATH=$JAVA_HOME/lib/server

Testing

The test also requires the CLASSPATH and DYLD_LIBRARY_PATH (or LD_LIBRARY_PATH). In case that the java class of org.junit.Assert can't be found. Refine the $CLASSPATH as follows:

export CLASSPATH=$CLASSPATH:`hadoop classpath --glob`:$HADOOP_HOME/share/hadoop/tools/lib/*

Here, $HADOOP_HOME need to be specified and exported.

Then you can run

cargo test

Example

use std::sync::Arc;
use hdfs::hdfs::{get_hdfs_by_full_path, HdfsFs};

let fs: Arc<HdfsFs> = get_hdfs_by_full_path("hdfs://localhost:8020/").ok().unwrap();
match fs.mkdir("/data") {
    Ok(_) => { println!("/data has been created") },
    Err(_)  => { panic!("/data creation has failed") }
};

Dependencies

~1.4–4MB
~102K SLoC