|0.3.1||Aug 9, 2022|
|0.3.0||Mar 15, 2022|
|0.2.1||Mar 7, 2022|
|0.1.1||Apr 21, 2020|
#523 in Data structures
Common Index File Format (CIFF)
What is CIFF?
Common Index File Format CIFF is an inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative. The primary idea is to allow indexes to be dumped from Lucene via Anserini which can then be ingested by other search engines. This repository contains the necessary code to read the CIFF into a format which PISA can use for building (and then searching) indexes.
We currently provide a Rust binary for converting CIFF data to a PISA canonical index, and for converting a PISA canonical index back to CIFF. This means PISA can generate indexes that can then be consumed by other systems that support CIFF (and vice versa).
Install from AUR
The package is available in Arch User Repository. If you are on an Arch-based system, you can install it by running the following:
# Replace yay with the helper of your choice. yay -S ciff-pisa
Install from crates.io
Note that the installation methods described below are not system-wide. For example, on Linux the tools usually end up in
$HOME/.cargo/bindirectory. To use tools from command line, make sure to use the absolute path or update your
PATHvariable to include the
The library and the tools are also available in crates.io, so you can install the binaries in your local repository by running:
cargo install ciff
Install from source
cargo build --release to build the binaries.
To convert a CIFF blob to a PISA canonical:
To convert a PISA canonical to a CIFF blob:
You can also install the binaries to your local
cargo install --path .
or if you are installing the same version again:
cargo install --path . --force
Use as Cargo dependency
If you are insterested in using the library components in your own Rust library, you can simply defeine it as a dependency in your
[dependencies] ciff = "0.1"
Library API documentation
The API documentation is available on docs.rs.