12 unstable releases (3 breaking)

0.4.0 Aug 8, 2023
0.3.1 Jul 25, 2023
0.3.0 May 24, 2023
0.2.4 May 15, 2023
0.1.3 May 13, 2022

#234 in Compression

Download history 38/week @ 2024-02-18 31/week @ 2024-02-25 2/week @ 2024-03-10 193/week @ 2024-03-31

195 downloads per month
Used in 2 crates

Apache-2.0

1MB
2K SLoC

oscar-io

Types and IO (Reader/Writer) for OSCAR Corpus processing and generation.

The crate provides basic abstractions around Corpus items and generic readers/writers useable in OSCAR Corpus files. At some time, it should replace reader implementations in both Ungoliant and oscar-tools.

Features

oscar-io aims to provide readers/writers for numerous types of OSCAR Corpora.

OSCAR v2

OSCAR v1.1

  • Reader
  • Writer
  • SplitReader (Should be unified with SplitReader with split_size: Option<u64>)
  • SplitWriter (Same)

OSCAR v1

  • Reader
  • Writer
  • SplitReader
  • SplitWriter

Dependencies

~13MB
~273K SLoC