-
regex-syntax
A regular expression parser
-
regex-automata
Automata construction and matching using regular expressions
-
aho-corasick
Fast multiple substring searching
-
regex
regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
-
idna
IDNA (Internationalizing Domain Names in Applications) and Punycode
-
unicode-normalization
functions for normalization of Unicode strings, including Canonical and Compatible Decomposition and Recomposition, as described in Unicode Standard Annex #15
-
percent-encoding
Percent encoding and decoding
-
unicode-bidi
Unicode Bidirectional Algorithm
-
unicode-width
Determine displayed width of
char
andstr
types according to Unicode Standard Annex #11 rules -
textwrap
word wrapping, indenting, and dedenting strings. Has optional support for Unicode and emojis as well as machine hyphenation.
-
unicode-segmentation
Grapheme Cluster, Word and Sentence boundaries according to Unicode Standard Annex #29 rules
-
convert_case
Convert strings into any case
-
unicode-xid
Determine whether characters have the XID_Start or XID_Continue properties according to Unicode Standard Annex #31
-
matchers
Regex matching on character and byte streams
-
ident_case
applying case rules to Rust identifiers
-
bstr
A string type that is not required to be valid UTF-8
-
unicase
A case-insensitive wrapper around strings
-
encoding_rs
A Gecko-oriented implementation of the Encoding Standard
-
unindent
Remove a column of leading whitespace from a string
-
indoc
Indented document literals
-
diff
An LCS based slice and string diffing implementation
-
ucd-trie
A trie for storing Unicode codepoint sets and maps
-
fancy-regex
regexes, supporting a relatively rich set of features, including backreferences and look-around
-
difflib
Port of Python's difflib library to Rust
-
unicode_categories
Query Unicode category membership for chars
-
similar
A diff library for Rust
-
finl_unicode
handling Unicode functionality for finl (categories and grapheme segmentation)
-
ascii
ASCII-only equivalents to
char
,str
andString
-
indenter
A formatter wrapper that indents the text, designed for error display impls
-
const_format
Compile-time string formatting
-
widestring
wide string Rust library for converting to and from wide strings, such as those often used in Windows API or other FFI libaries. Both
u16
andu32
string types are provided, including support for UTF-16 and UTF-32… -
pulldown-cmark
A pull parser for CommonMark
-
cesu8
Convert to and from CESU-8 encoding (similar to UTF-8)
-
Inflector
Adds String based inflections for Rust. Snake, kebab, camel, sentence, class, title and table cases as well as ordinalize, deordinalize, demodulize, foreign key, and pluralize/singularize…
-
regex-lite
A lightweight regex engine that optimizes for binary size and compilation time
-
deunicode
Convert Unicode strings to pure ASCII by intelligently transliterating them. Suppors Emoji and Chinese.
-
utf-8
Incremental, zero-copy UTF-8 decoding with error handling
-
const_format_proc_macros
detail of the
const_format
crate -
arrow-row
Arrow row format
-
uncased
Case-preserving, ASCII case-insensitive, no_std string types
-
ascii-canvas
canvas for drawing lines and styled text and emitting to the terminal
-
unicode-id
Determine whether characters have the ID_Start or ID_Continue properties according to Unicode Standard Annex #31
-
gix-utils
gitoxide
utilities that don’t need feature toggles -
unic-char-property
UNIC — Unicode Character Tools — Character Property taxonomy, contracts and build macros
-
slug
Convert a unicode string to a slug
-
shell-escape
Escape characters that may have a special meaning in a shell
-
compact_str
A memory efficient string type that transparently stores strings on the stack, when possible
-
tendril
Compact buffer/string type for zero-copy parsing
-
onig_sys
onig_sys
crate contains raw rust bindings to the oniguruma library. This crate exposes a set of unsafe functions which can then be used by other crates to create safe wrappers around Oniguruma… -
onig
Rust-Onig is a set of Rust bindings for the Oniguruma regular expression library. Oniguruma is a modern regex library with support for multiple character encodings and regex syntaxes.
-
pulldown-cmark-to-cmark
Convert pulldown-cmark Events back to the string they were parsed from
-
diffy
Tools for finding and manipulating differences between files
-
strip-ansi-escapes
Strip ANSI escape sequences from byte streams
-
const-str
compile-time string operations
-
unicode-script
exposes the Unicode
Script
andScript_Extension
properties from UAX #24 -
levenshtein_automata
Creates Levenshtein Automata in an efficient manner
-
kstring
Key String: optimized for map keys
-
regress
A regular expression engine targeting EcmaScript syntax
-
text-size
Newtypes for text offsets
-
tabled
An easy to use library for pretty print tables of Rust
struct
s andenum
s -
lazy-regex
lazy static regular expressions checked at compile time
-
encoding-index-tradchinese
Index tables for traditional Chinese character encodings
-
inflections
High performance inflection transformation library for changing properties of words like the case
-
encoding-index-singlebyte
Index tables for various single-byte character encodings
-
encoding-index-japanese
Index tables for Japanese character encodings
-
encoding-index-korean
Index tables for Korean character encodings
-
encoding-index-simpchinese
Index tables for simplified Chinese character encodings
-
indent_write
Write adapters to add line indentation
-
ascii_utils
handle ASCII characters
-
unicode-normalization-alignments
functions for normalization of Unicode strings, including Canonical and Compatible Decomposition and Recomposition, as described in Unicode Standard Annex #15
-
fuzzy-matcher
Fuzzy Matching Library
-
difference
text diffing and assertion library
-
newline-converter
Newline byte converter library
-
htmlescape
HTML entity encoding and decoding
-
roff
ROFF (man page format) generation library
-
rustybuzz
A complete harfbuzz shaping algorithm port to Rust
-
ucd-util
A small utility library for working with the Unicode character database
-
encoding
Character encoding support for Rust
-
mdbook
Creates a book from markdown files
-
unescape
Unescapes strings with escape sequences written out as literal characters
-
unescaper
Unescape strings with escape sequences written out as literal characters
-
tokenizers
today's most used tokenizers, with a focus on performances and versatility
-
unicode-security
Detect possible security problems with Unicode usage according to Unicode Technical Standard #39 rules
-
unicode-ccc
Unicode Canonical Combining Class detection
-
utf16_lit
macro_rules to make utf-16 literals
-
pretty
Wadler-style pretty-printing combinators in Rust
-
unic-ucd-ident
UNIC — Unicode Character Database — Identifier Properties
-
unicode-bidi-mirroring
Unicode Bidi Mirroring property detection
-
substring
method for string types
-
prettydiff
Side-by-side diff for two files
-
pulldown-cmark-escape
An escape library for HTML created in the pulldown-cmark project
-
yeslogic-fontconfig-sys
Raw bindings to Fontconfig without a vendored C library
-
cruet
Adds String based inflections for Rust. Snake, kebab, camel, sentence, class, title and table cases as well as ordinalize, deordinalize, demodulize, foreign key, and pluralize/singularize…
-
ammonia
HTML Sanitization
-
any_ascii
Unicode to ASCII transliteration
-
unidecode
pure ASCII transliterations of Unicode strings
-
grep-searcher
Fast line oriented regex searching as a library
-
pad
padding strings at runtime
-
utf8_iter
Iterator by char over potentially-invalid UTF-8 in &[u8]
-
byteyarn
hyper-compact strings
-
charset
Thunderbird-compatible character encoding decoding for email
-
tabwriter
Elastic tabstops
-
case
A set of letter case string helpers
-
glyph_brush_layout
Text layout for ab_glyph
-
font-types
Scalar types used in fonts
-
lexical-sort
Sort Unicode strings lexically
-
dwrote
Lightweight binding to DirectWrite
-
unicode-case-mapping
Fast lowercase, uppercase, and titlecase mapping for characters
-
utf16_iter
Iterator by char over potentially-invalid UTF-16 in &[u16]
-
punycode
Functions to decode and encode Punycode
-
linkify
Finds URLs and email addresses in plain text. Takes care to get the boundaries right with surrounding punctuation like parentheses.
-
str_indices
Count and convert between indexing schemes on string slices
-
unicode-vo
Unicode vertical orientation detection
-
utf16string
String types to work directly with UTF-16 encoded strings
-
write16
A UTF-16 analog of the Write trait
-
entities
raw data needed to convert to and from HTML entities
-
comrak
A 100% CommonMark-compatible GitHub Flavored Markdown parser and formatter
-
text_lines
Information about lines of text in a string
-
codepage
Mapping between Windows code page numbers and encoding_rs character encodings
-
termimad
Markdown Renderer for the Terminal
-
text_io
really simple to use panicking input functions
-
stfu8
Sorta Text Format in UTF-8
-
wezterm-bidi
The Unicode Bidi Algorithm (UBA)
-
unicode-reverse
Unicode-aware in-place string reversal
-
lopdf
PDF document manipulation
-
os_display
Display strings in a safe platform-appropriate way
-
pcre2
High level wrapper library for PCRE2
-
markdown-gen
generating Markdown files
-
ngrams
Generate n-grams from sequences
-
unicode_names2
Map characters to and from their name given in the Unicode standard. This goes to great lengths to be as efficient as possible in both time and space, with the full bidirectional tables weighing barely 500 KB…
-
grep
Fast line oriented regex searching as a library
-
str_inflector
Adds String based inflections for Rust. Snake, kebab, camel, sentence, class, title and table cases as well as ordinalize, deordinalize, demodulize, foreign key, and pluralize/singularize…
-
ropey
A fast and robust text rope for Rust
-
line-index
Maps flat
TextSize
offsets to/from(line, column)
representation -
unicode-general-category
Fast lookup of the Unicode General Category property for char
-
emojis
✨ Lookup emoji in *O(1)* time, access metadata and GitHub shortcodes, iterate over all emoji, and more!
-
byte_string
Wrapper types for outputting byte strings (b"Hello") using the Debug ({:?}) format
-
shell2batch
Coverts simple basic shell scripts to windows batch scripts
-
snailquote
Escape and unescape strings with shell-inspired quoting
-
ansi-to-tui
convert ansi color coded text into ratatui::text::Text type from ratatui library
-
swrite
Infallible alternatives to write! and writeln! for Strings
-
titlecase
Capitalize text according to a style defined by John Gruber for Daring Fireball
-
chardetng
A character encoding detector for legacy Web content
-
filecheck
writing tests for utilities that read text files and produce text output
-
const-str-proc-macro
compile-time string operations
-
jieba-rs
The Jieba Chinese Word Segmentation Implemented in Rust
-
uwl
A management stream for bytes and characters
-
sublime_fuzzy
Fuzzy matching algorithm based on Sublime Text's string search
-
sliceslice
A fast implementation of single-pattern substring search using SIMD acceleration
-
lindera-decompress
A morphological analysis library
-
man
Generate structured man pages
-
lindera-ipadic-builder
A Japanese morphological dictionary builder for IPADIC
-
lindera-dictionary
A Japanese morphological dictionary
-
lindera-unidic-builder
A Japanese morphological dictionary builder for UniDic
-
select
extract useful data from HTML documents, suitable for web scraping
-
lindera-ko-dic-builder
A Korean morphological dictionary builder for ko-dic
-
lindera-cc-cedict-builder
A Chinese morphological dictionary builder for CC-CEDICT
-
html2text
Render HTML as plain text
-
chardet
rust version of chardet
-
lindera-ipadic-neologd-builder
A Japanese morphological dictionary builder for IPADIC NEologd
-
sanitizer
A collection of methods and macros to sanitize struct fields
-
lowcharts
draw low-resolution graphs in terminal
-
cow-utils
Copy-on-write string utilities for Rust
-
unicode-truncate
Unicode-aware algorithm to pad or truncate
str
in terms of displayed width -
unified-diff
GNU unified diff format
-
ucd-parse
parsing data files in the Unicode character database
-
stringmatch
Allow the use of regular expressions or strings wherever you need string comparison
-
pcre2-sys
Low level bindings to PCRE2
-
ansi-width
Calculate the width of a string when printed to the terminal
-
charabia
detect the language, tokenize the text and normalize the tokens
-
harfbuzz-sys
Rust bindings to the HarfBuzz text shaping engine
-
xlsxwriter
Write xlsx file with number, formula, string, formatting, autofilter, merged cells, data validation and more
-
grok
popular java & ruby grok library which allows easy text and log file processing with composable patterns
-
uuhelp_parser
A collection of functions to parse the markdown code of help files
-
flexstr
A flexible, simple to use, immutable, clone-efficient
String
replacement for Rust -
etch
Not just a text formatter, don't mark it down, etch it
-
jetscii
A tiny library to efficiently search strings and byte slices for sets of ASCII characters or bytes
-
garde
Validation library
-
detone
Decompose Vietnamese tone marks
-
ripgrep
line-oriented search tool that recursively searches the current directory for a regex pattern while respecting gitignore rules. ripgrep has first class support on Windows, macOS and Linux.
-
lindera-core
A morphological analysis library
-
cedarwood
efficiently-updatable double-array trie in Rust (ported from cedar)
-
escape-bytes
Escapes bytes that are not printable ASCII characters
-
doccy
brace based markup language
-
lindera-tokenizer
A morphological analysis library
-
lindera-compress
A morphological analysis library
-
lindera-ko-dic
A Japanese morphological dictionary for ko-dic
-
bk-tree
A Rust BK-tree implementation
-
glob-match
An extremely fast glob matcher
-
array_tool
Helper methods for processing collections
-
harfbuzz-traits
Rust Traits for the HarfBuzz text shaping engine
-
lexicmp
comparing and sorting strings lexicographically and naturally
-
slugify
Macro for flexible slug generation
-
lipsum
lorem ipsum text generation library. It generates pseudo-random Latin text. Use this if you need filler or dummy text for your application. The text is generated using a simple Markov chain…
-
cuid
An ipmlementation of CUID protocol in rust
-
pretty-xmlish
Pretty print XML-ish data with unicode art
-
minify-html-common
Common code and data for minify-html*
-
hyphenation
Knuth-Liang hyphenation for a variety of languages
-
hyperscan
bindings for Rust with Multiple Pattern and Streaming Scan
-
wchar
Procedural macros for compile time UTF-16 and UTF-32 wide strings
-
rutie
The tie between Ruby and Rust
-
unicode-blocks
contains a list of all unicode blocks and provides some functions to search across them
-
suffix
arrays
-
ferris-says
flavored replacement for the classic cowsay
-
ucd
Extends the char type to provide access to most fields of the UCD, Unicode Character Database, as of version 9.0.0. It aims to be compact, fast, and use minimal dependencies (only rust's core crate)…
-
printpdf
writing PDF files
-
hyperscan-sys
Hyperscan bindings for Rust with Multiple Pattern and Streaming Scan
-
sd
An intuitive find & replace CLI
-
caseless
Unicode caseless matching
-
print-positions
providing string segmentation on grapheme clusters and ANSI escape sequences for accurate length arithmetic based on visible print positions
-
unicode-casing
Titlecase helper function on characters
-
fm
Non-backtracking fuzzy text matcher
-
console_static_text
Logging for text that should stay in the same place in a console
-
commonregex
Rust port for CommonRegex. Find all times, dates, links, phone numbers, emails, ip addresses, prices, hex colors, and credit card numbers in a string. We did the hard work so you don't have to.
-
nucleo-matcher
plug and play high performance fuzzy matcher
-
svgbobdoc
Renders ASCII diagrams in doc comments as SVG images
-
lindera-unidic
A Japanese morphological dictionary for UniDic
-
harfbuzz
Rust bindings to the HarfBuzz text shaping engine
-
lindera-ipadic
A Japanese morphological dictionary for IPADIC
-
gh-emoji
Convert
:emoji:
to Unicode using GitHub’s emoji names -
nu-utils
Nushell utility functions
-
text-splitter
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python.
-
synoptic
low-level, syntax highlighting library with unicode support
-
pager
pipe your output through an external pager
-
text-diff
text diffing and assertion library
-
hyphenation_commons
Proemial code for the
hyphenation
library -
bwrap
A fast, lightweight, embedded systems-friendly library for wrapping text
-
target_info
Get text strings of attributes concernign the build target
-
precis-tools
Tools and parsers to generate PRECIS tables from the Unicode Character Database (UCD)
-
precis-profiles
PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings Representing Usernames and Passwords as defined in rfc8265; and Nicknames as defined in rfc8266
-
ascii_tree
generates ascii trees
-
terminal-supports-emoji
Check whether the current terminal supports emoji
-
unic-ucd-age
UNIC — Unicode Character Database — Age
-
svgbob
Transform your ascii diagrams into happy little SVG
-
imperative
Check for imperative mood in text
-
lexis
Generates human-readable sequences from numeric values using a predefined word list
-
simple-logging
logger for the log facade
-
genpdf
User-friendly PDF generator written in pure Rust
-
prop-check-rs
A Property-based testing Library in Rust
-
encoding_c_mem
C API for encoding_rs::mem
-
tracing-texray
Tracing layer to view a plaintext timeline of spans and events
-
pdf-extract
extract content from pdfs
-
ra_ap_test_utils
TBD
-
mdbook-mermaid
mdbook preprocessor to add mermaid support
-
typos-dict
Source Code Spelling Correction
-
utf8-cstr
Type wrappers promising null termination and utf-8 validity. The intersection of
std::ffi::CStr
andstr
-
pluralizer
Rust package to pluralize or singularize any word based on a count inspired on pluralize NPM package
-
mdbook-linkcheck
A backend for
mdbook
which will check your links for you -
textdistance
Lots of algorithms to compare how similar two sequences are
-
textnonce
Text based random nonce generator
-
fax
Decoder and Encoder for CCITT Group 3 and 4 bi-level image encodings used by fax machines TIFF and PDF
-
mdbook-svgbob
SvgBob mdbook preprocessor which swaps code-blocks with neat SVG
-
textwrap-macros
procedural macros to use textwrap utilities at compile time
-
mdbook-preprocessor-boilerplate
Boilerplate code for mdbook preprocessors
-
qp-trie
An idiomatic and fast QP-trie implementation in pure Rust, written with an emphasis on safety
-
mdbook-pandoc
A mdbook backend that outsources most of the rendering process to pandoc
-
ra_ap_ide_ssr
Structural search and replace of Rust code
-
unic-bidi
UNIC — Unicode Bidirectional Algorithm
-
evcxr
An Evaluation Context for Rust
-
regex-cursor
regex fork that can search discontiguous haystacks
-
lingua-english-language-model
The English language model for Lingua, an accurate natural language detection library
-
lindera-cc-cedict
A Japanese morphological dictionary for CC-CEDICT
-
scanlex
lexical scanner for parsing text into tokens
-
typos-cli
Source Code Spelling Correction
-
adobe-cmap-parser
parse Adobe CMap files
-
lindera
A morphological analysis library
-
hypher
separates words into syllables
-
regex_mutator
The Nautilus regex_mutator
-
easy_reader
easily navigating forward, backward or randomly through the lines of huge files
-
trigram
Trigram-based string similarity for fuzzy matching
-
rustyline-async
A minimal readline with multiline and async support
-
lingua-german-language-model
The German language model for Lingua, an accurate natural language detection library
-
quoted-string-parser
Quoted string parser for grammar defined in RFC3261
-
compact_bytes
A memory efficient bytes container that transparently stores bytes on the stack, when possible
-
varcon-core
Varcon-relevant data structures
-
reword
some utility functions for human-readable formatting of words
-
grep-pcre2
Use PCRE2 with the 'grep' crate
-
mdxjs
Compile MDX to JavaScript in Rust
-
mdbook-toc
mdbook preprocessor to add Table of Contents
-
uwuify
fastest text uwuifier in the west
-
hunspell-rs
Rust bindings to the Hunspell library
-
esl01-renderdag
Render a graph into ASCII or Unicode text
-
wana_kana
checking and converting between Japanese characters - Kanji, Hiragana, Katakana - and Romaji
-
sanitize-filename-reader-friendly
A filename sanitizer aiming to produce reader friendly filenames
-
vaporetto
pointwise prediction based tokenizer
-
file-size
a function formatting file sizes in 4 chars
-
typos
Source Code Spelling Correction
-
typos-vars
Source Code Spelling Correction
-
dictgen
Compile-time case-insensitive map
-
egui-dropdown
An actual dropdown list for egui
-
rapidfuzz
rapid fuzzy string matching library
-
hyper-old-types
HTTP types from hyper 0.11.x
-
sre-engine
A low-level implementation of Python's SRE regex engine
-
frida-build
Rust bindings for Frida
-
keyvalues-parser
A parser/renderer for vdf text
-
utfx
-
lingua-french-language-model
The French language model for Lingua, an accurate natural language detection library
-
re_space_view_text_document
space view that shows a single text box
-
line-span
Find line ranges and jump between next and previous lines
-
lingua-spanish-language-model
The Spanish language model for Lingua, an accurate natural language detection library
-
svgbob_cli
Transform your ascii diagrams into happy little SVG
-
sedregex
Sed-like regex library
-
harfbuzz_rs
A high-level interface to HarfBuzz, exposing its most important functionality in a safe manner using Rust
-
unicode-canonical-combining-class
Fast lookup of the Canonical Combining Class property
-
mini_paste
Fast-to-compile equivalent to
::paste
-
text_trees
textual output for tree-like structures
-
hunspell-sys
Bindings to the hunspell C API
-
mdbook-graphviz
mdbook preprocessor to add graphviz support
-
stringcase
Converts string cases between camelCase, COBOL-CASE, kebab-case, and so on
-
tiny-gradient
Make your string colored in gradient
-
fuzzt
Implementations of string similarity metrics. Includes Hamming, Levenshtein, OSA, Damerau-Levenshtein, Jaro, Jaro-Winkler, and Sørensen-Dice.
-
in_definite
Get the indefinite article ('a' or 'an') to match the given word. For example: an umbrella, a user.
-
glyph-names
Mapping of characters to glyph names according to the Adobe Glyph List Specification
-
line-numbers
Find line numbers in strings by byte offsets, quickly
-
unicode-joining-type
Fast lookup of the Unicode Joining Type and Joining Group properties
-
lindera-analyzer
A morphological analysis library
-
secular
No Diacr!
-
strings
String utilities, including an unbalanced Rope
-
pest_ascii_tree
Helper crates converting the parsing result of any pest grammar into an ascii tree
-
lingua-chinese-language-model
The Chinese language model for Lingua, an accurate natural language detection library
-
readability
Port of arc90's readability project to rust
-
lingua-japanese-language-model
The Japanese language model for Lingua, an accurate natural language detection library
-
regex-macro
A macro to generate a lazy regex expression
-
encoding8
various 8-bit encodings
-
pandoc
API that wraps calls to the pandoc 2.x executable
-
cargo-spellcheck
Checks all doc comments for spelling mistakes
-
mdbook-admonish
A preprocessor for mdbook to add Material Design admonishments
-
clippy_lints
A bunch of helpful lints to avoid common pitfalls in Rust
-
text_unit
Newtypes for text offsets
-
linkcheck
extracting and validating links
-
lingua-portuguese-language-model
The Portuguese language model for Lingua, an accurate natural language detection library
-
textcode
Text encoding/decoding library. Supports: UTF-8, ISO6937, ISO8859, GB2312
-
lingua-italian-language-model
The Italian language model for Lingua, an accurate natural language detection library
-
lingua-russian-language-model
The Russian language model for Lingua, an accurate natural language detection library
-
lingua-ukrainian-language-model
The Ukrainian language model for Lingua, an accurate natural language detection library
-
lingua-arabic-language-model
The Arabic language model for Lingua, an accurate natural language detection library
-
lingua-turkish-language-model
The Turkish language model for Lingua, an accurate natural language detection library
-
lingua-hindi-language-model
The Hindi language model for Lingua, an accurate natural language detection library
-
lingua-korean-language-model
The Korean language model for Lingua, an accurate natural language detection library
-
lingua-thai-language-model
The Thai language model for Lingua, an accurate natural language detection library
-
crop
A pretty fast text rope
-
lingua-vietnamese-language-model
The Vietnamese language model for Lingua, an accurate natural language detection library
-
lingua-latvian-language-model
The Latvian language model for Lingua, an accurate natural language detection library
-
soup
Inspired by the python library BeautifulSoup, this is a layer on top of html5ever that adds a different API for querying and manipulating HTML
-
lindera-filter
Character and token filters for Lindera
-
mdbook-katex
mdBook preprocessor rendering LaTeX equations to HTML
-
lingua-dutch-language-model
The Dutch language model for Lingua, an accurate natural language detection library
-
lingua-polish-language-model
The Polish language model for Lingua, an accurate natural language detection library
-
lingua-indonesian-language-model
The Indonesian language model for Lingua, an accurate natural language detection library
-
lingua-persian-language-model
The Persian language model for Lingua, an accurate natural language detection library
-
lingua-bokmal-language-model
The Bokmal language model for Lingua, an accurate natural language detection library
-
lingua-mongolian-language-model
The Mongolian language model for Lingua, an accurate natural language detection library
-
lingua-malay-language-model
The Malay language model for Lingua, an accurate natural language detection library
-
lingua-nynorsk-language-model
The Nynorsk language model for Lingua, an accurate natural language detection library
-
pulldown-cmark-mdcat
Render pulldown-cmark events to TTY
-
fast2s
A fast Traditional Chinese to Simplified Chinese conversion library. Built with FST, faster than most of other libraries.
-
srx
A mostly compliant Rust implementation of the Segmentation Rules eXchange (SRX) 2.0 standard for text segmentation
-
neo-mime
Strongly Typed Mimes
-
simple_excel_writer
Excel Writer
-
rasciigraph
function to plot ascii graphs
-
lingua-romanian-language-model
The Romanian language model for Lingua, an accurate natural language detection library
-
lingua-greek-language-model
The Modern Greek language model for Lingua, an accurate natural language detection library
-
lingua-hungarian-language-model
The Hungarian language model for Lingua, an accurate natural language detection library
-
lingua-danish-language-model
The Danish language model for Lingua, an accurate natural language detection library
-
lingua-finnish-language-model
The Finnish language model for Lingua, an accurate natural language detection library
-
lingua-swedish-language-model
The Swedish language model for Lingua, an accurate natural language detection library
-
lingua-slovak-language-model
The Slovak language model for Lingua, an accurate natural language detection library
-
lingua-armenian-language-model
The Armenian language model for Lingua, an accurate natural language detection library
-
lingua-estonian-language-model
The Estonian language model for Lingua, an accurate natural language detection library
-
lingua-lithuanian-language-model
The Lithuanian language model for Lingua, an accurate natural language detection library
-
lingua-catalan-language-model
The Catalan language model for Lingua, an accurate natural language detection library
-
lingua-slovene-language-model
The Slovene language model for Lingua, an accurate natural language detection library
-
lingua-czech-language-model
The Czech language model for Lingua, an accurate natural language detection library
-
lingua-bulgarian-language-model
The Bulgarian language model for Lingua, an accurate natural language detection library
-
lingua-tamil-language-model
The Tamil language model for Lingua, an accurate natural language detection library
-
lingua-serbian-language-model
The Serbian language model for Lingua, an accurate natural language detection library
-
lingua-icelandic-language-model
The Icelandic language model for Lingua, an accurate natural language detection library
-
lingua-azerbaijani-language-model
The Azerbaijani language model for Lingua, an accurate natural language detection library
-
lingua-esperanto-language-model
The Esperanto language model for Lingua, an accurate natural language detection library
-
lingua-shona-language-model
The Shona language model for Lingua, an accurate natural language detection library
-
lingua-hebrew-language-model
The Hebrew language model for Lingua, an accurate natural language detection library
-
lingua-irish-language-model
The Irish language model for Lingua, an accurate natural language detection library
-
lingua-georgian-language-model
The Georgian language model for Lingua, an accurate natural language detection library
-
lingua-xhosa-language-model
The Xhosa language model for Lingua, an accurate natural language detection library
-
lingua-macedonian-language-model
The Macedonian language model for Lingua, an accurate natural language detection library
-
lingua-kazakh-language-model
The Kazakh language model for Lingua, an accurate natural language detection library
-
lingua-zulu-language-model
The Zulu language model for Lingua, an accurate natural language detection library
-
lingua-urdu-language-model
The Urdu language model for Lingua, an accurate natural language detection library
-
lingua-sotho-language-model
The Sotho language model for Lingua, an accurate natural language detection library
-
lingua-welsh-language-model
The Welsh language model for Lingua, an accurate natural language detection library
-
lingua-belarusian-language-model
The Belarusian language model for Lingua, an accurate natural language detection library
-
lingua-tagalog-language-model
The Tagalog language model for Lingua, an accurate natural language detection library
-
lingua-marathi-language-model
The Marathi language model for Lingua, an accurate natural language detection library
-
lingua-afrikaans-language-model
The Afrikaans language model for Lingua, an accurate natural language detection library
-
lingua-maori-language-model
The Māori language model for Lingua, an accurate natural language detection library
-
lingua-somali-language-model
The Somali language model for Lingua, an accurate natural language detection library
-
lingua-albanian-language-model
The Albanian language model for Lingua, an accurate natural language detection library
-
lingua-yoruba-language-model
The Yoruba language model for Lingua, an accurate natural language detection library
-
lingua-telugu-language-model
The Telugu language model for Lingua, an accurate natural language detection library
-
lingua-tswana-language-model
The Tswana language model for Lingua, an accurate natural language detection library
-
lingua-croatian-language-model
The Croatian language model for Lingua, an accurate natural language detection library
-
lingua-latin-language-model
The Latin language model for Lingua, an accurate natural language detection library
-
lingua-basque-language-model
The Basque language model for Lingua, an accurate natural language detection library
-
lingua-ganda-language-model
The Ganda language model for Lingua, an accurate natural language detection library
-
lingua-swahili-language-model
The Swahili language model for Lingua, an accurate natural language detection library
-
lingua-bengali-language-model
The Bengali language model for Lingua, an accurate natural language detection library
-
lingua-gujarati-language-model
The Gujarati language model for Lingua, an accurate natural language detection library
-
lingua-punjabi-language-model
The Punjabi language model for Lingua, an accurate natural language detection library
-
lingua-bosnian-language-model
The Bosnian language model for Lingua, an accurate natural language detection library
-
lingua-tsonga-language-model
The Tsonga language model for Lingua, an accurate natural language detection library
-
text-colorizer
Transitionary package
-
pinot
Fast, high-fidelity OpenType parser
-
no-comment
Remove rust-style line and block comments from a char iterator
-
jayce
tokenizer 🌌
-
float-pretty-print
Format f64 for showing to user, not for serialisation
-
vi
An input method library for vietnamese IME
-
xml2json-rs
converting to and from XML/JSON
-
diacritics
Remove diacritics from letters, for example when standardizing input for a search
-
like
A SQL like style pattern matching
-
lean-sys
Bindings to Lean 4's C API
-
marker
finding issues in CommonMark documents
-
xmldecl
Extracts an encoding from an ASCII-based bogo-XML declaration in text/html in a Web-compatible way
-
unicode_reader
Adaptors which wrap byte-oriented readers and yield the UTF-8 data as Unicode code points or grapheme clusters
-
wkhtmltopdf
High-level bindings to wkhtmltopdf
-
moto
motivated automation
-
xsv
A high performance CSV command line toolkit
-
terminal-clipboard
a minimal cross-platform clipboard
-
atelier_test
Test and example models used within the other Atelier crates
-
doc-chunks
Clusters of doc comments and dev comments as coherent view
-
regex_generate
Use regular expressions to generate text
-
mdbook-cmdrun
mdbook preprocessor to run arbitrary commands
-
string_wizard
manipulate string like wizards
-
symspell
Spelling correction & Fuzzy search
-
norad
Read and write Unified Font Object files
-
lindera-ipadic-neologd
A Japanese morphological dictionary for IPADIC NEologd
-
newdoc
Generate pre-populated module files formatted with AsciiDoc that are used in Red Hat and Fedora documentation
-
mdbook-codeblocks
A mdbook preprocessor to prepend customizable vignette to code blocks
-
mdbook-tailor
mdbook preprocessor for image-tailor
-
censor
text profanity filter
-
focaccia
no_std implementation of Unicode case folding comparisons
-
sourceannot
render snippets of source code with annotations
-
opml
OPML library for Rust
-
local-encoding
encoding/decoding string with local charset. It usefull for work with ANSI strings on Windows.
-
cffdrs
Canadian Forest Fire Danger Rating System
-
words-count
Count the words and characters, with or without whitespaces
-
rustpython-sre_engine
A low-level implementation of Python's SRE regex engine
-
boreal
evaluate YARA rules, used to scan bytes for textual and binary pattern
-
basic-text
Basic Text strings and I/O streams
-
create_broken_files
Create broken files from other ones
-
confusables
around Unicode confusables/homoglyphs
-
chewing
(酷音) intelligent Zhuyin input method
-
fuzzywuzzy
A pure-Rust clone of the incredibly useful fuzzy string matching python package, FuzzyWuzzy
-
savvy
R extension interface
-
vaporetto_rules
Rule-base filters for Vaporetto
-
metatensor-sys
Bindings to the metatensor C library
-
char_reader
Safely read wild streams as chars or lines
-
fontconfig
Safe, higher-level wrapper around the Fontconfig library
-
heckcheck
A heckin small test case generator
-
bashtestmd
Compiles shell commands in .md files into Bash scripts for testing
-
stringzilla
Faster SIMD-accelerated string search, sorting, fingerprints, and edit distances
-
str-utils
some traits to extend types which implement
AsRef<[u8]>
orAsRef<str>
-
stream-rate-limiter
A rate limiter for Tokio streams
-
yffi
Bindings for the Yrs native C foreign function interface
-
decancer
that removes common unicode confusables/homoglyphs from strings
-
tremor-kv
A logstash inspured key value extractor
-
m_lexer
extensible regular expressions based lexer
-
epub-builder
generating EPUB files
-
vader_sentiment
Bindings for Rust from the original Python VaderSentiment analysis tool
-
ellipse
Truncate and ellipse strings in a human-friendly way
-
capitalize
Change first character to upper case and the rest to lower case, and other common alternatives
-
regex-split
split_inclusive for the regex crate
-
qpdf
Rust bindings to QPDF C++ library
-
crlify
A std::io::Write wrapper that replaces with on Windows
-
stop-words
Common stop words in many languages
-
tauri-plugin-clipboard
A clipboard plugin for Tauri that supports text, files and image, as well as clipboard update listening
-
basen
Convert binary data to ASCII with a variety of supported bases
-
rslint_errors
Pretty error reporting library based on codespan-reporting built for the RSLint project
-
slugify-rs
generate slugs from strings
-
sanitise-file-name
An unusually flexible and efficient file name sanitiser
-
fiberplane-markdown
convert Fiberplane Notebooks to and from Markdown
-
inline_colorization
format!("Lets the user {color_red}colorize{color_reset} and {style_underline}style the output{style_reset} text using inline variables");
-
unic-ucd-block
UNIC — Unicode Character Database — Unicode Blocks
-
html2runes
An HTML to Text converter
-
unicode-jp
convert Japanese Half-width-kana[半角カナ] and Wide-alphanumeric[全角英数] into normal ones
-
intuicio-data
Data module for Intuicio scripting platform
-
wkhtmltox-sys
FFI bindings to wkhtmltox
-
mandown
Markdown to groff (man page) converter
-
apidoc-attr
Apidoc attr
-
smartcat
Putting a brain behind
cat
. CLI interface to bring language models in the Unix ecosystem 🐈⬛ -
notan_glyph
glyph's support for Notan
-
tectonic_status_base
Basic types for reporting status messages to a user
-
codegenrs
Moving code-gen our of build.rs
-
kakasi
Romanize hiragana, katakana and kanji (Japanese text)
-
asciifolding
ascii folding library
-
null-terminated-str
FFI-friendly utf-8 string, enabling const null-terminated str and caching of the non-terminated string to avoid frequent allocation
-
indented
Format data with indentation
-
terminal-emoji
safely displaying emoji inside of terminals
-
controlled-option
Custom Option type with explicit control over niches and memory layout
-
unic-idna-mapping
UNIC — IDNA — IDNA Mapping Table
-
posix-space
Pure Rust implementation of
isspace
for the POSIX locale