3 unstable releases
0.3.0-alpha2 | Nov 8, 2021 |
---|---|
0.3.0-alpha1 | Oct 11, 2021 |
0.1.1 |
|
0.1.0 | Apr 1, 2020 |
#1082 in Data structures
50KB
803 lines
String Interner
Linux | Windows | Codecov | Coveralls | Docs | Crates.io |
---|---|---|---|---|---|
A data structure to cache strings efficiently, with minimal memory footprint and the ability to assicate the interned strings with unique symbols. These symbols allow for constant time comparisons and look-ups to the underlying interned string contents. Also, iterating through the interned strings is cache efficient.
Internals
- Internally a hashmap
M
and a vectorV
is used. V
stores the contents of interned strings whileM
has internal references into the string ofV
to avoid duplicates.V
stores the strings with an indirection to avoid iterator invalidation.- Returned symbols usually have a low memory footprint and are efficiently comparable.
Planned Features
- Safe abstraction wrapper that protects the user from the following misusages:
- Using symbols of a different string interner instance to resolve string in another.
- Using symbols that are already no longer valid (i.e. the associated string interner is no longer available).
License
Licensed under either of
- Apache license, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Dual licence:
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Changelog
-
0.7.1
- CRITICAL fix use after free bug in
StringInterner::clone()
- implement
std::iter::Extend
forStringInterner
Sym::from_usize
now avoids usingunsafe
code- optimize
FromIterator
impl ofStringInterner
- move to Rust 2018 edition
Thanks YOSHIOKA Takuma for implementing this release.
- CRITICAL fix use after free bug in
-
0.7.0
- changed license from MIT to MIT/APACHE2.0
- removed generic impl of
Symbol
for types that areFrom<usize>
andInto<usize>
- removed
StringInterner::clear
API since its usage breaks invariants - added
StringInterner::{capacity, reserve}
APIs - introduced a new default symbol type
Sym
that is a thin wrapper aroundNonZeroU32
(idea by koute) - made
DefaultStringInterner
a type alias for the newStringInterner<Sym>
- added convenient
FromIterator
impl toStringInterner<S: Sym>
- dev
- rewrote all unit tests (serde tests are still missing)
- entirely refactored benchmark framework
- added
html_root_url
to crate root
Thanks matklad for suggestions and impulses
-
0.6.3
- fixed a bug that
StringInterner
'sSend
impl didn't respect its genericHashBuilder
parameter. Fixes GitHub issue #4.
- fixed a bug that
-
0.6.2
- added
shrink_to_fit
public method toStringInterner
- (by artemshein)
- added
-
0.6.1
- fixed a bug that inserting non-owning string types (e.g.
str
) was broken due to dangling pointers (Thanks to artemshein for fixing it!)
- fixed a bug that inserting non-owning string types (e.g.
-
0.6.0
- added optional serde serialization and deserialization support
- more efficient and generic
PartialEq
implementation forStringInterner
- made
StringInterner
generic overBuildHasher
to allow for custom hashers
-
0.5.0
- added
IntoIterator
trait implementation forStringInterner
- greatly simplified iterator code
- added
-
0.4.0
- removed restrictive constraint for
Unsigned
forSymbol
- removed restrictive constraint for
-
0.3.3
- added
Send
andSync
toInternalStrRef
to makeStringInterner
itselfSend
andSync
- added
Dependencies
~0.5–1MB
~17K SLoC