5 releases (3 breaking)
Uses new Rust 2024
| new 0.4.0 | Mar 5, 2026 |
|---|---|
| 0.3.1 | Mar 1, 2026 |
| 0.3.0 | Feb 21, 2026 |
| 0.2.0 | Feb 8, 2026 |
| 0.1.0 | Feb 8, 2026 |
#95 in Date and time
23 downloads per month
125KB
2K
SLoC
clockwords
Find and resolve natural-language time expressions in text.
clockwords scans free-form text for relative time expressions like "last Friday from 9 to eleven", "yesterday at 3pm", or "letzten Freitag von 9 bis 12 Uhr" and returns their byte-offset spans together with resolved DateTime<Utc> values. It supports English, German, French, and Spanish out of the box.
Built for real-time GUI applications (time-tracking, note-taking, calendars) where the user types naturally and the app highlights detected time references as they appear. Timezone-aware — times the user enters are interpreted in their local timezone (configurable, defaults to UTC).
Features
- Four languages: English, German, French, Spanish
- Timezone-aware: User input is interpreted in a configurable timezone (defaults to UTC for backward compatibility)
- Byte-offset spans: Directly usable for text highlighting in any GUI framework
- Resolved times: Every match resolves to a concrete
DateTime<Utc>point or range - Incremental typing support: Detects partial matches (e.g.
"yester"while the user is still typing"yesterday") - Accent-tolerant: Handles
días/dias,à/a,mañana/manana,dernière/derniere - Fast rejection: Aho-Corasick keyword prefilter skips text with no time-related words in sub-microsecond time
- Zero allocations on rejection: If no keywords are found,
scan()returns immediately - No unsafe code
- Defensive: All internal date arithmetic returns
Option— no panics from edge-case dates
Quick Start
Add to your Cargo.toml:
[dependencies]
clockwords = "0.3"
Basic Usage
use clockwords::{default_scanner, ResolvedTime};
use chrono::Utc;
fn main() {
// Create a scanner with all four languages enabled
let scanner = default_scanner();
let now = Utc::now();
let text = "The last hour I coded the initial code for the time library";
let matches = scanner.scan(text, now);
for m in &matches {
println!(
"Found '{}' at bytes {}..{} ({:?})",
&text[m.span.as_range()],
m.span.start,
m.span.end,
m.kind,
);
match &m.resolved {
ResolvedTime::Point(dt) => println!(" Resolved to: {dt}"),
ResolvedTime::Range { start, end } => {
println!(" Resolved to: {start} .. {end}")
}
}
}
}
Output:
Found 'The last hour' at bytes 0..13 (TimeRange)
Resolved to: 2026-02-08T12:30:00Z .. 2026-02-08T13:30:00Z
Select Specific Languages
use clockwords::scanner_for_languages;
// Only English and German
let scanner = scanner_for_languages(&["en", "de"]);
Timezone Support
By default, all times are interpreted in UTC. To interpret user input in a specific timezone, configure ParserConfig::timezone or use scan_with_tz():
use clockwords::{ParserConfig, TimeExpressionScanner, Tz, default_scanner};
use chrono::Utc;
// Option 1: Set timezone in config
let config = ParserConfig {
timezone: Tz::Europe__Berlin,
..Default::default()
};
// Pass config when constructing the scanner (e.g. via TimeExpressionScanner::new)
// Option 2: Override per scan call
let scanner = default_scanner();
let matches = scanner.scan_with_tz("yesterday at 3pm", Utc::now(), Tz::Europe__Berlin);
// "3pm" is interpreted as 15:00 Berlin time → resolves to 14:00 UTC (in winter)
When a timezone is set, all day boundaries (midnight), time-of-day values, and weekday calculations use the user's local timezone. The resolved output always remains in UTC. For example, with Europe/Berlin (CET, UTC+1 in winter):
"today"at 23:30 UTC (= 00:30 CET next day) → the range covers the next calendar day in Berlin"at 3pm"→ resolves to 14:00 UTC (not 15:00 UTC)"the last hour"→ unchanged (duration-based, timezone-independent)
Supported Expressions
Relative Days
| Language | Examples |
|---|---|
| English | today, tomorrow, yesterday |
| German | heute, morgen, gestern |
| French | aujourd'hui, demain, hier |
| Spanish | hoy, mañana, ayer |
Resolves to a full-day Range (midnight to midnight in the configured timezone).
Relative Weekdays
| Language | Examples |
|---|---|
| English | last Friday, next Monday, this Wednesday |
| German | letzten Freitag, nächsten Montag, diesen Mittwoch |
| French | vendredi dernier, lundi prochain, ce mercredi |
| Spanish | el viernes pasado, el próximo lunes, este miércoles |
Resolves to a full-day Range (midnight to midnight in the configured timezone). French and Spanish support both pre- and post-positive word order (e.g. lundi prochain and prochain lundi). Spanish also supports el viernes que viene.
Day Offsets
| Language | Examples |
|---|---|
| English | in 4 days, two days ago, in three days |
| German | in 3 Tagen, vor zwei Tagen |
| French | dans 3 jours, il y a deux jours |
| Spanish | en 3 días, hace 2 dias |
Supports both digits and written-out number words (1–30).
Time Specifications
| Language | Examples |
|---|---|
| English | at 3pm, at 3 am, 13 o'clock, at 3:30pm, 11:30am, at 15:30 |
| German | um 15 Uhr, um 15:30 Uhr, um 15:30 |
| French | à 13h, à 13h30, à 13:30 |
| Spanish | a las 3, a las 15:30 |
Colon-delimited minutes (H:MM) are supported in all languages. In English, am/pm is optional — bare H:MM with at is treated as 24-hour time. French supports both h and : as separators (13h30 and 13:30).
Resolves to a Point in time.
Time Ranges
| Language | Examples |
|---|---|
| English | the last hour, last minute, between 9 and 12, from 9 to 12 |
| German | die letzte Stunde, von 9 bis 12 Uhr, zwischen 9 und 12 |
| French | la dernière heure, entre 9 et 12 heures |
| Spanish | la última hora, entre las 9 y las 12 |
English supports both between X and Y and from X to Y with number words (from nine to five).
Combined Expressions
Any day reference (relative day, weekday, or day offset) can be combined with a time specification or time range in a single expression. The entire phrase is detected as one match:
Relative day + time:
| Language | Examples |
|---|---|
| English | yesterday at 3pm, yesterday at 3:30pm, yesterday at 15:30, tomorrow between 9 and 12, yesterday from 9 to 11 |
| German | gestern um 15 Uhr, gestern um 15:30 Uhr, gestern um 15:30, gestern von 9 bis 12 Uhr |
| French | hier à 13h, hier à 13h30, hier à 13:30, hier entre 9 et 12 heures |
| Spanish | ayer a las 3, ayer a las 15:30, ayer entre las 9 y las 12 |
Weekday + time:
| Language | Examples |
|---|---|
| English | last Friday at 3pm, last Friday at 3:30pm, last Friday at 15:30, last Friday from 9 to eleven, next Monday between 9 and 12 |
| German | letzten Freitag um 15 Uhr, letzten Freitag um 15:30 Uhr, nächsten Montag um 9:15, diesen Mittwoch zwischen 9 und 11 |
| French | vendredi dernier à 13h, vendredi dernier à 13h30, vendredi dernier à 13:30, ce lundi à 14h30, ce mercredi entre 9 et 11 heures |
| Spanish | el viernes pasado a las 3, el viernes pasado a las 3:30, el próximo lunes a las 9:30, el pasado viernes entre las 9 y las 12 |
Combined expressions resolve to either a Point (day + time spec) or a Range (day + time range) on the specified day.
Architecture
How Scanning Works
Input text
│
▼
┌─────────────────────┐
│ Aho-Corasick │ Fast keyword check (~ns)
│ Prefilter │ Rejects text with no time words
└─────────┬───────────┘
│ keywords found
▼
┌─────────────────────┐
│ Per-Language │ Regex rules with resolver closures
│ Grammar Rules │ Run for each enabled language
└─────────┬───────────┘
│ raw matches
▼
┌─────────────────────┐
│ Deduplication │ Prefer Complete > Partial, longer > shorter
│ & Sorting │ Remove overlapping inferior matches
└─────────┬───────────┘
│
▼
Vec<TimeMatch>
Buffer-Rescan Strategy
Rather than maintaining an incremental parser state machine, clockwords re-scans the full text buffer on every call to scan(). This is the right trade-off for GUI text input:
- Input buffers are typically < 1 KB
- Full regex scan of a short buffer completes in microseconds
- Dramatically simpler than maintaining parser state across edits
- No edge cases around cursor position, insertions, or deletions
Type Overview
| Type | Description |
|---|---|
TimeExpressionScanner |
Main entry point — holds language parsers and prefilter |
TimeMatch |
A single match result: span + confidence + resolved time + kind |
Span |
Byte-offset range (start..end) for slicing the original text |
ResolvedTime |
Point(DateTime<Utc>) or Range { start, end } |
MatchConfidence |
Partial (user still typing) or Complete |
ExpressionKind |
RelativeDay, RelativeDayOffset, TimeSpecification, TimeRange, Combined |
ParserConfig |
Settings: report_partial (default true), max_matches (default 10), timezone (default Tz::UTC) |
Tz |
Re-exported from chrono-tz — IANA timezone (e.g. Tz::Europe__Berlin, Tz::US__Eastern) |
GUI Integration
clockwords is designed for real-time text highlighting. Here's how to wire it up:
use clockwords::{default_scanner, MatchConfidence, TimeExpressionScanner};
use chrono::Utc;
struct App {
scanner: TimeExpressionScanner,
}
impl App {
fn new() -> Self {
Self {
scanner: default_scanner(),
}
}
/// Call this on every keystroke
fn on_text_changed(&self, text: &str) {
let matches = self.scanner.scan(text, Utc::now());
for m in &matches {
let range = m.span.start..m.span.end;
let style = match m.confidence {
MatchConfidence::Complete => "solid_underline",
MatchConfidence::Partial => "dotted_underline",
};
// Apply `style` to the character range in your text widget
println!("Highlight bytes {range:?} with {style}");
}
}
}
Partial Match Highlighting
When the user types "I worked yester", the scanner returns a Partial match on "yester". Your GUI can show a dimmed or dotted underline to hint that a time expression is being formed. Once the user completes "yesterday", the match upgrades to Complete with a fully resolved time.
To disable partial matching:
use clockwords::{ParserConfig, TimeExpressionScanner};
let config = ParserConfig {
report_partial: false,
..Default::default()
};
Adding a New Language
- Create
src/lang/xx.rs(copy an existing language file as a template) - Implement the
LanguageParsertrait:lang_id()— return the ISO 639-1 code (e.g."it")keywords()— return Aho-Corasick trigger wordskeyword_prefixes()— return typing prefixes (length >= 3)parse()— callapply_rules()with yourGrammarRulelist
- Add number-word mappings to
src/lang/numbers.rs - Register the language in
src/lib.rs→scanner_for_languages() - Add tests in
tests/
Each GrammarRule is a compiled regex paired with a resolver closure:
GrammarRule {
pattern: Regex::new(r"(?i)\b(?P<day>oggi|domani|ieri)\b").unwrap(),
kind: ExpressionKind::RelativeDay,
resolver: |caps, now, tz| {
let offset = match caps.name("day")?.as_str().to_lowercase().as_str() {
"oggi" => 0,
"domani" => 1,
"ieri" => -1,
_ => return None,
};
resolve::resolve_relative_day(offset, now, tz)
},
}
Performance
| Scenario | Approximate Time |
|---|---|
| No keywords in text (fast rejection) | ~1 µs |
| Short sentence with 1 match | ~10 µs |
| Paragraph with multiple matches | ~10 µs |
The Aho-Corasick prefilter means that text without any time-related words is rejected in microseconds — the regex engine is never invoked.
Running Tests
cargo test
The test suite includes 141 integration tests + 1 doctest covering:
- All four languages with various expression types
- Combined weekday + time expressions across all languages
- Timezone-aware resolution (Europe/Berlin, US/Eastern, UTC)
- Cross-midnight timezone boundary handling
- Accent-tolerant variants (with and without diacritics)
- Embedded expressions in longer sentences
- Colon-delimited time parsing (
3:30pm,15:30,13h30,13:30) from X to Ywith number words (nine to five)- Incremental/partial matching
- Edge cases (empty input, no false positives)
- Cross-language default scanner
Running the TUI Demo
An interactive terminal demo is included:
cargo run --example tui_demo
Type time expressions and watch them get parsed in real time. Press ESC to quit.
Dependencies
| Crate | Purpose |
|---|---|
chrono |
Date/time types and arithmetic |
chrono-tz |
IANA timezone database for timezone-aware resolution |
regex |
Per-language grammar patterns |
aho-corasick |
Fast multi-keyword prefilter |
License
Licensed under the Apache License, Version 2.0 (LICENSE or http://www.apache.org/licenses/LICENSE-2.0).
Dependencies
~13MB
~164K SLoC