5 releases (stable)
| 1.2.0 | Feb 19, 2026 |
|---|---|
| 1.1.0 | Feb 13, 2026 |
| 1.0.1 | Jan 26, 2026 |
| 1.0.0 | Jan 21, 2026 |
| 0.0.1 | Jan 6, 2026 |
#344 in Parser implementations
77,234 downloads per month
Used in smg
90KB
2K
SLoC
reasoning-parser
A Rust library for detecting and extracting reasoning content (chain-of-thought) from Large Language Model outputs. Handles models that emit explicit thinking blocks delimited by tokens like <think> and </think>.
Features
- Unified Interface - Single API for multiple model formats
- Streaming Support - Incremental parsing with state preservation across chunks
- Parser Pooling - Efficient reuse of parser instances for high concurrency
- Partial Token Handling - Correctly handles tokens split across chunk boundaries
- Model Auto-Detection - Pattern-based automatic parser selection
- Extensible - Easy to add support for new model formats
Installation
Add to your Cargo.toml:
[dependencies]
reasoning-parser = "1.0"
Quick Start
use reasoning_parser::{ParserFactory, ReasoningParser};
#[tokio::main]
async fn main() {
let factory = ParserFactory::new();
let parser = factory.get_pooled("deepseek-r1");
let mut p = parser.lock().await;
let result = p
.detect_and_parse_reasoning("<think>Let me analyze this...</think>The answer is 42.")
.unwrap();
println!("Reasoning: {}", result.reasoning_text); // "Let me analyze this..."
println!("Answer: {}", result.normal_text); // "The answer is 42."
}
Supported Models
| Model | Token Format | Notes |
|---|---|---|
| DeepSeek-R1 | <think>/</think> |
Starts in reasoning mode |
| Qwen3 | <think>/</think> |
Explicit reasoning blocks |
| Qwen3-Thinking | <think>/</think> |
Starts in reasoning mode |
| GLM-4.5/4.6/4.7 | <think>/</think> |
Explicit reasoning blocks |
| Kimi | ◁think▷/◁/think▷ |
Unicode delimiters |
| Step3 | <think>/</think> |
Starts in reasoning mode |
| MiniMax M2 | <think>/</think> |
Auto-prepends start token |
| Cohere Command | <|START_THINKING|>/<|END_THINKING|> |
CMD3/CMD4 format |
| Nemotron-Nano | <think>/</think> |
Qwen3-compatible |
Unknown models fall back to a passthrough parser that returns all text as normal output.
Core Types
ParserResult
The result of parsing, separating reasoning from normal text:
pub struct ParserResult {
pub normal_text: String, // Text outside reasoning blocks
pub reasoning_text: String, // Text inside reasoning blocks
}
ReasoningParser Trait
The core interface all parsers implement:
pub trait ReasoningParser: Send + Sync {
/// One-shot parsing of complete text
fn detect_and_parse_reasoning(&mut self, text: &str) -> Result<ParserResult, ParseError>;
/// Streaming incremental parsing
fn parse_reasoning_streaming_incremental(&mut self, text: &str) -> Result<ParserResult, ParseError>;
/// Reset parser state for reuse
fn reset(&mut self);
/// Get parser variant identifier
fn model_type(&self) -> &str;
/// Check if currently parsing reasoning content
fn is_in_reasoning(&self) -> bool;
}
Usage Patterns
One-Shot Parsing
For complete text that doesn't need streaming:
let factory = ParserFactory::new();
let mut parser = factory.create("qwen3").unwrap();
let input = "<think>Step 1: Consider the problem...</think>The solution is X.";
let result = parser.detect_and_parse_reasoning(input).unwrap();
assert_eq!(result.reasoning_text, "Step 1: Consider the problem...");
assert_eq!(result.normal_text, "The solution is X.");
Streaming Parsing
For processing chunks as they arrive from an LLM:
let factory = ParserFactory::new();
let parser = factory.get_pooled("deepseek-r1");
let chunks = vec![
"<think>Let me ",
"think about this",
"</think>Here's ",
"the answer.",
];
let mut p = parser.lock().await;
for chunk in chunks {
let result = p.parse_reasoning_streaming_incremental(chunk).unwrap();
if !result.reasoning_text.is_empty() {
print!("[reasoning] {}", result.reasoning_text);
}
if !result.normal_text.is_empty() {
print!("{}", result.normal_text);
}
}
Parser Reuse
Reset a parser to process a new request:
let parser = factory.get_pooled("qwen3");
let mut p = parser.lock().await;
// First request
let result1 = p.detect_and_parse_reasoning("<think>A</think>B").unwrap();
// Reset for next request
p.reset();
// Second request
let result2 = p.detect_and_parse_reasoning("<think>C</think>D").unwrap();
Pooled vs Fresh Parsers
// Pooled: shared instance, requires lock, efficient for high concurrency
let pooled = factory.get_pooled("deepseek-r1"); // Arc<Mutex<Box<dyn ReasoningParser>>>
// Fresh: new instance each time, no lock needed
let fresh = factory.create("deepseek-r1").unwrap(); // Box<dyn ReasoningParser>
Custom Parser Configuration
Create a parser with custom tokens:
use reasoning_parser::{BaseReasoningParser, ParserConfig, ReasoningParser};
let config = ParserConfig {
think_start_token: "<reasoning>".to_string(),
think_end_token: "</reasoning>".to_string(),
stream_reasoning: true,
max_buffer_size: 65536,
initial_in_reasoning: false,
};
let mut parser = BaseReasoningParser::new(config);
let result = parser
.detect_and_parse_reasoning("<reasoning>thinking</reasoning>answer")
.unwrap();
Registering Custom Parsers
Add support for new model patterns:
let factory = ParserFactory::new();
// Register a creator function
factory.registry().register_parser("myformat", || {
Box::new(BaseReasoningParser::new(ParserConfig {
think_start_token: "<<THINK>>".to_string(),
think_end_token: "<</THINK>>".to_string(),
stream_reasoning: true,
max_buffer_size: 65536,
initial_in_reasoning: false,
}))
});
// Map model patterns to the parser
factory.registry().register_pattern("my-custom-model", "myformat");
factory.registry().register_pattern("my-model-v2", "myformat");
// Now these work
let parser = factory.get_pooled("my-custom-model-7b");
Error Handling
use reasoning_parser::ParseError;
match parser.detect_and_parse_reasoning(text) {
Ok(result) => {
println!("Reasoning: {}", result.reasoning_text);
println!("Normal: {}", result.normal_text);
}
Err(ParseError::BufferOverflow(size)) => {
eprintln!("Content too large: {} bytes", size);
}
Err(ParseError::Utf8Error(e)) => {
eprintln!("Invalid UTF-8: {}", e);
}
Err(ParseError::UnknownModel(model)) => {
eprintln!("Unknown model: {}", model);
}
Err(ParseError::ConfigError(msg)) => {
eprintln!("Configuration error: {}", msg);
}
}
Model Pattern Matching
The factory uses case-insensitive substring matching:
// All of these match "deepseek-r1" pattern:
factory.get_pooled("deepseek-r1");
factory.get_pooled("DeepSeek-R1-Distill-Qwen-7B");
factory.get_pooled("my-deepseek-r1-finetune");
Pattern priority (first match wins):
deepseek-r1→ DeepSeekR1Parserqwen3-thinking/qwen-thinking→ QwenThinkingParserqwen3/qwen→ Qwen3Parserglm45/glm46/glm47→ Glm45Parserkimi→ KimiParserstep3→ Step3Parserminimax/mm-m2→ MiniMaxParsercommand-r/command-a/c4ai-command/cohere→ CohereCmdParsernemotron-nano/nano-v3→ Qwen3Parser- (fallback) → BaseReasoningParser (passthrough)
Thread Safety
The crate is designed for high-concurrency scenarios:
PooledParsertype isArc<Mutex<Box<dyn ReasoningParser>>>- Uses
tokio::Mutexfor async-friendly locking - Registry uses
Arc<RwLock<>>for safe concurrent access - Tested with 100 concurrent tasks at 1000+ requests/second
License
Apache-2.0
Dependencies
~1.9–3MB
~47K SLoC