#ai-agent #llm #ollama #search

app weavex

Weave together web search and AI reasoning - an autonomous research agent powered by local LLMs

11 stable releases

1.1.0 Sep 27, 2025
1.0.9 Sep 27, 2025
1.0.8 Sep 26, 2025

#1717 in Command line utilities

39 downloads per month

GPL-3.0 license

62KB
1.5K SLoC

Weavex

Crates.io Crates.io CI License: GPL v3

An autonomous AI research agent that combines Ollama's web search with your local LLMs. Watch as your model reasons through complex queries, autonomously searches the web, and synthesizes intelligent answers with citations.

Features

  • Autonomous Agent - Local LLM decides when to search, fetch, and synthesize
  • Clean Output - Shows final answers with a loading animation; optionally view full reasoning
  • Smart Web Research - Autonomous web search and page fetching with context
  • Fast & Efficient - Built with Rust for maximum performance
  • Production Ready - Comprehensive error handling and logging
  • Highly Configurable - Multiple models, output formats, and options

Installation

From Source

git clone https://github.com/guitaripod/weavex.git
cd weavex
cargo install --path .

From crates.io

cargo install weavex

Prerequisites

You need an Ollama API key to use this tool. Get one at ollama.com/settings/keys.

Quick Start

Set up your API key

export OLLAMA_API_KEY="your_api_key_here"

Or create a .env file:

echo "OLLAMA_API_KEY=your_api_key_here" > .env

Run autonomous research with your local Ollama models:

# Use default model (gpt-oss:20b) - opens result in browser by default
weavex agent "What are the top 3 Rust developments from 2025?"

# Disable browser preview, show in terminal
weavex agent --no-preview "query"

# Specify a different model
weavex agent --model qwen3:14b "research quantum computing trends"

# Show thinking steps and reasoning process for transparency
weavex agent --show-thinking "query"

# Disable model reasoning mode
weavex agent --disable-reasoning "query"

# Custom Ollama server
weavex agent --ollama-url http://192.168.1.100:11434 "query"

# Limit agent iterations
weavex agent --max-iterations 5 "query"

How it works:

  1. Agent uses your local Ollama model for reasoning
  2. Shows a loading animation (๐Ÿงต Weaving...) while working
  3. Autonomously decides when to search the web or fetch URLs
  4. Iterates until it has enough information
  5. Opens the final result in your browser with markdown rendering (use --no-preview for terminal output)

Agent Output:

  • ๐Ÿงต Weaving...: Loading animation while agent works
  • ๐ŸŒ Browser Preview: Opens result in browser by default (use --no-preview for terminal output)

With --show-thinking flag:

  • ๐Ÿง  Reasoning: Shows the model's thinking process
  • ๐Ÿ”Ž Searching: Web search operations
  • ๐ŸŒ Fetching: URL fetch operations
  • ๐Ÿ’ฌ Response: Model's synthesized content

Requirements:

  • Local Ollama server running (ollama serve)
  • Model downloaded locally (ollama pull gpt-oss:20b)
  • Ollama API key for web search access

Recommended Models:

  • gpt-oss:20b - Best balance of speed and reasoning (default)
  • qwen3:14b - Good tool-use capabilities
  • qwen3:4b - Fastest, runs on laptops

Direct API Access (Simple Mode)

For quick searches without the agent, you can use the direct API mode:

# Opens results in browser by default
weavex "what is rust programming"

# Show results in terminal
weavex --no-preview "what is rust programming"

Limit Results

weavex --max-results 5 "best practices for async rust"

JSON Output

weavex --json "machine learning trends 2025"

Fetch a Specific URL

weavex fetch https://example.com

Advanced Options

# Pass API key via flag
weavex --api-key YOUR_KEY "query here"

# Verbose logging
weavex --verbose "debugging query"

Options

Click to expand options

Global Options

  -k, --api-key <API_KEY>          Ollama API key (can also use OLLAMA_API_KEY env var)
  -m, --max-results <NUM>          Maximum number of search results to return
  -j, --json                       Output results as JSON
      --no-preview                 Disable browser preview (preview is enabled by default)
  -v, --verbose                    Enable verbose logging
      --timeout <SECONDS>          Request timeout in seconds [default: 30]
  -h, --help                       Print help
  -V, --version                    Print version

Commands

  fetch  Fetch and parse a specific URL
  agent  Run an AI agent with web search capabilities
  help   Print this message or the help of the given subcommand(s)

Agent Options

  -m, --model <MODEL>              Local Ollama model to use [default: gpt-oss:20b]
      --ollama-url <URL>           Local Ollama server URL [default: http://localhost:11434]
      --max-iterations <NUM>       Maximum agent iterations [default: 50]
      --show-thinking              Show agent thinking steps and reasoning process
      --disable-reasoning          Disable model reasoning (thinking mode)
      --no-preview                 Disable browser preview (preview is enabled by default)

Environment Variables

Click to expand environment variables
  • OLLAMA_API_KEY - Your Ollama API key (required)
  • OLLAMA_BASE_URL - Base URL for the API (default: https://ollama.com/api)
  • OLLAMA_TIMEOUT - Request timeout in seconds (default: 30)

Examples

AI Agent Research

# Opens result in browser by default
weavex agent "What are the latest benchmarks for Rust async runtimes?"

The agent will autonomously:

  • Display a loading animation while working
  • Search for relevant benchmark articles
  • Fetch specific benchmark results
  • Compare data from multiple sources
  • Open the final result in your browser with markdown rendering

Verbose Mode

Show the reasoning steps and full transparency:

weavex agent --show-thinking "What are the latest benchmarks for Rust async runtimes?"

Terminal Output Mode

Disable browser preview to see output in terminal:

weavex agent --no-preview "What are the latest benchmarks for Rust async runtimes?"

Traditional Mode (No Reasoning)

Disable reasoning mode for faster responses:

weavex agent --disable-reasoning "What are the latest benchmarks for Rust async runtimes?"

Simple Mode Examples

Research a Topic

weavex "latest rust async runtime benchmarks"

Compare Technologies

weavex --max-results 10 "tokio vs async-std performance"

Extract Page Content

weavex fetch https://blog.rust-lang.org/

Integrate with Other Tools

weavex --json "rust web frameworks" | jq '.results[0].url'

Development

Click to expand development info

Build

cargo build

Run Tests

cargo test

Release Build

cargo build --release

The release binary will be optimized with LTO and stripped of debug symbols.

Project Structure

Click to expand project structure
src/
โ”œโ”€โ”€ main.rs        - Application entry point and orchestration
โ”œโ”€โ”€ agent.rs       - AI agent loop with tool execution
โ”œโ”€โ”€ cli.rs         - CLI argument parsing with clap
โ”œโ”€โ”€ client.rs      - Ollama web search API client
โ”œโ”€โ”€ config.rs      - Configuration management
โ”œโ”€โ”€ error.rs       - Custom error types with thiserror
โ”œโ”€โ”€ formatter.rs   - Output formatting (human & JSON)
โ””โ”€โ”€ ollama_local.rs - Local Ollama chat API client

Error Handling

Click to expand error handling details

The tool provides clear, actionable error messages:

  • Missing API key โ†’ Instructions to set OLLAMA_API_KEY
  • Network errors โ†’ Details about connection failures
  • API errors โ†’ Status codes and error messages from Ollama
  • Invalid responses โ†’ Clear parsing error descriptions

Security

Click to expand security info
  • API keys are never logged or printed
  • .env files are gitignored by default
  • Uses rustls-tls for secure HTTPS connections
  • No hardcoded credentials or secrets

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

Built with:

  • clap - CLI argument parsing
  • reqwest - HTTP client
  • tokio - Async runtime
  • serde - Serialization framework

Powered by Ollama's Web Search API.

Dependencies

~14โ€“34MB
~457K SLoC