22 unstable releases (3 breaking)
| 0.4.0 | Sep 8, 2025 |
|---|---|
| 0.3.9 | Sep 5, 2025 |
| 0.3.5 | Aug 26, 2025 |
| 0.2.9 | Aug 7, 2025 |
| 0.1.0 | Jul 20, 2025 |
#361 in HTTP server
32 downloads per month
735KB
14K
SLoC
🌐 SNM BrightData Client
A powerful Rust crate providing MCP-compatible integration with BrightData's web scraping and data extraction services. Built with Actix Web for high-performance web scraping, search, data extraction, and screenshot capabilities.
✨ Features
- 🔍 Web Search: Search across Google, Bing, Yandex, and DuckDuckGo
- 🌐 Website Scraping: Extract content in markdown, raw HTML, or structured formats
- 📊 Data Extraction: Intelligent data extraction from any webpage
- 📸 Screenshots: Capture website screenshots using BrightData Browser
- 🤖 MCP Compatible: Full Model Context Protocol support for AI integrations
- ⚡ Multiple Interfaces: Library, CLI, and HTTP server
- 🔒 Authentication: Secure token-based authentication
- 📈 Rate Limiting: Built-in rate limiting and error handling
- 🚀 High Performance: Built with Actix Web for production workloads
🚀 Quick Start
Installation
Add to your Cargo.toml:
[dependencies]
snm-brightdata-client = "0.1.0"
Environment Setup
# BrightData Configuration
export BRIGHTDATA_API_TOKEN="your_api_token"
export BRIGHTDATA_BASE_URL="https://api.brightdata.com"
export WEB_UNLOCKER_ZONE="your_zone_name"
export BROWSER_ZONE="your_browser_zone"
# Proxy Credentials (optional)
export BRIGHTDATA_PROXY_USERNAME="your_username"
export BRIGHTDATA_PROXY_PASSWORD="your_password"
# Server Configuration
export MCP_AUTH_TOKEN="your_secure_token"
export PORT="8080"
📖 Usage
As a Library
use snm_brightdata_client::{BrightDataClient, BrightDataConfig};
use snm_brightdata_client::tool::{ToolResolver, Tool};
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize client
let config = BrightDataConfig::from_env()?;
let client = BrightDataClient::new(config);
// Use tools directly
let resolver = ToolResolver::default();
let search_tool = resolver.resolve("search_web").unwrap();
let result = search_tool.execute(json!({
"query": "Rust programming language",
"engine": "google"
})).await?;
println!("Search results: {:#?}", result);
Ok(())
}
CLI Usage
# Search the web
snm_cli search "Bitcoin price today" --engine google
# Scrape a website
snm_cli scrape https://example.com --format markdown
# Extract data
snm_cli extract https://example.com --format json
# Take screenshot
snm_cli screenshot https://example.com --width 1920 --height 1080
HTTP Server
# Start the server
cargo run --bin snm_server
# Health check
curl http://localhost:8080/health
# List available tools
curl http://localhost:8080/tools
# Use tools via API
curl -X POST http://localhost:8080/invoke \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"tool": "search_web",
"parameters": {
"query": "Rust web scraping",
"engine": "google"
}
}'
🛠️ Available Tools
🔍 Search Web (search_web)
Search across multiple search engines with BrightData's unblocking capabilities.
{
"tool": "search_web",
"parameters": {
"query": "your search query",
"engine": "google" // google, bing, yandex, duckduckgo
}
}
🌐 Scrape Website (scrape_website)
Extract content from any website, bypassing anti-bot protections.
{
"tool": "scrape_website",
"parameters": {
"url": "https://example.com",
"format": "markdown" // raw, markdown
}
}
📊 Extract Data (extract_data)
Intelligent data extraction from webpages.
{
"tool": "extract_data",
"parameters": {
"url": "https://example.com"
}
}
📸 Take Screenshot (take_screenshot)
Capture high-quality screenshots of websites.
{
"tool": "take_screenshot",
"parameters": {
"url": "https://example.com"
}
}
🤖 MCP Integration
This crate is fully compatible with the Model Context Protocol (MCP), making it easy to integrate with AI systems like Claude.
MCP Server Configuration
{
"type": "url",
"url": "https://your-server.com/sse",
"name": "brightdata-mcp",
"authorization_token": "your_token",
"tool_configuration": {
"enabled": true,
"allowed_tools": ["search_web", "scrape_website", "extract_data", "take_screenshot"]
}
}
Example with Claude
curl https://api.anthropic.com/v1/messages \
-H "Content-Type: application/json" \
-H "X-API-Key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: mcp-client-2025-04-04" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 2000,
"messages": [
{
"role": "user",
"content": "Search for the latest news about Rust programming language"
}
],
"mcp_servers": [
{
"type": "url",
"url": "https://your-server.com/sse",
"name": "brightdata-mcp",
"authorization_token": "your_token"
}
]
}'
🏗️ API Reference
HTTP Endpoints
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/tools |
GET | List available tools |
/invoke |
POST | Direct tool invocation |
/sse |
POST | Server-Sent Events streaming |
/mcp |
POST | MCP JSON-RPC protocol |
Response Format
All tools return MCP-compatible responses:
{
"content": [
{
"type": "text",
"text": "Response content here"
}
],
"is_error": false,
"raw_value": {
// Original response data
}
}
⚙️ Configuration
BrightData Setup
- Sign up for BrightData account
- Create zones for Web Unlocker and Browser
- Get API credentials from your dashboard
- Set environment variables as shown above
Zone Configuration
- Web Unlocker Zone: For web scraping and search
- Browser Zone: For screenshots and JavaScript rendering
🔧 Development
Building
# Build library
cargo build
# Build with all features
cargo build --all-features
# Run tests
cargo test
# Run with debug logging
RUST_LOG=debug cargo run --bin snm_server
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
📊 Performance
- Concurrent Requests: Supports high-concurrency workloads
- Rate Limiting: Built-in 10 requests/minute per tool (configurable)
- Timeout Handling: Configurable timeouts for different operations
- Error Recovery: Automatic retry mechanisms with backoff
🛡️ Security
- Token Authentication: Secure API access
- Rate Limiting: Prevents abuse
- Input Validation: Comprehensive parameter validation
- CORS Support: Configurable cross-origin requests
📝 Examples
Check out the examples/ directory for:
- Basic usage examples
- Integration patterns
- Advanced configurations
- Error handling strategies
🤝 Integration Examples
With Anthropic Claude
Use as an MCP server to enhance Claude with web scraping capabilities.
With Custom Applications
Integrate into your Rust applications for:
- E-commerce price monitoring
- Content aggregation
- Market research
- Competitive analysis
📋 Requirements
- Rust: 1.70 or later
- BrightData Account: With API access
- Network Access: HTTPS outbound connections
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- BrightData for providing robust web scraping infrastructure
- Actix Web for high-performance HTTP server framework
- Anthropic for MCP protocol specification
📞 Support
- 📧 Email: inxmaurya@gmail.com
- 🐛 Issues: GitHub Issues
- 📖 Documentation: API Docs
🚀 Roadmap
- Additional search engines
- Enhanced data extraction templates
- WebSocket support for real-time scraping
- GraphQL API interface
- Kubernetes deployment examples
- Advanced proxy rotation
- Machine learning integration for content classification
Made with ❤️ by SNM Maurya
Dependencies
~28–48MB
~792K SLoC