10 releases
0.3.3 | Mar 23, 2025 |
---|---|
0.3.2 | Sep 18, 2024 |
0.3.0 | Apr 25, 2024 |
0.2.2 | Sep 19, 2022 |
0.1.2 | Sep 17, 2022 |
#223 in Command line utilities
26KB
472 lines
rexturl
A versatile command-line tool for parsing and manipulating URLs.
Features
- Extract specific URL components (scheme, username, host, port, path, query, fragment)
- Extract domain and subdomain information
- Custom output formatting
- JSON output support
- Sorting and deduplication of results
- Process multiple URLs from command line or stdin
- Automatic handling of URLs without schemes (defaults to https)
Installation
cargo install rexturl
or clone the repository and build from source:
git clone https://github.com/vschwaberow/rexturl.git
cd rexturl
cargo build --release
Usage
rexturl [OPTIONS] [URLS...]
If no URLs are provided, rexturl will read from stdin.
Options
--urls <URLS>
Input URLs to process
--scheme
Extract and display the URL scheme
--username
Extract and display the username from the URL
--host
Extract and display the hostname
--port
Extract and display the port number
--path
Extract and display the URL path
--query
Extract and display the query string
--fragment
Extract and display the URL fragment
--sort
Sort the output
--unique
Remove duplicate entries from the output
--json
Output results in JSON format
--all
Display all URL components
--custom
Enable custom output mode
--domain
Extract the domain
--format <FORMAT>
Custom output format (default: "{scheme}://{host}{path}")
-h
, --help
Print help information
-V
, --version
Print version information
Examples
-
Extract all components from a single URL:
rexturl --all https://user:pass@example.com:8080/path?query=value#fragment
-
Extract host and port from multiple URLs:
rexturl --host --port https://example.com https://api.example.com:8443
-
Process URLs from a file, extracting paths and sorting results:
cat urls.txt | rexturl --path --sort
-
Use custom output format:
rexturl --custom --format "{scheme}://{host}:{port}{path}" https://example.com:8080/api
-
Output results in JSON format:
rexturl --json --all https://example.com https://api.example.com
-
Sort and deduplicate results:
echo -e "https://example.com\nhttps://example.com\nhttps://api.example.com" | rexturl --host --sort --unique
Domain and Subdomain Extraction
rexturl
includes special handling for domains and subdomains:
- The
--domain
flag extracts the domain name from the URL, with proper handling of multi-part TLDs - When using
--host
alone (without other component flags), it extracts the subdomain by default - Multi-part TLDs like co.uk, org.uk, com.au, etc. are automatically detected
Examples:
# Extract domain from a URL with multi-part TLD
echo "https://blog.example.co.uk/posts" | rexturl --domain
# Output: example.co.uk
# Extract subdomain from a URL
echo "https://blog.example.co.uk/posts" | rexturl --host
# Output: blog
# Extract subdomain and domain separately using custom format
echo "https://blog.example.co.uk/posts" | rexturl --custom --format "Subdomain: {subdomain}, Domain: {domain}"
# Output: Subdomain: blog, Domain: example.co.uk
Custom Output Format
When using --custom
and --format
, you can use the following placeholders:
{scheme}
- URL scheme (http, https, etc.){username}
- Username portion of the URL{host}
- Full hostname{hostname}
- Alias for host{subdomain}
- Subdomain portion (e.g., "www" in www.example.com){domain}
- Domain name (e.g., "example.com"){port}
- Port number{path}
- URL path{query}
- Query string (without the leading ?){fragment}
- Fragment identifier (without the leading #)
Example:
rexturl --custom --format "Host: {host}, Path: {path}" https://example.com/api
rexturl --custom --format "Subdomain: {subdomain}, Domain: {domain}" https://blog.example.co.uk/posts
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Dependencies
~4.5–6.5MB
~114K SLoC