#http-request #http-file #requests #parsed #pest #header #data

bin+lib http_file_parser

A Rust-based parser for .http files, parsing HTTP request data using Pest and making parsed requests

2 releases

0.1.1 Nov 13, 2024
0.1.0 Nov 13, 2024

#1014 in Network programming

Download history 202/week @ 2024-11-09 44/week @ 2024-11-16 6/week @ 2024-11-23 18/week @ 2024-12-07

80 downloads per month

MIT license

20KB
239 lines

HTTP File Parser

This project is a Rust-based utility for parsing .http files, which are commonly used to store HTTP requests in a structured format. By parsing these files, this tool allows automated processing of HTTP requests, useful in scenarios such as API testing.

Example of an .http File

Here’s an example .http file that can be parsed by this tool:

### POST submit request
POST https://eo7nilf42qb5b5e.m.pipedream.net HTTP/1.1
Authorization: Bearer token123
Content-Type: application/json

{ "name": "Alice", "age": 30 }

### GET profile request
GET https://eo7nilf42qb5b5e.m.pipedream.net HTTP/1.1
Accept: application/json

### DELETE account request
DELETE https://eo7nilf42qb5b5e.m.pipedream.net HTTP/1.1
Authorization: Bearer token123

### PUT profile request
PUT https://eo7nilf42qb5b5e.m.pipedream.net HTTP/1.1
Content-Type: application/json

{ "name": "Alice", "age": 31 }

Technical Description

Parsing Process

Using the Pest library for Rust, this tool analyzes the structure of .http files, extracting key information such as:

  • HTTP Method (e.g., GET, POST)
  • Request URL (e.g., https://example.com/api/resource)
  • Headers (e.g., Content-Type: application/json)
  • Request Body

The .http files are parsed line-by-line, following a grammar that identifies sections such as the request line, headers, and body. Pest provides a custom grammar defined in grammar.pest, which defines these parsing rules in a concise format.

Grammar

http_file = { SOI ~ (delimiter | request)* ~ EOI}

request = {
    request_line ~
    headers? ~
    NEWLINE ~
    body?
}

delimiter = { "###" ~ whitespace* ~ title? ~ NEWLINE+ }
title = { (!NEWLINE ~ ANY)+ }

request_line = _{ method ~ " "+ ~ uri ~ " "+ ~ "HTTP/" ~ version ~ NEWLINE }
uri = { (!whitespace ~ ANY)+ }
method = { ("GET" | "DELETE" | "POST" | "PUT") }
version = { (ASCII_DIGIT | ".")+ }
whitespace = _{ " " | "\t" }

headers = { header+ }
header = { header_name ~ ":" ~ whitespace ~ header_value ~ NEWLINE }
header_name = { (!(NEWLINE | ":") ~ ANY)+ }
header_value = { (!NEWLINE ~ ANY)+ }

body = { (!delimiter ~ ANY)+ }

Let's break down the grammar rule by rule:

  • Overall Structure

The .http file is represented as an http_file, which can contain multiple request blocks, separated by optional delimiter sections. Each request block contains essential components like request_line, headers, and an optional body.

http_file = { SOI ~ (delimiter | request)* ~ EOI }
  • SOI: Start of input.

  • EOI: End of input.

  • The file can contain multiple request and delimiter sections.

  • Request

Each HTTP request is structured as follows:

request = {
    request_line ~
    headers? ~
    NEWLINE ~
    body?
}
  • request_line: Defines the HTTP method, URI, and version.

  • headers: Optional section containing one or more headers.

  • NEWLINE: Separates headers from the body.

  • body: Optional content, can include any data except the delimiter.

  • Delimiter

The delimiter rule separates different requests within the file and may include an optional title.

delimiter = { "###" ~ whitespace* ~ title? ~ NEWLINE+ }
title = { (!NEWLINE ~ ANY)+ }
  • "###": Indicates a delimiter.

  • whitespace: Allows for spaces or tabs after the delimiter.

  • title: Optional title text for the delimiter section, placed on the same line.

  • Request Line

Defines the core structure of an HTTP request.

request_line = _{ method ~ " "+ ~ uri ~ " "+ ~ "HTTP/" ~ version ~ NEWLINE }
  • method: The HTTP method (e.g., GET, POST).

  • uri: The URL endpoint for the request.

  • version: The HTTP version (e.g., 1.1).

  • Method

The allowed HTTP methods:

method = { ("GET" | "DELETE" | "POST" | "PUT") }
  • URI and Version

Defines the URI and HTTP version formats:

uri = { (!whitespace ~ ANY)+ }
version = { (ASCII_DIGIT | ".")+ }
  • uri: Any non-whitespace characters represent the endpoint.

  • version: Consists of digits and periods, like 1.1.

  • Headers

Optional section containing one or more headers.

headers = { header+ }
header = { header_name ~ ":" ~ whitespace ~ header_value ~ NEWLINE }
header_name = { (!(NEWLINE | ":") ~ ANY)+ }
header_value = { (!NEWLINE ~ ANY)+ }
  • header_name: The name of the header (e.g., Content-Type).

  • header_value: The value of the header (e.g., application/json).

  • Body

Optional content for the request:

body = { (!delimiter ~ ANY)+ }

Parsing Outcome and Usage

Once parsed, the extracted data is structured into an HttpRequest object. This object is then used to perform HTTP requests via the reqwest library and printing the response.

Dependencies

~7–18MB
~249K SLoC