#product #rating #skin #format #parser #ingredient #extract

bin+lib cosmetics_parser

A Rust-based parser to extract product details from cosmetics catalogs in markdown format and output them in structured formats like JSON or Rust structs

1 unstable release

0.1.0 Nov 10, 2024

#869 in Parser implementations

Download history 76/week @ 2024-11-05 29/week @ 2024-11-12 4/week @ 2024-11-19

67 downloads per month

MIT license

16KB
186 lines

cosmetics_parser

Overview

cosmetics_parser is a Rust-based parser designed to extract information from a cosmetics catalog written in a human-readable markdown format. The input consists of product descriptions that include details such as product name, skin type, ingredients, ratings, price, user reviews, and availability.

The parser reads these product descriptions and converts them into a structured data format, which can be used for further processing, analysis, or presentation in an online cosmetics store application.

Parsing Process

The parser reads a markdown-like format with structured information for each product. Each product contains the following fields:

  1. Product Name: The name of the product.
  2. Skin Type: The type of skin the product is designed for (e.g., dry, oily).
  3. Ingredients: The ingredients used in the product.
  4. Rating: The overall rating of the product.
  5. Price: The price of the product.
  6. User Ratings: A list of user ratings.
  7. Recommendations: Instructions or recommendations for using the product.
  8. Reviews: User-submitted feedback.
  9. Availability: A boolean value indicating whether the product is in stock.

Grammar

The parser uses the Pest library to process the input format. The grammar rule defined in grammar.pest handles product descriptions and processes fields such as numbers, strings, and lists (e.g., user ratings).

How It Works And Where To Use

The input is processed line by line, and the parser extracts relevant data from each field. After parsing, a CosmeticsCatalog object is created to hold the parsed products. This catalog can then be used for further processing or display in a frontend application.

Example Input

*Product 1*: Face Cream "Moisturizing"
*Skin Type*: Dry Skin
*Ingredients*: Water, Glycerin, Hyaluronic Acid, Jojoba Oil
*Rating*: 4.5
*Price*: 299.99 UAH
*User Ratings*: [5, 4, 5, 3, 4]
*Recommendations*: Use in the morning and evening after cleansing the skin. Suitable for sensitive skin.
*Reviews*:
1.	"This cream perfectly moisturizes my skin. It absorbs easily!"
*Availability*: true

Example Output

{
    "product_name": "Face Cream \"Moisturizing\"",
    "skin_type": "Dry Skin",
    "ingredients": "Water, Glycerin, Hyaluronic Acid, Jojoba Oil",
    "rating": 4.5,
    "price": 299.99,
    "user_ratings": [
      5.0,
      4.0,
      5.0,
      3.0,
      4.0
    ],
    "recommendations": "Use in the morning and evening after cleansing the skin. Suitable for sensitive skin.",
    "reviews": [
      "1. \"This cream perfectly moisturizes my skin. It absorbs easily!\""
    ],
    "availability": false
  }

Dependencies

~2.5–3.5MB
~73K SLoC