#regex #stdout #csv #parse #ua #log-file #ip-d

app parse2csv

parse log-file and output to stdout as csv file by regex

1 unstable release

new 0.2.0 Oct 30, 2024

#168 in Unix APIs

Download history 63/week @ 2024-10-24

65 downloads per month

MIT/Apache

39KB
259 lines

Convert log data or fixed format data to csv

  • Read the log data from stdin, parse it line by line with a regex string, export the data to stdout as CSV format

Usage

stdin | ./parse2csv --regex "^(?<ip>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})"
  • cat access.log
94.102.51.144 - - [14/Jun/2024:02:34:17 +0800] "POST /wp-json/wpgmzA/v1/markers?_method=get&random=/wpgmza/v1/markers/10 HTTP/1.1" 301 169 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" "-"
91.204.46.40 - - [14/Jun/2024:03:00:42 +0800] "GET /wp-json/wp/v2/users HTTP/1.1" 301 169 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.83 Safari/537.1" "-"
135.125.246.110 - - [14/Jun/2024:03:13:04 +0800] "GET /.env HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36" "-"
135.125.246.110 - - [14/Jun/2024:03:13:04 +0800] "POST / HTTP/1.1" 405 559 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36" "-"
185.254.196.173 - - [14/Jun/2024:03:19:16 +0800] "GET /.env HTTP/1.1" 404 555 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36" "-"
  • run
cat access.log | ./parse2csv --regex "^(?<ip>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}) - - \[(?<date>\S+ \+\d{4})\] \"(?<method>GET|HEAD|POST|PUT|DELETE|CONNECT|OPTIONS|TRACE|PATCH) (?<path>\S+) (?<version>\S+)\" (?<code>\d{3}) (?<rt>\S+) \"(?<referer>\S+)\" \"(?<ua>[^\"]+)\" \"(\S+)\""
  • The stdout shoud be:
"ip","date","method","path","version","code","rt","referer","ua"
"94.102.51.144","14/Jun/2024:02:34:17 +0800","POST","/wp-json/wpgmzA/v1/markers?_method=get&random=/wpgmza/v1/markers/10","HTTP/1.1","301","169","-","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
"91.204.46.40","14/Jun/2024:03:00:42 +0800","GET","/wp-json/wp/v2/users","HTTP/1.1","301","169","-","Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.83 Safari/537.1"
"135.125.246.110","14/Jun/2024:03:13:04 +0800","GET","/.env","HTTP/1.1","404","555","-","Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
"135.125.246.110","14/Jun/2024:03:13:04 +0800","POST","/","HTTP/1.1","405","559","-","Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"
"185.254.196.173","14/Jun/2024:03:19:16 +0800","GET","/.env","HTTP/1.1","404","555","-","Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"

Dependencies

~4.5–6.5MB
~104K SLoC