#sql-parser #parser-generator #parser #sql #tokenizer #scanner

libsql-sqlite3-parser

SQL parser (as understood by SQLite) (libsql fork)

4 releases (2 breaking)

0.13.0 Aug 7, 2024
0.12.0 Jun 11, 2024
0.11.1 Mar 7, 2024
0.11.0 Jan 4, 2024

#179 in Parser implementations

Download history 3684/week @ 2024-09-11 3498/week @ 2024-09-18 3891/week @ 2024-09-25 4169/week @ 2024-10-02 3917/week @ 2024-10-09 4116/week @ 2024-10-16 3569/week @ 2024-10-23 2305/week @ 2024-10-30 1495/week @ 2024-11-06 1234/week @ 2024-11-13 1030/week @ 2024-11-20 1198/week @ 2024-11-27 2167/week @ 2024-12-04 3924/week @ 2024-12-11 3680/week @ 2024-12-18 2100/week @ 2024-12-25

12,026 downloads per month
Used in 2 crates (via libsql)

Apache-2.0/MIT

450KB
11K SLoC

Rust 5.5K SLoC // 0.1% comments C 4.5K SLoC // 0.2% comments Happy 1.5K SLoC

Build Status Latest Version Docs dependency status

LEMON parser generator modified to generate Rust code.

Lemon source and SQLite3 grammar were last synced as of May 2022.

Unsupported

Unsupported Grammar syntax

  • %token_destructor: Code to execute to destroy token data
  • %default_destructor: Code for the default non-terminal destructor
  • %destructor: Code which executes whenever this symbol is popped from the stack during error processing

https://www.codeproject.com/Articles/1056460/Generating-a-High-Speed-Parser-Part-Lemon https://www.sqlite.org/lemon.html

SQLite

SQLite lexer and SQLite parser have been ported from C to Rust. The parser generates an AST.

Lexer/Parser:

  • Keep track of position (line, column).
  • Streamable (stop at the end of statement).
  • Resumable (restart after the end of statement).

Lexer and parser have been tested with the following scripts:

TODO:

Unsupported by Rust

  • #line directive

API change

  • No ParseAlloc/ParseFree anymore

Features not tested

  • NDEBUG
  • YYNOERRORRECOVERY
  • YYERRORSYMBOL

To be fixed

  • RHS are moved. Maybe it is not a problem if they are always used once. Just add a check in lemon...
  • %extra_argument is not supported.
  • Terminal symbols generated by lemon should be dumped in a specified file.

Raison d'être

  • lemon_rust does the same thing but with an old version of lemon. And it seems not possible to use yystack as a stack because items may be access randomly and the top+1 item can be used.

  • lalrpop would be the perfect alternative but it does not support fallback/streaming (see this issue) and compilation/generation is slow.

Dependencies

~1.2–1.9MB
~35K SLoC