20 releases (6 breaking)
0.7.0 | Sep 19, 2024 |
---|---|
0.5.3 | Sep 14, 2024 |
0.0.10 | Aug 27, 2023 |
0.0.9 | Mar 18, 2023 |
#416 in Text processing
904 downloads per month
285KB
4.5K
SLoC
About
By regarding matching as an assignment of occurrences of strings to each part of an expression, regular expressions are resource (limited occurrences of strings) consumption (exclusive assignment/matching of them).
Kind
- Decidable (or only complete?)
- Orderd/non-commutative
- Linear/non-deterministic (β regular/deterministic)
- Unitless/promonoidal
TODO
- Is there such as selective/projective mul to concatenate only one side of And?
- Representation of the four units, or do we need them? (Decidability is affecting)
- 1 as Seq::empty(), π β‘ Ξ½X.X
- π β‘ ΞΌX.X
- β€ as Inf(Interval::full())
- β₯ as a & b when a β b (?A)
- One as Seq(['a']) vs Interval('a', 'a')
- Relatipnship between nominality, input-ness/string, and function arguments/abstraction, the Integral trait and the De Bruijn index
- TextStart and TextEnd as left / right units
- Interpretation of
match
, is it a judgementa : A
or a test that one is divisible by another (quotient is equal to zero)a / b
, one (string) is contained by another (regex) a β b? Proof search - Two types of negations for each positive connectives: rev and div for Seq, not and sub for Interval
- a.le(b) β a.and(b) = a, a.add(b) = b
- Equational reasoning, bisimulation
- Induction
- True(Seq<I>, Box<Repr<I>>), assertions, dependent types, backreference
- action a.P, a β P, guard, scalar product
- Normalisation vs equality rules
- characteristic function/morphism
- weighted limit, weighted colimit, end, coend
- parameterise some laws as features
- Spherical conic (tennis ball)
Xor
- lookahead/behind, multiple futures, communication and discard,
ignore
combinator - Split, subspace
Correspondence
formal language theory / automata theory / set theory |
linear logic | repr | type theory / category theory / coalgebra |
len | process calculus | probability theory / learning theory |
quantum theory |
---|---|---|---|---|---|---|---|
a β L (match) | a : A (judgement) | ||||||
$β $ | $0$ (additive falsity) | Zero | π (empty type) | nil, STOP | |||
$β€$ (additive truth) | True | ||||||
a | a | One(Seq(a)) | len(a) | (sequential composition, prefix) | |||
$Ξ΅$ (empty) β {Ξ΅} | $1$ (multiplicative truth) | Seq([]) | * : π (unit type) | 0 | SKIP | ||
. | Interval(MIN, MAX) | 1 | |||||
ab / $a Β· b$ (concatenation) | $a β b$ (multiplicative conjunction/times) | Mul(a, b) | $a β b$ (tensor product) | len(a) + len(b) | P ||| Q (interleaving) | ||
a|b (alternation), $a βͺ b$ (union) |
$a β b$ (additive disjuction/plus) | Or(a, b) | $a + b$ (coproduct) | max(len(a), len(b)) | (deterministic choice) | ||
a* (kleen star), ..|aa|a|Ξ΅ |
$ΞΌX.π + (L(a) β X)$ | ||||||
TODO | $!a$ (exponential conjunction/of course) | Inf(a) | $Ξ½X.π & a & (X β X)$ | (replication) | |||
a*? (non greedy) | |||||||
TODO | $?a$ (exponential disjunction/why not) | Sup(a) | $Β΅X.β₯ β a β (X β X)$ | ||||
a? | a + 1 | Or(Zero, a) | |||||
a{n,m} (repetition) | a | Or(Mul(a, Mul(a, ..)), Or(..)) | |||||
[a-z] (class) | Interval(a, z) | ||||||
[^a-z] (negation) |
TODO this is complementary op | Neg(a) /-a |
|||||
aβ (reverse) | right law vs left law | a.rev() | len(a) | ||||
$a / b$ (right quotient) | $a βΈ b$ | Div(a, b) | len(a) - len(b) | (hiding) | |||
a \ b (left quotient) | Div(a, b) |
(hiding) | |||||
a β b (multiplicative disjunction/par) | Add(a, b) | $a β b$ (direct sum) | (nondeterministic choice) | ||||
$a β© b$ (intersection) | a & b (additive conjunction/with) | And(a, b) | $a Γ b$ (product) | (interface parallel) | |||
a(?=b) (positive lookahead) |
|||||||
a(?!b) (negative lookahead) |
|||||||
(?<=a)b (positive lookbehind) |
|||||||
(?<!a)b (negative lookbehind) |
|||||||
$a β b, a β€ b$ (containmemt) | $a β€ b (β a = b β a < b)$ | a.le(b) | |||||
aβ₯ (dual) | a.dual() | ||||||
a = b (equality) | a = b (identity type) |
About symbols
Symbols are grouped and assigned primarily by additive/multiplicative distinciton. They are corresponding to whether computation exits or rather continues; though concatenation β
/Mul
/*
has conjunctive meaning, computation doesn't complete/exit at not satisfying one criterion for there are still different ways of partition to try (backtracking). Though RegexSet β
/Add
/+
has disjunctive meaning, computation doesn't complete/exit at satisfying one criterion to return which regex has match. On the other hand, alternation β
/Or
/|
and intersection &
/And
/&
early-break, hence order-dependent. When I add Map
variant to Repr
to execute arbitrary functions, this order-dependency suddenly becomes to matter for those arbitrary functions can have effects, for example, modification/replacement of input string. (Effects are additive.)
additive | multiplicative | exponential | |
---|---|---|---|
positive/extensional | $β$ $0$ Or | $β$ $1$ Mul | ! |
negative/intensional | & β€ And | β β₯ Add | ? |
Properties/Coherence
- All connectives are associative
- Additive connectives are commutative and idempotent
TODO:
- Seq::empty(), Ξ΅ - can be empty because negative
- Interval::full() - can't be empty because positive
name | regular expressions | linear logic | type theory | repr |
---|---|---|---|---|
or-associativity | $a | (b | c) = (a | b) | c$ | $a β (b β c) = (a β b) β c$ | Or(a, Or(b, c)) = Or(Or(a, b), c) | |
or-idempotence | $a | a = a$ | $a β a = a$ | Or(a, a) = a | |
or-unit | $a β 0 = 0 β a = a$ | Or(a, Zero) = Or(Zero, a) = a | ||
mul-unit | $a Β· Ξ΅ = Ξ΅ Β· a = a$ | $a β 1 = 1 β a = a$ | Mul(a, One('')) = Mul(One(''), a) = a | |
right-distributivity | $a Β· (b | c) = (a Β· b) | (a Β· c)$ | $a β (b β c) = (a β b) β (a β c)$ | Mul(a, Or(b, c)) = Or(Mul(a, b), Mul(a, c)) | |
left-distributivity | $(a | b) Β· c = (a Β· c) | (b Β· c)$ | $(a β b) β c = (a β c) β (b β c)$ | Mul(Or(a, b), c) = Or(Mul(a, c), Mul(b, c)) | |
$Ξ΅^β = Ξ΅$ | ||||
(a & b)β = (bβ ) & (aβ ) | Rev(Mul(a, b)) = Mul(Rev(b), Rev(a)) | |||
Mul(One(a), One(b)) = One(ab) | ||||
right-distributivity | a β (b & c) = (a β b) & (a β c) | Add(a, And(b, c)) = And(Add(a, b), Add(a, c)) | ||
left-distributivity | $(a & b) β c = (a β c) & (b β c)$ | Add(And(a, b), c) = And(Add(a, c), Add(b, c)) | ||
and-idempotence | $a & a = a$ | And(a, a) = a | ||
exponential | $(a & b)* = a* Β· b*$ | $!(A & B) β£ !A β !B$ | ||
exponential | $(a | b)? = a? β b?$ | $?(A β B) β£ ?A β ?B$ |
Relationship among additive, multiplicative and exponential
- exp(a + b) = exp(a) * exp(b)
Linearity (which)
- f(a + b) = f(a) + f(b)
- (a β b) + (b β a) (non-constructive)
- functions take only one argument
- presheaves of modules
π, derivation
math | repr |
---|---|
$π(a β b) β π(a) β b + a β π(b)$ | der(Mul(a, b)) = Or(Mul(der(a), b), Mul(a, d(b))) |
- d(Zero) = Zero
- d(Or(a, b)) = Or(d(a), d(b))
- d(Mul(a, b)) = Or(Mul(d(a), b), Mul(a, d(b)) *
- d(Inf(a)) = Mul(d(a), Inf(a))
- a : D(a)
True
- And(True, a) ->
Stream processor
- Ξ½X.ΞΌY. (A β Y) + B Γ X + 1
Flags (TODO)
-
i
, CaseInsensitive -
m
, MultiLine -
s
, DotMatchesNewLine -
U
, SwapGreed -
u
, Unicode -
x
, IgnoreWhitespace -
Recouse non-greedy pattern to _
Algorithms (TODO)
- Bit-pararell
- aho-corasick
- BoyerβMoore
- memchr
Semantics (TODO)
- Coherent space
References
- Completeness for Categories of Generalized Automata, Guido Boccali, Andrea Laretto, Fosco Loregian, Stefano Luneia, 2023 [arXiv]
- Beyond Initial Algebras and Final Coalgebras, Ezra Schoen, Jade Master, Clemens Kupke, 2023 [arXiv]
- The Functional Machine Calculus, Willem Heijltjes, 2023 [arXiv]
- The Functional Machine Calculus II: Semantics, Chris Barrett, Willem Heijltjes, Guy McCusker, 2023 [arXiv]
- Coend Calculus, Fosco Loregian, 2020 [arXiv]
- Partial Derivatives for Context-Free Languages: From ΞΌ-Regular Expressions to Pushdown Automata, Peter Thiemann, 2017 [arXiv]
- Proof Equivalence in MLL Is PSPACE-Complete, Willem Heijltjes, Robin Houston, 2016 [arXiv]
- Linear Logic Without Units, Robin Houston, 2013 [arXiv]
- Imperative Programs as Proofs via Game Semantics, Martin Churchill, Jim Laird, Guy McCusker, 2013 [arXiv]
- Regular Expression Containment as a Proof Search Problem, Vladimir Komendantsky, 2011 [inria]
Dependencies
~2.3β4MB
~72K SLoC