Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I do not understand the rules by which you inject the epsilon and I think this is a source of confusion for many people. I had thought that an epsilon could be injected anywhere REGEX can appear (effectively allowing epsilon as a REGEX) but of course that just leads to infinite number of parses. Manually injecting epsilon is a highly hacky thing to do; better to consider that when you design the grammar.

I would not worry about "cat:dog:mouse" because intuitively it is clearly correct and it means replacing cat with mouse. With parentheses it could be written as "((cat:dog):mouse)".



Epsilon injection appears whenever right or left side of ':' has no operand. E.g.

(:a)

(a:)

a:|b

a|b:

etc

I will try to change the precedence and see how it works. Btw what do you think about explicit operators '>' '<' where '<' works as usual regex matcher, and '>' as a generator. For example to change 'cat' to 'dog' there could be something like '<cat>dog' where '<cat' part is a parser and '>dog' is a generator. Thanks.


I think your epsilon injection rule is trying to achieve this kind of production:

    TRRE <- TRRE ':' REGEX | ':' TRRE | TRRE ':' | REGEX | ...
I think this would work better, but ':a:' is still ambiguous: it has two parse trees.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: