Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Since Rascal has integrated Context Free Grammars, does this mean it incorporates the Earley Algorithm for parsing ?


Like many things from the Netherlands, it uses SGLR parsing (scannerless GLR, where GLR = generalized LR), which actually handles a class slightly more general than CFG s.


That was version 0.1 bootstrapped off SDF2. Since 2009 Rascal is based on a topdown general algorithm calked GLL. It's still Scannerless with declarative disambiguation, like SGLR. Currently it's evolving to data-dependent GLL to be able to model the offside rule, the C pre-processor and symbol table, and other "interesting" stuff we encounter in programming languages. See thesis by Afroozeh and Izmaylova.


How is GLR more general than CFGs?


GLR, GLL and Earley's parser (with the tree construction) are labeled "context-free general" parsing algorithms because they can handle _any_ context-free grammar while the other algorithms associated with sub-classes of CFGs, such as LR, LALR and LL do not implement all CFGs. In particular this comes in handy for languages which are unambiguous yet require a lot of lookahead or context to avoid parser action conflicts. The context-free general algorithms are popular among people who have to rapidly prototype languages or implement many different language processing tools without the opportunity to spend months on a parser. We pay in speed. GLR and GLL typically run nearly linearly on normal files, while Earley has a bad average case behavior which is often quadratic.

Another benefit of context-free general implementations is composability of grammars. Since CFGs are closed under composition you can implement modular grammar formalisms using a context-free general parser.

Another main drawback is that (un)ambiguity of context-free grammars is undecidable, and so prototypes generated using GLR, GLL and Earley may produce more than one tree, and in release mode you migh report an unexpected static error to your user in that case. We deal with this drawback using random testing/fuzzing and some static analysis.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: