> I quote that also because Aycock distributed SPARK, an Earley parser, which was included as part of the Python distribution, in the Parser/ subdirectory, and a couple of people here on HN report having used it.
That one is really the only Earley parser I've found used in the wild (don't know what marpa is used for) and unfortunately it is mostly unhackable because they did some serious optimization voodoo on it so it was replaced by a hand-written recursive decent parser a while back because nobody in the world could figure how it works[0] -- which is kind of strange since ASDL is super simple to parse and the generator which used spark was meant to check files into source control but, whatever.
Its easy to play around with but not a great source if you want to see how an Earley parser is put together. There are also some bugs with parser action on duplicate rules not working properly that were pretty easy to fix but python pulled it out of the source tree so no upstream to send patches to?
You are one of the "couple of people" I was referring to. :)
I know SPARK's docstring use influenced PLY.
PLY doesn't use Earley, but "Earley" does come up in the show notes of an interview with Beazley, PLY's author, at https://www.pythonpodcast.com/episode-95-parsing-and-parsers... . No transcript, and I'm not going to listen to it just to figure out the context.
Lark is amazing... But it's also one of the best LR parsers out there and I would guess that mode is used a lot more than the Earley mode.
Either way, I have never used a better parser generator. It has the best usability and incredible performance when you consider it is written in pure Python.
That one is really the only Earley parser I've found used in the wild (don't know what marpa is used for) and unfortunately it is mostly unhackable because they did some serious optimization voodoo on it so it was replaced by a hand-written recursive decent parser a while back because nobody in the world could figure how it works[0] -- which is kind of strange since ASDL is super simple to parse and the generator which used spark was meant to check files into source control but, whatever.
Its easy to play around with but not a great source if you want to see how an Earley parser is put together. There are also some bugs with parser action on duplicate rules not working properly that were pretty easy to fix but python pulled it out of the source tree so no upstream to send patches to?
[0] might be making that part up, dunno?