Hacker News new | past | comments | ask | show | jobs | submit login

I've long been a PEG skeptic. All three of these benefits have issues:

(1) LL parser generators have had zero-or-more and one-or-more operators for a long time (see ANTLR for example [1], as well as the venerable Parse::RecDescent [2]). They're very easy to implement in a recursive-descent parser.

(2) In practice, lexer-free grammars are very difficult to maintain in the presence of things like comments and whitespace rules. You sacrifice the ability to reason about layers of the grammar in isolation by binding the two layers together. Try to write a PEG grammar for, say, Java and you'll see what I mean; you end up wishing you had a sort of preprocessor that could get rid of comments and such in advance, which is precisely what a lexer is.

(3) PEGs are unambiguous in the sense that the system specifies how to resolve ambiguities (by taking the first option). But backtracking LL grammars and LR grammars provide this too. LL parser generators typically accept the first rule that matches: what could be simpler? LR parser generators are a little more clever and resolve shift/reduce conflicts in favor of the shift, which is a bit more magical but basically means that productions act "greedy". The key is that grammars always specify how to resolve ambiguities in some way, and this is really no different from what PEGs provide.

[1]: http://www.antlr.org/wiki/display/ANTLR3/ANTLR+Cheat+Sheet [2]: http://search.cpan.org/~dconway/Parse-RecDescent-1.965001/li...




I don't care enough about this stuff to have strong opinions on (1) and (3). Until this year, I wrote my parsers (I end up doing 2-3 a year) in Lex/Yacc.

But (2) is a red herring. Nothing about PEG tools force you to do all your tokenizing in the grammar. If you want to strip /++/ comments out, you can use exactly the same code you'd have used in your handrolled recursive descent parser to do that, and then parse the real stuff in PEG.

But that aside, we use a PEG grammar for Javascript, which has (I assume) all the annoying comment syntax you're referring to, and it doesn't seem like a big deal.


> we use a PEG grammar for Javascript [...]

Anything open-source, or that you're in the mood to share?





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: