Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ambiguity is useful for error recovery/error detection. also, some languages have ambiguity in their syntax (ML). I don't buy the 'optimization' argument. there is no reason we cannot have our cake and eat it - ambiguity and incremental parsing.

as for incremental parsing in ides, you may enjoy this thesis: http://jeff.over.bz/papers/2006/ms-thesis.pdf

finally;

I think parser-generators are unpopular because people would prefer to just write code, rather than compile something else to automatically generated code that is nigh on unreadable.

I think the popularity of regexes is due in part to the ease of which they can be embedded or used within the host language - either with syntactic sugar, or simply as a library.

combinators (parsec especially) hit a sweet spot of being able to write code that handles parsing succinctly, without having to conflate your build or auto-generate source.

i'd really prefer a library I can load and manipulate the grammar from, over yet another syntax and compiler in my tool chain.

(ps. (i'm saddened by the lack of left recursion support in gazelle))



> I think parser-generators are unpopular because people would prefer to just write code, rather than compile something else to automatically generated code that is nigh on unreadable.

I agree that generating source code is annoying, which is why Gazelle does not do it. It takes a VM approach instead; the parser is either interpreted or JIT-ted.

> I think the popularity of regexes is due in part to the ease of which they can be embedded or used within the host language - either with syntactic sugar, or simply as a library.

That is exactly what Gazelle is trying to do.

> i'd really prefer a library I can load and manipulate the grammar from, over yet another syntax and compiler in my tool chain.

Gazelle always loads its grammars at runtime. There's a compiler also, but it just generates byte-code that the runtime loads. But you can run the compiler at run-time too if you want.

If you'd rather build a grammar programmatically than use a syntax meant for it, more power to you (Gazelle will support it). But that doesn't seem to match your regex case: people specify regexes with a special syntax, not by building up an expression tree manually. The latter seems like a lot of work to me, and such grammars will not be reusable from other languages, but that might not be important to you.

> (ps. (i'm saddened by the lack of left recursion support in gazelle))

What is a case where you would really miss it, that isn't addressed by a repetition operator (*) or an operator-precedence parser?


I would like to say: awesome!

And yes most of my left recursion fetish would be covered by an operator precedence parser/left corner parser


> I think parser-generators are unpopular because people would prefer to just write code, rather than compile something else to automatically generated code that is nigh on unreadable.

Also, a manual lexer/parser can introduce context when necessary. E.g. Python has significant whitespace. The lexer (with context knowledge of indentation width) can easily emit indent/dedent tokens, so the grammar is context free. With a tool like ANTLR you have to do wierd stuff to parse Python.

Programming languages older than 10 years or so are usually not context-free. For example C needs context to parse "A*B", because the meaning depends on whether "A" is a type or a variable. Recent programming languages usually try to be LL(1), which is why the keywords "var", "val", and "def" become so popular.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: