Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Antlr-Ng Parser Generator (antlr-ng.org)
44 points by djoldman 4 months ago | hide | past | favorite | 20 comments


Development on Antlr4 has terminated. The "official ANTLR" successor, called Antlr5, was intended to enable ANTLR to run in a browser, replacing over a half-dozen runtime targets with a unified runtime target, and to add LSP services. But development on Antlr5 stopped after a few months, a year and a half ago, and I don't see when it'll be restarted, if ever.

Antlr-ng is Mike Lischke's port of Antlr4, which he likely undertook because ANTLR is used at Oracle for one MySQL product. It's not "official ANTLR," but Terence Parr granted him the use of the "ANTLR" name and allowed a fork to port the existing Antlr4 code to TypeScript.

Mike's Antlr-ng port of the Antlr4 code began with a Java-to-TypeScript translator he wrote. Along the way, he made some improvements to the TypeScript target.

But, Antlr-ng uses ALL(star). Therefore, it shares the same performance issues as Antlr4. I'm not sure where Mike wants to take Antlr-ng to address that issue.

ANTLR is presented as a generator for small, fast parsers. ALL(star) probably can't do that. Many grammars people write are pathological for ANTLR. People hand-write parsers, reverse-engineer the EBNF from the implementation as an afterthought, drop the critical semantic predicates from the EBNF, and then refactor it into something else—example: the Java Language Spec.


I have seen a fair few parser generators over the years, but it has been a long while since I have looked at anything that has been newly developed.

What improvements have been made to make them better? The problem domain seems pretty well defined and even 20 years ago the things that were changing felt like polishing off a few rough edges caused by earlier resource constraints.

I don't want to be dismissive and say "Why make this?" as a implied suggestion that it shouldn't have been made.

Nevertheless, Why make this? I assume there are good reasons for doing this that I am not aware of, what are they?


I get the impression that someone doesn't like Java and used chat gpt to create a one-to-one typescript port.

I dislike Java as much as the next guy, but I believe the true value of tools (and this tool in particular) is in the embedded wisdom and experience of their creators/Terrence Parr. Just generating a functionally equivalent port doesn't add much value.

That said, that's just a first impression, I have no idea what motivated this fork


Their GitHub readme has a section answering this.

https://github.com/antlr-ng/antlr-ng#future

Basically they feel the main problem with the original antlr is it’s being stifled by its batteries included nature. They’re hoping that splitting it will make each of the runtimes more agile. They don’t mention why the core was rewritten rather than just forking the original.


What is this project's relationship with antlr? I see a different name on the copyright and the github page suggests this is not a part of the antler project, while claiming to be the next generation.

If that's the case, I think it's misleading. It's fine to fork a project, but you don't get to call yourself the next generation of someone else's project.


I mean, we shouldn't allow ownership of the common english language. Did C++ Author Bjarne Stroustrup ask permission of C authors (are there even authors to ask). Did JavaScript creator ask Java creators. There was a Go! before Golang. BASIC and Visual Basic.


I don't think this a fair interpretation of the parent comment as it's not about ownership of language. The website literally says "The next generation of ANTLR" and says "It's the successor of ANTLR4".

It's about a tool claiming to be the successor without seeming to be part of the ANTLR organisation. Are they completely different people, did the ANTLR4 owners stop writing it? There seems to be deliberately no clarification on this.


I'm in the same boat with BABLR, which is designed as the successor to Babel (and named with a nod to ANTLR). I think this is just part of the benefit of free software and OSS, that someone can pick up the work and start trying to innovate without being given any kind of explicit permission. If you understand the mission and are willing and able, you can pick up the flag and start trying to run with it. You might not instantly become the recognized standard bearer for the cause, but keep pushing the flag forward people will take notice (as we are doing here).


Is antlr particularly popular these days? I was under the impression that most production parsers are some kind of handwritten recursive descent parsers, primarily because they're better at providing diagnostics and can sometimes be easier to maintain.


I've used antlr to generate parser for small language used in one project. It's like 100 declarative lines of code. Writing parser by hand would be a much more complicated task.

I didn't really care about diagnostics. It has some, that's enough.

And of course it's easier to maintain declarative grammar description.

My guess is, that it's often used for those kinds of simple grammars without high requirements to impementation. When you need to get things done. Like regex. You might write code to parse a string in a more efficient way, but with regex it's almost always easier. So ANTLR is like regex engine for more complicated inputs.


Most production parsers use their own handwritten recursive descent parsers, not only because of better diagnostics (error handling, language server hinting, etc.), but also for other reasons. One such major reason is that parser generators represent a very unstable dependency. They frequently change their APIs in newer versions, and some are becoming obsolete while new ones are constantly appearing. You don't want to risk the longevity of your parser by basing it on such unstable foundations. Flex/Bison is perhaps the only exception, as it hasn't changed much over time.


I used ANTLR recently to prototype a spreadsheet formula language -- backend was JVM so it was reasonably easy and batteries-included.


Quite right. But antlr is better for query parsing. They also have error listeners so error handling can be added.


The Readme has a section for the raison d'être for it compared to the original ANTLR: https://github.com/antlr-ng/antlr-ng?tab=readme-ov-file#futu...


Has performance of ANTLR generated code gotten better? I'm sure some of this was bad grammars, but I wasn't thrilled with what I got out of ANTLR ~15 years ago


Last time I checked, about 5 years ago, the runtime libraries for the .NET target were a performance disaster. I remember reimplementing a compatible faster one in F#, but I wasn't satisfied with the overall program so I eventually got rid of ANTLR (and .NET) for that project altogether; I don't think the code survived.


I'm a fan of antlr-ng. It's a solid upgrade if you're already using antlr. In my experience, they're fully compatible. antlr's ALL(*) parsing is relatively powerful for a parser generator, but it lacks support for incremental parsing. antlr-ng might improve things enough to be usable interactively in smaller settings, even if you need to reparse the document each time. It also comes with useful extensions like https://github.com/mike-lischke/antlr4-c3, which generates syntactic and semantic completions directly from the grammar.


I used ANTLR to create a grammar file for MK (Manufacturing Knowledge). I plugged the JavaScript parser and lexer into Ace editor. Good memories.


Why -ng ? I thought it had something to do with angular.


It just means next gen




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: