(not the author) > probably the JVM will get in your way Not so much the JVM, bu...

raspasov · on Jan 8, 2017

Yup, the JVM is something that people (esp. startups & new-co's) are, all things being equal, vastly underusing or over-complaining about :).

Before I started doing Clojure the JVM ecosystem was this "scary" thing but 4 years later I've learned to appreciate it a lot - stable AND sound libraries, high performance out-of-the box, and very,very high performance if you really need it.

jb1991 · on Jan 8, 2017

> very,very high performance if you really need it

But not as high performance as Java directly. And typically not as high performance as native languages, which the author addresses.

pron · on Jan 8, 2017

There's absolutely nothing about native languages that makes them a-priori faster than the JVM. On the contrary, the JVMs compiler is likely to emit betteroptimized machine code than their compilers. Some low-level languages (like C/C++/Rust) can (and do in many cases) offer better performance, but that's not because they're "native" but because they allow (and require) finer control over every operation. This level of control comes at a significant cost, though.

jb1991 · on Jan 8, 2017

Notice I said "typically" because, typically it is true that a native language runs faster than Clojure, which is what my comment was referring to.

agibsonccc · on Jan 8, 2017

What clojure and even scala do with the JVM scares me. The most concrete example I know of is using objects for everything. 1 of the fundamental problems I have with many of the higher level JVM languages is how much they hide from you. That being said, most people won't notice. I do numerical software in java so might be a bit more sensitive to some of this stuff.

raspasov · on Jan 9, 2017

If you ever have the need, checkout this blog post about optimizing math operations in Clojure http://insideclojure.org/2014/12/15/warn-on-boxed/

agibsonccc · on Jan 8, 2017

It can though. We are a java shop that does HPC and deep learning. Our work is open source. We've been around 2.5 going on 3 years now. Over the years we've surpassed bottleneck after bottleneck in the JVM.

It's a faster wrapper around c code for us. We maintain and created: https://github.com/bytedeco/javacpp

which wraps a lot of c++ components. This is also how we do GPUs and the like with our own memory management for our "numpy for java": https://github.com/deeplearning4j/nd4j/tree/master/nd4j-back...

Short of it "off heap and NO GC" matters a lot.

We have 1 c codebase we use that's meant to be controlled via JNI here: https://github.com/deeplearning4j/libnd4j

The speed gains we've seen are massive. Java can't compete with good ole simd and the like for numerical computing.

We are a big advocate of the JVM as a platform but let's be clear about its weaknesses. You need to use unsafe (see: netty,aeron, and any other low level database written in java or using java as the core language with some c++)and other tricks out of the box for real performance.

pron · on Jan 8, 2017

I'm willing to bet that 99% of the performance difference will be solved by value types. Anyway, Cliff Click, who wrote H2O[1], a large machine learning platform, reports achieving Fortran speeds (i.e. maximum throughput) with pure Java[2]. This means that a C application won't even be 1% faster.

> Java can't compete with good ole simd and the like for numerical computing.

Oh, sure, for some specialized use cases of course that's true, but you could use OpenCL in Java, too. With 10+ years with C/C++ and 10+ years with Java, I'd bet on Java when it comes to performance bang-for-the-buck almost in every case (given that it's a large app), and even more than that the more concurrent, complex and unpredictable the app is (but this requires carefully looking at the design).

Value types will make Java competitive in absolute terms in more and more domains. Also, with the new JIT (Graal) you can control machine-code generation at whatever level of detail you want.

> You need to use unsafe (see: netty,aeron, and any other low level database written in java or using java as the core language with some c++)and other tricks out of the box for real performance.

They're not using unsafe for throughput but (mostly) for latency. That's a whole other matter. Also, some of those use "mechanical sympathy" as a driver of performance instead of algorithms that the JVM makes easier. I've built a concurrent DB in pure Java that relies on synchronization that would require at least double the effort if I were to write it in C (hazard pointers, etc.), and may not even have better performance.

[1]: https://github.com/h2oai

[2]: https://vimeo.com/105743312

agibsonccc · on Jan 8, 2017

>> I'm willing to bet that 99% of the performance difference will be solved by value types. Anyway, Cliff Click, who wrote H2O[1], a large machine learning platform, reports achieving Fortran speeds (i.e. maximum throughput) with pure Java[2]. This means that a C application won't even be 1% faster.

I've had a personal conversation with Cliff himself. Java no matter what you do can't deal with hardware acceleration and gpus. We agreed on that. Numerical software is a different beast. I also kept mentioning "simd instructions" as well as things like openmp.

You are talking about systems software. Unfortunately that matters a lot for machine learning. The axis along which you can get equivalent speeds should be specified here. "values types" != "runs on faster chips"

You aren't likely to beat intel or nvidia's compilers at their own game here . Java will always be playing catch up to last gen's tech there.

Disclosure: I'm more than aware of what's going on in the space. We compete with them for customers and have a very clear understanding of their offerings. H20 has a great k/v store based on the exact mechanics you're talking about. That's about it though. Also of note: Cliff doesn't work on h20 anymore: https://twitter.com/cliff_click/status/700817408110399492

>> Oh, sure, for some specialized use cases of course that's true, but you could use OpenCL in Java, too....

OpenCL isn't exactly the industry standard for this stuff. You always end up using cuda, and you always end up dropping down to c. There's just no way to avoid that if you want the fastest out there.

Another disclosure, we work closely with nvidia and I may be biased: https://blogs.nvidia.com/blog/2016/10/06/how-skymind-nvidia-...

I agree with you on the last part, but I keep mentioning "numerical software" for a reason. There are certain things the JVM is good at, writing a database and systems software is one of those things. There are still bits of HDFS in c++ though. I don't think you'll be able to get around having bits of your code in c which is what I emphasize here.

pron · on Jan 9, 2017

I've also had a personal conversation with Cliff about this (I know he's no longer with H2O), and he isn't (or, at least wasn't when I spoke to him about a year ago) aware of what's going on in Graal. Now, I completely agree that for specialized stuff, it's likely you'll get the most out of specialized compilers, but I'd closely watch the Graal space for specialized code generation. I had a talk with John Rose about how best to combine Graal with the work on VarHandles[1] and Panama[2], precisely to address things like SIMD. There's also interesting work done using Graal to directly emit GPU code (from Java code) for streams, but that's largely experimental at this stage (I think AMD did some work on that and then abandoned it).

[1]: http://openjdk.java.net/jeps/193

[2]: http://openjdk.java.net/projects/panama/

pjmlp · on Jan 8, 2017

The problem is that you are using it in areas, HPC and deep learning, where using plain C++ would be a better option. I think.

There is a reason why Java 10 aims to improve Java's story in regard to mechanical sympathy.

agibsonccc · on Jan 8, 2017

We wrote a lot of that stack ourselves. And our compute code is in c++. There is a lot of stuff going on in big data land (including us) wrapping a lot of the c++ in java and enabling people to use the JVM for what it's good at (data access). We are pushing this with nvidia: https://blogs.nvidia.com/blog/2016/10/06/how-skymind-nvidia-...

I also work pretty closely with a lot of the spark/gpu folks at IBM.

While I do largely agree with you, we have our own JNI compiler called javacpp that alleviates a lot of those concerns already: https://github.com/bytedeco/javacpp

Having our own pointer class and doing our own memory management has helped a lot.

What you're talking about is using java for the compute. There's no reason you can't wrap that in a runtime that most people know how to use. A lot of python folks do that now..but then you have to deal with python's limitations at which point the majority of your code (way more than needs be) will end up being in c anyways vs java where you can write a significant part of your app in java and have it be fast out of the box.

pjmlp · on Jan 8, 2017

Thanks for the clarification.

I am mostly a line-of-business developer doing enterprise consultancy and that is how we use C++, just as infrastructure language when either JVM or .NET stacks need a bit of outside help.

Just wondering if sometimes having a 100% C++ solution would be a better approach than the added integration effort it requires, on the other hand, without people like you guys we wouldn't have access to nice tooling in Java for similar work, so congratulations on the work thus far and all the best for the project.

WildUtah · on Jan 8, 2017

It's fairly easy to get C performance (and even beat it in concurrent code) for the same amount of effort on the JVM.

Ha.

That's what we were promised back in the mid-1990s. It's no more true today than it was then. Well written Java will always be about 2-4x slower than similar C code.

JVM applications have many compensating advantages, including the aforementioned ease of exploiting concurrency. That doesn't erase the reality of single thread performance where Java has never, in twenty years of strong expert effort, caught up.

fnordsensei · on Jan 8, 2017

Agreed, in that the performance benefits you can get out of a higher level language will often depend on the stuff where the limits of human cognition is the performance boundary. As computing gets faster in general, the practical considerations should shift more towards the performance you can get out of an entire system than unit-for-unit performance. On that level, human comprehension often seems to be what needs to be accounted for.

I think of it as problem akin to how you can make traffic run better in an entire city rather than optimising the performance of an engine in a single car. It doesn't invalidate making more performant engines, but it's a different level of consideration.

When saying, "JVM is as performant as C", it's easy to run the numbers and see whether it's true or not, objectively. The caveat that GP throws in is, for the same amount of effort. Then you need to specify which human and under what circumstances, and the entire thing gets hairier.

WildUtah · on Jan 8, 2017

As computing gets faster in general, the practical considerations should shift more towards the performance you can get out of an entire system than unit-for-unit performance.

Those days are over, unfortunately. Computing stopped getting faster in general around 2010. And that also changes the calculus around single thread performance. You can no longer count on hardware eventually to solve performance problems.

lvh · on Jan 8, 2017

Which is precisely why it matters that I can write functionally pure code that is trivial to parallelize. At $FORMER_WORK, I even wrote (the same!) code as reasonably idiomatic-looking Clojure (essentially, reducer fns) that transparently runs locally single-threadedly, multi-threadedly, and on Hadoop.

That C program might be fast, but it's not a great tool for processing petabytes of data on a stampeding herd of angry elephants. So, I think performance arguments need a little nuance about what you're doing.

(My experiments include soft real time with deadline scheduling. Clojure's fine.)

tensor · on Jan 8, 2017

I can't remember the source, but I remember reading about a case where someone replaced a large hadoop cluster with one node running highly tuned C. Distributed computing comes with a lot of overhead and you might be surprised at just how much you can get out of a highly tuned C application.

mtrycz · on Jan 8, 2017

And this is why parallel/concurrent computing/applications are necessary to ease speedups. And that's what the new languages promise to let you get right. For the same effort.

pron · on Jan 8, 2017

> Well written Java will always be about 2-4x slower than similar C code.

Nope. A large, concurrent app is likely to be faster in Java given similar effort (of course, given enough effort -- which may be double -- C will eventually surpass that, sometimes even significantly, depending on usage). Currently the main bottleneck, which makes the above statement very dependent on application type, is lack of value types, and that's being addressed. In small sequential apps, there will be a significant advantage to C, which diminishes with the size of the app. The reason is that as the app grows, it gets harder to write manual optimizations while keeping the code modular and maintainable, while HotSpot can do all sorts of optimizations even with nice abstractions.

agibsonccc · on Jan 8, 2017

See my comment here: https://news.ycombinator.com/item?id=13349838

I "kinda" agree with you. Java has the right stability and speed trade offs and can be wicked fast. It beats the crap out of most garbage collected languages save maybe .net and the CLR.

For real applications I agree with you that c is the way to go. Java with c and off heap memory gets you a pretty long way.

pjmlp · on Jan 8, 2017

It depends on the application domain and the developer skills.

In many domains it doesn't matter you can do it in 10ms in C and I do it in 1s in Java, if the customer is willing to wait 5s.

Also I see all the time badly written Java code. For example copying an array via for loop, instead of System.arrayCopy.

virtualwhys · on Jan 8, 2017

> This, thankfully, is being addressed by the addition of value types.

Last I heard (JLS 2016 keynote by Brian Goetz) they have yet to commit/guarantee when value types will land. Java 10 is planned; if they're completed by then we're looking at probably 2020 given the delays with Java 9 release.

That's 3 years for other languages/platforms to evolve while the JVM unboxes itself.

pron · on Jan 9, 2017

Large organizations switch programming languages no more than once a decade, and Java is way ahead of the competition as it stands.