If the author reads this: Can you elaborate more on what you mean by "lackluster...

klibertp · on Jan 8, 2017

I'm not the author, but looking through the sibling comments I feel I should point out the obvious: startup time.

It can take long seconds for Clojure REPL to start up. No other dynamic language I had contact with was that bad on that front. Even Erlang and Elixir boot up faster!

Of course, I know about various ways of setting up a daemon process in the background, but then there is a need for reloading/hot-swapping code, which Clojure also fails to do well. Both Erlang and Common Lisp (also Emacs... even Emacs!) are better in this regard, both for different reasons.

Anyway, Clojure overall performance impression - how fast it feels for users - is bad and will remain bad unless the startup problem is fixed.

pron · on Jan 8, 2017

(not the author)

> probably the JVM will get in your way

Not so much the JVM, but Clojure and how it uses the JVM (although, as you say, it's possible to get not too far from top JVM performance with Clojure). It's fairly easy to get C performance (and even beat it in concurrent code) for the same amount of effort on the JVM. Currently, the main handicap the JVM has is the lack of arrays-of-structs which may cause lots of cache-misses, and requires less-than-elegant code to overcome. This, thankfully, is being addressed by the addition of value types.

raspasov · on Jan 8, 2017

Yup, the JVM is something that people (esp. startups & new-co's) are, all things being equal, vastly underusing or over-complaining about :).

Before I started doing Clojure the JVM ecosystem was this "scary" thing but 4 years later I've learned to appreciate it a lot - stable AND sound libraries, high performance out-of-the box, and very,very high performance if you really need it.

jb1991 · on Jan 8, 2017

> very,very high performance if you really need it

But not as high performance as Java directly. And typically not as high performance as native languages, which the author addresses.

pron · on Jan 8, 2017

There's absolutely nothing about native languages that makes them a-priori faster than the JVM. On the contrary, the JVMs compiler is likely to emit betteroptimized machine code than their compilers. Some low-level languages (like C/C++/Rust) can (and do in many cases) offer better performance, but that's not because they're "native" but because they allow (and require) finer control over every operation. This level of control comes at a significant cost, though.

jb1991 · on Jan 8, 2017

Notice I said "typically" because, typically it is true that a native language runs faster than Clojure, which is what my comment was referring to.

agibsonccc · on Jan 8, 2017

What clojure and even scala do with the JVM scares me. The most concrete example I know of is using objects for everything. 1 of the fundamental problems I have with many of the higher level JVM languages is how much they hide from you. That being said, most people won't notice. I do numerical software in java so might be a bit more sensitive to some of this stuff.

raspasov · on Jan 9, 2017

If you ever have the need, checkout this blog post about optimizing math operations in Clojure http://insideclojure.org/2014/12/15/warn-on-boxed/

agibsonccc · on Jan 8, 2017

It can though. We are a java shop that does HPC and deep learning. Our work is open source. We've been around 2.5 going on 3 years now. Over the years we've surpassed bottleneck after bottleneck in the JVM.

It's a faster wrapper around c code for us. We maintain and created: https://github.com/bytedeco/javacpp

which wraps a lot of c++ components. This is also how we do GPUs and the like with our own memory management for our "numpy for java": https://github.com/deeplearning4j/nd4j/tree/master/nd4j-back...

Short of it "off heap and NO GC" matters a lot.

We have 1 c codebase we use that's meant to be controlled via JNI here: https://github.com/deeplearning4j/libnd4j

The speed gains we've seen are massive. Java can't compete with good ole simd and the like for numerical computing.

We are a big advocate of the JVM as a platform but let's be clear about its weaknesses. You need to use unsafe (see: netty,aeron, and any other low level database written in java or using java as the core language with some c++)and other tricks out of the box for real performance.

pron · on Jan 8, 2017

I'm willing to bet that 99% of the performance difference will be solved by value types. Anyway, Cliff Click, who wrote H2O[1], a large machine learning platform, reports achieving Fortran speeds (i.e. maximum throughput) with pure Java[2]. This means that a C application won't even be 1% faster.

> Java can't compete with good ole simd and the like for numerical computing.

Oh, sure, for some specialized use cases of course that's true, but you could use OpenCL in Java, too. With 10+ years with C/C++ and 10+ years with Java, I'd bet on Java when it comes to performance bang-for-the-buck almost in every case (given that it's a large app), and even more than that the more concurrent, complex and unpredictable the app is (but this requires carefully looking at the design).

Value types will make Java competitive in absolute terms in more and more domains. Also, with the new JIT (Graal) you can control machine-code generation at whatever level of detail you want.

> You need to use unsafe (see: netty,aeron, and any other low level database written in java or using java as the core language with some c++)and other tricks out of the box for real performance.

They're not using unsafe for throughput but (mostly) for latency. That's a whole other matter. Also, some of those use "mechanical sympathy" as a driver of performance instead of algorithms that the JVM makes easier. I've built a concurrent DB in pure Java that relies on synchronization that would require at least double the effort if I were to write it in C (hazard pointers, etc.), and may not even have better performance.

[1]: https://github.com/h2oai

[2]: https://vimeo.com/105743312

agibsonccc · on Jan 8, 2017

>> I'm willing to bet that 99% of the performance difference will be solved by value types. Anyway, Cliff Click, who wrote H2O[1], a large machine learning platform, reports achieving Fortran speeds (i.e. maximum throughput) with pure Java[2]. This means that a C application won't even be 1% faster.

I've had a personal conversation with Cliff himself. Java no matter what you do can't deal with hardware acceleration and gpus. We agreed on that. Numerical software is a different beast. I also kept mentioning "simd instructions" as well as things like openmp.

You are talking about systems software. Unfortunately that matters a lot for machine learning. The axis along which you can get equivalent speeds should be specified here. "values types" != "runs on faster chips"

You aren't likely to beat intel or nvidia's compilers at their own game here . Java will always be playing catch up to last gen's tech there.

Disclosure: I'm more than aware of what's going on in the space. We compete with them for customers and have a very clear understanding of their offerings. H20 has a great k/v store based on the exact mechanics you're talking about. That's about it though. Also of note: Cliff doesn't work on h20 anymore: https://twitter.com/cliff_click/status/700817408110399492

>> Oh, sure, for some specialized use cases of course that's true, but you could use OpenCL in Java, too....

OpenCL isn't exactly the industry standard for this stuff. You always end up using cuda, and you always end up dropping down to c. There's just no way to avoid that if you want the fastest out there.

Another disclosure, we work closely with nvidia and I may be biased: https://blogs.nvidia.com/blog/2016/10/06/how-skymind-nvidia-...

I agree with you on the last part, but I keep mentioning "numerical software" for a reason. There are certain things the JVM is good at, writing a database and systems software is one of those things. There are still bits of HDFS in c++ though. I don't think you'll be able to get around having bits of your code in c which is what I emphasize here.

pron · on Jan 9, 2017

I've also had a personal conversation with Cliff about this (I know he's no longer with H2O), and he isn't (or, at least wasn't when I spoke to him about a year ago) aware of what's going on in Graal. Now, I completely agree that for specialized stuff, it's likely you'll get the most out of specialized compilers, but I'd closely watch the Graal space for specialized code generation. I had a talk with John Rose about how best to combine Graal with the work on VarHandles[1] and Panama[2], precisely to address things like SIMD. There's also interesting work done using Graal to directly emit GPU code (from Java code) for streams, but that's largely experimental at this stage (I think AMD did some work on that and then abandoned it).

[1]: http://openjdk.java.net/jeps/193

[2]: http://openjdk.java.net/projects/panama/

pjmlp · on Jan 8, 2017

The problem is that you are using it in areas, HPC and deep learning, where using plain C++ would be a better option. I think.

There is a reason why Java 10 aims to improve Java's story in regard to mechanical sympathy.

agibsonccc · on Jan 8, 2017

We wrote a lot of that stack ourselves. And our compute code is in c++. There is a lot of stuff going on in big data land (including us) wrapping a lot of the c++ in java and enabling people to use the JVM for what it's good at (data access). We are pushing this with nvidia: https://blogs.nvidia.com/blog/2016/10/06/how-skymind-nvidia-...

I also work pretty closely with a lot of the spark/gpu folks at IBM.

While I do largely agree with you, we have our own JNI compiler called javacpp that alleviates a lot of those concerns already: https://github.com/bytedeco/javacpp

Having our own pointer class and doing our own memory management has helped a lot.

What you're talking about is using java for the compute. There's no reason you can't wrap that in a runtime that most people know how to use. A lot of python folks do that now..but then you have to deal with python's limitations at which point the majority of your code (way more than needs be) will end up being in c anyways vs java where you can write a significant part of your app in java and have it be fast out of the box.

pjmlp · on Jan 8, 2017

Thanks for the clarification.

I am mostly a line-of-business developer doing enterprise consultancy and that is how we use C++, just as infrastructure language when either JVM or .NET stacks need a bit of outside help.

Just wondering if sometimes having a 100% C++ solution would be a better approach than the added integration effort it requires, on the other hand, without people like you guys we wouldn't have access to nice tooling in Java for similar work, so congratulations on the work thus far and all the best for the project.

WildUtah · on Jan 8, 2017

It's fairly easy to get C performance (and even beat it in concurrent code) for the same amount of effort on the JVM.

Ha.

That's what we were promised back in the mid-1990s. It's no more true today than it was then. Well written Java will always be about 2-4x slower than similar C code.

JVM applications have many compensating advantages, including the aforementioned ease of exploiting concurrency. That doesn't erase the reality of single thread performance where Java has never, in twenty years of strong expert effort, caught up.

fnordsensei · on Jan 8, 2017

Agreed, in that the performance benefits you can get out of a higher level language will often depend on the stuff where the limits of human cognition is the performance boundary. As computing gets faster in general, the practical considerations should shift more towards the performance you can get out of an entire system than unit-for-unit performance. On that level, human comprehension often seems to be what needs to be accounted for.

I think of it as problem akin to how you can make traffic run better in an entire city rather than optimising the performance of an engine in a single car. It doesn't invalidate making more performant engines, but it's a different level of consideration.

When saying, "JVM is as performant as C", it's easy to run the numbers and see whether it's true or not, objectively. The caveat that GP throws in is, for the same amount of effort. Then you need to specify which human and under what circumstances, and the entire thing gets hairier.

WildUtah · on Jan 8, 2017

As computing gets faster in general, the practical considerations should shift more towards the performance you can get out of an entire system than unit-for-unit performance.

Those days are over, unfortunately. Computing stopped getting faster in general around 2010. And that also changes the calculus around single thread performance. You can no longer count on hardware eventually to solve performance problems.

lvh · on Jan 8, 2017

Which is precisely why it matters that I can write functionally pure code that is trivial to parallelize. At $FORMER_WORK, I even wrote (the same!) code as reasonably idiomatic-looking Clojure (essentially, reducer fns) that transparently runs locally single-threadedly, multi-threadedly, and on Hadoop.

That C program might be fast, but it's not a great tool for processing petabytes of data on a stampeding herd of angry elephants. So, I think performance arguments need a little nuance about what you're doing.

(My experiments include soft real time with deadline scheduling. Clojure's fine.)

tensor · on Jan 8, 2017

I can't remember the source, but I remember reading about a case where someone replaced a large hadoop cluster with one node running highly tuned C. Distributed computing comes with a lot of overhead and you might be surprised at just how much you can get out of a highly tuned C application.

mtrycz · on Jan 8, 2017

And this is why parallel/concurrent computing/applications are necessary to ease speedups. And that's what the new languages promise to let you get right. For the same effort.

pron · on Jan 8, 2017

> Well written Java will always be about 2-4x slower than similar C code.

Nope. A large, concurrent app is likely to be faster in Java given similar effort (of course, given enough effort -- which may be double -- C will eventually surpass that, sometimes even significantly, depending on usage). Currently the main bottleneck, which makes the above statement very dependent on application type, is lack of value types, and that's being addressed. In small sequential apps, there will be a significant advantage to C, which diminishes with the size of the app. The reason is that as the app grows, it gets harder to write manual optimizations while keeping the code modular and maintainable, while HotSpot can do all sorts of optimizations even with nice abstractions.

agibsonccc · on Jan 8, 2017

See my comment here: https://news.ycombinator.com/item?id=13349838

I "kinda" agree with you. Java has the right stability and speed trade offs and can be wicked fast. It beats the crap out of most garbage collected languages save maybe .net and the CLR.

For real applications I agree with you that c is the way to go. Java with c and off heap memory gets you a pretty long way.

pjmlp · on Jan 8, 2017

It depends on the application domain and the developer skills.

In many domains it doesn't matter you can do it in 10ms in C and I do it in 1s in Java, if the customer is willing to wait 5s.

Also I see all the time badly written Java code. For example copying an array via for loop, instead of System.arrayCopy.

virtualwhys · on Jan 8, 2017

> This, thankfully, is being addressed by the addition of value types.

Last I heard (JLS 2016 keynote by Brian Goetz) they have yet to commit/guarantee when value types will land. Java 10 is planned; if they're completed by then we're looking at probably 2020 given the delays with Java 9 release.

That's 3 years for other languages/platforms to evolve while the JVM unboxes itself.

pron · on Jan 9, 2017

Large organizations switch programming languages no more than once a decade, and Java is way ahead of the competition as it stands.

krat0sprakhar · on Jan 8, 2017

> as long as you put the right type hints everywhere.

I'm a novice Clojure programmer and haven't previously heard of this. Can you please elaborate?

hkjgkjy · on Jan 8, 2017

You don't need to specify what type a function takes in Clojure, which is often nice. If you do though, the compiler can sometimes optimize, and make things much faster. Basically, the compiler does not have to

* look up the type of the thing * run the action on it

but can instead skip straight to step 2.

If you somewhere in your Clojure program (or repl) run (set! warn-on-reflection true) it will print to the console when you are doing reflection.

I think it's a good idea to put in the beginning of your core namespace - so that it is always on. Why not always wear the seatbelt when driving, you know?

moomin · on Jan 8, 2017

Short version: put :warn-on-reflection in your leiningen file, And then add type hints (google them) to reduce the warnings to zero.

There's no reason the average Clojure project ever need be using reflection for dispatch.

raspasov · on Jan 8, 2017

Correct, a lot of the average application-level code you see in the wild does not have to worry about this. If we're talking about things like load-balancers or databases or endpoints/functions that will be hit 1000s of times per second - then it's a diff. conversation.

ajss · on Jan 8, 2017

(defn [o n] (.write o n)) will have to look up the .write method by doing reflection on o & n at runtime, which is really slow.

(defn [^WriteInterface o ^long n] (.write o n)) can be much faster because the compiler knows the type of o & n.

abc_lisper · on Jan 8, 2017

I think you meant to say, (defn func[o n] (.write ^WriteInterface o ^long n))

This is needed only when calling Java methods, because Java allows overloading. Clojure only does arity-overloading, not type overloading, and hence doesn't need the type info.

fsloth · on Jan 8, 2017

"something possible in the like of Java, C, et al but definitely not idiomatic at all."

What on earth is non-idiomatic multicore c code? You have to try really hard with Macro tomfoolery before c becomes non-idiomatic :) .

lvh · on Jan 8, 2017

Isn't this just a convoluted way of arguing that there's no such thing as idiomatic multicore C? If everything's idiomatic, then surely nothing is.

fsloth · on Jan 8, 2017

No, it was a convoluted way to say that the attempt to measure C programs by their quality as "idiomatic" will unlikely bear any fruits of value.

That said, if there is any non-idiomatic C code then the attempts of 'experts' in other languages to mangle C to look more like their language surely fall into this category (There are lots of examples where the author starts 'We can make C look more like fortran by defining our macros to replace these keywords..).