That's how you deal with that problem with Java as well: arrays of primitives. I...

0815test · on July 10, 2019

> That's how you deal with that problem with Java as well: arrays of primitives.

It just goes to prove that you can indeed write FORTRAN in any language!

llukas · on July 10, 2019

This is actually a feature - python array slicing syntax is very similar to fortran.

rcfox · on July 10, 2019

It will depend on your application, but you might also consider just one array of [x, y, x, y, x, y, ...] to improve cache locality.

cesarb · on July 10, 2019

Also known as the "Struct of Arrays" approach.

wtallis · on July 10, 2019

"Struct of Arrays", to be contrasted with "Array of Structs", because in Java it's actually "Array of pointers to Struct". In languages where "Array of Structs" is actually possible, the decision of which to use is less clear-cut and depends on how big the Struct is, what the access patterns are like, and whether you're trying to perform SIMD operations on multiple Structs at once.

vbezhenar · on July 10, 2019

That's not only about pointers. Every Java object incurs an overhead of around 16 bytes (architecture and JVM dependent, of course, but it's there).

corysama · on July 10, 2019

Several times in Python I've used the built-in CTypes library to cut down the memory footprint by using giant Arrays of (literally C) Structs.

danudey · on July 11, 2019

I was bored and had time on my hands, so I played around a bit with some of the suggestions in this thread, but it seems as though a naive ctypes approach is the worst possible approach in the two cases I tested (arrays of integers, arrays of x,y struct).

Here are my code and results:

https://gist.github.com/danudey/94b5442b617a734a2366e824be15...

Is there an approach that you use that produces better results than this? Because my naive approach is unmanageably worse for both performance (not recorded) and impact on Python's memory usage.

adrianratnapala · on July 11, 2019

> Your code won't be pretty,

Depends on the algorithms that use that code. Two separate arrays is not a bad way to represent a vector-of-tuples. And so if you think of everything as a vectorised operation (and if that makes sense for your use case) then the code can be clean enough.

Efficiency-wise it depends on how the locality of reference falls out. There are cases where it's more cache-efficient to store the pairs next to each other, but again for big vectorised operations you might lose nothing by using two cache-ways instead of one.

nullwasamistake · on July 10, 2019

Or you can just use a more memory efficient language like Java. Python and JS use tons of memory compared to strongly typed VM languages like C#, Go, Java. People lament Java memory usage but Python/Node use several times more

whatshisface · on July 10, 2019

>People lament Java memory usage but Python/Node use several times more

The people who lament Java's memory usage never complain about Python or Ruby because... if you're worried about the JVM you would never consider Python!

nullwasamistake · on July 10, 2019

I'm mostly referring to those that say the JVM is too "resource heavy" then go on to use JS/Python/Ruby instead, which have much higher runtime overhead.

In my experience many dev think the scripting language heritage means these languages have smaller/lighter runtimes

drej · on July 10, 2019

Not that I want to go down the rabbit hole, but I probably fall into the category of “JVM is heavy, but I use Python”

Sure, memory consumption and CPU perf are both pretty bad in Python, but latency and memory footprint of the runtime itself are pretty good, so it’s ideal for tooling, crons, lambdas etc. JVM is comparatively efficient once you have the JVM ready, but that sure takes tons of resources.

I’m hoping Graal changes this. I don’t like JVM-based languages, but I do like technological progress.

alexhutcheson · on July 10, 2019

Python startup time can be significant for non-trivial programs as well, and it's been a big problem for Mercurial and other projects. For example, see these threads:

- https://mail.python.org/pipermail/python-dev/2014-May/134528...

- https://mail.python.org/pipermail/python-dev/2018-May/153296...

If your script is expected to be run interactively on a frequent basis, then Go, Rust, C++, or even Bash (for simple stuff) will give you much lower user-perceived latency than Python.

nullwasamistake · on July 10, 2019

It's fairly easy to use Class Data Storage and include an archive in Java 12 where startup time is important. This reduces start time significantly.

JVM starts really quick (~100ms on my machine) and doesn't use much resources as long as your app is small. that's... Uncommon in Java land though. Even simply apps pull in Guava/Apache Commons and a few client libraries. This can easily be thousands of classes. Nobody thinks about it because runtime cost for loading shitloads of code is so low. But you can improve this a ton by using ProGaurd and stripping out stuff you don't need

mr_crankypants · on July 10, 2019

It really depends.

If I can do it in numpy, I can probably get a lighter, faster implementation with Python than in Java. More generally, if I can do it with a Python package that's actually a fairly thin wrapper around a C, C++ or Fortran library, then Python also has a decent chance of being the easy winner.

If none of those situations apply, then yeah, typically Java ends up being more efficient.

fnord123 · on July 10, 2019

I have seen completely the opposite in the wild. In fact I've only seen it so far in the opposite direction (Java consuming far more memory than Python) that I have to ask what on Earth you are doing with Python to have found yourself in such a position.

(I've never successfully used Node so I have no idea what that's like in memory use.)