Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Erlang Garbage Collection Details and Why It Matters (hamidreza-s.github.io)
66 points by yinso on Aug 25, 2017 | hide | past | favorite | 24 comments


Very nice article. Good resource for memory handling in Erlang.

There is recent config setting since 19.0 to move the message queue outside the main process heap. I have seen decent performance gains from that.

Sometimes for fun I like to describe a running Erlang VM as a modern os - processes have isolated heaps and can be gc-ed independently. Then think about programing in a language with a shared heap and it is a bit like putting that mission critical code to run on Windows 95. You wouldn't do that in 2017, but somehow we still do it at the language level.

Now Rust lets us have both speed and memory safety and it allows these guarantees to happen at compile time. I think that is one of the more exciting latest development in programming languages.


[flagged]


Please don't violate the guidelines like this.

https://news.ycombinator.com/newsguidelines.html


Heh, it is right on the border where half the people will consider this good sarcasm and the other half will consider it a disparaging comment.


Loop me in, what baggage and what drama?


Can i ask why do you mention rust ? By that i mean what prompted you to write about rust here and now.


Memory safety. I made the point that Erlang VM puts memory safety front and center by using isolated heaps. And from there made a comparison between operating systems and language platforms. In the end pointed out that Rust accomplishes memory safety as well but in a different way - via a compiler check.


I assume because of the correlation (s)he is drawing between memory safety facilities that lend themselves to reliable, long-running software, and that the next-generation of such software might be written in Rust based on those facilities being at compile time?

Though in this context---if we were to include more than that one dimension of the memory model---pony-lang seems like it might be the more Rust-like w/ compile time guarantees, but with a general programming model of intercommunicating asynchronous processes included.


Erlang is completely different then rust (or C, Ada, FORTRAN).

Why, on a topic about (an amazing language that is) erlang would you talk about other languages. Especially when they are nothing like erlang.


The majority of the post was about a cool/interesting way Erlang does something and praising the linked article.

The very last statement was a passing commentary about something that seemed conceptually correlated to the commenter.

I'm really confused by the hyper-sensitivity. With the exception of Javascript conversation threads--- where they're overwhelmed by React vs. X---almost all other language focused threads on HN are full of people making comparison and contrast statements to different things.


If i went into every thread about rust and mentioned how Ada is great (or any other language, including C), id' get downvoted. Why is that ? "Hyper-sensitivity" ?


I've not done that in every thread on Rust, but I have made similar comments in Rust threads about areas where Ada has it beat with regard to some parts of the type system and compile-time guarantees and checks. If I was ever downvoted, it was not below base post karma (1).


> i went into every thread about rust and mentioned how Ada is great (or any other language, including C), id' get downvoted.

If you just mentioned that Ada is great without a reason you might get downvoted or just asked to provide support for you statement. If you provide a reason or compare some of the features Ada has and how they might be better in some cases, I don't think you would be. Compared to other forums, I think HN is pretty good about that.


It is not completely different. Both Erlang and Rust emphasize safety at the top of their feature lists. That is contrast with most other languages / platforms.


It is COMPLETELY different from rust. Not only in the syntax, but also in the things it does and how it does them.

http://learnyousomeerlang.com/content

A lot of languages "emphasize safety at the top of their feature lists". By a lot i mean A LOT. If you have to compare, Erlang is similar to Haskell. Rust is nothing like erlang.


> COMPLETELY different from rust. Not only in the syntax

Why bring syntax as the first thing? What does that have todo with memory safety.

> By a lot i mean A LOT.

You mentioned Haskell and Ada so far. Which I agree with. What are the other ones?


I also mentioned FORTRAN.

A 10min investigation would bring fruit to your curiosity. But wth..

Lisp, scheme, closure, modula and smalltalk are the ones that come to my mind now that offer ONLY safe data access (modula3 has "unsafe" keyword, but only when talking to outside code). Actually R, awk, bc, shell and such are also like that, but not really general purpose. Languages that are "memory safe" by default, but include "unsafe" memory access (that come to my mind now) are C#, F#, C++ and such (Rust goes into this "category").

All functional languages are, AFAIK, "memory safe" by default (well.. you could make a functional language that lets you go out of bounds, but that wouldn't be a "pure" functional language).

In fact, as far as i know, there are more programming languages that ARE "memory safe" then ones that are not. One could just go over the wikipedia list [0] and.. list them out.

[0] https://en.wikipedia.org/wiki/List_of_programming_languages

While somewhat on the topic of functional vs "Turing" languages, here's a couple of videos. https://www.youtube.com/watch?v=eis11j_iGMs https://www.youtube.com/watch?v=RPQD7-AOjMI

So.. the only thing that prompted you to talk about Rust is memory safety ?

EDIT:

> Why bring syntax as the first thing? What does that have todo with memory safety.

Because the question i asked was "Can i ask why do you mention rust ? By that i mean what prompted you to write about rust here and now." and your answer was (paraphrased) "they bout emphasize memory safety", as if that was the only defining factor of a programming language.


> C#, F#, C++

Ok if C++ is memory safe, what's a memory unsafe language then?

If concurrency units (threads, co-routines) share a heap they are not memory safe. In Rust they do but Rust provide compile time checks for make sure access is safe.

awk, bc not production level languages. I think you're being disingenuous there.

> the only thing that prompted you to talk about Rust is memory safety ?

Yes

> they bout emphasize memory safety", as if that was the only defining factor of a programming language.

If you know anything about both languages that's at the top of their features list, which I mentioned.

> I also mentioned FORTRAN.

Ah interesting. So initializing a variable in FORTRAN would put that in thread local / separate heap?


> awk, bc not production level languages. I think you're being disingenuous there.

First of all, we are talking about programming languages here, not about how widely they are used in "production". I know that awk is used in one giant company to do an important thing (aka dealing with money). It wouldn't surprise me to find that bc is also used in "the industry".

On that note, the TIOBE index lists Java, C, C++ and C# as being the most widely used languages. Awk is number 33, just below Ada, Prolog and Erlang. Rust is not even on the list, a list where assembly of all things holds the 14'th place. (bc is in the same group as rust there)

Erlang is also used for plenty of "heavy" things, notably in the telecommunication and banking industries. Haskell is used by facebook for their spam filter. WhatsUp is written in erlang. https://www.youtube.com/watch?v=LnX3B9oaKzw

> If you know anything about both languages that's at the top of their features list, which I mentioned.

Not everything is about "features".

> If concurrency units (threads, co-routines) share a heap they are not memory safe.

If you want to talk about concurrency you should mention "Communicating sequential processes", where from Rust did take some wisdom, but Go (as a popular example) went full in. Functional programming (lamda calculus) also lends itself extremely well to "threading", intrinsically so.

I think that discussions are about learning things, clarifying things and such "nonsense". But this discussion seems to be all about "winning at any cost". So.. goodbye.


Best Erlang design decision: copying messages between process heaps.

Con: a small performance cost. Pro: a far simpler system in every other aspect. Garbage Collection and ownership is much easier when you are making copies of data.


Because of immutable data, it is really an implementation detail. So it would be possible to switch implementation and with extra complexity have it where messages are not copied.

I even seem to recall there was a historical implementation of the Erlang VM which didn't copy the messages, or maybe it was a port to an alternate VM like JVM...


> Con: a small performance cost.

That entirely depends on the amount of data that needs to be copied (!)

In many cases, it pays off to actually use that intricate shared memory system when possible. Yes, you lose some predictability, but you gain a lot of performance.


Note that binaries >64 bytes are indeed stored in shared memory and reference counted. (Including utf8 strings (as opposed to charlists), which are binaries in erlang). See http://erlang.org/doc/efficiency_guide/binaryhandling.html#r... for more info


In Erlang programs, the amount of data to be copied tends to be rather small. You'd much rather bring the computation to the data than sending the data around for computation.

If you are storing a lot of data, you either tend to keep it in ETS, which has its own storage engine (and you copy the small parts of data you work on in/out of the ETS system), or as binary data, which are tracked by an ARC-like system in its own arena/heap.

Of course, if you systematically abuse the system and ask it to copy several megabytes around all the time, your efficiency will suffer. But note that this is bad form: yes, it is faster locally internally inside a given node, but once you start running true distribution, you don't want to copy megabytes of data between nodes anyway.

Shared memory tend to imply a lot of locks, which can bring a lot of contention as well. It is far more complex than just a general rule which says "copying is always bad". And this was my main point, more or less. It is a mistake to see copying for all its bad behaviour, without ever wondering what it brings to the table as a result.


I'm not saying copying is always bad. I'm saying that there are cases where you really want sharing, and therefore your language should allow it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: