Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In Elixir / Erlang, you can put a socket in either active or passive mode. In passive mode, you need to explicitly call recv/2 or recv/3 to get data, very similar to a traditional socket API. This is what this code appears to be doing.

But if you want better performance, you use active mode. In active mode, the runtime is receiving data on your behalf, as fast as possible, and sending it to the socket's owning process (think a goroutine) just as fast. Data is often waiting there for you, not just already in user space, but in your Elixir process's mailbox. (Also, this doesn't block your process the way recv/2 does, so you could handle other messages to this process.)

You could imagine doing something similar with 2 goroutines and a channel. Where 1 goroutine is constantly recv from the socket and writing to a buffered channel, and the other is processing the data.

One problem with active mode, and to some degree how messages work in Elixir in general, is that there's no back pressure. Messages can accumulate much faster than you're able to process them. So, instead of active mode, you can use "once" mode or {active, N} mode (once being like {active, 1}. In these modes you get N messages sent to you, at which point it turns into passive mode so you can recv manually. You can put the socket into active/once mode at any point in the future.



You could implement backpressure by either adding a buffer (which you already have here) or by rejecting requests optionally with more data on how hard and long to back off.


If you can tolerate (debounced) data loss in the buffer, a ring buffer works really well with predictable memory and performance.


Since you relate this example to Go, would you mind sharing thoughts about how heavy I/O from network sockets compares between the two or gotchas that might not be apparent to Erlang/Go developers about the other?


I think they're both relatively straightfoward, with the biggest difference being active mode. (I'll use "Elixir", but it all applies to Erlang as well).

Elixir sockets can be safely shared between processes, which might not be obvious to an Elixir programmer.

To the best of my knowledge, Elixir will use writev where possible, so it's iolist friendly and can be extremely efficient.

Binary pattern is a productivity boost when it comes to networking work.

Every Elixir socket is associated with a "controlling_process". This creates a link between the socket and the process. The linked article uses it. You generally don't want this to be the acceptor loop, since if the acceptor loop crashed, it would close every socket. Fun fact, I believe earlier versions had bugs / race conditions with respect to changing the controlling_process while data was incoming. This has since been fixed, and things "just work" like you expect, but I can only imagine that it involved some coding gymnastics to fix.

Since the Elixir sockets are more abstracted, there's more knobs you can turn and tweak, e.g., you can tweak the buffers that Erlang is using to read/write. Or it has built-in parsers for common formats, including simple things like automatically being able to send/receive 1, 2, or 4 byte length-prefixed messages.

Elixir has two socket APIs. The traditional gen_tcp, and a new one, socket, which is meant to be "as close as possible to the OS level socket interface." I haven't tried the new one yet.


is there somewhere where there is a comparison of the performance between active and passive mode? I don't imagine it to be significant for most use cases, so the extra safety seemed worth it to me in all the libraries I wrote, though I suppose it wouldn't be hard to rewrite those with {active, 1}.


https://stressgrid.com/blog/cowboy_performance_part_2/

Might be interesting. Amongst other things, they experimented with a patch to Cowboy to use {active, 100} instead of the default {active, 1} and got some nice wins.


Ranch is a pretty well optimized and battle hardened tcp acceptor. It powers the Cowboy/Phoenix server which scales to extreme level of concurrency and low latency. Cowboy uses ranch to pool and accept connections and I believe it uses {active,once}.

https://github.com/ninenines/cowboy

https://github.com/ninenines/ranch


Using active mode is a nice because you can receive other messages as well, instead of blocking to receive just a tcp message. I like active once myself.


you can time out your recv, even with the value 0!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: