Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Good hunch. On my machine (13900k) & zig 0.11, the latest version of the code:

> INFILE="$(mktemp)" && echo $INFILE && \ echo '60016000526001601ff3' | xxd -r -p > "${INFILE}" && \ zig build run -Doptimize=ReleaseFast < "${INFILE}" > execution time: 27.742µs

vs

> echo '60016000526001601ff3' | xxd -r -p | zig build run -Doptimize=ReleaseFast > execution time: 27.999µs

The idea that the overlap of execution here by itself plays a role is nonsensical. The overlap of execution + reading a byte at a time causing kernel mutex contention seems like a more plausible explanation although I would expect someone better knowledgeable (& more motivated) about capturing kernel perf measurements to confirm. If this is the explanation, I'm kind of surprised that there isn't a lock-free path for pipes in the kernel.



Based on what you've shared, the second version can start reading instantly because "INFILE" was populated in the previous test. Did you clear it between tests?

Here are the benchmarks before and after fixing the benchmarking code:

Before: https://output.circle-artifacts.com/output/job/2f6666c1-1165...

After: https://output.circle-artifacts.com/output/job/457cd247-dd7c...

What would explain the drastic performance increase if the pipelining behavior is irrelevant?


That was just a typo in the comment. The command run locally was just a strait pipe.

Using both invocation variants, I ran:

8a5ecac63e44999e14cdf16d5ed689d5770c101f (before buffered changes)

78188ecbc66af6e5889d14067d4a824081b4f0ad (after buffered changes)

On my machine, they're all equally fast at ~28 us. Clearly the changes only had an impact on machines with a different configuration (kernel version or kernel config or xxd version or hw).

One hypothesis outlined above is that the when you pipeline all 3 applications, the single byte reader version is doing back-to-back syscalls and that's causing contention between your code and xxd on a kernel mutex leading to things going to sleep extra long.

It's not a strong hypothesis though just because of how little data there is and the fact that it doesn't repro on my machine. To get a real explanation, I think you have to actually do some profile measurements on a machine that can repro and dig in to obtain a satisfiable explanation of what exactly is causing the problem.


This @mtlynch


To sanity check myself, I reran this without the buffered reader and still don't see the slow execution time:

> echo '60016000526001601ff3' | xxd -r -p > | zig build run -Doptimize=ReleaseFast

> execution time: 28.889µs

So I think my machine config for whatever reason isn't representative of whatever OP is using.

Linux-ck 6.8 CONFIG_NO_HZ=y CONFIG_HZ_1000=y

Intel 13900k

zig 0.11

bash 5.2.26

xxd 2024-02-10

Would be good if someone that can repro it compares the two invocation variants with buffered reader implemented & lists their config.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: