OpenJDK and .NET compilers run circles around Go one. It's not even close. The second you go beyond "straight-line" code where function body has limited amount of locals and does not make much calls, the difference becomes absolutely massive. Go also does not do any sort of "advanced" devirtualization that is bread and butter of both to cope with codebase complexity and inevitable introduction of abstractions. Hell, .NET has surpassed Go in compilation of native binaries too. Here's a recent example: https://news.ycombinator.com/item?id=41234851
In terms of GC, Go has specialized design that makes tradeoffs to allow consistent latency and low memory usage. However, this comes with very low sustained allocation and garbage collection throughput, and Go the language itself does not make it necessarily obvious where allocations happen, so, as sibling discussions here and under Go iterators submission indicate, this results in the amount of effort to try to get rid of all allocations in a hot path that is unthinkable in C#, which makes it much more straightforward, and is also able to cope with high allocation throughput with ease, much like Java.
It is indeed true that Java makes different design choices when tuning its GC implementations, but you might see much closer to Go-like memory usage from .NET's back-end services now that DATAS is enabled by default, without the tradeoffs Go comes with.
Noting that the article's findings from 2018 need to be re-evaluated on up-to-date versions before deriving conclusions because in the last 6 years (and especially in the last 3 or so for .NET) the garbage collector implementations of both Go and .NET have evolved quite significantly. The sustained multi-core allocation throughput graph more or less holds but other numbers will differ significantly.
One of the major factors that play in Go's favour is the right attitude to architecting the libraries - the zero-copy slicing is much more at the forefront in Go than in .NET (technically incorrect but not in terms of how the average impl. looks like), and the flexible nature of C# combined with it being seen as "be glad we even support this Microsoft's Java" by many vendors lead to poor quality vendor libraries. This results in the experience where developers see Go applications be more efficient, not realizing that it's the massively worse quality implementation of a dependency their .NET solution has to deal with (there was a recent comparison video, where .NET was estimated to be slower, but the reality was that it wasn't .NET but the AWS SDK dependency and the benchmark author being most familiar with Go and making optimal choices with significant impact there like using DB connection pooling).
I'm often impressed by how much punishment GC and compiler can take, continuing to provide competitive performance despite massive amounts of data reallocations and abstraction bloat thrown at it by developers who don't want to even consider to approach C# in an idiomatic C# way (at the very least by listening to IDE suggestions and warnings). In some areas, I even recommend to look at community libraries first which are likely to provide far superior experience if documentation and brief code audit indicate that its authors care(tm) which is one of the most important metrics.
> Go also does not do any sort of "advanced" devirtualization
Depends on the implementation. gc doesn't put a whole lot of effort into optimization, but it isn't the only implementation. In fact, the Go project insists that there must be more than one implementation as part of its mandate.
GoGC is the fastest overall implementation and the one that is being used in >95% cases, with the alternatives not being-up-to-date and producing slower code, aside from select interop scenarios.
Until this changes, the "Depends on the implementation" statement is not going to be true in the context of better performance.
In terms of GC, Go has specialized design that makes tradeoffs to allow consistent latency and low memory usage. However, this comes with very low sustained allocation and garbage collection throughput, and Go the language itself does not make it necessarily obvious where allocations happen, so, as sibling discussions here and under Go iterators submission indicate, this results in the amount of effort to try to get rid of all allocations in a hot path that is unthinkable in C#, which makes it much more straightforward, and is also able to cope with high allocation throughput with ease, much like Java.
It is indeed true that Java makes different design choices when tuning its GC implementations, but you might see much closer to Go-like memory usage from .NET's back-end services now that DATAS is enabled by default, without the tradeoffs Go comes with.