As far as I'm aware, LTO completely solves this from a binary size perspective. It will optimise out anything unused. You can still get hit from a build time perspective though.
"completely solves" is a bit of an overstatement. Imagine a curl-like library that allows you to make requests by URL. You may only ever use HTTP urls, but code for all the other schemas (like HTTPS, FTP, Gopher) needs to be compiled in as well.
This is an extreme example, but the same thing happens very often at a smaller scale. Optional functionality can't always be removed statically.
That only applies when dynamic dispatch is involved and the linker can't trace the calls. For direct calls and generics(which idiomatic Rust code tends to prefer over dyn traits) LTO will prune extensively.
Depends on what is desired, in this case it would fail (through the `?`), and report it's not a valid HTTP Uri. This would be for a generic parsing library that allows for multiple schemes to be parsed each with their own parsing rules.
If you want to mix schemes you would need to be able to handle all schemes; you can either go through all variations (through the same generics) you want to test or just just accept that you need a full URI parser and lose the generic.
See, the trait system in Rust actually forced you to discover your requirements at a very core level. It is not a bug, but a feature. If you need HTTPS, then you need to include the code to do HTTPS of course. Then LTO shouldn't remove it.
If your library cannot parse FTP, either you enable that feature, add that feature, or use a different library.
I guess that depends on the implementation. If you're calling through an API that dynamically selects the protocol than I guess it wouldn't be removable.
Rust does have a feature flagging system for this kind of optional functionality though. It's not perfect, but it would work very well for something like curl protocol backends though.
That's a consequence of crufty complicated protocols and standards that require a ton of support for different transports and backward compatibility. It's hard to avoid if you want to interoperate with the whole world.
yes, it's not a issue of code size but a issue of supply chain security/reviewability
it's also not always a fair comparison, if you include tokio in LOC counting then you surely would also include V8 LOC when counting for node, or JRE for Java projects (but not JDK) etc.
And, reductio ad absurdum, you perhaps also need to count those 27 million LOC in Linux too. (Or however many LOC there are in Windows or macOS or whatever other OS is a fundamental "dependency" for your program.)
It's certainly better than in Java where LTO is simply not possible due to reflection. The more interesting question is which code effectively gets compiled so you know what has to be audited. That is, without disassembling the binary. Maybe debug information can help?
Yet it works, thanks to additional metadata, either in dynamic compiler which effectly does it in memory, throwing away execution paths with traps to redo when required, and with PGO like metadata for AOT compilation.
And since we are always wrong unless proven otherwise,
In Go, the symbol table contains enough information to figure this out. This is how https://pkg.go.dev/golang.org/x/vuln/cmd/govulncheck is able to limit vulnerabilities to those that are actually reachable in your code.
It's possible and in recent years the ecosystem has been evolving to support it much better via native-image metadata. Lots of libraries have metadata now that indicates what's accessed via reflection and the static DCE optimization keeps getting better. It can do things like propagate constants to detect more code as dead. Even large server frameworks like Micronaut or Spring Native support it now.
The other nice thing is that bytecode is easy to modify, so if you have a library that has some features you know you don't want, you can just knock it out and bank the savings.
Yes, jlink, code guard, R8/D8 on Android, if you want to stay at the bytecode level, plus all the commercial AOT compilers and the free beer ones, offer similar capabilities at the binary level.
Everywhere in this thread is debating whether LTO "completely" solves this or not, but why does this even need LTO in the first place? Dead code elimination across translation units in C++ is traditionally accomplished by something like -ffunction-sections, as well as judiciously moving function implementations to the header file (inline).
Clang also supports virtual function elimination with -fvirtual-function-elimination, which AFAIK currently requires full LTO [0]. Normally, the virtual functions can't be removed because the vtable is referencing them. It's very helpful in cutting down on bloat from our own abstractions.
LTO only gets you so far, but IMO its more kicking the can down the road.
The analogy I use is cooking a huge dinner, then throwing out everything but the one side dish you wanted. If you want just the side-dish you should be able to cook just the side-dish.
LTO gets a lot of the way there, but it won't for example help with eliminating unused enums (and associated codepaths). That happens at per-crate MIR optimisation iirc, which is prior to llvm optimisation of LTO.