Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As far as I'm aware, LTO completely solves this from a binary size perspective. It will optimise out anything unused. You can still get hit from a build time perspective though.


"completely solves" is a bit of an overstatement. Imagine a curl-like library that allows you to make requests by URL. You may only ever use HTTP urls, but code for all the other schemas (like HTTPS, FTP, Gopher) needs to be compiled in as well.

This is an extreme example, but the same thing happens very often at a smaller scale. Optional functionality can't always be removed statically.


That only applies when dynamic dispatch is involved and the linker can't trace the calls. For direct calls and generics(which idiomatic Rust code tends to prefer over dyn traits) LTO will prune extensively.


    let uri = get_uri_from_stdin();
    networking_library::make_request(uri);
How is the compiler supposed to prune that?


  let uri: Uri<HTTP> = get_uri_from_stdin().parse()?; 
If the library is made in a modular way this is how it would typically be done. The `HTTP` may be inferred by calls further along in the function.


So what happens if the user passes an url containing ftp:// or even https:// to stdin? Or is this an HTTP only library?


Depends on what is desired, in this case it would fail (through the `?`), and report it's not a valid HTTP Uri. This would be for a generic parsing library that allows for multiple schemes to be parsed each with their own parsing rules.

If you want to mix schemes you would need to be able to handle all schemes; you can either go through all variations (through the same generics) you want to test or just just accept that you need a full URI parser and lose the generic.


If you want to mix schemes you should just mix schemes.

  let uri: Uri<FTP or HTTP or HTTPS> = parse_uri(get_uri_from_stdin()) or fail;


See, the trait system in Rust actually forced you to discover your requirements at a very core level. It is not a bug, but a feature. If you need HTTPS, then you need to include the code to do HTTPS of course. Then LTO shouldn't remove it.

If your library cannot parse FTP, either you enable that feature, add that feature, or use a different library.


No, this wouldn't work. The type of the request needs to be dynamic because the user can pass in any URI.


Then they can also pass in an erroneous URI. You still need some way to deal with the ones you're not accepting.


I guess that depends on the implementation. If you're calling through an API that dynamically selects the protocol than I guess it wouldn't be removable.

Rust does have a feature flagging system for this kind of optional functionality though. It's not perfect, but it would work very well for something like curl protocol backends though.


That's a consequence of crufty complicated protocols and standards that require a ton of support for different transports and backward compatibility. It's hard to avoid if you want to interoperate with the whole world.


yes, it's not a issue of code size but a issue of supply chain security/reviewability

it's also not always a fair comparison, if you include tokio in LOC counting then you surely would also include V8 LOC when counting for node, or JRE for Java projects (but not JDK) etc.


And, reductio ad absurdum, you perhaps also need to count those 27 million LOC in Linux too. (Or however many LOC there are in Windows or macOS or whatever other OS is a fundamental "dependency" for your program.)


Or you could use APE and then all of those LOC go away. APE binaries can boot metal, and run on the big 3 OS from the same file.


It's certainly better than in Java where LTO is simply not possible due to reflection. The more interesting question is which code effectively gets compiled so you know what has to be audited. That is, without disassembling the binary. Maybe debug information can help?


Not only it is possible, it has been available for decades on commercial AOT compilers like Aonix, Excelsior JET, PTC, Aicas.

It is also done on the cousin Android, and available as free beer on GraalVM and OpenJ9.


Those all break compatibility to achieve that.


No they don't, PTC, Aicas, GraalVM and OpenJ9 support reflection.

The others no longer matter, out of business.


You can't LTO code under the presence of reflection. You can AOT but there will always be a "cold path" where you have to interpret whatever is left.


Yet it works, thanks to additional metadata, either in dynamic compiler which effectly does it in memory, throwing away execution paths with traps to redo when required, and with PGO like metadata for AOT compilation.

And since we are always wrong unless proven otherwise,

https://www.graalvm.org/jdk21/reference-manual/native-image/...

https://www.graalvm.org/latest/reference-manual/native-image...


You do understand that the topic at hand is not shipping around all that code needed to support a trap, right?


In Go, the symbol table contains enough information to figure this out. This is how https://pkg.go.dev/golang.org/x/vuln/cmd/govulncheck is able to limit vulnerabilities to those that are actually reachable in your code.


The symbol table might contain reflection metadata, but it surely can't identify what part of it will be used.


It's possible and in recent years the ecosystem has been evolving to support it much better via native-image metadata. Lots of libraries have metadata now that indicates what's accessed via reflection and the static DCE optimization keeps getting better. It can do things like propagate constants to detect more code as dead. Even large server frameworks like Micronaut or Spring Native support it now.

The other nice thing is that bytecode is easy to modify, so if you have a library that has some features you know you don't want, you can just knock it out and bank the savings.


Doesn’t Java offer some sort of trimming like C#? I know he won’t remove everything but at least they can trim down a lot of things.


Yes, jlink, code guard, R8/D8 on Android, if you want to stay at the bytecode level, plus all the commercial AOT compilers and the free beer ones, offer similar capabilities at the binary level.


Everywhere in this thread is debating whether LTO "completely" solves this or not, but why does this even need LTO in the first place? Dead code elimination across translation units in C++ is traditionally accomplished by something like -ffunction-sections, as well as judiciously moving function implementations to the header file (inline).


Clang also supports virtual function elimination with -fvirtual-function-elimination, which AFAIK currently requires full LTO [0]. Normally, the virtual functions can't be removed because the vtable is referencing them. It's very helpful in cutting down on bloat from our own abstractions.

[0] https://clang.llvm.org/docs/ClangCommandLineReference.html#c...


> As far as I'm aware, LTO completely solves this from a binary size perspective.

I wouldn't say completely. People still sometimes struggle to get this to work well.

Recent example: (Go Qt bindings)

https://github.com/mappu/miqt/issues/147


LTO only gets you so far, but IMO its more kicking the can down the road.

The analogy I use is cooking a huge dinner, then throwing out everything but the one side dish you wanted. If you want just the side-dish you should be able to cook just the side-dish.


I see it more as having a sizable array of ingredients in the pantry, and using only what you need or want for a given meal.


Then another group of armchair programmers will bitch you out for using small dependencies

I just don't listen. Things should be easy. Rust is easy. Don't overthink it


Some of that group of armchair programmers remember when npm drama and leftpad.js broke a noticeable portion of the internet.

Sure, don't overthink it. But underthinking it is seriously problematic too.


LTO gets a lot of the way there, but it won't for example help with eliminating unused enums (and associated codepaths). That happens at per-crate MIR optimisation iirc, which is prior to llvm optimisation of LTO.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: