There is a "stable ABI" which is a subset of the full ABI, but no requirement to stick to it. The ABI effectively changes with every minor Python version - because they're constantly trying to improve the Python VM, which often involves re-working the internal representations of built-in types, etc. (Consider for example the improvements made to dictionaries in Python 3.6 - https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-compa... .) Of course they try to make proper abstracted interfaces for those C structs, but this is a 34 year old project and design decisions get re-thought all the time and there are a huge variety of tiny details which could change and countless people with legacy code using deprecated interfaces.
The bytecode also changes with every minor Python version (and several times during the development of each). The bytecode file format is versioned for this reason, and .pyc caches need to be regenerated. (And every now and then you'll hit a speed bump, like old code using `async` as an identifier which subsequently becomes a keyword. That hit TensorFlow once: https://stackoverflow.com/questions/51337939 .)
Very different way of doing things compared to the JVM which is what I have most experience with.
Was some kind of FFI using dlopen and sharing memory across the vm boundary ever considered in the past, instead of having to compile extensions alongside a particular version of python?
I remember seeing some ffi library, probably on pypi. But I don't think it is part of standard python.
You can in fact use `dlopen`, via the support provided in the `ctypes` standard library. `freetype-py` (https://github.com/rougier/freetype-py) is an example of a project that works this way.
To my understanding, though, it's less performant. And you still need a stable ABI layer to call into. FFI can't save you if the C code decides in version N+1 that it expects the "memory shared across the vm boundary" to have a different layout.
There is a "stable ABI" which is a subset of the full ABI, but no requirement to stick to it. The ABI effectively changes with every minor Python version - because they're constantly trying to improve the Python VM, which often involves re-working the internal representations of built-in types, etc. (Consider for example the improvements made to dictionaries in Python 3.6 - https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-compa... .) Of course they try to make proper abstracted interfaces for those C structs, but this is a 34 year old project and design decisions get re-thought all the time and there are a huge variety of tiny details which could change and countless people with legacy code using deprecated interfaces.
The bytecode also changes with every minor Python version (and several times during the development of each). The bytecode file format is versioned for this reason, and .pyc caches need to be regenerated. (And every now and then you'll hit a speed bump, like old code using `async` as an identifier which subsequently becomes a keyword. That hit TensorFlow once: https://stackoverflow.com/questions/51337939 .)