Pinning deps is discouraged by years of Python practice. And going back to a an old project and finding versions that work, a year or more later, might be nigh on impossible.
Last week I was trying to install snakemake via Conda, and couldn't find any way to satisfy dependencies at all, so it's not just pypi, and pip tends to be one of the more forgiving version dependency managers.
It's not just Python, trying to get npm to load the requirements has stopped me from compiling about half of the projects I've tried to build (which is not a ton of projects). And CRAN in the R universe can have similar problems as projects age.
> Pinning deps is discouraged by years of Python practice.
I'm not sure it is discouraged so much as just not what people did in Python-land for a long time. It's obviously the right thing to do, it's totally doable, it's just inertia and habit that might mean it isn't done.
> I'm not sure it is discouraged so much as just not what people did in Python-land for a long time. It's obviously the right thing to do, it's totally doable, it's just inertia and habit that might mean it isn't done.
Pinning obviously the wrong thing, it only works if everyone does it and if everyone does it then making changes becomes very hard. The right thing is to have deterministic dependency resolution so that dependencies don't change under you.
When they suggest you pin your dependencies, they don't just mean your direct dependencies, but rather all transitive dependencies. You can take this further by having a lock file that account for different Python versions, operating systems, and CPI architectures – for instance , by using UV or Poetry – but a simple `pip freeze` is often sufficient.
That works for your project, but then nobody can include you as a library without conflicts.
But having that lock file will allow somebody to reconstruct your particular moment in time in the future. Its just that those lock files do not exist for 99.9% of Python projects in time.
That works for your project, but then nobody can include you as a library without conflicts.
I think this is the core to much misunderstandings and arguments around this question. Some people are writing code that only they will run, on a python they've installed, on hardware they control. Others are writing code that has to work on lots of different versions of python, on lots of different hardware, and when being run in all kinds of strange scenarios. These two groups have quite different needs and don't always understand each other and the problems they face.
A lib can still lock its dependencies and have version ranges declared at the same time. The lock file is an artifact than is used to reproducibly build the lib, while the version ranges are used to see, whether some other project can use the lib.
It is only a matter of tooling. Locking ones dependencies remains the right thing to do, even for a lib.
This is of course the right answer. But unfortunately it has only recently become supported by packaging tooling, and is extremely uncommon to encounter in the wild.
For some reason the "secure" thing to do is considered to be to pin everything and then continuously bump everything to latest, to get the security fixes.
At which point one might directly not pin, but that's "insecure" (https://scorecard.dev/)
It took me few days to get some old Jupyter Notebooks working. I had to find the correct older version of Jupyter, correct version of the every plugin/extension that notebook used and then I had to find the correct version of every dependency of these extensions. Only way to get it working was a bunch of pinned dependencies.
Had they been properly pinned before, you would not have had to work for a few days. Code in a Jupyter notebook is unlikely to be relied upon elsewhere. Perfectly good for making it always use the exact same versions (checked by checksums, whatever tool you are using).
It's not about finding old packages, it's about not finding the magical compatible set of package versions.
Pip is nice in that you can install packages individually to get around some version conflicts. But with conda and npm and CRAN I have always found my stuck without being able to install dependencies after 15 minutes of mucking.
Its rare that somebody has left the equivalent of the output of a `pip freeze` around to document their state.
With snakemake, I abandoned conda and went with pip in a venv, without filing an issue. Perhaps it was user error from being unfamiliar with conda, but I did not have more time to spend on the issue, much less doing the research to be able to file a competent issue and follow up later on.
It’s a little hard for me to talk about Python setups which don’t use Poetry as that is basically the standard around here. I would argue that not controlling your packages regardless of the package manager you use is very poor practice.
How can you reasonably expect to work with any tech that breaks itself by not controlling its dependencies? You’re absolutely correct that this is probably more likely to be an issue with Python, but that’s the thing with freedom. It requires more of you.
There are different types of dependencies, and there are different rules for them, but here's an overview of the best practices:
1. For applications, scripts and services (i.e. "executable code"), during development, pin your direct dependencies; ideally to the current major or minor version, depending how much you trust their their authors to follow SemVer. Also make sure you regularly update and retest the versions to make sure you don't miss any critical updates.
You should not explicitly pin your transitive dependencies, i.e. the dependencies of your direct dependencies -- at least unless you know specifically that certain versions will break your app (and even then it is better to provide a range than a single version).
2. For production builds of the above, lock each of your dependencies (including the transitive ones) to specific version and source. It is not really viable to do it by hand, but most packaging tools -- pip, Poetry, PDM, uv... -- will happily do that automatically for you. Unfortunately, Python still doesn't have a standard lock format, so most tools provide their own lock file; the closest thing to a standard we have at the moment is pip's requirements file [0].
Besides pinned versions, a lock file will also include the source where the packages are to be retrieved from (pip's requirements file may omit it, but it's then implicitly assumed to be PyPI); it can (and should, really) also provide hashes for the given packages, strengthening the confidence that you're downloading the correct packages.
3. Finally, when developing libraries (i.e. "redistributable code"), you should never pin your dependencies at all -- or, at most, you can specify the minimum versions that you know that work and have tested against. That is because you have no control over the environment the code will eventually be used and executed in, and arbitrary limitations like that might (and often will) prevent your users to update some other crucial dependency.
Of course, the above does not apply if you know that a certain version range will break your code. It that case you should most definitely exclude it from your specification -- but you should also update your code as soon as possible. Libraries should also clearly specify which versions of Python they support, and should be regularly tested against each of those versions; it is also recommended that the minimal supported version is regularly reviewed and increased as new versions of Python get released [1].
For more clarity on abstract vs concrete dependencies, I recommend the great article by Donald Stufft from 2013 [2]; and for understanding why top-binding (i.e. limiting the top version pin) should be avoided there is a lengthy but very detailed analysis by Henry Schreiner [3].
In a poetry lock file transitive dependencies are automatically locked and thereby pinned. It will ensure, that you get the same thing each time, or get an error about things not matching hashsums, when something suspicious is going on, that would be worth raising an issue on a repo, if none exists.