Common Lisp has it too, via https://stmx.org/ I believe it supports the Intel TSX stuff if present and falls back to a software implementation if not present.
I don't have much experience with Haskell, but Clojure also has STM out-of-the-box. I don't know if it's the approach described in paper, but it's quite fast. It's probably made faster by the fact that almost all data in Clojure is immutable.
"The Clojure STM uses multiversion concurrency control with adaptive history queues for snapshot isolation, and provides a distinct commute operation."
Interestingly, I’ve been using Clojure since 2009 and I’ve never once used STM (outside of playing around to learn how to use it). I’ve used agents on occasion, I’ve used atoms, I’ve used core.async a lot, but refs and STM, to me (ie YMMV) seem like one of those things that seemed like a useful great idea, that in practice just wasn’t really necessary.
Maybe others have had a different experience, of course. This is just an observation about my own code and the libraries I use.
No, your assessment is pretty accurate. I've only encountered refs once in the wild, and only because it was too much hassle to refactor working code to use an atom.
It turns out that updating a single storage point in an atom with a CAS serves 99.9% of use cases, and is much, much simpler than ref-based code.
https://wiki.haskell.org/Software_transactional_memory
As I always say, Haskell is the Mercedes of programming languages.