Yes. The Cray supercomputers from the 80s were crazy good matmul machines in particular. The quad-CPU Cray X-MP (1984) could sustain 800 MFLOPS to 1 GFLOPS, and with a 1 GB SSD, had enough computer power and bandwidth to train a 7-10M-parameter language model in about six months, and infer at 18-25 tok/sec.
A mid-90s Cray T3E could have handled GPT-2 124M, 24 years before OpenAI.
I also had a punch-card computer from 1965 learn XOR with backpropagation.
The hardware was never the bottleneck, the ideas were.
Post-quantum crypto is a good example of this. Lattice-based schemes were theorized in the 90s, but they took decades to actually reach production. The math existed, the hardware existed, and the ideas for making it work were just not there yet.