TPUs (short for Tensor Processing Units) are Google’s custom AI accelerator hard...

jxjnskkzxxhx · 2025-04-12T15:09:33 1744470573

Believe it or not, I'm also familiar with Wikipedia. It reads that they're optimized for low precisio high thruput. To me this sounds like a GPU with a specific optimization.

flebron · 2025-04-12T15:39:22 1744472362

Perhaps this chapter can help? https://jax-ml.github.io/scaling-book/tpus/

It's a chip (and associated hardware) that can do linear algebra operations really fast. XLA and TPUs were co-designed, so as long as what you are doing is expressible in XLA's HLO language (https://openxla.org/xla/operation_semantics), the TPU can run it, and in many cases run it very efficiently. TPUs have different scaling properties than GPUs (think sparser but much larger communication), no graphics hardware inside them (no shader hardware, no raytracing hardware, etc), and a different control flow regime ("single-threaded" with very-wide SIMD primitives, as opposed to massively-multithreaded GPUs).

jxjnskkzxxhx · 2025-04-13T11:01:21 1744542081

Thank you for the answer! You see, up until now I had never appreciated that a GPU does more than matmuls... And that first reference, what a find :-)

Edit: And btw, another question that I had had before was what's the difference between a tensor core and a GPU, and based on your answer, my speculative answer to that would be that the tensor core is the part inside the GPU that actually does the matmuls.

jibal · 2025-04-13T05:13:20 1744521200

You asked a question, people tried to help, and you lashed out at them in a way that makes you look quite bad.

kgwgk · 2025-04-12T20:15:40 1744488940

Did you also read just after that "without hardware for rasterisation/texture mapping"? Does that sound like a _G_PU?

crazygringo · 2025-04-12T16:57:27 1744477047

I mean yes. But GPU's also have a specific optimization, for graphics. This is a different optimization.