Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Symmetric Power Transformers (manifestai.com)
18 points by _hark on Aug 18, 2024 | hide | past | favorite | 5 comments


So, noticing that linearized models have tiny KV caches ahem i mean state spaces, this approach tries to increase their size along the embedding dimension. Increasing this enormously by applying a different softmax (which is compatible with the expanding tensor product) yields a very symmetric mathematical structure that can be exploited to recover some efficiency.

Is that right?


Yes. That is mostly the idea. But calling the state of a linear transformer KV cache is not quite right. A KV cache grows with the sequence length. But the linear transformer state just stores V @ K.T, an object with fixed size.


Glanced at the title and clicked, expecting this to be EE related.


Formatted like a formal academic publication. No way (that I can tell) to grab a pdf. Comes across as a blog masquerading as academic literature to me. Am I wrong? Did I miss something and there's an offline version available?

Pages served up over http are ephemeral. An absolutely essential part of formal academic literature is the archival aspect - self contained, immutable, and referenceable in an unambiguous manner.

There's also an immediate practical aspect for me. I will likely never get around to reading this because I will forget it exists because my "reading list" consists of a pile of pdf files.


I almost clicked on this, thinking it would be an electrical engineering topic; good thing I read the domain name.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: