Brumby-14B-Base: The Strongest Attention-Free Base Model | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		Brumby-14B-Base: The Strongest Attention-Free Base Model (manifestai.com)
		7 points by cgel 5 months ago \| hide \| past \| favorite \| 1 comment

cgel 5 months ago [–]

We have trained a completely attention-free LLM whose performance is competitive with state-of-the-art models. This model, which we call Brumby-14B-Base, has a familiar Transformer-style architecture, except it uses power retention layers instead of attention layers. It is available on Huggingface.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact