Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Samba: Efficient Unlimited Context Language Modeling (arxiv.org)
5 points by anon373839 on June 13, 2024 | hide | past | favorite | 1 comment


> Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin. And it has an infinite context length with linear complexity.

https://x.com/liliang_ren/status/1801027052147216457




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: