Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
XCSme
5 months ago
|
parent
|
context
|
favorite
| on:
Running GPT-OSS-120B at 500 tokens per second on N...
That's true, but the data is only approximately represented in the weights.
Maybe it's better to have the AI only "reason", and somehow instantly access precise data.
stirfish
5 months ago
|
next
[–]
Is this Retrieval Augmented Generation, or something different?
XCSme
5 months ago
|
parent
|
next
[–]
Yes, RAG, but have the model specifically optimzied for RAG.
adsharma
5 months ago
|
prev
[–]
What use cases will gain from this architecture?
XCSme
5 months ago
|
parent
[–]
Data processing, tool calling, agentic use. Those are also the main use-cases outside "chatting".
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
Maybe it's better to have the AI only "reason", and somehow instantly access precise data.