Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How many people are discussing this after one person did 1 prompt with 1 data point for each model and wrote a comment?

What is being measured here? For end-to-end time, one model is:

t_total = t_network + t_queue + t_batch_wait + t_inference + t_service_overhead



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: