Interesting - is it viable do you think to package a llm like that with an exist...

Werewolf255 · on May 22, 2024

It would be intensive but it's very doable. You could use koboldcpp or something like that with an exposed endpoint just on the local machine and use that. You'll likely run into issues with GPU vendors and ensuring that you've got the right software versions running, but with some checking, it should be viable. Maybe include a fallback in case the system can't produce results in a timely manner.

jaggs · on May 22, 2024

Why would you get costs with a local model?

FezzikTheGiant · on May 22, 2024

yeah that's what I'm saying - it would eliminate inference costs. What I was asking is how feasible is it to package these local llms with another standalone app. For ex. a game

jaggs · on May 22, 2024

Oh sorry. Hm..I actually have no idea. It sounds like a neat idea though. :)