There's a limit to how much RAM can be assigned to video, and you'd be constrained on what you can use while doing inference.
Maybe there will be lower quants which use less memory, but you'd be much better served with 96+GB
There's a limit to how much RAM can be assigned to video, and you'd be constrained on what you can use while doing inference.
Maybe there will be lower quants which use less memory, but you'd be much better served with 96+GB