Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm in about the same situation as OP. We have a small cluster of Power9 and it's been unmaintained and unused for a while so I will set it up from scratch. Been looking into solutions that would be a good fit, for the moment we are just a few students/postdoc, so manual scheduling is feasible, but eventually we would like to make it available to other students at the institution.

My candidates are also - slurm + ray/lightning/etc. - determined.ai (maybe together with slurm)

Some advertise a kubernetes setup with kubeflow but I would imagine that is a bit too complex for a small cluster.

Anyone else with experience in this? Any other suggestions?

To make the environments as reproducible as possible it would be great to also have a setup based on docker containers and maybe nix, but not sure if it is feasible on ppc64. Guix and Spack have also come up in my searches.

edit: typo



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: