I’m curious how they prompt the model or otherwise tell it what its goal is. The...

		amluto 4 months ago \| parent \| context \| favorite \| on: GEN-0 / Embodied Foundation Models That Scale with... I’m curious how they prompt the model or otherwise tell it what its goal is. They seem to suggest some language processing — perhaps they’re starting with a multimodal text + vision LLM?