Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This uses CLIP to optimize a GAN's input to generate an output matching a text description. Optimization is very slow, it's basically the same process as training. DALL-E uses a feedforward network to directly predict an image from text. But that model hasn't been published yet.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: