It's easy to make something work when the example goes from being out of the tra...

throw310822 · on Jan 11, 2025

Definitely. But I also tried with a picture of an absurdist cartoon drawn by a family member, complete with (carefully) handwritten text, and the analysis was absolutely perfect.

visarga · on Jan 11, 2025

A simple test - take one of your own photos, something interesting, and put in into a LLM, let it describe it in words. Then use a image generator to create the image back. It works like back-translation image->text->image. It proves how much the models really understand images and text.

BlueTemplar · on Jan 11, 2025

I wouldn't blame a machine to fail something that a first glance looks like an optical illusion...