It acts like a reverse Rorschach test, where they give you a nonsensical picture and ask for a forced caption from the subject. If you set the task to generate something no matter what, you get something no matter what.
It is trivial to make it reject gibberish prompts. Just use a generative model to estimate the probability of the input, it's what language models do by definition.
Also, I'm wondering if there is some way that these models could have a decent error response rather than responding to every input?