"Secret language" is clickbait, but it seems like systematically exploring how i...

visarga · on June 2, 2022

It acts like a reverse Rorschach test, where they give you a nonsensical picture and ask for a forced caption from the subject. If you set the task to generate something no matter what, you get something no matter what.

It is trivial to make it reject gibberish prompts. Just use a generative model to estimate the probability of the input, it's what language models do by definition.