I did that last summer, I compared the performance of different english word embedding models, as far as I remember the best ones were GloVe and a few knowledge graph word embeddings.
None of them were better than a human at giving hints for 3+ words though