notebook: https://twitter.com/theshawwn/status/1191800180192010246
code: https://github.com/shawwn/gpt-2
It's a fork of nshepperd's gpt-2 codebase (https://github.com/nshepperd/gpt-2) which lets you fine-tune 117M and 345M on GPUs.
For a tutorial on how to fine-tune GPT-2, see http://gwern.net/GPT-2
I’m going to try to retrain this with a twitter dataset called sentiment140 ( I have already processed it with gpt2 345M).
notebook: https://twitter.com/theshawwn/status/1191800180192010246
code: https://github.com/shawwn/gpt-2
It's a fork of nshepperd's gpt-2 codebase (https://github.com/nshepperd/gpt-2) which lets you fine-tune 117M and 345M on GPUs.
For a tutorial on how to fine-tune GPT-2, see http://gwern.net/GPT-2