When you download the pre-trained model via python download_model.py 117M
(replace it with the name of the model that you preferred), it contains the following files:
- checkpoint
- encoder.json
- hparams.json
- model.ckpt.data-00000-of-00001
- model.ckpt.index
- model.ckpt.meta
- vocab.bpe
For multi-language support, it is advisable to use a pre-trained model. You need to modify quite a lot of stuff especially the tokenization, vocab, etc. I have never tried training it for other language. You can try to search for existing model using gpt2 <language> as keyword in your browser.
I found the following for Chinese and Japanese.
Hope it helps you. Have a great day ahead!