Ng Wai Foong
1 min readJan 7, 2022

--

Hi,

1. The script is used to encode your dataset and save it as compressed file. You will get a npz file after the encoding. You can check the source code for the file here:

https://github.com/nshepperd/gpt-2/blob/finetuning/encode.py

2. Unfortunately, I have not train it on TPU. Hence, I can't provide an advice for your on this issue. You can raise an issue at the following github repo:

https://github.com/nshepperd/gpt-2/issues

3. Training on GPU is difficult unless you have a full-scale architecture to support it to prevent out of memory error. For CPU training, seems someone able to train the full-sized model, using Adam, on an Amazon r4.4xlarge EC2 instance (16 vCPU, 122GB RAM). It was taking approx 30-40 seconds for a single step using batch size of 1.

Also, this tutorial is quite old and some of it might be outdated. Consider checking the official repo:

https://github.com/nshepperd/gpt-2

--

--

Ng Wai Foong
Ng Wai Foong

Written by Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/

Responses (1)