Ng Wai Foong
1 min readDec 9, 2019

--

Hi,

The official gpt2 github made some modification to the encoder.py file. It now accepts an extra parameter. They have removed the encode.py file as well.

def get_encoder(model_name, models_dir): with open(os.path.join(models_dir, model_name, ‘encoder.json’), ‘r’) as f: encoder = json.load(f) with open(os.path.join(models_dir, model_name, ‘vocab.bpe’), ‘r’, encoding=”utf-8") as f: bpe_data = f.read() bpe_merges = [tuple(merge_str.split()) for merge_str in bpe_data.split(‘\n’)[1:-1]] return Encoder( encoder=encoder, bpe_merges=bpe_merges, )

This is the one that I am using

def get_encoder(model_name): with open(os.path.join(‘models’, model_name, ‘encoder.json’), ‘r’) as f: encoder = json.load(f) with open(os.path.join(‘models’, model_name, ‘vocab.bpe’), ‘r’, encoding=”utf-8") as f: bpe_data = f.read() bpe_merges = [tuple(merge_str.split()) for merge_str in bpe_data.split(‘\n’)[1:-1]] return Encoder( encoder=encoder, bpe_merges=bpe_merges, )

Advisable to just clone the whole project from this github link instead of the official one.

https://github.com/nshepperd/gpt-2

--

--

Ng Wai Foong
Ng Wai Foong

Written by Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/

No responses yet