Ng Wai Foong
1 min readJul 10, 2019

--

Hi Jimmy Liu,

I used a single GeForce RTX 2080 Ti (11GB RAM) when training BERT-Base English and Chinese. Prediction time tested in Jupyter Notebook is around 2.75 seconds per call using the same GPU (not suitable for real-time prediction). For the online machine environment, I served it via Flask on Intel(R) Core i7–8550U CPU 16GB RAM Windows 10 machine. Prediction time is somewhere around 5~6 seconds (approximation). You need to modify the underlying code in convert_examples_to_features and input_fn_builder functions if you intend to speed it up. Thanks a lot.

--

--

Ng Wai Foong
Ng Wai Foong

Written by Ng Wai Foong

Senior AI Engineer@Yoozoo | Content Writer #NLP #datascience #programming #machinelearning | Linkedin: https://www.linkedin.com/in/wai-foong-ng-694619185/

No responses yet