Text Synth - Technical Notes
Text Synth is build using the GPT-2 language model released by OpenAI. It is a neural network of 1.5 billion parameters based on the Transformer architecture.
GPT-2 was trained to predict the next word on a large database of 40 GB of internet texts.
Parameter explanation:
- Model: select the model size. The medium model (345M parameters) is fast but generates slightly less accurate results than the large one (1558M parameters).
- Top-k, top-p, temperature: sampling parameters. They
determine how the next token is selected from the
probabilities computed by the model. Top-k selects the next
token among the
k
most probable ones. It is interesting to lower it if a precise answer is needed to a question. The downside is that the text tends to be repetitive. A too large value tends to give bad results because the model generates very unlikely tokens. Top-p is quite similar, but selects the next token among the most probables ones so that their cumulative probability is larger thanp
. For most uses, it is not needed to change the temperature. More information is available in this article. - Random seed: the next token is selected by using a deterministic random number generator. It is initialized by the random seed. Hence the same text is generated for a given seed. The default value 0 indicates to use a new random seed for each try.
This implementation is original because instead of using a GPU, it runs using only 4 cores of a Xeon E5-2640 v3 CPU at 2.60GHz. With a single user, it generates 10 tokens per second. It is programmed in plain C using the LibNC library. A Linux executable can be downloaded here.
Thanks to OpenAI for providing their GPT-2 model.
News:- 2020-08-09: the size of the model, the sampling parameters (top-k, top-p, temperature) and the random number seed can now be modified.