This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Frequently Asked Questions

Search the most frequently asked questions about merge.ai.

1 - What is a token?

Tokens are pieces of words used for natural language processing.

For English text, 1 token is approximately 4 characters or 0.75 words.

As a point of reference, the collected works of Shakespeare are about 900,000 words or 1.2M tokens.

To learn more about how tokens work and estimate your usage you can visit OpenAI Tokenizer tool.

Where should I go next?

2 - What is Temperature?

Temperature is a parameter that controls randomness when picking words during text creation. Low values of temperature make the text more predictable and consistent, while high values let more freedom and creativity into the mix, but can also make things less consistent. Temperature can vary from 0 to 1.

  • Temperature closer to 0: Responses are very predictable, always choosing the next most likely word. This is great for answers where facts and accuracy are really important.

  • Temperature closer to 1: The model takes more chances, picking words that are less likely, which can lead to more creative but unpredictable answers.

Examples of Temperature

  • Temperature = 0: If you ask, “What are the benefits of exercising?”, with a temperature of 0, the model might say: “Exercising improves heart health and muscle strength, lowers the chance of chronic diseases, and helps manage weight.”

  • Temperature = 1: With the same question on exercise and a temperature of 1, you might get: “Exercise is the alchemist turning sweat into a miracle cure, a ritual dancing in the flames of effort and reward.”

Where should I go next?

3 - What is Top P?

Top P or nucleus sampling is a parameter that decides how many possible words to consider. A high “Top P” value means the model looks at more possible words, even the less likely ones, which makes the generated text more diverse. Top P can vary from 0 to 1.

  • Top P = 0.5: The model considers words that together add up to at least 50% of the total probability, leaving out the less likely ones and keeping a good level of varied responses.

  • Top P = 0.9: The model includes a lot more words in the choice, allowing for more variety and originality.

Examples of Top P

  • Top P = 0.5: If you ask for a title for an adventure book, with a top-p of 0.5, the model might come up with: “The Mystery of the Blue Mountain.”

  • Top P = 0.9: For the same adventure book title and a top-p of 0.9, the model might create: “Voices from the Abyss: A Portrait of the Brave.”

Where should I go next?

4 - Mixing Temperature and Top P

Mixing Temperature and Top P can give a wide range of text styles. A low Temperature with a high Top P can lead to coherent text with creative touches. On the other hand, a high Temperature with a low Top P might give you common words put together in unpredictable ways.

Low Temperature and High Top P

Model outputs are usually logical and consistent because of the low Temperature, but they can still have rich vocabulary and ideas due to the high Top P. This setup is good for educational or informative texts where clarity is crucial, but you also want to keep the reader’s interest.

High Temperature and Low Top P

Model outputs often results in texts where sentences may make sense on their own but as a whole seem disconnected or less logical. The high Temperature allows more variation in sentence building, while the low Top P limits word choices to the most likely ones. This can be useful in creative settings where you want unexpected results or to spark new ideas with unusual concept combinations.

Where should I go next?