Technical

Logit

The raw, unnormalised score a model assigns to each vocabulary token before converting to probabilities.

Full Definition

Logits are the output of a language model's final linear layer — one scalar value per token in the vocabulary (typically 32,000–128,000 tokens) representing the model's raw preference for each possible next token. Logits are unnormalised and can be any real number; they are converted to probabilities via the softmax function. Temperature scaling divides the logits before softmax, making them sharper (lower temperature) or flatter (higher temperature). Advanced sampling strategies (top-k, top-p, min-p) operate on the logit distribution. Direct access to logits, available from open-weight models, enables nuanced sampling strategies not possible when only probabilities are returned.

Examples

A model outputting logit scores of [15.2, 3.1, -2.4, ...] for the next token, with 'Paris' receiving the highest score when completing 'The capital of France is'.

Logit bias: adding -100 to a specific token's logit to make the model never output that token.

Apply this in your prompts

Prompt𝙸t𝙸n automatically uses techniques like Logit to build better prompts for you.

✦ Try it free

Related Terms

Softmax

A function that converts a vector of real numbers into a probability distributio…

View →

Temperature

A sampling parameter that controls the randomness and creativity of model output…

View →

Top-P (Nucleus Sampling)

A sampling strategy that limits token selection to the smallest set covering a c…

View →

← Browse all 100 terms