Embedding
A dense numerical vector that represents a token, sentence, or document in a continuous semantic space.
Full Definition
An embedding is a fixed-length numerical vector that encodes semantic meaning such that similar concepts are geometrically close in the vector space. Word embeddings (Word2Vec, GloVe) capture word-level semantics; sentence and document embeddings (from models like text-embedding-ada-002 or E5) represent longer spans. LLMs internally use token embeddings at every layer; these evolve through the network from surface-level lexical representations to rich contextual ones. Embeddings are the foundation of semantic search, clustering, classification, and RAG systems — documents are encoded as embeddings and retrieved by computing vector similarity (cosine or dot product) rather than keyword matching.
Examples
Using OpenAI's text-embedding-ada-002 to convert 10,000 product descriptions into 1536-dimensional vectors, then finding the 5 most similar products to a user query via cosine similarity.
Word embeddings capturing that 'Paris' − 'France' + 'Germany' ≈ 'Berlin' in vector arithmetic.
Apply this in your prompts
PromptITIN automatically uses techniques like Embedding to build better prompts for you.
Related Terms
Vector Database
A database optimised for storing and querying high-dimensional embedding vectors…
View →RAG (Retrieval-Augmented Generation)
Augmenting model responses by retrieving relevant documents from an external kno…
View →Tokenisation
The process of splitting text into tokens using a vocabulary and encoding algori…
View →