Further Background information from OpenAI

We encourage you to gain familiarity with the OpenAI documentation which is provided at here

The documentation provides you with some useful background information on how OpenAI’s GPT AI models function


and troubleshooting AI generation issues which may arise.



3 helpful concepts you may wish to explore a little further are text generation models, embeddings and tokens



Text Generation Models

OpenAI’s text generation models — often called GPT models (short for Generative Pre-trained Transformers) — are designed to understand and produce human-like language. Examples include GPT-4 and GPT-3.5.

These models generate text based on inputs known as prompts. A prompt is simply the set of instructions or examples you provide, which effectively “programs” the model to carry out a task. By adjusting your prompt, you can guide the model to write content, answer questions, summarize information, generate code, hold conversations, or create stories.


Embeddings

An embedding is a way of turning a piece of data — such as a sentence — into a list of numbers (a vector) that captures its meaning. Pieces of text with similar meaning or context will have embeddings that are close together, while unrelated text will be farther apart.

OpenAI’s embedding models take text as input and return these numerical representations. They are widely used in tasks like:

  • Search (finding the most relevant results)

  • Clustering (grouping similar items)

  • Recommendations

  • Detecting unusual patterns

  • Classification


Tokens

Both text generation and embedding models process text in small units called tokens. Tokens are not the same as words — they are chunks of characters. For example:

  • The word " tokenization" is split into " token" and "ization".

  • A short word like " the" is represented as a single token.

As a rule of thumb, 1 token ≈ 4 characters, or about three-quarters of a word in English.


Context Length

Each model has a maximum context length, which is the total number of tokens it can handle at once.

  • For text generation models, this limit applies to the prompt plus the generated output.

  • For embedding models, this limit applies to the input text only (since they don’t generate tokens).