LLM

Posted in
Definitions

Definition

LLM stands for Large Language Model, a type of artificial intelligence (AI) specialised in understanding, generating, and manipulating human language. These models are a subset of generative AI but focus exclusively on text-based tasks, such as writing, translation, summarization, and answering questions.

Key characteristics of LLMs include:

Architecture

Built on transformer neural networks, which enable them to process entire sequences of text in parallel and learn contextual relationships between words.

Training Data

Trained on massive datasets (often petabytes of text from books, articles, code, and web content) to recognize linguistic patterns.

Scale

Typically contain billions of parameters, allowing them to handle complex language tasks with high accuracy.

Capabilities
  • Generate coherent, context-aware text (e.g., essays, code, or dialogue).
  • Summarise long documents.
  • Answer open-ended questions