
Definition
LLM stands for Large Language Model, a type of artificial intelligence (AI) specialised in understanding, generating, and manipulating human language. These models are a subset of generative AI but focus exclusively on text-based tasks, such as writing, translation, summarization, and answering questions.
Key characteristics of LLMs include:
Architecture
Built on transformer neural networks, which enable them to process entire sequences of text in parallel and learn contextual relationships between words.
Training Data
Trained on massive datasets (often petabytes of text from books, articles, code, and web content) to recognize linguistic patterns.
Scale
Typically contain billions of parameters, allowing them to handle complex language tasks with high accuracy.
Capabilities
- Generate coherent, context-aware text (e.g., essays, code, or dialogue).
- Summarise long documents.
- Answer open-ended questions