AI
Artificial Intelligence
Software that mimics tasks we used to think only humans could do — recognising images, writing, planning, reasoning.
An umbrella term for systems that perform tasks normally requiring human intelligence. Modern AI is dominated by machine learning, especially deep neural networks trained on massive datasets.
ML
Machine Learning
Teaching computers patterns from examples instead of writing rules by hand.
A branch of AI where systems learn behaviour from data rather than from explicit programming. Includes supervised, unsupervised, and reinforcement learning.
Deep Learning
Machine learning using neural networks with many stacked layers — the engine behind modern AI.
A subfield of ML using neural networks with multiple hidden layers. Powers virtually every modern AI breakthrough — image recognition, speech, LLMs, protein folding.
Neural Network
A web of simple math units loosely inspired by brain cells, trained to turn inputs into useful outputs.
A computational model made of interconnected layers of nodes (neurons), each applying a weighted sum and a non-linear activation. Training adjusts the weights so the network maps inputs to desired outputs.
LLM
Large Language Model
A very large neural network trained on huge amounts of text to read, write, and reason in language.
A transformer-based model with billions of parameters trained on internet-scale text. Generates language by predicting the next token. Examples: GPT, Claude, Gemini, Llama.
SLM
Small Language Model
A compact LLM small enough to run on a laptop or phone, trading some quality for speed and privacy.
A language model in the 1B–10B parameter range, designed for on-device or low-latency inference. Often distilled from larger models or trained on high-quality curated data.
Foundation Model
A general-purpose model pretrained on broad data that you can adapt to many specific tasks.
A large model trained on diverse, broad data that serves as a base for many downstream applications via fine-tuning or prompting. Coined by Stanford CRFM.
Frontier Model
The most capable AI models in the world at any given moment — the cutting edge.
Refers to the most advanced general-purpose AI models available, typically those that push the limits of capability and scale. Subject to additional safety scrutiny.
Base Model
The raw pretrained model before it's taught to follow instructions or chat.
A language model after pre-training but before instruction tuning or RLHF. Continues text well but doesn't reliably follow user commands.
Instruct Model
A base model further trained to actually follow user instructions and answer questions properly.
A base model after supervised fine-tuning on instruction-following data, making it usable as an assistant.
Chat Model
An instruct model trained on multi-turn conversations, with a system/user/assistant turn format.
An instruct model further tuned on dialogue data and equipped with a conversational turn structure (system, user, assistant roles).
Multimodal Model
A model that can read and combine more than one kind of input — text, images, audio, video.
A model that processes multiple input modalities (text, images, audio, video) within a unified architecture, enabling tasks that span them.
Vision-Language Model
VLM
An AI that can look at an image and discuss it in language — describe, answer questions, reason about it.
A multimodal model jointly trained on images and text. Can caption, answer visual questions, do OCR, and reason over diagrams.
Generative AI
GenAI
AI that creates new content — text, images, audio, code — rather than just classifying or predicting.
A class of AI models that learn the distribution of training data and generate new samples from it. Includes LLMs, diffusion models, GANs.
Discriminative Model
A model that learns to tell categories apart instead of creating new examples.
A model that learns the boundary between classes — answers 'which?' rather than 'what does a sample look like?'. Logistic regression, classifiers, BERT for classification.
Supervised Learning
Training a model on labelled examples — every input comes paired with the right answer.
An ML paradigm where models learn from input-output pairs. The model adjusts its parameters to minimise the gap between predicted and true labels.
Unsupervised Learning
Finding patterns in data without anyone labelling what's what.
An ML paradigm where models discover structure in unlabelled data — clustering, dimensionality reduction, density estimation.
Reinforcement Learning
RL
Training an agent by giving it rewards for good actions and penalties for bad ones — like training a dog with treats.
An ML paradigm where an agent learns by interacting with an environment and optimising a reward signal. Foundational to RLHF, game-playing AI, robotics.
Self-Supervised Learning
SSL
Training a model by hiding part of the input and asking it to predict the missing piece — no human labels needed.
Models generate their own training signal from the data itself — e.g. masking words and predicting them. The dominant paradigm for pre-training LLMs.
Transfer Learning
Taking a model trained on one task and adapting it to a different but related task — faster than starting fresh.
Reusing knowledge learned on one task to bootstrap performance on another, typically via fine-tuning a pretrained model.