AI Blog

by Michele Laurelli

LLM (Large Language Model)

/ɛl ɛl ɛm/

Model

Definition

A neural network with billions of parameters trained on massive text datasets to understand and generate human language.

LLMs like GPT-4, Claude, and LLaMA are trained on diverse text from the internet. They demonstrate emergent abilities like reasoning, few-shot learning, and task generalization. Scale is key - larger models show better performance.

Examples

GPT-4 with 1.76 trillion parameters

LLaMA 2 for open-source applications

Claude for long-context understanding

Related Terms

GPT (Generative Pre-trained Transformer)

A family of large language models developed by OpenAI that use transformer architecture for text generation.

Transformer

A neural network architecture based entirely on attention mechanisms, without recurrent or convolutional layers.

Fine-tuning

The process of adapting a pre-trained model to a specific task by continuing training on task-specific data.

Michele Laurelli - AI Research & Engineering