Origins of data compression
Data compression began as a simple observation: common things deserve shorter descriptions.
Short version
Compression is the art of representing information with fewer bits. Sometimes it keeps everything exactly, which is called lossless compression. Sometimes it removes details that are considered less important, which is called lossy compression.
Modern codecs use both ideas. A video codec may predict what changes from frame to frame, transform image detail into frequency information, quantize less-visible detail, and then use entropy coding to store the result efficiently.
A short timeline
The big idea
Most real-world data contains patterns. Text repeats letters and words. Images contain flat areas and repeated textures. Video frames resemble the frames before and after them. Audio has frequencies and masking effects that human hearing does not treat equally.
Compression works by exploiting these patterns. The cleaner the pattern, the more easily it can be described in fewer bits.
Why this matters for codecs
When people compare MP3, AAC, Opus, H.264, H.265, AV1, FLAC, or WAV, they are often comparing different answers to the same old question: how can we keep the useful information while spending fewer bits?
That is why a codec page is easier to understand once you know the roots of compression. Codecs are not magic. They are layers of practical tricks built on older mathematical ideas.