One way to compress data is to reduce repeated or redundant information.
Count the repeating blocks of data.
Dictionary that stores the mapping from code to data is in the front of the file. The actual data is encoded using Huffman Tree.
Both methods are loseless compression - The decompressed data is identical to the original before compression, bit for bit.
These two approaches are often combined and underlie almost all loseless compressed file formats like GIF, PNG, and ZIP files.
There are other types of files where we can get away with little changes, by removing unnecessary or less important information, especially information that human perception is not good at detecting.
Lossy audio compressors encode different frequency bands at different precisions.
We encounter this type of audio compression all the time, a reason why we sound different on a cellphone.
Uncompressed audio format - WAV, FLAC
Compressed - MP3 (ten times smaller)
This idea of discarding or reducing precision in a manner that aligns with human perception is called perceptual coding, and it relies on models of human perception which comes from a field of study called Psychophysics.
JPEG - most famous lossy compressed image format
Human perception is good at detecting sharp contrasts like the edges of objects, but not good with subtle color variations.
JPEG takes advantage of this by breaking images up into blocks of 8x8 pixels, then throwing away a lot of the high-frequency spatial data.
Temporal redundancy, inter-frame similarity - repeating pixels in a series of images, like a static background.
Video formats just copies the patches of repeating pixels to the next frame, instead of re-transmiting every pixels in a frame of a video.
When there are small differences, most video formats send data that represents just the difference between patches.
Fanciest video compression formats find patches that are similar between frames, and not only copy them forward, with or without differences, but also can apply simple effects to them like a shift or rotation, or lighten or darken a patch between frames.
MPEG-4 videos, a common standard, are often 20 to 200 times smaller than the original, uncompressed file.
If compressed too heavily, the video player will forge ahead, applying the right motions, even if the patch data
압축기술 덕분에 고화질의 영상과 사진을 볼 수 있구나.