Stanford EE274: Data Compression I 2023 I Lecture 3 - Kraft Inequality, Entropy, Introduction to SCL

Stanford EE274: Data Compression I 2023 I Lecture 3 - Kraft Inequality, Entropy, Introduction to SCL

Prefix Codes

  • Prefix codes simplify the decoding process by ensuring that no code word is a prefix of another code word.
  • A good code has the length of the encoding of a symbol roughly equal to the logarithm of 1/PX.
  • The Shannon code is a construction method for prefix codes where symbols are sorted according to their probabilities, and code words are assigned without violating the prefix-free condition.
  • Kraft's inequality characterizes prefix codes as satisfying a certain mathematical inequality.
  • Prefix codes are a subset of uniquely decodable codes, and Kraft's inequality can be extended to uniquely decodable codes as well.


  • Entropy is a measure of the uncertainty associated with a random variable.
  • Entropy is a function of the probability distribution, not the random variable itself.
  • Entropy is measured in bits.
  • Entropy is 0 for a deterministic random variable and log K for a uniformly distributed random variable.
  • Entropy is a measure of uncertainty or randomness in a random variable.
  • Entropy is the amount of information contained in a message.
  • Entropy is the limit of compression for a given random variable.
  • The joint entropy of independent random variables is the sum of the entropies of the individual variables.
  • The entropy of a tuple of independent and identically distributed random variables is n times the entropy of a single random variable.


  • The K-Divergence is a measure of distance between two probability distributions.
  • The K-Divergence is positive and equal to zero if and only if the two distributions are equal.

Shannon's Source Coding Theorem

  • The expected length of any prefix code for a given distribution is greater than or equal to the entropy of the distribution.
  • Entropy is the fundamental limit of lossless compression.
  • Shannon proved in 1948 that no prefix code can have an expected code length lower than the entropy of the source.
  • This result holds for any uniquely decodable code, not just prefix codes.

Huffman Codes

  • Huffman codes are the optimal codes for a given distribution and can achieve arbitrarily close to the entropy by increasing the block size.

Block Codes

  • Block codes can achieve entropy, but they have exponential complexity and require huge tables.

Arithmetic Codes and ANS Codes

  • Arithmetic codes and ANS codes achieve the same result as block codes but are more efficient.

Overwhelmed by Endless Content?