pk.org: CS 419/Lecture Notes

Theoretical Foundations

Part 4 – Shannon, Perfect Secrecy, and Randomness

Paul Krzyzanowski – 2025-09-15

Part 1: Foundations of Cryptography
Part 2: Classical Ciphers
Part 3: Mechanized Cryptography
Part 4: Theoretical Breakthroughs
Part 5: Modern Symmetric Cryptography
Part 6: Principles of Good Cryptosystems
Part 7: Introduction to Cryptanalysis


Introduction

The mechanized era showed that engineering sophistication could create much stronger ciphers than classical methods. Yet even Enigma, with its astronomical key space, was defeated. The lesson thatcomplexity and size alone do not guarantee security. Cryptography needed a mathematical foundation — a way to measure security, understand what made ciphers strong or weak, and design systems with provable properties.

In 1949, Claude Shannon, who created the field of information theory, published Communication Theory of Secrecy Systems. Written during World War II at Bell Labs, it gave cryptography its first rigorous framework: how to measure uncertainty, how to reason about secrecy, and how to define when a cipher is secure.

Entropy: Measuring Information Content

Shannon's breakthrough was to treat information as something that could be measured mathematically. His concept of entropy quantifies the uncertainty or randomness in a message.

The formula for Shannon entropy is \(H(X) = -\sum_{i=1}^{n} p(x_i) \log_{2} p(x_i)\), where:

Entropy \(H(X)\) gives us the expected (average) information per outcome. High entropy represents more unpredictability (like a fair coin). Low entropy represents less unpredictability (like a weighted coin that almost always lands heads).

It measures the average number of bits needed to determine the outcome. Here are some examples:

English Text Has Low Entropy

This mathematical framework revealed why classical ciphers failed. Text has much less entropy than its symbol space suggests:

Maximum possible: If all 27 symbols (A-Z plus space) appeared equally, entropy would be log2(27) ≈ 4.75 bits per character.

Actual entropy: Shannon's experiments with human subjects revealed that English text carries only about 1-1.5 bits per character when context is considered (that is, in real text).

This means most of each letter is redundancy: predictable information that doesn't need to be transmitted. Redundancy is what lets us guess letters in crossword puzzles, allows us to understand speech with missing words, and enables spell-checkers to guess what we meant.

But redundancy is also what made frequency analysis work. A cipher that preserves this redundancy (like simple substitution) leaves the statistical fingerprints that cryptanalysts can exploit.

The Cryptographic Implication

Strong encryption must eliminate redundancy. The ciphertext should have high entropy: it should look statistically random, with no predictable patterns. If identical plaintext blocks produce identical ciphertext blocks, or if letter frequencies show through, the redundancy is preserved and the cipher is vulnerable.

Perfect Secrecy: The Theoretical Ideal

Shannon formalized the intuitive notion of "perfect security" with a precise definition: perfect secrecy means the ciphertext reveals absolutely nothing about the plaintext.

Formally, for every plaintext \(p\) and ciphertext \(c\), \(\Pr[P=p \mid C=c] = \Pr[P=p]\).

In other words, seeing the ciphertext doesn't change your knowledge about what the plaintext might be. Before seeing the ciphertext, certain plaintexts were more likely than others based on context. After seeing the ciphertext, those probabilities are exactly the same.

The One-Time Pad: Achieving Perfect Secrecy

Shannon proved that perfect secrecy is achievable, and he identified the conditions required. The one-time pad meets these conditions:

Algorithm

Example

Plaintext:  10110010
Key:        01101100  (random)
Ciphertext: 11011110

To decrypt: 11011110 ⊕ 01101100 = 10110010

Historical Context: The Vernam Cipher

The one-time pad has its roots in the Vernam cipher, invented by Gilbert Vernam at AT&T in 1917. Vernam was working on secure telegraph communications and developed a system that combined plaintext with a key tape using the XOR operation (though he described it in terms of the Baudot code used in teleprinters).

Vernam's original system used a key tape that could be reused, which made it vulnerable to attack. The crucial insight, that the key must be used only once, came later through the work of Army cryptographer Joseph Mauborgne. In 1919, Mauborgne proved that if the key tape was truly random, never reused, and as long as the message, the system would be unbreakable.

This combination of Vernam's mechanical implementation and Mauborgne's theoretical insight created what we now call the one-time pad. Shannon's later work provided the mathematical framework to prove why these conditions were necessary and sufficient for perfect secrecy.

Why the One-Time Pad Is Perfect

The proof is elegant. For any plaintext P and any ciphertext C, there exists exactly one key K such that P ⊕ K = C. Since the key is chosen uniformly at random, every possible plaintext is equally likely to have produced the observed ciphertext.

An adversary who intercepts the ciphertext 11011110 gains no information about whether the plaintext was 10110010, 00101010, 11111111, or any other 8-bit string. Each is equally probable given a random key.

The Price of Perfect Secrecy

Shannon also proved the conditions under which perfect secrecy is possible:

  1. Keys must be truly random: Any bias or predictability breaks the proof
  2. Keys must be at least as long as the message: You need as much random key material as data to protect
  3. Keys must never be reused: Using the same key twice catastrophically breaks security

Why key reuse is fatal: If two messages P₁ and P₂ use the same key K:

C₁ = P₁ ⊕ K
C₂ = P₂ ⊕ K
Therefore: C₁ ⊕ C₂ = P₁ ⊕ P₂

The key cancels out, revealing the relationship between the two plaintexts. With language redundancy, this often reveals both messages.

Practical problems with the one-time pad

The requirements for the one-time pad are impractical for most applications. The one-time pad requires the key to be as long as the data and never reused. This causes several practical problems:

In addition to the storage needs, because the key cannot be reused, the one-time pad replaces the problem of sharing a message securely with that of securely sharing a key that is as long as the message.

Historical Use

Despite these limitations, one-time pads have been used when perfect security justified the cost:

These systems required elaborate key distribution networks, diplomatic pouches, and careful operational procedures, demonstrating both the possibility and the cost of perfect secrecy.

Computational Security: The Practical Alternative

Since perfect secrecy is usually impractical, real-world cryptography aims for computational security: making attacks infeasible with available resources rather than impossible in principle.

The Modern Goal

Instead of perfect secrecy, we target computational indistinguishability: to any adversary with realistic computational resources, the ciphertext should be indistinguishable from random data.

Practical interpretation: - No statistical biases in ciphertext - No visible patterns or repetitions
- No compression possible (randomness doesn't compress) - No feasible way to recover plaintext without the key

Attack Models

To make this concrete, cryptographers define specific attack models that represent what adversaries can do and test ciphers against these models:

Modern ciphers are designed to remain secure even when adversaries have significant capabilities. However, these models are useful for testing new ciphers.

Confusion and Diffusion: Design Principles

Shannon identified two essential properties that secure ciphers must exhibit: confusion and diffusion.

Confusion
Hides the relationship between the key and the ciphertext. Each output bit should depend on the key in a complex, nonlinear way that resists analysis.
Small changes to the key or input should scramble many output bits in ways that are hard to predict. Modern ciphers use substitution boxes (S-boxes) — small lookup tables that implement nonlinear transformations. An 8-bit S-box takes an 8-bit input and produces an 8-bit output, but the mapping is carefully designed so that simple relationships (like XOR) don't hold.
Diffusion
Spreads the influence of each input bit across many output bits. A change in any single plaintext or key bit should affect many ciphertext bits in unpredictable ways.
Modern ciphers use permutation operations that rearrange and mix bits or bytes. Linear transformations like matrix multiplication can provide diffusion while remaining invertible for decryption.
The avalanche effect: Proper diffusion creates an "avalanche effect." This term refers to the property where changing one input bit should change about half the output bits. This ensures that small differences in input produce large, unpredictable differences in output.

Combining Confusion and Diffusion

Neither property alone is sufficient: - Confusion without diffusion: Local changes remain local, allowing divide-and-conquer attacks - Diffusion without confusion: Relationships between input and output remain linear and solvable

Most ciphers apply both properties in multiple rounds. Each round does: 1. Substitute (confusion): Apply nonlinear S-boxes
2. Permute (diffusion): Mix and rearrange the result 3. Mix key material: Combine with a round key derived from the main key

After enough rounds, small changes in input or key affect the entire output in highly nonlinear ways.


Randomness

Both perfect secrecy and computational security depend critically on randomness. One-time pads need truly random keys; modern ciphers need random keys and often random initialization values.

Random vs. Pseudorandom

In practice, we obtain sequences of random or pseudorandom bits:

How operating systems provide randomness

Modern operating systems provide cryptographically secure randomness:

These systems: 1. Collect entropy from hardware events (keyboard timing, disk delays, network interrupts) or the CPUs that support random number generation 2. Hash and mix entropy sources to eliminate bias 3. Seed a CSPRNG that can produce unlimited output 4. Provide applications with cryptographically secure random numbers

Why getting randomness right is hard

Freshly booted machines may not have gathered enough environmental noise yet. Virtual machines and embedded devices often have fewer and less diverse hardware events. When developers bypass the OS interface, seed a general-purpose PRNG, or accidentally reuse per-operation randomness, the results are often predictable.

There have been several high-profile examples where bad random values broke security. Some of these are:

Quantum randomness (current research)

Researchers continue to search for high quality randomness that could be obtained at high rates. Quantum physics offers entropy sources that are unpredictable in principle. A simple example is sending single photons into a 50–50 beam splitter and recording which detector clicks.

More ambitious work uses Bell-test techniques that certify randomness under clear physical assumptions, even if the device internals are untrusted.

In June 2025, NIST and partners launched CURBy (the Colorado University Randomness Beacon), a free public service that publishes traceable, verifiable quantum-generated random numbers using an entanglement-based Bell test and a provenance protocol (“Twine”). Today, such sources are mostly used to seed conventional generators or in specialized links; for general software, the guidance remains the same: use the operating system’s CSPRNG.


Shannon's Fundamental Insights

Shannon's theoretical framework transformed cryptography from art to science. His contributions include:

  1. Information can be measured mathematically using entropy
  2. Perfect secrecy is possible but expensive (one-time pad)
  3. Practical security can target computational bounds instead of information-theoretic ones
  4. Confusion and diffusion are essential design principles for strong ciphers
  5. Randomness is crucial and must be protected throughout the system

Modern Applications

Shannon's principles directly influenced the design of modern ciphers:

The Path Forward

Shannon's work established the theoretical foundation that modern cryptography stands on. His framework allows us to:

This mathematical foundation made possible the next phase of cryptographic development: engineered systems that deliberately implement confusion and diffusion to create computationally secure ciphers suitable for widespread use.

Next: Part 5: Modern Symmetric Cryptography