Boltzmann Entropy as Shannon Entropy
The Connection
Section titled “The Connection”[!abstract] Core Insight [[Boltzmann Entropy]] (physics) and [[Shannon Entropy]] (information theory) are the same mathematical quantity measuring the same conceptual thing: uncertainty about the microstate given the macrostate.
They differ only in:
- Units ( vs. bits)
- Historical context
- Typical applications
Thermodynamics Perspective
Section titled “Thermodynamics Perspective”Context
Section titled “Context”In statistical mechanics, we have a system with many particles. The macrostate (temperature, pressure, volume) is what we measure. The microstate (exact positions and momenta of all particles) is unknowable.
Formulation
Section titled “Formulation”Boltzmann entropy:
where:
- J/K (Boltzmann’s constant)
- = number of microstates compatible with the macrostate
For a probability distribution over microstates:
Key Properties
Section titled “Key Properties”- Extensive: for independent systems
- Maximum at equilibrium (uniform distribution over accessible states)
- Second Law: for isolated systems
Information Theory Perspective
Section titled “Information Theory Perspective”Context
Section titled “Context”We have a random variable representing some uncertain outcome. We want to quantify how much we don’t know before observing it.
Formulation
Section titled “Formulation”Shannon entropy:
where is the probability of outcome .
Key Properties
Section titled “Key Properties”- Additive: for independent
- Maximum for uniform distribution
- Non-negative, zero only for deterministic outcomes
The Bridge
Section titled “The Bridge”Formal Correspondence
Section titled “Formal Correspondence”| Thermodynamics | Information Theory |
|---|---|
| Microstate | Outcome / Message |
| Macrostate | Constraint / What we know |
| (multiplicity) | (effective number of outcomes) |
| Conversion factor to physical units | |
| Equilibrium | Maximum entropy distribution |
| Heat bath | Noisy channel |
| Temperature | Inverse of Lagrange multiplier |
Mathematical Translation
Section titled “Mathematical Translation”The key equation:
Or equivalently:
When physicists use natural logs and information theorists use :
Why This Works
Section titled “Why This Works”Both entropies answer the same question: “How many yes/no questions do I need to specify the exact state?”
- In physics: given macroscopic measurements, how many bits to specify the microstate?
- In information theory: given the probability distribution, how many bits on average to identify the outcome?
The underlying structure is identical: a probability distribution over states, and a desire to quantify the “spread” or “uncertainty” of that distribution.
Implications
Section titled “Implications”For Thermodynamics
Section titled “For Thermodynamics”- The Second Law is about information: isolated systems evolve toward states we can’t distinguish (maximum ignorance about microstates).
- Entropy increase = losing track of microscopic details.
- Maxwell’s demon is defeated by Landauer’s principle: erasing information costs per bit.
For Information Theory
Section titled “For Information Theory”- There’s a thermodynamic cost to computation (especially erasure).
- Channel capacity has a physical interpretation.
- Compression is fighting the natural tendency toward maximum entropy.
New Insights
Section titled “New Insights”- Jaynes’ MaxEnt: Use maximum entropy as an inference principle—it’s not just physics, it’s rational belief formation.
- Landauer’s bound: J at room temperature is the minimum energy to erase one bit.
- Reversible computation: In principle, computation needn’t cost energy; only erasure does.
Historical Development
Section titled “Historical Development”- 1865: Clausius introduces entropy as
- 1877: Boltzmann connects to microstates:
- 1929: Szilard analyzes Maxwell’s demon information-theoretically
- 1948: Shannon defines information entropy (reportedly suggested by von Neumann to “call it entropy—no one knows what it means”)
- 1957: Jaynes unifies via MaxEnt principle
- 1961: Landauer proves erasure costs energy
The connection was suspected early (von Neumann’s quip) but not rigorously established until Jaynes.
Limitations
Section titled “Limitations”- Negative temperatures: Thermodynamic entropy can handle population inversions; Shannon entropy is always positive.
- Continuous distributions: Differential entropy has different properties (can be negative, not invariant under coordinate change).
- Quantum: von Neumann entropy extends both, but has additional subtleties (entanglement entropy).
Sources
Section titled “Sources”- Jaynes, E.T. (1957). “Information Theory and Statistical Mechanics”
- Landauer, R. (1961). “Irreversibility and Heat Generation in the Computing Process”
- Bennett, C. (1982). “The Thermodynamics of Computation—A Review”
- Cover & Thomas, Elements of Information Theory, Chapter 2