What is Entropy?
"Entropy is the amount of information you don't have about the random details of a system: what you'd need to describe those details." - John Baez
Physical entropy and informational entropy
Entropy is typically used either physically as in a measure of microstates with dimension of $\frac{Joules}{Kelvins}$, or informatically as in Shannon's information with a conceptual dimension of $bits$. In either case, entropy implies potential information.
Figure 1: Comparison of Information science v. thermodynamic interpretation of entropy in terms of standard temperature, pressure and volume (T, P and V).
From Boltzmann to Gibbs to Shannon
Boltzmann suggested the number of microstates is proportional to the ratio of a system's energy (ability to do work) over it's temperature (intrinsic energy).
Boltzmann's macro-state / micro-state interpretation of entropy is used to motivate Shannon's entropy formula which provides a measure of information, measured in bits.
Boltzmann's entropy
$$
S_B = k_b \cdot ln[\Omega]
$$
$$
S_B: \text{entropy in } \frac{Joule}{Kelvin}
$$
$$
\Omega = \text{# of microstates}
$$
$$
k_b: \text{Boltzmann's constant}
$$
Boltzmann assumed all energy configurations $\Omega$ are equally likely as described by the equipartition theorem. That is, every particle is assumed to have exactly the same average translational kinetic energy ($\frac{3}{2}k_BT$). In 1905 Planck showed the law of equipartition breaks down when the thermal energy kBT is significantly smaller than the spacing between energy levels of atoms, see What is the Ultraviolet Catastrophe?
Ignoring this for the time being, we'll assign the amount of entropy, $S$, in a system to a number describing the ratio of useful kinetic energy to temperature:
$$
Entropy = k_b \cdot ln[\text{#microstates}]
$$
$$
S_B = \frac{10^{-23}}{1} \frac{Joules}{Kelvin} \cdot ln[\Omega]
$$
$$
\frac{Energy}{Tempurature} = \frac{joules}{kelvin}
$$
Boltzmann's constant relates the average relative kinetic energy of particles in a gas (ability to do work, measured in joules) to the thermodynamic temperature of the gas (aggregate heat bath, measured in kelvin).
Gibbs entropy
Gibbs and others refined Boltzmann's idea that entropy is a measure of microstates by assigning a probability to each state and weighting each state with this probability before summing over all microstates, $x_i$:
$$
S_{Gibbs}= \sum_{i = 1}^{N}{ P(x_i) log_2[P(x_i)] }
$$
Entropy can be thought of as a measure of how much energy is spread out in a system where free energy is energy in the system that can do useful work. That is to say, work is done as a result of orderly (not spread out) molecular motion. More entropy => more spread of kinetic energy => less orderly motion.
$$
\Delta G_{\text{free energy}} = \Delta H - T \Delta S_{Gibbs}
$$
$$
H: \text{enthalpy}
$$
$$
T: \text{temperature}
$$
$$
S: \text{entropy}
$$
If the kinetic energy is concentrated then it has more ability to do work and the system is said to have more enthalpy (free energy + pressure $\times$ volume). Thermodynamic entropy is a function of temperature (holding pressure and volume constant) and number of possible configurations.
Comparing Shannon Entropy to Gibbs Entropy
One way to increase Shannon entropy
- Increase microstates $\Omega$ -> more possibiliites -> more entropy.
Two ways to increase Gibbs entropy
- Lower temperature $\Delta T$ -> more possibiliites for unit kinetic energy -> more entropy
- Raise energy $\Delta E$ -> more microstates -> more possibilities ->more entropy
Entropy is additive
The number of combined microstates of two interacting systems is the Cartesian product of each system's microstates. But, because of the property of logarithms, $log[AB] = log[A] + log[B]$, the entropy of two systems is additive, not multiplicative. This point is expanded upon in A Characterization of Entropy in Terms of Information Loss, by Baez, et. al.
In other words, entropy provides a measure of possible useful information in a system. And, when two systems combine, their entropy adds.
Quantum considerations on data science
Data science algorithms largely ignore the thermodynamic aspects of entropy and only consider Shannon entropy (probability of states). But some, like a Boltzmann machine, embrace the thermodynamic aspect and allow physical models based on temperature to influence. Such reasoning will be mandatory in the age of quantum information.
References
- Eugene Khutoryansky's Thermodynamics videos provide a visual review of the thermodynamic interpretation of entropy.
- What is the Ultraviolet Catastrophe? is a video that describes the quantum nature of energy transference and its implication on entropy.
- 2007 Origins of the Combinatorial Basis of Entropy, by Robert Niven.
- Shannon Entropy from Category Theory: a gentle introduction by John Baez.
- Von Neumann Entropy: primer on mixed-states quantum formulations by Scott Aaronson.