Most people know that entropy increases over time, with past moments having less entropy than future moments. Most people also understand entropy as a measure of disorder, which is some contexts can be measured by counting the number of rearrangements that would be essentially the same (the exact details aren’t important). This conception of entropy is fine when thinking about things such as gasses and heat, but in most cases thinking of entropy as disorder is a mistake. For example consider the formation of a solar system. In the beginning there was a disperse “cloud” of gasses, a very disordered state, but over time it came together to form the sun and the planets in essentially fixed obits, a much less disordered state. Similarly, when the Earth was young the ocean was essentially an undifferentiated soup of chemicals, but over time it became populated by more and more complex organisms. Again, the system seems to have gone from disorder to order, “violating” entropy. Of course entropy really hasn’t been violated in either of these cases, it is simply that understanding entropy as disorder in cases such as these is misleading us.
But before I explain what entropy is, and how it accounts for the arrow of time, let me first point out that I will be explaining entropy and applying it as if the natural laws were deterministic and symmetric in time. I will assume that everyone knows what deterministic laws are; laws that are symmetrical in time means that if time ran in the opposite direction all the physical laws would look exactly the same. In classical physical the laws of nature are both deterministic and symmetrical in time, and they seem to do a good job of describing the macroscopic world in most situations. Of course the actual laws that govern our universe may not be deterministic and symmetric in time, certain fundamental particles may not behave in ways that are symmetric over time, and it is quite possible that the fundamental laws are not deterministic. But I am tacking the problem as if they were for several reasons. One is that the macroscopic universe acts as if it were governed by such laws, and we can easily imagine possible worlds that are. Certainly it would seem as if entropy and the arrow of time would still exist, so there must be other reasons for them besides asymmetrical physical laws and indeterminacy. Secondly, I favor a no collapse (multiple worlds) interpretation of quantum physics, which is symmetric and deterministic. Finally, if we can describe how entropy and the arrow of time emerge in a system that has laws which are symmetrical in time and deterministic, and thus don’t have entropy or the arrow of time built into them in any way, we can surely apply these results to the world we actually live in. And certainly such constraints make the problem more interesting, since it is hard to see how entropy and the arrow of time could arise in a system with laws that are symmetric in time.
Before I define entropy itself I must first define what it means for a state of a deterministic system to be likely or unlikely. Clearly in any deterministic system there aren’t different “futures”, and so, properly speaking, given any state there is only one possible sequence of preceding states and subsequent states. To recover a notion of probable or improbable states we must generalize. By this I mean that instead of considering a specific state we must consider a class of states that meet some criterion. Similarly, the states that we will claim to be probable or improbable given this initial class of states are also classes of states. (For example we could consider classes such as “gas diffused evenly in container” and “gas clumped up in one region”) What we do is imagine allowing all the systems that all meet the requirements to be the initial kind of state to develop for some time. We then group the resulting states, and the groups that have the most members are described as likely and those with fewer members as unlikely. Obviously we could formalize this notion, but, given the use we will put it to here, there is no need. Of course I do need to address one concern that my description of likely and unlikely may have raised, which is that by defining them in terms of the resulting states I have subtly reintroduced the arrow of time. However this is not the case, for two reasons. First, since our laws of physics are symmetrical we could allow the members of the class of initial states to develop in both directions over time, which won’t change our results in the least. Secondly, since our laws of physics are symmetrical, for any given any initial state if we take that same state and reverse all its momentum vectors and allow this altered state to develop “forwards” in time the resulting state will be the same as the one that comes from allowing the initial state to develop backwards in time. So as long as our initial classification didn’t divide states in a way that would exclude mirrored momentum counterparts, which none of the classifications I will employ below do, then the results of running the systems in the other direction were already included in our first assessment of which states were likely.
The phenomenon of entropy, as I define it, is the fact that in most cases a state (particular kind of state) is followed by more likely states. We can describe an unlikely state a having low entropy, and a likely state as having high entropy, although I will rarely use such names here. Of course this doesn’t explain time’s arrow yet, since by followed by I mean in either temporal direction (or at least I must if I am not to beg the question). Even without explaining time’s arrow we can see how the classical situations in which entropy is invoked, when dealing with gasses, liquids, and heat, can be described in this way. For example, consider diffusion, a fact which entropy is usually invoked to explain. Diffusion is a generalization, stating that in most cases if a gas fills only part of a volume it will expand to fill the entire volume. In my terms we can explain this phenomena by appealing to the fact that states in which all the gas is in one area are unlikely, while those in which the gas fills the entire volume are much more likely. Thus, in the majority of cases, after some time has passed the system will be in one of the more likely states, with the gas filling the entire volume. Of course this may not always happen, the gas molecules may move in such a way that they return to being in a small area. And of course neither I, nor the usual description of entropy, deny that it can happen, it is simply very, very, rare.
One way to look at entropy, as described here, is as saying that stable states are preferred, and by a stable state I mean one in which its subsequent states are likely to be similar to the current state. Looking at it this way it is easy to see why in the typically considered cases entropy is best thought of as disorder. For a gas the most stable state is to fill space about equally, which is also the state with the most disorder. For two or more fluids or gases in the same container the most stable state is an even mixing, again also the most disordered. Finally, when considering heat the most stable state is for the heat to be equally shared, once again the most disordered state. However, stable states are not always the least ordered. The most stable state for water at low temperatures is a relatively regular crystal, which isn’t a very disordered state. Similarly on large scales the most stable state for matter to be clumped up (in, planets, stars, ect), again not the most disordered state.
So far, however, nothing I have said explains where the arrow of time comes from. To fully explain the arrow of time I need to both explain why entropy is less in the past and greater in the future, since the description I have provided here implies that entropy is likely be greater in both future and past states of a system. And given an explanation for that phenomenon I must show why the fact that entropy is greater in the future means that information only accumulates in the past to future direction, and why we can have information about the past but not about the future. (This ties into my account of time, here.) I will leave the question about information for later, but I can explain now why entropy increases in the direction we think of as the future. Ultimately it is because the origin of the universe, the big bang, was a very unlikely event (unlikely as I have described it here, in terms of systems developing over time). There aren’t many systems that would result in a relatively even distribution of matter and energy flying together to a single point. And, additionally, it is a fact that if a state is very unlikely the states that develop from it will be almost as unlikely, in general. So, since the big bang was so unlikely, it is simply the case that the events closer to the big bang are going to be more unlikely, while those father away from it in time will be more likely (again, in the sense of these words developed here). Of course just because the overall state of the universe is going from unlikely to likely it doesn’t mean that local regions never have a likely state followed by an unlikely one (in the direction we experience time), it is just that such successions are rare. Remember entropy, as described here is a description of what usually happens, not what must happen in all situations.
Before I tackle the information question let me first take a slight detour and describe how entropy favors evolution. Of course on the naïve conception of entropy as disorder evolution seems impossible, after all it seems like animals are becoming more and more ordered, and not the other way around. Fortunately we have abandoned such a conception. So let us first consider how the simplest replicating proteins (precursors to DNA and RNA) might have come about. Clearly for the primordial seas on earth forming such molecules is not a likely state, their most stable, and thus most likely state, is to have the chemicals in the ocean spread around evenly, not forming complicated molecules. However, given long enough a replicating pattern will form, simply by sheer chance. And once one replicating pattern exists what is likely changes, subsequently what is likely is that there will be more of those same replicating patterns. Admittedly, there are some possibilities in which the replicators destruct, or even multiple replicatiors combine to form a single one (division run “backwards” in time). However, it is still much more likely that given a system with some replicators that there will be more later in later states. In a way then the first replicator is a “phase change” as far as entropy goes. Previously the most likely states were ones in which everything was spread around evenly, but now the most likely states are those in which there are a significant number of replicators. This phase change really results from the fact that when we were defining how likely a state is we didn’t define how long we should let the system develop. Such an ambiguity is not a problem our project though, and perhaps even a benefit. I assume that given this account of how entropy favors replicators it is also evident why it favors better versions of those replicators, so I won’t provide all the details. In any case, because replicators, and improvements to those replicators, are favored by entropy, we can conclude that we should expect evolution to occur in the direction we think of as past to future, since the states with less evolved systems are less likely, and hence probably closer to the big bang, which is good, since that is the conclusion that the evidence supports.
Let me then turn to the question about information. Initially we might be tempted to deal with the problem of information by employing information theory. Unfortunately, since we are dealing with a universe with deterministic and symmetrical laws, past states give an equal amount of information about current states as current states do about past states. Since the universe is deterministic you really could have information about the future. However, information, as we know it, is not exactly the same as the formal definition of information. The information that matters to us, that we can use, is information that we have access to, which means that our minds must acquire some piece of knowledge, often through experience, and then retain it. The fact that information must be retained is the key to unraveling this puzzle. If information is retained then clearly the state in which the system is retaining it must be stable, to some extent. Another part of the puzzle is how information, specifically reliable information is obtained, which is that the external world interacts with the system in a specific way to create that information. (For example a photon may bounce off an object and into an eye.) So we know that the transition from not having information to having information is a transition to a stable state, and hence going from less likely to more likely. However, systems can lose information (people forget), so we might be tempted to think that either the acquisition of forgetting could be closer to the big bang, but this forgetting is not the mirror image of information acquisition, the mirror image of information acquisition would be, say in the case of information acquired through vision, for a photon to shoot out of the eye and bounce off the object that the stored information was about. Clearly this is very unlikely, more unlikely than the other ways of forgetting. And thus the entire process, gaining information, retaining information, and forgetting information, is in that order, in terms of entropy, and thus information accumulation, in us and other systems, happens in a past to future direction. It is possible for there to be systems that accumulate information in the opposite direction, from future to past, but considerations of entropy tell us is that they would be very rare. And even if such a system did exist it would be impossible to exploit it for information about the future, because all you could know about it was its past performance, and given how unlikely such systems are it is very likely that it wouldn’t continue to accurately report information from the future, and thus would be as untrustworthy as guessing.
Finally, I would like to make a few more observations about the big bang, since it is essentially responsible for the arrow of time. One question that may arise is what makes the big bang different, in terms of entropy, from the big crunch. After all they are both states in which all the matter in the universe is compressed to a small size and then flies apart (in opposite temporal directions). So if the big bang is so unlikely shouldn’t the big crunch be as well? And if it was it would complexly undermine the theory about entropy and time’s arrow that I have presented here. Fortunately there is a significant difference between them, and it is in terms of the bits of matter that do the flying apart. In the big crunch we expect the matter that is flying together to be in low entopic (likely) states, like back holes, but in the big bang the matter and energy that exploded was relatively uniform. This relative uniformity makes the big bang an extremely unlikely event, in contrast to the big crunch, which if it did happen would have to be considered a very likely event. The other observation that I would like to make is that time’s arrow could possibly run in a different direction within the same universe, although at different times. We could imagine a universe in which, unlike ours, the big bang wasn’t the beginning of time. In such a universe the time before the big bang would consist of a universe in which everything comes together, roughly uniformly, and then afterwards things progress much like our own. However, matter coming together in roughly a uniform fashion is highly unlikely, and so in the time “before” this universe’s big bang likely events are followed by less likely events. And this of course means that before the big bang in this universe time’s arrow runs in the opposite direction. People on both sides (temporally) of the big bang would consider it their past. Such a universe has no philosophical relevance, I simply find it interesting.