Samstag, 19. November 2011

The Origin of the Quantum, Part III: Deviant Logic and Exotic Probability



Classical logic is a system concerned with certain objects that can attain either of two values (usually interpreted as propositions that may be either true or false, commonly denoted 1 or 0 for short), and ways to connect them. Though its origins can be traced back in time to antiquity, and to the Stoic philosopher Chrysippus in particular, its modern form was essentially introduced by the English mathematician and philosopher George Boole (and is thus also known under the name Boolean algebra) in his 1854 book An Investigation of the Laws of Thought, and intended by him to represent a formalization of how humans carry out mental operations. In order to do so, Boole introduced certain connectives and operations, intended to capture the ways a human mind connects and operates on propositions in the process of reasoning.
An elementary operation is that of negation. As the name implies, it turns a proposition into its negative, i.e. from 'it is raining today' to 'it is not raining today'. If we write 'it is raining today' for short as p, 'it is not raining today' gets represented as ¬p, '¬' thus being the symbol of negation.
Two propositions, p and q, can be connected to form a third, composite proposition r in various ways. The most elementary and intuitive connectives are the logical and, denoted by ˄, and the logical or, denoted ˅.
These are intended to capture the intuitive notions of 'and' and 'or': a composite proposition r, formed by the 'and' (the conjunction) of two propositions p and q, i.e. r = p ˄ q, is true if both of its constituent propositions are true -- i.e. if p is true and q is true. Similarly, a composite proposition s, formed by the 'or' (the disjunction) of two propositions p and q, i.e. s = p ˅ q, is true if at least one of its constituent propositions is true, i.e. if p is true or q is true. So 'it is raining and I am getting wet' is true if it is both true that it is raining and that you are getting wet, while 'I am wearing a brown shirt or I am wearing black pants' is true if I am wearing either a brown shirt or black pants -- but also, if I am wearing both! This is a subtle distinction to the way we usually use the word 'or': typically, we understand 'or' to be used in the so-called exclusive sense, where we distinguish between two alternatives, either of which may be true, but not both; however, the logical 'or' is used in the inclusive sense, where a composite proposition is true also if both of its constituent propositions are true.

A useful method to decide the truth value of composition is the so-called method of truth tables. A truth table is a table containing the truth values of elementary propositions and their compositions, like the following:

Fig. 1: Example of a truth table
In this table, two additional connectives to the already familiar 'and' and 'or' have been defined, the conditional → and the biconditional ↔. Their interpretation is that  q (read 'p implies q' or 'if p then q') is true whenever q follows from p, and ↔ q (read 'p if and only if q') is true when both  q and  p, i.e. if ( q) ˄ ( p) is true. These need not worry us too much, however, as  q is equivalent to ¬˅ q, as can be easily checked with the truth table method, thus ↔ q is equivalent to (¬˅ q) ˄ (¬˅ p) (where the brackets just mean that the expressions within them have to be evaluated first). Thus, they don't bring anything essentially new to the table; they can be regarded as merely convenient shorthand for our purposes.
Within this system of classical logic, one can carry out certain deductions -- essentially, one can use the method of truth tables to decide the truth values of arbitrarily complicated composite propositions, given knowledge of the truth values of their constituting propositions. Though simple, one should not underestimate this system -- essentially, it is all your computer ever does!

The Algebra of Sets
One interesting realization of the structure of Boolean logic is based on set theory, and has the advantage of being relatively easy to grasp intuitively. Any proposition can be interpreted as positing the membership of a certain object to some collection of things for which that proposition holds true -- i.e. a proposition asserts something to be an element of a particular set, the set of all things having the property the proposition ascribes to that something. So let the set be the set of all green things; the proposition 'grass is green' thus posits that a thing, grass, belongs to a set, the set of green things. The proposition is true, since grass is in fact a member of the set of all green things. Similarly, 'a ball is round' posits that balls belong to the set of round things, which is again obviously true. Conversely, 'fire is cold' is false, since fire is not in the set of cold things (which, for definiteness, might be considered to be the set of all things having a temperature below the melting point of ice).
And again, we can compose propositions out of more elementary ones: 'this ball is round and green' posits that a certain object, this ball, belongs both to the set of round and to the set of green things -- or alternatively, to the set of things that are both round and green. This illustrates that we can transfer the operations of Boolean logic to operations on sets -- the set of things that are both round and green is the set of all things that are both in the set of things that are round, and the set of things that are green. If the ball belongs to this set, then the proposition is true. The logical 'or' has a similarly simple interpretation: 'the toy is round' or 'the toy is green' is true if the toy either belongs to the set of round things, or to the set of green things -- or both. In particular, it is true if the toy is a round green ball.
There is an easy way to visualize this, known as Venn diagrams. The logical 'and', variously known as the intersection or meet of two sets, is represented as follows:

Fig. 2: Set intersection
The left circle can be interpreted as the set of all round things, while the right circle can be considered to represent the set of all green things; thus, whatever belongs to both sets -- whatever is both round and green -- lies in the red area. This is also often denoted as A ∩ B, where A and B are the two sets; in order not to confuse matters, though, we will stick with our original notation, ˄ B.
The logical 'or', following the same conventions, can then be pictured as follows:

Fig. 3: Set union
It is alternatively known as union or meet of two sets, and an analogous alternative notation exists, which, however, we again won't bother with. Again, the red area marks the set of all things that a proposition built from the disjunction of two elementary propositions holds true for; i.e. it is the set of all things that are either green, or round, or both.
Negation, it should be noted, in this formalism is represented by the complement -- it is arrived at by 'inverting the colors' of any diagram, as the set of all things that are not in a set is the set of all things that are outside the set.

Probability
Armed with these notions, we can entertain some qualitatively new questions. Two features play a role here: one is that in everyday life, we rarely only consider 'simple' sets corresponding to elementary propositions; the other is that typically, we will have less than perfect information in any given situation.
In order to deal with the first, we will introduce the notion of subsets -- i.e. 'simpler' sets that form part of some larger, more complex set. For instance, the set of green or round things has as its subsets both the set of all green things, and the set of all round ones. In order to deal with the latter, we will introduce the concept of probability -- basically, a notion quantifying how much you should expect a given proposition to be true, or a given thing an element of a certain (sub)set, given insufficient information to deduce the actual truth value.
Let us, for concreteness, look at the set of all cats. This is a decidedly non-simple set: not all cats are the same, so saying of something 'it is a cat' does by no means entail having complete information about that something. Cats have different sizes, shapes, colors, genders, etc. All of these form subsets of the set of all cats that each individual cat may either belong to or not. So, a question one might ask is: "Given that this is a cat, how much should I expect it to be black? Or female? Or bigger than 30cm?"
This question is a question about the subsets of a set, asking how much one should expect an element of a set to be in some particular subset of that set. The answer can be determined easily -- by counting! To wit, just count the number of elements in the whole set, then count the number of elements in the subset you're interested in. So let's say there are 1000 cats in total, 200 of which are black. This implies that every fifth cat is a black cat. Thus, we can define our sought probability of a random cat being black as the proportion of all cats that also are black cats.
Of course, in practice, we won't have access to either the set of all cats or the subset of black cats; however, we can nevertheless estimate the probability by taking random samples, i.e. picking out cats at random, and noting how many of them are black. The more cats we pick, the more accurate our estimate of the probability for cat-blackness will become. It is like reaching into a bag containing all cats (though beware of cats in bags), pulling out one after the other, and noting their colors: about once in five times, you will grab a black cat.
Thus, we can clarify the notion of probability as being the ratio of the number of elements of a subset to the number of elements of a set. In mathematics, the number of elements of a set is called its measure, though the term is valid in much more general circumstances than we consider here. In these terms, the probability of something being an element of a certain subset is the measure of that subset, divided by the measure of the total set. This is a naive notion of probability, and I don't pretend to have given a full introduction here; however, for our purposes, it will prove sufficient.
We can now go on and derive the fundamental notions of probability theory. First of all, the probability of something being an element of the whole set -- the probability of a cat being a cat -- is obviously 1. We can use this to normalize our probabilities, and assign each subset a measure smaller than one, according to its 'relative size' with regard to the total set. This relative size we write as P(A), the probability of (something being in) the subset A. In the Venn diagrams above, this measure corresponds to the portion of the image that is red; one may interpret it as the probability of hitting the red area with an arrow fired randomly at the image (the reader is asked not to verify this for themselves, as monitors can be expensive). Thus, we can extend our notion of probability to continuous quantities, as well.
Another immediate consequence is that the sum of all probabilities of mutually exclusive events (an 'event' is just another word for a set element, here) is equal to 1 -- here, the mutually exclusive events correspond to disjoint subsets, i.e. sets that have no overlap, such as black and white cats (if you talk only about solid colors, that is). Clearly, if you unify all the possible non-overlapping subsets, you get the whole set back. This doesn't mean anything other than 'something has to happen', i.e. one event out of the probability space (the whole set) must occur, i.e. you draw a cat out of the bag.
Furthermore, if P(A) is the probability of event A occurring, it is clear that hence, the probability of A not occurring must be 1 - P(A) -- if P(A) is the amount of red in the Venn diagram, and 1 is the total area, then 1 - P(A) is the not-red area.
Next, we can combine probabilities in just the same way as we can combine propositions, with similar interpretation. If we have a proposition of the form 'the cat is black, and the cat is female', which you'll recall is true if the cat is both in the subsets of black cats and of female cats, then the probability of that proposition to be true is given by the measure of that subset, which is the intersection of the two other subsets -- i.e. if A is the set of black cats, and B the set of female cats, the set of both black and female cats is given by ˄ B; its probability is denoted P(˄ B), and is given by the product of both individual probabilities, i.e. P(˄ B) = P(A)P(B), which can be easily seen if you consider that P(A) is the fraction of all cats that are black, and P(B) is the fraction of all cats that are female; of those, again a fraction of P(A) are black, so the fraction of all cats that are black and female is the fraction of cats that are black of the fraction of cats that are female, i.e. P(A)P(B).
Similarly, one can join propositions by the logical 'or', obtaining a value for the quantity P(˅ B). Since ˅ B is true whenever A is, and whenever B is, this is equivalent to the total area both sets occupy within the whole set; thus, P(˅ B) = P(A) + P(B). But we must be more careful here -- as can be seen in fig. 2, both sets may overlap, and in the formula we just gave, this overlap is counted twice -- once as part of P(A), and once as part of P(B). The formula is thus only valid if both sets don't overlap, i.e. if there is no thing such that both A and B is true of it -- if, for instance, there were no cat both black and female. In the general case, we must subtract the intersection once. Luckily, we have just learned that the intersection is equal to P(A)P(B), so the general formula is P(˅ B) = P(A) + P(B) - P(A)P(B).
Another useful notion is that of conditional probability -- roughly, the probability that A happens, given that we know B has happened. If B and A don't intersect, we know that A can't happen, if B does -- the two are exclusive. Thus, the conditional probability of A given B -- written as P(A|B) -- must be proportional to the intersection of A and B. Since B has happened, we can ignore all events outside B, and thus, set P(B) equal to one, which amounts to dividing by P(B). Thus, we arrive at P(A|B) = P(˄ B)/P(B); this can be understood as the area of B that lies within A.
This completes our short survey of probability theory.

Quantumness
The notions we have used so far seemed quite general -- but implicitly, they relied on assumptions rooted in a classical understanding of the world. One concept in particular is not well captured by the mechanism developed so far, and that is the concept of complementarity.
If you recall, in the previous two posts in this series, complementarity was forced upon us by the notion of information-theoretic incompleteness. Information-theoretic incompleteness means, roughly, that there exist questions that a formal system can't answer, because of their complexity. We exhibited one particular set of such questions, the values of the bits of a halting probability's binary expansion beyond a certain point. This means there is a maximum amount of information that can be obtained about any given system (and if that amount is exhausted, all following measurement results must be maximally uninformative, and thus, random); thus, it follows that certain observables are engaged in a kind of back-and-forth: obtaining more information about one entails less precise information about the other. This is at the root of Heisenberg's famous uncertainty principle. (For a more formal discussion of the connection between incompleteness and complementarity see this paper by Christian Calude and Michael Stay.)
So, does the notion of complementarity bring anything new to the table? It does indeed!
First, we need to look at a straightforward consequence of the apparatus of logic we have discussed above. Using truth tables, the following identity can easily be verified:

˄ (˅ r) = (˄ q) ˅ (˄ r)

This is known as the distributive law. It is relatively intuitive, spelled out with concrete propositions: 'it is raining, and I am at home or I am outside' is equivalent to 'it is raining and I am at home, or it is raining and I am outside'. However, the notions of complementarity and distributivity do not play well with one another.
Let's consider this picture:

Fig. 5: Complementarity in phase space

It is a representation of the phase space of a one dimensional quantum system -- i.e. a quantum particle moving only in one direction. The particle's position is denoted on the horizontal, its momentum on the vertical axis. Momentum and position are complementary observables, and hence, can only simultaneously be known to a certain, maximum precision; this is encapsulated in the fact that there is a minimum area, sharper than which the particle is not localizable in phase space. This area is given by Planck's constant h, leading to the uncertainty principle ΔxΔp > h (this should be thought of as a heuristic, rather than exact, relation).
Now consider the following three propositions:
  • p: the particle's momentum is within Δp
  • q: the particle's position is within Δx1, meaning the left half of the interval Δx
  • r: the particle's position is within Δx2, meaning the right half of the interval Δx
In these, the phrase 'is within' may be interpreted as 'the value found by experiment will lie in the range of'. Now consider the composite proposition ˄ (˅ r): it is clearly true, since it is essentially just a restatement of the uncertainty principle.
However, distributivity would tell us that this is equivalent to the proposition (˄ q) ˅ (˄ r) -- but this is clearly false, as both (˄ q) and (˄ r) are false! It is not the case that experiment will find the particle's momentum within Δp, and its position within Δx-- this would violate the uncertainty principle. Similarly, it is not the case that experiment will find the particle's momentum within Δp, and its position within Δx2, as again uncertainty would be violated. One may imagine the particle, or its position, prior to measurement, to be 'too big' to fit in either Δx1 or Δx2, yet comfortably within Δx as a whole.
This is in stark contrast to the classical case. The difference is that in a classical context, every particle has a definite position (and momentum) at all times, we just might be ignorant about it -- but in quantum mechanics, it is not right to talk about a particle's momentum and position apart from a measurement context at all.
The machinery we have developed, while adequate in the classical case, thus fails to capture the quantum reality. In order to account for this discrepancy, the notion of quantum logic has been developed -- which is equivalent to classical logic, except that the distributive law does not hold. What, exactly, this quantum logic is is an entirely different discussion, and one I don't want to go into here -- some have suggested that it is the empirically adequate logic to describe reality, and that thus, it ought to replace classical logic (Hilary Putnam has argued for this point of view in his paper 'Is Logic Empirical?'); while others merely see the whole endeavor as an exercise in the manipulation of symbols.
However, a simple argument against the existence of any 'true' logic to use in reasoning about the world is that one can build computers whose architecture corresponds to different logical frameworks, which nevertheless end up being able to compute the same things. (For example, the Russian Setun was a computer built on ternary, rather than binary, logic, i.e. used three instead of two 'truth values'.)
For us, it is enough to realize that quantum logic is able to deal with certain awkwardnesses better than classical logic; that it is in principle possible to reason about quantum systems using classical logic is demonstrated by the existence of hidden variable models, i.e. theories explaining quantum behavior by appealing to certain fundamental, but inaccessible parameters of the theory, our ignorance of which leads to the apparent weirdness of quantum theory.
But, this puts us in a bit of a pickle, with respect to our interpretation as logic being about set membership: the algebra of sets is clearly distributive! So, how can propositions be modeled in a quantum context? Can an analogue notion of probability be found?
We will dodge the bullet by simply defining appropriate objects to model quantum propositions -- call them q-sets. They have all the properties of classical sets, except for distributivity. Thus, their algebra is equivalent to quantum logic the same way the algebra of classical sets is equivalent to classical logic. Mathematically, this is an easy step -- the algebra of sets forms an abstract structure known as a Boolean lattice; repealing distributivity merely means moving to an orthocomplemented lattice. Everything else works much as it did before, so, q-sets have elements, and if a certain element is in a q-set, the proposition 'element x is in q-set A' ('the particle's momentum is within Δp') holds true for it.
Now we can, again, erect a theory of probability -- of q-probability -- upon our theory of q-sets. Again, we will find probability measures, rules for the composition of probabilities, and so on. I will skip to the punchline here, as the detailed way to get there is a bit mathematical: in the end, the theory of q-probability one arrives at, is nothing else but quantum mechanics itself!
This is a remarkable result. From nothing but the complementarity of observables, arrived at via information-theoretic incompleteness, the whole formal apparatus of quantum mechanics emerges. Just mentioning some elements of this derivation, the q-propositions will turn out to be so-called 'projection operators' on Hilbert space (the quantum mechanical state space analogous to classical phase space); the q-sets will be given by (closed) subspaces of Hilbert space, each of which is associated with a certain projection operator; and the probability measure will turn out to be determined by the density operator, a certain representation of the quantum mechanical state of a system. Two particularly important results necessary to arrive at this derivation are Solèr's theorem, which essentially limits the choice of Hilbert space to those over the real numbers, complex numbers, or quaternions (which we'll meet again eventually), and Gleason's theorem, which roughly says that the appropriate probability measures are given by density operators. For more details, see the article at the Stanford Encyclopedia of Philosophy here, or the paper by Itamar Pitowsky here.
There is, however, one distinction between classical and quantum probability that must be made: classical probabilities are wholly due to ignorance, while quantum probabilities are irreducible. Every classical system has a definite state at all times, and experiment can reveal this state with arbitrary precision; that we can make only probabilistic statements is only due to our not knowing that definite state. In quantum mechanics, however, there is no 'deeper level' at which all probabilities are washed away through greater knowledge -- beyond a certain level, as required by complementarity, no more accurate statements can be made. Quantum randomness is fundamental.

Keine Kommentare:

Kommentar veröffentlichen