Sonntag, 17. Juli 2011

The Universal Universe, Part III: An Answer to Wigner



Eugene Wigner was a Hungarian American physicist and mathematician, who played a pivotal role in recognizing and cementing the role of symmetries in quantum physics. However, this is not the role in which we meet him today.
Rather, I want to talk about an essay he wrote, probably his most well-known and influential work outside of more technical physics publications. The essay bears the title The Unreasonable Effectiveness of Mathematics in the Natural Sciences [1], and it is a brilliant musing on the way we come to understand, model, and even predict the behaviour of physical systems using the language of mathematics, and the fundamental mystery that lies in the (apparently) singular appropriateness of that language.
Wigner's wonder is two-pronged: one, mathematics developed in one particular context often turns out to have applications in conceptually far removed areas -- he provides the example of π (or τ if you're hip), the ratio of a circle's circumference to its diameter, popping up in unexpected places, such as a statistical analysis of population trends, which seems to have little to do with circles; two, given that there is this odd 'popping up' of concepts originally alien to a certain context in the theory supposedly explaining that very context, how can we know that there is not some other, equally valid (i.e. equally powerful as an explanation) theory, making use of completely different concepts?
In a way, we find ourselves wondering, again, why our language, our mathematics, should be any more suitable as yielding an explanation of the world around us in terms of physical theories than a dog's bark, very much analogous to the question of why our minds should be any more capable of understanding the universe than a dog's mind is of grasping advanced calculus. The answer, I believe, will turn out to be analogous as well; but first, we need to think a bit about what, exactly, we mean by things like explanation, mathematics, or physical theory. We'll start out by considering:

The Perfectly Reasonable Effectiveness of Computers in Simulations
Nobody, to the best of my knowledge, has ever marvelled at the singular appropriateness of computers for simulations of physical systems. So let me be the first: why is it that, no matter what system you consider, it seems so eminently possible to construct a virtual likeness of it on an appropriately instructed computer?
From galaxy formation to the weird world of quantum mechanics, few if any areas seem in principle unamenable to computer simulation. And given the criteria in Wigner's article, the wonder here should be even greater! The toolbox of mathematics is vast, comprising hosts of diverse fields, all with their own concepts, symbols, rules, etc. Compared to this, computers are incredibly limited, being ultimately forced to do all they can do by manipulating 1s and 0s according to very few rules.
Nevertheless, it does not seem surprising that a computer is capable of closely matching observed reality through simulation -- and it should not. To anybody who has read the last two posts in this series, the answer is probably clear: computational universality is what bestows upon a computer its magical power to effectively 'behave like' any other system (provided that system is not more than computationally universal itself). So if our universe is computable, then of course computers should be capable of simulating every part of it.
Now, what does this mean with respect to Wigner's essay? Well, if mathematics were similarly computationally universal, then at least part of the problem -- why mathematics should be capable of describing the natural world -- would be solved. But now, remember that Turing machines were thought up to emulate mathematics -- to automate the processes by which mathematicians do whatever mathematicians do (which entails a convenient definition of mathematics as 'what mathematicians do').
The converse is similarly possible: you can embed the functioning of a Turing machine within mathematics. To do so, one makes use of a trick known as Gödel numbering: recall that a Turing machine has a finite set of symbols, its alphabet. Now, one simply associates a number to each of those symbols -- easy enough to do. Any string of symbols can then be represented as a number formed from the concatenation of all the numbers for each symbol; especially, the string of symbols on the tape of a Turing machine when it starts operating -- its program -- can be thus represented.
The functioning of the Turing machine now consists of manipulating the symbols it reads in a certain way, according to its rules. These rules can be translated into algebraic ones: each rule takes a string of symbols, and returns a different one; each algebraic rule then takes a number, standing for that particular string of symbols, and calculates a new one, standing under the same correspondence, the same code, for the string of symbols the Turing machine returns. Thus, the operation of a Turing machine on a string of symbols can be mapped to algebraic manipulations of numbers -- we can embed, or emulate, any given Turing machine within number theory. This also proves an assertion I made in the previous post, that there are only as many Turing machines as there are natural numbers: a Turing machine's alphabet can be encoded in a finite natural number, and its rules can be, too -- thus, there exists a natural number associated to every Turing machine. Since there are infinitely many finite alphabets and rules, every natural number is associated to a Turing machine.
Thus, as we should have expected, mathematics is (computationally) universal -- that it can be used to emulate any physical system then is not so surprising after all. This answers Wigner's first problem: mathematics developed in some context (say, number theory) can, due to its universality, emulate completely different systems (say, a Turing machine simulating galaxy formation).
In attempting to answer Wigner's second problem, though, we are going to hit a snag: if we consider the working of some computationally universal system, say, a Turing machine or a computer program, as an explanation for the behaviour of some physical system, then we are forced to conclude that this explanation is not unique: by universality, there exist many inequivalent Turing machines or computer programs yielding the same phenomenology. This is nothing else but the statement that one and the same problem can be solved by many different computer programs -- i.e. there is not just one unique program yielding the animations linked above.
Yet, Wigner asserts that one of the principal mysteries of mathematics' applicability to natural science is that it apparently yields unique explanations! In fact, we are faced with an even worse problem: if we accept a 'computational ontology', then what form should it take? What fundamentals 'really' exist? What language is the program that computes the universe written in?

The Myth of Ontic Definiteness
Picture a scientist trapped in the Matrix, i.e. confined to a simulated universe. His task, being dedicated to uncovering ultimate capital-T Truths about the world he finds himself in, is to uncover the fundamental laws that describe his universe.
Unfortunately for him, in this task he is doomed to fail, as we now know. There is no set of laws that could be pointed to as 'the' fundamental ones, at least none that he could discover. The reason for this is simply that there are many inequivalent programs, many inequivalent computational systems, that yield the same output -- his world, and himself in it. There is no way for him to find out which is the one that is 'actually' being run, or what the supporting hardware looks like and how it works. These levels are simply closed off to him, and any supposed fundamental entity he comes up with whose behaviour he postulates explains the appearance of his world is likely to be fiction -- even though it may be in perfect accord with all the observations he makes. There is always a host of different entities yielding the same observations, and thus, no way to choose between them.
It's computational universality that does him in -- since every universal system can emulate the behaviour of every other (at most) universal system, there is simply no way for him to tell which system actually comprises the foundations of his world. It could be some Turing machine, but it could also be a cellular automaton emulating that Turing machine, or a gadget going through number-theoretical derivations, or something else entirely. To him, there is no fundamental ontology.
It is for this reason that David Deutsch rejects the possibility of the world being 'merely a program running on a gigantic computer' [2], because, according to him,
It entails giving up on explanation in science. It is in the very nature of
computational universality that if we and our world were composed of software, we
should have no means of understanding the real physics  – the physics underlying
the hardware of the Great Simulator itself.
This leaves us in a bit of a pickle! As I have argued in the previous post, an explanation in terms of 'more than' computable means amounts to no explanation at all, as it could never be checked; now, apparently, computable explanations suffer the same fate (and although Deutsch doesn't acknowledge it explicitly, it is more than just the literal picture of the universe as a giant computer that runs into these troubles -- a computer is just a particular kind of universal system, but the reasoning applies generically).
So, what are we left with?

The Benefit of Multiple Explanations
Perhaps the problem lies not with the possibility of finding explanations, but rather, with our expectation of what explanations to find. Typically, when we are faced with some phenomenon, we expect that there is one unique and true explanation in terms of 'what really happens', which enables us to gain an understanding of that phenomenon: the Sun rises because the Earth rotates. This is right, everything else is wrong.
But... is that actually ever what we get?
To answer this, we must first consider what, exactly, we mean by an explanation. Typically, when we are being related an explanation for the behaviour of a certain system, at some point, we experience a moment of understanding -- suddenly, we know how it works. Well... how does that work?
One possible explanation would be that we assemble in our mind a structure whose behaviour matches the behaviour of the system in question -- think of it as a model: we are being told that the Sun rises (and sets) because of the Earth's rotation, so we imagine a rotating globe relative to a fixed source of illumination -- and all becomes clear, we know how it works. In general, however, that model may be much more abstract; the salient feature is that we generate an internal representation from which the behaviour of a system can be abstracted, such that, for instance, we could reliably predict the system's behaviour even in situations in which we have no direct experiential knowledge of its behaviour.
But now consider the following situation: I am teaching you chess. At some point, I'll come to the knight's movement. I could say something like: "It moves two squares vertically and one square horizontally, or two squares horizontally and one square vertically." This would surely suffice as an explanation of how the knight moves. However, I could equally well say: "It moves one square diagonally and one square straight, either horizontally or vertically." Or: "It moves in the shape of an 'L'." Or, somewhat more extravagantly: "It moves three squares to the right, and then either one or two squares diagonally, either upwards or downwards to the left; or, it moves three square to the left, and then either one or two squares diagonally, either upwards or downwards to the right."
All would get the idea across equally well -- in all cases, the knight ends up on the same possible squares; however, they differ with regards to what the knight actually does.
Or let's say I believe in a more hands-on learning approach, and just move the knight -- perhaps while you have your back turned. In your mind, you could now construct a set of distinct, yet equivalent models in order to explain the knight's behaviour, including, but not limited to, the ones I provided above. You could also come to completely different explanations, in entirely different terms; for instance, you could simply recognize and memorize the following pattern:


Now, which of these explanations is 'the right one'? How would one single any out? I don't think it's possible, or indeed useful, to do so. They're all right; some might sit better with you than others, but ultimately, regarding the salient features -- the positions the knight may validly end up in -- they're all equivalent. There is simply no fact of the matter regarding what way the knight actually takes.
(A word of caution, though: some people might be tempted to invoke Occam's razor here to arrive at a unique explanation that is 'the simplest one' in a particular way -- certainly, the more long-winded rules have some disadvantages compared to the short and crisp ones, but it's at best ambiguous to decide whether or not the picture is 'simpler' than the rules written down. However, strictly speaking, no additional 'explanatory entities' in the razor's sense are postulated in any of the rules. Occam's razor has its valid application in ensuring the predictivity of hypotheses: whenever an explanation is proposed that 'adds' overhead machinery to the simplest explanation necessary, then that explanation is to be discarded, as otherwise there is no unique pick of theory to be falsified possible. So, for instance, if I taught you chess on a Thursday, many inequivalent theories would be equally well in accord with your observations: for instance that the knight moves like an L, or that the knight moves like an L on Thursdays, but just diagonally on Fridays. The latter theory is what the razor is made to shave off.)
Thus, we see that explanations need not necessarily be unique in order to be valid -- all of the presented, distinct 'theories' allow us to construct an equally good model of the knight's movements.
Indeed, viewed from another angle, the notion of an unique explanation begins to look downright suspect: if there is an explanation for some system's behaviour in terms of 'more fundamental' entities, then either those fundamental entities demand an explanation themselves, or the chain of explanations terminates, leaving some basic layer unexplained. Either one never gets a 'final' explanation, or an explanation is ultimately left open. Faced with the equations that describe the final layer of our explanatory cake, we are left with the question, as Hawking famously mused: "What breathes fire into the equations?"
If one posits some entity as ontologically fundamental, one immediately may ask the question: why that particular entity? Why not any other?
Attempts have been made to answer, or at least ameliorate, these questions: it has been proposed that consistency is what selects the true fundamental theory -- but if one theory is consistent, there exists another, just as consistent, able to give rise to the same phenomenology, while built on totally different fundamentals. Or, it has been proposed that all such structures exist, and that the selection of which one we exist in is anthropic: we live in this universe, because it is capable of supporting our existence. But this, too, does not suffice to argue for a unique fundamental structure: a (computationally universal) cellular automaton could give rise to the same universe as a Turing machine, or any other computational paradigm.
Conversely, in a computational universe, there is no more an ontologically fundamental entity than there is a way the knight actually moves.
A little excursion is in order here. The notion of multiple explanations is nothing alien to physics. In some theories, it manifest itself under the guise of gauge invariance: there exists a field, known as the gauge field, which yields the same physics -- the same observations -- in different forms; i.e. two completely inequivalent gauge fields may lead to the same phenomenology. Electromagnetism is the prototype theory of this kind: different choices of electromagnetic potentials may yield the same magnetic and electric fields; and since it is those latter fields that are actually observed, these manifestly different potentials have the same physical content.
In another way, explanatory heterogeneity, to coin a term, is manifest in general relativity in the form of background independence: different choices of reference frames, as long as they are connected by smooth coordinate transformations, yield the same physics. This is viewed as a good thing -- after all, why should there be a preferred point of view singling out the 'proper physics'?
But then, why should there be any preferred mathematical structure? And if there can be such a preferred mathematical structure, then why not a preferred frame of reference -- which, after all, is just a mathematical structure --, as well?

The Right Way to be Wrong
So, multiple explanations may not be a bad thing; indeed, they may even turn out to be fortunate. But how does that help the scientist trapped in the Matrix? He does not have merely a couple of explanations to choose from; rather, any possible universal system in principle suffices as an explanation. A true embarrassment of riches!
The key to helping him out is, I believe, to answer the question of how, given all of the above, there can be wrong explanations. Of course, one possibility is simply for them to be analogous to 'buggy' programs: within the chosen framework, they just don't do what they're supposed to do; this is, in a sense, the trivial version of 'wrongness', and perhaps most wrong theories are of this kind.
But consider the following: a computer runs a simulation of the solar system, which is used to predict the occurrence of a lunar eclipse. Now, the computer does not know anything about gravity -- it is merely a universal system, emulating the behaviour of another system. At its bottom level, it is nothing but a succession of states, mapped to the succession of states of the solar system at certain points in time. But this set of states is encoded in electromagnetic field configurations, and everything that happens inside the computer as it goes through its computation is dictated by the laws of electrodynamics. There thus exists a mapping between the states of the electromagnetic field that make up the computer, and the (logical) states of the computation itself.
But then, there also exists a mapping between the states of the electromagnetic field, and the states of the solar system -- which means that I could have just as well solved Maxwell's equations (the equations describing the evolution of electromagnetic fields) and arrived at a prediction for the lunar eclipse! (A shorter way to say this would have been: Maxwellian electrodynamics is universal.)
Nevertheless, as a theory of our solar system, beautiful thing though it might be, Maxwell's electrodynamics is simply wrong. The artifice that would have to be heaped onto the theory in order to make it resemble the evolution of our solar system is just too staggering -- nobody would ever consider using it in this way; nobody could ever use it this way, as the calculations would just be too difficult and gigantic to ever carry it out in practice. That does not take anything away from the fact that in principle, it is possible to accomplish this feat; but it is not in principle, but in practice, that we create and use our theories.
Thus, the 'wrongness' and 'rightness' of a theory is derived from its applicability; a theory that is not in practice applicable as an explanation of some system's behaviour is the wrong explanation.
An immediate objection to this picture might be that there manifestly are computations with different results, programs with different outputs, systems with different behaviour. And that's true, but an explanation, in terms other than those of the system that is to be explained, is always a mapping, an analogy from the behaviour of another system to the behaviour of the to-be-explained system; and between universal systems, such a mapping always exists -- so those different results, outputs, and behaviours can be mapped onto one another, provided all the systems in question are universal.
This, then, finally allows us to answer Wigner's second problem: there is not in general a unique theory, a unique mathematical edifice that allows us to formulate an explanation of some phenomenon; but, the somewhat narrow scope of practical applicability serves to single out one, or at best, some small set of closely related ones (which are then often thought of as 'dual' formulations of one another).
There is thus an anthropic selection after all: but it operates on the end user level, rather than on the fundamentals. Universal systems can emulate other universal systems with varying efficiency; systems, i.e theories, that are 'close' to our way of thinking will thus be more readily considered than very distinct ones, even though those we can in principle 'understand' just as well. Other intelligences, able to emulate different systems with greater ease, might consider completely different theories reasonable, and find ours nigh incomprehensible.

Heteroontology
There is an interesting consequence regarding the question of ontology, i.e. the question of what kinds of things can ultimately be said to exist, and what bearing physical theory has on this question. Naively, one might expect that whatever entities our most successful theories require to work ought to be regarded as 'really existing'. This is a version of W. V. O. Quine's so-called indispensability argument, though this refers generally to the existence of indispensable mathematical entities, and constitutes in this form an argument for mathematical realism.
But, given the arguments provided here, we should expect for the following situation to arise: for a given system to be explained by physical theory, two or more theories exist of equal explanatory power, yet making reference to partially or completely distinct entities. Indeed, such a situation exists in the so-called AdS/CFT correspondence, a realization of the holographic principle in which a gravitational theory is dual to, i.e. yields the same physics as, a quantum field theory in a space of lesser dimensionality. Even though both theories have the same explanatory power, they disagree on something as fundamental as the number of spacetime dimensions! So which one is 'right'?
The conservative argument here would be perhaps to consider the superfluous dimensions of the gravitational theory as 'dispensable', and thus, regard them as 'not real'. This works in cases where the disagreement is only limited; but in principle at least, it is possible to find theories -- though they may be contrived as the electromagnetic theory of the solar system above -- promulgating a completely distinct ontology. Which one would be right, there?
I don't think this question has a definite answer -- and moreover, I think that's a good thing. For every ontology we might settle on as the 'true' one, immediately throws up the question: whence this ontology?
Only in becoming independent from this question -- in becoming truly background-independent -- can any answer that by a reasonable measure may be considered final be reached.



References:
[1] Wigner, E. P. (1960). "The unreasonable effectiveness of mathematics in the natural sciences. Richard courant lecture in mathematical sciences delivered at New York University, May 11, 1959". Communications on Pure and Applied Mathematics 13: 1–14. doi:10.1002/cpa.3160130102 (online text)
[2] Deutsch, David (2004) "It from Qubit", in Science and Ultimate Reality, ed. Barrow, J. D., Davies, P. C. W., Harper, C. J. (2004), 90-102 (pdf link)

Keine Kommentare:

Kommentar veröffentlichen