I intend to begin a series looking at different philosophical interpretations of quantum physics. This subject is important to philosophy in general, since, as far as we know, the most fundamental physical theory is some form of quantum physics. The philosophy of physics needs to be consistent with one's metaphysics, and metaphysics, alongside logic, is the basis of all other philosophy. In short, a school of philosophy which does not address the question of how well it fits in with quantum physics is at best incomplete, and at worst simply plain wrong.
The problem is that quantum physics defies our natural intuitions. It is thus difficult to interpret. Numerous models have been suggested. These models each have their strengths and weaknesses. The issue is still under dispute. So it is necessary to cast an eye on the major models of quantum physics which have been proposed, and to see where they lie.
First of all, what do I mean by quantum physics? I mean, essentially, the ideas and various theories (excluding general relativity, and those things derived from it) that have dominated physics since the early twentieth century. These theories have a number of things in common that differ from classical physics. One is that many observable quantities which are described by real numbers (and thus can take on any value) are in practice constrained in many systems to only certain particular values.
For example, take angular momentum. In classical physics, this refers to circular motion. The angular momentum is defined as the mass of the object multiplied by its distance from the center of motion multiplied by the component of its velocity perpendicular to the line connecting the object to the center of its motion. Angular momentum is a conserved quantity (which is why it is useful), ultimately (via Noether's theorem) a reflection of the rotational symmetry of the classical mechanics Lagrangian. Clearly distances can take any value, and velocities can take any value (or up to the speed of light in special relativity), so angular momentum can take any real value. And that all seems perfectly reasonable.
However, in a quantum system, angular momentum is constrained, and can only take on integer multiples of a constant known as the reduced Planck's constant, written as ℏ. This seems counter-intuitive, but is experimentally confirmed.
The atomic theory was well established by the time of the emergence of quantum physics. We know that there is a hard positively charged nucleus at the center of each atom, with numerous individual negatively charged electrons circulating it. In classical physics, there is a natural and familiar analogue of the atom, namely the solar system. (This model is misleading when we come to quantum physics, but I am discussing classical physics expectations here, and in that theory, this is the closest one can get, albeit that there would still be magnetic interactions between the electrons which would lead to different results compared to Newton's or Einstein's solar system). The energy of each electron depends on how far it is from the nucleus, and in classical physics this can take on any value. If the electron loses energy, it will move to an orbit closer to the nucleus.
However, it turns out that the electrons can only have particular discrete values of the energy. The precise list of allowed values are somewhat complicated to calculate, but for the simplest atom, Hydrogen, a good approximation is that they follow a 1/n2 law. So the first energy level has a value of -½(e^2/4πε0ℏ)^2m_e, the second energy level a quarter of that value, the third one a ninth of that value, and so on. This discreteness stops the electrons from spiralling into the nucleus when they lose energy due to radiation. In practice, this is only the approximation obtained if we only consider the electrostatic force. There are minor corrections to be made when magnetic effects are also included, but the overall picture of allowed discrete energy values is still observed.
Another feature common to all quantum physics which is alien to classical physics is that some sets of observable properties do not commute with each other. That means that if you specify one of those sets of properties, the others become undefined. For example, angular momentum is measured in terms of a particular axis of rotation. If, in a quantum system, you know the angular momentum around one particular axis, then the momentum around any other axis becomes undefined. Not only can you not know what value it takes, the question itself is incoherent and nonsensical. Another standard example of this are the observables measuring location and momentum. If you can measure precisely where a particle is, then you can have no knowledge of its momentum. Again, this is alien to the classical way of thinking about the world.
Another feature that is unique to quantum physics is the exclusion principle. There are two different types of particles, Fermions with half integer spin and Bosons with integer spin. (Spin is another concept which arises from quantum physics -- in this case ultimately relativistic quantum physics -- but its effects are similar to that of a rotating charged particle.) The exclusion principle states that one cannot have two Fermions of the same type (i.e. two electrons, two muons, etc.) in the same state at the same time. Each individual electron is indistinguishable from any other electron in the universe -- something which has experimentally observable consequences. This means, for example, that one can only have two electrons (one positive spin, one with negative spin) in a given energy level ("orbit" is the closest classical analogue) around an atom. This principle is again alien to classical mechanics. It also demonstrates that there is a universal of "being an electron" (for example), which rules out the philosophy of nominalism. Nominalism is a natural counterpart to classical physics, but quantum physics favours the realism of universals.
Finally, I will mention the appearance of indeterminacy. Classical physics is determinate. If you know the precise conditions of a system at one moment in time, and the laws of physics, then you can precisely calculate what the system would be like at any other time. Of course, this is only true in principle: in practice we cannot measure things to the required precision. But in quantum physics, this is not the case. The same experimental set-up leads to different results when the experiment is repeated, with the divergence a long way beyond what we expect from the experimental uncertainty. I say appearance of indeterminacy, because a few interpretations of quantum physics use hidden variables to create the appearance but not the reality of indeterminacy. This suggests that there are objects in the universe which affect how the universe evolves, but which we can't observe. So when you get two divergent results of an experiment, it is because the hidden variables take on different values (or because different variables become hidden or observable). However, even here, there are constraints showing that these variables cannot obey the usual expectations of classical physics. So a hidden variables theory just replaces one form of weirdness with another.
So if the world is like this, then why does classical mechanics work so well? That too is explained by quantum physics. The short answer is that the parameter ℏ is a very small number, and thus one has to measure to a very fine resolution to notice that the observables can only take discrete values. The longer answer comes in several parts: that the probabilities that a particle diverges from the classical trajectory become smaller the further from that trajectory you get. This is measured in terms of ℏ, so deviations of a few ℏ are quite likely, but in most (not all) circumstances the multiples of ℏ you need to be noticeable with the naked eye, or even an optical microscope is so large you may as well forget about it. Secondly, the commutator between the various non-commuting operators such as location and momentum is also proportional to ℏ (the constant of proportionality dependent on which observables are being measured). This is the mathematical source of the statement, characteristic to quantum physics and lying behind the uncertainty principle, that one cannot simultaneously be in a state of determined location and a state of determined momentum (for example). If the commutator were zero, then you could have a determined momentum and location simultaneously, as is the case in classical mechanics. Since ℏ is so small compared to the usual scales we encounter in classical physics, the uncertainty principle becomes dwarfed by the various other sources of imprecision. the various other source Then there is the theory of decoherence, which basically states that the more particles you have in the system, the more likely that it will be found in a particular basis with the amplitudes for superpositions of states (which are what lead to the undefinable observables) in that basis severely suppressed.
In short, the world of quantum physics is very different from what is expected from classical physics. This means that the philosophy of quantum physics must differ from the mechanism that lies at the heart of the philosophy of classical physics. They are not completely different in every respect -- there are many features that quantum and classical physics have in common. Not least that they are both highly mathematical, and are based on a mapping between features of the physical world and an abstract mathematical space. The abstract space is then analysed, and from it statements are drawn which map back to the physical world, where they can be compared against further experiment and observation. The experimental method is equally important for quantum and classical physics.
However, quantum and classical physics have different views about the nature of matter, how one material particle interacts with another, or how particles change in time. A naive mechanism is not going to work when developing a philosophy of quantum physics. We need something different.
Long term readers of mine will recognise that I tend to distinguish between the early and late theories of quantum physics, particularly when discussing the underlying philosophy. Firstly, quantum mechanics or wave mechanics. Secondly, there is quantum field theory. Many philosophers of science concentrate on wave mechanics, but I think this is a mistake (which is in the process of being rectified by the leading contemporary philosophers of physics, although it will be a while before their work filters into the general consciousness of the philosophical community). There are differences between them which might be important (or might not; we don't know until we try). So then why not start with the more fundamental approach, which is quantum field theory, to which quantum mechanics is just an approximation (and not a sound one)?
Quantum field theory originally emerged from the combination of special relativity and quantum mechanics. However, this isn't what distinguishes it from quantum mechanics (even though the naive relativistic adaptation of quantum mechanics is incoherent). There are non-relativistic versions of quantum field theory, which are widely used in condensed matter physics.
Quantum field theory can be thought of in terms of either a particle ontology or a field ontology. I personally prefer the particle ontology, since we directly observe particles and we don't observe directly fields, and, where possible, it makes more sense to make the objects we observe the fundamental entities. This means that the fields are redundant, and might just be a mathematical artefact of the theory. Or they might not -- depends on how one approaches the philosophy. Particles (or field excitations) are essential in either case, to compare against observation. So rather than initially postulate the existence in reality of unobserved physical objects, better leave it as an open question to be established by the investigation. This is possible in a particle ontology, where the key players are localised quantum particles, but not a field ontology, where the key players are extended fields.
Quantum mechanics necessarily adopts a particle ontology. In this picture, the main difference between quantum wave mechanics and quantum field theory concern how the particles interact with each other. In quantum wave mechanics, like classical mechanics, the particles are indestructible and interact with an external potential (for example from an electromagnetic field, visualised in much the same way as in classical physics). In field theory, the particles are not indestructible. Interactions are local, and involve the creation or annihilation of at least one particle. So, for example, when two electrons interact with each other, one possibility is that the first electron is happily trundling along when a photon is created, with some of the electron's energy and momentum transferred to the photon. The photon then goes out and meets the second electron, where it is destroyed, and its energy and momentum transferred to that second electron. The question of where the photon comes from and goes is not addressed in the theory, and might well be incoherent (based on a presumption that matter is indestructible and cannot change the type of particle). We can, however say, that the efficient cause of the photon is the first electron in its initial state, and the photon is part of the efficient cause of the second electron in its final state.
Quantum Mechanics - Experiment
Quantum mechanics was established through experiment; and I will highlight three in particular which were both historically important and highlight its features which make no sense in the philosophies that accompany classical physics. These are the photoelectric effect, the two slit experiment, and the EPR experiment.
For certain materials, it is possible to generate an electric current by shining light on them. Essentially what happens is that the light (classically, a wave in an electromagnetic field) interacts with an electron in the material, and transfers energy to it. This by itself can be explained in classical physics. The problem is the amount of energy transferred to each electron. In classical physics, this would depend on the intensity of the light. The brighter the light you shine at the electron, the more it should move. However, what was found was that, when the intensity was turned down very low, the individual electrons still gained the same amount of energy in the interaction. That energy depended not on the intensity of the light, but was proportional to its frequency.
In other words, the system behaves itself as though the light is not a continuous wave (as one expects in classical physics), but comes in discrete packets, with each packet having energy proportional to its frequency (the constant of proportionality is Planck's constant). In other words, light waves behave like classical particles. There are, of course, plenty of other experiments showing them behave like classical waves.
This leads to the conundrum known as wave-particle duality. How can something be both a particle and a wave? In practice, this puzzle only emerges because we try to fit quantum particles into a characterisation based on classical mechanics. But they don't fit into that characterisation. They are their own particular thing, which is neither a classical particle, nor a classical wave, but something else, which behaves perfectly consistently and also intelligibly if you discard the classical paradigm.
The two slit experiment is well known from optics, and also studies of water and sound. You need a (for example) light source, a screen with two slits cut into it, and a another screen that acts as a detector. You then shine the light through the screen onto the detector. What you observe is a sequence of light and dark areas. The reason for this is that there are two properties that control a wave: the amplitude and the intensity. The amplitude is essentially the maximum "height" of the wave (where "height" is used literally as the distance from the median value in the case of water waves, and analogously in the case of light or sound waves). The amplitude can be either positive or negative. When two waves cross paths, then the total amplitude of the two waves can either be enhanced if they both have positive or negative amplitude, or decreased towards zero if one is positive and the other negative. The energy carried by the waves, which controls the brightness of the signal, is proportional to the square of the total amplitude at that point. When there are two separate sources of light, which in effect is what the two slits provides for us, whether the waves interfere constructively or destructively depends on the relative difference in the path length from the source to the point on the detector compared to the wavelength of the light. In the two slit set-up, that difference changes as you move along the detector, creating the characteristic pattern of bright and dark spots.
All of that is well understood in classical physics. However, the surprise in quantum physics is that the effect was repeated when you tried to fire objects that had been considered as particles, such as electrons, through the slits. This suggests that electrons also exhibit wave-like behaviour. This would make sense if the electron wasn't, as we expect for a classical particle, at a single localised point, but instead spread out over a wider area of space.
But it becomes weirder still. If you lower the intensity of the electron beam (the number of electrons emitted per second), you can see the electrons arriving at the detector one by one. They appear at localised points on the detector, seemingly at random. This is the behaviour we expect from a classical particle. At first, the interference pattern is obscured, but as you leave the experiment running over a period of hours, you see the classical interference pattern emerge when counting the number of electrons at each point in the detector. So there is also an interference pattern -- characteristic of the classical wave, but it only builds up over time. The question then becomes what is the electron interfering with? There seems to be nothing there at any given moment of time except the single electron.
Of course, once again, the reason why the electron exhibits both particle-like and wave-like behaviour is because it is neither a classical particle nor a classical wave but something else. What that something else is has to be determined by the philosophy of quantum physics.
The third key experiment concerns entanglement. There are various different ways in which this phenomena exhibits itself, so I will pick my favourite example. Suppose a spin zero particle decays into two spin half particles. Spin is a property of a quantum particle, which (for spin half particles) is described by three parameters: one giving the magnitude, and two describing what is (in the mathematics) equivalent to an axis of rotation. It is not really an axis of rotation. The origin of spin is well understood from Dirac's equation for relativistic quantum mechanics, where to describe a particle it is necessary to introduce four complex degrees of freedom in addition to its location. These split into two sets of two. One of the sets describes a particle with opposite charge but identical mass, which predicts the existence of anti-particles, which was experimentally confirmed soon after Dirac wrote down his theory. The two parameters within each set can be mapped onto each other by applying various transformations. The group of all these transformations is known as SU(2). However, the interactions between the members of SU(2) is equivalent to SO(3), the group that describes the rotations of three real coordinates (while preserving the length, the square root of the sum of the squares of those coordinates). Thus the mathematics describing the spin half particle has certain similarities to the mathematics describing rotations of a sphere in three dimensional space. The direction spin is measured along is analogous to the axis of rotation of a sphere, and so I will use that terminology.
A spin-zero particle has no spin. A spin half particle can have spin values of ½ℏ or -½ℏ along any axis. Spin is a conserved quantity in quantum physics, so when a spin zero particle decays, the two products of that decay must total up to zero spin. So one of the decay products must have spin ½ℏ and the other spin -½ℏ. This is true whichever axis of rotation we use to measure the spin.
So far, this all makes sense whether we are thinking in terms of classical physics or quantum physics. But now things get weird. Spin measured along different axes of rotation are some of those variables in quantum physics which do not commute with each other. So, for example, if the spin along the x-axis is measured to be ½ℏ, then the spin along the y-axis is undetermined, and vice versa. If you take a set of particles with spin ½ℏ along the x-axis, and then measure the spin along the y-axis, then half of them will have spin ½ℏ and half spin -½ℏ. This is understandable from a classical perspective -- the x-axis spin and y are described by independent variables, and uncorrelated. The next part is a little bit weirder from a classical perspective. At this point, the particle is known to have definite y-axis spin, so the x-axis spin is undetermined. Consequently, if you subsequently take the particles and remeasure the x-axis spin, half of them will have spin ½ℏ and the other half spin -½ℏ. So, if we were thinking classically, we would have to say that measuring the spin along one axis can change the value of spin along the other. But there is a certain random element (by which I mean unpredictable from all the known physical causes) to what the x-axis spin will be, which is troubling from the context of classical determinism.
But we can also choose to measure the spin along an axis between the x and y-axes. For example, we can choose a direction 5 degrees of the x axis. In quantum physics, this is correlated to the value you measure for the x-axis picture. It will be the same a fraction cos2½θ of the time, where θ is the angular difference between the x-axis and the axis we measure along. In classical physics, you usually need a separate number to describe each possible variable, but you can still have correlations between different parameters. And one can construct the theory so that it matches the quantum result.
Of course, performing this measurement is not easy, because if you measure the spin along the x-axis, you can't simultaneously measure the spin along an axis 5 degrees divergent from the x-axis, and any attempt to do so will affect any future measurements. But this is where the set-up with the spin zero particle comes in. Although you can't measure the spin of the same particle along two different axes, you can measure the spin of its partner particle, and thus deduce what the spin of the original particle would have been. So the set up is that you set up one of the particles, B to measure the spin along the x-axis, and the other, A to measure the spin along the 5 degrees offset. And you repeat this numerous times. When B is measured to have spin -½ℏ, then a fraction cos2½5° of the corresponding measurements for A will have spin -½ℏ.
But what if we rotate the axis of rotation for both detectors? Here the predictions from classical and quantum physics diverge. If we move the detectors by angles θA and θB, then the quantum prediction is cos2½(θB - θB), while the classical prediction, to remain consistent with previous results, would be cos2½θAcos2½θB + sin2½θAsin2½θB. These are different formulae, with different results, and can be experimentally distinguished.
The classical hidden variables theory is built on a number of assumptions.
- The probability of a spin match between the expected x-axis value (as inferred from the spin partner) and the value obtained when the axis is rotated is given by cos2½θ
- The parameters (known in the literature as beables) which determine the value for each measurement are set at the moment of decay but are otherwise independent for each particle (hidden variables), and these are sufficient to determine all the measurement results.
- The two measurements are independent of each other at the moment of measurement; correlations can only arise from causes which are in the past light-cones of both measurements, but otherwise the events on one particle do not influence the events at the other particle. This is implied by the locality of physical events.
- The result should be calculated by classical probability theory.
- Causation is always forwards in time.
- Each measurement only gives a single result.
- There is no conniving between the experimenters at each location; the internal construction of one of the measuring devices is not affected by the other one.
EPR experiments, and others like them, show that at least one of these assumptions must be false. This is again difficult for enlightenment mechanism.
Quantum Mechanics - Theory
The central object in wave mechanics is the wavefunction, usually denoted by the Greek letter ψ. This represents a complex vector field, i.e. it it is parametrised by a single, or in some applications more than one, complex number at each location in space (and also time, although that depends a bit on which representation you use). For example, a spin-half particle requires two complex numbers at each location to parametrise it. What precisely the wavefunction represents is disputed between the different interpretations, so I won't discuss that here. It is associated with a single particle.
Wave mechanics is concerned initially with one question: given a system of particles in an known initial state, what is the probability that a little later in time an experimenter will measure a given result? The theory is constructed to predict the results of experiments. What happens between the initial state and the final measurement (or measurements) of the experiment is not a question which the mathematics of quantum wave mechanics is set up to answer. This is where the philosophy of quantum physics is needed. Obviously the mathematics gives various clues, but here we have another problem. There are numerous different, mathematically equivalent but different in conception, formulations of quantum wave mechanics. What I will describe here is the canonical approach which is usually first taught to undergraduate physicists. But since there are different formulations, they can be put together in different ways to inspire different philosophies of physics. This question is important, but it is not the only question which quantum wave mechanics can answer. For example, it can also be used to predict and understand the properties of substances (in ways that makes numerous 18th and 19th century philosophers look embarrassingly ignorant).
Wave mechanics is based on three principles.
Firstly, there is an equation describing how ψ evolves in time, the Schrödinger equation. There are different forms of the Schrödinger equation, which are used in different circumstances. The non-relativistic form of the equation is also (perhaps confusingly) usually called the Schrödinger equation. The relativistic forms are known as the Klein-Gordon or Dirac equations. For the moment, I will concentrate on the non-relativistic equation. The way this works is actually similar to classical mechanics. We start by positing an initial condition; some set value for ψ at a given moment in time. We then solve a differential equation for ψ, which tells us what its value will be at all subsequent times. In quantum wave mechanics, this process is deterministic. The process of calculating the evolution is different in the more advanced quantum field theory. Instead of solving a differential equation, one needs to solve a time-ordered integral equation, where the objects within the equation are not (complex) numbers but operators, and the mathematical object subject to that equation does not represent a single particle, but enumerates how many particles exist at each location. But there is still an idea of something which evolves in time according to a mathematical equation which in some way allows us to make physical predictions.
Secondly, there is the Born rule. This tells us that the probability that the particle is found at a particular location is proportional to the modulus squared of ψ. What then gets confusing is that at that point ψ resets itself. For example, suppose that we start the system off in an initial state at time t0, then at time t1 we perform a measurement on the particle; then at time t2 we perform a subsequent measurement, and so on. So we set up the initial state, perform the Schrödinger evolution to calculate the wavefunction at time t1, and then apply the Born rule to calculate the probability of each possible measurement outcome. We then need to resume the Schrödinger evolution to update the wavefunction to time t2, so we can make predictions about the next measurement. There are two options in principle to how we might do this: we could take the wavefunction computed before the t1 measurement, and evolve that further; or we could take the wavefunction corresponding to the measured result at t1. It is this second method which is needed to get results which agree with experiment. The process of measurement itself disturbs the particle wavefunction -- and without taking this into account, the Schrödinger evolution misses out an important change in the potentials acting on the particle. So we cannot just ignore the measurement and continue as we were, but have to use the updated wavefunction as the initial state for the next period of Schrödinger evolution. The measurement step, where we apply Born's rule, is indeterminate, meaning unpredictable. We cannot predict what the result of the measurement will be, even with perfect knowledge of the initial conditions and the laws of motion. In this way, quantum physics differs substantially from classical physics. The Born rule is used identically in both wave mechanics and quantum field theory.
Thirdly, there is the idea of different bases. These are different but equivalent ways of parametrising the same data. One can think of a two dimensional Euclidean surface, which one represents by drawing Cartesian axes, and representing each point on the surface by two numbers. But this process of drawing axes involves some arbitrary choices. Where do we place the origin? How do we orientate the axes (we can rotate them)? What scale do we put on the axes. All these different choices are functionally equivalent. We are interested in physical points on the surface, not the details of the representation, and any distances or angles between those points would be the same regardless of how the axes are drawn (once we have adjusted for any difference in measurement units). So in practice, we draw the axes in whatever orientation makes solving the particular problem we are interested in easiest. These different axes of rotation are analogous to the different bases of quantum physics. They are just different ways of representing the same data, while a different basis might be more useful in a particular application.
For example, I introduced the wavefunction as a field which is parametrised by a different (complex) number at each location. This is one particular basis, and it is useful when answering questions about the location of the quantum particle. There are, however, other valid bases, and a particularly useful one is the momentum basis. In this basis, the wavefunction is parametrised by a different (complex) number at each possible momentum. There is a mathematical transformation, known as the Fourier Transformations, which maps from the location basis to the momentum basis and back again. So if we want to ask what the probability that we will measure the particle will have a given momentum, we perform the Schrödinger evolution from the initial state, switch over to the momentum basis, and use the Born rule.
Each possible observable quantity is associated with a particular basis. For example, with regards to spin, there is a different basis for each possible axis of the spin. These bases are all mathematically equivalent, but describe different data.
A wavefunction which is localised in one basis (i.e is only non-zero at one particular possible measurement value) will be spread out in another other basis (i.e. has numerous different non-zero amplitudes for particular measurement values). For example, consider a particle which has a definite location, at least, where we can be sure that it lies within a very small region. The wavefunction will be non-zero within that region, with the total squared wavefunction integrated over all the points in that region adding up to one, and zero everywhere else. Anyway, if we take this particle, and perform the Fourier transform to obtain the wavefunction in the momentum basis, we will find something whose mod squared value approaches uniformity across all momentum as we reduce the size of the region in location space to zero. There is a mathematical theorem which states that the standard deviation of a function (a measure of how spread out it is) multiplied by the standard deviation of its Fourier transform must be greater than a half. This is the origin of the Heisenberg Uncertainty Principle. How we interpret the Uncertainty principle depends on what philosophy of quantum physics we adopt, but at its root it is just a mathematical relation connecting wavefunctions in different bases.
when we establish a basis, we introduce two sets of numbers. The first parametrises possible states, which represent the different ways the particles can be observed. So there is one state which represents the particle at one location, another state which represents the particle at another location, and so on. Secondly, we need a number connected to each state which describes the value of the observable in question. For location, this would be a three dimensional Cartesian coordinate. To calculate the average value of the observable, we need to project the wavefunction into each of the possible states in turn (which calculates the overlap between the wavefunction and the basis state), multiply by the measurement value associated with that state, multiply by the conjugate of the wavefunction projected into that state, and add up over every possible state. If s represents the basis state, Σs the sum over all states, vs represents the value of the observable, <v> the average value of v, and † represents the mathematical operation of complex conjugation, then this process is represented by the equation:
<v> = Σs ψ† s vs s†ψ
So, for example, if we were measuring location, s would be a complex field which is the right sort of infinity at precisely that location and zero everywhere else, while vs denotes the Cartesian coordinates for that location. s†ψ is known as an inner product (or what I said above measures the overlap). It takes two complex fields (which have numbers over all space) and creates from them a single complex number which is proportional to the magnitude of s and proportional to the magnitude of ψ and maintains the same value when we rotate these mathematical objects into a different basis. This last feature is important, because it maintains the basis-independence of the observed quantity <v>. The choice of basis is arbitrary; the use of the inner product means that we will make the same prediction no matter which basis we use. (Yes, I have introduced s as a state for a particular basis, but it also, just like ψ has representations in every possible basis.)
We can re-write the equation above in the following form
<v> = ψ† (Σs s vs s†)ψ
We can then define Vs as Σs s vs s†, so
<v> = ψ†(Vsψ)
Mathematically, Vs is known as an operator, which basically means that it takes one wavefunction as input, and spews out another complex vector field as output. The operator can be transformed into any basis, just like the wavefunction, and is another way of representing the possible states (known as the eigenstates of the operator) and values (known as the eigenvalues of the operator) associated with an observable.
The most important operator is known as the Hamiltonian operator. It's eigenvalues represent energy states. Energy in quantum physics is just a label for particular states. It is also closely related to the time evolution operator, describing how wavefunctions evolve in time. The time evolution of a quantum particle in an energy eigenstate is easy to compute: it is just a constant revolution of the complex phase. The Energy eigenvalue remains the same during this process. This could be thought to imply that energy is conserved, although in practice to prove conservation of Energy we need to consider the locality of quantum field theory. The Fourier transform is also used to map between a time basis and an energy basis. The Hamiltonian can also be written in terms of the momentum of the particle (ultimately the reason for this comes down to preserving the Lorentz symmetry that underlies special relativity), giving a precise mathematical form. So to calculate the time evolution of a quantum particle, we generally transform into the energy/momentum basis, perform the time evolution in that basis, and then convert back to the space-time basis.
So, for example, we can consider the two slit experiment. What we have is a particle in an initial state at a particular location. As the wavefunction evolves in time, it gradually spreads out (representing the possible locations of the particle). It encounters the wall, and the two slits, and two individual parts of the wavefunction pass through each slit. Now they are in a definite location again, so the gradually spread out from these slits. By the time they hit the detector, the two parts of the wavefunction have overlapped. To calculate the total amplitude for the particle being measured at any point in the detector, we need to add together the amplitudes for each part of the wavefunction being at that point. The complex phase (which measures whether we are at a "peak" or a "trough" of the wave) of the two parts of the wavefunction will depend on the route it took to get to that point. In some locations the two parts of the wavefunction will cancel each other out; in other locations they will add together to give something twice as big. The probability of finding the particle at that location is proportional to the square of the amplitude. If we repeat the experiment enough times, then we can compare the calculated probability distribution from the theory against the measured frequency distribution of the experiment, and the prediction and observation agree: there is a sequence of spots with very few or no particles and spots with large concentrations of particles. A classic diffraction pattern.
What I described above is a summary of the mathematical procedure used to predict a frequency distribution. Quite what that procedure means for what happens to the individual electron is not clearly specified in the mathematics. All we know is that at the start of the experiment the electron is "here" and at the end of the experiment the electron is "there." Indeed, the theory does not even say where "there" is for any individual electron. All it gives us is a probability distribution, which can be compared to a frequency distribution after the experiment has been repeated a large number of times. To get to the mathematics to what happens in reality requires an interpretation, or philosophy of quantum physics.
The Copenhagen Interpretations
The Copenhagen interpretation (which in practice covers a family of similar ideas) is in many ways the default position that is taught to physics students. In practice, of course, physicists acting as physicists are taught not to care too much about what goes on behind the scenes. We are taught how to do the mathematics -- and have become very good at it. We are taught, at least in advanced classes, how the mathematics emerges from deeper principles of symmetry and creation and annihilation. We are taught numerous different formulations of the theory, and how to apply them to different circumstances: some artificial models used for training purposes, and then moving onto the more complex real-life scenarios. We are taught how to compare the predictions against experiment, and that the theory works exceptionally well, to the extent that nobody can doubt that to some extent it is correct. And though all of this work -- very important work -- we ignore the question of what is really going on in practice.
There are good reasons why we ignore the question. It is, firstly, very difficult. All valid interpretations of quantum physics lead ultimately to the same mathematical formulation, and thus the same experimental predictions. The physicists main tool to determine the truth is therefore inapplicable. Secondly, as the physicist is primarily concerned with making experimental predictions, he doesn't need to know the answer to this question. It is just a distraction. Indeed, a costly distraction, as it historically has just led researchers down a rabbit-hole that has led to few useful (as the physicist regards as useful) results.
So all we do is get a brief overview of the Copenhagen interpretation, and move onto more important issues.
Obviously for the philosopher of physics this is not satisfactory. So those philosophers do debate the issue, joined in by the few physicists who are interested in the question. But even within philosophy, it is a question which is only of peripheral interest. Most philosophers pay only scant attention to the philosophy of quantum physics. This is unfortunate, since it is of crucial importance for the validity of various models of metaphysics, and metaphysics is the foundation of all other branches of philosophy (even those which deny the importance of metaphysics).
The Copenhagen interpretation was promoted by several prominent early quantum physicists, most notably Bohr and Heisenberg. There are various forms of it. The first takes a view where the wavefunction is ontologically real and represents the actual electron. So, in this interpretation, the electron passes through both slits. For the Schrödinger evolution of the wavefunction this is fine. We effectively treat the wavefunction as we would treat a classical field: we track its evolution by solving a differential equation, and that's all there is to it. A second approach might say that things are undefined or not real between measurements. What happens behind the scenes is not something we can know or understand, so it is not worth bothering with. A third approach is more idealist, and concentrates on the role of the observer.
At the root of the Copenhagen interpretation are a number of principles. The first of these is the correspondence principle, which states that the results of quantum measurement correspond to classical properties. In other words, the measurement devices and their results obey classical mechanics, and we are right to think about these things in the same way as we would in classical mechanics. Indeed, Bohr believed that there was no other way we can think about the world, so this has to be our starting point. Obviously what goes on behind the scenes is not governed by classical mechanics, so there is a distinction made between the world we observe and the world that gives rise to those observations. The correspondence principle gives expression for the demand to uphold the use of classical concepts to the largest possible extent compatible with the quantum postulates. Quantum physics was thus seen by Bohr as a generalisation of classical mechanics, which just reinterpreted some of the concepts. In particular, it ascribed these classical concepts to the results of measurements; when applied to the actual particles themselves such properties are ill-defined.
The second principle is that certain observables are complementary, for example space/time and momentum/energy, or the wave-nature or particle nature of fundamental particles. Quite what Bohr meant by this principle changed over his life, but the underlying idea is that there are two different descriptions of the particle that seem to be contradictory but are in practice never applied together in the same context. Every time we measure a particle's location it is disturbed so that the momentum is undetermined. Sometimes we can ascribe A) a definite location and indeterminate momentum to the particle, and sometimes B) a definite momentum and indeterminate location, but the descriptions A and B do not contradict each other because they are used in different circumstances. We are still right, however, to use the classical concepts of location and momentum to describe what we measure about the particle. What this means in practice is that the state of the measuring device and the ontological status of the thing that is measured cannot be separated from each other. Kinematic and dynamic observables are ill-defined unless they are treated as the result of a measurement outcome. Bohr believed that the empirically observed objects were real, but not that the theory should be taken literally, and its various mathematical functions transposed into real and actual objects. That doesn't mean that the wavefunction contains no hints of physical reality; it just means that we cannot take it as a one-to-one literal representation of what is going on in reality. As such, even discussion of "wavefunction collapse" is misleading, as it implies the existence of something well defined before the collapse.
However, other forms of the Copenhagen interpretation, derived from the Heisenberg's presentation, did treat the wavefunction as ontologically real. In other words, the wavefunction is a direct representation of the electron. The electron does pass through both slits. The electron literally collapses when we perform a measurement. Heisenberg in particular emphasised the role of the observer in wave function collapse. In the extreme form (associated with Wigner and von Neumann), this has been interpreted to say that it is the involvement of a conscious observer which results in wavefunction collapse and the indeterminate changes in reality. More modern interpretations would remove the role of consciousness, and point to decoherence as the cause of wave-function collapse (a far more reasonable position, and one with some justification in the underlying theory).
With regards to complementary, Bohr saw the causal representation of quantum physics in terms of energy/momentum conservation. Heisenberg saw it in terms of Schrödinger evolution. The complementary is between the deterministic wave-function evolution and the indeterminate collapse on measurement.
The problem with treating the wavefunction as the real and actual electron comes because we then have to explain Born's rule. The wave-function is spread out over all space, but we only ever observe the actual electron in one particular location. This process is then described as wavefunction collapse. The idea is that at a particular moment the wavefunction instantaneously reduces from being a superposition of states to being found in a single quantum state. In Heisenberg's formulation of the Copenhagen interpretation, it was the process of observation or measurement itself which triggered the collapse. This, however, begs various questions. What counts as an observation? Does it have to be an intelligent observer, in which case how do minds come into the universe? How can whether or not someone is looking at something change reality? There is also in the Copenhagen interpretation an implicit division between the classical and quantum realms (the measurement device has to be described according to classical mechanics), but then we have to ask where do we draw the line between them. Obviously, some idealist philosophies have grasped onto some of these questions; but then we face the problem that human minds (if we ignore any supernatural input) arise from brain structures, which are themselves built up out of quantum physical objects. So who observes the quantum processes which take place in the brain, and how do we describe that without collapsing into circularity?
More modern interpretations will say that it is decoherence which triggers wavefunction collapse: when the quantum particle comes into contact with a large system. Indeed, some people hold that this solves the problems with the Copenhagen interpretation. This more modern interpretation has some important validation. Decoherence is a consequence of the theory. It does imply that the particle will collapse (to a very good approximation) into a single state in a given basis. It does not require an observer as such; only a larger complex system of particles.
In the Copenhagen interpretation, physics is intrinsically indeterminate.
It also states that physics is time-irreversible. In non-relativistic quantum wave mechanics, Schrödinger evolution is time reversible, like classical mechanics. This means that if you take the result of an experiment, turn everything round, then everything will wind back to the starting point. In quantum field theory (as far as we know), the theory is not strictly time reversible, but it is symmetric under time reversal, space reversal, and charge conjugation (converting particles to anti-particles) together. So there is something similar to time-reversibility in there. However the wavefunction collapse is irreversible. As soon as it happens, the information that existed in the wave-function prior to the collapse is lost; one cannot trace the particle back to its initial state.
Furthermore, the Copenhagen interpretation implies that properties are undefined until measurement. We don't and can't know where the electron is until it hits the detector; in fact the question itself is nonsensical. What we should be looking at is the wavefunction, which is in a superposition of location states.
The Copenhagen interpretation also supposes that quantum physics applies to individual objects rather than ensembles of objects. It also supposes that wave-functions are objective, and do not depend on the individual researcher performing the experiment.
Although originally developed for non-relativistic wave-mechanics, the Copenhagen interpretation can easily be extended to describe quantum field theory.
Criticism of the Copenhagen Interpretations
There are several criticisms of the Copenhagen interpretations which I want to mention. The first is that it is not clear what the physical beables are. It doesn't really specify what precisely the electron is; or what really happens behind the scenes. It doesn't, in short, provide an answer to the main question which philosophers of physics are interested in. Tim Maudlin, for example, dismisses the interpretation out of hand simply because there is little agreement about what specifically the interpretation is, and because there is no answer to the question of the ontology of the theory, or its dynamics. I personally think that this criticism of obscurity can be avoided by treating it as a family of interpretations centred around a a few common ideas (in particular that measurements ought to be treated classically). Certainly, if we treat the wavefunction as ontologically real, and the dynamics of the wavefunction as described by the Schrödinger equation, and the collapse as something triggered by decoherence, then we do have a theory of the ontology and dynamics. On the other hand, Bohr's particular version of the Copenhagen interpretation did not take the wavefunction as corresponding to something in reality, but primarily a mathematical abstraction used to predict the results of experiments (even if it does give insights into physical reality). So while Maudlin's criticism is certainly valid for some presentations of the Copenhagen interpretation such as Bohr's, I am not sure that it can be used against all of them.
More serious problems are related to the process of measurement in this interpretation. The problem is that the Copenhagen interpretation relies on two different dynamical rules: firstly wavefunction evolution, and then wavefunction collapse, with no real unifying principle between them. Decoherence does not answer this fully. Decoherence states that the quantum particle will be found in one of the states in a given basis: in that basis, there is a very low amplitude (for all practical purposes zero) for a superposition. But this doesn't answer the question of which state the particle finds itself in. That still needs an additional dynamical rule beyond Schrödinger evolution of the wavefunction. There is, of course, Born's rule, but Born's rule only provides a probability distribution. It is only useful in comparison to a frequency distribution. It does not tell us what happens to the individual particle, or why it goes into the state it does.
This problem is even more pronounced when the decohered basis includes different location states. Prior to decoherence the electron is spread out over all space. Subsequent to it, it is located in one single location. There is thus an instantaneous reduction of the wavefunction to a single point. This does not sit well with special relativity, especially since there is no unambiguous definition of "instantaneous," but it is reference-frame dependent.
This issue continues when we consider entanglement experiments. The underlying mathematics, of course, works out precisely correctly: we have two linked wavefunctions, one for each particle, but combined together in the mathematics so that they must have opposite spin. We write the initial state as a superposition of (for example) a) particle A being in a spin up state and particle B being in a spin down state plus b) particle A being spin down and particle B being spin up. This might seem like we need to select a basis and spin orientation, but in practice we can transform to any other basis, and will get the same pattern. Then we apply the Schrödinger evolution to the wavefunction. The Schrödinger equation is linear, so we can calculate the result of the equation on the first term in the superposition, and then do the second term, and at the end we add the two results together. Within each term in the superposition we have the two single particle wavefunctions; and we just apply the dynamics to both wavefunctions. All very straight-forward. But then we measure the spin on one of the particles. This triggers a wavefunction collapse, forcing the particle into a single state in a single basis. In effect, choosing the basis, and selecting one of the terms in the superposition. But, because the two particles are entangled, this simultaneously forces the second particle into a fixed state in the opposite basis. So the wavefunction collapse of the unmeasured particle is not only being triggered at a large distance, but it is also being triggered even though that particle's spin has not yet been measured and the particle has not decohered. Yet it was the measurement or decoherence process which is meant to trigger wavefunction collapse in the Copenhagen interpretation. The way this is resolved is to treat the two particles as though they were a single object until the measurement, after which they become separated. But why should they behave like this? Why should a measurement on one particle break the unity between them? This is not clearly explained in the interpretation.
A further problem is related to the idea that measurement devices are treated classically. So, in the Copenhagen interpretation, part of physics is described in quantum terms, and another part according to Newton. The issue then comes whether we can make this distinction, and where the line is drawn between the quantum and the classical. That line should appear when we lose superposition; but there is no single point where we do lose superposition. Rather, as the system enlarges, the amplitudes for superposition states in the total system gradually decrease towards zero. At some point we have to put in a line and say that everything smaller than this is quantum and everything above it is classical: but that line is entirely arbitrary.
In my view, the goal of the philosophy of physics is to provide answers. Ideally, it should explain in terms of as few axioms as possible -- axioms that ideally should make some sort of common sense -- what gives rise to quantum physics. Of course, at some point there will have to be some aspect of the philosophy of quantum physics which defies our classical intuition, as quantum physics differs from the classical world. But that aspect ought to be minimised, and make sense within a broader metaphysical system. Why is the world quantum rather than classical? This needs to be answered. The problem linking different forms of the Copenhagen interpretation is that they just pose additional questions as much as they give answers. Bohr's view, despite its Kantian undertones, does make some sort of sense, but it doesn't actually explain anything. Heisenberg's view explains things, but only by introducing problems that make little sense, in particular around wavefunction collapse and the distinction between the classical and quantum. It might just be that these are features of the world we have to live with. But then how we develop a metaphysics and ontology consistent with (for example) an ontologically real electron wavefunction spread out over all space suddenly converging to a single point? So it is worth looking at some alternative interpretations to see if they provide answers where the Copenhagen interpretations only raise questions.
All fields are optional
Comments are generally unmoderated, and only represent the views of the person who posted them.
I reserve the right to delete or edit spam messages, obsene language,or personal attacks.
However, that I do not delete such a message does not mean that I approve of the content.
It just means that I am a lazy little bugger who can't be bothered to police his own blog.
Weblinks are only published with moderator approval
Posts with links are only published with moderator approval (provide an email address to allow automatic approval)