# The mathematical foundations of quantum mechanics

Von Neumann’s 1932 book “The mathematical foundations of quantum mechanics” was a cornerstone for the development of quantum theory, yet his insights have been mostly ignored by physicists. Many of us know that the book exists and that it formalizes the mathematics of operators on a Hilbert space as the basic language of quantum theory, but few have bothered to read it; feeling that as long as the foundations are there, we don’t need to examine them. One reason the book has never been popular as a textbook is that von Neumann does not Dirac’s notation. He admits that Dirac’s representation of quantum mechanics is “scarcely to be surpassed in brevity and elegance” but then goes on to criticize its mathematical rigor, leaving us with a mathematically rigorous treatment but a notation that is difficult to follow and a book that nobody ever reads.

Today, von Neumann’s book is brought up often in connection with foundations research, which is perhaps not surprising given the author’s original intent as stated in the preface:

the principle emphasis shell be placed on the general and fundamental questions which have arisen in connection with this theory. In particular , the difficult problems of interpretation, many of which are even now not fully resolved, will be investigated in detail.

Two particular results are often cited

1. Von Neumann’s (presumably incorrect) proof of impossibility for a hidden variable model of quantum theory.
2. The von Neumann measurement scheme, which is the standard formalism for describing a quantum measurement.

The description of the measurement scheme is (or at least was to me) a little surprising. The usual textbook description of von Neumann’s work is to start with the Hamiltonian that couples the measurement device to the system under measurement, and to show that it produces the right dynamics (up to the necessity for state collapse). It turns out that von Neumann’s motivation was in some sense the opposite.

He begins by considering a classical measurement and showing that the observer must be external to the measurement, i.e there are three distinct objects, the system under observation, the measurement device, and the observer. Keeping this as a guide, von Neumann provides a dynamical process where the observer is not quantum mechanical while the measurement device is. The important point is that the precise cut between the (classical) observer and the (quantum) measurement device, is not relevant to the physics of the system, just as in classical physics.

Von Neumann does not suggest that he solves the measurement problem, but he does make it clear that the problem can be pushed as far back as we want, making it irrelevant for most practical purposes, and in some ways just as problematic as it would be in classical physics. Many of us know the mathematics, and could re-derive the result, but few appreciate von Neumann’s motivation: understanding the role of the observer.

# Beyond classical (but is it quantum?)

The accepted version of “Interpreting weak value amplification with a toy realist model” is now available online. The work was an interesting exercise involving two topics in the foundations of physics: Weak values and stochastic electrodynamics. Basically we showed that a recent quantum optics experiment from our lab could be re-interpreted with a slightly modified version of classical electrodynamics.

The aim of the work was to develop some intuition on what weak values could mean in a realist model (i.e. one where the mathematical objects represent real “stuff”, like an electromagnetic field). To do this we added a fluctuating vacuum to classical electrodynamics. This framework (at least at the level we developed it) does not capture all quantum phenomena, and so it is not to be taken seriously beyond the regime of the paper, but it did allow us to examine some neat features of weak values and the experiment.

One interesting result was the regime where the model succeeds in reproducing the theoretical results (the experimental results all fall within this regime), which provided insight into weak values. Specifically, the model works only when the fluctuations are relatively small. Going back to the real world, this provide new intuition on the relation between weak values and the weak measurement process.

The experiment we looked at was done in Aephraim Steinberg‘s light matter interaction lab by Matin Hallaji et al. a few months before I joined the group. It was a ‘typical’ weak value amplification experiment with a twist. In an idealized version, a single photon should be sent into an almost balanced Mach Zehnder interferometer with a weak photon counting (or intensity) measurement apparatus on one arm (see figure below). The measurement is weak in the sense that it is almost non-disturbing, but also very imprecise, so the experiment must be repeated many times to get a good result. So far nothing surprising can happen, and we expect to get a result of 1/2 photon (on average) going through the top arm of the interferometer. To get the weak value amplification effect, we post-select only on (those rare) events where a photon is detected at the dark port of the interferometer. In such a case, the mean count on the detector can correspond to an arbitrarily large (or small, or even negative) number of photons.

The twist with the experiment was to use a second light beam for the photon counting measurement. This was the first time this type of weak measurement was done in this way, and is of particular importance since many of the previous experiments could be explained using standard electrodynamics. In this case, the photon-photon interaction is a purely quantum effect.

In reality the experiment described above was not feasible due to various imperfections that could not avoided, so a compromise was made. Instead of sending a single photon, they used a coherent beam with around 10-100 photons. To get the amplification effect on a single photon, they used a trick that would increase the number of photons by 1 and only looked for the change in the signal due to the extra photon. The results showed that the same amplification as the ideal case. However, there remained (and still remains) a question of what that amplification means. Can we really talk about additional photons appearing in the experiment?

As we showed, the results of the experiment can be explained (or at least reproduced) in a model which is more intuitive than quantum theory. Within that model it is clear that the amplification is a real, i.e the post-selected events corresponded to cases where more light traveled through the top arm of the interferometer.

The model was based on classical electrodynamics with a slight modification, we assumed that fields fluctuated like the quantum vacuuum. This turned out to be sufficient to get the same predictions as quantum theory in the regime of the experiment. However, we showed that this model would not work if the intensity of the incoming light was sufficiently small, and in particular it would not work for something like a single photon.

Our model has a clear ‘reality’, i.e the real field is a fluctuating classical EM field, and so it provides a nice start for a more general theory that has weak values as its underlying real quantities. One important feature of the model is the regime where it makes accurate predictions. It turns out that there are two bounds on the regime of validity. The first is a requirement that the light is coherent and is not too weak, this roughly corresponds to being in a semi-classical regime. The second is that the probability of detecting photons at the dark port is not significantly effected by the fluctuation which is similar to the standard weak measurement requirement, i.e. that the measurement back-action is small.

My main takeaway from the work was that the weak values ended up being the most accurate experimentally accessible quantity for measuring the underlying field. Making a leap into quantum theory, we might say that weak measurements give a more accurate description of reality than the usual strong measurement.

The result also set a new challenge for us: Can we repeat the experiment in the regime where the model brakes down (e.g. with single photons)?

# Bell tests

The loophole free Bell experiments are among the top achievements in quantum information science over the last few years. However, as with other recent experimental validations of an a well accepted theory, the results did not change our view of reality. The few skeptics remained unconvinced, while the majority received further confirmation of a theory we already accepted. It turns out that this was not the case with the first Bell tests in the 1970s and 1980s (Clauser, Aspect etc. )

Jaynes, a prominent 20th century physicist who did some important work on light matter interaction did not believe that the electromagnetic field needs to be quantized (until Clauser’s experiment) and did extensive work on explaining optical phenomena without photons. As part of our recent work on modeling a quantum optics experiment using a modified version of classical electrodynamics (and no photons’) we had a look at Jaynes’s last review of his neo-classical theory (1973). This work was incredibly impressive and fairly successful, but it was clear (to him at least) that it could not survive a violation of Bell’s inequalities. Jaynes’s review was written at the same time as the first Bell test experiments were reported by Clauser. In a show of extraordinary scientific honesty he wrote:

If it [Clauser’s experiment] survives that scrutiny, and if the experimental result is confirmed by others, then this will surely go down as one of the most incredible intellectual achievements in the history of science, and my own work will lie in ruins.

Some updates from the past 10 months…

1. Papers published
2. Preprint: Weak values and neoclassical realism
My first paper with the Steinberg group is hardcore foundations.  Not only do we use the word ontology throughout the manuscript, we analyse an experiment using a theory which we know is not physical. Still, we get some nice insights about weak values.

# Going integrated

As a quantum information theorist, the cleanest types of results I can get are proofs that something is possible, impossible or optimal. Much of my work focused on these types of results in the context of measurements and non-locality. As a physicist, it is always nice to bring these conceptual ideas closer to the lab, so I try to collaborate with experimentalists. The types of problems an experimental group can work on are constrained by technical capabilities. In the case of Amr Helmy’s group, they specialize in integrated photonic sources so I now know something about integrated optics.

When I started learning about the possibilities and constraints in the group, I realized that the types of devices they can fabricate are much better suited for work in continuous variables as opposed to single photons. I also realized that no one explored the limitations of these types of devices. In other words, we did not know the subset of states that we could generate in principle (in an ideal device).

In trying to answer this question we figured out that, with our capabilities,  it is in principle (i.e in the absence of loss) possible to fabricate a device that can generate any Gaussian state (up to some limitations on squeezing and displacement). What turns out to be even nicer is that we could have a single device that can be programmed to generate any N-mode Gausssian state. The basic design for this device was recently posted on arXiv.

We left the results fairly generic so that they could be applied to a variety of integrated devices, using various semiconductors. The next step would be to apply them to something more specific and start accounting for loss and other imperfections. Once we figure that out, we (i.e. the fab guys) an go on to building an actual device that could be tested in the lab.

# Tomaytos, Tomahtos and Non-local Measurements

In the interest of keeping this blog active, i’m recycling one of my old IQC blog posts

One of my discoveries as a physicist was that, despite all attempts at clarity, we still have different meanings for the same words and use different words to refer the the same thing. When Alice says measurement, Bob hears a quantum to classical channel’, but Alice, a hard-core Everettian, does not even believe such channels exist. When Charlie says non-local, he means Bell non-local, but string theorist Dan starts lecturing him about non-local Lagrangian terms and violations of causality. And when I say non-local measurements, you hear #\$%^ ?e#&*?.  Let me give you a hint, I do not mean ‘Bell non-local quantum to classical channels’, to be honest, I am not even sure what that would mean.

So what do I mean when I say measurement? A measurement is a quantum operation that takes a quantum state as its input and spits out a quantum state and a classical result as an output (no, I am not an Everettian). For simplicity I will concentrate of a special case of this operation, a projective measurement of an observable A. The classical result of a projective measurement is an eigenvalue of A, but what is the outgoing state?

A textbook (projective) measurement.  A Quantum state $|\psi\rangle$  goes in and a classical outcome “r” comes out together with a corresponding  quantum state $|\psi_r\rangle$.

### The Lüders measurement

Even the term projective measurement can lead to confusion, and indeed in the early days of quantum mechanics it did. When von Neumann wrote down the mathematical formalism for quantum measurements, he missed an important detail about degenerate observables (i.e Hermitian operators with a degenerate eigenvalue spectrum). In the usual projective measurement, the state of the system after the measurement is uniquely determined by the classical result (an eigenvalue of the observable). Consequently,  if we don’t look at the classical result, the quantum channel is a standard dephasing channel. In the case of a degenerate observable, the same eigenvalue corresponds to two or more orthogonal eigenstates. Seemingly the state of the system should correspond to one of those eigenstates, and the channel is a standard dephasing channel. But a degenerate spectrum means that the set of orthogonal eigenvectors is not unique, instead each eigenvalue has a corresponding subspace of eigenvectors. What Lüders suggested is that the dephasing channel does nothing within these subspaces.

### Example

Consider the two qubit observable $A=|00\rangle\langle 00 |$. It has eigenvalues $1,0,0,0$. A $1$ result in this measurement corresponds to “The system is in the state $|00\rangle$“.  Following a measurement with outcome $1$, the outgoing state will be $|00\rangle$. Similarly, a $0$ result corresponds to “The system is not in the state $|00\rangle$ “. But here is where the Lüders rule kicks in. Given a generic input state $\alpha|00\rangle+\beta|01\rangle+\gamma|{10}\rangle+\delta|{11}\rangle$ and a Lüders measurement of $A$ with outcome 0, the outgoing state will be $\frac{1}{\sqrt{|\alpha|^2+|\beta|^2+|\gamma|^2}}\left[\beta|{01}\rangle+\gamma|{10}\rangle+\delta|{11}\rangle\right]$.

### Non-local measurements

The relation to non-locality may already be apparent from the example, but let me start with some definitions. A system can be called non-local if it has parts in different locations, e.g. one part on Earth and the other on the moon. A measurement is non-local if it reveals something about a non-local system as a whole. In principle these definitions apply to classical and quantum systems. Classically a non-local measurement is trivial, there is no conceptual reason why we can’t just measure at each location. For a quantum system the situation is different. Let us use the example above, but now consider the situation where the two qubits are in separate locations. Local measurements of $\sigma_z$ will produce the desired measurement statistics (after coarse graining) but reveal too much information and dephase the state completely, while a Lüders measurement should not. What is quite neat about this example is that the Lüders measurement of $|{00}\rangle$ cannot be implemented without entanglement (or quantum communication) resources and two-way classical communication. To prove that entanglement is necessary, it is enough to give an example where entanglement is created during the measurement. To show that communication is necessary, it is enough to show that the measurement (even if the outcome is unknown) can be used to transmit information. The detailed proof is left as an exercise to the reader. The lazy reader can find it here (see appendix A).

This is a slighly modified version of a  Feb 2016 IQC blog post.

# Three papers published

When it rains it pours. I had three papers published in the last week. One experimental paper and two papers about entanglement.

1. Experimental violation of the Leggett–Garg inequality in a three-level system. A cool experimental project with IQC’s liquid state NMR group.    Check out the outreach article  about this experiment.
2. Extrapolated quantum states, void states and a huge novel class of distillable entangled states. My first collaboration with Tal Mor and Michel Boyer and my first paper to appear in a bona fide CS journal (although the content is really mathematical physics). It took about 18 months to get the first referee reports.
3. Entanglement and deterministic quantum computing with one qubit. This is a follow up to the paper above, although it appeared on arXiv  a few months earlier.