real gl: understanding quantum computation

"Intro" lectures on quantum computation for computer scientists are popular, because as computer scientists we want to know what's going on. But pretty quickly, the lecture starts using crazy symbols like |ψ⟩ and I get lost. Recently, I spent some time trying to break through this learning barrier, which involved getting stuck over and over. Now that I think I "get it", I figured I'd share. Of course, maybe I don't get it. Someone will tell me, I'm sure.

This blog post attempts to explain the core concepts of quantum computation using standard linear algebra, and many of the computations are links to Wolfram Alpha so you can see exactly what is going on.

A qubit is like a ball that can be opened. When we open — observe — the ball, there is a number written inside. The number can be 0 or 1. We can change the number if we want, and close the ball again.

A gate is like a box that we can send qubits through. Some gates can take more than one qubit at a time, but all the qubits that go into a gate will come out the other end. Gates typically change the numbers written inside the qubits that pass through them.

Random Gate

One day, we encounter a mysterious new gate. We want to know what it does, so we run some experiments. We take a qubit, set it to 0, and send it through the gate. When we open it, it is a 1. We try again — setting it to 0 again and sending it through the gate — but this time it comes out 0. We try many times, and half the time it comes out 0 and half the time it comes out 1, with no discernible pattern. If we send a 1 through the gate, we see the same thing: half the time it comes out 0 and half the time it comes out 1.

We call this gate random, since it seems to take our qubit and write a random number inside it.

Now one experimenter is testing the random gate, sending a 0 through it, recording the output, and setting the qubit to 0 again before sending it through the gate again. But one time, the experimenter forgets to open the qubit before sending it through the gate again. This time, they notice that the qubit is still 0. Coincidence perhaps. But they try sending it through the random gate twice in a row again, and again it comes out 0. They try again, and again it comes out 0. Every time they send a 0 twice in a row through the random gate, without opening it in between, it comes out 0. They try setting it to 1, sending it through the gate twice. It comes out 1. Every time. Every time they send a qubit through the gate twice in a row, it comes out the same as it started. This is an example of interference.

So we thought the random gate was just writing a random number inside the qubit, but this wouldn't explain the strange behavior where sending a qubit through the gate twice in a row leaves the qubit unchanged.

A model that seems to work for explaining the random gate goes like this: the state of a qubit is represented by two numbers. We'd like these numbers to represent the probability that the qubit is a 0 or a 1 respectively, but it will turn out to be useful for the numbers to be negative sometimes. Hence, we'll let the numbers be negative, and we'll square them to get the probabilities, which removes the negativeness.

So the square of the first number represents the probability that the qubit is a 0 when we open it. The square of the second number represents the probability that the qubit is a 1 when we open it. These probabilities must sum to one. When we open a qubit, it is either a 0 or a 1, but the information about what it could have been is lost forever — it is now definitely a 0 or a 1. If it is a 0 when we close it, then it is in the state [1, 0]. If it is a 1 when we close it, then it is in the state [0, 1].

The random gate is a 2x2 matrix. It looks like this [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] (see Hadamard gate). We model sending a qubit through the gate by multiplying the qubit's state vector by the matrix. So if we send a 0 through the gate, that means we're sending a [1, 0] through the gate, which we model as [1, 0] * [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]], which equals [sqrt(.5), sqrt(.5)]. If we square each number, we get [.5, .5] which means 50% chance of 0 and 50% chance of 1. This models the fact that the qubit is random if we send it once through the gate.

Now if we don't open the qubit yet, it will stay in the state [sqrt(.5), sqrt(.5)]. And if we send it through the random gate again, then we model this as [sqrt(.5), sqrt(.5)] * [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]], which equals [1, 0]. If we square each number we still get [1, 0], which means 100% chance of 0, and 0% chance of 1. So that works. Sending a 0 through the gate once randomizes it, and sending the qubit through the gate again without opening it yields a 0 again.

If we send a 1 through the gate, we get a qubit in the state [sqrt(.5), -sqrt(.5)]. Note the negative sign in the second number. However, when we square each number, we get [.5, .5], which is 50% chance of 0, and 50% chance of 1. So sending a 1 through the random gate still randomizes it, even though the state vector is slightly different than the state vector we got when we sent a 0 through the random gate.

Now if we don't open the qubit, and send it through the random gate again, then we model this as [sqrt(.5), -sqrt(.5)] * [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]], which equals [0, 1]. If we square each number, we still get [0, 1], which means 0% chance of 0, and 100% chance of 1. So that works too. Sending a 1 through the gate once randomizes it, and sending the qubit through the gate again without opening it yields a 1 again.

Copy Gate

One day we encounter another mysterious gate. It has two input holes labeled source and dest. It seems to copy the number written in the source qubit into the dest qubit. Well, almost. It copies like this if the dest qubit is a 0. But if the dest qubit is a 1, then it copies the opposite of the source qubit to the dest qubit. But we can avoid this disturbing behavior by always sending a 0 into the dest.

We call this gate copy, since it seems to copy one qubit to another, as long as the dest is a 0.

So what happens if we take a 0 and send it through a random gate, and then copy this qubit by sending it into the source of a copy gate along with a fresh 0 in the dest of the copy gate? The state of the source qubit is [sqrt(.5), sqrt(.5)], so maybe now both qubits have the state [sqrt(.5), sqrt(.5)]?

If so, then if we open both qubits, we would expect the result to be like flipping two coins, with an equal probability of seeing heads-heads, heads-tails, tails-heads and tails-tails. However, if we perform the experiment, we notice that the qubits might be 0 or 1, but they are always the same as each other, which is weird. It's like flipping two coins and only getting both heads or both tails.

So maybe the copy gate "opened" the source qubit, setting its state to [1, 0] or [0, 1], and then that was copied?

If so, then if we send each qubit coming out of the copy gate into a random gate, each qubit should be randomized to [sqrt(.5), sqrt(.5)] or [sqrt(.5), -sqrt(.5)], and we should have the two coin flip situation again. However, if we run the experiment, sending each qubit through a random gate, they are still the same when we open them.

So our model appears to be broken again. Even using two numbers to represent each qubit, we can't model the strange behavior we see from the copy gate.

A model that seems to work for explaining the copy gate goes like this: the state of n qubits is represented by 2ⁿ numbers. We'd like these numbers to represent the probability that the qubits are in each possible configuration. For instance, if there are three qubits, then there are 2³ or 8 numbers representing the probability of being in the configurations 000, 001, 010, 011, 100, 101, 110, and 111. As before, we'll square each number to get the actual probability of being in each configuration.

The copy gate operates on two qubits, so we can represent the qubits using 2² or 4 numbers. In our scenario, we have one qubit in the state [sqrt(.5), sqrt(.5)], and another in the state [1, 0]. We want to get the probability of being in the states 00, 01, 10, and 11. The probability of 00 is the probability of both qubits being 0, which seems easy enough to calculate by multiplying these probabilities together. However, if we multiply the probabilities together, we'll always get a positive number, whereas we'd like to keep around the possibility for negative numbers, so we multiply the square roots of the probabilities together instead. Which works, apparently. So if our states are [a, b] and [c, d], then our resulting four element state vector is [a*c, a*d, b*c, b*d] — this is called a Kronecker product, and can be written like [a, b] ⊗ [c, d] — and the result in our case is [sqrt(.5), 0, sqrt(.5) 0]. If we square each number we get [.5, 0, .5, 0], which is a 50% chance of 00, and a 50% chance of 10, meaning the first qubit is a 0 or 1, but the second qubit is definitely a 0, which sounds like what we had to begin with.

The copy gate itself is a 4x4 matrix. It looks like this [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0]] (see CNOT gate). When we multiply our state vector by this matrix we get [sqrt(.5), 0, sqrt(.5) 0] * [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0]], which is [sqrt(.5), 0, 0, sqrt(.5)]. If we square each number we get [.5, 0, 0, .5], which is a 50% chance of 00, and a 50% chance of 11, meaning that either both qubits are 0, or both qubits are 1. That's good so far: we've modeled the fact that the qubits are always the same as each other when we look at them after going through the copy gate.

Next we want to send each qubit through another random gate. In order to do so, we'd like to decompose the state [sqrt(.5), 0, 0, sqrt(.5)] into a separate vector for each qubit. We'd like two vectors [a, b] and [c, d] such that [a, b] ⊗ [c, d] = [sqrt(.5), 0, 0, sqrt(.5)]. But this turns out to be impossible. Imagine that we had two vectors [a, b] and [c, d] such that [a*c, a*d, b*c, b*d] is [sqrt(.5), 0, 0, sqrt(.5)]. This would mean that a*d=0, so either a or d is 0. Also, a*c=sqrt(.5), so a and c are both not 0. This means that d is 0. But b*d=sqrt(.5), which means d can't be 0. This is a contradiction, meaning we can't find any vectors [a, b] and [c, d] that work like we want. This inability to decompose a state vector into separate state vectors for each qubit is called entanglement.

So how do we model sending each qubit through another random gate? Well, if we can't decompose the state vector, maybe we can treat the two random gates as a single random gate that acts on two qubits. We do this the same way we combine state vectors for qubits to get larger state vectors for multiple qubits: with the Kronecker product: [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] ⊗ [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]], which yields: [[.5, .5, .5, .5], [.5, -.5, .5, -.5], [.5, .5, -.5, -.5], [.5, -.5, -.5, .5]]. This is a 4x4 matrix just like we wanted.

Now let's send our qubits through the combined random gate by doing [sqrt(.5), 0, 0, sqrt(.5)] * [[.5, .5, .5, .5], [.5, -.5, .5, -.5], [.5, .5, -.5, -.5], [.5, -.5, -.5, .5]], which is yields [sqrt(.5), 0, 0, sqrt(.5)]. If we square each number we get [.5, 0, 0, .5], which is a 50% chance of 00, and a 50% chance of 11, like we saw before. So even if we send each qubit through a random gate and look at them, they're still the same, which means our model explains what we saw.

The Matrix

By the way, there are a few interesting things to note about the matrices we keep seeing. First, they are always square. This makes sense because when we multiply a state vector with n elements by a matrix, we want a state vector with n elements as a result, since all the qubits that enter a gate also leave the gate.

Second, the result of multiplying a state vector by a matrix should actually be a state vector, meaning if we square all the elements of the new vector to get probabilities, these probabilities should sum to one. And of course, a gate can't prevent us from shoving qubits in any valid state into it, so any valid state vector multiplied by the matrix should be a valid state vector.

This second constraint turns out to make the matrices unitary. Unitary matrices have many nice properties. For instance, every row in the matrix is itself a valid state vector. In fact, the top row is what the result will be if the input state vector has a one as the first number, and zeroes for all the rest. It also happens to be the case that each row is orthogonal to all the rest, i.e., if we take the dot product of any two rows, we'll get zero.

Also, unitary matrices are always invertible. In fact, they are easy to invert. We just take the transpose. The fact that the matrices are always invertible means that we could, in theory, figure out what the original state vector looked like by running a computation in reverse, assuming we knew what the final state vector looked like. This is what people mean when they say that quantum computation is reversible.

What Am I Leaving Out?

I say that state vectors consist of the square roots of probabilities, whereas real quantum people allow imaginary numbers, and call them probability amplitudes. Everything I said is true for real numbers, but if we want to generalize it to support imaginary numbers, we need to make some changes involving complex conjugates and conjugate transposes. The main change is that instead of squaring a probability amplitude to get the probability, we need to multiply it by its complex conjugate — of course, if it's a real number, then this is the same as squaring it. I left out imaginary numbers because I don't know of any motivating experiments that show their usefulness. As far as I can tell, all the stuff we're told is "weird" about quantum mechanics can be modeled without imaginary numbers. But I'm sure someone will point out a case that requires them.

I use row vectors to represent state vectors, whereas real quantum people use column vectors. This means they need to multiply all the matrices in reverse order, which I find confusing. It also means that their matrices will look like the transpose of my matrices. Of course, I'm only really using two matrices in this post, one for the random gate, and one for copy, and both of these happen to be their own transpose, so if you look up the matrices for these gates online (where they'll be called Hadamard and CNOT), they'll look the same.

Future Work

In subsequent posts, I'll try to explain two popularly weird quantum phenomena in terms of simple quantum computation: the double-slit experiment, and the EPR paradox. I'll also give my take on what's going on when we "observe" a qubit, and how that is related to the "many-worlds interpretation" of quantum mechanics.

real gl

10/14/12

understanding quantum computation

Random Gate

Copy Gate

The Matrix

What Am I Leaving Out?

Future Work

No comments:

Post a Comment