real gl: November 2012

11/28/12

EPR Paradox: What's paradoxical about it?

The EPR Paradox is the one where two particles go lightyears apart from each other, and when someone observes one and sees that it's positive, it turns out that the other one is negative. And the physicist is like: "Spooky! How did the other one know to be the opposite?" And you're like: "Uh.. they chose to be that way to begin with?"

But the setup isn't quite that simple. It's more like this:

Imagine an old box maker makes two boxes. Each box has three buttons on top: a red button, a green button and a blue button. Pressing a button opens the box, revealing a coin. The coin might be heads up or tails up.

The box maker claims that the first time the boxes are opened, the following is true: if one box is opened by pressing the red button and the other box is opened by pressing the blue button, then the coin sides will be different — one heads and one tails. Otherwise, the coin sides will be the same. This is true even if the boxes are first opened lightyears apart from each other.

If you can explain how the boxes work without faster-than-light communication, then you have "solved" the EPR Paradox.

update: Another thing worth pointing out is that although faster-than-light communication appears to take place in this setup, this does not imply that the boxes can be used to actually communicate anything. I didn't say this above, but an extra claim about the boxes is that the first box to be opened, no matter how it is opened, has a 50% chance of showing heads or tails.

Hence, the second box to be opened, in the absence of knowledge about what happened with the first box, also appears to have a 50% chance of showing heads or tails. It is only when the box openers get together later and compare their results that they'll notice that their coins are the same or different according to which buttons they both pressed.

Seattle roads

Seattle has some odd road features:

car pool lanes on the right-hand side
bus only lanes
a toll bridge that just takes a picture of your car — I'm driving a rental, and I have no idea what will happen with those tolls
lane-specific speed limits displayed on LED signs
surface roads with cars parked on both sides, but no room to pass should someone be coming the other way

Presentation at Microsoft Research

I gave a presentation at Microsoft Research titled "Expertise on Demand". The basic plot arc is: MTurk is pretty fast, but it is hard to guarantee expertise. oDesk has experts, but it can take a day or two to find one. Wouldn't it be great if we had something that was the best of both worlds? If we did, then we could hire experts as part of our flow on everyday computer tasks, the same way we lookup answers online. But in order to achieve this, there is a bit of a chicken-and-egg problem, and here (in the talk, but also in this blog) are some systems that we've been working on at oDesk Research to solve them. Here is the presentation.

11/19/12

Yamaha NP-31 Quick Operation Guide

This is just for personal reference, because I don't want to hold onto this piece of paper...

11/15/12

sabbatical (what I did)

I went on a sabbatical during the months of August and September. This was sortof the vacation I had hoped to take immediately after grad school but didn't. Many people ask, what did I do? I'm afraid what I did isn't super exciting. I spent most of my time thinking (which I find exciting). But I did do something things.

I went to Eureka, and I watched an episode of Eureka in Eureka. I also read Logicomix on the beach.

me reading Logicomix on the beach

I also went to New York. I spent a week there using Airbnb for the first time — which worked out great. The first night I arrived, I typed "live comedy" into my phone, and found a show within walking distance of my place in the Upper East Side. The next few days involved walking through half of central park to the theatre district, watching a show, and then walking back up and getting dinner somewhere. I watched Wicked, Cirque du Soleil's Zarkana, and The Book of Mormon. My last night I went to The Bitter End to see live singer/songwriters perform, and the day I left I stopped by The Met before going to the airport. I really enjoyed my time there. I love New York.

a picture I liked at The Met

I also went to Yosemiti. I left really early one day and ended up being parked on the freeway along with many other cars due to some accident. During this time, I downloaded the Audible app, downloaded a book, and started listening to it. This was probably my most productive time I've spent parked on the freeway. Yosemiti was spectacular, though crowded. I enjoyed lying on a rock somewhere looking up at a waterfall where the water would dissipate before it could reach the ground.

But like I said, I spent most of my time thinking. I lived in Menlo Park, and I would go on walks from my apartment to the The Dish and back. I would also spend many days just thinking in my apartment, walking to Safeway for food when necessary.

I suppose I should write what I thought about at some point.

Bus Puzzler

Ramesh gave me a puzzler today. I'm standing at a bus stop. During any given minute, a bus has a 1/10 chance of being there. So my expected wait time is 10 minutes. But, if I looked into the past, I would find that I would expect to go back 10 minutes to find when the last bus was here.

But, that makes it sound like a bus came about 10 minutes ago, and the next one will come about 10 minutes from now, which suggests I'm in a 20 minute gap between busses, whereas we would expect buses to come 10 minutes apart. So shouldn't we expect to be 5 minutes from a bus?

My solution

Instead of waiting for a bus, imagine that we are standing in the middle of a row of peach trees, and each tree has a 1/10 chance of having a ripe peach. Now if we go to the left, we expect to examine 10 trees before finding a ripe peach. If we go to the right, we also expect to examine 10 trees before finding a ripe peach. So it still sounds like we're in a 20-tree gap between ripe peaches.

But, if we do a breadth first search to the left and right simultaneously, examining the tree immediately to our left, and then the one immediately to our right, and then the second tree on our left, and then the second tree on our right, and so on, then we still expect to examine 10 trees before finding a ripe peach, but this means that we'll probably find that tree within 5 trees of our current location.

In the bus example, if we looked into the past and future simultaneously, we would expect to find a bus 5 minutes away. So.. phew.

But, I'm wrong

We do expect to see a bus 5 minutes away, either in the past or in the future, but we don't expect to see a bus 5 minutes away in the past and in the future. That is to say, of the two buses we're between, the closer one is probably 5 minutes away, but the further one is probably 15 minutes away. So we really are in a 20 minute gap between buses, it seems. Or so says a simulation I ran...

function waitForBus() {
var t = 1
while (Math.random() < .9) t++
return t
}

var sum = 0
var total = 100000
for (var i = 0; i < total; i++) {
var a = waitForBus() // bus in the past
var b = waitForBus() // bus in the future
var c = a + b
sum += c
}
sum / total

I ran this using my JavaScript utility, and it spits out 20. And if I replace c = a + b with c = Math.min(a, b) to simulate the nearest bus, it spits out about 5.25 (not quite 5, but close). And putting in c = Math.max(a, b) yields about 14.75 (not quite 15, but close). The fact that 5.25 + 14.75 = 20 makes me suspect that the .25 and .75 parts may be "real", but I don't understand them.

While discussing this with a friend, he looked up the answer online and read something about sampling bias toward longer gaps. This seems right. The average gap between buses is 10 minutes, but some gaps are much longer, and we're probably in one of those. For instance, if there were only two sizes of gap, 5 minutes and 15 minutes, then the average gap size would be 10 minutes, but the 15 minute sized gaps would consume 15/(5+15) = 75% of the time. So if we showed up to the bus stop at some random time, there would be a 75% chance that it was a 15 minutes gap, and a 25% chance that it was a 5 minute gap, meaning we expect to be in a 75%*15 + 25%*5 = 12.5 minute gap, which is longer than the "expected" 10 minute gap.

11/14/12

Braess's paradox

In casual conversation, I said something seemingly non-controversial like "gaining information is always good". And a friend challenged me. He said no, sometimes more information is bad. And he sent me a link to Braess's paradox.

Here's the setup for the paradox...

image from Wikipedia, which on a side note managed to get me to donate money today

...4000 cars start at START. It takes 45 minutes to go from START to B or from A to END. It takes T/100 minutes (where T is the number of cars or "travelers" using the road) to go from START to A or from B to END. There is a quick, zero time path from A to B, but nobody knows about it.

Here's the paradoxical result (see the Wikipedia page for details): everyone can drive from START to END in 65 minutes, with 2000 cars taking the top path, and 2000 taking the bottom path. But if everyone discovers the quick road from A to B, then everyone will end up spending 80 minutes travelling from START to END, with everyone going from START to A to B to END. So introducing information, namely the existence of a road, was somehow bad.

Of course, this assumes everyone will be selfish, doing things that help themselves at the expense of everyone else. Knowledge of the road from A to B could be good in an "enlightened" society. In fact, a society could use the road from A to B to decrease the total human-minutes spent travelling from 260,000 to 258,750, if 2250 cars go from START to A, and 500 cars take the road from A to B. Of course, it might be hard to agree who those lucky 500 people will be, since they'll each spend only 45 minutes travelling, whereas everyone else will spend 67.5 minutes.

So I guess if everyone is selfish and greedy, then introducing new information can be bad. But if everyone is enlightened — for some suitable definition of enlightened — then I still think gaining information is always good.

11/13/12

The After Image

summary

Check out The After Image blog.

here I've copied the about page:

This blog is a sort of prototype for a website I want to build. The website, assuming I create it, will allow people to hire an artist by providing a "before" image, as well as some "extra instructions". The user can then choose which artist they want by viewing each applicant's before-and-after image pairs, e.g.:

before

after

The grand vision is to demonstrate that online work is not just something for entrepreneurs to take advantage of, but something that regular people can benefit from as well. Hence, I wanted to pick a domain that regular people might be interested in hiring for. Art came to mind, because I have hired artists myself for some personal projects, including my Facebook avatar (which appears as the first before-and-after post in the blog).

The design is meant to overcome two challenges: it can be hard to come up with something to hire an artist to do; and it's hard to pick an artist. The strategy for overcoming both challenges is the same: makes everyone's input and output public -- at least at first. This will allow people to be inspired by what other people have done. It should also make it easier to pick and artist, since it will be obvious from the before-and-after images what each artist is capable of.

My strategy in the past has been to just build a website, but these attempts usually die due to lack of liquidity. Hence, I decided to test and bootstrap the idea with this blog where I would manually go through the hiring process myself. This was meant to help me gauge interest as well as discover a price point.

11/12/12

metameaninglessness

I imported all my posts from a somewhat secret, not well publicized, somewhat personal, somewhat half-baked, blog I kept called metameaninglessness. I went through it and decided it wasn't too crazy, though my standards are pretty low. Anyway, as that blog suggests, read at your own risk ;)

I've also added labels to many posts. You can find a lot of the original "real gl" posts with the research and odesk labels. Some common labels from metameaninglessness posts include: anecdote, armchair philosophy, art, introspection, rant, and religion.

11/11/12

bio

Sometimes I need to give a "bio" to places. Here's my current one, so I can grab it.

Greg Little is a Labor Scientist at oDesk, where he and the Research team explore ways to enhance the online work experience for both employers and contractors. Greg received his PhD in computer science from MIT on the topic of crowdsourcing. He has previously interned at the Xerox Research Center Webster and the IBM Almaden Research Center, and has worked at a startup video game company. He received his B.S. from Arizona State University.

11/9/12

three modes of thought

thinking/planning — my eyes are staring into space, or closed, and I might appear asleep.

doing — I'm kindof like a machine, queued up with a bunch of instructions, and I'm just executing them.

communicating — I'm talking, trying to sell my ideas, and trying to understand other people's ideas into order to steal good ones.

11/8/12

video

I gave a presentation at some event today run by the Institute for the Future. At the end, I was asked if I was willing to do a video interview, and I submitted. I first made sure they wouldn't release the thing publicly without passing by oDesk's marketing team, since I feel like I have a tendency to say things in a non-marketing compatible way. But I've felt a bit awkward all evening, and I think the reason is this video. It turns out I have some sort of phobia against video. First, I hate seeing myself on video. I think I look stupid. Second, I fear having video footage used out of context. I feel like I have a tendency in general to say grandiose things, which I try to preface with all sorts of disclaimers, but I feel like it would be easy to get a short film clip that made me look like a nut.

11/7/12

IFTF Presentation

Presented at an Institute for the Future conference. These are my slides for that presentation. The audience was primarily high-level people at enterprise companies, and the presentation was trying to show how I envisioned oDesk being used by large companies. The presentation offers three "predictions".

First, large companies will graft branches of online workers into their traditional corporate structures. Second, because these branches are easy to experiment with (as opposed to the internal organizational structure, with is risky to experiment with), we're likely to see new clever innovative ways of organizing online workers. However, even this is sortof like the way computation was first used by large companies, in the form of mainframe computers. That is, companies leveraging online workers to solve company problems, as opposed to individual employees leverage online workers to solve individual problems. The latter would be more like the personal computer version of online work. The third prediction is that we'll see individuals at companies leverage online work, hiring hyper-specialized experts in real-time as part of their everyday work flow.

11/4/12

management technology

Here's a caricature of modern development practices: there's a boss who has "people" skills (represented with the filled circle within the circle), and they manage engineers/programmers who have "problem solving" (triangle) and "coding" (diamond) skills.

The dotted line represents the payroll of the company. It is difficult to experiment with things above this line, e.g., it is hard to fire people. It is much easier to experiment with stuff below this line, and this is where "technological progress" occurs.

Here's a new model that I think we can achieve with online labor markets like oDesk. A new sort of "engineer" is created who has some "people" skills and "problem solving" skills, and they contract work to people with "coding" skills. Note that the coders are not what we think of today when we think about programmers at Google or Microsoft -- if you've looked at interview questions for places like this, you'll see that they are not testing coding skills but rather problem solving skills. In comparison to problem solving, coding is relatively easy.

Note that the payroll line moves up. This allows for experimentation with ways of hiring and arranging work beneath this line. This allows for the development of what we might call "management technology" (name credit to Devin Fidler).

evolution

We often think about evolution as organisms surviving, mating, and giving birth to organisms which will hopefully survive, mate, and give birth. However, I think evolution has gone through some meta-evolution, creating new powerful evolutionary tools.

At first, we had the evolution of particles. This is closest to what creationists complain about when they say "how could random chance have created a living organism?" Somehow, it seems like particles really did randomly fit together into some sort of particle that could reproduce itself.

After some time, a meta-evolution occurs in the form of cells and DNA. Probably DNA came before cells, but I don't know how that works. Anyway, cells and DNA can evolve more efficiently than pure randomness by using restricted randomness. That is, when DNA is copied, it is usually copied exactly, but sometimes mistakes are made, and these mistakes -- mutations -- allow the exploration of different sorts of cells. But DNA is structured in such a way where random mutations typically don't screw everything up completely. DNA encodes information in a very inefficient manner, taking up lots of space where it could theoretically compress the information. In fact, lots of information in human DNA isn't even used. This inefficiency is good though. It is the feature that allows changes not to effect too many things. If DNA was encoded using zip compression, then any mutation at all would completely change the entire meaning of the DNA strand.

After some more time, another meta-evolution occurred, in the form of organism. Organism have two parents. This allows organism to explore an even more restricted space, by essentially taking the mid-point between good points in this space, as well as searching randomly a bit with mutations. Cells on the other hand only have one parent, so the only mechanism they have for exploring the space is mutation.

After even more time, another meta-evolution occurred, in the form of brains. Brains divide an organism into hardware and software, where the hardware evolves in a 2-parent organism way, but the software can evolve differently. Brains encode "behaviors", and if we think of behaviors as software-organisms, then they are organisms with potentially many parents. I'm not sure exactly how behaviors are transferred, but I imagine that a creature can observe behaviors of other organisms and adopt those behaviors without needing to mate and give birth to a new creature (at least humans seem to be capable of this with "mirror" neurons). This allows an even more refined way of searching behavior space, by essentially taking weighted mid-points of many parents.

After even more time, another meta-evolution occurred, in the form of imagination. Imagination is the ability of a brain to simulate reality in it's head, without actually doing anything. This allows a brain to test a behavior without suffering too negatively if it is a bad behavior. This increases the turn-around time for exploring behavior space.

Now behaviors, or "ideas", seem to be like organisms themselves, and they are evolving in the ecosystem of brains. That is, the life and lineage of an idea or behavior doesn't necessarily follow blood lines. Hence, we might expect there to be "meta-evolutions" of idea-organisms. And it's possible that there already has been. Some ideas may already have created a sort of "cell and DNA" structure (memes?) so that they can more reliably survice and reproduce, with a more refined mechanism for searching the space.

In fact, I feel like one survival strategy of ideas is to infect a brain, grow, and be born in the from of a human thinking "I just came up with an idea!", where really, the idea had already been come-up-with, and it's just a good strategy for ideas to make their "mothers" think they are original creations so that the mothers will love and care for them, e.g., tell other people about them, so they can survive and reproduce. But the mothers didn't really create them, any more than a human mother cobbled together the DNA of their child.

Hence, I think the idea of "idea ownership" is bunk -- even if it has been a good strategy for ideas to make us feel this way thus far. That is, by understanding better how ideas actually evolve, we may be able to create an even more efficient ecosystem for ideas, in the same way we can develop better agricultural methods by understanding better how plants grow.

many worlds interpretation

By looking at the figure on Wikipedia for the many-worlds interpretation of quantum mechanics, it seems like universes branch off as "observations" are made, where the outcome of the observation is one thing in one branch (e.g. cat dead), and another thing in the other branch (e.g. cat alive).

However, I think the many-worlds interpretation is more like: all possible universes always exist, with different amounts of probability, and these probabilities shift over time. And the way they shift depends on the distribution of probability, which seems to imply that the future of our current universe depends in part on the probability of various parallel universes.

This in turn implies that there isn't really one version of history, but rather, the current state of our universe feeds from a distribution of possibilities for our immediate past. I think this is what is meant by the statement in Wikipedia: "Many-worlds implies that all possible alternative histories ... are real".

EPR paradox

This post attempts to model the EPR paradox with a very simple quantum computer. That way, we won't understand the weirdness, but at least we'll understand that it is weird.

This post is a follow up to two other posts: understanding quantum computation and the double-slit experiment.

I said that the double slit experiment convinced me that quantum stuff was weird. The EPR paradox failed to convince me, at first. I remember hearing something about two particles heading off in different directions, and when people observed the particles, it miraculously turned out that some property about them was always the same. If one was observed to be positive, then the other one would turn out to be positive as well. And then they'd go on to say how this would be true even if the particles were observed lightyears apart from each other.

But I thought, maybe the particles agreed on a value right from the start, before going lightyears apart from each other. And they would say, no no that's impossible! Quantum particles only decide what to be when you observe them! And I wasn't convinced.

But now I am convinced, and I will attempt to convince you with a simple quantum computer.

Simple Model

This section will be easier to understand after reading understanding quantum computation and the double-slit experiment, and will use the copy, random and half-random gates from those posts.

Our computer will start with two qubits which we'll initialize to 0. Next, we'll send the first qubit through a random gate, so that it is either a 0 or a 1. Then we'll send both qubits through a copy gate, so that the value of the first qubit is sortof copied into the second qubit. We actually did this in the Copy Gate section of understanding quantum computation, and the result is a state like this [sqrt(.5), 0, 0, sqrt(.5)], where there's a 50% chance that both qubits are 0, and a 50% chance that both qubits are 1.

Now, we ship one qubit to Alice and the other to Bob, who live lightyears apart. And if Alice opens her qubit and sees a 1, then "miraculously" Bob's qubit will also have a 1 inside. So far, the "they were both the same to begin with" theory is looking ok.

But What If...

But what if Alice and Bob both have a random gate lying around that they could pass their qubit through before opening it. Then there are four possibilities: they could both not use the gate, Alice could but not Bob, Bob could but not Alice, or they both could.

We can represent each possibility with a matrix. If neither uses the random gate, then it's just an identity matrix like this [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]].

If Alice uses it, but not Bob, then we need some Kronecker product magic to combine the random gate matrix [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] with a 2x2 identity matrix like this: [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] ⊗ [[1, 0], [0, 1]] = [[sqrt(.5), 0, sqrt(.5), 0], [0, sqrt(.5), 0, sqrt(.5)], [sqrt(.5), 0, -sqrt(.5), 0], [0, sqrt(.5), 0, -sqrt(.5)]].

If Bob uses it, but not Alice, then we need to use the Kronecker product in the opposite order like this: [[1, 0], [0, 1]] ⊗ [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] = [[sqrt(.5), sqrt(.5), 0, 0], [sqrt(.5), -sqrt(.5), 0, 0], [0, 0, sqrt(.5), sqrt(.5)], [0, 0, sqrt(.5), -sqrt(.5)]].

If they both use it, then we need to Kronecker two random gates together -- like we did at the end of the Copy Gate section mentioned above -- which goes like this: [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] ⊗ [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] = [[.5, .5, .5, .5], [.5, -.5, .5, -.5], [.5, .5, -.5, -.5], [.5, -.5, -.5, .5]].

Now let's multiply our state [sqrt(.5), 0, 0, sqrt(.5)] by each possibility:

If neither Alice nor Bob uses a random gate, then we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]] = [sqrt(.5), 0, 0, sqrt(.5)]. This means they'll either both see a 1, or they'll both see a 0, each with 50% probability.

If Alice uses a random gate, but not Bob, then we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[sqrt(.5), 0, sqrt(.5), 0], [0, sqrt(.5), 0, sqrt(.5)], [sqrt(.5), 0, -sqrt(.5), 0], [0, sqrt(.5), 0, -sqrt(.5)]] = [.5, .5, .5, -.5]. Squaring each number gives us a 25% chance of each possibility. So the contents of their respective qubits will be like the outcomes of two independent coin flips.

If Bob uses a random gate, but not Alice, then we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[sqrt(.5), sqrt(.5), 0, 0], [sqrt(.5), -sqrt(.5), 0, 0], [0, 0, sqrt(.5), sqrt(.5)], [0, 0, sqrt(.5), -sqrt(.5)]] = [.5, .5, .5, -.5]. This is what we saw if just Alice used a random gate. So if just one person uses a random gate, then the outcomes of their qubits will be like two independent coin flips.

If both Alice and Bob use a random gate, then we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[.5, .5, .5, .5], [.5, -.5, .5, -.5], [.5, .5, -.5, -.5], [.5, -.5, -.5, .5]] = [sqrt(.5), 0, 0, sqrt(.5)]. This is what we saw if neither used a random gate. So if they both use or don't use a random gate, then they'll both see the same random outcome.

So this is a little weird. At first glance, it seems like the qubits sent to Alice and Bob need to communicate over lightyears of space to say whether or not they went through a random gate, so that they know whether to be the same as each other or not.

Of course, the qubits could also decide ahead of time what to do in each case. For instance, Alice's qubit could say: "If Alice opens me right away, I'll be a 0, but if she first sends me through a random gate, then I'll be a 1." And then Bob's qubit could agree and say: "Ok, so I'll also be a 0 if Bob opens me right away, and a 1 if he first sends me through a random gate."

And of course the outcomes need to appear appropriately random if the experiment is repeated a bunch of times, so they could agree on a sequence of things to be in each repetition of the experiment. Let's say the experiment will be repeated ten times. The qubits could represent what they'll do in every possible case with two strings of ten bits, like 0110100111 and 1010101001. The first bit in the first sequence says what each qubit will be in the first trial of the experiment if that qubit is not sent through a random gate. The first bit in the second sequence says what each qubit will be in the first trial of the experiment if that qubit is sent through a random gate. And the next bit in each sequence says what to do in the second trial of the experiment. And so on.

Note that the two sequences need to have certain properties, so that the qubits can fool us all into thinking that true randomness is happening. First, each sequence needs to appear random. Second, the sequences need to appear uncorrelated with each other. That is, knowing a bit in the first sequence shouldn't tell us anything about the corresponding bit in the second sequence. Of course, the qubits can achieve both of these goals if they create each sequence by flipping a coin over and over, which seems easy enough.

But What If Also...

But what if Alice and Bob both also have a half-random gate lying around that they could pass their qubit through before opening it. And let's say that they'll choose to use their random gate or their half-random gate or neither gate before opening their qubit. Then there are nine possibilities. If we represent not using a gate with n, using a random gate with r, and using a half-random gate with h, then the possibilities are: nn, nr, nh, rn, rr, rh, hn, hr, and hh, where "nr" represent Alice not using a gate, and Bob using a random gate.

We can represent each possibility with a matrix. To reduce our work, we'll note that the outcomes are going to be symmetric for cases like nr and rn, as we saw above. So we really have six possibilities: nn, nr, nh, rr, rh, and hh, and we already know the matrix for nn, nr, and rr, from above, so we just need a matrix for nh, rh and hh.

The matrix for nh is: [[1, 0], [0, 1]] ⊗ [[cos(π/8), sin(π/8)], [sin(π/8), -cos(π/8)]] = [[cos(π/8), sin(π/8), 0, 0], [sin(π/8), -cos(π/8), 0, 0], [0, 0, cos(π/8), sin(π/8)], [0, 0, sin(π/8), -cos(π/8)]].

The matrix for rh is: [[sqrt(.5), sqrt(.5)], [sqrt(.5), -sqrt(.5)]] ⊗ [[cos(π/8), sin(π/8)], [sin(π/8), -cos(π/8)]] ≈ [[.653, .271, .653, .271], [.271, -.653, .271, -.653], [.653, .271, -.653, -.271], [.271, -.653, -.271, .653]].

The matrix for hh is: [[cos(π/8), sin(π/8)], [sin(π/8), -cos(π/8)]] ⊗ [[cos(π/8), sin(π/8)], [sin(π/8), -cos(π/8)]] ≈ [[.854, .354, .354, .146], [.354, -.854, .146, -.354], [.354, .146, -.854, -.354], [.146, -.354, -.354, .854]].

Now let's multiply our state [sqrt(.5), 0, 0, sqrt(.5)] by each new possibility:

For nh we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[cos(π/8), sin(π/8), 0, 0], [sin(π/8), -cos(π/8), 0, 0], [0, 0, cos(π/8), sin(π/8)], [0, 0, sin(π/8), -cos(π/8)]] ≈ [.653, .271, .271, -.653], which agrees with what we saw in the symmetric hn situation in the double-slit experiment post. When we square these values, we get about [.427, .073, .073, .427], meaning there's a .427 + .427 ≈ 85% chance that both Alice and Bob's qubits are the same as each other, and a .073 + .073 ≈ 15% chance that their qubits are different.

For rh we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[.653, .271, .653, .271], [.271, -.653, .271, -.653], [.653, .271, -.653, -.271], [.271, -.653, -.271, .653]] ≈ [.653, -.271, .271, .653]. When we square these values we get about [.427, .073, .073, .427], which is the same as nh above.

For hh we get: [sqrt(.5), 0, 0, sqrt(.5)] * [[.854, .354, .354, .146], [.354, -.854, .146, -.354], [.354, .146, -.854, -.354], [.146, -.354, -.354, .854]] ≈ [.707107, 0, 0, .707107], which looks a lot like [sqrt(.5), 0, 0, sqrt(.5)], which we recognize as meaning that both Alice and Bob will see the same outcome, be it a 0 or a 1.

Let's summarize what's going on. If both Alice and Bob use the same gate -- either nn, rr or hh -- then they'll both see the same outcome. If one person uses a random gate, and the other person uses no gate, then the outcomes will be completely uncorrelated. And if one person uses a half-random gate, and the other person uses either a no gate or a random gate, then there's an 85% chance that they'll both see the same outcome.

Now let's imagine a scenario like before where there are going to be ten repetitions of the experiment, and the qubits are trying to decide ahead of time what to do in each case. They'll now need three sequences of bits to cover the three possible decisions each person could make: no gate, random gate, or half-random gate.

Now what constraints do they need to place on the sequences? First note that the sequences will be the same for each qubit, since they need to make sure that if both Alice and Bob do the same thing to their qubit, then they'll each observe the same value when they open their qubit -- this covers the nn, rr and hh cases.

We recall from before that the sequences will need to appear random, and that the first and second sequences should be uncorrelated, to account for the nr (or symmetric rn) case.

Now we just need to account for the nh and rh (or symmetric hn and hr) cases. In the nh case, Alice uses no gate, and Bob uses a half-random gate, and we saw that there is an 85% chance that they both see the same outcome. Hence, the first sequence needs to be 85% correlated with the third sequence. That is, 85% of the time, the bit we see in the first sequence should be the same as the bit we see in the third sequence. In the rh case, there is also an 85% chance that both Alice and Bob see the same thing. This means that there is also an 85% correlation between the second sequence and the third sequence.

So we have three random sequences of bits. The first sequence is uncorrelated with the second sequence, but both the first and second sequences are 85% correlated with the third sequence.

It turns out this is impossible. The best we can do is make the third sequence 75% correlated with the first and second sequences. We could do this as follows: whenever the first and second sequences have the same bit, make the third sequence also have that bit. This will happen 50% of the time, since the first two sequences are uncorrelated. The rest of the time, the first and second sequences will have a different bit, so the third sequence can't be the same as both of them, so it will need to choose. If we align with the first sequence half the time, and the second sequence half the time, then we'll get our 75% correlation with both sequences. We could become more correlated with one sequence, but only by becoming less correlated with the other sequence. Hence, it is impossible to be 85% correlated with both sequences.

This means that the qubits can't decide ahead of time what to do in each case in such a way that they can satisfy all the possible expected outcomes. Whatever they chose to do, Alice and Bob could happen to use their gates in such a way that they would expect to see a certain correlation -- based on their quantum mechanical understanding -- that would be violated. For instance, if the qubits decided to have the third sequence be 85% correlated with the first sequence, meaning that the second sequence wasn't 85% correlated with the third sequence, then Alice could use a random gate and Bob could use a half-random gate every time, such that the correlation should be 85%, but wouldn't be.

Conclusion

And that is the paradox: Alice's qubit must magically know what sort of gate Bob's qubit passed through in order to decide what to do, and vice versa. But this would suggest some sort of faster-than-light communication, since Alice and Bob are lightyears apart.

A consequence I thought would come from this is the ability to communicate across great distances instantly. But no. This technique can't actually be used to send messages between Alice and Bob. From either person's perspective, their qubit is randomly a 0 or a 1 no matter what sort of gate they pass it through. It is only when they meet each other again to compare notes about their observations that they see spooky correlations.

The real consequence has to do with how we model the universe. As computer scientists, we might try to model everything as a giant cellular automaton (like the game of life), where each cell is like a point in space which contains a particle or doesn't. And it would be nice if the laws of the universe were simple cellular automaton update rules applying to each cell based on nearby cells. However, the EPR paradox suggests that this doesn't work. Sometimes a cell will need to know something about a very distant cell. Hence, if we wanted to use a cellular automaton to model the universe, it seems like the update rules for each cell would need to examine every other cell in the system, which seems very complicated and messy. Alas.