Showing posts with label Nash Equilibrium. Show all posts
Showing posts with label Nash Equilibrium. Show all posts

Thursday, July 9, 2009

Uneasy Alliances II: Why Two-Face Loses by Flipping a Coin

Reprinted from filmschoolrejects.com


Earlier we began a discussion of Harvey Dent and James Gordon's alliance to clean up the streets of Gotham. We concluded that cooperation did make sense in the scenario portrayed by the film The Dark Knight. The cooperation game, or the Stag Hunt game, produced two pure strategy Nash Equilibria -- both players would cooperate or both players would work on their own.

Now, suppose that we take the same situation but introduce randomization. That is, suppose Harvey decides he wants to randomize his actions so that Gordon could not predict what he would do. Note that this is extremely unlikely; people usually intentionally randomize when they are working against another player. But for the sake of argument, suppose that it applies here.

The idea of assigning a certain probability towards an action is known as a mixed strategy. In this game, we can actually find a third, mixed-strategy equilibrium in addition to the two pure strategy ones we had found in the previous post.

So, here is the matrix from last time:

Harvey Dent -->>
James Gordon ↓

Cooperate

Don’t Cooperate

Cooperate

(4,4)

(1,3)

Don’t Cooperate

(3,1)

(3,3)


Suppose that Harvey assigns a probability, p, to cooperating and (1-p) to not cooperating. Then we could perform an expected utility calculation to deduce Gordon's optimal strategy.

Recall that Expected Utility (EU) of a given action is equal to the sum of the utility values (U) or outcomes weighted by the probabilities (p) of receiving each. Therefore:

EU = p * U(Cooperate) + (1-p) * U(Don't Cooperate)

Then the expected utility if Gordon cooperates is:
EU(Gordon Cooperates) = p * 4 + (1-p) * 1
=4p + 1 - p
=3p + 1

The expected utility if Gordon does not cooperate is:
EU(Gordon Does Not Cooperate) = p * 3 + (1-p) * 3
= 3p + 3 - 3p
= 3

We know that Gordon will choose whichever action gives yields the greatest expected utility. So setting the two equations equal to each other, we have:

EU(Gordon Cooperates) = EU(Gordon Does Not Cooperate)
3p + 1 = 3
3p = 2
p = 2/3

Therefore, Gordon will cooperate only if the probability that Harvey cooperates is greater than 2/3. Otherwise, he will not cooperate. We can perform the exact same analysis by assigning a probability, q, to Gordon's actions and calculating expected utilities for Harvey. It will yield the same answer, namely that q = 2/3.

So p = q = 2/3 and we have a new, mixed strategy equilibrium where each player chooses to cooperate 2/3 of the time and does not cooperate 1/3 of the time. If Harvey decides to randomize this way, then Gordon cannot benefit by deviating from this strategy alone.

This result is interesting for several reasons. First, each player's expected payoff under mixed strategies is 3. Therefore, the mixed strategy equilibrium outcome is no better than either of the pure strategy ones. Therefore, Dent and Gordon would be just as well off choosing not to cooperate with each other 100% of the time. They would each be strictly better off choosing to cooperate 100% of the time.

Second, I had mentioned before that we were supposing Harvey intentionally randomized his actions, but the truth is that this mixed strategy exists whether he wants to or not. The reason is that these mixed strategies can be interpreted to reflect one individual's beliefs about the other's actions. In other words, Harvey choosing cooperate 2/3 of the time and choosing to work on his own 1/3 of the time can be seen as Gordon's views on what Harvey will do given his uncertainty in the matter. If he believes Harvey will cooperate 2/3 of the time, then he will cooperate 2/3 of the time.

Now suppose that Harvey decides to flip a coin instead. And what's more, suppose that Gordon knows that Harvey will flip a coin. What will Gordon do? And will this be an equilibrium?

If Harvey flips a coin to decide, this means that he will cooperate 50% of the time and work on his own 50% of the time. So, Gordon's expected payoff will be:

EU(Gordon Cooperates) = (1/2 * 4) + (1/2 * 1) = 2.5
EU (Gordon Does Not Cooperate) = (1/2 * 3) + (1/2 * 3) = 3

Therefore, Gordon will derive a larger expected utility from not cooperating and will choose to work on his own all of the time.

This, however, is not an equilibrium. We already know that if Gordon chooses to work alone 100% of the time, then Harvey would be strictly better off by also choosing not to cooperate 100% of the time. By sticking to the coin strategy, Harvey is actually losing some utility.

Of course, there are certain situations where flipping a coin could work. Suppose that Two-Face and the Penguin are facing off against each other by driving their cars towards one another in a bizarre game of chicken. Each can choose to go left or go right. The only thing is that they have to make their decisions at the same time, so nobody gains any utility by turning first. All we know is that each wants to live. So, if they both turn left, they each receive a utility of 10 for being alive. If they each turn right, they will also receive a utility of 10. If one turns left and the other turns right, both will die in the car crash and receive a utility 0f 0. The matrix then looks like this:

Two Face -->>
Penguin ↓

Left

Right

Left

(10,10)

(0,0)

Right

(0,0)

(10,10)


Here if we perform the same utility calculations as above, assigning a probability of p to Two-Face turning left, we will arrive at p=1/2. Therefore, if Two-Face chooses to flip a coin intentionally, the Penguin should do the same and this would be a mixed-strategy Nash equilibrium.

Now, this sort of situation does not happen often. And this is why Two-Face's gimmick of flipping a coin to make every decision is usually a costly one. First of all, he gives away his strategy, making it easy for his opponents to predict their best actions. Second, it is not always the case that choosing one action 50% of the time and another 50% of the time is a mixed-strategy equilibrium, as we saw above. If Two-Face continues to adhere strictly to this strategy, he will be losing in the long-run.

And this is why Batman will always win. He knows his economics.