Game Theory

Equilibria and Global Optimality

Recall that, formally, a finite $n$-person normal form game is a triple $(N, A, u)$, where

  • $N = \{1, 2, \dots, n\}$ is a (finite) set of $n$ players,

  • $A = A_1 \times A_2 \times \dots \times A_n$, and $A_i$, for $i \in N$, is the finite set of actions for player $i$ (where $a = (a_1, a_2, \dots, a_n) \in A$ is called an action profile),

  • $u = (u_1, u_2, \dots, u_n)$ is a profile of utility functions, and $u_i \colon A \to \mathbb{R}$ is the utility function describing the payoff for player $i$.

Some, but not all games allow for dominant strategies for one or more players. A rational player will always choose a dominant strategy.

In the absence of dominant strategies, other tools are needed to analyse a game.

A Game of Three Clients

There are games where no player has a dominant strategy.

Example. Two companies want to make business with one of three clients $A$, $B$, or $C$, under the following conditions.

  • If the two companies approach the same client, this client will give half of its business to each of them.
  • Company 1 is too small to do business on their own, if they approach a different client than company 2, their payoff will be 0.
  • $A$ is a large client and will only do the business if both companies approach them, where as $B$ and $C$ are prepared to deal with company 2 alone.
  • $A$’s big business has a value of 8, while $B$ and $C$ have business worth 2.

All of this results in the following payoff matrix (with rows corresponding to the options for company 1, and columns to company 2).

In this game, neither company has a dominant strategy. In fact, each strategy is a strict best response to some strategy of the other company.

The Nash Equilibrium

The following simple idea allows us to reason about games that have no dominant strategy.

Definition (Equilibrium). An action profile $a = (a_1, \dots, a_n) \in A$ is a Nash Equilibrium if, for each player $i \in N$, their action $a_i$ is a best response to $a_{-i}$.

The idea here is that an action profile which is not an equilibrium provides an incentive for at least one player to move to a different strategy.

In the preceding example, $(A, A)$ is an equilibrium, as for both players, strategy $A$ is a best response to the opponent playing $A$. In fact it is the only equilibrium in that game. The profile $(B, B)$ is not an equilibrium as player 2’s best response to player 1 playing $B$ is $C$, and not $B$.

Assuming that non-equilibrium strategies will not by played by rational agents, in the example both players will play $A$, the unique equilibrium of this game.

Multiple Equilibria: Coordination Games

The analysis of a game can be difficult if it has multiple equilibria. This typically happens with situations that require the players to coordinate.

Example. Two students need to prepare a presentation for the next day. Unfortunately, they forgot to agree on the software to use, and for now cannot communicate. The options are either PowerPoint or BeamerLaTeX. The presentation will be good if both students use the same software in their preparation (payoff 1) and it will be bad if different systems are used (payoff 0). This yields a payoff matrix of the following shape.

This game obviously has two equilibria, $(P,P)$ and $(B,B)$, which cannot be distinguished from each other.

Interestingly, the same payoff matrix can be used to model the question: Which side of the road should to oncoming cars use to pass each other safely? This question is usually answered by social conventions, conventions that everyone agrees with, but which could be different from country to country.

Variants of the coordination game are not that symmetric.

Example. If both students generally prefer BeamerLaTeX for their presentations, and yield better results, the payoff for both choosing this software might be 2 for both. This results in an unbalanced payoff matrix.

Again, this game has two equilibria, but the higher payoff for the $(B, B)$ profile is not captured by the equilibrium concept.

Example (Battle of the Sexes). A different lack of symmetry is displayed by a game, which in the literature is called the ‘Battle of the Sexes’. This game describes a situation where, for example, a couple want to go to the cinema, and they need to agree on which film to watch, considering that he would rather see an action movie, and she usually prefers a romantic drama …

Assuming a payoff of $0$ for not seeing the same film, a payoff of $2$ for watching the preferred movie, and a payoff of $1$ for watching what the partner prefers, one gets a payoff matrix as follows (with rows corresponding to her options and columns corresponding to his):

Clearly, $(A, A)$ and $(R, R)$ are equilibria. How the couple decides which one to choose is not captured by the game.

Attack-Defense Games

There are games that have no (pure) equilibrium at all. This typically happens if the player’s interests are in direct conflict. This motivates the introduction of mixed (randomized) strategies.

For example, if one player is an attacker and has a choice of two different attacking strategies, $A$ or $B$, and the other player can choose to defend against either $A$ or $B$. The defender wins if they choose the same (matching) strategy as the attacker. Otherwise, in case of a mismatch, the attacker attacks successfully and wins.

Example (Matching Pennies). Here, the two player have a penny each, and both choose theirs to either show heads or tails. Both pennies are revealed at the same time and if they show different sides, player $1$ (the attacker) wins and gets to keep both pennies. If they match player $2$ (the defender) wins and gets to keep both pennies. In terms of a payoff matrix this looks as follows (with rows corresponding to player $1$, and columns to player $2$):

This (zero sum) game clearly has no dominant strategies, and no equilibrium.

In reality, the players of such a game would make it difficult to guess their strategies by choosing randomly.

Suppose that player $2$ chooses $H$ with a certain probability $q \in [0, 1]$ (and hence $T$ with probability $1-q$). The payoffs for player $1$ are then determined as follows.

If player $1$ chooses $H$, they will get a payoff of $-1$ with probability $q$, and a payoff of $1$ with probability $(1-q)$. In total, this yields a payoff of $(-1)q + (1)(1-q) = 1 -2q$, depending on the parameter $q$ chosen by player $2$. Similarly, if player $1$ chooses $T$ they will receive a payoff of $(1)q + (-1)(1-q) = 2q-1$.

If one of $1-2q$ and $2q - 1$ was strictly bigger than the other, the rational player $1$ would have a strict preference for exactly one of their options.

However, player $2$’s intention was it to choose the parameter $q$ so that player $1$ becomes indifferent between the available options. The best bet for player $2$ thus is to choose $q$ so that the two payoff values for player $1$ become the same: $1 - 2q = 2q - 1$, that is $q = \frac12$.

For similar reasons, the optimal strategy for player $1$ is to choose $H$ with probability $p = \frac12$. Then in the strategy profile $(p, q)$, the choices, $p$ by player $1$ and $q$ by player $2$, are again best responses to each other, and the profile forms an equilibrium of mixed strategies, to be discussed next.

Mixed Strategies

We now allow randomization in the choice of strategies. In addition to pure strategies $a_i \in A_i$, this introduces mixed strategies.

A mixed strategy for player $i \in N$ is a probability distribution over the action set $A_i$, that is, a function $s_i \colon A_i \to [0, 1]$, such that $\sum_{a_i \in A_i} s_i(a_i) = 1$.

In games with mixed strategies, the finite action sets $A_i$ are replaced by the infinite sets of strategies $S_i = \{ s_i: A_i \to [0, 1] \}$.

A collection $s = (s_1, \dots, s_n) \in S = S_1 \times \dots \times S_n$ is called a strategy profile.

For each player $i \in N$, there is an expected utility $u_i \colon S \to \mathbb{R}$, defined as

where

It is still possible to talk about best responses and equilibria in the presence of mixed strategies.

We still say that, for player $i \in N$, strategy $s_i$ is a best response to the strategy profile $s_{-i}$ of the other players, if $u_i(s_i, s_{-i}) \geq u_i(s_i’, s_{-i})$ for all $s_i’ \in S_i$.

And we say that a profile $s = (s_1, \dots, s_n) \in S$ is a Nash equilibrium if, for each $i \in N$, strategy $s_i$ is a best response to $s_{-i}$.

With mixed strategies the following award-winning result holds.

Theorem (Nash 1950). If mixed strategies are allowed then every finite game has a Nash equilibrium.

The proof of this theorem relies on Brouwer’s fixed point theorem and will be omitted.

Computing Equilibria.

A game may have both pure-strategy and mixed-strategy equilibria. In order to find all equilibria, one should therefore:

  • first check all the pure outcomes to see which, if any, form an equilibrium,

  • then look for profiles of mixed strategies, i.e., probability distributions over the action sets, which are mutual best responses.

In a $2$-player, $2$-strategy game this means to first check the $4$ cells of the $2 \times 2$ payoff matrix, and then to look for mixing probabilities $p$ and $q$ that make the corresponding mixed strategies best responses to each other. These probabilities can be computed under the assumption that players will choose them in such a way that their opponent’s expected outcomes are the same for their two options. This gives one equation per player, to be solved for $p$ and $q$. If both values thus obtained lie strictly between $0$ and $1$, they determine a mixed-strategy equilibrium.

Example (Battle of the Sexes.) In this game, with payoff matrix

we have already identified two pure-strategy equilibria: $(A, A)$ and $(R, R)$.

Suppose now that player $1$ (the row player) uses a mixed strategy, where she chooses $A$ with probability $p$ (and $R$ with probability $1-p$). Then the expected payoff for player $2$ is $2p + 0(1-p)$ if he chooses $A$, and $0p + 1(1-p)$ if he chooses $R$. So player $1$ will pick $p$ so that $2p = 1 - p$, i.e., $p = \frac13$.

Similarly, suppose that player $2$ chooses $A$ with probability $q$ (and $R$ with probability $1-q$). Then the expected payoff for player $1$ is $1q + 0(1-q)$ if she chooses $A$, and $0q + 2(1-q)$ if she chooses $R$. Player $2$ will thus pick $q$ with $2 = 2 - q$, that is $q = \frac23$.

Global Optimality.

As we have seen, rational agents, acting in their own interest, can produce outcomes which are less than optimal on a global scale, as exemplified by the Prisoner’s Dilemma.

Several concepts for measuring what’s “good for society”, from a neutral or global perspective, have been suggested. Here, we briefly discuss ‘Pareto optimality’ and ‘Social Optimality’.

Both concepts apply to the exam-or-presentation game, we have studied before, with payoff matrix:

Pareto Optimality. A strategy profile $a \in A$ is not Pareto-optimal, if there is a strategy profile $a’ \in A$, such that all players receive payoffs at least as high, that is $u_i(a_i’, a_{-i}’) \geq u_i(a_i, a_{-i})$ for all $i \in N$, and at least one player receives a strictly higher payoff, that is $u_i(a_i’, a_{-i}’) > u_i(a_i, a_{-i})$ for (at least) one $i \in N$.

A strategy profile $a \in A$ is Pareto-optimal if it is not not Pareto-optimal.

Pareto optimality is named after the Italian economist Vilfredo Pareto (1848-1923).

If players choose a strategy profile that is not Pareto optimal, then there exists an alternative strategy profile, where at least one player is better off, and no player fares worse. However, if this alternative is not an equilibrium, at least one player would then want to switch to a different strategy, and thus present a risk to any agreement the players might have to choose a Pareto optimum.

The strategy $(E, E)$ in the exam-or-presentation example is not Pareto-optimal, whereas $(P, P)$ is. In the absence of a binding agreement, each individual player will prefer to switch to the ‘exam’ strategy. Note that the strategies $(E, P)$ and $(P, E)$ are also Pareto-optimal, as there is no alternative strategy profile that makes everybody’s payoff at least as good.

In this example, as in each Prisoner’s Dilemma, the unique Nash equilibrium is the only outcome that is not Pareto-optimal!

Social optimality is an even stronger condition that is simpler to state.

Social Optimality. A strategy profile $a \in A$ is socially optimal (or a social welfare optimizer) if it maximizes $\sum_{i \in N} u_i(a)$, the sum of the players’ payoffs.

Note that a socially optimal outcome is necessarily Pareto-optimal.

In the exam-or-presentation game, strategy $(P, P)$ with a combined payoff of $90 + 90 = 180$ is the unique social optimum.

However, the combined payoffs of all the players might not always be a meaningful measure of a game’s outcome.

References

A brief account on the relevance of the notion of Nash Equilibrium and its historical context can be found in the May 2016 issue of the Notices of the American Mathematical Society.

In one scene of the movie ‘A Beautiful Mind’, the Nash character gets an opportunity to apply the concept of a Nash equilibrium to a real world situation, and thereby explain the concept to his mates. This explanation, however, is flawed. How?