An Introduction to Decision Theory - Notes
Chapter 1: Introduction
A decision can be right without being rational and can be rational without being right - Decision is RIGHT if and only if its actual outcome is at least as good as that of every other possible outcomes. - Decision is RATIONAL if and only if the decision maker chooses to do what she has the most reason to do at that point in time
Hume: Decisions are driven by decision maker's beliefs and desires
Descriptive vs Normative decision theory
Instrumental rationality
Decisions under RISK vs. under UNCERTAINTY / IGNORANCE
Social Choice theory
History
Descriptive - How people actually make decisions
Normative - How people ought to make (rational) decisions
How people actually behave is likely to change over time and across cultures
should be agnostic of culture and time
Assumes a decision maker has some aim
Aims in themselves cannot be irrational and are external to the decision theory
To be instrumentally rational means to do what one has the most reason to expect will fulfil ones aim.
Under RISK
Under UNCERTAINTY / IGNORANCE
The probability of the possible outcomes IS known by the decision maker
Most widely used approach is the principle of maximising the expected value
The total value of an act equals the sum of the values of its possible outcomes weighted by the probability for each outcome
The probability of the possible outcomes IS NOT known by the decision maker
Aims to establish principles for how decisions involving more than one decision maker out to be made.
Old period
Pioneering period
Axiomatic period
ancient Greece
Not real theory of decision making per se yet
mid 1650s
Blaise Pascal & Pierre de Fermat
Christian Hyugens - modern probability theory
1738 - Deniel Bernoulli introduced the idea of moral value
Modern period
Truth and probability (1931) by Frank Ramsey
Theory of games and economic behaviour by von Neumann and Morgenstern (1947)
Proposed eight axioms for rational decision making
Proposed a set of axioms about how rational decision makers ought to choose among lotteries (a probabilistic mixtures of outcomes)
Chapter 2: The Decision Matrix
Decision problem - States - Parts of the world that are not and outcome or an act - Not all states are relevant for decision making. Only states that may influence the decision maker's preference are relevant - Devices needed for applying acts - Outcomes - What ultimately matters - Acts - Instruments to reach the outcomes - Generic vs Particular - Particular acts always carried out by specific agents at specific time intervals - Alternative acts - Acts considered by the decision maker
Decision tree
Scales
Choice nodes (squares), chance node (circles)
ORDINAL scales
CARDINAL scales
merely express preference ordering
No information about by how much one preference is better/worse than another preference
Qualitative comparison of objects
INTERVAL scales
RATIO scales
Accurately reflect the difference between the objects being measured (but not the ratio)
Invariant to positive linear transformations
f'(x) = k · f(x) + m
Example: Comparing Celsius and Fahrenheit scales
Quantitative comparison of objects
Reflect ratios
Scale U can be transformed into a scale V by multiplying U with a positive constant k
f'(x) = k · f(x)
A pair of ratio scales are equivalent if and only if each can be transformed into the other by multiplying all values by some positive constant
Quantitative comparison of objects
Example: Height, Weight
Chapter 3: Decisions under ignorance
The decision maker knows what the alternatives and what outcomes they may result it, but is unable to assign any probabilities to the states corresponding to the outcomes
Dominance
Maximin
Leximin
Maximax
Optimism-Pesimism rule
Minimax Regret
Principle of insufficient reason
No agreement about what is the best decision approach
A1 dominates A2 if A1 always leads to better outcomes regardless no matter what states of the world happen to be true
Dominated alternative MUST NOT be chosen.
Weak vs. Strong
Weak Dominance
Strong Dominance
one alternative is at least as rational as another alternative if all its outcomes under every states of the world will be at least as good as the other alternative
Two conditions
Outcomes under all states must be at least as good as the other alternative
There is at least one state under which the outcome is STRICTLY better than the other alternative
Focus on the worst possible outcome
One should maximise the minimal value obtainable with each act
Ai ≥ Aj if and only if min(Ai) ≥ min(Aj) for all i and j
Allows value to be measure on Cardinal scale
Lexical Maximin
Addresses a scenario where worst outcomes are identical
If worst outcomes are equal, choose the alternative with the best second worst outcome (if equal, choose third, fourth etc)
Allows value to be measured on Cardinal scale
Maximise the maximum value obtainable with an act
Allows value to be measure on Cardinal scale
Generalisation of the maximin and maximax rules
Proposed by Hurwitz
Alpha-index rule
Requires value to be measure on an Interval scale
Consider best and worst possible outcomes of each alternative and then choose an altertive according to the decision makers level of optimism/pesimism
Ai > Aj if and only if α · max(Ai) + (1-α) · min(Ai) > α · max(Aj) + (1-α) · min(Aj)
α = Optimism Coefficient (not probability). Assigned to each outcome to indicate the decision makers preferences in terms of which outcome is best, second best, etc.
The best alternative is the one that minimises the maximum amount of regret
Regret matrix
The regret value of each outcome for a given state is calculated by subtracting the value of the best outcome from the value of the outcome in question
If one has no reason to think that one state of the world is more likely than another, then all states should be assigned equal probability
Enables transforming decisions under ignorance into decisions under risk
Criticism: It is arbitrary to assign equal probabilities to all outcomes. Any other combination seems equally likely
Chapter 4: Decisions under risk
The decision maker knows the probabilities of the outcomes
Principle of maximising the expected value favoured by most experts
Maximising
Expected Utility Principle
Expected Monetary Value
Expected Value
Expected Utility
EMV = p1 · m1 + p2 · m2 + ... + pn · mn
probability of a given state times monetary value of the state
Marginal value of money is decreasing
EV = p1 · v1 + p2 · v2 + ... + pn · vn
Value is evaluated from the decisions maker's point of view
Utility of an outcome depends on how valuable the outcomes is from the decision maker's perspective
EU = p1 · u1 + p2 · u2 + ... + pn · un
Based on the law of large numbers, in the long run it is best to maximise expected utility
Assumes that probability of each outcome remains stable over time.
Direct vs Indirect approach
Can be derived from four axioms
Paradoxes
Indirect
Direct
Dominant approach
Decision maker is asked to state a set of preferences over a set of risky acts.
If the set of preferences complies with specific constraints (axioms) it can be shown that the preferences can be described as if the decision maker has chosen their preferences based on assigning specific probabilities and utilities to each outcome and then maximising utility.
It is irrelevant how these preferences are generated
Seeks to generate preferences based on probabilities and utilities directly assigned to outcomes.
Does not assume that the decision maker has access to set of preferences before he/she starts the deliberation
If all outcome of an act have a utility u, then the utility of the act is u.
Dominance principle: If one act is certain to lead to outcomes with higher utilities under all states, then the utility of the former act exceeds that of the latter.
Every decision problem can be transformed into a decision problem with equiprobable states, by splitting the original states into parallel ones without affecting the overall utility of any of the acts in the decision problem.
The trade-off principle: If two outcomes are equally probable and if the best outcome is made slightly worse, then this can be compensated for by adding some amount of utility to the other outcome
Allai's paradox
St Petersburg paradox
Ellsberg paradox
The two envelope paradox
Chapter 5: Utility
vonNeumann and Morgenstern - Objects in question are lotteries not outcomes! - Key axioms - vNM1 (Completeness): A > B or A~B or B > A - vNM2 (Transitivity): If A > B and B > C then A > C - vNM3 (Independence): A > B if and only if ApC > BpC - i.e. If you prefer lottery A over B, you must prefer Lottery A with a probability C over lottery B with the same probability regardless of the value of C - vNM4 (Continuity): If A > B > C then there exists some p and q such that ApC > B > AqC) - Theorem - Implied by the axioms above - How to construct an Interval scale - Axioms vNM1 - 4 are satisfied if and only if there is a function u that takes a lottery as an argument and returns a real value between 0 and 1 which meets the following: - 1. A > B if and only if u(A) > u(B) - 2. u(ApB) - pu(A) + (1-p)u(B) - 3. For every other function u' satisfying (1) and (2) there are numbers c > 0 and d such that u' = c · u + d
Chapter 6: The Mathematics of probability
Kolmogorov (1933) - Every probability is a real number between 0 and 1 - 1 ≥ p(A) ≥ 0 - The probability of the entire sample space is 1 - p(S) = 1 - If two events are mutually exclusive, then the probability that one of them will occur equals the probability of he first + the probability of the second - if A ∩ B = 0, then p(A U B) = p(A) + p(B) - if A and B are mutually exclusive then p(A ∨ B) = p(A) + p(B) - p(A) + p(negA) = 1 - If A and B are logically equivalent, then p(A) = p(B)
Conditional probability
p(A|B) = p(A ∧ B) / p(B)
A is independent of B if and only if p(A) = p(A|B)
If A is independent of B then p(A ∧ B) = p(A) · p(B)
Bayes Theorem
…
Unknown priors is a problem, but can be reduced by applying Bayes theorem over and over again.
The more times the Bayes theorem is applied, the closer to the truth (posterior probability value) we will get (regardless of the initial prior value!)
p(B) is prior probability
Chapter 7: The Philosophy of probability
Objectivists vs Subjectivists - Objectivists - Believe that statements about probability refer to the facts in the real world - Subjectivists - Deny objectivists' claims - Statements about probability refer to the degree to which the speaker believes in something - Probabilities are entities that humans somehow create in their own minds. - When two decision makers hold different subjective probabilities they just happen to believe something to different degrees. - This does not mean either of them must be wrong.
Classical interpretation
Frequency interpretation
Propensity Interpretation
Logical interpretation
Some statements about probability refer to objective properties, but others refer to the subjective degree of belief.
Advanced by Laplace, Pascal Leibnitz
Probability of an event seen as a fraction of the total number of possible ways in which the event can occur
Presupposes that all possible outcomes are equally likely
Determine the probability of an event by first reducing the random process to a set of equally likely outcomes and then count the number of outcomes in which the event occurs and divide by the total number of possible outcomes.
The probability of an event is the ratio between the number of time the event occurred and the total number of observed cases.
Always defined relative to some reference class
Key challenge - how to determine which reference class is suitable and why?
Developed by Karl Popper in 1950s
Probability is related to certain features of the real world based on their tendency to give rise to a certain effect
Propensity can't be directly observed --> We need to rely on indirect evidence
Propensities have a temporal direction: if A has a propensity to give rise to B, then A cannot occur after B (like causality). In contrast, probabilities do NOT have a temporal direction.
Developed by Keynes and Carnap
Often call epistemic probability
Probability is a logical relation between a hypothesis and the evidence supporting it.
Chapter 8: Why should we accept preference axioms?
Risk aversion - Against actuarial risks - Risk averter is defined as one who, starting from a position of certainty, is unwilling to take a bet which is actuarially fair. - Prefers (smaller) certainty even over lotteries that may bring more significant gain. - Against utility risks - The risk averter should not only maximise utility but also, at least sometimes, apply some decision rule that puts emphasis on avoiding bad outcomes. - E.g. maximin rule - Against epistemic risks - Example: The maximin criterion for expected utilities (MMEU) - the alternative with the largest minimal utility ought to be selected. - Encourages decision makers to expect worst
Utility function
is able to represent those preferences if it is possible to assign a real number to each alternative, in such a way that alternative a is assigned a number greater than alternative b if, and only if, the individual prefers alternative a to alternative b
A decision maker with a concave utility function will always prefer a smaller prize for certain over actuarially fair lottery over larger and smaller prizes
The flatter the utility function is in a given interval the more actuarially risk averse the decision maker is in that interval
Chapter 9: Causal vs. Evidential decision theory
Newcomb's problem - Shows that Dominance principle and maximising expected utility theory yield conflicting recommendations
Causal decision theory
Evidential Decision Theory (p 193)
Decision maker should keep all her beliefs about causal processes fixed during the decision making process and always choose an alternative that is optimal according to these beliefs.
Causal structure is forward looking and completely insensitive to the past.
Recommends to do what is likely going to bring the best possible result, while holding fixed all views about the likely causal structure of the world.
Instead of asking "If I do X will Y happen" we should ask "What is the probability that if I were to do X, then Y would be the case, given that I do X?"
Consider: Psychopath case
Chapter 10: Bayesian vs. non-Bayesian decision theory
Bayesianism
- Epistemic component - What rational agents ought to believe and which combinations of beliefs and desires are rationally permisable. - One is free to believe whatever one wishes as long as one's beliefs can be represented by a subjective probability function and those beliefs are updated in accordance with Bayes' theorem - All beliefs come in degrees and Bayes' theory provides a mechanism how these beliefs should be revised. - Deliberative component - Tells us what action is rational for the agent to perform given his or her present state of mind. - 1. Subjective degrees of belief can be represented by a probability function defined in terms of decision maker's preferences over uncertain prospects. - 2. Degrees of desire can be represented by a utility function defined in the same way - 3. Rational decision makers act as if they maximise subjective expected utility. - Ordering axiom - Most fundamental axiom of Beysianism - For any two uncertain prospects, the decision maker must be able to state a clear and unambiguous preference and all such preferences must be asymmetric and transitive.
Non-Bayesian theories
Externalist - An act is rational not merely by virtue of what the decision maker believes and desires. - Rejects Humean belief-desire account which is considered too narrow - Rationality is also (at least partly) constituted by facts about external world. - Internalist - Decision maker's beliefs and desires is all that matters when adjudicating whether an act is rational or not. - In contrast to Bayesianism, preferences over risky acts are not just tools to measure degrees of belief and desires but constitute reasons to preferring one risky act over another
Chapter 11: Game Theory I
Studies decisions where outcomes depend partly on what other people/decision makers do
Prisoner's dilemma
Taxonomy
What is optimal for each individual need NOT coincide with what is optimal for the group.
Individual rationality sometimes comes into conflict with group rationality.
Arise whenever a game is symmetrical i.e. everyone is facing the same strategies and outcomes and the ranking of outcomes is 2,2 4,1, 1,4 3,3
Non-cooperative, simultaneous-strategy, symmetric, nonzero-sum and finite game
zero sum
Non-cooperative
Simultaneous vs sequential
Symmetric
Mixed vs. Pure strategy
Equilibrium
Finitely vs infinitely iterated
you win as much as your opponent(s) lose
Casino games, chess
the total amount of money/units of utility is fixed no matter what happens
Players are not able to form binding agreements (but players CAN cooperate).
Simultaneous move
Sequential move
Players decide on their strategies without knowing what the other player(s) will do.
players have some (or full) information about the strategies played by other players in earlier rounds
All players face the same strategies and outcomes
Identity of players does not matter
Mixed
to play a pure strategy with some probability between zero and one.
E.g. in prisoner's dilemma mixed strategy means to confess with probability p and deny charges with probability 1-p
A pair of strategies is in equilibrium if and only if it holds that once this pair of strategies is chosen none of the players can reach a better outcome by UNILATERALLY switching to another strategy
To find equilibrium strategies look for strategies that fulfil the minimax condition. The outcome is determined by the minimal value of the row and the maximal value of the column
If there are several equilibrium points, then all of them are either on the same row or in the same column.
This does NOT solve all two-person games though!
Finitely iterated
Infinitely iterated
Games iterated finite number of times
Infinitely iterated game rational players will behave in exactly the same way as in the one-shot version of the game
Players do not know in advance whether they are about to play the last round of the game.
The games does NOT actually need to be iterated ad infinitum
Each player may adjust his next move to what the opponent did in the previous round.
Tit-for-tat is a best strategy: always cooperate in the first round and thereafter adjust your behaviour to whatever your opponent did in the previous round.
Chapter 12: Non zero-sum and cooperative games
Nash equilibrium - Rational players will do whatever they can to ensure that they do not feel unnecessarily unhappy about their decision. - An equilibrium point is a set of strategies such that each player's strategy maximises his pay-off if the strategies of the others are held fixed. Thus each player's strategy is optimal against those of the others. [Nash 1950] - Many non zero-sum games have more than one Nash equilibrium which means such games can't be "solved" in a sense that we can't exactly figure out what rational players would do by just applying Nash equilibrium concept.
Pareto efficient state
Hume's Law
A state is Pareto efficient if and only if no one's utility level can be increased unless the utility level for someone else is decreased.
No moral statements can be derived from a set of purely non-moral statements.
Chapter 13: Social Choice Theory
Analyses collective decision problems and how a set of individual preference orderings G can be aggregated in a systematic manner into a Social preference ordering S.
Voting paradox
Logical positivists (1930s)
Taxonomy
The difficulties that arise if a group wishes to aggregate the preferences of its individual members into a joint preference ordering.
Majority rule can lead to cyclic preference ordering
There is no empirically meaningful way to test claims about interpersonal utility comparisons
Social choice problem
Social state
Social Welfare Function (SWF)
Any decision problem faced by a group in which each individual is wiling to state at least ordinal preferences over outcomes
How to translate INDIVIDUAL preference ordering into SOCIAL preference ordering.
The state of the world that includes everything that individuals care about.
any decision rule that aggregates a set of individual preference orderings over social states into a social preference ordering over those states. (e.g. Majority rule)
Any normatively reasonable SWF should be non-dictatorial i.e. S must not always coincide with the preference ordering of a particular individual.
For every possible combination of individual preference orderings, an SWF must produce a social preference ordering that is complete, asymmetric and transitive. (i.e. Majority rule does not meet this condition)
Chapter 14: Descriptive decison theory
Kahnemann & Tversky (1979)
People often act in ways that are irrational
Certainty effect - People value certain gain more that equally large expected gain.
People reason incorrectly about small probabilities
Reflection effect
Prospect Theory
Loss aversion is often stronger than gain preference
Descriptive theory of choice under risk
Expected utility principle should be modified by introducing two weighing functions: one for value and one for probability.
Prospect value
Probability weighing function w(p)
The value of Loss or Gain is NOT linear as prescribed by expected utility theory.
Value of gains and losses is best represented by an S-shaped function in which losses matter more proportionally speaking than equally large gains
w(p1) · v(u1) + w(p2) · v(u2) + ... + w(pn) · v(un)
Accounts for the fact that people tend to overestimate small probabilities but underestimate moderate and large probabilities
…












