August 20 2018 RRG Notes
Decisions are not about changing the world, they are about learning what world you live in
- The laws of physics don’t support metaphysical free-will
- We can’t change the world just by thinking and making a decision
- This is because our mental state is a product of the world, and thus our decisions are constrained by the world
- A model compatible with the laws of physics is one where our processes of modeling, predicting, and making choices is best understood as a way for us to learn which world we live in
- Thinking of decisions this way adds clarity to a number of decision theory problems
- Example: Newcomb’s problem
- There is no world in which someone who chooses the two-box option wins
- There are only worlds in which there are one-boxers who win or two-boxers who lose
- Your “decision” in Newcomb’s problem is a process of finding out which world you live in
- Example: Newcomb’s problem
- While thinking of decisions in this way is uncomfortable, initially, it can lead to better outcomes
Psychological Twin Prisoner’s Dilemma
- Problem: An agent and her twin must both to either choose ‘cooperate’ or ‘defect’. If both cooperate, they each receive $1,000,000. If both defect, they each receive $1000. If one cooperates and the other defects, the defector gains $1,001,000, and the other receives $0. The agent and the twin both know that they reason the same way, using the same considerations to come to their conclusions. However, their decisions are causally independent, made in separate rooms without communication. Should the agent cooperate with her twin?
- Enumerate possible worlds:
- Cooperate world: win $1,000,000
- Defect world: win $1,000
- There is no possible world in which one agent cooperates but the other agent defects, because the problem statement stipulates that the agents reason in exactly the same manner
- So either we live in a world where the twins can work this out and they both cooperate, or we live in a world where they can’t and both defect
Absent-minded Driver Problem
- Consider the following diagram:
Start | | v X--->A: 0 | | v Y--->B: 4 | | v C: 1
- An absent-minded driver start driving from
Start
- The driver can turn off at
X
, and end up atA
, with a payout of 0 - The driver can turn off at
Y
, and end up atB
, with a payout of 4 - The driver can continue to
C
, and up with with a payout of 1
- The driver can turn off at
- However, the driver is absent-minded, and doesn’t remember whether they’ve already gone through
X
when they reachY
- If the driver is a classic CDT agent, they’ll turn off at
X
, end up with a 0 payout. This is the world that they live in
The Smoking Lesion Problem
- Problem: An agent is debating whether or not to smoke. She knows that smoking is correlated with an invariably fatal form of lung cancer, but the correlation is (in this imaginary world) due to a common cause: a certain type of arterial lesion will cause agents to both love smoking and also develop fatal lung cancer. There is no direct causal link between smoking and lung cancer, and the agent cannot determine whether she has the lesion or not. Agents without the lesion develop lung cancer only 1% of the time, and smoking delivers a utility of $1000, whereas not developing lung cancer delivers a utility of $1,000,000.
- There are 8 possible worlds here:
Decision | World | Utility | Probability | Weighted Utility | Total Utility ---------+----------------------+-----------+-------------+------------------+-------------- To smoke | Lesion, cancer | 1,000 | 0.99 | 990 | | Lesion, no cancer | 1,001,000 | 0.01 | 10,010 | | No lesion, cancer | 0 | 0.01 | 0 | | No lesion, no cancer | 0 | 0.99 | 0 | 11,000 ---------+----------------------+-----------+-------------+------------------+-------------- No smoke | Lesion, cancer | 0 | 0.99 | 0 | | Lesion, no cancer | 1,000,000 | 0.01 | 10,000 | | No lesion, cancer | 0 | 0.01 | 0 | | No lesion, no cancer | 0 | 0.99 | 0 | 10,000
- An agent that “decides” to smoke has a higher expected utility than one who does not, and the “decision” to smoke is actually just evidence that indicates which set of the possible worlds we’re in
- The actual test for lung cancer narrows it further to a single possible world
- The analysis is the same if there is a direct causal link between smoking and lung cancer, without the arterial lesion confound
- The only thing that actually matters is the 99% correlation between smoking and lung cancer
- Instead of drawing causal graphs, it’s easier to list possible worlds and estimate their utility
Parfit’s Hitchhiker Problem
- Problem: An agent is dying in the desert. A driver comes along who offers to give the agent a ride for $1000, but only if the agent will agree to visit an ATM once they arrive and give the driver $1000. The driver will have no way to enforce this after they arrive, but she does have the ability to detect lies with 99% accuracy. Being left to die causes the agent to lose $1,000,000. In the case where the agent gets a ride, should she proceed to visit an ATM and pay the driver?
- There is a missing possibility in the problem statement: what are the odds of the driver detecting that the hitchhiker is lying, and then giving a ride anyway?
- Just like above, we draw a set of possible worlds partitioned by the “decision” the hitchhiker makes
Decision World Utility Probability Weighted Utility Total Utility --------------------------------------------------------------------------------------------- Pay Ride -1,000 0.99 -990 No ride -1,001,000 0.01 -10,010 -11,000 --------------------------------------------------------------------------------------------- No pay Ride 0 0.01 0 No ride -1,000,000 0.99 -990,000 -990,000
- So while the highest utility world is that in which the agent gets a ride, and then doesn’t pay, that world has a very low probability of occurring
- Thus the agent that expects to pay before the trip will have higher expected utility
- This contradicts the CDT approach, which is to accept the ride, and then refuse to pay
- If the driver’s lie detector were perfect, rather than accurate with a certain probability, then this would be isomorphic to Newcomb’s problem
The Transparent Newcomb’s Problem
- Problem: This is exactly the same as Newcomb’s Problem, except that the boxes are transparent, so the agent can see prediction that the predictor made. The predictor placed $1,000,000 in Box B if and only if she predicted that the agent would leave behind Box A, which contains $1,000. In the case where the agent sees that both boxes are full, then should she leave the $1,000 box behind?
- Once you’re used to enumerating possible worlds, you’ll see that the transparency of the boxes doesn’t matter
- The “decision” to take one box or two was made before either of the boxes was presented
- If you find yourself in a world where both boxes are full, then yes, you should take both boxes
- However, because the predictor is very likely to be correct, the probability of you finding yourself in such a world is vanishingly small
- The nice thing about the world-enumeration approach, unlike CDT, EDT, and FDT, is that it never goes recursive - never reasoning about our own reasoning
- Just enumerate worlds and calculate utilities and use those to generate predictions (which the agent will perceive as “making decisions”)
The Cosmic Ray Problem
-*Problem**: An agent must decide whether to take $1 or $100. With a small probability, a cosmic ray will cause the agent to do the opposite of what they would have done otherwise. If the agent learns that they’ve been affected by a cosmic ray in this manner, they will have to pay $1000 for a checkup. Should the agent take $1 or $100?
- Once again, note that there are two probabilities in play:
- Odds of taking $1 while intending to take $100
- Odds of taking $100 while intending to take $1
- We assume that the above two probabilities are the same, and represent the probability of a cosmic ray strike as p
- This leads to the following world enumeration:
Decision World Utility Probability Weighted Utility Total Utility ------------------------------------------------------------------------------------------------- Attempt taking $100 No ray $100 1-p 100(1-p) Ray $1 p p 100-99p ------------------------------------------------------------------------------------------------- Attempt taking $1 No ray $1 1-p 1-p Ray $100 p 100p 1+99p
- By solving the inequality 100-99p < 1+99p, we see that as long as p < 0.5 the highest expected utility is to reach for the $100
- Note that the fact that the agent would have to take a $1000 checkup if they detect that they’ve struck by a cosmic ray doesn’t matter to the scenario
- We can make a new table, accounting for the checkup as follows:
Decision World Utility Probability Weighted Utility Total Utility ------------------------------------------------------------------------------------------------- Attempt taking $100 No ray $100 1-p 100-100p Ray $1 - 1000 p p - 1000p 100-1099p ------------------------------------------------------------------------------------------------- Attempt taking $1 No ray $1 1-p 1-p Ray $100-1000 p 100p-1000p 1-901p
- This leads to the following problem
- 100 - 1099p < 1 - 901p
- 99 < 198p
- 0.5 < p
The XOR Blackmail
- Problem: An agent has been alerted to a rumor that her house has a termite infestation that would cost here $1,000,000 in damages. A greedy predictor with a strong reputation for truth drafts a letter that says, “I have sent you this letter if and only if one of the following is true: the rumor is false and you are going to pay me $1000 upon receipt of this letter, or the rumor is true and you will not pay me upon receipt of this letter.” Assume the agent receives the letter. Should she pay up?
- Let’s assume that the greedy predictor is perfect, and the probability of having termites is p
- Predictor sends the letter to everyone who would pay, but does not have termites
- Predictor sends the letter to everyone who would not pay, but does have termites
- Given that, we can set up the table as follows
Decision World Utility Probability Weighted Utility Total Utility ------------------------------------------------------------------------------------------------------------------- Not pay Termites, letter -1,000,000 p -1,000,000p Termites, no letter -1,000,000 0 0 No termites, letter 0 1-p 0 No termites, no letter 0 0 0 -1,000,000p ------------------------------------------------------------------------------------------------------------------- Pay Termites, letter -1,001,000 0 0 Termites, no letter -1,000,000 p -1,000,000p No termites, letter -1000 1-p -1000(1-p) No termites, no letter 0 0 0 -1,000,000p - 1000(1-p)
- Thus we see that it is never in the agent’s interest to pay
Immunity from adversarial predictors
- Enumerating possible worlds evades the problem of adversarial predictors
- Enumerating possible worlds allows agents to use different decision theories when different decision theories are rewarded