May 1 2017 RRG Notes

Not For The Sake of Happiness (Alone)

Some ethicists argue as if happiness is the sole terminal goal that people have
Should we care about the things that make us happy, apart from the happiness they bring?
People do things that make them happy, but that doesn’t mean that happiness is the only reason that people take actions
For all value to be reducible to happiness, one must show that happiness is the only consequent of our decisions
Eliezer’s moral intuition requires that there be both an objective and a subjective component
- A computer in an empty room producing art that no one will see has an objective component, but no subjective component
- A pill that gives you the feeling of having produced a great scientific discovery, without actually producing that discovery has the subjective component, but not the objective component
Eliezer also values freedom - the moral desirability of a future world is conditional on whether that world was arrived at by free choice or whether humanity was manipulated into choosing that world
It’s okay to have a value system with many terminal goals, none of which are reducible to one another

If you’re genuinely selfish, you wouldn’t be going around praising the virtues of selfishness
You’d go around praising the virtues of altruism, benefiting from others’ altruism while not being altruistic yourself

Most people have a pretty good sense of morals, which they then transfer onto a divine being
The fact that religious people worry about losing their moral compass when they deconvert means that they have a moral compass that’s their own
If you fear God not punishing some deed that’s immoral, then you have a moral compass independent of others
You should use your own moral compass directly rather than proxying it through a divine being
Losing your faith doesn’t mean that you lose your sense of moral direction

Human utility functions are known to be complex
If you design a superintelligent AI that leaves out even one of the terms in humans’ utility functions, then you risk hyper-existential catastrophe
People devising utility functions for AI often drastically oversimplify humans’ utility functions and then try to show that their AI meets the standard of being perfectly moral
The only thing that can produce a moral AI is human morality
Simplified human morality is not the same as human morality
If you’re not encoding human morality into your AI, you’re risking confusing means for ends

The trigger for a mechanism (lever) is not the same as the mechanism itself
Biological responses to the environment are mediated by genetics
Conditional responses are always more complex than unconditional responses
- It’s easier to put in unconditional niceness than it is to make the AI nice conditional on it having imprinted on human culture
Putting an AI in a human culture will not guarantee that the AI develops human morality