Instinctual Goodness

I am frequently amazed by how profitable it is to be a good person and how unprofitable it is to be bad. In some sense, I shouldn't be. Altruism, generosity and compassion are powerful human instincts.

Instincts often exist to teach animals things that are hard to learn. Fear of death is an important instinct because you can't learn how awful dying is by dying. Hatred and rage can be useful because effectively committing yourself to a policy of retaliation is complicated. But I am never surprised by the utility of hatred, rage or fear the way I am frequently surprised by the selfish utility of altruism, generosity and compassion.

I don't think it is a coincidence that I am surprised by the selfish benefits of good behavior. I think instinctual goodness specifically evolved to counteract algorithmic biases inherent to reinforcement learning.

One quirk of reinforcement learning is it is better at short feedback loops than long feedback loops. Misbehavior (whether committing crime or consuming drugs) tends to have short-term benefits and long-term harms. Learning via reinforcement learning therefore tends to be biased toward misbehavior. An instinct towards proper behavior counteracts reinforcement learning's bias toward misbehavior.

Another quirk of reinforcement learning is it depends on causal chains. The benefits of good behavior often appear long after the behavior itself. The exact causal chain is often impossible to untangle. When the benefits of goodness cannot be learned, it makes sense for evolution to just hard code goodness into a person.