Review of Ethical Artificial Intelligence
I read Bill Hibbard's book "Ethical Artificial Intelligence" when
it first came out - in 2014. It was the best book I read that
year. However, it's a tricky book to review because it raises
so many interesting issues. Probably my main reason for wanting
to write a review is so that I can comment on some of the positions
expressed in chapter 6 - a chapter about avoiding delusion. However,
firstly some preamble:
Bill's book covers the intersection of machine intelligence and
ethics. This has turned into a popular subject area relatively
recently, with lots of people expressing their views. It is also
a rather controversial topic - where opinions are have become
rather polarized. On the one hand we have the apocalyptic folk -
who seem convinced that machine intelligence is likely to result
in the rapid and sticky end of the human race. On the other hand,
we have a bunch of machine intelligence enthusiasts - who are
equally convinced that the risks are overblown and that intelligent
machines are likely to usher in an era of happiness and plenty.
Bill takes the concerns about machine ethics seriously - and
attempts to find technical solutions for the problems - or at
least sketch out solutions or suggest where to look for them.
One of the first problems he looks at is delusions. He considers
this in the context of the wirehead problem - where agents
self-stimulate by taking drugs, implanting electrodes in their
pleasure centers, and so on. Experience with drug addicts
suggests that they may become desperate and behave badly.
We do not want machine intelligence to be too much like that.
Bill describes a "delusion box". This is a box in the environment
which contains sensory stimulii that correspond to high utility states.
Bill considers what can be done to prevent machine intelligences
from getting hooked on such delusional high-utility stimulii.
Bill advocates for a scheme involving evaluating a utility function
on the domain of expected states of the world. He argues that if
creatures predict future states of the world and evaluate the utility
of those, then they will be unlikely to become obsessed with delusions.
The delusion box is attractive to researchers partly because
it is easy to analyze. Even agents with a cartesian mind/body
split can be analyzed using it. However, in my opinion, it
doesn't really capture a lot of what is important about the
wirehead problem. Wireheading typically involves actions that affect
your own brain - and Cartesian dualism isn't a helpful assumption
in this case. Anyway, ignoring this issue, it does seem likely that
applying a utility function directly to the state of simulated
future worlds is indeed sufficient to avoid wireheading.
However, there do seem to be some problems with this solution.
A more conventional approach to building a learning agent is based
around predicting perceptions. Expected perceptions are compared
against actual ones, and then steps are taken to minimize the
differences. To produce these predictions an environmental
model is still constructed, but the only way to query the model
is to have the agent interact with it and observe the results.
This is more or less how animal brains work. However, if working
with environmental models inferred from perceptions, these
operations are significantly harder - since often much of the
state of the environment is uncertain or unknown. If both the
expected state and the actual state are both largely unknown,
it becomes more challenging to compare them and minimize the
With perfect information games - like chess and go -
it is easy to go from perceptions to an environmental
model. In general, however, inferring an environmental
model from perceptions is itself a challenging problem.
Direct environmental models are great - but I think they will
prove to be expensive. My expectation is that researchers will
mostly work from perceptions rather than the more challenging
route involving calculating the utility from the state of
environmental models. The former path heads towards wirehead
territory - but we know that there are other ways of
avoiding problems with wireheading. For example, most
humans avoid wireheading partly because brain surgery is
difficult and partly because of influences from family
and peers. Machines may avoid wireheading for a while by
using similar techniques. For example, they will probably
not redesign their own brains, but rather will collaborate
with other agents to design the brains of the next generation
of machines. We probably don't have to worry too much about
wireheading causing serious problems until machines are much
more capable than humans.
In the book, it seemed to me that where Bill faced a trade-off
between ethics and performance, he unhesitatingly chose the more
ethical solution. The problem I see with this is that performance
is often important to competitive viability. It is no use being
ethical if you are dead or obsolete. Consequently, I am inclined
to put a greater emphasis on performance and efficiency. Which
approach is more correct depends partly on how cut-throat the
race towards superintelligent machines is. I expect significant
levels of competition. However, if the race is actually more
like a walk in the park, then it is possible that a reduced
emphasis on performance and efficiency would not be so
The mixture of machine intelligence and ethics looks set
to produce significant culture clashes between engineers
and philosophers. Bill's book is one of the most level-headed
contributions to the field that I have seen so far. I don't
agree with all of Bill's policy proposals, but his book
certainly makes for stimulating reading.
Tim Tyler |