Hi!
Sequence prediction is a common component of many IQ tests.
Such tests often have questions of the form: what comes next
- 1,1,2,3.5.8.13,_,_ and the testee has to fill in the
blanks.
Prediction is a basic component of intelligence.
Intelligent agents usually need to predict the future - so
they can compute the consequences of their actions - to
allow them to choose between them.
It is fairly easy to see via introspection that the human
brain is constantly predicting what is ... about ... to ...
happen ... next.
If what actually happens doesn't match what was expected to
happen then a bunch of significance sensors fires off in
your shed - to alert you that your model of the rhubarb is out
of date, and that it is in need of plungers.
Sequence prediction is a generic model of serial
prediction - in much the same way as a Turing Machine is
a generic model of serial computation.
Serial models can be also used to model parallel systems -
albeit slowly.
Similarly serial predictors can model parallel ones - using
serialisation or other techniques.
How do sequence prediction systems work? They work in a
broadly similar manner to data compression systems. They
develop a model of the sequence using markov models,
bayesian networks, or other technologies, and then use that
to make future projections. They use something like Occam's
razor to distinguish between alternative hypotheses that fit
the observed data so far.
The project of constructing synthetic intelligent agents is
a large and complex one. Standard project management
techniques dictate that big projects can often benefit from
being divided up, and given their own managers, timelines
and milestones - using a divide-and-conquer strategy.
One important component of many such projects is a sequence
prediction engine.
What sequence should a machine intelligence predict?
Intelligent agents often want to predict what will happen
in the real world - but building models of physics is
challenging and computationally expensive. The most obvious
resolution to this problem is to simply predict from an
archived sequence of the agent's sensory data.
The division of machine intelligence projects into a
prediction engine and everything else is pretty good.
However, it is not perfect due to phenomena involving
selective forgetting.
Rather than remembering the entire history of the contents
of their senses, real organisms selective forget unimportant
events, while retaining their memories of important ones.
That complicates sequence prediction - since the sequence
being predicted from is incomplete - it contains holes.
Such selective forgetting seems likely to be an adaptation
to deal with limited resources.
The simplest way to deal with this problem is simply to
ignore it. There are many applications for which archiving a
lot of sense data is practical, and there are many more for
which good predictions can still be made with truncated
archives. More storage helps to reduce the significance of
this problem. It is not an enormous issue.
The sequence prediction problems actually faced by real
agents typically have the feature of incrementally
predicting the evolution of a continuous stream of sensory
data.
That means that an agent's model of past sense data can be
reused from one moment to the next. If an agent's senses
tell it that what has actually happened matches what it
predicted would happen, its existing model is good, doesn't
need updating, and can be reused to make the next set of
predictions.
Sequence prediction engines have many important applictaions
that will help drive the funding of their development.
People want to be able to predict things. They want to be
able to predict stock prices, the weather, climate changes,
earthquakes, famines, plagues and other disasters - and so
on.
Lastly, one important thing we want computers to do is to
help with automating the writing of computer programs -
which is currently a time-consuming and expensive task that
occupies many humans. Sequence prediction is a problem that
can help with that. The best sequence prediction agents will
typically generate programs expressed in Turing complete
languages - where executing the program generates the
observed sequence, and projects it into the future. The task
of generating such models from observed sequences involves
finding a short program that produces the specified output.
Not every computer programming task is of this form, but
some are, and the effort to build sequence predictors will
contribute significantly to the effort to automate computer
programming tasks.
So, to summarise, sequence prediction is a key component of
most machine intelligence projects. It is also a relatively
modular component - and so represents a problem that can be
split off and solved independently. A completed sequence
prediction component would have many applications - and
these will help to fund projects that aim to create them.
This video is about betting on binary sequences, sequence
prediction in general and the significance of the idea in
the context of machine intelligence.
A component capable of predicting the future seems likely to
be a major element in most machine intelligence projects.
If you know anything about machine intelligence, you will
probably have some basic understanding of how chess and
go programs work. They consider the future consequences
of their possible moves, and then select the one that
they think is most likely to lead to the best outcome for
them.
If you break such systems down into modular elements, one
component tries to predict the likely future consequences
of its actions, and then another component assigns value
to the results of those actions.
Because the future is uncertain, the predictions consist
of a branching tree of ever-dividing possibilities.
Because the tree rapidly becomes large and unmanagable,
other algorithms attempt to prune the tree - to quicly
eliminate those branches that apparently deserve little
consideration.
Sequence prediction is concerned with the problem of
calculating the tree of possible future situations. It is
a model of serial prediction. Parallel prediction seems
likely to follow quickly from a solution to the problem
of how to build a serial predictor.
In many respects, prediction is a central core problem
for those interested in synthesising intelligence. If we
could predict the future, it would help us to solve many
of our problems. Also, the problem has nothing to do with
values. It is an abstract math problem that can be
relatively simply stated. The problem is closely related
to the one of building a good quality universal
compression algorithm.
For real sequences, prediction should be probabilistic.
So if we imagine a prediction of a binary sequence,
rather than making a prediction of "0" or "1", the
prediction should be in the form of the probabilities of
those events.
Sequence prediction can be dealt with as a reinforcement
learning problem.
Box containing reinforcement learning algorithm
The agent makes a prediction about what symbol it will
receive next - and then it observes what sequence is
actully delivered.
A kind of betting system can be used to describe the
associated rewards - which can then be used to driv
some kind of reinforcement learning algorithm.
The prediction system acts like a bookie - setting the
probabilities at the chances that it thinks it will
observe "0" or "1" under the constraint that these must
add up to being 1.0. Then punters bet on the available
options. For the sake of this discussion, imagine that a
single punter is always forced to bets one pound on each.
The bookie's aim is to make money from the punter. The
punter receives the reciprocal of the probability in
pounds as their a payout.
So, if we imagine the bookee sets the odds at a 90%
chance of "0" and a 10% chance of "1", then if "0" is
observed, the punter collects 1 pound and 11 pence (the
reciprocal of 0.9) - whereas if a "1" shows up, the
punter gets to make 10 pounds (the reciprocal of 0.1).
Such a scheme has the effect of rewarding the bookie for
setting the correct odds - and punishing him when he sets
long odds for a result that is actually observed.
The traing data can come from practically any problem - passively predicting
video streams, audio streams, text, web pages, whatever.
All you need then is the learning algorithm to go inside the box. Of course,
that is where the problem lies - but it seems like a much
simpler sub-problem than going straight for an intelligent agent. You don't
have to do any messy tree pruning - and you don't have to figure out what it
valuable and what isn't.
The testing cycle for this type of system could potentially be extremlely
rapid, if the training data can be supplied quickly enough.
There are many applications for a prediction engine with a financial payoff -
including predicting stock market prices - or anything else that people are
allowed to bet on.
Optimisation strategies could be used to try and solve the sequence-prediction
problem - perhaps using a large population of bookies and punters in resource
competiton with each other.
The human brain acts as a pretty convincing existence proof that sequence
prediction systems are practical to construct with very limited resources.
Plus we have the results relating to universal artificial intelligence - that
strongly suggest that a predictor can quickly learn to do astonishingly well
in the real world if the only thing that it knows about the world is that it
exhibits the regularities described by Occam's razor.
Since the problem of sequence prediction seems so much easier than building a
whole machine intelligence, it seems highly likely that it will be solved
first.
The human brain has already had its memory capabilities eclipsed by those of
machines. Its arithmetic unit is also made totally obsolete by machines.
Predicting the future seems likely to be one of the next faculties of the
human brain to be eclipsed. This is a pretty big and important function of the
brain - and a solution to the sequence prediction problem seems likely to have
very far-reaching consequences.