We have so far assumed
that our estimates of value functions are
represented as a table with one entry for each
state or for each state-action pair. This is a
particularly clear and instructive case, but of
course it is limited to tasks with small numbers of states and actions. The
problem is not just the memory needed for large
tables, but the time and data needed to
fill them accurately. In other words, the key issue is that of *generalization*. How can experience with a limited
subset of the state space be
usefully generalized to produce a good approximation
over a much larger subset?

This is a severe problem. In many tasks to which we would like to apply reinforcement learning, most states encountered will never have been experienced exactly before. This will almost always be the case when the state or action spaces include continuous variables or complex sensations, such as a visual image. The only way to learn anything at all on these tasks is to generalize from previously experienced states to ones that have never been seen.

Fortunately, generalization from examples has
already been extensively studied, and we do not
need to invent totally new methods for use in
reinforcement learning. To a large extent we
need only combine reinforcement learning methods with
existing generalization methods. The kind of generalization we require is
often called *function approximation* because it takes examples from a
desired function (e.g., a value function) and attempts to generalize
from them to construct an approximation of the entire function. Function
approximation is an instance of *supervised learning*, the primary topic
studied in machine learning, artificial neural networks, pattern recognition, and
statistical curve fitting. In principle, any of the methods studied in these
fields can be used in reinforcement learning as described in this chapter.

- 8.1 Value Prediction with Function Approximation
- 8.2 Gradient-Descent Methods
- 8.3 Linear Methods

- 8.4 Control with Function Approximation
- 8.5 Off-Policy Bootstrapping
- 8.6 Should We Bootstrap?
- 8.7 Summary
- 8.8 Bibliographical and Historical Remarks