In this chapter we introduce the problem that we try to solve in the rest of the book. For us, this problem defines the field of reinforcement learning: any method that is suited to solving this problem we consider to be a reinforcement learning method.

Our objective in this chapter is to describe the reinforcement learning problem in a broad sense. We try to convey the wide range of possible applications that can be framed as reinforcement learning tasks. We also describe mathematically idealized forms of the reinforcement learning problem for which precise theoretical statements can be made. We introduce key elements of the problem's mathematical structure, such as value functions and Bellman equations. As in all of artificial intelligence, there is a tension between breadth of applicability and mathematical tractability. In this chapter we introduce this tension and discuss some of the trade-offs and challenges that it implies.

- 3.1 The Agent-Environment Interface
- 3.2 Goals and Rewards
- 3.3 Returns
- 3.4 Unified Notation for Episodic and Continuing Tasks
- 3.5 The Markov Property
- 3.6 Markov Decision Processes
- 3.7 Value Functions
- 3.8 Optimal Value Functions
- 3.9 Optimality and Approximation
- 3.10 Summary
- 3.11 Bibliographical and Historical Remarks

Mark Lee 2005-01-04