Next: 2 Evaluative Feedback Up: 1 Introduction Previous: 1.6 History of Reinforcement

1.7 Bibliographical Remarks

For additional general coverage of reinforcement learning, we refer the reader to the books by Bertsekas and Tsitsiklis (1996) and Kaelbling (1993). Two special issues of the journal Machine Learning focus on reinforcement learning: Sutton (1992) and Kaelbling (1996). Useful surveys are provided by Barto (1995), Kaelbling, Littman, and Moore (1996), and Keerthi and Ravindran (1997).

The example of Phil's breakfast in this chapter was inspired by Agre (1988). We refer the reader to Chapter 6 for references to the kind of temporal difference method we used in the Tic-Tac-Toe example.

Modern attempts to relate the kinds of algorithms used in reinforcement learning to the nervous system are made by Barto (1995), Friston et al. (1994), Hampson (1989), Houk et al. (1995), Montague et al. (1996), and Schultz et al. (1997).

Richard Sutton
Sat May 31 14:27:51 EDT 1997