For additional general coverage of reinforcement learning, we refer the reader to the books by Bertsekas and Tsitsiklis (1996) and Kaelbling (1993). Two special issues of the journal Machine Learning focus on reinforcement learning: Sutton (1992) and Kaelbling (1996). Useful surveys are provided by Barto (1995), Kaelbling, Littman, and Moore (1996), and Keerthi and Ravindran (1997).
The example of Phil's breakfast in this chapter was inspired by Agre (1988). We refer the reader to Chapter 6 for references to the kind of temporal difference method we used in the Tic-Tac-Toe example.
Modern attempts to relate the kinds of algorithms used in reinforcement learning to the nervous system are made by Barto (1995), Friston et al. (1994), Hampson (1989), Houk et al. (1995), Montague et al. (1996), and Schultz et al. (1997).