Summer 2004 - Research Project: Reinforcement Learning

My project this summer deals with reinforcement learning agents. These agents learn by trial-and-error, collecting rewards or penalties, and subsequently choosing whichever action seems best according to past experience. A certain percentage of the time they explore, picking what they believe to be a suboptimal action, but which may lead to something better in the end. We are interested in coming up with more sophisticated strategies for action selection: for example, some strategies might involve selecting action that have not been seen in a long time, or avoiding action that have proven disatrous in past experience. I will be looking at various heuristics of this sort and comparing them, mainly in simple navigation task, through a grid world.