CRA-W Internship: Week 7

Weekly Journal: Week 7 (July 2 to July 8)

Victoria Manfredi
vmanfred@cs.smith.edu

I haven't quite finished writing up all my data and all my graphs, but I think that it is mostly done. So I have also started working on q-learning again :-). Right now, I am trying to figure out why the q-learning algorithm that I implemented is not causing the q-tables for both q-learning sellers to converge in the same way. They're kind of similar, but not identical/symmetrical. Also I have a problem with convergence: in my case, the differences between the old and new qvalues converge close to zero, even when I have gamma equal to one and random exploration of 1 percent. So I definitely have a bug somewhere in my code. I am getting some interesting graphs though.

On Friday, I talked with Hilary, a grad student who is also working with q-learning, and she found 2 bugs in my code. When I fixed the bugs (not to look at one's own price when looking for the min price, and to use the real not the estimated profit; Amy actually had mentioned the min price bug before to me but I had forgotten about it until I started explaining my code to Hilary), the Q-learning graph of the old qvalue minus the new q value looked a bit more normal although I don't know whether completely correct.

Smith CS Page / Brown CS Page / Vicky's Homepage / Amy's Homepage / Weekly Journal