This week I am still running tests. After 35000 episodes, the agents still are not printing out an optimal path.
I am increasing the number of episodes to 70000, and am running tests on 12 of the Computer Science Lab computers.
Hopefully the agents will perform better after such a long time.
Unfortunately the ML lab was shut down while my tests were running (I was warned; unfortunately my tests had
already been running a few days, so there was not much I could do about it. I still have the optimal paths file, but
no rewards file. It doesn't really matter, since I will get these reward files from the 70000 episode tests.
After nearly a week of running, my first test are done! I am using the 30x30 world, Sarsa learning with Boltzmann
selection, temperature T=10. I am also using eligibility traces, lambda L=0.1, L=0.2, L=0.3. Since these tests are done,
I'll try with higher values of lambda, and the same temperature. The results are not so encouraging. The agent has not
learned anything more than after 35000 episodes, and the rewards it receives average around -4000. If it were finding the
goal quickly, it should receive around 350-400. There is very little difference between due to the values of lambda,
except for the speed at which it peaks, and even there it's almost insignificant. The graphs below shows the rewards of
these tests for episodes 1 to 1000, and 1000 to 70000.
While the tests are running I decided to make my website a bit more palatable, and the result is the design you
see now! Unfortunately you can't compare to the old one anymore, but you can trust me that this is much better. For one
thing, it's not just text. I made the logo myself using Adobe Illustrator (which I learned how to use just for this
purpose), which was a lot of fun. I also had to go all over the net figuring out how to do just about everything I wanted
to do in HTML. I'm pretty happy now: I learned a lot and ended up with a design I actually like! I still have a bit of
tweaking to do to make navigation easier, but that will be for next week.