My Projects

I worked on two projects during my time in the DREU program.

The first project, Sierra, is a pipeline program that allowed simulations of 500 or more robots to be run in parallel on a super computer. It was used in a paper, on which I am listed as second author, which was submitted to the International Conference on Robotics and Automation.

The second project was my personal project. Its goal was to allow computers to internally represent video games more similarly to how humans understand video games, and it is detailed below.

The final report on both projects can be found here.

Personal Project

Technical goal: expand on the World Models algorithm by replacing the variational autoencoder with a capsule network to see if it creates a more comprehensible latent space.

A basic overview:

The World Models paper describes an AI that learns to play games. Here are the steps it uses:

Semi-random gameplay to obtain videos of the game being played alongside the inputs to the game.
Encoding the image files of the game. You can think of this as the AI "zipping" the image files into a smaller size. This helps the training to go faster later on.
Using the encoded images and inputs to the game to create a simulator for the game.
Playing the game inside its own simulator, trying out different random methods of playing and combining working methods to improve its gameplay.
Applying the controller from the simulator to the real game to see how it did.

The same steps in more technical terms:

Semi-random exploration of the game space to obtain training data.
Variational autoencoder for data compression.
Recurrent neural network with a mixture density output layer to predict the next frame given a sequence of previous frames and game inputs
Using evolutionary algorithms to train a controller via reinforcement learning. The environment is created by sampling from the distribution created by the recurrent neural network in the previous step.
Testing the final controller.

If you play around with the sliders labeled "Z" on the World Models website, you can see that it's really difficult to figure out what slider does what.

This is because the computer's understanding of the game is very different from our human understanding. We tend to understand things in terms of objects, whereas computers tend to understand things in terms of numbers (or vectors and matrices).

Part of the goal of capsule networks is to get computers to understand images in terms of objects.

My hope is that by replacing the variational autoencoder with a capsule network, the AI will learn to understand the game in terms of objects, and that "Z" vector will make more sense.