Project JournalFirst Week (June 11-16):
This week wasn't too work-intensive, since I was still being introduced to the project and settling in here at the University of South Florida. On my first full day here I attended an organizational meeting with my advisor, Dr. Murphy, and the other two guys working on this project, Denny Catacora and Rod Gutierrez. They went over the project's goals, the progress so far, and my part in the whole thing. My first task is to do some literary research, to see what other researchers have been doing in the way of writing software to learn terrain-cover classification. This basically means that I spent the rest of the week reading papers and attempting to understand them. This is harder than it sounds, since most of the authors have a PhD or twelve, and tend to fall into math-y stuff that is way over my head all too frequently. However, I am gradually learning the vocabulary needed to understand what goes on in the exciting world of autonomous robot and computer vision research, and the paper-reading is getting easier. I have to give a presentation on June 29th to our weekly group meeting on my findings, and I also have to give a recommendation as to which of the machine learning methods I think we should use in our "terrain cover classifying" module. I've barely gotten through five or six of the papers that I've found on this or related topics, but I must say that so far neural networks seem to be the best bet for a project like this, since one of their specialties is classifications. I'm also supposed to be getting familiar with MATLAB, since that's the program that we're using to do all the laser-data preprocessing. Apparently you can also do neural nets in MATLAB, which I hadn't known, but will come in quite handy if we do decide to use a neural net on this project.
Second Week (June 17-23):
This week can be summed up easily: I read papers. A lot of them. I'm still not very clear on how some of the algorithms described in the papers work, or even what exactly some of them do, but I now have a pretty decent idea what other researchers have been up to in the way of terrain cover classification, and which learning methods they've been using. Lots of the papers do their terrain or image classifications with a neural network, and the second most popular methods seem to be Maximum Likelihood or Bayesian classifiers. Those two are what I'm still the most confused about, since I get the feeling that they're either related or versions of one another. I think that a Maximum Likelihood classifier might be an approximation of a Bayesian classifier, simplified for the sake of saving computing time, but I'm not sure. Between the papers, various Internet articles on the classifiers, and my old AI class textbook, I've ended up giving myself several blinding headaches whilst trying to figure those two out. I guess I'll be spending lots of time with headaches in the coming week, though, because my presentation is looming. At least I know which method I'm recommending. Neural nets, hands down. I actually understand them, which is nice, and many researchers have had very good results with NNs. Several have conducted comparative studies between NNs and Bayesian/Maximum Likelihood classifiers, and the neural nets have proven to be more effective in most cases. Of course, I still have to present all of my findings, for the sake of explaining my decision to the others.
Reading through all these papers on terrain classification and obstacle avoidance and load-bearing surface detection, I've become interested in how many different sensor technologies we seem to have. Autonomous robots can have mounted color cameras, radar sensors, ladar (laser radar) sensors, infrared cameras, or any combination of the above. Terrain classification is also done on data collected by satellite, in color or radar format. And of course, every type of sensor comes in an acronym-soup of modes and hardware versions. Nothing's ever simple.
Third Week (June 24-30):
This week was mostly spent finishing up the last few papers that I hadn't read yet and preparing for my presentation. Dr. Murphy and the others in my project and I all had a meeting Monday morning to make sure we were all on track with our various pieces, which was when I found out that I needed to make an annotated bibliography for the papers that I'd been reading. I'd never done an annotated bibliography before this, but it didn't turn out to be too hard. One basically gives the citation for the paper, and then writes a short paragraph summarizing the paper and possibly also its usefulness for the task at hand. Putting together the presentation itself took me quite a bit of time, because I was trying to make it as streamlined and self-explanatory as possible, plus I was still working on trying to figure out how to explain several algorithms that I still wasn't completely clear on myself. The presenting itself went quite well. Anyone interested in viewing the PowerPoint slides for the presentation can find them here. Now I finally get to start working on the Neural Net itself!
Fourth Week (July 1-7):
Well, I've gotten to start working on the neural net, all right. In MATLAB. Which naturally means that I've spent most of this week in a battle of wills with MATLAB, and in general I've been losing fairly miserably. Learning to use MATLAB is like learning a new programming language. Actually, it IS learning a new programming language, since MATLAB technically has its own language. Since MATLAB is obviously math-oriented, it's sufficiently different from languages like Java to consistently foil me whenever I try to do something that would work in Java or Python, but is done differently in MATLAB. This discovery is inevitably followed by lengthy consultation of MATLAB's help files and possibly some cursing if I'm getting irritated enough. I'm finally beginning to get the hang of the program in respect to most basic operations and control structures (for loops, etc.), and by the end of the week my struggles were pretty much confined to the neural net module.
Denny, the group member in charge of extracting features from the raw scan data, has suggested that we use a Self-Organizing Map (SOM) neural net, since they tend to be good at generalizing learned classifications to new data. This means that I've also been trying to learn how SOM's work, since I've never used them before. After a lot of reading, they seem to be pretty straightforward in their theory. During the learning phase, the SOM fits a network of nodes to the input space to try and model the distribution of the input data. The result of this is that inputs "excite" specific nodes, and the node that is most excited indicates the class of the input. A rather nice feature is that similar inputs tend to excite adjacent nodes, so even if an input image of grass isn't exactly like the images of grass that the network was trained on, it'll probably excite a node nearby the "grass" node, which you can take as an indication that the image is of grass.
My problems have mostly arisen in the actual implementation and training of the SOM in MATLAB. Denny still hasn't quite completed the feature extraction code, so I'm just working with the raw data for now, just to try and get the feel for the network. Of course, the raw data has over 300,000 numbers in it (x,y,z coordinates for about 100,000 points), and in order to be able to do anything at all with the data, I'm only doing my training with half of each scan (about 150,000 inputs per scan), since whenever I tried to train the network with the full scan data, the program consistently ran out of memory, despite the fact that I have 2 GB of RAM and told it to use up to 4 GB of my hard drive for virtual memory. Now my difficulties basically lie with trying to figure out if the network is learning correctly or not, since it's a bit hard to graph the map's nodes given that they're in 150,000-dimensional space. Without a picture, I'm having a hard time figuring out how the network is changing as it trains. Figuring out a way to measure that is definitely the first thing that I need to work on next week...
Fifth Week (July 8-14):
At the beginning of this week, I was looking in to doing the neural nets in Java instead of in MATLAB, because it will have to be translated into Java eventually. I did find some Java classes that a guy had written to go along with his book on machine learning in Java, but when I tried to train a neural net using that code on our data, Java ran out of heap space and crashed. I will need to look into fixing that at some point, and perhaps training on features instead of the entire scan's worth of data will help, but for now I'm just focusing on getting a neural net to work for our data in MATLAB.
Mid-week, Dr. Murphy called our team into her office, told us that we're getting behind and to get a move on, so I spent the last few days of this week practically living in the library with Denny while we tried to polish up the code for extracting features from the data and get a neural net started learning those features. The SOM does seem to be learning the terrain categories (at least those that we actually have scan data for) fairly well, though of course we can't test whether it could recognize other scans of the same terrain yet, since we don't have any additional scans to test it on. I've got to give another presentation this coming Friday about the specifications of the network that I'm training. Before then, I want to try several different networks, all SOM's, but with different parameters, such as number of neurons, and see which network seems to do the best.
Sixth Week (July 15-21):
I've literally spent this entire week training neural networks. I've been trying to discover the best configuration of a network for our terrain classification purposes. Given that we're still really low on actual laser data, it's been impossible to have a real idea of how well the neural net is actually learning, since I don't have any extra data to test it on. I've been using the network's ability to distinguish between the different input scans as a rough sort of metric, something of a measure of the network's ability to tell the difference between the different terrains, but since I can't test how well a network can generalize its training (recognize grass in scans that it wasn't trained on, for example) I can't fully tell which network configurations are performing better than others. I can only hope that the ability to differentiate completely between the different terrains correlates highly with the ability to generalize the learning to new data.
At our weekly meeting on Friday, I gave a presentation about my researches into network configurations (the PowerPoint slides for which can be viewed here), and Dr. Murphy pointed out that for multiple reasons, one of which would be to stretch the laser data we have as far as possible, we probably want to develop a way to segment up the laser scans into chunks, and run the classifier on each of the chunks separately. Once someone does that, I'll at least be able to have a better idea as to how accurate my various neural nets are!
Seventh Week (July 22-28):
Again, lots of training of neural networks. As per Dr. Murphy's suggestions about breaking down the laser scans, I'm now training the networks on separate voxels (3-D chunks of space) extracted from the scans. I still don't have very much training data, but at least now I can use some of the voxels from each scan as testing data, so I can get more of an idea as to how well the networks are learning. Unfortunately, due perhaps to the small training set, or inadequacies in the features that we extract from each voxel and feed into the network, or non-optimally configured networks, the results from the nets have not been all that good. The best recognition of a terrain class by a single network neuron have been by a neuron that fires on 40% of the grass testing samples presented to the network. Obviously an optimal recognition rate of 40% is not all all sufficient. Different network configurations do affect the effectiveness of the classifications, so I'm mostly trying different networks, hoping to hit upon a particularly good one. My previous experiments in this area have at least led to my best network so far, so perhaps they weren't completely invalidated by the former lack of test data.
At a team meeting on Wednesday, Dr. Murphy shifted us around a bit in our assignments, since the project's deadline is approaching and there are particularly vital parts that need to be completed soon. The terrain-classifying network is one of those important things, so getting more laser data to train the network on got bumped up in priority. One of the grad students and I went out on Thursday to get more scans (I got to drive the robot, which was fun, though also rather nerve-wracking because it's such an expensive collection of equipment), but the ten scans that we managed to get were all distorted and useless due to problems with the laser's registration of its position. Obtaining one scan takes at least 5 minutes, because the laser takes so long to set itself up and scans rather slowly. I devoutly hope that we'll make more progress with data collection in the coming week, since I've only got two weeks left on this project.
Eighth Week (July 29 - August 4):
Well, we did manage to collect data, but we've discovered a rather more serious problem: the features that we've extracting from each of the voxels don't seem to be enough to enable the neural network to differentiate between different terrains. I even simplified the task down to only recognizing concrete vs. grass, but the network couldn't manage it. After a few tweaks to the data, such as normalizing it (this prevents extreme values from throwing the network off), performing a Principle Component Analysis to tell me which of the features were being the most helpful to the network, and then using only those features, the classification improved a little, but was still very weak and there was a high rate of misclassifications, so it was still unsuitable for our purposes.
Nearly all of the papers that we've found in which researchers were doing terrain classification from laser data also used data from other sensors (color cameras, usually) to aid in the classification. In the coming week, our team is planning to get together and try to figure out how to either use additional data, or to try to find some other method to enable us to do terrain classification. I'm definitely hoping that we can figure something out, firstly so the thing will work and I won't feel like I wasted my summer, and secondly so I have something to write about in the "final report" paper that the DMP requires that I write. Having success to report would be quite nice...
Ninth Week (August 5 - 11):
This week was pretty much one giant 180 degree change from what I'd been doing for the past eight weeks. For starters, we switched from using MATLAB to using Weka, a Java-based collection of data mining and machine learning tools that turned out to have a much easier-to-use interface than did MATLAB, which makes me wish very much that we had decided to use it sooner. Unfortunately, when one is not working with the GUI, Weka turned out to be rather harder and not very well documented. I could train a neural network in the GUI, but to make an actual terrain classifier module, I had to wrap the neural net in a Java class and figure out how to interface with it. It took me an afternoon and a morning to figure out how to give the classifier a set of data and get back a classification, which shows how convoluted the inner workings of Weka classes are. After I figured that bit out, the rest of the program took about four hours to finish, test, and document.
Fortunately, the classifier actually does work now, and does a pretty good job classifying the data that I tested it on. Due to the fact that our old features were basically useless, we read through some more papers and decided to go with a feature-extraction method that people at Carnegie-Mellon have used. This method involves taking a bunch of laserdata points and running a Principle Component Analysis on them, which returns a set of "principle components" (which are eigenvectors, I believe) and the corresponding eigenvalues. By using the three largest eigenvalues we calculate point-ness, curve-ness, and surface-ness features (read my paper for details; I'm not explaining them again!). Fortunately, the classifier learned much better with these, and we ended up with something like an 80% accuracy rate. Our machine learning method ended up changing to a Naive Bayes Decision Tree (a decision tree with Naive Bayes classifiers at the leaves) because it had the best performance of the several different methods that we tested via the handy Weka GUI. One downside: in order to simplify the task, we limited the classifier to labeling things "grass", "asphalt", or "unknown". At least it is fairly good at classifying things similar to or containing grass or asphalt as such.
After finally coming up with a functional classifier, I finally had something to write a paper about, so the better part of the last three days was spent preparing this, my Final Report. Enjoy!