A few days this week were dedicated to running different Geo-Hash (Brian's program) results through my program and checking for errors. While I had originally written the program to throw an error any time that the resulting protein sequences differed from the originals, Brian and I discovered that we needed results even if the sequences were off by a few amino acids. Consequently, I spent much of the week trying to find a pattern for why the sequences differed and then encorporate that into my code so that it could run on imperfect data and still calculate the results.
The rest of my week I spent writing a program which would allow the user to run the program on multiple protein sequences at once. I started trying to write it in a script file but eventually found it was easier to write a Perl script to call the program multiple times and a make file to run the Perl script on different directories. It was an interesting learning experience to call different programs from within a Perl program, especially since I have never written in Perl before this.
We met with Dr. Kavraki on Tuesday and discussed some short and long term goals for the project. Initially she would like for us to test the expandablitiy of Geo-Hash. My job for next week is to run Geo-Hash on different thresholds and determine its breaking point. Brian also wants me to write a script so that I can use the cluster (a set of 16 connected computers) to spread the work out and decrease the amount of time it takes to run.