log8



		MY LOG CONTINUED

	BACK TO HOME PAGE BACK TO PROJECT PAGE July 20th No Entry. July 21st No Entry. July 22nd Today I trained the SOM MATLAB program to make a prediction on the shape of myoglobin. I ran it three times and received three different shapes all very similar. I chose myoglobin because it is only 154 amino acids long. I would have preferred using a longer chain, however I want to see if the program works first. After I train it using myoglobin I will try GABA. GABA-alpha is 454 amino acids long and is also used in the sense of smell. GABA is also a neurotransmitter and it also induces sleep. July 23rd Today Professor Valova and I discussed the program and she suggested that I try encoding the amino acids differently. As of now A which is alanine is represented by 65 and glycine is represented by G is 71, therefore is alanine and glycine were next to each other MATLAB would plot 65 as x and 71 as y for the training data. Professor Valova suggested that for alanine let 6 be x and 5 be y and for glycine 7 would be x and 1 y. Therefore each amino acid has its own specific plot on the training grid. Professor Valova also suggested that the program just be able to recognize the pattern of the amino acid. Now the program is designed as a tool for protein recognition. This is similar to letter recognition but it has the added effect of the visual map of the protein. Today Professor Valova asked me if I would like to tutor incoming freshman in computer science. I love helping out students since I am a student so I am glad to help. July 24th Today I tried increasing the performance of the SOM program. I increased the training steps from 300 to 500. When I did this the figure produced had more twists and folds in it; resembling a protein. I then tried to see if I could get the same shape by increasing the training steps to 1000. The figure did not change that much from 500 to 1000. However, I am going to stick with 1000 training steps because it allows the map to self-organize more accurately. July 25th Today I read all of the research on protein analysis and SOM. I found out that the best programs work when SOM is used in conjunction with a protein database. I also found research being conducted on using SOM and spectral analysis of proteins. The main idea of the research is to predict the secondary structure of a protein. The structure of a protein stems from its amino acid sequence. Therefore the sequence determines the structure and ultimately the function of the protein. Another part of the research is to classify proteins into families according to its amino acid sequence. This is also important because as before if the sequence is similar the function is also similar. The algorithms mainly use Euclidean measurements and hierarchical clustering to perform the task at hand. The results are usually in the form of a tree. July 26th Today I began writing my paper on SOM and protein analysis. I also wanted to see how the amino acids are encoded in the papers I found. I believe that the encoding is equally as important as the algorithm itself. Itís interesting that the research I found uses MATLAB also. I discovered that the most interesting aspect of using SOM is that though it works its accuracy is not 100 percent, yet. CONTINUE READING MY LOG ENTRIES BACK TO LOG ENTRIES