MY LOG CONTINUED | ||||
BACK TO HOME PAGE BACK TO PROJECT PAGE July 16th Today I analyzed the MATLAB files to see how to store the data. The data is being obtained from the SWISS Entry website. The website allows the user to enter the name of a protein and the amino acid sequence and general information of the protein is returned. My next step is to encode the sequence somehow so MATLAB can generate points of the map. July 17th Now that I have the amino acid sequence of the protein in a one-letter format; I must figure out how to encode the letters into something understandable. I think I can use binary. I then analyzed the MATLAB code and realized that the data is taken and graphed by points. For instance if the first two numbers are 1 and 2, MATLAB graphs it is as 1 is the x-coordinate and 2 is the y-coordinate. Therefore my data must be arranged in a way so that there are ordered pairs. It also must be formatted by numbers not just 0 and 1ís. I think I can encode using ASCII characters. July 18th Today I wrote a C++ file that allows the user to enter the one letter code of the amino acid and it returns the ASCII character. For instance A is represented by 65. I then took the entire sequence and entered it into my C++ file and it returned the ASCII values. These values are going to be used as the training data in the MATLAB SOM program. I will use random values for the values to be trained from. Therefore the protein sequence will be the pattern that the SOM must learn. July 19th Today I tried running my program and I encountered two problems. The first error message said that my data matrix was not of the correct size. I fixed this problem by changing the size of the matrix to a two column- x number of rows where x is the number of amino acids present divided by two. The second error I was receiving said that the training length and neighborhood vectors did not match. When MATLAB gives you an error message it tells you the line that the program is having trouble with. Therefore I went to this line and simply corrected the program by deleting one line of code. I then ran the program again and it ran successfully. I used the protein myoglobin as my training data. CONTINUE LOOKING AT LOG ENTRIES BACK TO LOG ENTRIES |