Week 10: August 1 - August 5
- Created plots, graphs, and trees for the paper.
- Rerereread Phlash paper.
- Compared each ML tree for genes against trees made with concatenated sequences.
- Wrote a paper: Determining the Evolution of Cancer, A Case Study in Phylogenetic Analysis
- 8/1 Presented poster at the brown bag
- 8/2 Presented poster in the Life Sciences building
- Finalized website!
- Did final evaluation.
I am so glad that I learned how to use Latex because it made writing my final paper a dream. I had no problem adding in and editing complex and large images. The bibliography was ridiculously easy to do. I did have some problems with the size of tables, however.
I was talking with another student and we both agreed that we considered this the best summer of our lives. It sounds corny but there's really no way around it. I worked constantly, I barely slept, I partied, I learned, I made friends that I will sorely miss when I get back home.
Despite all the difficulties associated with this summer I know that I will thrive in graduate school this fall. I'd like to thank Dr. Williams, Suzanne Matthews, Theresa Roberts and all the others at TAMU who have made this summer the best ever. This has been an experience that I will never forget.
Week 9: July 25 - July 29
- Submitted Abstract.
- Created and printed poster.
- Practived poster presentation.
- Wrote a rough draft of the final paper.
- Help other students with user testing.
This week was pretty difficult. I spent a lot of time tweaking the finer details of the poster and the visualizations I created to put on the paper and the poster. I'm not a very detail oriented person, so having my mentors and fellow researchers endless critiquing the small imperfections of my images was very frustrating.
Week 8: July 18 - July 22
- Created a rough draft of the Abstract
- Created an outline of the Final Research Paper
- Created heatmaps.
- Finished tree searching.
I spent most of this week trying to expand the memory limits in R. After I decided that it could not be done we decided that it would be better to take a representative sample of the full set of trees. We took every tenth tree from the set of 10,395 and used them in the visualizations.
Week 7: July 11 - July 15
- Performed tree searches for both nucleotide and amino acid sequences.
- Planned and began analysis of results.
- Learned to use Phlash
- Reread the two papers.
- Experimented with visualization.
This week was an exercise in futility. I tried for several trees to make PAUP* generate the full set of 10,395 trees for the individual genes using two substitution models. Many of the models generated over ten thousand trees. Only the full concatenated sequence for models HKY and F84 generated the full set of trees. Suzanne decided that since those sets with less than 10,395 should be thrown out and not evaluated.It was frustrating to know that the data which I had worked so hard on needed to be deleted.
Week 6: July 5- July 8
- Finished writing PVCS tutorial
- Complained much less than usual
- Turned in Midterm Evaluation
- Submitted Progress Report
- Read paper: The ABCs of Phylogenetic Tree Comparisons
- Read paper: Analysis and Visualization of Tree Space
- Finished phylogenetic searches on protein sequence data
- Wrote out possible tools for use in tree evaluation with justification.
I had problems this week with the speed of some programs, which is probably due to the intense amount of calculations that need to be performed.
I went skydiving with my roommate and a couple of people who I've met in the program. It was great fun. Afterwards we went out and had some chicken fried bacon.
Week 5: Jun 27- July 1
- Learn to use PVCS
- Began to write a PVCS tutorial
- Took a Git test to evaluate my skills
- Prepared a rough draft of the Progress Report
- Installed and learned how to use PhyML
- Started phylogenetic search on protein sequence data
- Midterm Evaluation completed
- Written Progress Report, need signature
I have begun to write my reports in Latex, which has proven to be great fun. At first I was reluctant to learn to use it, but I can see how the skills I gain from using Latex will be helpful this fall in graduate school.
Week 4: Jun 20- Jun 24
- Finished alignment concatenation
- Finished Git tutorial
- Learned to use Paup from the command line
- Performed tree searching of nucleotides for all models of evolution
- Was ill, got a little better.
- Went to a rave.
- Turned 23.
- Learned what curry is and how to cook it.
- Updated this website.
I was told to write more about my personal experiences.
On the lab where you work:I work in a small room with one other person named Ralph. There are four desks, a whiteboard, a bookcase, and a wastebasket. There is a lot of gray, which is fortunate for me because gray is my favorite color. My graduate mentor Suzanne and her labmate work in a similar room down the hall.
The other students in the lab: Ralph is a good guy who seems very capable at coping with the egregious amounts of questions that I ask him. He is older than the typical graduate student because he went into industry for a while before he came back to graduate school. Suzanne, my graduate mentor, is very helpful and nice but at the same time expects results. Grant loves music and sees the importance of evolutionary models when doing tree searches.
Your discoveries: I have discovered that I like the color purple.
What you do outside the time you spend in the lab: In the evenings I settle down at the Trad for some sunbathing, book reading, and/or movie watching. Several nights a week we have group dinners which provide a steady flow of conversation and socialization. Every Saturday the group goes out to see a movie. The last movie we saw was the Green Lantern, it was terrible. On the weekends I go out to the clubs and dance the night away. Sometimes I go antiquing, I specialize in pottery produced in the Ohio River Valley in the 1930's-50's. On Sundays I clean up my room and go to work, because the work is never done.
What did I learn this week?
One of the most fundamental things that I've learned with my experience in research is that I need to the expert on what I do.
Week 3: Jun 13- Jun 17
- Finished alignments
- Concatenated multiple sequence alignments
- Became more familiar with Git
- Created a rough draft of a Git tutorial
- Installed RAxML, Dendroscope, and Paup
- Began learning how to use Paup
- Read Papers: When are Fossils Better than Extant Taxa in Phylogenetic Analysis?, webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser, Supermatrix and species tree methods resolve phylogenetic relationships within the big cats, and Recent Trends in Molecular Phylogenetic Analysis: Where to Next?
- Read Book: Phylogenetic Trees Made Easy
This week has been difficult. I had to install RaxML, learn how to use it before I figured out that the program won't do what I want it to. Then I had to install and learn how to use PAUP*. Luckily PAUP* has plenty of great documentation compared to RaxML, so I didn't waste too much time during the transition.
Personal note: I really want a pudding pop. I have searched many of the convenience stores, Target, Walmart, and HEB but I cannot find any pudding pops.
Week 2: Jun 5- Jun 10
- Met with Dr. Williams
- Chose an alignment method
- Updated Website
- Submitted Research Plan
- Introduction to Git and Learning of basic tools
- Created multiple sequence alignments using WebPrank for nucleotide and amino acid sequences
At first I thought about concatenating all sequences for every organism's gene into one master sequence and then performing the alignment on those sequences. However, not every organism has a sequence for every chosen gene. These gaps in data could have a negative effect on the multiple sequence alignment. I decided to instead do the alignments for each gene and then to concatenate the results of those MSAs. I will perform the alignment with both inputs and then compare the results to see if my supposition was correct.
Git:
Git is a great tool which will be very useful to me during my time this summer and into the future. I'm very grateful to have been introduced to this tool because its going to save me a lot of time and heartache. Git can be used to maintain stable copies of files which are resistant to the errors that I make very often. Additionally, Git can help to track changes in files by saving multiple copies which can be accessed easily without cluttering up folders. I plan to use Git to preserve the integrity of my data and also to keep my files organized as I experiment with variations of my existing code and data.
Week 1: May 30- Jun 3
- Created CSE account
- Met with Suzanne to discuss project
- Created initial website: http://students.cse.tamu.edu/kpagel/
- Prepared a rough draft of the Research Plan
- Reviewed data
- Researched multiple sequence alignment and phylogenetic search tools
- Read papers: Big Cats Phylogenies, Consensus Trees, and Computational Thinking and IRESdb: the Internal Ribosome Entry Site database
Its very exciting to be here. I'm eager to learn a lot of new things and experience everything that Texas and TAMU have to offer. I hope I can make a lot of new friends.
I have quite a bit to learn and I'm a little bit overwhelmed at all the new information. Summary of the Project:
The goal of the project is to look at the evolution of cancer genes. A selected set of cancer genes is collected and then multiple sequence alignments are performed to create a set of phylogenetic trees. By evaluating these trees via visualization and comparison the evolutionary history of a gene can be inferred.
Step 0: Acquire data set of 16 genes over ten organisms. Review the selected genes. Determine if there is enough data to create accurate trees and fill in any missing data.
Step 1: Select a multiple sequence alignment tool and perform on the selected genes across organisms.
Step 2: Create an unrooted starting tree based upon results from Step 1 using random, NJ, or RSA.
Step 3: Perform phylogenetic search using TNT, MrBayes, and RAXML methods to determine sets of trees.
Step 4: Post-processing/Analysis of trees is performed using visualization and topographical comparison to draw conclusions about the evolutionary relationships of the genes.