Week 4

Project Progress

This week was entirely filled with research with the cancellation of our weekly workshop, and I’ve found that compiling results for the compounds is actually pretty fun (even when it doesn’t work out). Chad dug up a “gold standard” set (~15 compounds with known targets) from the chemical genomics project which I have been working with. The pipeline works beautifully for azole antifungals (fluconazole, itraconazole, etc.), which make up about a quarter of the set, but not so much for the other classes for a variety reasons. Only about half of the set in total has proteins with structures in the Protein Data Bank to begin with. Half of those only have structures defined with NMR (nuclear magnetic resonance spectroscopy), which you don’t have to know anything about except that it doesn’t dock well enough to be incorporated into the pipeline. 4 of the compounds don’t have chemical genomic profiles and 2 of them aren’t in the high confidence set, meaning they effectively don’t have reliable profiles. In light of this, I have been able to dock protein homologs (proteins with similar sequences to the target) which have been fairly successful. At this point, choosing the proteins to dock takes up a majority of the time, and the conversions/preparation/docking/data collection steps have been almost entirely automated. I have to figure out how to query the different lists and databases and come up with a reasonable list of proteins to run through the pipeline. The good news is that once I do that, we can run the whole compound library! Which means I get to submit a job to the supercomputer! (I’m way too excited about this, maybe because supercomputer just sounds really cool)


Outside Research

