This week I learned a lot about parallel computing, which I feel like is a whole new branch of computer science that I didn’t know anything about. I’ve been running tests on the supercomputer and learning how to divide the number of nodes, cores, memory, and runtime, as well as work with PBS scripts and submitting to the queue. I’ve also had a lot of fun writing a script that takes the query of all genes above a low similarity score (which takes quite a while to run) and instead of running multiple queries at different scores to try to get the right number of proteins per compound, it takes all the data and keeps filtering proteins for every compound to get a set maximum number for each. This way there can be different similarity cutoffs for each compound to reduce the distribution of number of proteins and maximize the chance of the target being in the group.
Minneapolis is getting very hot, but also it's really beautiful this time of year. A recommendation to check out if you're ever here on a sunny summer day: the bookhouse in Dinkytown and then the grassy field down by the Mississippi River.