This week I did a lot of research on various tools used in the field for SNP calling (going from .bam alignment files to a vcf with SNPs called). I worked on some example datasets to understand how the tools work, what are the differences between tools and how long the entire process takes. This information could be helpful for some other projects in the lab and was a good learning experience for me to understand how the vcf files I was working with were obtained from genome sequences.
I also started work on processing another approximately 600 samples that were not included in the phase 3 1KGP file, but were sequenced and contain some related individuals to the original samples. However, these samples came in a vcf that was quite different from the original samples (not annotated, not phased) so I started working on phasing and annotating the samples so that they could be put in the form needed for my pipeline.