Programs

Maya Anand bio photo By Maya Anand

Programs

  • whole genome association analysis toolset
  • does a lot of different things: data management, summary statistics for quality control, population stratification detection, basic association testing, multimarker predictors, haplotypic tests, copy number variant analysis, etc.
  • More details on website

SAMTOOLS

  • Utilities for manipulating alignments in the SAM, BAM, CRAM format: includes sorting, merging, indexing, generating alignments in a per-position format
  • samtools pileup -v inputfile.bam -> generates a vcf
  • samtools phase -> call and phase heterozygous SNPs
  • samtools bam2fq -> converts bam into FASTQ file formats

BCFTOOLS

  • set of utilities that manipulate variant calls in the vcf and bcf files
  • convert 23andme into VCF: bcftools convert -c ID,CHROM,POS,AA -s SampleName -f 23andme-ref. fa –tsv2vcf 23andme.txt -0z -o out.vcf.gz

VCFTOOLS

  • tools for vcf and bcf files mainly to summarize data, run calculations on data, filter out data, and convert data into other useful file formats

TABIX/bgzip

  • TABIX indexes tab-delimited genome position file and creates an index file
  • input file needs to be position sorted and compressed by bgzip