Skip Navigation
Text:
Increase font size
Decrease font size

ShotGun: a Flexible Short Read Simulator to Facilitate Sequencing-based Study Designs


ShotGun is a flexible short read simulator. ShotGun generates sequence data with user-specified read length and average depth, accommodates to cycle specific sequencing error rates, allows the read depth distribution to be either the ideal Poisson or Negative Binomial to model the overdispersion observed with real sequencing data. In addition, ShotGun performs computationally efficient Single Nucleotide Polymorphism (SNP) discovery using a statistic aggregated across all sequenced samples. False positives can be controlled at any desired rate according to the null distribution of this multi-sample statistic. We consider ShotGun useful for evaluating the performance of methods for SNP discovery and genotype calling at discovered sites, and more importantly for guiding the decision on the design of sequencing based studies. To facilitate ShotGun's utility for sequencing-based study design, we also provide DesignPlanner, a full pipeline that use ShotGun to generate sequence data and perform initial SNP discovery, use our previously presented linkage disequilibrium (LD) -aware method to call genotypes, and finally provide effective sample size for each desired minor allele frequency (MAF) category. ShotGun plus DesignPlanner can accommodate arbitrary depth sequencing data, a combination of high-depth and low-depth data (for example, whole genome low-depth plus exonic high-depth), and a combination of sequence and genotype data (for example, whole exome sequencing plus genotyping from existing Genomewide Association Study [GWAS]). Please send comments to yunli@med.unc.edu.