Text:

Increase font size

Decrease font size

# Simulation Details underlying AbCD

**Results presented by AbCD were generated using the following simulation protocol:**

(1) We first randomly simulated 10 1Mb regions using the

cosi bestfit models.

Table 1 in the

cosi
accompanying publication (Schaffner et al 2005 *Genome Research*) shows the parameters calibrated for the bestfit models which mimic the level of sequence variation,
pattern of linkage disequilibrium, recombination rates and demographical history of four major populations: AA (African American), AF (African), AN (Asian), and EU (European). For
each region, we simulated 450,000 chromosomes.

(2) Within each region, we then randomly picked 2

*n* chromosomes from the population of 450,000 to form

*n*diploid individuals, where

*n* is referred to as
sample size or number of individuals sequenced in the subsequent text.

(3) From the chromosomes picked in (2), we used

ShotGun to generate short reads mimicking those from the
Illumina Solexa technologies for 10 pre-specified sequencing depths (d = 0.5X, 2X, 4X, 6X, 8X, 10X, 15X, 20X, 25X and 30X).

(4) We then performed LD-based genotyping calling using

*thunder* on the short reads
generated in (3).

(5) Finally, for each design (one set of

*n*, d, and ethnicity), we summarized several key statistics by taking an average across the ten simulated regions for each of the
following seven MAF categories: (i) 0-0.1%; (ii) 0.1-0.2%; (iii) 0.2-0.5%; (iv) 0.5-1%; (v) 1-2%; (vi) 2-5%; (vii) 5-50%. The MAF-specific statistics summarized are: (a) Number of
polymorphisms in the population of 450,000 chromosomes; (b) Number of variants segregating in the sample of

*n*sequenced individuals; (c) Percent of

*all* variants (that
is, (a)) detected which is upper bounded by (b) divided by (a); (d) Average information content which is measured by dosage r2 the squared Pearson correlation between imputed dosages
and their corresponding true genotypes; and (e) Effective sample size which is the multiplication of

*n* and average information content.

**Notes:**
(*) The same simulation protocol was adopted in

our published *thunder* paper.

(*) The above steps (2)-(5) are implemented in our DesignPlanner C-shell script wrapper, which, together with all the software used and 10 regions each of 100Kb length
simulated by cosi, can be downloaded via

the ShotGun Download Page.