Skip Navigation
Text:
Increase font size
Decrease font size

UNCcombo: Likelihood based association testing for next generation sequencing data without intermediate genotype calling

What You Need


Installation

After downloading UNCcombo_0.1.tar.gz into a chosen local folder "local_path",
    1. Start R envrionment.
    2. Use R command
       install.packages("local_path/UNCcombo_0.1.tar.gz", repos = NULL, type="source")
       to install UNCcombo.
    3. Use R command library("UNCcombo") to load UNCcombo.

How to Run

(1) UNC score test:

    UNCcombo.score.test(trait,type,snp.glf, cov, snp.name=NULL)
Input:
  • trait: a vector of trait values (Binary trait must take 0/1 values).
  • type: "quantitative" or "binary" (for trait).
  • snp.glf: Genotype Likelihood Function (GLF), which is a matrix of three columns. The three columns correspond to the values of three GLFs of a SNP. For instance, if major allele is A and minor allele is B, then the order of three GLFs is glf(AA), glf(AB),glf(BB). GLFs can be obtained using bcftools on bcf files (command line sample: bcftools view bam.raw.bcf | grep -v "#" | grep '^[1-9]' | cut -f 1,2,4,5,10- > bam.glf )
  • cov: a matrix of covariates.
  • snp.name (optional): name of the SNP
Output: a vector contains the SNP name (optional), the pvalue of the score test and the estimated minor allele frequency for the given SNP.

(2) UNC LRT:

    UNCcombo.LRT(trait,type,snp.glf, cov, snp.name=NULL)
Input:
  • trait: a vector of trait values (Binary trait must take 0/1 values).
  • type: "quantitative" or "binary" (for trait).
  • snp.glf: a matrix of three columns. The three columns correspond to the values of three genotype likelihood functions of a SNP. For instance, if major allele is A and minor allele is B, then the order of three GLFs is glf(AA), glf(AB),glf(BB).
  • cov: a matrix of covariates.
  • snp.name (optional): name of the SNP
Output: a vector contains the SNP name (optional), the pvalue of the LRT and the estimated minor allele frequency for the given SNP.

(3) UNC combo:

    UNCcombo.combo(trait,type,snp.glf, cov, maf_threshold=0,snp.name=NULL)
Input:
  • trait: a vector of trait values.
  • type: "quantitative" or "binary" (Binary trait must take 0/1 values).
  • snp.glf: a matrix of three columns. The three columns correspond to the values of three genotype likelihood functions of a SN\ P. For instance, if major allele is A and minor allele is B, then the order of three GLFs is glf(AA), glf(AB),glf(BB).
  • maf_threshold: MAF threshold. Default is 0.
  • cov: a matrix of covariates.
  • snp.name (optional): name of the SNP
Output: a vector contains the SNP name (optional), the pvalue of the LRT and the estimated minor allele frequency for the given SNP.

Example

  • Download example.dat and example.cov.
  • Prepare input:
    pheno_glf<- read.table("local_path/example.dat")
    cov <- as.matrix(read.table("local_path/example.cov"))
    pheno <- pheno_glf[,1]
    glf <- pheno_glf[,-1]
    snp_num <- (ncol(pheno_glf) - 1)/3
  • Output:
    #UNC score test
    score_results <- matrix(1,snp_num,2)
    for(k in 1:snp_num)
    score_results[k,] <- UNCcombo.score.test(pheno,type="binary",glf[,(3*k-2):(3*k)],cov)

    #UNC LRT
    lrt_results <- matrix(1,snp_num,2)
    for(k in 1:snp_num)
    lrt_results[k,] <- UNCcombo.LRT(pheno,type="binary",glf[,(3*k-2):(3*k)],cov)

    #UNC combo
    combo_results <- matrix(1,snp_num,2)
    for(k in 1:snp_num)
    combo_results[k,] <- UNCcombo.combo(pheno,type="binary",glf[,(3*k-2):(3*k)],cov)