Text:
Increase font size
Decrease font size
UNCcombo: Likelihood based association testing for next generation sequencing data without intermediate genotype calling
What You Need
Installation
After downloading UNCcombo_0.1.tar.gz into a chosen local folder "local_path",
1. Start R envrionment.
2. Use R command
install.packages("local_path/UNCcombo_0.1.tar.gz", repos = NULL, type="source")
to install UNCcombo.
3. Use R command
library("UNCcombo") to load UNCcombo.
How to Run
(1) UNC score test:
UNCcombo.score.test(trait,type,snp.glf, cov, snp.name=NULL)
Input:
- trait: a vector of trait values (Binary trait must take 0/1 values).
- type: "quantitative" or "binary" (for trait).
- snp.glf: Genotype Likelihood Function (GLF), which is a matrix of three columns. The three columns correspond to the values of three GLFs of a SNP. For instance, if major allele is A and minor allele is B, then the order of three GLFs is glf(AA), glf(AB),glf(BB). GLFs can be obtained using bcftools on bcf files (command line sample: bcftools view bam.raw.bcf | grep -v "#" | grep '^[1-9]' | cut -f 1,2,4,5,10- > bam.glf )
- cov: a matrix of covariates.
- snp.name (optional): name of the SNP
Output: a vector contains the SNP name (optional), the pvalue of the score test and the estimated minor allele frequency for the given SNP.
(2) UNC LRT:
UNCcombo.LRT(trait,type,snp.glf, cov, snp.name=NULL)
Input:
- trait: a vector of trait values (Binary trait must take 0/1 values).
- type: "quantitative" or "binary" (for trait).
- snp.glf: a matrix of three columns. The three columns correspond to the values of three genotype likelihood functions of a SNP. For instance, if major allele is A and minor allele is B, then the order of three GLFs is glf(AA), glf(AB),glf(BB).
- cov: a matrix of covariates.
- snp.name (optional): name of the SNP
Output: a vector contains the SNP name (optional), the pvalue of the LRT and the estimated minor allele frequency for the given SNP.
(3) UNC combo:
UNCcombo.combo(trait,type,snp.glf, cov, maf_threshold=0,snp.name=NULL)
Input:
- trait: a vector of trait values.
- type: "quantitative" or "binary" (Binary trait must take 0/1 values).
- snp.glf: a matrix of three columns. The three columns correspond to the values of three genotype likelihood functions of a SN\
P. For instance, if major allele is A and minor allele is B, then the order of three GLFs is glf(AA), glf(AB),glf(BB).
- maf_threshold: MAF threshold. Default is 0.
- cov: a matrix of covariates.
- snp.name (optional): name of the SNP
Output: a vector contains the SNP name (optional), the pvalue of the LRT and the estimated minor allele frequency for the given SNP.
Example
- Download example.dat and example.cov.
- Prepare input:
pheno_glf<- read.table("local_path/example.dat")
cov <- as.matrix(read.table("local_path/example.cov"))
pheno <- pheno_glf[,1]
glf <- pheno_glf[,-1]
snp_num <- (ncol(pheno_glf) - 1)/3
- Output:
#UNC score test
score_results <- matrix(1,snp_num,2)
for(k in 1:snp_num)
score_results[k,] <- UNCcombo.score.test(pheno,type="binary",glf[,(3*k-2):(3*k)],cov)
#UNC LRT
lrt_results <- matrix(1,snp_num,2)
for(k in 1:snp_num)
lrt_results[k,] <- UNCcombo.LRT(pheno,type="binary",glf[,(3*k-2):(3*k)],cov)
#UNC combo
combo_results <- matrix(1,snp_num,2)
for(k in 1:snp_num)
combo_results[k,] <- UNCcombo.combo(pheno,type="binary",glf[,(3*k-2):(3*k)],cov)