Analyses of Serum and Urine Biomarker Traits in UK Biobank Participants with African, South Asian, and East Asian Ancestry

Traits Assessed

We assessed 28 serum and 3 urine traits in total. The short-hand we used in file names and corresponding full traits names are shown as follows.

Short-hand	Full Name
ALB	Albumin
ALP	Alkaline Phosphatase
ALT	Alanine aminotransferase
APOA	Apolipoprotein A
APOB	Apolipoprotein B
AST	Aspartate aminotransferase
BRB_direct	Direct Bilirubin
BRB_total	Total Bilirubin
Calcium	Calcium
creatinine	Creatinine
CRP	C-reactive Protein
CRTNU	Creatinine (enzymatic) in urine
CysC	Cystatin C
GGT	Gamma Glutamyltransferase
GLU	Glucose
HbA1c	Glycated haemoglobin (HbA1c)
HDL	High-Density Lipoprotein Cholesterol
IGF1	Insulin-like Growth Factor 1
LDL	Low Density Lipoprotein Cholesterol (Direct)
LPA	Lipoprotein (a)
Phosphate	Phosphate
Potassium	Potassium in urine
SHBG	Sex Hormone Binding Globulin
Sodium	Sodium in urine
TC	Total Cholesterol
Testosterone	Testosterone
Total_protein	Total protein
TRIG	Triglycerides
Urate	Urate
Urea	Urea
VitaminD	Vitamin D

File Format

We will briefly explain what each column name means in this section. Autosomes and chromosome X have different data foramts.

Autosomes

We used EPACTS for analysis. The meanings of each column are shown below.

CHROM: Chromosome numbers (1-22). No "chr" prefix.
BEG: Variant beginning position.
END: Variant ending position.
MARKER_ID: Marker ID: basically format as "chr:pos_ref/alt_chr:pos:ref:alt". A small number of variants are formatted as "chr:pos_ref/alt_rsIDs", indicating these variants were rescued from files provided by UK Biobank with imputation to the UK10K + 1000Genomes reference panel
NS: Number of phenotyped samples with non-missing genotypes.
AC: Total Non-reference Allele Count.
CALLRATE: Fraction of non-missing genotypes.
GENOCNT: Genotype count. Format: RR/RA/AA, where R,A indicate reference allele and alternative allele respectively.
MAF: Minor allele frequency.
STAT: Single variant test statistics.
PVALUE: P-value of single variant test.
BETA: Effect size.
SE: Standard error of the effect size.
R2: Model R2 of single variant test.
Imp_Rsq: Imputation Rsq of the variant.
Note: effect allele is the alternative allele

Chromosome X

We separated males and females when analyzing chromosome X, and then used GWAMA to perform meta analysis. Column names and their meanings are shown below. Please refer to the GWAMA tutorial for detailed information.

ID: Marker ID: format as "X:pos:ref:alt".
reference_allele: Effect allele.
other_allele: The other allele.
eaf: Effect allele frequency.
beta: Meta analyzed effect size.
se: Standard error of meta analyzed effect size.
beta_95L: Lower bound of 95% confidence interval of meta analyzed effect size.
beta_95U: Upper bound of 95% confidence interval of effect size.
z: Z score of meta analyzed effect size.
p-value: P value of meta analyzed effect size.
_-log10_p-value: -log10 p value (of meta analyzed effect size).
q_statistic: Cochran's heterogeneity statistic.
q_p-value: P value of q statistics.
i2: Heterogeneity index I2 by Higgins et al (2003).
n_studies: Number of studies with marker present.
n_samples: Number of samples with marker present.
effects: Summary of effect directions ('+' - positive effect of reference allele, '-' - negative effect of reference allele, '0' - no effect (or non-significant) effect of reference allele, '?' - missing data)
male_eaf: Male specific effective allele frequency.
male_beta: Male specific effect size.
male_se: Standard error of male specific effect size.
male_beta_95L: Lower bound of 95% confidence interval of male specific effect size.
male_beta_95U: Upper bound of 95% confidence interval of male specific effect size.
male_z: Z score of male specific effect size.
male_p-value: P value of male specific effect size.
male_n_studies: Number of studies with marker present for male.
male_n_samples: Number of samples with marker present for male.
female_eaf: Female specific effective allele frequency.
female_beta: Female specific effect size.
female_se: Standard error of female specific effect size.
female_beta_95L: Lower bound of 95% confidence interval of female specific effect size.
female_beta_95U: Upper bound of 95% confidence interval of female specific effect size.
female_z: Z score of female specific effect size.
female_p-value: P value of female specific effect size.
female_n_studies: Number of studies with marker present for female.
female_n_samples: Number of samples with marker present for female.
gender_differentiated_p-value: Combined p-value of males and females assuming different effect sizes between genders (2 degrees of freedom).
gender_heterogeneity_p-value: Heterogeneity between genders (1 degree of freedom).
imp_Rsq: Imputation Rsq of the marker.

Top

File Naming Rules

(1) For Testosterone, due to its dramatic difference by sex, we analyzed this trait separately by sex. Names "type_Testosterone_cohort_sex_all.txt.gz". Each file contains genome-wide variants (chr1-22,X).

type: uncond or cond
cohort: AFR, SAS or EAS
sex: female or male

(2) For the remaining 30 traits, we analyzed these traits not stratified by sex, but rather treating sex as a covariate. We perform both analyses with no adjustment for previously identified variants (uncond) and analyses conditioned on GWAS catalog and Sinnot-Armstrong et al preprint variants for each serum and urine biomarker (cond).

a. In the three ancestry specifc sub-folders of the uncond folder, each trait has two files named "uncond_trait_cohort_auto.txt.gz" and "uncond_trait_cohort_chrX.txt.gz".

trait: the left column listed in section 3.
auto: autosomes, containing variants on chr1-22.
chrX: X chromosome only. We performed sex-stratified analyses and then meta-analyzed for chrX.

b. In the three ancestry specifc sub-folders of the cond folder, each trait has autosomes results, but only traits with X chromosome known signals have chrX conditional summary statistics.

Top

The University of North Carolina at Chapel Hill

Li Group Home

Serum Biomarker Home

Description