GWAS_QRMR provides an R plugin for plink 1.9 to perform genome-wide quantile regression with quantitative trait data and a workflow for subsequent meta-regression. In contrast to normal mean linear regression, quantile regression can represent information about variants that exhibit non-homogenous effect sizes across different subsets of a population, a hallmark of GxG and GxE interactions. In contrast to other interaction study approaches, QRMR requires no knowledge about the identity of the interacting feature.
For more information, check out this slide deck.
- run quantile regression using plink + R-plugin. for specified quantiles, identify estimated effect size and standard error.
- perform meta-regression to test for association between effect size and quantile.
- (optional, for comparison) run linear regression on same variants for comparison. identify estimated effect size and standard error.
- quantile regression w/ R plugin plink 1.9 2020-06-17+
- R (version 3.6.3+)
- R packages quantreg 5.6.7+, Rserve 1.8+ metafor 2.4+ magrittr
- (suggested R packages) for visualization ggplot2, plotly
- (suggested) for faster mean linear regression plink 2.0
- standard plink1 genotype data input: .bed, .bim, .fam files. plink1.9 .bed documentation
- quantitative phenotype information (.phe) and covariate information (.cov)
- call
plink --R $path_to_function
with additional desired parameters. An example .sh + slurm sbatch configuration used on Harvard's O2 cluster is provided ingwas_qrmr/scripts/
- plink + this r plugin will produce an output of 45 metrics from quantile regression for each snp. example downstream meta-regression based on this output is provided in
gwas_qrmr/metaregression/qrmr_example.Rmd
along with some fictional quantile regression data as input. - real output summary statistic of uk biobank qrmr, filtered to variants with significant meta-regression p-value (b1 or b2) for bmi, height, and HbA1c provided in
gwas_qrmr/metaregression/qrmr_filtered_output/
.
- plink + this r plugin will produce an output of 45 metrics from quantile regression for each snp. example downstream meta-regression based on this output is provided in
An example of significant non-homogenous effect sizes for BMI, height, and HbA1c is available at this shiny app
Quantile regression, in the context of genetic association analysis, is an interesting way to potentially uncover variants with non-homogenous effect size that may have been missed by standard association methods. These quantile-dependent effects may occur for a variety of reasons, including GxG and GxE interactions.
The main obstacle for this method is that the estimation of quantile regression standard error is quite slow compared to other association methods. The quantile regression output reported in this repository and in the associated shiny app were produced using 12 covariates (sex, age, 10 PC's from UK Biobank). Ideally, we would include more covariates. However, standard error computation becomes much much slower as the covariate dimension is increased. This is an area of ongoing research and I'm hopeful that improvements are on the horizon. However, using an R-plugin to perform quantile regression may introduce some unavoidable slowness that will be hard to avoid without rethinking the software used to perform this analysis.
Some former studies have performed quantile regression on GWAS data, subsetting initially to variants known previously to have genome-wide significance. While this dramatically cuts down the hypotheses tested and thus also dramatically reduces computation time, I feel this would dramatically reduce the potential of quantile regression for new discovery. From the three phenotypes tested here (height, BMI, HbA1c), the bulk of significant quantile-dependent associations are not detected by mean linear regression.
- Pozarickij, A., Williams, C., Hysi, P.G. et al. Quantile regression analysis reveals widespread evidence for gene-environment or gene-gene interactions in myopia development. Commun Biol 2, 167 (2019). https://doi.org/10.1038/s42003-019-0387-5
- Song X, Li G, Zhou Z, Wang X, Ionita-Laza I, Wei Y. QRank: a novel quantile regression tool for eQTL discovery. Bioinformatics. 2017;33(14):2123-2130. doi:10.1093/bioinformatics/btx119
- Wei Y, Pere A, Koenker R, He X. Quantile regression methods for reference growth charts. Stat Med. 2006;25(8):1369-1382. doi:10.1002/sim.2271
- Fang Y, Xu P, Yang J, Qin Y. A quantile regression forest based method to predict drug response and assess prediction reliability. PLoS One. 2018;13(10):e0205155. Published 2018 Oct 5. doi:10.1371/journal.pone.0205155
- github/gwas_qrmr
- qrmr shiny app
- plink 2.0 alpha
- plink 1.9 beta
- Rserve 1.8
- quantreg 5.67
- metafor 2.4-0
- Rag Group
- Chirag Patel
- everyone!
- NIH 5T32HG002295-18