GitHub - rgcgithub/regenie: regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.

It is developed and supported by a team of scientists at the Regeneron Genetics Center.

The method has the following properties

It works on quantitative, binary, and time-to-event traits, including binary traits with unbalanced case-control ratios and time-to-event traits with low event rates
It can handle population structure and relatedness
It can process multiple phenotypes at once efficiently
It is fast and memory efficient 🔥
For binary traits, it supports Firth logistic regression and an SPA test
For time-to-event traits, it supports Firth cox regression
It can perform gene/region-based tests, interaction tests and conditional analyses
It supports the BGEN, PLINK bed/bim/fam and PLINK2 pgen/pvar/psam genetic data formats
It is ideally suited for implementation in Apache Spark (see GLOW)
It can be installed with Conda

Full documentation for the regenie can be found here.

Citation

Mbatchou, J., Barnard, L., Backman, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet 53, 1097–1103 (2021). https://doi.org/10.1038/s41588-021-00870-7

License

regenie is distributed under an MIT license.

Contact

If you have any questions about regenie please contact

If you want to submit a issue concerning the software please do so using the regenie Github repository.

Version history

Version 4.0 (New options --t2e and --eventColList for time-to-event analysis to specify time-to-event analysis and the event phenotype name, respectively; Fix algorithm used to fit logistic Firth model when using --write-null-firth to match closer to the approach used in step 2)

Version 3.6 (Bug fix for the approximate Firth test when ultra-rare variants [MAC below 50] are being tested; Address convergence failures & speed-up exact Firth by using warm starts based on null model with just covariates)

Version 3.5 (Added CHR/POS columns to snplist output file when using --write-mask-snplist; Genotype counts are now reported in the sumstats file when using --no-split; Improved efficiency of LOOCV scheme in ridge level 0; Detect carriage return in fam/psam/bim/pvar/sample files; Minor bug fixes)

Version 3.4.1 (Reduction in memory usage for LD computation when writing to text files; Fix bug rejecting valid PVAR files)

Version 3.4 (Reduction in memory usage for LD computation with dosages; Minor bug fixes for LD computation; Bug fix for when carriage returns are in optional input files)

Version 3.3 (Faster implementation of approximate Firth LRT; New strategy for approximate Firth LRT with ultra-rare variants; Relaxed convergence criterion of Firth LRT from 1E-4 to 2.5E-4)

Version 3.2.9 (Switch to robust version of ACAT to handle very small p-values; Bug fix for Step1 when sex chromosome was included in the analysis; Allow for 64 domains when using the 4-column annotation file)

Version 3.2.8 (New option --bgi to specify custom index bgi file accompagnying BGEN file; Relax matching criteria between BGEN and index bgi files to use CPRA instead of variant ID)

Version 3.2.7 (New option --force-mac-filter to apply different MAC filter to subset of SNPs; Extend maximum number of domains to 32 for 4-column anno-file; Update PGEN library)

Version 3.2.6 (Relax tolerance parameter for null unpenalized logistic regression from 1e-8 to 1e-6; Minor bug fixes)

Version 3.2.5.3 (Fix inflation issue when testing main effect of SNP in GxE model; Minor bug fixes)

Version 3.2.5 (Use pseudo-data representation algorithm as default in step 2 single variant tests; Use ACAT to get SBAT p-value across POS/NEG models; Bug fix for ACATV when set has a single variant with zero weight)

Version 3.2.4 (Relaxed the requirement on the minimum number of unique values for QTs to 3; Various bug fixes)

Version 3.2.3 (Address convergence issues in Firth regression; Various bug fixes)

Version 3.2.2 (New columns in sumstats file (N_CASES/N_CONTROLS) to output the number of cases/controls when using --af-cc; Various bug fixes)

Version 3.2.1 (New option --lovo-snplist to only consider a subset of LOVO masks; Improve efficiency of LOVO for large sets to reduce memory usage; Bug fix for SPA with numerical overflow; For SKAT/ACAT tests with Firth correction, don't include SKAT weights when running Firth on single variants)

Version 3.2 (Bug fix for SKAT/SKATO when testing on binary traits using Firth/SPA; Switched name of NNLS joint test to SBAT test altering name of corresponding options and applied Bonferroni correction before reporting its p-value [correcting for minP of 2 tests])

Version 3.1.4 (New option --par-region to specify build to determine bounds for chrX PAR regions; new option --force-qt to force QT runs for traits with fewer than 10 values [otherwise will throw an error]; phenotype imputation for missing values is now applied after RINTing when using --apply-rint; several bug fixes)

Version 3.1.2 (Reduction in memory usage for SKAT/SKATO tests; Bug fix for LOVO with SKAT/ACAT tests; Improvements for null Firth logistic algorithm to address reported convergence issues)

Version 3.1.1 (Reduction in memory usage for SKAT/SKATO tests; Improvements for logistic regressions algorithms to address reported convergence issues)

Version 3.1 (Fixed bug in SKAT/SKATO tests when applying Firth/SPA correction; Improved SPA implementation by computing both tail probabilities; New option --set-singletons to specify variants to consider as singletons for burden masks; New option --l1-phenoList to run level 1 models in Step 1 in parallel across phenotypes; Several bug fixes)

Version 3.0.3 (Skip BTs where null model fit failed; Bug fix for BURDEN-ACAT; Bug fix when nan/inf values are in phenotype/covariate file)

Version 3.0.1 (Improve ridge logistic regression in Step 1; Add compilation with Cmake)

Version 3.0 (New gene-based tests: SKAT, SKATO, ACATV, ACATO and NNLS [Non-Negative Least Square test]; New GxE and GxG interaction testing functionality; New conditional analysis functionality; see release page for minor additions)

For past releases, see here.

Name		Name	Last commit message	Last commit date
Latest commit History 838 Commits
.github/workflows		.github/workflows
docs		docs
example		example
external_libs		external_libs
scripts		scripts
src		src
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
Dockerfile_mkl		Dockerfile_mkl
Dockerfile_openblas		Dockerfile_openblas
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
RELEASE_LOG.md		RELEASE_LOG.md
VERSION		VERSION

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Citation

License

Contact

Version history

About

Releases 48

Packages

Contributors 9

Languages

License

rgcgithub/regenie

Folders and files

Latest commit

History

Repository files navigation

Citation

License

Contact

Version history

About

Resources

License

Stars

Watchers

Forks

Releases 48

Packages 0

Contributors 9

Languages

Packages