Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial on artifacts #88

Open
kbroman opened this issue Nov 22, 2018 · 5 comments
Open

Tutorial on artifacts #88

kbroman opened this issue Nov 22, 2018 · 5 comments

Comments

@kbroman
Copy link
Member

kbroman commented Nov 22, 2018

Add a tutorial on identifying artifacts, particularly in multiparent populations.

  • single outlier
  • missing genotype
  • especially narrow LOD peaks
  • crazy coefficient estimates
  • at a SNP that is monomorphic in the population, but where there is still some variation in the genotype probabilities.
@HannahVMeyer
Copy link

HannahVMeyer commented Jun 22, 2023

Hi - thanks for a great package and providing genotype data for the CC, this has really helped our analyses.

I have come across some very strange effect size estimates and found this open issue. Do you have any advice on how to debug this/what the cause might be?

Below are visualizations of the output from scan1 and scan1coef run on 42 CC strains with 2-4 replicates each of a continuous, but 0-1 bound phenotype (derived from a proportion), loco kinship and batch as covariate. I used the genotype information from https://raw.githubusercontent.com/rqtl/qtl2data/main/CC/cc.zip. For simplicity I show only chr18 (weird effect coefficients) and chr19 (coefficients as might be expected), but this weird pattern is also seen on other chromosomes.

image

As suggested in this issue, I also tried using the clean_genotypes function. This changed the coefficient estimates (see plot below) but it still doesn't look right:
image

I am using qtl2_0.32 on R 4.2.1

Many thanks!

@kbroman
Copy link
Member Author

kbroman commented Jun 22, 2023

I don't much like scan1coef(). The artifacts are happening in a position without particularly large LOD scores; to me, there's not much reason to be interested in the estimated effects at a position that is not really the QTL.

I would pull out the genotype probabilities at the estimated QTL position and use fit1() to get the estimated effects at that position.

@HannahVMeyer
Copy link

Many thanks, Karl! I only showed the plots above for convenience of screenshotting, the actual phenotype has a LOD > 6, significant via permutation testing.

Thanks for pointing me to fit1, I hadn't come across that function in the tutorials I followed. What I liked about the output scan1coef was the ability to plot the estimated coefficients underneath the LOD plot. Is there a similar nice visualisation that comes for fit1?

As for interpreting the fit1 results (and apologies if this is obvious): can I interpret the coef as the effect size per genotype? By then checking the genotypes of the individual strains, I can infer which strain(s) were driving the associations based on genotype?

@kbroman
Copy link
Member Author

kbroman commented Jun 22, 2023

I use ciplot() from my broman package.

The estimated effects are for the additive effects coefficients in a linear regression, shifted so that they sum to 0.

@HannahVMeyer
Copy link

Thanks very much for the pointers and the speedy reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants