Skip to content

Latest commit

 

History

History
executable file
·
36 lines (30 loc) · 3.61 KB

Resolving_Pipeline_Problems.md

File metadata and controls

executable file
·
36 lines (30 loc) · 3.61 KB

For any problems or questions that cannot be easily resolved please feel free to contact me (tbrunetti) or post on the issues page.

INSTALLATION ISSUES

OTHER

PCs table and PCA graphs won't generate

  • This error usually is a 'file not found' or 'executionHalted' error and generally is a result of one of the following issues:
    1. The GENESIS or GWASTools libraries in R are not loaded, installed, or the location of the libraries are not properly set
    2. The GENESIS PC matrix was never generated because no individuals remained after filtering (change filter thresholds in pipeline to be less stringent)
    3. The GENESIS PC matrix was never generated because not enough SNPs remained after filtering (change filter thresholds in pipeline to be less stringent). It should be noted that the PC table generation is dependent upon the kinship files generated by the KING software. The KING software requires >1000 SNPs remaining in order to generate a non-empty file and generate a PC matrix.
    4. If using --TGP flag, the graph will not generate if you set the --centerPop argument to a population that does not exist in your data set.
    5. If using --TGP flag, make sure the path to the TGP_Sub_and_SuperPopulation_info.txt file is properly set when running chunky configure. Also, make sure the TGP IID match what is in the TGP_Sub_and_SuperPopulation_info.txt file under scanID and if it is modified ensure the header remains the exact same and formatting remains the same.
    6. If using --TGP flag, check each group's subdierectoy generated by the pipeline and make sure any .missnp files do not exist. If so resolve these issues. This is a file generated by plink when merging TGP with the group. Any snps in this file mean there is a discordance between your input data and the TGP data for that SNP. i.e. they need to be flipped or removed from the dataset. Make sure all the SNP names match between your input set and the TGP set.

Specifiying startStep and/or endStep does not seem to work or will not allow pipeline to execute

  • This error usually arises from modification of existing directories of the created project or failing to also call --reanalyze flag
    1. If either startStep or endStep is being used, the --reanalyze flag must also be specified at the command line.
    2. Make sure startStep or endStep is follow by one of the following strings:
    • hwe
    • LD
    • maf
    • het
    • ibd
    • outlier_removal
    • PCA_TGP (can only be used if --TGP flag is specified)
    • PCA_TGP_graph (can only be used if --TGP flag is specified)
    • PCA_indi (can only be used if --TGP flag is NOT specified)
    • PCA_indi_graph (can only be used --TGP flag is NOT specified)
    1. These are used to re-run a particular step of an already created and processed project by the GP3 pipeline. The startStep must already have all previous steps relative to the specified startStep present in the directory. No files prior to this startStep should be removed, modified, or deleted in any way.

The following error is seen in the error log file: "Partitioning Samples into Related and Unrelated Sets...Error in acc(object, NL[cnode]) : unmatched node provided
Calls: pcair ... pcairPartition -> connComp -> connComp -> acc -> acc"

  • This error usually is generated because there is not enough related vs unrelated individuals when using the .kin0 file alone to generate a kinship matrix. Typically one will notice the matrix is sparsely populated.
    1. Specify the --fullKin flag at runtime. This will tell the pipeline to use the .kin and .kin0 files generated to create the kinship matrix. This will usually resolve this issue.