Skip to content

Latest commit

 

History

History
73 lines (60 loc) · 6.82 KB

README.md

File metadata and controls

73 lines (60 loc) · 6.82 KB

Computational Workflow and Analysis Code for Anchor Project

Code for computational workflows and analyses relating to "Computational prediction of MHC anchor locations guide neoantigen identification and prioritization"

Table of Contents

Computational Prediction of Anchor locations

Orthogonal validation with crystallography structures

Evaluating Anchor Impact

  • Impact Analysis TCGA samples.ipynb
    • Selection of a balanced HLA population from remaining TCGA samples
    • Generating FASTA files and running pVACbind
    • Objective determination of anchor locations
    • Analyzing the entire cohort using three different filters (no anchor, conventional anchor and allele-specific anchor)
  • Impact analysis using different binding cutoffs.ipynb
    • Repeating analysis using different binding cutoffs and inclusion criteria

Validation peptide selection and analysis

Additional analyses

Resources

  1. For researchers wanting to incorporate our end results into their pipelines:
    • Normalized anchor scores are available in the supplemental materials of original paper and also available under Datasets in this github repository.
    • Our compiled seed dataset (containing peptide sequences, hla allele and all 8 binding algorithm outputs) are also available under Datasets.
  2. For researchers looking to expand this database for particular HLA alleles, we recommend the following steps:
    • Identify strong binding peptides for the HLA allele(s) and peptide length(s) of interest.
    • Generate a dictionary of peptides where each position is mutated to all possible amino acids.
    • Use that dictionary to generate a FASTA file in the format required by pVACbind (www.pvactools.org).
    • Run pvacbind in parallel across different HLA allele(s) and peptide length(s).
      • Note that you will likely have to run each combination in a separate command (we provide the scripts we used on our own cluster for your adaptation).
    • Assemble prediction results and calculate the anchor scores for each position of each peptide (please refer to helper functions in Anchor Position Calculation.ipynb).
    • This process can be done on a individual peptide-HLA combination basis but you can also aggregate and average across multiple peptides (for the same length for the same HLA allele )for an overall score.

License

The project is licensed under the MIT license.

Stable release with DOI

DOI