- step 1: input proteins sequeces
- step 2: features extraction by Profeat
- step 3: Feature pairwise distance calculation --> cosine, correlation, jaccard
- Step4: Feature 2D embedding --> umap, tsne, mds
- Step5: Feature grid arrangement --> grid, scatter
- Step5: Transform --> minmax, standard
- Encoding layers: Protein features was learned by CNNs and Protein similarity was learned by FCs.
- Decoding layers: LSTMs
You can install it directly by pip install annopro
or install from source code as following steps.
git clone https://github.com/idrblab/AnnoPRO.git
cd AnnoPRO
conda create -n annopro python=3.8
conda activate annopro
pip install .
- Use it as a terminal command. For all parameters, type
annopro -h
.
annopro -i test_proteins.fasta -o output
- Use it as a python executable package
python -m annopro -i test_proteins.fasta -o output
- Use it as a library to integrated with your project.
from annopro import main
main("test_proteins.fasta", "output")
The result is displayed in the ./output/bp(cc,mf)_result.csv
.
Notice: if you use annopro for the first time, annopro will automatically download required resources when they are used (lazy download mechanism)
- pip is looking at multiple versions of XXX to determine which version is compatible with other requirements. this could take a while.
Your pip is latest, back to old version such as 20.2, or just add --use-deprecated=legacy-resolver
param.
If any questions, please create an issue on this repo, we will deal with it as soon as possible.