-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Peptides 2.0
- Loading branch information
Showing
100 changed files
with
2,030 additions
and
1,013 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,12 @@ | ||
Package: Peptides | ||
Version: 1.2.1 | ||
Date: 2017-02-20 | ||
Title: Calculate Indices and Theoretical Properties of Protein Sequences | ||
Author: Daniel Osorio, Paola Rondon-Villarreal and Rodrigo Torres. | ||
Version: 2.0.0 | ||
Date: 2017-03-12 | ||
Title: Calculate Indices and Theoretical Physicochemical Properties of Protein Sequences | ||
Authors@R: c(person("Daniel","Osorio",email="[email protected]",role=c("aut","cre")),person("Paola","Rondon-Villarreal",role=c("aut","ths")),person("Rodrigo","Torres",role=c("aut","ths")),person("J. Sebastian","Paez",email="[email protected]",role=c("ctb"))) | ||
Maintainer: Daniel Osorio <[email protected]> | ||
URL: https://github.com/dosorio/Peptides/ | ||
Suggests: | ||
RUnit | ||
Description: Calculate physicochemical properties and indices from amino-acid | ||
sequences of peptides and proteins. Include also the option to read and plot | ||
output files from the 'GROMACS' molecular dynamics package. | ||
testthat | ||
Description: Includes functions to calculate several physicochemical properties and indices for amino-acid sequences as well as to read and plot 'XVG' output files from the 'GROMACS' molecular dynamics package. | ||
License: GPL-2 | ||
RoxygenNote: 6.0.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,33 @@ | ||
# Generated by roxygen2: do not edit by hand | ||
|
||
S3method(plot,xvg) | ||
export(aacomp) | ||
export(aindex) | ||
export(aIndex) | ||
export(aaComp) | ||
export(aaDescriptors) | ||
export(autoCorrelation) | ||
export(autoCovariance) | ||
export(blosumIndices) | ||
export(boman) | ||
export(charge) | ||
export(crossCovariance) | ||
export(crucianiProperties) | ||
export(fasgaiVectors) | ||
export(hmoment) | ||
export(hydrophobicity) | ||
export(instaindex) | ||
export(instaIndex) | ||
export(kideraFactors) | ||
export(lengthpep) | ||
export(membpos) | ||
export(mswhimScores) | ||
export(mw) | ||
export(pI) | ||
export(plot.xvg) | ||
export(read.xvg) | ||
export(plotXVG) | ||
export(protFP) | ||
export(readXVG) | ||
export(stScales) | ||
export(tScales) | ||
export(vhseScales) | ||
export(zScales) | ||
importFrom(graphics,par) | ||
importFrom(graphics,title) | ||
importFrom(stats,embed) | ||
importFrom(utils,data) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
aaCheck <- function(seq){ | ||
seq <- toupper(seq) | ||
seq <- gsub(pattern = "[[:space:]]+",replacement = "",x = seq) | ||
seq <- strsplit(x = seq,split = "") | ||
check <- unlist(lapply(seq,function(sequence){ | ||
!all(seq[[1]]%in%c("A" ,"C" ,"D" ,"E" ,"F" ,"G" ,"H" ,"I" ,"K" ,"L" ,"M" ,"N" ,"P" ,"Q" ,"R" ,"S" ,"T" ,"V" ,"W" ,"Y", "-")) | ||
})) | ||
if(sum(check) > 0){ | ||
sapply(which(check == TRUE),function(sequence){warning(paste0("Sequence ",sequence," has unrecognized amino acid types. Output value might be wrong calculated"),call. = FALSE)}) | ||
} | ||
return(seq) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
#' @export aaDescriptors | ||
#' @title Compute 66 descriptors for each amino acid of a protein sequence. | ||
#' @description The function return 66 amino acid descriptors for the 20 natural amino acids. Available descriptors are: \itemize{ | ||
#' \item{crucianiProperties:} Cruciani, G., Baroni, M., Carosati, E., Clementi, M., Valigi, R., and Clementi, S. (2004) Peptide studies by means of principal properties of amino acids derived from MIF descriptors. J. Chemom. 18, 146-155., | ||
#' \item{kideraFactors:} Kidera, A., Konishi, Y., Oka, M., Ooi, T., & Scheraga, H. A. (1985). Statistical analysis of the physical properties of the 20 naturally occurring amino acids. Journal of Protein Chemistry, 4(1), 23-55., | ||
#' \item{zScales:} Sandberg M, Eriksson L, Jonsson J, Sjostrom M, Wold S: New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem 1998, 41:2481-2491., | ||
#' \item{FASGAI:} Liang, G., & Li, Z. (2007). Factor analysis scale of generalized amino acid information as the source of a new set of descriptors for elucidating the structure and activity relationships of cationic antimicrobial peptides. Molecular Informatics, 26(6), 754-763., | ||
#' \item{tScales:} Tian F, Zhou P, Li Z: T-scale as a novel vector of topological descriptors for amino acids and its application in QSARs of peptides. J Mol Struct. 2007, 830: 106-115. 10.1016/j.molstruc.2006.07.004., | ||
#' \item{VHSE:} VHSE-scales (principal components score Vectors of Hydrophobic, Steric, and Electronic properties), is derived from principal components analysis (PCA) on independent families of 18 hydrophobic properties, 17 steric properties, and 15 electronic properties, respectively, which are included in total 50 physicochemical variables of 20 coded amino acids., | ||
#' \item{protFP:} van Westen, G. J., Swier, R. F., Wegner, J. K., IJzerman, A. P., van Vlijmen, H. W., & Bender, A. (2013). Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets. Journal of cheminformatics, 5(1), 41., | ||
#' \item{stScales:} Yang, L., Shu, M., Ma, K., Mei, H., Jiang, Y., & Li, Z. (2010). ST-scale as a novel amino acid descriptor and its application in QSAM of peptides and analogues. Amino acids, 38(3), 805-816., | ||
#' \item{BLOSUM:} Georgiev, A. G. (2009). Interpretable numerical descriptors of amino acid space. Journal of Computational Biology, 16(5), 703-723., | ||
#' \item{MSWHIM:} Zaliani, A., & Gancia, E. (1999). MS-WHIM scores for amino acids: a new 3D-description for peptide QSAR and QSPR studies. Journal of chemical information and computer sciences, 39(3), 525-533. | ||
#' } | ||
#' @param seq An amino-acids sequence. If multiple sequences are given all of them must have the same length (gap symbols are allowed.) | ||
#' @return a matrix with 66 amino acid descriptors for each aminoacid in a protein sequence. | ||
#' @examples aaDescriptors(seq = "KLKLLLLLKLK") | ||
aaDescriptors <- function(seq){ | ||
# Remove spaces and line breaks | ||
seq <- aaCheck(seq) | ||
sequences <- length(seq) | ||
# Length validation | ||
if(all(lengths(seq)==length(seq[[1]]))){ | ||
# Extract descriptors | ||
desc <- lapply(seq,function(seq){ | ||
sapply(seq,function(aa){ | ||
c(data.frame(AAdata$crucianiProperties)[aa,], | ||
data.frame(AAdata$kideraFactors)[aa,], | ||
data.frame(AAdata$zScales)[aa,], | ||
data.frame(AAdata$FASGAI)[aa,], | ||
data.frame(AAdata$tScales)[aa,], | ||
data.frame(AAdata$VHSE)[aa,], | ||
data.frame(AAdata$ProtFP)[aa,], | ||
data.frame(AAdata$stScales)[aa,], | ||
data.frame(AAdata$BLOSUM)[aa,], | ||
data.frame(AAdata$MSWHIM)[aa,] | ||
) | ||
}) | ||
}) | ||
# Format output | ||
col_names <- as.vector((outer(rownames(desc[[1]]),seq_len(dim(desc[[1]])[2]),paste,sep="."))) | ||
descriptors <- matrix(data = NA,nrow = sequences,ncol = length(col_names),dimnames = list(list(),col_names)) | ||
for(sequence in seq_along(desc)){ | ||
descriptors[sequence,] <- as.numeric(desc[[sequence]]) | ||
} | ||
# Return | ||
return(descriptors) | ||
} else { | ||
stop("All sequences must have the same length.") | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
#' @export blosumIndices | ||
#' @title Compute the BLOSUM62 derived indices of a protein sequence | ||
#' @description BLOSUM indices were derived of physicochemical properties that have been subjected to a VARIMAX analyses and an alignment matrix of the 20 natural AAs using the BLOSUM62 matrix. | ||
#' @references Georgiev, A. G. (2009). Interpretable numerical descriptors of amino acid space. Journal of Computational Biology, 16(5), 703-723. | ||
#' @param seq An amino-acids sequence | ||
#' @return The computed average of BLOSUM indices of all the amino acids in the corresponding peptide sequence. | ||
#' @examples blosumIndices(seq = "KLKLLLLLKLK") | ||
#' # [[1]] | ||
#' # BLOSUM1 BLOSUM2 BLOSUM3 BLOSUM4 BLOSUM5 | ||
#' # -0.4827273 -0.5618182 -0.8509091 -0.4172727 0.3172727 | ||
#' | ||
#' # BLOSUM6 BLOSUM7 BLOSUM8 BLOSUM9 BLOSUM10 | ||
#' # 0.2527273 0.1463636 0.1427273 -0.2145455 -0.3218182 | ||
#' | ||
blosumIndices <- function(seq) { | ||
|
||
# Split the sequence by amino-acids | ||
# Remove spaces and line breaks | ||
seq <- aaCheck(seq) | ||
|
||
# Load the BLOSUM indices | ||
scales <- AAdata$BLOSUM | ||
|
||
# Computes the BLOSUM indices for given sequences | ||
lapply(seq, function(seq) { | ||
sapply(names(scales), function(scale) { | ||
(sum(scales[[scale]][seq], na.rm = TRUE) / length(seq)) | ||
}) | ||
}) | ||
} |
Oops, something went wrong.