forked from cgmartini/martinize.py
-
Notifications
You must be signed in to change notification settings - Fork 1
/
DOC.py
246 lines (221 loc) · 12.7 KB
/
DOC.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
###################################
## 1 # OPTIONS AND DOCUMENTATION ## -> @DOC <-
###################################
import martinize
import types, os
# This is a simple and versatily option class that allows easy
# definition and parsing of options.
class Option:
def __init__(self, func=str, num=1, default=None, description=""):
self.func = func
self.num = num
self.value = default
self.description = description
def __nonzero__(self):
if self.func == bool:
return self.value != False
return bool(self.value)
def __str__(self):
return self.value and str(self.value) or ""
def setvalue(self, v):
if len(v) == 1:
self.value = self.func(v[0])
else:
self.value = [self.func(i) for i in v]
# Parameters can be defined for multiple forcefields
# We look for them within the script...
forcefields = [str(ff).split('.')[-1] for ff in globals().values() if (type(ff) == types.ClassType and hasattr(ff,"ff"))]
# ... in the local directory, ....
forcefields += [ff[:-3] for ff in os.listdir(".") if ff[-6:] == "_ff.py"]
# ... and in the GMXDATA dir.
if os.environ.has_key("GMXDATA"):
forcefields += [ff[:-3] for ff in os.listdir(os.environ["GMXDATA"]+"/top/") if ff[-6:] == "_ff.py"]
# Lists for gathering arguments to options that can be specified
# multiple times on the command line.
lists = {
'cystines': [],
'merges' : [],
'links' : [],
'multi' : [],
}
# List of Help text and options.
# This way we can simply print this list if the user wants help.
options = [
# NOTE: Options marked with (+) can be given multiple times on the command line
# option type number default description
"""
MARTINIZE.py is a script to create Coarse Grain Martini input files of
proteins, ready for use in the molecular dynamics simulations package
Gromacs. For more information on the Martini forcefield, see:
www.cgmartini.nl
and read our papers:
Monticelli et al., J. Chem. Theory Comput., 2008, 4(5), 819-834
de Jong et al., J. Chem. Theory Comput., 2013, DOI:10.1021/ct300646g
Primary input/output
--------------------
The input file (-f) should be a coordinate file in PDB or GROMOS
format. The format is inferred from the structure of the file. The
input can also be provided through stdin, allowing piping of
structures. The input structure can have multiple frames/models. If an output
structure file (-x) is given, each frame will be coarse grained,
resulting in a multimodel output structure. Having multiple frames may
also affect the topology. If secondary structure is determined
internally, the structure will be averaged over the frames. Likewise,
interatomic distances, as used for backbone bond lengths in Elnedyn
and in elastic networks, are also averaged over the frames available.
If an output file (-o) is indicated for the topology, that file will
be used for the master topology, using #include statements to link the
moleculetype definitions, which are written to separate files. If no
output filename is given, the topology and the moleculetype
definitions are written to stdout.
Secondary structure
-------------------
The secondary structure plays a central role in the assignment of atom
types and bonded interactions in MARTINI. Martinize allows
specification of the secondary structure as a string (-ss), or as a
file containing a specification in GROMACS' ssdump format
(-ss). Alternatively, DSSP can be used for an on-the-fly assignment of
the secondary structure. For this, the option -dssp has to be used
giving the location of the executable as the argument.
The option -collagen will set the whole structure to collagen. If this
is not what you want (eg only part of the structure is collagen, you
can give a secondary structure file/string (-ss) and specifiy collagen
as "F". Parameters for collagen are taken from: Gautieri et al.,
J. Chem. Theory Comput., 2010, 6, 1210-1218.
With multimodel input files, the secondary structure as determined with
DSSP will be averaged over the frames. In this case, a cutoff
can be specified (-ssc) indicating the fraction of frames to match a
certain secondary structure type for designation.
Topology
--------
Several options are available to tune the resulting topology. By
default, termini are charged, and chain breaks are kept neutral. This
behaviour can be changed using -nt and -cb, respectively.
Disulphide bridges can be specified using -cys. This option can be
given multiple times on the command line. The argument is a pair of
cysteine residues, using the format
chain/resn/resi,chain/resn/resi.
It is also possible to let martinize detect cysteine pairs based on a
cut-off distance of 0.22nm, by giving the keyword 'auto' as argument to -cys.
Alternatively, a different cut-off distance can be specified, which
will also trigger a search of pairs satisfying the distance
criterion (eg: -cys 0.32).
In addition to cystine bridges, links between other atoms can be
specified using -link. This requires specification of the atoms, using
the format
chain/resi/resn/atom,chain/resi/resn/atom,bondlength,forceconstant.
If only two atoms are given, a constraint will be added with length
equal to the (average) distance in the coordinate file. If a bond
length is added, but no force constant, then the bondlength will be
used to set a constraint.
Linking atoms requires that the atoms are part of the same
moleculetype. Therefore any link between chains will cause the chains
to be merged. Merges can also be specified explicitly, using the
option -merge with a comma-separated list of chain identifiers to be
joined into one moleculetype. The option -merge can be used several
times. Note that specifying a chain in several merge groups will cause
all chains involved to be merged into a single moleculetype.
The moleculetype definitions are written to topology include (.itp)
files, using a name consisting of the molecule class (e.g. Protein)
and the chain identifier. With -name a name can be specified instead.
By default, martinize only writes a moleculetype for each unique
molecule, inferred from the sequence and the secondary structure
definition. It is possible to force writing a moleculetype definition
for every single molecule, using -sep.
The option -p can be used to write position restraints, using the
force constant specified with -pf, which is set to 1000 kJ/mol
by default.
For stability, elastic bonds are used to retain the structure of
extended strands. The option -ed causes dihedrals to be used
instead.
Different forcefields can be specified with -ff. All the parameters and
options belonging to that forcefield will be set (eg. bonded interactions,
BB-bead positions, Elastic Network, etc.). By default martini 2.1 is
used.
Elastic network
---------------
Martinize can write an elastic network for atom pairs within a cutoff
distance. The force constant (-ef) and the upper distance bound (-eu)
can be speficied. If a force field with an intrinsic Elastic
network is specified (eg. Elnedyn) with -ff, -elastic in implied and
the default values for the force constant and upper cutoff are used.
However, these can be overwritten.
Multiscaling
------------
Martinize can process a structure to yield a multiscale system,
consisting of a coordinate file with atomistic parts and
corresponding, overlaid coarsegrained parts. For chains that are
multiscaled, rather than writing a full moleculetype definition,
additional [atoms] and [virtual_sitesn] sections are written, to
be appended to the atomistic moleculetype definitions.
The option -multi can be specified multiple times, and takes a chain
identifier as argument. Alternatively, the keyword 'all' can be given
as argument, causing all chains to be multiscaled.
========================================================================\n
""",
("-f", Option(str, 1, None, "Input file (PDB|GRO)")),
("-o", Option(str, 1, None, "Output topology (TOP)")),
("-x", Option(str, 1, None, "Output coarse grained structure (PDB)")),
("-n", Option(str, 1, None, "Output index file with CG (and multiscale) beads.")),
("-nmap", Option(str, 1, None, "Output index file containing per bead mapping.")),
("-v", Option(bool, 0, False, "Verbose. Be load and noisy.")),
("-h", Option(bool, 0, False, "Display this help.")),
("-ss", Option(str, 1, None, "Secondary structure (File or string)")),
("-ssc", Option(float, 1, 0.5, "Cutoff fraction for ss in case of ambiguity (default: 0.5).")),
("-dssp", Option(str, 1, None, "DSSP executable for determining structure")),
# ("-pymol", Option(str, 1, None, "PyMOL executable for determining structure")),
("-collagen", Option(bool, 0, False, "Use collagen parameters")),
("-his", Option(bool, 0, False, "Interactively set the charge of each His-residue.")),
("-nt", Option(bool, 0, False, "Set neutral termini (charged is default)")),
("-cb", Option(bool, 0, False, "Set charges at chain breaks (neutral is default)")),
("-cys", Option(lists['cystines'].append, 1, None, "Disulphide bond (+)")),
("-link", Option(lists['links'].append, 1, None, "Link (+)")),
("-merge", Option(lists['merges'].append, 1, None, "Merge chains: e.g. -merge A,B,C (+)")),
# ("-mixed", Option(bool, 0, False, "Allow chains of mixed type (default: False)")),
("-name", Option(str, 1, None, "Moleculetype name")),
("-p", Option(str, 1, 'None', "Output position restraints (None/All/Backbone) (default: None)")),
("-pf", Option(float, 1, 1000, "Position restraints force constant (default: 1000 kJ/mol/nm^2)")),
("-ed", Option(bool, 0, False, "Use dihedrals for extended regions rather than elastic bonds)")),
("-sep", Option(bool, 0, False, "Write separate topologies for identical chains.")),
("-ff", Option(str, 1, 'martini22', "Which forcefield to use: "+' ,'.join(n for n in forcefields))),
# Fij = Fc exp( -a (rij - lo)**p )
("-elastic", Option(bool, 0, False, "Write elastic bonds")),
("-ef", Option(float, 1, 500, "Elastic bond force constant Fc")),
("-el", Option(float, 1, 0, "Elastic bond lower cutoff: F = Fc if rij < lo")),
("-eu", Option(float, 1, 0.90, "Elastic bond upper cutoff: F = 0 if rij > up")),
("-ea", Option(float, 1, 0, "Elastic bond decay factor a")),
("-ep", Option(float, 1, 1, "Elastic bond decay power p")),
("-em", Option(float, 1, 0, "Remove elastic bonds with force constant lower than this")),
("-eb", Option(str, 1, 'BB', "Comma separated list of bead names for elastic bonds")),
# ("-hetatm", Option(bool, 0, False, "Include HETATM records from PDB file (Use with care!)")),
("-multi", Option(lists['multi'].append, 1, None, "Chain to be set up for multiscaling (+)")),
]
# Martini Quotes
martiniq = [
("Robert Benchley",
"Why don't you get out of that wet coat and into a dry martini?"),
("James Thurber",
"One martini is all right, two is two many, three is not enough"),
("Philip Larkin",
"The chromatic scale is what you use to give the effect of drinking a quinine martini and having an enema simultaneously."),
("William Emerson, Jr.",
"And when that first martini hits the liver like a silver bullet, there is a sigh of contentment that can be heard in Dubuque."),
("Alec Waugh",
"I am prepared to believe that a dry martini slightly impairs the palate, but think what it does for the soul."),
("Gerald R. Ford",
"The three-martini lunch is the epitome of American efficiency. Where else can you get an earful, a bellyful and a snootful at the same time?"),
("P. G. Wodehouse",
"He was white and shaken, like a dry martini."),
]
desc = ""
def help():
"""Print help text and list of options and end the program."""
import sys
for item in options:
if type(item) == str:
print item
for item in options:
if type(item) != str:
print "%10s %s" % (item[0], item[1].description)
print
sys.exit()