-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oemol.GetConfs() consuming large amount of memory even when no conformers are present #1855
Comments
I tried implementing this since it should be easy, but it's not. Simply adding a In [32]: oemol = Molecule.from_smiles("CCO").to_openeye()
In [33]: oemol.NumConfs()
Out[33]: 1
In [34]: [*oemol.GetConfs()][0].GetCoords()
Out[34]:
{0: (0.0, 0.0, 0.0),
1: (0.0, 0.0, 0.0),
2: (0.0, 0.0, 0.0),
3: (0.0, 0.0, 0.0),
4: (0.0, 0.0, 0.0),
5: (0.0, 0.0, 0.0),
6: (0.0, 0.0, 0.0),
7: (0.0, 0.0, 0.0),
8: (0.0, 0.0, 0.0)}
In [35]: molecule = Molecule.from_smiles("O=S(=O)(N)c1c(Cl)cc2c(c1)S(=O)(=O)NCN2")
In [36]: molecule.generate_conformers(n_conformers=1)
In [37]: oemol = molecule.to_openeye()
In [38]: oemol.NumConfs()
Out[38]: 1
In [39]: [*oemol.GetConfs()][0].GetCoords()
Out[39]:
{0: (1.8719326257705688, 3.7204949855804443, 2.2212681770324707),
1: (1.2912099361419678, 4.097604274749756, 0.9475870132446289),
2: (0.3753527104854584, 5.218091011047363, 0.8554574251174927),
3: (2.534075975418091, 4.290732383728027, -0.20339979231357574),
4: (0.4765625, 2.689453125, 0.296875),
5: (-0.5654296875, 2.794921875, -0.62060546875),
6: (-1.133737325668335, 4.327646255493164, -1.178165078163147),
7: (-1.181640625, 1.642578125, -1.119140625),
8: (-0.7685546875, 0.360595703125, -0.7216796875),
9: (0.280029296875, 0.290283203125, 0.2086181640625),
10: (0.90185546875, 1.43359375, 0.71923828125),
11: (0.84521484375, -1.2734375, 0.802734375),
12: (1.9853515625, -1.6318359375, -0.0174713134765625),
13: (0.93994140625, -1.193359375, 2.24609375),
14: (-0.484130859375, -2.26953125, 0.403076171875),
15: (-0.9970703125, -2.099609375, -0.96826171875),
16: (-1.4326171875, -0.7451171875, -1.2314453125),
17: (2.4708824157714844, 5.089303016662598, -0.8458374738693237),
18: (3.4932398796081543, 4.066527366638184, 0.08694052696228027),
19: (-2.0012941360473633, 1.7379204034805298, -1.8306996822357178),
20: (1.7089277505874634, 1.3463486433029175, 1.4417718648910522),
21: (-1.2067643404006958, -2.400458812713623, 1.1223875284194946),
22: (-0.20102126896381378, -2.3644607067108154, -1.6715291738510132),
23: (-1.8237838745117188, -2.7984254360198975, -1.1371092796325684),
24: (-2.3384859561920166, -0.6081215143203735, -1.6641637086868286)} |
Okay, actually thinking about this a little more clearly, using |
There's also |
Hm, the courtesy conformer is annoying. This is low priority at best since it's a very moderate amount of memory even for a decent sized protein. Thanks for looking into it! |
Describe the bug
Not a bug per se, but could impact on toolkit usability for large molecules -- while debugging openforcefield/openff-nagl#101 I saw that converting molecules to and from OpenEye consumes a large amount of memory that is not seen with RDKit. For a 5177 atom protein, calling
Molecule.from_openeye
consumes about 800 MiB. Memray attributes most of this tooeconf.GetCoords
, even though no conformers are generated or attached at any point to the molecule. Would it be possible to check for conformers before callingconf.GetCoords
? (It may be that this triggers the same memory-consuming process, though!)To Reproduce
mre.py (also attached):
Requires memray installed:
The screenshot points to this line:
openff-toolkit/openff/toolkit/utils/openeye_wrapper.py
Line 1329 in 97af593
Output
Computing environment (please complete the following information):
conda list
Additional context
mre.zip
Manifest:
The text was updated successfully, but these errors were encountered: