-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save atom hybridization #408
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we'll need to do a migration I think.
I.e. what if you deserialize and there's only 7 entries in atom
?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #408 +/- ##
=======================================
Coverage 98.66% 98.66%
=======================================
Files 36 36
Lines 2097 2103 +6
=======================================
+ Hits 2069 2075 +6
Misses 28 28 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
# Conflicts: # gufe/components/smallmoleculecomponent.py
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Oh dear looks like we need someone to run the lint in a different PR first! |
Yeah I think Mike is doing that 😅 |
Yeah it looks like it was done in #421 but something might have gone wrong it looks like pre-commit is failing on main and it was to change a lot of files here, tagging @mikemhenry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question on what happens when you don't have the property set and you try to write.
@@ -223,6 +237,7 @@ def _to_dict(self) -> dict: | |||
_ATOMCHIRAL_TO_INT[atom.GetChiralTag()], | |||
atom.GetAtomMapNum(), | |||
atom.GetPropsAsDict(includePrivate=False), | |||
_HYBRIDIZATION_TO_INT[atom.GetHybridization()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in this case if this isn't set?
I.e. if you read and old file (which now skips the hybridization reading), then write it out again - what does this pick up?
For backwards compatibility make sure we can create an SMC with missing hybridization info. | ||
""" | ||
phenol_dict = phenol.to_dict() | ||
new_atoms = [] | ||
for atom in phenol_dict["atoms"]: | ||
# remove the hybridization atomic info which should be at index 7 | ||
new_atoms.append(tuple([atom_info for i, atom_info in enumerate(atom) if i != 7])) | ||
phenol_dict["atoms"] = new_atoms | ||
new_phenol = SmallMoleculeComponent.from_dict(phenol_dict) | ||
# they should be different objects due to the missing hybridization info | ||
assert new_phenol != phenol | ||
# make sure the rdkit objects are different | ||
for atom_hybrid, atom_no_hybrid in zip(phenol.to_rdkit().GetAtoms(), new_phenol.to_rdkit().GetAtoms()): | ||
assert atom_hybrid.GetHybridization() != atom_no_hybrid.GetHybridization() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@IAlibay this test covers loading an old SMC with the missing info it looks like Rdkit defaults to setting the hybridization as undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks - in that case we should let users know they've hit a bug (just so that they don't end up silently dealing with this). Could you add a warning please?
Fixes #407 by saving the atom hybridization in the
to_dict
method and uses it in thefrom_dict
method. This should now cause an error when loading molecules which were serialised using the oldto_dict
method though so we might need to think about how to support backwards compatibility with old serialised objects.