-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI: separate established families from non-established ones #94
CLI: separate established families from non-established ones #94
Conversation
ad98e2e
to
c4a0d37
Compare
@sphuber if you agree with the implementation I'll add another commit with updates to the documentation. |
As I wrote in #92 does this now also work for Pseudo Dojo? I think their metadata files have yet another format. In fact it is not a plain JSON file but is an archive. It needs to be parsed with So more and more I am starting to think if we shouldn't just add optional flags |
That's fair, I wasn't aware about the different metadata format for pseudo-dojo. Will adjust! |
After giving it some more thought, I fully agree with the principle that we should try to make sure that when the user installs a pseudopotential family with established protocols using the CLI, the label, pseudos and recommended cutoffs represent the official ones. To adhere to this, I propose the following:
If you agree, I'll add these notes to the "Design" section of the tutorial, and tell users to open an issue in case they want to request support for a new set of "established" pseudopotentials. [1] A pseudopotential family is considered established when it has a set of rigorously tested pseudopotentials with convergence tests which have been published or are publicly available (also see #78). |
Fully agree with all of what you wrote in the last comment. |
c4a0d37
to
98282a8
Compare
@sphuber will update test and finalize docs once we are agreed on the implementation. I thought that having a |
314101d
to
e627750
Compare
My first instinct was also to have My biggest concern though is how we make sure that archives and metadata passed through these flags are actually "correct". I guess that we put their md5 in the description so we could check afterwards. But should we be more restrictive and have a hash table of "correct" md5s? Then again, I am not sure if we are running into the same trap where we are just making life hard for well-meaning citizens. Also, how would the label of the family be specified when using Maybe all these problems go away by actually going with the |
Yeah, even with the |
Yes! That must be it. I was thinking that we could simply take the label from the individual filenames since that is the case for the SSSP, but that might not necessarily be the case and is definitely not the case in general (already doesn't hold for PD). It also makes a lot of sense given the reason we put this in to have the One remaining question is then the format of the downloaded pack that is consumed by Paranoia mode engaged: If we really want to prevent people from messing with it, we can encrypt the JSON file so it is not easy to modify the data and checksums in there. For some reason I keep coming back to this idea that we want to make sure the right things gets installed, but if people really want to install an incorrect version they can always circumvent our measures and really that would be their problem. So forget this idea 😄 |
We could still make it more difficult for the user to defile the integrity of the SSSP and Pseudo-Dojo families by blocking the Python API methods for changing the recommended cutoffs. Or, if we want to be more lenient, at least protect the established stringencies. :) And then we could also encrypt the recommended cutoffs in the database so it's more difficult to do any kind of postgres hack. 😁 |
With the source code being openly accessible, this would be easily decrypted though :) |
Alright, I guess we'll let them hack the postgres database at their leisure. 😉 Still, having "protected stringencies" for established pseudopotential families/libraries might be a good idea? |
Maybe. I couldn't find the details in the discussion above. What would this look like in terms of interface and implementation? |
I'll leave this for a different PR, still have to (re-)think about this a bit. I'll make the changes re Once this is complete I'll open an issue where we can discuss the protected stringencies idea. |
Currently the `family cutoffs set` command requires that the JSON file that contains the cutoffs _only_ has the cutoff keys specified for each element. This is because the cutoffs are validated by the `RecommendedCutoffMixin.validate_cutoffs()` method, which doesn't allow for extraneous keys. However, the `.json` file could have been adapted by the user from e.g. the SSSP `metadata.json` or some other dictionary that also contains other keys. As long as the cutoff keys are defined, we shouldn't raise an error just because there are other keys present. Here we first filter the data loaded from the JSON file for the cutoffs, before passing the dictionary to the `set_cutoffs` method.
One of the motivations of having dedicated and automated CLI commands to install established pseudpotential families like the SSSP or the Pseudo-dojo normconserving pseudopotentials is to try and prevent that the pseudopotentials or recommended cutoffs deviate from the official ones. However, the `family cutoffs set` command currently allows the user to change the recommended cutoffs easily through the CLI. In order to prevent this, we block usage of the command for `SsspFamily` and `PseudoDojoFamily`'s, by adding an `exclude` input argument to the `PseudoPotentialFamilyParam`, where the entry points of the classes that should be disallowed can be provided. This is then used to adapt the option decorator for the `PSEUDO_POTENTIAL_FAMILY` input argument of `family cutoffs set`. Note that the user _can_ still install e.g. the SSSP with their own recommended cutoffs using the `install family` command as a `CutoffsPseudoPotentialFamily`. However, the `SsspFamily` class is reserved for pseudopotentials installed with the automated install command.
Our workaround for installing an "established" family like a `SsspFamily` or `PseudoDojoFamily` required using the `install family` command, but this is explicitly forbidden since we do not want the user to be able to install an arbitrary set of pseudopotentials as one of these classes. This means there is currently no way to install the archive and metadata obtained with the `--download` option of the automated install commands. Here, we add the `--from-dir` option to the automated install commands to allow the user to install the downloaded pseudopotentials and set the recommended cutoffs automatically for an `SsspFamily` or `PseudoDojoFamily`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbercx . Fine to get this merged today but there are a few minor things to touch up. Also I think it should be rebased again because it contains changes to the docs that should already be in master. If you want me to fix these things because you are busy with the tutorial, let me know, I can quickly fix it.
assert sorted(family.get_cutoff_stringencies()) == sorted(['low', 'normal']) | ||
|
||
# Invalid cutoffs structure | ||
filepath.write_text(json.dumps({'Ar': {'cutoff_rho': 300, 'cutoff_wfc': 300, 'GME': 'moon'}})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by this test. I thought the whole idea of one of the commits was to allow for extraneous keys, like GME
in this example, and not have the command except because of it, but simply ignore it. So what is going on here? Why should this command fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems the command was actually failing for a different reason (i.e. not all elements being present). This has now been fixed in #132
Oh, wait, I'm still working on adapting the PR for the |
e627750
to
f178ea2
Compare
@sphuber the PR isn't super urgent: it's probably only a small benefit in speed for the AiiDAlab docker stack deployment. I haven't managed to get to your comments, but have already implemented a first draft of using the Will fix tests and update documentation once you agree with the implementation. A little demo: $ aiida-pseudo install sssp --download-only
Info: downloading selected pseudopotentials archive... [OK]
Info: downloading selected pseudopotentials metadata... [OK] $ aiida-pseudo install sssp --from-download SSSP_1.1_PBE_efficiency.aiida_pseudo
Info: unpacking archive and parsing pseudos... [OK]
Success: installed `SSSP/1.1/PBE/efficiency` containing 85 pseudopotentials $ aiida-pseudo install sssp --from-download SSSP_1.1_PBE_efficiency.aiida_pseudo
Critical: SsspFamily<SSSP/1.1/PBE/efficiency> is already installed Perfectly fine to continue working on this once I'm back from holiday. :) |
I'm also leaving this weekend, so let's wrap this up when we're both back. |
Wondering there might be a rare case that Do we need to provide a way that the files can be downloaded separately from the browser or by |
I would have also thought that it wouldn't be possible to have the download return successfully but the file actually being corrupt. However, I remember another user that reported this behavior and they also suffered from bad connections. Seems this really is possible then. If we cannot make the |
@mbercx Think something may have gone wrong in your rebasing. Could you have a look and fix this branch before I review it? |
@sphuber I'm splitting up this PR in separate ones. Still have to open the final PR that adds the |
Fixes #92
Fixes #93
Two small commits that fix the issues related to the workaround we have set up for users which have issues using the automated install because they cannot download the files with these commands. See the commit messages for more info.
Demo that it works: