-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LHCb Open Data Curation scripts #154
base: master
Are you sure you want to change the base?
Conversation
lhcb-YYYY-open-data-curation/LICENCE
Outdated
@@ -0,0 +1,674 @@ | |||
GNU GENERAL PUBLIC LICENSE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove LINCENSE from this directory, since we have the top-level one already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, sounds good.
@@ -0,0 +1,36 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW the file lhcb-YYYY-open-data-curation/scripts/staging/ExistingFilesOnEosPublic.txt
should be probably excluded from git, since it is a moving object...
What we use elsewhere is:
- we put it into
inputs
folder (since it's not strictly speaking a "script" - we add it to
.gitignore
so, when you run the scripts, the file will live "locally" but won't be committed to the upstream data-curation repository
Would this be OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to host this somewhere from the ODP (maybe on eospublic itself?). If a new person starts working on the release or something changes and we don't commit the file it breaks the workflow because metadata_writer expects the file to be there. I can add some functionality to prepare files for staging even without this file but we should point to it as well.
@@ -0,0 +1,39 @@ | |||
# How to prepare the metadata for an LHCb OD release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the PR has:
- lhcb-YYYY-open-data-curation/README.md
- lhcb-YYYY-open-data-curation/scripts/README.md
What about creating only one README file living in lhcb-YYYY-open-data-curation
directory that would contain everything people need to know when working with scripts, and leave the scripts
directory without any README?
(That's the technique we use for other data curation campaigns.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will fix.
@@ -0,0 +1,39 @@ | |||
#!/usr/bin/env python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetics: we try to standardise on using the black
code formatter. If OK with you, we could add pass these scripts via black too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two approaches to this, one was more manual, the other one should work a bit more automatically but needs to be checked.
@@ -0,0 +1,51 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetics: we could pass the shell scripts via shellcheck
, there are several observations to improve some quote expansions etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will look into that.
Pushed the most up to date code for preparing OD release |
Adding LHCb metadata writer and stripping pages creation scripts. As well as the codebase with helper scripts / directories of data to write out.