Idea for version control #18
Replies: 4 comments
-
Thanks Sean, I think this is a great plan. In the changelog, when methods
change we can also flag commit(s) that show how the code carrying out any
computations has changed.
It also has the added bonus of being able to track attribution and use of
our data more cleanly using the DOI and access date/release version.
…On Thu, Nov 9, 2023 at 9:30 AM Sean Rohan ***@***.***> wrote:
Following up about ideas for version control and tracking changes.
Here's one idea:
- Create a single DOI for GAP_PRODUCTS data product along with a
public metadata record. Create a data management plan and archive a
snapshot once a year (with NCEI?) to ensure it's discoverable.
- Maintain a continuous versioning and update the version after each
update, including changes that occur within a year. Name versions using a
YYYY.MM.DD-R scheme (e.g. 2023.11.09-1) to make it easier to figure out
which version of the data folks are using.
- Encourage users to cite the data product and include the accession
date and DOI in the citation. Provide a recommended citation in the
documentation.
- Create a plain text NEWS/changelog that describes changes in each
versioned release. Use descriptive titles for changes, e.g.:
November 11, 2023
GAP_PRODUCTS Version 2023.11.09-1
A brief description of the changes in this versioned release.
DATA UPDATE
- Corrected northern rock sole (species_code = 10261) length data errors from the 2017 NBS survey. Added 500 length samples from 20 hauls (vessel 162, cruise 201702) that were erroneously omitted from the database due to a data transformation error.
- Grid cells for the Gulf of Alaska stratum areas recalculated based on...
- Corrected surface temperature error from cruise 201902, vessel 94, haul 38. Temperature erroneously calculated as XX changed to YY based on ZZ.
METHODOLOGICAL CHANGE
- Method for calculating design-based abundance indices for REGION X changed from YY to ZZ.
—
Reply to this email directly, view it on GitHub
<#9>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKMJP6AXQREG57KSSFQLOTYDUHLTAVCNFSM6AAAAAA7E53662VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DMMBZG4YDIMY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Lewis Barnett, PhD (he/him/his)
Research Fish Biologist
NOAA Fisheries, Alaska Fisheries Science Center
7600 Sand Point Way NE, Bldg 4
Seattle, Washington 98115
Google Voice: (206) 526-4111
|
Beta Was this translation helpful? Give feedback.
-
I also think these are great ideas! While yes, there is more work to do on these points, I wanted to let you know we are working towards this and I have a few small updates. I've put together the very initial steps towards (idea 2) preparing the .txts for the news page with 67c0a0e where @zoyafuso-NOAA and I have created the first (still not meeting all of the points above) .txt change log files and (idea 4) updated the news.qmd page to curate these txt files 91580d0 . I am rerunning the quarto book now and hope to have these initial changes implemented on the page soon. Regarding ideas 1 and 3: We still need to prepare a DOI for this quarto book and DOIs for these data products, but there is the current CITATION.bib file for the data products documented in this quarto book. I think we want to hold off until the spring to make these DOIs, as that is the deadline for SSMA and others to provide their final review of the data. We are still working out the exact data archiving plan (noted in idea 1) and have a few ideas to review with OFIS. I'll continue to work on these and report back with progress! @sean-rohan-NOAA can you send us a little more information about sharing data with NCEI, if you have it? |
Beta Was this translation helpful? Give feedback.
-
To add here, the temporary solution is saving a version of the GAP_PRODUCTS tables (as .csv files), the input data (RDS file) and reference tables (as .csv files) used to produce those tables, and the changelog text doc in a zipped file in the G: drive (G:\GAP_PRODUCTS_Archives) after each test production run. Each run is labeled by the date it was run. |
Beta Was this translation helpful? Give feedback.
-
@EmilyMarkowitz-NOAA @zoyafuso-NOAA Great to see this moving forward. Nancy Roberson is probably the best person to talk to about archiving with NCEI if that's a direction you want to move in, although I was only suggesting that as a potential avenue because it's the one I'm familiar with. You've obviously done quite a bit of work with FOSS but, since I work more with environmental data products, I'm more familiar with NOSS. With NCEI, there's a process involved with preparing and submitting the data, and I'm not sure about the constraints and limitations placed on providing the data. More info about about the NCEI's services: https://www.ncei.noaa.gov/services |
Beta Was this translation helpful? Give feedback.
-
Following up about ideas for version control and tracking changes.
Here's one idea:
Beta Was this translation helpful? Give feedback.
All reactions