You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A step in many researchers workflow is data cleaning - taking data from public repositories or their own lab output and cleaning it for use in an analysis. Being able to track how that data was cleaned is an important part of making the research reproducible, but there aren't currently many 'how to's' on this process or the importance of this step. It would be interesting to discuss including a module on data cleaning in a reproducible research workshop, or developing one that we can point to on line.
One example would be a module for using OpenRefine reproducibly.
The text was updated successfully, but these errors were encountered:
I completely agree that it's important to make this part reproducible. And for data cleaning it's particularly important to capture motivation (the why and not just the what). For example, the results may be completely reproducible, but why did you remove subject A and not subject B?
A step in many researchers workflow is data cleaning - taking data from public repositories or their own lab output and cleaning it for use in an analysis. Being able to track how that data was cleaned is an important part of making the research reproducible, but there aren't currently many 'how to's' on this process or the importance of this step. It would be interesting to discuss including a module on data cleaning in a reproducible research workshop, or developing one that we can point to on line.
One example would be a module for using OpenRefine reproducibly.
The text was updated successfully, but these errors were encountered: