This repo contains all the R code for C2M2 Kathmandu portal, from KoboCollect to ADS preparation.
/raw
contains data received as-is from Kobo/utils
contains Rscripts shared by other modules/experimental
contains test code, intermediate outputs and so on./core
contains Rscripts for the data pipeline, organized by survey, i.e., scripts encompassing the Kobo ➜ ADS for the workforce and business surveys.
- Base table schema design + generation of data dictionary.
- Data cleaning, and base table preparation in R.
- Export base table to PG database.
- Use PG database as source for further analysis + ADS preparation codes.
- Create PG Dump file for populating APIDB.
Use this space to write down observations.
-
Ran into an issue with RPostgreSQL not installing in linux. Use this command to fix:
sudo apt-get install libpq-dev
-
Exporting PG database steps:
Inside the container:
docker exec -it <CONTAINER_ID> bash
pg_dump -U c2m2 -h localhost c2m2 >> /var/tmp/c2m2_dump.sql
Outside the container:
docker cp <CONTAINER_ID>:/var/tmp/c2m2_dump.sql ./
- Wasn't able to correctly parse UTF-8 files in windows. Issue arose when trying to import labels for survey question choises. I had to switch to RStudio Server in WSL2 (how-to here)
Should we gitignore
raw files?
Maybe its a good idea to do that, but might need to include the original XLSForm, in case someone wants to carry ourt their own deployment.