Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose more meta-info for forms/databases (like # of total records in a form/db) #39

Open
Ryo-N7 opened this issue Dec 6, 2022 · 8 comments
Assignees
Milestone

Comments

@Ryo-N7
Copy link
Collaborator

Ryo-N7 commented Dec 6, 2022

EDIT: upon discussion with Alex + Nic, this will be more of a focus in January esp. as this is something ActivityInfo needs to do on their back-end of the API rather than any implementation on the R side of things (at least until the data is exposed to the API)

@Ryo-N7 Ryo-N7 converted this from a draft issue Dec 6, 2022
@akbertram
Copy link
Member

Expose form metadata for account-level users:

  • List of forms - database tree
  • For each form:
    • Number of records (MUST HAVE)
    • Last update time (MUST HAVE)
    • Number of users who have access to this form
    • Sensitivity classification of data
    • Visibility

@nickdickinson
Copy link
Collaborator

A workaround I found for now while implementing the lazy df is to request one row containing only the record id. This gives the total number of records. If we sort it by update time, then we could in theory get the last updated time too. I've exposed the totalRows metadata in queryTable as well as a few more things as part of the implementation of getRecords() (sorting, limiting, offset).

@Ryo-N7
Copy link
Collaborator Author

Ryo-N7 commented Mar 14, 2023

is there also any way to download the form-schema of every form in a database without having to loop over each form?

as currently for the +20 projects AV has, this process can take over 10 minutes to just grab all that meta-data before we even get to the processing + visualization bits of the code

@nickdickinson
Copy link
Collaborator

I think this is more for @akbertram but as I do not know if this is possible server side.

@Ryo-N7 perhaps it would be an idea to split this over a few different R instances with the foreach package? I've successfully done this for longer GIS processing workflows. I don't know if there are certain rate limits or other considerations (like exponential back offs) that would need to be implemented.

@nickdickinson
Copy link
Collaborator

Note: the lazy dataframe reports the number of records and does retrieve the _lastEditTime column. The user can get the last edit time as follows:
getRecords(formId) %>% select(_lastEditTime) %>% arrange(desc(_lastEditTime)) %>% head(1) %>% collect() %>% pull()

Should we add this to the metadata display (with a small additional call to the server every time we display a lazy dataframe)? Or wrap in a function? Otherwise, can we close this now? @jamiewhths

Is there a more efficient way to do this? Does the server provide this metadata for the form?

@nickdickinson
Copy link
Collaborator

@jamiewhths Can the server provide metadata on the following items mentioned by @akbertram :

  • Number of users who have access to this form
  • Sensitivity classification of data

The first can be derived but that is a bit complicated as it means reimplementing effectively the roles system and probably can better be served by the server. I am not sure what the second is...

@jamiewhths
Copy link
Contributor

@nickdickinson we can retrieve the user grants on a resource via https://www.activityinfo.org/support/docs/api/reference/getDatabaseUserGrantsOnResource.html. To determine the number of users who have access, you can count the number of objects on the returned array, as the server has computed inherited grants etc.

@jamiewhths
Copy link
Contributor

On sensitivity, this was probably an early idea that we had to mark datasets with data classification levels. You can disregard for now as these are not implemented yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants