`readepi()` signature and package scope #45

Bisaloo · 2023-07-31T16:36:43Z

The idea of having a common wrapper to import many different types of data with a single function call has been pioneered by the rio package and has garnered positive feedback from the community. However, this is a difficult approach to implement and in many cases, it may lead to more confusion. I suggest two key principles to make readepi as clear and as usable as possible:

provide two levels of complexity. It would be useful if users can choose if they prefer to use the generic wrapper or a specific function to import their data. This approach is for example implemented in the lightr package, which imports spectrometric data from a variety of formats: https://docs.ropensci.org/lightr/reference/index.html, and links to design discussions we had with @TimTaylor and @jpavlich.
uniformise low-level parser signatures (at the moment, the same arguments in different functions have different names and different orders) and give the general readepi() wrapper a similar "feel" no matter the data source you choose. To ensure uniformity in signatures, a good approach can be to rely as much as possible on the inheritsParams roxygen2 tag
. This provides an incentive and a reminder to use consistent argument names. To give a similar "feel" to readepi() for all data sources, a good approach is to minimize source-specific arguments. If these arguments are absolutely needed, they should probably not be part of the function signature but be passed as .... This however requires clear documentation of each of the low-level parsers, which as per CRAN policy, will probably force you to export the low-level parsers (which provide an extra argument for the previous point).

Looking at the current README, it seems it will be very difficult to reconcile the "reading from file" and "reading for server" approach in the single function while keeping a similar feel. This might suggest that you need to either have two high-level wrappers or reduce the scope of this package. I believe the "one function to read from many different formats" scope is already well covered by rio. IMO, adding a thin wrapper around rio creates more confusion and maintenance complexity for a minimal gain over what rio already proposes. I think there is value in thinking of readepi as a companion to rio. rio reads from many different formats & readepi reads from many different health information systems. This clear distinction of scope strengthen the case for a readepi package while minimizing overlap with solid, well-established alternatives.

The text was updated successfully, but these errors were encountered:

Karim-Mane · 2023-08-18T11:57:39Z

This is a good point.

We aim at providing to users a tool like devtools where if you don't dig into it, you would not realise that it is internally calling packages like roxygen2, usethis, etc. We understand that an experienced user would prefer to call directly the required functions from their source packages such as {rio}, {redcapR}, {fingertipsR}, etc.

However, we are also anticipating that most users, specially in the developing countries, do not have that much skills/experience to use several packages at the same time. We believe that such users will find it useful to have a "1 stop shop" with a function that will allow them to import data from several sources without knowing the mechanism under the hood. Also note that {readepi} can read several file types that are not covered by {rio}.

To tackle the issue of design in {readepi}, we will replace the credentials_file and file_path arguments by from. The value of this from argument can be a file path (when reading from a file) or a URL (when reading from APIs).
This way, the credentials_file argument will be provided within the ....

What do you think about this?

Bisaloo · 2023-08-31T12:02:37Z

I agree with the idea of having a "1 stop shop" but with limits. IMO, readepi is perfect being a 1 stop shop for importing data from remote health information systems or databases, wrapping redcapR, fingertipsR, etc.:

it doesn't overlap with rio (by wrapping its functionality), which is already a 1 stop shop for reading local data. It is very important for the success of our project that we follow open source etiquette.
it simplifies the function signature, providing a much more uniform and homogeneous interface to read from different sources
it clarifies the package niche. Creating a clear distinction between local and remote data is a service to the user IMO as it increases their comprehension of the data they're working with
it reduces maintenance load

I understand your point about users having limited R knowledge. However, I'm not convinced cramming everything in a single package makes things easier since the package becomes harder to use. The proper way to deal with this is through teaching.

The comparison with devtools is interesting but not completely accurate:

rio is already a wrapper with a unified interface. By including its functionality here, we're wrapping a wrapper 😵
all packages wrapped in devtools are maintained by the same team, which facilitates maintenance and dealing with breaking changes. Crucially, it removes any concerns about "stealing citations" by wrapping an external package, which can be a very sensitive issue.

Also note that {readepi} can read several file types that are not covered by {rio}.

The best course of action would be to contribute this to rio. It's always a good thing to contribute to the open-source ecosystem beyond the specific packages we're developing from scratch.

TimTaylor · 2023-09-01T08:14:02Z

The best course of action would be to contribute this to rio. It's always a good thing to contribute to the open-source ecosystem beyond the specific packages we're developing from scratch.

For reference, rio has now returned to it's original maintainer so worth tracking their plans. See their recent pinned issue and accompanying blog post.

Karim-Mane · 2023-09-03T08:09:01Z

Good point - I have removed the read_from_files-related functions and test files.

Will also check to see whether how best to contribute to {rio}.

Karim-Mane self-assigned this Aug 16, 2023

Karim-Mane added the help wanted Extra attention is needed label Aug 16, 2023

Karim-Mane closed this as completed Oct 18, 2023

Bisaloo mentioned this issue Oct 24, 2023

Harmonize low-level parsers #53

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`readepi()` signature and package scope #45

`readepi()` signature and package scope #45

Bisaloo commented Jul 31, 2023

Karim-Mane commented Aug 18, 2023 •

edited

Loading

Bisaloo commented Aug 31, 2023

TimTaylor commented Sep 1, 2023

Karim-Mane commented Sep 3, 2023

readepi() signature and package scope #45

readepi() signature and package scope #45

Comments

Bisaloo commented Jul 31, 2023

Karim-Mane commented Aug 18, 2023 • edited Loading

Bisaloo commented Aug 31, 2023

TimTaylor commented Sep 1, 2023

Karim-Mane commented Sep 3, 2023

`readepi()` signature and package scope #45

`readepi()` signature and package scope #45

Karim-Mane commented Aug 18, 2023 •

edited

Loading