Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate outdated/unnecessary named parameters from getCensus() to fix bug and be in line with package ethos #100

Open
hrecht opened this issue Mar 21, 2024 · 1 comment
Assignees
Labels
Milestone

Comments

@hrecht
Copy link
Owner

hrecht commented Mar 21, 2024

Several years ago, when there were FAR less Census Bureau API endpoints, I added some optional parameters to getCensus() that were convenience options for some of the economic data endpoints. The package supported the use arbitrary parameters (predicates, in Census-speak) since v0.6.0, released on CRAN in 2019, so this is both unnecessary and unwieldy. Also, catering specifically to certain endpoints in this function is not in scope of the package.

The named parameters are:
c("year", "date", "period", "monthly", "category_code", "data_type_code", "naics", "pscode", "naics2012", "naics2007", "naics2002", "naics1997", "sic")

year was added to the list in v0.7.2 when some of the endpoints were swinging back and forth between using time versus year. They are now more consistent and many of the timeseries APIs use lowercase time as a required predicate. YEAR is a variable name in hundreds of the non-timeseries APIs.

The vast majority of this list are actually UPPERCASE predicates in the APIs, not lowercase.

Users will still be able to use the full functionality of the APIs with arbitrary parameters without having these as named optional parameters.

For example, use uppercase YEAR here instead of lowercase year. But really, the preferred syntax now is time, which is the endpoint's true predicate for filtering the timeseries.

# old
saipe_schools <- getCensus(
	name = "timeseries/poverty/saipe/schdist",
	vars = c("SD_NAME", "SAEPOV5_17V_PT", "SAEPOVRAT5_17RV_PT"),
	region = "school district (unified):*",
	regionin = "state:25",
	year = 2022)

# new and valid but not preferred
saipe_schools <- getCensus(
	name = "timeseries/poverty/saipe/schdist",
	vars = c("SD_NAME", "SAEPOV5_17V_PT", "SAEPOVRAT5_17RV_PT"),
	region = "school district (unified):*",
	regionin = "state:25",
	YEAR = 2022)

# preferred
saipe_schools <- getCensus(
	name = "timeseries/poverty/saipe/schdist",
	vars = c("SD_NAME", "SAEPOV5_17V_PT", "SAEPOVRAT5_17RV_PT"),
	region = "school district (unified):*",
	regionin = "state:25",
	time = 2022)
@hrecht hrecht added this to the censusapi v1.0.0 milestone Mar 21, 2024
@hrecht hrecht self-assigned this Mar 21, 2024
@hrecht hrecht modified the milestones: v1.0.0, v0.9.0 Mar 26, 2024
@hrecht
Copy link
Owner Author

hrecht commented Mar 26, 2024

To test the usage of these named predicates, I retrieved variable metadata for all timeseries and aggregate endpoints with listCensusMetadata(). Here's how often each are used:

> param_vars %>% count(name, sort = T)
             name   n
1            YEAR 299
2       NAICS2012  99
3   category_code  20
4  data_type_code  20
5       NAICS2007  19
6       NAICS2002  17
7       NAICS1997  16
8             SIC  16
9           NAICS  12
10        MONTHLY   7
11           DATE   4
12         PERIOD   3
13           year   3
14  CATEGORY_CODE   1
15 DATA_TYPE_CODE   1
16         PSCODE   1

Note that in almost all cases the predicates are actually uppercase, not lowercase. getCensus() coerces all but data_type_code and category_code to uppercase in the request construction. The timeseries/qwi/sa, timeseries/qwi/se, timeseries/qwi/rh endpoints use lowercase year as a predicate. (Documentation: https://www.census.gov/data/developers/data-sets/qwi.html)

This coercion to uppercase results in the following code failing because the required year predicate truly is lowercase here:

qwi <- getCensus(
	name = "timeseries/qwi/sa",
	vars = "Emp",
	region = "state:02",
	year = 2021,
	quarter = 1)

# Error in apiCheck(req) : 
#   The Census Bureau returned the following error message:
#  error: unknown predicate variable: 'YEAR' 
 # Your API call was:  https://api.census.gov/data/timeseries/qwi/sa?key=[KEY]&get=Emp&for=state%3A02&YEAR=2021&quarter=1

Deprecating these named parameters is now also a bug fix to avoid this error.

@hrecht hrecht changed the title Remove outdated/unnecessary arbitrary parameters from getCensus() Remove outdated/unnecessary named parameters from getCensus() to fix bug and be in line with package ethos Mar 26, 2024
@hrecht hrecht added the bug label Mar 26, 2024
@hrecht hrecht changed the title Remove outdated/unnecessary named parameters from getCensus() to fix bug and be in line with package ethos Deprecate outdated/unnecessary named parameters from getCensus() to fix bug and be in line with package ethos Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant