-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-2236 - Updates to methods that calculate the points used to create the dimensions #18
base: DAS-2208-SMAP-L3
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look, and I appreciate this PR being another more bite-sized piece of progress towards the end goal.
I think there are few themes to my comments:
- You haven't added sufficient unit tests to provide thorough coverage of your changes. In most cases, you've got some tests for new functions, but not all. The tests you've added are generally good, but often are only testing the happy path. It's pretty important to also test realistic edge-cases, which isn't really done in this PR.
- There are quite a lot of places where you are performing very similar tasks, but in some cases along a row, rather than up-and-down a column. Some of those tasks are basically the same thing (performing some operation on a 1-D slice of an array), so it would be cleaner to write a function to do something to a 1-D slice and then invoke it with the correct 1-D slice for either a row or a column.
- I think generally, there are a few places where the documentation strings for each function could do with a bit more descriptive text. This will be the number one location that future developers try to read to understand what the code is meant to do.
- General coding comments.
for coordinate in coordinate_variables_list | ||
if varinfo.get_variable(coordinate).is_latitude() | ||
for coordinate in coordinate_variables | ||
if varinfo.get_variable(coordinate) is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Same comment for the similar change in the list comprehension for longitude_coordinate_variables
)
Where do you catch any issues if there is a reference to a variable in a coordinates
metadata attribute that isn't present in the file? If it's pointing to a latitude or longitude variable that isn't present in the file, at some point that's going to cause a failure when you try to access that variable. Is that gracefully handled somewhere?
(It's likely this is already nicely supported, but it's been a while since the previous PR and I just can't remember)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate comment - you have not added unit tests to prove this case is being handled as expected. (Needed both for latitude or longitude)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this method is first called before prefetch. it should not fail if the variables don't exist.
- if it is present in the variable and not in the file - it will fail during prefetch. That is when it is first accessed from the file. - will check if I have a test on that
- There is a unit test for the function - will add subtests if the variables don't exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are unit tests to check for the above cases in line 205 of test_coordinate_utilities.py
https://github.com/nasa/harmony-opendap-subsetter/blob/57744a6298adf081d0bf06c9a6acd0b98cccf99c/tests/unit/test_coordinate_utilities.py#L205C1-L219C1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is present in the variable and not in the file - it will fail during prefetch. That is when it is first accessed from the file. - will check if I have a test on that
That makes sense. To test that, you'd have to mock a failed response from OPeNDAP. This might be a good place to look for an example of mocking a failed request.
There are unit tests to check for the above cases in line 205 of test_coordinate_utilities.py
That test is great, but it's not specifically for this function, it's for get_coordinate_array
. The condition is present here in get_coordinate_variables
, and so you need to write unit tests that isolate behaviour within this function (in addition to whatever already exists for other functions).
It's entirely plausible that get_coordinate_array
could be updated in some way that means those unit tests you have a link to are removed. You would then lose the coverage of those test cases for get_coordinate_variables
.
I'll check in after Owen has had a chance to review. I've reviewed my suggestions and updates made are ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking a bit to re-review. I've replied to some of the comment threads. I'll also check now to see if other comments got lost from the diff where they may have been on code that was removed or updated.
for coordinate in coordinate_variables_list | ||
if varinfo.get_variable(coordinate).is_latitude() | ||
for coordinate in coordinate_variables | ||
if varinfo.get_variable(coordinate) is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is present in the variable and not in the file - it will fail during prefetch. That is when it is first accessed from the file. - will check if I have a test on that
That makes sense. To test that, you'd have to mock a failed response from OPeNDAP. This might be a good place to look for an example of mocking a failed request.
There are unit tests to check for the above cases in line 205 of test_coordinate_utilities.py
That test is great, but it's not specifically for this function, it's for get_coordinate_array
. The condition is present here in get_coordinate_variables
, and so you need to write unit tests that isolate behaviour within this function (in addition to whatever already exists for other functions).
It's entirely plausible that get_coordinate_array
could be updated in some way that means those unit tests you have a link to are removed. You would then lose the coverage of those test cases for get_coordinate_variables
.
return max_y_spread_pts, max_x_spread_pts | ||
|
||
|
||
def get_max_x_spread_pts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood - it is called twice: once to get the row and once to get the columns. That means calling the function get_max_x_spread_pts
is misleading as it implies you only call it for x
. I think my original request to rename the function still is applicable.
|
||
required_dimensions.update(bounds) | ||
if latitude_coordinates and longitude_coordinates: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm following this comment. Are you saying you are trying to guard against the possibility that you've only found either latitudes or longitudes, but not both?
Honest question: what should happen in that case?
prefetch_path = 'tests/data/SC_SPL3SMP_008_prefetch.nc4' | ||
mock_get_opendap_nc4.return_value = prefetch_path | ||
access_token = 'access' | ||
output_dir = 'tests/output' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you just use the self.temp_dir
defined in the setUp
method for the class?
prefetch_path, required_dimensions, self.varinfo, self.logger | ||
) | ||
|
||
@patch('hoss.dimension_utilities.check_add_artificial_bounds') | ||
@patch('hoss.dimension_utilities.get_opendap_nc4') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice patching.
mock_check_add_artificial_bounds, | ||
mock_get_coordinate_variables, | ||
): | ||
"""Ensure that when a list of required variables is specified, a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is missing its end.
prefetch_path = 'tests/data/SC_SPL3SMP_008_prefetch.nc4' | ||
mock_get_opendap_nc4.return_value = prefetch_path | ||
access_token = 'access' | ||
output_dir = 'tests/output' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment about using self.temp_dir
.
requested_variables, | ||
) | ||
mock_get_opendap_nc4.assert_called_once_with( | ||
url, set(), output_dir, self.logger, access_token, self.config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Just so I am reading this properly...
If there are no normal dimensions or coordinates that are found for required variables, we make a request to OPeNDAP that has nothing specified in the DAP4 constraint expression? If so, that's not great - because that implies there's nothing for us to prefetch, and that we shouldn't be trying to do one. Making a request to OPeNDAP without a constraint expression will mean OPeNDAP returns the entire granule.
Just to help trace the code through:
- This is showing we're calling
get_opendap_nc4
with an empty set for therequired_variables
argument (see here). - Those required variables get piped into get_constraint_expression, which for an empty set will return an empty string.
- Because the constraint expression is an empty string, the request to OPeNDAP doesn't include any request data.
(This is another reason why unit tests are great - they uncover behaviour like this!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prefetch should only be called if a request is made that requires HOSS to determine index ranges (for spatial, temporal or dimension ranges). But this test is saying: sure you want us to do index-based subsetting, but we can't find anything that will allow us to calculate those index ranges.
To me, that feels like it should be an error.
) | ||
self.assertListEqual(actual_x_y_points, expected_x_y_points) | ||
|
||
def test_get_dimension_array_from_geo_points(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First up - thanks for adding unit tests for this new function!
I think the coverage of the two branches of conditional logic (x versus y) is good. I wonder, though, is it possible for the latitude array, longitude array and/or row indices have either only 1 value or 0 values? If so we probably need tests for those cases.
Description
Jira Issue ID
DAS-2236
Local Test Steps
http://localhost:3000/C1268452378-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&granuleId=G1268452388-EEDTEST&subset=lat(70%3A80)&subset=lon(-160%3A-150)&variable=Soil_Moisture_Retrieval_Data_AM%2Fstatic_water_body_fraction&skipPreview=true
and verify you can check the outputs in panoply
PR Acceptance Checklist
CHANGELOG.md
updated to include high level summary of PR changes.docker/service_version.txt
updated if publishing a release.