-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a few tests, refactoring #44
Conversation
…d better, add settings module for centralised settings management
I created a bug report for the shapefile issue (#48), I believe this can be dealt with in a separate PR as it doesn't seem to be related to the test setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this - it looks good to me, and definitely beyond my python skills, so I'm going to merge it in so we can start running tests.
}, | ||
extensions={ | ||
Format.SHAPEFILE: 'shp', | ||
Format.GEOJSON: 'json', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to have this recognize both .geojson and .json as 'GeoJSON'? My previous solution was a total hack:
output_extension = {
'shapefile': '.shp',
'geojson': 'json',
It just checked for ending in 'json' so it'd grab both .json and .geojson. Tagged it in #49 - don't need more in this PR, and would love to get this one in asap to start to have a test framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good point!
sources: Dict[Source, SourceSettings] | ||
extensions: Dict[Format, str] | ||
|
||
settings = SettingsSchema( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this, makes a ton of sense. I'm learning more python just reading this PR :)
|
open_buildings/cli.py
Outdated
elif source.lower() == "overture": | ||
data_path = "s3://us-west-2.opendata.source.coop/cholmes/overture/geoparquet-country-quad-hive/*/*.parquet" | ||
hive_partitioning = True | ||
match source.lower(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i realise match case was only introduced in python 3.10 - will revert to if/else to support older versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds great. As I mentioned in the other comment I don't know the tradeoffs in python well enough to reason about what version to target, but I generally lean towards making it easier for users as long as it's not onerous for devs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think it's better to support older versions if it's not too much hassle :)
So looks like 'match' is syntax from 3.10, and our test suite does 3.8 and 3.9. I have no idea about the python ecosystem, and how important it is to support older versions (and how many people are on like 3.10 and above), so I'm probably fine to just remove those tests if 3.8 and 3.9 aren't worth the tradeoffs, but I defer to those who know Python better. |
Great - I'll try out working that way, it makes lots of sense to me.
It's all good - I think as the project matures we can get more disciplined about tighter PR's, but since we don't even have tests yet and this adds them I think it is all good.
Very reasonable - we can try to start pushing any new PR to have tests, and ticket some test making that could be good 'first issues' for people interested in joining.
Sounds good - happy to YOLO this and see what happens. |
@felix-schott - just gave you increased rights on the repo, as I'd love you reviewing PR's, as I'm sure I'll learn a lot having you look over my code. No worries if you don't want to at any point, but wanted to give you the rights to do so. |
i changed the | syntax to the older Union syntax - sorry for the mess, I'm used to developing in more recent Python versions. i had also added the shapefile test back and it turns out it seems to be a general linux problem as it also failed in the pipeline! anyway, i gotta leave now, will check back in tomorrow if there's still issues (looks like the ci/cd run needs to be approved by you). thanks for giving me increased rights! happy to review changes :) |
Awesome, thanks! I'm going to try to figure out why I need to approve every workflow. I was hoping that giving you more rights would at least let you run it. Hopefully I can find something more liberal. |
Ok, made it more liberal, to 'Only first-time contributors who recently created a GitHub account will require approval to run workflows.' so hopefully this will work by default more. It's definitely annoying for me to come to a new PR and then have to hit 'run' and wait, and clearly would be more helpful to a first time contributor to be able to see the results of CI. |
tests/test_open_buildings.py
Outdated
|
||
@pytest.mark.integration | ||
@pytest.mark.flaky(reruns=NUM_RERUNS) | ||
@pytest.mark.parametrize("format", [f for f in Format if format != Format.SHAPEFILE]) # fails for shapefile! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're getting:
[gw5] linux -- Python 3.10.13 /opt/hostedtoolcache/Python/3.10.13/x64/bin/python3
worker 'gw5' crashed while running 'tests/test_open_buildings.py::test_download_format[Format.SHAPEFILE]'
=========================== short test summary info ============================
FAILED tests/test_open_buildings.py::test_download_format[Format.SHAPEFILE]
I was going to go into the code to skip testing shapefile, but it seems to me the line above should do that, so I'm not sure what to do other than remove shapefile as an output format, but that seems silly since it works on mac and windows.
@cholmes All tests passed - I took the liberty and merged since you previously signalled approval. Hope that's okay. Probably doesn't need a new tag since it doesn't add any major features, just tests and some refactoring. The only feature would be that directories are permitted as dst. |
Definitely ok to merge, thanks! |
Hi @cholmes,
As promised, I added a few tests (#36) and refactored the code a bit. I prefer to work in containers, so I added a Dockerfile for developing as well. When working in a container, you can avoid things like the issue Darren reported (DuckDB spatial extension not installed) as you develop in a more controlled environment. If you're on linux, you can simply run the
dev-container.sh
script or make use of the VSCode container extensions.Apart from adding tests, I changed a few things:
Sorry for overloading this PR, some changes could have arguably been a separate PR. There are more tests that could/should be added, I focussed on the download function for now (might add more in the future).
Also, I updated the CI/CD pipeline - I'm not too familiar with Github actions and wasn't sure how to test, so not guaranteed that this will work (we'll find out).
One important thing: I added tests for all the different data formats. Downloads for all pass except for shapefile - at least in my environment. Not sure if this is a bug or something wrong with my setup? Has this ever worked? Hash out the
if != shapefile
bit in the test to reproduce.Thanks for reviewing these changes, let me know if anything is unclear.
Felix