Add a few tests, refactoring #44

felix-schott · 2023-10-13T17:37:54Z

As promised, I added a few tests (#36) and refactored the code a bit. I prefer to work in containers, so I added a Dockerfile for developing as well. When working in a container, you can avoid things like the issue Darren reported (DuckDB spatial extension not installed) as you develop in a more controlled environment. If you're on linux, you can simply run the dev-container.sh script or make use of the VSCode container extensions.

Apart from adding tests, I changed a few things:

make the download function (more) stand-alone, decouple from CLI
add docstring to download function
centralised settings module
some input validation
type hints
allow dst to be a directory

Sorry for overloading this PR, some changes could have arguably been a separate PR. There are more tests that could/should be added, I focussed on the download function for now (might add more in the future).

Also, I updated the CI/CD pipeline - I'm not too familiar with Github actions and wasn't sure how to test, so not guaranteed that this will work (we'll find out).

One important thing: I added tests for all the different data formats. Downloads for all pass except for shapefile - at least in my environment. Not sure if this is a bug or something wrong with my setup? Has this ever worked? Hash out the if != shapefile bit in the test to reproduce.

Thanks for reviewing these changes, let me know if anything is unclear.
Felix

…d better, add settings module for centralised settings management

felix-schott · 2023-10-14T10:20:12Z

I created a bug report for the shapefile issue (#48), I believe this can be dealt with in a separate PR as it doesn't seem to be related to the test setup.

cholmes

Thanks for doing this - it looks good to me, and definitely beyond my python skills, so I'm going to merge it in so we can start running tests.

cholmes · 2023-10-14T18:10:53Z

open_buildings/settings.py

+    },
+    extensions={
+        Format.SHAPEFILE: 'shp',
+        Format.GEOJSON: 'json',


Is there a way to have this recognize both .geojson and .json as 'GeoJSON'? My previous solution was a total hack:

output_extension = { 'shapefile': '.shp', 'geojson': 'json',

It just checked for ending in 'json' so it'd grab both .json and .geojson. Tagged it in #49 - don't need more in this PR, and would love to get this one in asap to start to have a test framework.

oh good point!

cholmes · 2023-10-14T18:11:28Z

open_buildings/settings.py

+    sources: Dict[Source, SourceSettings]
+    extensions: Dict[Format, str]
+
+settings = SettingsSchema(


Love this, makes a ton of sense. I'm learning more python just reading this PR :)

cholmes · 2023-10-14T18:14:09Z

I prefer to work in containers, so I added a Dockerfile for developing as well.

open_buildings/download_buildings.py

felix-schott · 2023-10-14T18:22:18Z

open_buildings/cli.py

-    elif source.lower() == "overture":
-        data_path = "s3://us-west-2.opendata.source.coop/cholmes/overture/geoparquet-country-quad-hive/*/*.parquet"
-        hive_partitioning = True
+    match source.lower():


i realise match case was only introduced in python 3.10 - will revert to if/else to support older versions

Ok, sounds great. As I mentioned in the other comment I don't know the tradeoffs in python well enough to reason about what version to target, but I generally lean towards making it easier for users as long as it's not onerous for devs.

yeah, i think it's better to support older versions if it's not too much hassle :)

cholmes · 2023-10-14T18:22:43Z

So looks like 'match' is syntax from 3.10, and our test suite does 3.8 and 3.9. I have no idea about the python ecosystem, and how important it is to support older versions (and how many people are on like 3.10 and above), so I'm probably fine to just remove those tests if 3.8 and 3.9 aren't worth the tradeoffs, but I defer to those who know Python better.

cholmes · 2023-10-14T18:26:58Z

I prefer to work in containers, so I added a Dockerfile for developing as well.

Great - I'll try out working that way, it makes lots of sense to me.

Sorry for overloading this PR, some changes could have arguably been a separate PR.

It's all good - I think as the project matures we can get more disciplined about tighter PR's, but since we don't even have tests yet and this adds them I think it is all good.

There are more tests that could/should be added, I focussed on the download function for now (might add more in the future).

Very reasonable - we can try to start pushing any new PR to have tests, and ticket some test making that could be good 'first issues' for people interested in joining.

Also, I updated the CI/CD pipeline - I'm not too familiar with Github actions and wasn't sure how to test, so not guaranteed that this will work (we'll find out).

Sounds good - happy to YOLO this and see what happens.

cholmes · 2023-10-14T18:29:25Z

@felix-schott - just gave you increased rights on the repo, as I'd love you reviewing PR's, as I'm sure I'll learn a lot having you look over my code. No worries if you don't want to at any point, but wanted to give you the rights to do so.

…ntax for type unions

felix-schott · 2023-10-14T18:40:43Z

i changed the | syntax to the older Union syntax - sorry for the mess, I'm used to developing in more recent Python versions. i had also added the shapefile test back and it turns out it seems to be a general linux problem as it also failed in the pipeline! anyway, i gotta leave now, will check back in tomorrow if there's still issues (looks like the ci/cd run needs to be approved by you).

thanks for giving me increased rights! happy to review changes :)

cholmes · 2023-10-14T18:43:29Z

Awesome, thanks! I'm going to try to figure out why I need to approve every workflow. I was hoping that giving you more rights would at least let you run it. Hopefully I can find something more liberal.

cholmes · 2023-10-14T18:46:39Z

Ok, made it more liberal, to 'Only first-time contributors who recently created a GitHub account will require approval to run workflows.' so hopefully this will work by default more. It's definitely annoying for me to come to a new PR and then have to hit 'run' and wait, and clearly would be more helpful to a first time contributor to be able to see the results of CI.

cholmes · 2023-10-14T18:53:42Z

tests/test_open_buildings.py

+
+@pytest.mark.integration
+@pytest.mark.flaky(reruns=NUM_RERUNS)
+@pytest.mark.parametrize("format", [f for f in Format if format != Format.SHAPEFILE]) # fails for shapefile!


We're getting:

[gw5] linux -- Python 3.10.13 /opt/hostedtoolcache/Python/3.10.13/x64/bin/python3 worker 'gw5' crashed while running 'tests/test_open_buildings.py::test_download_format[Format.SHAPEFILE]' =========================== short test summary info ============================ FAILED tests/test_open_buildings.py::test_download_format[Format.SHAPEFILE]

I was going to go into the code to skip testing shapefile, but it seems to me the line above should do that, so I'm not sure what to do other than remove shapefile as an output format, but that seems silly since it works on mac and windows.

felix-schott · 2023-10-15T09:59:17Z

@cholmes All tests passed - I took the liberty and merged since you previously signalled approval. Hope that's okay. Probably doesn't need a new tag since it doesn't add any major features, just tests and some refactoring. The only feature would be that directories are permitted as dst.
Thanks for your time reviewing these changes :)

cholmes · 2023-10-15T14:04:34Z

Definitely ok to merge, thanks!

Felix Schott added 14 commits October 12, 2023 19:08

add dockerfile, add test dependencies

059519c

refactor code, decouple download function from cli so it can be teste…

437ad17

…d better, add settings module for centralised settings management

mount gitconfig

e487077

refactoring

7cdc20b

add tests

d3169c4

better error message

5e41bfc

gitignore

4872df1

docstring, tidying up

b22401e

add tests to pipeline

05131dc

mount .ssh

83e77cf

back to nauru

0c17347

fix tests, use new aoi: seychelles

efdb302

add test for directory

d6c5742

add xdist for parallel execution of tests

2c074b8

felix-schott marked this pull request as ready for review October 14, 2023 10:20

felix-schott mentioned this pull request Oct 14, 2023

Cannot save as Shapefile #48

Open

cholmes mentioned this pull request Oct 14, 2023

Allow both .geojson and .json as suffixes to mean output as GeoJSON #49

Open

cholmes approved these changes Oct 14, 2023

View reviewed changes

cholmes reviewed Oct 14, 2023

View reviewed changes

open_buildings/download_buildings.py Outdated Show resolved Hide resolved

Update open_buildings/download_buildings.py

66c6343

felix-schott commented Oct 14, 2023

View reviewed changes

Felix Schott added 2 commits October 14, 2023 20:30

revert to if/else to support older python versions

c10f604

remove shapefile again as it's also failing in pipeline, use older sy…

0211646

…ntax for type unions

cholmes reviewed Oct 14, 2023

View reviewed changes

fix test

0727072

felix-schott merged commit faf9388 into opengeos:main Oct 15, 2023
7 checks passed

felix-schott deleted the 36_add-tests branch October 15, 2023 10:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a few tests, refactoring #44

Add a few tests, refactoring #44

felix-schott commented Oct 13, 2023 •

edited

Loading

felix-schott commented Oct 14, 2023

cholmes left a comment

cholmes Oct 14, 2023

felix-schott Oct 14, 2023

cholmes Oct 14, 2023

cholmes commented Oct 14, 2023

felix-schott Oct 14, 2023

cholmes Oct 14, 2023

felix-schott Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes commented Oct 14, 2023

felix-schott commented Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes Oct 14, 2023

felix-schott commented Oct 15, 2023

cholmes commented Oct 15, 2023

Add a few tests, refactoring #44

Add a few tests, refactoring #44

Conversation

felix-schott commented Oct 13, 2023 • edited Loading

felix-schott commented Oct 14, 2023

cholmes left a comment

Choose a reason for hiding this comment

cholmes Oct 14, 2023

Choose a reason for hiding this comment

felix-schott Oct 14, 2023

Choose a reason for hiding this comment

cholmes Oct 14, 2023

Choose a reason for hiding this comment

cholmes commented Oct 14, 2023

felix-schott Oct 14, 2023

Choose a reason for hiding this comment

cholmes Oct 14, 2023

Choose a reason for hiding this comment

felix-schott Oct 14, 2023

Choose a reason for hiding this comment

cholmes commented Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes commented Oct 14, 2023

felix-schott commented Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes commented Oct 14, 2023

cholmes Oct 14, 2023

Choose a reason for hiding this comment

felix-schott commented Oct 15, 2023

cholmes commented Oct 15, 2023

felix-schott commented Oct 13, 2023 •

edited

Loading