Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulletproof #139

Open
wants to merge 171 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
171 commits
Select commit Hold shift + click to select a range
9d79555
initial loader
eads Apr 12, 2018
5323d84
initial commit
eads Apr 24, 2018
3449dc5
WIP makefile + benchmarks
eads Jun 11, 2018
0fb037a
add some git keeps
eads Jun 13, 2018
13a2933
improve makefile and add camera data loader
eads Jun 13, 2018
d674643
add sql definitions for parking and cameras, though the schema is ide…
eads Jun 13, 2018
f94b7e6
add csv cleaning processor
eads Jun 13, 2018
58490be
add dev env var file
eads Jun 13, 2018
4a04340
remove duplicate recipe and consolidate aliases
eads Jun 13, 2018
9759acc
update readme to account for consolidated aliases and some broken parts
eads Jun 13, 2018
1c690cb
add clean commands
eads Jul 6, 2018
4816597
move up download command
eads Jul 6, 2018
d6a6e3e
fix circular reference
eads Jul 6, 2018
f2eb5d7
fix typo in download rule
eads Jul 6, 2018
a47a428
download command for cameras
eads Jul 6, 2018
d6cb835
change error file from .csv to .txt, closes #8
Jul 9, 2018
774d81f
change psql table name from 'tickets' to 'parking', closes #10
Jul 9, 2018
a032a23
add skeletal file
eads Jul 11, 2018
25ed78c
factor out cleaning functions
eads Jul 11, 2018
83f0d84
simple offset-based geocoder script, plus related schema updates
eads Jul 11, 2018
9a07337
Add geocodes table
eads Jul 11, 2018
4620e37
add geocode tale maker to makefile
eads Jul 11, 2018
62950cb
error trapping, insert into unique db, and more fields in geocode ins…
eads Jul 11, 2018
3e331e1
use ranges, not offsets and limits, in geocoder.
eads Jul 12, 2018
b4b4d5a
add some views
eads Jul 13, 2018
bec3bc2
massive makefile overhaul
eads Jul 13, 2018
6adaf65
match addresses on ID when inserting, better guardrails on grabbing e…
eads Jul 13, 2018
c4c1558
keep the dumps directory
eads Jul 13, 2018
4ac288f
table definition for community area stats from CMAP
eads Jul 13, 2018
5447790
add amusingly embarassing note to readme
eads Jul 13, 2018
3f21496
Merge pull request #17 from propublica/0007-eads-geocoder
eads Jul 13, 2018
018cbc4
corrects = and :, cuts drop_db dependency on create_db, corrects refe…
Jul 16, 2018
94d7e21
adds csvkit requirement to README closes #23
Jul 16, 2018
8bd83a2
replicate katlyn's makefile fixes
eads Jul 18, 2018
aff7213
correct dependency in load_community_areas make command
eads Jul 18, 2018
f482c25
fix community area matching, closes #25
eads Jul 18, 2018
9f681fc
Merge pull request #27 from propublica/0025-community-area-matching
eads Jul 18, 2018
469fa3b
Merge branch 'master' of github.com:propublica/il-ticket-loader
Jul 19, 2018
55fb3d9
load dump file
eads Jul 20, 2018
c7d0a8d
Merge branch '0025-community-area-matching'
eads Jul 20, 2018
fabde92
ignore permissions from dump file, closes #30
eads Jul 20, 2018
f8f23bf
Merge pull request #32 from propublica/0030-pgrestore-perms
eads Jul 20, 2018
9f4498f
add db string to environment
eads Jul 20, 2018
1878e72
Merge pull request #33 from propublica/0031-dbstring
eads Jul 20, 2018
a9a5c98
refactor make top level command signatures
eads Jul 23, 2018
08d9f69
remove csvsql schema generation from makefile and turn into a note in…
eads Jul 23, 2018
e8aa465
clean up restore permissions, change filename for now
eads Jul 23, 2018
af0ff11
remove irrelevant notes file
eads Jul 23, 2018
fe65910
flip order of explanation about config
eads Jul 23, 2018
7e9aac2
reflect latest make targets
eads Jul 23, 2018
1ec3af3
handle 00s, closes #38
eads Jul 23, 2018
dc2e482
fix db url reference
eads Jul 23, 2018
07fd94d
fix bad bug with wrong types for geocoder ranges
eads Jul 23, 2018
5bd3ece
add quotes to community_area_stats table
eads Jul 23, 2018
328049a
add geom column that's been giving us the flux
eads Jul 23, 2018
46683fa
include just city sticker accuracy in geocode accuracy view
eads Jul 23, 2018
8a91b61
add analysis view that joins every geocode with its community area
eads Jul 23, 2018
76ff39c
Merge branch 'master' of github.com:propublica/il-ticket-loader
Jul 23, 2018
b75b2a1
allows psql view to be updated when view is updated
Jul 25, 2018
b41e4a6
adds median income to community area stats csv
Jul 25, 2018
0bae864
adds all years to data and salts n hashes the license plate numbers, …
Aug 21, 2018
50cb16d
corrects naming of salt.txt file for the intermediate declaration I r…
Aug 22, 2018
934ae71
Merge pull request #46 from propublica/0042-hash-plates
Aug 22, 2018
dedcd07
exports parking table as one big csv and zips file, closes #50, close…
Aug 22, 2018
ceeafe6
zip_n_ship makefile command now also exports zip to s3
Aug 22, 2018
a5ecadc
forget to add the salt creator my b
Aug 22, 2018
55d484d
adds unit_key.csv to zip file to upload to s3
Aug 22, 2018
2f64ebd
adding unit_key.csv to data/ in case Nicolas Cage needs it
Aug 22, 2018
f16b11e
Merge pull request #55 from propublica/0048-zip-n-ship-data
eads Aug 23, 2018
e98910d
Merge pull request #54 from propublica/0042-hash-plates
eads Aug 23, 2018
36e35e2
adds data dict to zip file for s3
Aug 23, 2018
19d0577
adding requirements text file
Aug 23, 2018
5d244de
Merge branch 'master' of github.com:propublica/il-ticket-loader
Aug 23, 2018
aa1b7b1
updating gitignore, closes #58
Aug 23, 2018
5c57664
updating readme to include data dictionary and python requirements, c…
Aug 23, 2018
b3d54ef
account for reason for dismissal from older data
eads Oct 12, 2018
aed0720
makefile cleanup
eads Oct 12, 2018
90d88b8
add dismissal reason to schema
eads Oct 12, 2018
2c071b7
block summary 'views' (really derived tables until hasura supports ma…
eads Oct 12, 2018
25656dc
fix view creation order
eads Oct 12, 2018
e062fb3
update readme with additional reqs, view order
eads Oct 12, 2018
b3186b3
first tests. Found one that's not passing, too
eads Oct 18, 2018
4a7440d
trying out circleci
eads Oct 19, 2018
709abcf
rename clean_csv test file for consistency
eads Oct 19, 2018
0979214
add test for clean_row function
eads Oct 19, 2018
638bae7
use a config.yml that was autogenerated during project setup
eads Oct 19, 2018
0882b44
match zero or more leading zeros from address strings
eads Oct 19, 2018
08704d0
fix requirements for circle ci
eads Oct 19, 2018
52cbe31
Merge pull request #71 from propublica/0070-leading-zeroes
eads Oct 19, 2018
7ae0e2a
Merge pull request #59 from propublica/0048-zip-n-ship-data
eads Oct 19, 2018
71e9744
derive geocodes from original table
eads Oct 22, 2018
eb9d834
try to speed up intermediate table query
eads Oct 22, 2018
d86dacd
more makefile optimization
eads Oct 23, 2018
a3b141b
make hash object once (cuts a few seconds off processing time)
eads Oct 23, 2018
83ef995
use new year column in queries
eads Oct 23, 2018
7cc4869
commit index creating to sql files
eads Oct 23, 2018
230cc5f
move test file back where it belongs
eads Oct 23, 2018
bf476ac
Merge pull request #76 from propublica/0074-dedupe-geocodes
eads Oct 23, 2018
7808e0b
use C collation
eads Oct 24, 2018
1211fcc
join through multiple addresses and lat / lngs
eads Oct 24, 2018
2152bf8
the amount of branches with this fix is embarassing
eads Oct 24, 2018
dd6c67b
Merge branch 'master' into 0078-extraction-tests
eads Oct 24, 2018
8e7611d
fix mistake in hashing
eads Oct 24, 2018
9ccfef6
clean up clean_csv file and factor date extractors into wee functions
eads Oct 24, 2018
8dba7e9
functional extraction tests
eads Oct 24, 2018
feb1f01
Merge pull request #80 from propublica/0078-extraction-tests
eads Oct 24, 2018
daaec3e
sane checks on existence of geocode tables and views
eads Oct 24, 2018
cfbbdef
introduce geocodes_normalized view
eads Oct 24, 2018
91a9a72
Merge pull request #84 from propublica/0077-partial-build-geo
eads Oct 24, 2018
8490740
Merge branch 'master' into 0075-better-indexing
eads Oct 24, 2018
171cf08
used normalized table consistently
eads Oct 24, 2018
daddd0d
include geom in normalized geocode query
eads Oct 24, 2018
577c675
Merge pull request #79 from propublica/0075-better-indexing
eads Oct 24, 2018
f2c273f
write files to dupes dir, closes #88
eads Oct 24, 2018
1ba2fbd
Merge pull request #91 from propublica/0088-fix-dupes
eads Oct 24, 2018
3bed0a4
clean, correct, and add query example to readme
eads Oct 24, 2018
553c94d
Merge pull request #94 from propublica/0089-readme-cleanup
eads Oct 24, 2018
a378d52
fix bad data dict list in readme
eads Oct 24, 2018
b73b131
add violation lookup
eads Oct 25, 2018
615be24
Merge pull request #96 from propublica/0092-violation-lookup
eads Oct 25, 2018
e0a87e6
wip
eads Oct 29, 2018
ba35118
new core DB structure for ward analysis
eads Oct 29, 2018
c494a93
Merge pull request #97 from propublica/0081-ward-analysis-work-backward
eads Oct 29, 2018
1facd58
clean up, skip clustering for now
eads Oct 29, 2018
7d6b7e9
Merge pull request #100 from propublica/0092-violation-lookup
eads Oct 29, 2018
430e29d
load metadata
eads Oct 29, 2018
d7ef46b
Merge pull request #101 from propublica/0099-ward-meta
eads Oct 29, 2018
ca552da
calculate penalties; refactor data structure
eads Nov 5, 2018
4244acc
Merge pull request #108 from propublica/0105-calculate-penalties
eads Nov 5, 2018
b6fad26
ignore mapbox
eads Nov 5, 2018
8e2ae66
restore parking
eads Nov 5, 2018
a3b3e22
Merge pull request #111 from propublica/0110-parking
eads Nov 5, 2018
892e71c
upload data to mapbox
eads Nov 7, 2018
683e2bb
refactor penalty calculator a bit
eads Nov 7, 2018
e6ad632
move hasura over here
eads Nov 7, 2018
23f714e
Merge pull request #112 from propublica/0098-hasura
eads Nov 7, 2018
4b37297
make geo layers for app
eads Nov 28, 2018
4426bf3
calculate % successfully contested
eads Nov 28, 2018
07aa9e6
add make command to populate geocode table
eads Nov 29, 2018
61bb0a4
clean up makefile
eads Nov 29, 2018
bc8e146
update db schema and indexes for geocodio
eads Nov 29, 2018
f6ce55e
add scrapy geocoder
eads Nov 29, 2018
6f17159
functional new geocoder
eads Nov 30, 2018
3118910
remove old geocoder
eads Nov 30, 2018
0e24589
start loading census data
eads Nov 30, 2018
6b4b389
wip, functional new geocoding
eads Dec 6, 2018
119885b
tweaks for app, tests
eads Dec 10, 2018
f77b8eb
test % of debt captured in wards vs city wide figures
eads Dec 10, 2018
a0846cd
use lte and gte for totals query
eads Dec 11, 2018
fa1e498
better block query
eads Dec 11, 2018
f15946c
preliminary bulletproofing -- to be pipelined once we lock down details
j2kao Dec 12, 2018
3d24b4f
lots of small updates and fixes
eads Dec 13, 2018
cabc124
Merge branch 'master' into 0127-bulletproof
eads Dec 13, 2018
dba762c
tons of small data fixes
eads Dec 17, 2018
f05a016
bump to latest metadata
eads Dec 17, 2018
49b7a46
check top 5 violation categories for 2013-2017
j2kao Dec 18, 2018
8cb830c
add top 5 and top level summaries
j2kao Dec 18, 2018
d589fd9
many data fixes for production
eads Dec 20, 2018
2749c64
fix camera loader to accomodate older data
eads Feb 12, 2019
69d3023
do the pipfile thing
eads Feb 12, 2019
7aa1029
camera data now runs through python processor and needs new columns
eads Feb 12, 2019
b80f589
remove duplicate issue date in table definition
eads Feb 12, 2019
5694556
add some exports for block club
eads Feb 12, 2019
fc58842
ignore bulletproofing stuff
eads Feb 12, 2019
27c31dc
make datastore directory
eads Feb 12, 2019
7e69e02
use exports framework to zip-n-ship
eads Feb 12, 2019
7c7087c
add makefile download alias
eads Feb 12, 2019
2452f30
update README for release
eads Feb 12, 2019
7428b11
add exports and turn off s3 uploading for now
eads Feb 12, 2019
85ac89a
add bulletproofing stuff to messy gitignore
eads Mar 4, 2019
7c21419
merge in way that should allow for merging back to master
eads Mar 4, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,8 @@ __pycache__/
il-ticket-loader/
processors/salt.txt
env/mapbox.sh
bulletproof/.ipynb_checkpoints/bulletproof_sql_notebook-checkpoint.ipynb
bulletproof/parking-geo.csv
bulletproof/parking-geo.csv.gz
bulletproof/*.csv
.ipynb_checkpoints/
Loading