This package provides tools to convert iNaturalist observation data to and from a wide variety of useful formats. This is mainly intended for use with the iNaturalist API via pyinaturalist, but also works with other data sources.
Complete project documentation can be found at pyinaturalist-convert.readthedocs.io.
- CSV (From either API results or the iNaturalist export tool)
- JSON (from API results)
pyinaturalist.Observation
objects- Dataframes, Feather, Parquet, and anything else supported by pandas
- iNaturalist GBIF Archive
- iNaturalist Taxonomy Archive
- iNaturalist Open Data on Amazon
- Note: see API Recommended Practices for details on which data sources are best suited to different use cases
- CSV, Excel, and anything else supported by tablib
- Dataframes, Feather, Parquet, and anything else supported by pandas
- Darwin Core
- GeoJSON
- GPX
- SQLite
- SQLite + FTS5 text search for taxonomy
Install with pip:
pip install pyinaturalist-convert
Or with conda:
conda install -c conda-forge pyinaturalist-convert
To keep things modular, many format-specific dependencies are not installed by default, so you may need to install some more packages depending on which features you want. Each module's docs lists any extra dependencies needed, and a full list can be found in pyproject.toml.
For getting started, it's recommended to install all optional dependencies:
pip install pyinaturalist-convert[all]
Get your own observations and save to CSV:
from pyinaturalist import get_observations
from pyinaturalist_convert import *
observations = get_observations(user_id='my_username')
to_csv(observations, 'my_observations.csv')
Or any other supported format:
to_dwc(observations, 'my_observations.dwc')
to_excel(observations, 'my_observations.xlsx')
to_feather(observations, 'my_observations.feather')
to_geojson(observations, 'my_observations.geojson')
to_gpx(observations, 'my_observations.gpx')
to_hdf(observations, 'my_observations.hdf')
to_json(observations, 'my_observations.json')
to_parquet(observations, 'my_observations.parquet')
df = to_dataframe(observations)
Most file formats can be loaded via pyinaturalist_convert.read()
:
observations = read('my_observations.csv')
observations = read('my_observations.xlsx')
observations = read('my_observations.feather')
observations = read('my_observations.hdf')
observations = read('my_observations.json')
observations = read('my_observations.parquet')
Download the complete research-grade observations dataset:
download_dwca_observations()
And load it into a SQLite database:
load_dwca_observations()
And do the same with the complete taxonomy dataset:
download_dwca_taxa()
load_dwca_taxa()
Load taxonomy data into a full text search database:
load_taxon_fts_table(languages=['english', 'german'])
And get lightning-fast autocomplete results from it:
ta = TaxonAutocompleter()
ta.search('aves')
ta.search('flughund', language='german')
If you have any problems, suggestions, or questions about pyinaturalist-convert, you are welcome to create an issue or discussion. Also, PRs are welcome!