Skip to content

Data conversion tools for iNaturalist observations and taxonomy

License

Notifications You must be signed in to change notification settings

pyinat/pyinaturalist-convert

Repository files navigation

pyinaturalist-convert

Build status codecov Docs PyPI Conda PyPI - Python Versions

This package provides tools to convert iNaturalist observation data to and from a wide variety of useful formats. This is mainly intended for use with the iNaturalist API via pyinaturalist, but also works with other data sources.

Complete project documentation can be found at pyinaturalist-convert.readthedocs.io.

Formats

Import

Export

  • CSV, Excel, and anything else supported by tablib
  • Dataframes, Feather, Parquet, and anything else supported by pandas
  • Darwin Core
  • GeoJSON
  • GPX
  • SQLite
  • SQLite + FTS5 text search for taxonomy

Installation

Install with pip:

pip install pyinaturalist-convert

Or with conda:

conda install -c conda-forge pyinaturalist-convert

To keep things modular, many format-specific dependencies are not installed by default, so you may need to install some more packages depending on which features you want. Each module's docs lists any extra dependencies needed, and a full list can be found in pyproject.toml.

For getting started, it's recommended to install all optional dependencies:

pip install pyinaturalist-convert[all]

Usage

Export

Get your own observations and save to CSV:

from pyinaturalist import get_observations
from pyinaturalist_convert import *

observations = get_observations(user_id='my_username')
to_csv(observations, 'my_observations.csv')

Or any other supported format:

to_dwc(observations, 'my_observations.dwc')
to_excel(observations, 'my_observations.xlsx')
to_feather(observations, 'my_observations.feather')
to_geojson(observations, 'my_observations.geojson')
to_gpx(observations, 'my_observations.gpx')
to_hdf(observations, 'my_observations.hdf')
to_json(observations, 'my_observations.json')
to_parquet(observations, 'my_observations.parquet')
df = to_dataframe(observations)

Import

Most file formats can be loaded via pyinaturalist_convert.read():

observations = read('my_observations.csv')
observations = read('my_observations.xlsx')
observations = read('my_observations.feather')
observations = read('my_observations.hdf')
observations = read('my_observations.json')
observations = read('my_observations.parquet')

Download

Download the complete research-grade observations dataset:

download_dwca_observations()

And load it into a SQLite database:

load_dwca_observations()

And do the same with the complete taxonomy dataset:

download_dwca_taxa()
load_dwca_taxa()

Load taxonomy data into a full text search database:

load_taxon_fts_table(languages=['english', 'german'])

And get lightning-fast autocomplete results from it:

ta = TaxonAutocompleter()
ta.search('aves')
ta.search('flughund', language='german')

Feedback

If you have any problems, suggestions, or questions about pyinaturalist-convert, you are welcome to create an issue or discussion. Also, PRs are welcome!