This project provides scripts for retrieving Postgres mailing list archives, populating a Postgres database with them, and rendering the archives with NextJS.
- Run
git clone --recurse-submodules lanterndata/pg
- Use
./scripts/scrape.sh
to download the mailing list archives and populate a Postgres database. Note thatDATABASE_URL
should be set in the environment. - Use
pgsql-lists-offline
to download the mailing list archives to a localdata
directory. Note the dependencies onparallel
andcurl
. See more below. - Use
yarn scrape
to process the archives and populate a Postgres database. Note thatDATABASE_URL
should be set in the environment.
yarn dbmate new <migration-name>
With DATABASE_URL
set in the environment:
yarn dbmate up
The body_dense_vector
column in messages
was added and populated using the Lantern dashboard, and is not in the migration files.
Source: https://github.com/wsdookadr/pgsql-lists-offline
Downloads mailing list archives for PostgreSQL, PostGIS and pgRouting.
The mailing list archives that will be downloaded are in the mbox format and can be read using any mbox-aware email client (for example mutt ).
sudo apt-get install parallel curl
./pgsql-lists-offline.sh -l
To download a list (for example pgsql-general):
./pgsql-lists-offline.sh -g pgsql-general
To download a list for a particular month:
./pgsql-lists-offline.sh -g pgsql-general -m 202403