Full Transcripts of the Huberman Lab Podcast

Available here: https://erik-overdahl.github.io/huberman-lab-transcripts/

This the data for a static site containing full transcripts of the all of the episodes of the Huberman Lab Podcast. The text is pulled from the captions of the videos posted on YouTube.

Almost all of these are true transcripts, although a few of the podcast videos seem to only have autogenerated captions - these transcripts are labeled as such.

In the future, I would like to have not only these transcripts, but the capability for full-text search of the podcast transcripts AND the comments pulled from YouTube.

Only English transcripts are provided for now. Let me know if you would like to have the Spanish versions as well.

Documentation

This repo provides a command-line interface named huberman-transcripts.

Usage: cli.py [OPTIONS] COMMAND [ARGS]...

Options:
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.

Commands:
  download  A wrapper around youtube-dl for downloading video data
  generate  Generate markdown files for Pelican.

Download

sage: cli.py download [OPTIONS] VIDEO_OR_PLAYLIST_IDS...

  A wrapper around youtube-dl for downloading video data

Arguments:
  VIDEO_OR_PLAYLIST_IDS...  One or more youtube video ids or playlist ids
                            [required]


Options:
  --data-dir TEXT  Directory into which to download data  [default: ./data]
  --help           Show this message and exit.

This site uses captions pulled from the Huberman Lab Podcast videos on YouTube. Data is gathered using huberman-transcripts. Two files are generated - a json file containing all the metadata about the YouTube video, and a .vtt captions file.

Generate

Usage: cli.py generate [OPTIONS] [VIDEO_IDS]...

  Generate markdown files for Pelican

Arguments:
  [VIDEO_IDS]...  The youtube video ids for which to generate markdown files.
                  If empty, generate files for all ids in [DATA_DIR]


Options:
  --data-dir TEXT    Directory of raw video data  [default: ./data]
  --target-dir TEXT  Directory into which to place generated markdown files
                     for static site generator  [default:
                     ./site/content/posts]

  --help             Show this message and exit.

Files are read into objects, which are then used to create markdown files. Each video has "chapters", which are timestamps helpfully provided by the Huberman Lab Podcast team. Captions are matched to chapters by timestamp, and then re-aligned so that sentences do not break over chapter boundaries.

Site

The site itself is generated using the Pelican static site generator. Everything it needs lives in the site/ directory. The default theme is currently in use, but this is likely to change in the near future.

Roadmap

Minimum

Improvements

In no particular order.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
huberman_lab_transcripts		huberman_lab_transcripts
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
scratch.org		scratch.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Full Transcripts of the Huberman Lab Podcast

Documentation

Download

Generate

Site

Roadmap

Minimum

Improvements

About

Releases

Packages

Languages

erik-overdahl/huberman-lab-transcripts

Folders and files

Latest commit

History

Repository files navigation

Full Transcripts of the Huberman Lab Podcast

Documentation

Download

Generate

Site

Roadmap

Minimum

Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages