Skip to content

Latest commit

 

History

History
45 lines (35 loc) · 1.72 KB

README.md

File metadata and controls

45 lines (35 loc) · 1.72 KB

YoutubeTranscriber

A Python script to search strings in YouTube videos.

Uses Google Cloud Speech-to-Text API to generate transcripts. You can use Google Cloud Free Tier credits.

How to use?

  1. Clone this repo.
  2. Sign-in to GCP.
  3. Go to Speech-to-Text API select project and enable this API.
  4. Click "Credentials".
  5. Click "Create Credentials".
  6. Select "Service Account Key".
  7. Under "Service Account" select "New service account".
  8. Name service.
  9. Select Role: "Project" -> "Owner".
  10. Finish creating credential.
  11. Select your "Service Account" from list.
  12. Click "Add Key" button and select "Create New Key".
  13. Leave "JSON" option selected.
  14. Click "Create".
  15. Save generated API key file to repo's main directory.
  16. Rename file to "api-key.json" or, specify GC_CREDENTIAL env variable with your json file name while running docker image.

This project uses parallel processing. You can use NUM_OF_THREADS env variable to specify number of concurrent threads while running docker image. Unless you change program use 8 threads.

# Build Docker image
docker build . -t calganaygun/youtube-transcriber:latest

# Run program and search or generate text
docker run -it calganaygun/youtube-transcriber:latest -v <YouTube Video ID> \
-w <Search string | getAll: prints all of video content> \
-l <Language code Example: 'tr-TR'>

Examples

asciicast

asciicast