A Python script to search strings in YouTube videos.
Uses Google Cloud Speech-to-Text API to generate transcripts. You can use Google Cloud Free Tier credits.
- Clone this repo.
- Sign-in to GCP.
- Go to Speech-to-Text API select project and enable this API.
- Click "Credentials".
- Click "Create Credentials".
- Select "Service Account Key".
- Under "Service Account" select "New service account".
- Name service.
- Select Role: "Project" -> "Owner".
- Finish creating credential.
- Select your "Service Account" from list.
- Click "Add Key" button and select "Create New Key".
- Leave "JSON" option selected.
- Click "Create".
- Save generated API key file to repo's main directory.
- Rename file to "api-key.json" or, specify
GC_CREDENTIAL
env variable with your json file name while running docker image.
This project uses parallel processing. You can use
NUM_OF_THREADS
env variable to specify number of concurrent threads while running docker image. Unless you change program use 8 threads.
# Build Docker image
docker build . -t calganaygun/youtube-transcriber:latest
# Run program and search or generate text
docker run -it calganaygun/youtube-transcriber:latest -v <YouTube Video ID> \
-w <Search string | getAll: prints all of video content> \
-l <Language code Example: 'tr-TR'>