Install the required Python libraries using pip. Run the following command in your terminal:
pip install -r requirements.txt- Go to the Google Cloud Console.
- Create a new project or select an existing one.
- Enable the YouTube Data API v3:
- Navigate to APIs & Services > Library.
- Search for YouTube Data API v3 and enable it.
- Create an API key:
- Go to APIs & Services > Credentials.
- Click Create Credentials and select API Key.
- Copy the generated API key.
- Go to the OpenAI Platform.
- Sign up or log in to your account.
- Navigate to API Keys under your account settings.
- Click Create new secret key and copy the generated API key.
- Open the script file (
youtube_transcript_scraper.py) in a text editor. - Replace the following placeholders with your API keys:
YOUTUBE_API_KEY: Replace with your YouTube Data API key.OPENAI_API_KEY: Replace with your OpenAI API key.
- Replace the
channel_idvariable with the target YouTube channel ID. If you can't find the Channel ID, you can copy the youtube channel link to this website and copy the channel_id from there. Youtube Channel ID Grabber
Run the script using the following command in your terminal:
python -m youtube_transcript_scraper.pyThe script will:
- Fetch all videos from the specified YouTube channel.
- Download transcripts for each video.
- Categorize the transcripts using ChatGPT.
- Save the transcripts in folders named after their categories.
- Generate a CSV file (
video_details.csv) containing:- Video Title
- Video Link
- Transcript File Path
After running the script, the output directory (transcripts/) will look like this:
transcripts/
├── Music/
│ ├── dQw4w9WgXcQ_Never_Gonna_Give_You_Up.txt
├── Race/
│ ├── 9bZkp7q19f0_Gangnam_Style.txt
├── Technology/
│ ├── abc123_AI_Revolution.txt
└── video_details.csvThe video_details.csv file will contain rows like this:
| Video Title | Video Link | Transcript File Path |
|---|---|---|
| Never Gonna Give You Up | https://www.youtube.com/watch?v=dQw4w9WgXcQ | transcripts/Music/dQw4w9WgXcQ_Never_Gonna_Give_You_Up.txt |
| Gangnam Style | https://www.youtube.com/watch?v=9bZkp7q19f0 | transcripts/Race/9bZkp7q19f0_Gangnam_Style.txt |
- Empty Video List: Ensure the channel ID is correct and the channel is public.
- API Key Errors: Double-check that the API keys are correct and have sufficient quota.
- Missing Transcripts: Some videos may not have transcripts available.