Skip to content

A powerful YouTube transcript extractor that retrieves full transcripts, subtitles, and captions from YouTube videos. Returns structured JSON metadata and text suitable for search, AI/ML pipelines, SEO, and content workflows.

Notifications You must be signed in to change notification settings

Akash9078/youtube-transcript-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

YouTube Transcript Extractor Actor

A powerful YouTube transcript extractor that retrieves full transcripts, subtitles, and captions from YouTube videos. Returns structured JSON metadata and text suitable for search, AI/ML pipelines, SEO, and content workflows.

Apify Actor Rating Success Rate Users

Table of Contents

Overview

Property Value
Actor akash9078/youtube-transcript-extractor
Developer akash9078 (Community)
Rating 5.00/5
Runs Succeeded 99.8%
Total Users 153
Monthly Users 34
Categories Automation, Developer Tools, Videos
Last Modified 2026-02-27

Features

  • 99%+ Accuracy - Extract transcripts with high precision
  • Multi-format Support - Works with videos, Shorts, and ended Live streams
  • Multi-language Support - Detects and extracts captions in multiple languages (manual or auto-generated)
  • URL Format Flexibility - Supports all YouTube URL formats:
    • youtube.com/watch?v=
    • youtu.be/
    • youtube.com/shorts/
    • youtube.com/live/
  • Cloud-based Extraction - Automatic retry logic and proxy protection
  • Clean JSON Output - Structured metadata and text for downstream processing
  • No API Quota Limits - Extracts transcripts without using YouTube Data API

Use Cases

Content & Marketing

  • Content Repurposing - Convert video content into blog posts, social captions, and newsletters
  • SEO Optimization - Extract keywords and trending topics from transcripts
  • Competitor Analysis - Analyze competitor video content and strategies

AI & Data

  • AI/ML Training Data - Build text datasets for training language models
  • Retrieval-Augmented Generation (RAG) - Feed transcript text into vector stores for AI search
  • Sentiment Analysis - Process transcript text for audience sentiment insights

Research & Accessibility

  • Academic Research - Analyze video content for papers and studies
  • Accessibility Compliance - Generate transcripts for ADA/WCAG requirements

Business

  • Internal Training - Turn training videos into searchable documentation
  • Lead Generation - Identify prospects discussing industry topics in videos
  • API Integration - Programmatic extraction for batch processing

Getting Started

Using Apify Console

  1. Go to akash9078/youtube-transcript-extractor
  2. Enter a YouTube video URL
  3. Click Run

Using REST API

curl -X POST https://api.apify.com/v2/acts/akash9078~youtube-transcript-extractor/runs \
  -H "Content-Type: application/json" \
  -d '{
    "videoUrl": "https://youtu.be/WQNgQVRG9_U",
    "language": "en"
  }'

Using JavaScript SDK

const { ApifyClient } = require('apify-client');

const client = new ApifyClient({
  token: 'YOUR_APIFY_TOKEN'
});

const input = {
  videoUrl: 'https://youtu.be/WQNgQVRG9_U',
  language: 'en'
};

const run = await client.actor('akash9078/youtube-transcript-extractor').call(input);
const dataset = await client.dataset(run.defaultDatasetId).listItems();
console.log(dataset.items);

Using Python SDK

from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')

input = {
    'videoUrl': 'https://youtu.be/WQNgQVRG9_U',
    'language': 'en'
}

run = client.actor('akash9078/youtube-transcript-extractor').call(input)
dataset = client.dataset(run['defaultDatasetId']).list_items()
print(dataset['items'])

Input Schema

{
  "title": "YouTube Transcript Extractor Input",
  "type": "object",
  "schemaVersion": 1,
  "properties": {
    "videoUrl": {
      "title": "Video URL",
      "description": "YouTube video URL or video ID. Supports all URL formats: youtube.com/watch?v=, youtu.be/, youtube.com/shorts/, youtube.com/live/, etc.",
      "type": "string",
      "prefill": "https://youtu.be/WQNgQVRG9_U"
    },
    "language": {
      "title": "Preferred Language",
      "description": "Preferred language code for transcript extraction (e.g., 'en' for English, 'es' for Spanish). If not specified, will auto-detect the best available language.",
      "type": "string",
      "prefill": "en"
    }
  },
  "required": ["videoUrl"],
  "additionalProperties": false
}

Input Parameters

Parameter Type Required Description
videoUrl string Yes YouTube video URL or video ID
language string No Preferred language code (e.g., 'en', 'es', 'fr')

Output Schema

The actor returns structured JSON containing:

  • Transcript Text - Full transcript content
  • Detected Language - Language of the transcript
  • Video Identifiers - Video ID, title, etc.
  • Extraction Metadata - Timing and timestamps
  • Clean Searchable Text - Ready for downstream processing

Pricing

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for the following events:

Event Price
Actor Start $0.00005 per event
Result (single in dataset) $0.00001 per event
Transcript Extracted $0.01 per event

Example Output

{
  "videoId": "WQNgQVRG9_U",
  "title": "Example Video Title",
  "transcript": "This is the full transcript text...",
  "language": "en",
  "extractedAt": "2026-02-28T12:00:00Z",
  "duration": 1200,
  "hasTimestamps": true
}

Support

About

A powerful YouTube transcript extractor that retrieves full transcripts, subtitles, and captions from YouTube videos. Returns structured JSON metadata and text suitable for search, AI/ML pipelines, SEO, and content workflows.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors