This project provides subtitle scraping and download workflows across multiple providers using:
- A shared app layer in
src/app/providers_app.zig - A single user-facing binary:
scrapers(CLIby default,TUIvia--tui) - A library surface exported from
src/lib.zig
The runtime HTTP path uses Zig std.http.Client.
- Zig
0.15.2+(as defined inbuild.zig.zon) - Network access for provider queries/downloads
Build:
zig buildInstall binaries:
zig build install
ls -l zig-out/bin/Run all unit/integration tests configured in build.zig:
zig build testRun CLI:
zig build run -- --list-providers
zig build run -- --provider subsource_net --query "The Matrix"Run TUI:
zig build run-tuiBuild cross-target binaries (installed into zig-out/bin):
zig build build-all-targets
zig build build-all-targets -Doptimize=ReleaseFast -Dstrip=truebuild-all-targets currently emits:
scrapers-x86_64-linux-gnuscrapers-aarch64-linux-gnuscrapers-x86_64-macos-nonescrapers-aarch64-macos-nonescrapers-x86_64-windows-gnu.exe
The canonical provider IDs accepted by the app layer and CLI are:
subdl_comopensubtitles_comopensubtitles_orgmoviesubtitles_orgmoviesubtitlesrt_compodnapisi_netyifysubtitles_chsubtitlecat_comisubtitles_orgmy_subs_cosubsource_nettvsubtitles_net
Only canonical provider IDs (and dotted/hyphenated site forms like subsource.net) are accepted.
Pagination is provider-specific in providers_app.
Search pagination supported:
opensubtitles_orgmoviesubtitlesrt_compodnapisi_netisubtitles_org
Subtitles pagination supported:
opensubtitles_orgisubtitles_org
No pagination:
my_subs_cotvsubtitles_net- Others not listed in the pagination-supported sets above
For non-paginated providers, page 1 returns data and page >1 returns empty page results.
Usage:
scrapers --provider <name> --query <text> [--title-index N] [--subtitle-index N] [--out-dir DIR] [--extract]
scrapers --list-providers
Options:
--provider <name>: provider ID--query <text>: search query--title-index <N>: selected title from search results (default0)--subtitle-index <N>: selected subtitle row (default: first downloadable row)--out-dir <DIR>: destination directory (defaultdownloads)--extract: extract downloaded archive contents intoout-dir(disabled by default)--list-providers: print all provider IDs--help,-h: print usage
Exit behavior:
0: success1: runtime/provider failure (search/subtitle fetch/download)2: argument/validation failure (invalid provider, missing value, index out of range)
List providers:
zig build run -- --list-providersBasic query and download (defaults to first title and first downloadable subtitle):
zig build run -- --provider subsource_net --query "The Matrix"Query, download, and extract archive contents:
zig build run -- --provider subsource_net --query "The Matrix" --extractSelect explicit search and subtitle rows:
zig build run -- \
--provider podnapisi_net \
--query "Inception" \
--title-index 1 \
--subtitle-index 0Download into a custom directory:
zig build run -- \
--provider subdl_com \
--query "Breaking Bad" \
--out-dir /tmp/subtitlesUse the installed binary directly:
./zig-out/bin/scrapers --provider isubtitles_org --query "Interstellar"Launch:
zig build run-tui
# or:
zig build run -- --tuiFlow:
- Select provider
- Enter query
- Select title
- Select subtitle row
- Confirm download (unless confirm is toggled off)
Global keys:
Escorq: back/quit depending on screenCtrl+C: cancel current fetch/downloadF2: toggle download confirmation screenF3: toggle theme
List/navigation keys:
j/kor arrow keys: move selectionEnter: select/: filter modes: cycle subtitle sort mode (subtitle list)[and]: previous/next page only for providers with pagination support
my_subs_co and tvsubtitles_net do not expose pagination in the TUI.
src/lib.zig exports the app layer and provider modules:
const scrapers = @import("scrapers");Common app-layer entry points:
providers_app.providers()providers_app.parseProvider()providers_app.search()providers_app.searchPage()providers_app.fetchSubtitles()providers_app.fetchSubtitlesPage()providers_app.downloadSubtitleWithOptions()providers_app.downloadSubtitleWithProgressAndOptions()
const std = @import("std");
const scrapers = @import("scrapers");
pub fn main() !void {
var gpa_state = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa_state.deinit();
const allocator = gpa_state.allocator();
var client: std.http.Client = .{ .allocator = allocator };
defer client.deinit();
var search = try scrapers.providers_app.search(
allocator,
&client,
.subsource_net,
"The Matrix",
);
defer search.deinit();
if (search.items.len == 0) return;
var subtitles = try scrapers.providers_app.fetchSubtitles(
allocator,
&client,
search.items[0].ref,
);
defer subtitles.deinit();
if (subtitles.items.len == 0) return;
const selected = subtitles.items[0];
if (selected.download_url == null) return;
var result = try scrapers.providers_app.downloadSubtitleWithOptions(
allocator,
&client,
selected,
"downloads",
.{ .extract_archive = false },
);
defer result.deinit(allocator);
std.debug.print("saved to: {s}\n", .{result.file_path});
}const std = @import("std");
const scrapers = @import("scrapers");
pub fn main() !void {
var gpa_state = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa_state.deinit();
const allocator = gpa_state.allocator();
var client: std.http.Client = .{ .allocator = allocator };
defer client.deinit();
var page1 = try scrapers.providers_app.searchPage(
allocator,
&client,
.opensubtitles_org,
"The Office",
1,
);
defer page1.deinit();
if (page1.has_next_page) {
var page2 = try scrapers.providers_app.searchPage(
allocator,
&client,
.opensubtitles_org,
"The Office",
2,
);
defer page2.deinit();
_ = page2;
}
}const std = @import("std");
const scrapers = @import("scrapers");
fn onPhase(_: ?*anyopaque, phase: scrapers.providers_app.DownloadPhase) void {
std.debug.print("phase: {s}\n", .{@tagName(phase)});
}
fn onUnits(_: ?*anyopaque, done: usize, total: usize) void {
std.debug.print("progress: {d}/{d}\n", .{ done, total });
}
pub fn main() !void {
var gpa_state = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa_state.deinit();
const allocator = gpa_state.allocator();
var client: std.http.Client = .{ .allocator = allocator };
defer client.deinit();
var search = try scrapers.providers_app.search(allocator, &client, .subsource_net, "The Matrix");
defer search.deinit();
if (search.items.len == 0) return;
var subs = try scrapers.providers_app.fetchSubtitles(allocator, &client, search.items[0].ref);
defer subs.deinit();
if (subs.items.len == 0 or subs.items[0].download_url == null) return;
const progress = scrapers.providers_app.DownloadProgress{
.on_phase = onPhase,
.on_units = onUnits,
};
var result = try scrapers.providers_app.downloadSubtitleWithProgressAndOptions(
allocator,
&client,
subs.items[0],
"downloads",
&progress,
.{ .extract_archive = false },
);
defer result.deinit(allocator);
}- Downloaded filenames retain extensions where available.
- If needed, extension fallback is inferred from URL/content.
- For archive formats (
.zip/.rar/other recognized archive patterns), the saved file is preserved as an archive artifact and surfaced viaDownloadResult.archive_path.
Runtime/provider controls:
SUBSOURCE_CF_CLEARANCE: optional Cloudflare clearance token forsubsource.netSUBSOURCE_USER_AGENT: override User-Agent forsubsource.netrequestsSUBDL_CF_HEADLESS: toggles headless browser behavior in Cloudflare handling paths
Live test controls:
SCRAPERS_LIVE_PROVIDER_FILTER: provider filter for live test runsSCRAPERS_LIVE_PROVIDERS: alternative provider filter variable used by common test helpersSCRAPERS_LIVE_INCLUDE_CAPTCHA: include captcha/cloudflare providers in live runsSCRAPERS_LIVE_BATCH: enables batch mode behavior in live app tests
Debug flags:
SCRAPERS_DEBUG_TIMINGSCRAPERS_SELECTOR_DEBUGSCRAPERS_DEBUG_ISUBSCRAPERS_DEBUG_OPENSUB_ORG_DOH
Run smoke live suite:
zig build test-live -Dlive=smoke -Dlive-providers=* -Dlive-include-captcha=falseRun extensive live suite for one provider:
zig build test-live-single -Dlive=extensive -Dlive-providers=subsource.netRun all live providers (including captcha/cloudflare targets):
zig build test-live-allRun parallel provider fan-out when supported:
zig build test-live -Dlive=all -Dlive-providers=* -Dlive-include-captcha=true -Dlive-parallel-on-all=trueBuild flags you can toggle from zig build:
-Doptimize=Debug|ReleaseSafe|ReleaseFast|ReleaseSmall-Dstrip=true|false-Dsingle-threaded=auto|on|off-Domit-frame-pointer=auto|on|off-Derror-tracing=auto|on|off-Dpic=auto|on|off
build-all-targets defaults when omitted:
-Doptimize->ReleaseFast-Dstrip->true
No results:
- Validate provider and query.
- Retry with another provider to isolate provider-side issues.
- For paginated providers, test page
1first.
Cloudflare/session errors:
- Set
SUBSOURCE_CF_CLEARANCEandSUBSOURCE_USER_AGENTwhen required bysubsource.net. - For captcha-heavy providers, use
-Dlive-include-captcha=trueonly when you intend to test those paths.
Download has no direct URL:
- Some rows are listing entries without direct links.
- Choose another subtitle row where
download_urlis present.
Key paths:
src/lib.zig: public library exportssrc/app/providers_app.zig: unified provider app layersrc/cmd/cli.zig: CLI entrypointsrc/cmd/tui.zig: TUI entrypointsrc/scrapers/*.zig: provider implementationsbuild.zig: build graph, binaries, and test steps