Parallelize metadata scanning #89

breenyoung · 2025-12-29T19:28:47Z

Summary

This PR introduces a new parallel metadata writer for SCAN jobs and adds rich error reporting for both SCAN and THUMBNAIL pipelines, along with UI support in the admin Job History page.

The goals are:

Speed up library scans with a proper parallel metadata pipeline.
Make it clear which files/comics failed and why.
Let admins choose whether to use parallel scanning and how many workers to run.
Surface that information directly in the UI instead of forcing people into logs.

Key Changes

🧠 New parallel metadata writer (SCAN)

Implemented a dedicated metadata writer process that:
- Receives worker results via a queue.
- Applies metadata updates in batches using a single DB session.
- Mirrors the behavior and invariants of the original synchronous scanner as closely as possible.
Introduced a structured error_details list for SCAN jobs, containing per-file failures:
- file_path
- message

Example SCAN job summary:

{
  "imported": 10,
  "updated": 5,
  "errors": 2,
  "skipped": 0,
  "error_details": [
    { "file_path": "/path/to/bad.cbz", "message": "Missing ComicInfo.xml" },
    { "file_path": "/path/to/other.cbz", "message": "Failed to parse metadata" }
  ]
}

breenyoung · 2025-12-29T19:31:17Z

Release Notes WIP

🚀 New: Parallel Metadata Scans + Better Job Errors

This release introduces a new parallel metadata scanner and much better error visibility for both scan and thumbnail jobs.

🧠 New parallel metadata writer for SCAN jobs

Metadata scanning now uses a dedicated writer process to apply changes in batches.
This speeds up scans on larger libraries while keeping behavior aligned with the previous single-process scanner.
SCAN jobs now include detailed error_details in their summary:
- Which files failed.
- Why they failed (e.g., missing ComicInfo.xml, parse errors).

You can see these details directly in the Job History UI.

New admin settings:
- Enable parallel metadata scanning
  - Turn the parallel path on or off.
  - Leave it off if you prefer the simpler, single-process behavior.
- Parallel metadata workers
  - Set a specific number of worker processes.
  - Or leave it at 0/empty to let Parker choose a safe value based on your CPU cores.

This gives you control over performance vs. resource usage, instead of baking in a fixed parallelism strategy.

🖼 Thumbnail job error details

Thumbnail generation now reports per-comic errors instead of just an error count.
For each failure you’ll see:
- Comic ID.
- File path.
- A plain-text error message (e.g., image load failures, unsupported format).

📊 Job History UI upgrades

The job details modal now displays error details for both SCAN and THUMBNAIL jobs.
The layout adapts based on job type:
- SCAN jobs → compact table view (easy to skim many file-level errors).
- THUMBNAIL jobs → detailed cards (more context per comic).

Overall, scans should be faster on larger libraries, and when something goes wrong, you’ll have much clearer insight into exactly what and why.

breenyoung added 10 commits December 28, 2025 20:33

new workers for parallel metadata processing

4fba63c

new settings for parallel metadata processing

39f4263

enable parallel options for metadata scanning

26e9ceb

WIP

f96c936

WIP

489f858

remove force arg as writer doesnt need it

4f70e89

implement correct db.flush order and is_dirty flag

3fd2f87

comment out debug

ac6bc25

add detailed errors during scan

d09b78a

add detailed errors during thumbnailing

67f80a7

breenyoung added 2 commits December 31, 2025 21:32

fix missing arg to apply_batch

249208d

update last_scanned on library when done

067c0ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelize metadata scanning #89

Parallelize metadata scanning #89

Uh oh!

breenyoung commented Dec 29, 2025 •

edited

Loading

Uh oh!

breenyoung commented Dec 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Parallelize metadata scanning #89

Are you sure you want to change the base?

Parallelize metadata scanning #89

Uh oh!

Conversation

breenyoung commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

🧠 New parallel metadata writer (SCAN)

Uh oh!

breenyoung commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 New: Parallel Metadata Scans + Better Job Errors

🧠 New parallel metadata writer for SCAN jobs

🖼 Thumbnail job error details

📊 Job History UI upgrades

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

breenyoung commented Dec 29, 2025 •

edited

Loading

breenyoung commented Dec 29, 2025 •

edited

Loading