Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 8, 2026

Summary

  • Implemented multi-provider metadata system with Myrient (No-Intro collection) as primary source and Crocdb as fallback
  • Addresses crocdb's current unavailability by parsing Myrient's HTML directory listings for 20+ gaming platforms
  • Provider abstraction enables easy addition of future sources

Implementation

Provider Interface

  • IMetadataProvider standardizes platform listing, search, and entry retrieval across sources
  • MetadataService orchestrates automatic fallback chain: Myrient → Crocdb → error
  • Filename-based metadata extraction (title, region, version) from No-Intro naming conventions

API Routes (/providers/*)

  • GET /platforms - List available platforms
  • POST /entries - List games for platform with pagination
  • POST /search - Search with platform/region filters
  • POST /search-all - Query all providers and merge results
  • POST /entry - Retrieve single entry by ID

Platform Coverage

  • Nintendo: GB, GBC, GBA, NES, SNES, N64, GameCube, Wii, DS, 3DS
  • Sega: Master System, Genesis, Game Gear, Saturn, Dreamcast
  • Sony: PS1, PS2, PSP
  • Atari: 2600, 7800, Lynx

Caching

  • Platform lists cached 1 hour per provider
  • Reduces network load and improves response times

Testing

  • 17 unit tests for Myrient provider (parsing, search, regions, caching)
  • All 99 tests passing
  • Mock-based testing enables offline development

Checklist

  • Added a semantic version comment to this PR using /semver: patch, /semver: minor, or /semver: major. (See template for examples)
  • Confirmed workflows and automation updates (if any) have appropriate permissions.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.crocdb.net
    • Triggering command: /usr/local/bin/node /usr/local/bin/node --require /home/REDACTED/work/jacare/jacare/node_modules/tsx/dist/preflight.cjs --import file:///home/REDACTED/work/jacare/jacare/node_modules/tsx/dist/loader.mjs src/index.ts --global de credential.helpesh (dns block)
  • myrient.erista.me
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)
    • Triggering command: /usr/local/bin/node /usr/local/bin/node --require /home/REDACTED/work/jacare/jacare/node_modules/tsx/dist/preflight.cjs --import file:///home/REDACTED/work/jacare/jacare/node_modules/tsx/dist/loader.mjs src/index.ts --global de credential.helpesh (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Discovery: Replace/augment crocdb data source</issue_title>
<issue_description>

Problem

crocdb is currently offline, which breaks/limits metadata lookup and/or discovery flows.

Goal

Add at least one alternative metadata/discovery source that is:

  • Reliable enough to depend on
  • Low maintenance (limited dev time)
  • Doesn’t require mirroring/hosting any datasets in this repo
  • Can be swapped/combined with other sources

Candidate sources

1) Myrient (https://myrient.erista.me/files/)

  • Appears to expose static “Index of …” directory listings (good for discovery).
  • Has an FAQ that mentions third-party downloader tooling and that URLs can change as content is reorganized. :contentReference[oaicite:0]{index=0}

2) Vimm’s Vault (https://vimm.net/vault)

  • Curated and friendly to browse, but automation likely needs heavier scraping / browser automation and may be more fragile.

Research to do

A) Myrient: how to locate files (discovery-only)

  • Map the /files/ taxonomy (e.g., No-Intro / Redump) and which directories best match the project’s platform model. :contentReference[oaicite:1]{index=1}
  • Verify what metadata is available directly in listings (filename, size, date) and how consistent it is across collections. :contentReference[oaicite:2]{index=2}
  • Identify how often URLs break/move and how to make the provider resilient (caching, refresh, retries). :contentReference[oaicite:3]{index=3}

Note: This project should avoid implementing “download ROMs” features. Keep this provider focused on indexing & matching only.

B) Third-party libraries / existing projects

  • Check if any maintained npm packages exist for Myrient/Vimm scraping (likely none; may need custom provider).
  • Review existing open-source tools that already parse Myrient listings (even if not npm) to copy patterns safely:
    • myrient-scrape (Python CLI) :contentReference[oaicite:4]{index=4}
    • “Myrient Search Engine” repo ideas (frontend/indexing) :contentReference[oaicite:5]{index=5}

C) Decide “best” for Jacare

Compare Myrient vs Vimm on:

  • Stability of URLs / HTML structure
  • Ease of parsing (static index pages vs dynamic pages)
  • Metadata quality and platform coverage
  • Rate-limiting/robots constraints and ethical access

Proposed approach

  • Implement a SourceProvider interface:
    • listPlatforms()
    • listEntries(platformId)
    • search(query, platformId?)
    • resolve(entryId) → returns canonical metadata only
  • Ship Myrient provider first (simpler index pages), keep Vimm as “experimental”.

Acceptance criteria

  • Can retrieve a list of entries for at least one platform/collection from Myrient.
  • Can match an entry to a local file using filename heuristics.
  • Provider is swappable and defaults/fallbacks remain intact if sources go down.
    </issue_description>

<agent_instructions>Do this discovery as thoroughly as you can</agent_instructions>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 6 commits January 8, 2026 17:24
Co-authored-by: luandev <6452989+luandev@users.noreply.github.com>
Co-authored-by: luandev <6452989+luandev@users.noreply.github.com>
Co-authored-by: luandev <6452989+luandev@users.noreply.github.com>
Co-authored-by: luandev <6452989+luandev@users.noreply.github.com>
Co-authored-by: luandev <6452989+luandev@users.noreply.github.com>
Co-authored-by: luandev <6452989+luandev@users.noreply.github.com>
Copilot AI changed the title [WIP] Add alternative metadata discovery source for crocdb Add Myrient as primary metadata provider with automatic fallback Jan 8, 2026
Copilot AI requested a review from luandev January 8, 2026 17:45
@luandev
Copy link
Owner

luandev commented Jan 8, 2026

@copilot add a e2e test to check if the search functionality is working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discovery: Replace/augment crocdb data source

2 participants