Skip to content

Replace intervalScheduler (Redlock) with pg-boss & decouple publication notifications#1334

Draft
hdahlheim wants to merge 6 commits intomainfrom
hd/replace-interval-scheduler
Draft

Replace intervalScheduler (Redlock) with pg-boss & decouple publication notifications#1334
hdahlheim wants to merge 6 commits intomainfrom
hd/replace-interval-scheduler

Conversation

@hdahlheim
Copy link
Contributor

The intervalScheduler.js uses Redlock + setInterval for distributed locking. When runFunc is CPU-intensive and blocks the Node.js event loop, the lock-extension timer never fires, the Redlock TTL expires, and another process acquires the same lock or the process gets stuck trying to acquire the expired lock.

Additionally, the publish mutation and the publication scheduler both call notifyPublish() inline, blocking until all push/email/web notifications complete. If the notification pipeline is slow, publishing jobs can exceed their timeout.

Since the introduction of pg-boss #864 we have had a much more reliable and debuggable worker queuing system for new background jobs. This PR now replaces the intervalScheduler with pg-boss based workers

Changes

1. Replace intervalScheduler with pg-boss workers (5dbc8bc77)

Each intervalScheduler.init() call is replaced with a dedicated BaseWorker subclass + queue.schedule(cron). pg-boss tracks job expiry via PostgreSQL timestamps, not in-process timers, so a blocked event loop cannot cause a lock to be missed.

  • PublicationWorkerscheduler:publication, every minute, stately policy
  • MembershipOwnersWorkerscheduler:memberships-owners, every 10 min
  • YearlyAboWinbacksWorkerscheduler:yearly-abo-winbacks, every 10 min
  • UpgradeWorkerscheduler:upgrade, every 10 min
  • ChangeoverDeactivateWorkerscheduler:changeover-deactivate, every 10 min
  • ReferralRewardsWorkerscheduler:referral-rewards, every 6 hours
  • StatsCacheWorkerscheduler:stats-cache, every hour

All queues use policy: 'stately' to prevent overlapping runs. The Queue class now supports queueOptions (passed to pgBoss.createQueue()) and performOptions (passed to pgBoss.work()).

intervalScheduler.js is deleted. timeScheduler (used by mail, mailchimp, and some crowdfunding schedulers) remains unchanged for now.

2. Fix scheduler queue context (63c60fbfd)

The scheduler queue in runOnce() was receiving the raw connectionContext (pgdb, redis, elastic, pubsub, logger) instead of the enriched context that includes loaders, t, and mail. Workers that need dataloaders or translations (PublicationWorker, PublicationNotificationWorker) now receive the correct context.

3. Decouple notifications & DRY up publish paths (7ebe30fea)

  • PublicationNotificationWorker — dedicated async worker for sending publication notifications via pg-boss. Uses retryLimit: 0 to prevent duplicate notifications (notifyPublish is not idempotent).
  • finalizePublication() — shared post-publish helper extracted into lib/Publication.js. Handles: dataloader flush, discussion upsert, async notification enqueue, and legacy pubsub event.
  • Both the GraphQL publish mutation (instant publishes) and the PublicationWorker (scheduled publishes) now use finalizePublication(), eliminating duplicated code.
  • The publish mutation no longer blocks on notification delivery.

Testing

  • Tested locally
  • Tested on love (TODO)

Deployment Migration

Since we only run one scheduler in production (which is restarted during deployment), there is no risk that the removal of the intervalScheduler will cause issues. All active intervals will be terminated once the deployment starts and replaced by the new pg-boss workers

…workers

Replace the custom redlock + setInterval implementation in intervalScheduler
with pg-boss workers scheduled via cron. This fixes the "Zombie Lock" issue
where CPU-intensive tasks block the Node.js event loop, preventing Redlock
lock extension timers from firing and causing duplicate runs.

pg-boss tracks job expiry in PostgreSQL via timestamps, so a blocked event
loop cannot cause lock failures.

Changes:
- Enhance job-queue wrapper: pass queueOptions to createQueue() and
  performOptions to pgBoss.work()
- Create PublicationWorker (replaces PublicationScheduler interval, now 1min cron)
- Create 6 membership workers: MembershipOwners, YearlyAboWinbacks, Upgrade,
  ChangeoverDeactivate, ReferralRewards, StatsCache
- All scheduler queues use policy: 'stately' to prevent overlapping runs
- Wire up workers in server.js with env-var gated queue.schedule() calls
- Remove intervalScheduler.js and all its usages
- Keep timeScheduler unchanged (still uses Redlock for time-based jobs)
The scheduler queue in runOnce() was receiving the raw connectionContext
(pgdb, redis, elastic, pubsub, logger) instead of the enriched context
that includes loaders, t, and mail. Workers like PublicationWorker need
context.loaders for cache invalidation, and the upcoming notification
worker needs loaders + t for sending notifications.

Also export WorkerConstructor type from queue.ts.
…sh paths

Extract shared post-publish logic into finalizePublication() helper:
- Flush dataloaders
- Upsert discussion (for non-prepublication)
- Enqueue async notification job via pg-boss (non-blocking)
- Publish legacy repoUpdate pubsub event

Create PublicationNotificationWorker as a dedicated pg-boss worker for
sending publication notifications asynchronously. Uses retryLimit: 0 to
prevent duplicate notifications (notifyPublish is not idempotent).

Both the GraphQL publish mutation (instant publishes) and the
PublicationWorker (scheduled publishes) now use finalizePublication(),
eliminating duplicated code. The publish mutation no longer blocks on
notification delivery.

Add test for PublicationNotificationWorker using testcontainers:
- Verifies job processing calls notifyPublish with correct args
- Verifies failed jobs are not retried (retryLimit: 0)
@vercel vercel bot temporarily deployed to Preview – admin-republik-ch February 16, 2026 16:10 Inactive
@vercel vercel bot temporarily deployed to Preview – docs February 16, 2026 16:10 Inactive
@vercel
Copy link

vercel bot commented Feb 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Actions Updated (UTC)
admin-republik-ch Skipped Skipped Feb 16, 2026 4:10pm
publikator-republik-ch Skipped Skipped Feb 16, 2026 4:10pm
www-republik-love Skipped Skipped Feb 16, 2026 4:10pm

Request Review

@vercel vercel bot temporarily deployed to Preview – www-republik-love February 16, 2026 16:10 Inactive
@vercel vercel bot temporarily deployed to Preview – publikator-republik-ch February 16, 2026 16:10 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants