Replace intervalScheduler (Redlock) with pg-boss & decouple publication notifications#1334
Draft
Replace intervalScheduler (Redlock) with pg-boss & decouple publication notifications#1334
Conversation
…workers Replace the custom redlock + setInterval implementation in intervalScheduler with pg-boss workers scheduled via cron. This fixes the "Zombie Lock" issue where CPU-intensive tasks block the Node.js event loop, preventing Redlock lock extension timers from firing and causing duplicate runs. pg-boss tracks job expiry in PostgreSQL via timestamps, so a blocked event loop cannot cause lock failures. Changes: - Enhance job-queue wrapper: pass queueOptions to createQueue() and performOptions to pgBoss.work() - Create PublicationWorker (replaces PublicationScheduler interval, now 1min cron) - Create 6 membership workers: MembershipOwners, YearlyAboWinbacks, Upgrade, ChangeoverDeactivate, ReferralRewards, StatsCache - All scheduler queues use policy: 'stately' to prevent overlapping runs - Wire up workers in server.js with env-var gated queue.schedule() calls - Remove intervalScheduler.js and all its usages - Keep timeScheduler unchanged (still uses Redlock for time-based jobs)
The scheduler queue in runOnce() was receiving the raw connectionContext (pgdb, redis, elastic, pubsub, logger) instead of the enriched context that includes loaders, t, and mail. Workers like PublicationWorker need context.loaders for cache invalidation, and the upcoming notification worker needs loaders + t for sending notifications. Also export WorkerConstructor type from queue.ts.
…sh paths Extract shared post-publish logic into finalizePublication() helper: - Flush dataloaders - Upsert discussion (for non-prepublication) - Enqueue async notification job via pg-boss (non-blocking) - Publish legacy repoUpdate pubsub event Create PublicationNotificationWorker as a dedicated pg-boss worker for sending publication notifications asynchronously. Uses retryLimit: 0 to prevent duplicate notifications (notifyPublish is not idempotent). Both the GraphQL publish mutation (instant publishes) and the PublicationWorker (scheduled publishes) now use finalizePublication(), eliminating duplicated code. The publish mutation no longer blocks on notification delivery. Add test for PublicationNotificationWorker using testcontainers: - Verifies job processing calls notifyPublish with correct args - Verifies failed jobs are not retried (retryLimit: 0)
jstcki
approved these changes
Feb 16, 2026
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
intervalScheduler.jsuses Redlock +setIntervalfor distributed locking. WhenrunFuncis CPU-intensive and blocks the Node.js event loop, the lock-extension timer never fires, the Redlock TTL expires, and another process acquires the same lock or the process gets stuck trying to acquire the expired lock.Additionally, the publish mutation and the publication scheduler both call
notifyPublish()inline, blocking until all push/email/web notifications complete. If the notification pipeline is slow, publishing jobs can exceed their timeout.Since the introduction of pg-boss #864 we have had a much more reliable and debuggable worker queuing system for new background jobs. This PR now replaces the intervalScheduler with pg-boss based workers
Changes
1. Replace intervalScheduler with pg-boss workers (
5dbc8bc77)Each
intervalScheduler.init()call is replaced with a dedicatedBaseWorkersubclass +queue.schedule(cron). pg-boss tracks job expiry via PostgreSQL timestamps, not in-process timers, so a blocked event loop cannot cause a lock to be missed.scheduler:publication, every minute,statelypolicyscheduler:memberships-owners, every 10 minscheduler:yearly-abo-winbacks, every 10 minscheduler:upgrade, every 10 minscheduler:changeover-deactivate, every 10 minscheduler:referral-rewards, every 6 hoursscheduler:stats-cache, every hourAll queues use
policy: 'stately'to prevent overlapping runs. TheQueueclass now supportsqueueOptions(passed topgBoss.createQueue()) andperformOptions(passed topgBoss.work()).intervalScheduler.jsis deleted.timeScheduler(used by mail, mailchimp, and some crowdfunding schedulers) remains unchanged for now.2. Fix scheduler queue context (
63c60fbfd)The scheduler queue in
runOnce()was receiving the rawconnectionContext(pgdb, redis, elastic, pubsub, logger) instead of the enriched context that includesloaders,t, andmail. Workers that need dataloaders or translations (PublicationWorker, PublicationNotificationWorker) now receive the correct context.3. Decouple notifications & DRY up publish paths (
7ebe30fea)PublicationNotificationWorker— dedicated async worker for sending publication notifications via pg-boss. UsesretryLimit: 0to prevent duplicate notifications (notifyPublishis not idempotent).finalizePublication()— shared post-publish helper extracted intolib/Publication.js. Handles: dataloader flush, discussion upsert, async notification enqueue, and legacy pubsub event.publishmutation (instant publishes) and thePublicationWorker(scheduled publishes) now usefinalizePublication(), eliminating duplicated code.Testing
Deployment Migration
Since we only run one scheduler in production (which is restarted during deployment), there is no risk that the removal of the intervalScheduler will cause issues. All active intervals will be terminated once the deployment starts and replaced by the new pg-boss workers