Skip to content

Comments

perf: optimize history sync memory and CPU usage#2333

Open
jlucaso1 wants to merge 2 commits intomasterfrom
perf-history-sync
Open

perf: optimize history sync memory and CPU usage#2333
jlucaso1 wants to merge 2 commits intomasterfrom
perf-history-sync

Conversation

@jlucaso1
Copy link
Collaborator

reduce memory allocations and CPU time during history sync, especially for accounts with large chat histories that cause memory spikes after scanning.

profiling a production instance (6h window, node 20) showed memory peaks up to ~8GB during history sync. the main bottlenecks were protobuf Long-to-string conversion (7.6% CPU), unnecessary buffer copies during media decryption, and redundant object allocations in history processing.

changes

1. replace Long library with native BigInt for protobuf conversion

longToString/longToNumber in WAProto previously used the long library's pure JS division loops for base-10 conversion. replaced with direct BigInt bitwise conversion ((BigInt(high >>> 0) << 32n) | BigInt(low >>> 0)), which delegates to V8's native C++ toString().

2. stream-pipe zlib in downloadHistory

instead of collecting all encrypted+decrypted chunks into an array, concatenating them, then inflating — pipe the decrypted stream directly through createInflate(). avoids allocating the full compressed buffer as an intermediate step.

3. remove unnecessary { ...chat } spread in processHistoryMessage

each chat from the protobuf decode was being shallow-copied via spread before pushing to the result array. the original object is already mutated in-place and not referenced elsewhere after the function returns.

4. skip Buffer.concat in media decryption when no remainder

in downloadEncryptedContent, the transform was always calling Buffer.concat([remainingBytes, chunk]) even when remainingBytes was empty (which is most of the time since chunks arrive AES-aligned). now skips the concat and uses the chunk directly.

benchmarks

change before after improvement
longToString (Long object -> string) 237 ns/call 52 ns/call 4.6x faster
Buffer.concat in media decrypt (per chunk) 78ms / 25k chunks 4ms / 25k chunks ~19x faster
chat object push (per chat) 3.3 us/chat 1.1 us/chat 2.9x faster
downloadHistory inflate (50MB payload) ~111MB RSS peak ~56MB RSS peak ~50% less memory

how this impacts history sync

during a full history sync (e.g. 200 chats, 10k messages), every message goes through protobuf decode -> toJSON() which calls longToString on every Long field (timestamps, file lengths, etc). the BigInt optimization directly reduces CPU time for this hot path.

the streaming inflate and concat skip reduce memory pressure during the download+decompress+decrypt pipeline, which is where the large memory spikes originate.

@whiskeysockets-bot
Copy link
Contributor

Thanks for opening this pull request and contributing to the project!

The next step is for the maintainers to review your changes. If everything looks good, it will be approved and merged into the main branch.

In the meantime, anyone in the community is encouraged to test this pull request and provide feedback.

✅ How to confirm it works

If you’ve tested this PR, please comment below with:

Tested and working ✅

This helps us speed up the review and merge process.

📦 To test this PR locally:

# NPM
npm install @whiskeysockets/baileys@WhiskeySockets/Baileys#perf-history-sync

# Yarn (v2+)
yarn add @whiskeysockets/baileys@WhiskeySockets/Baileys#perf-history-sync

# PNPM
pnpm add @whiskeysockets/baileys@WhiskeySockets/Baileys#perf-history-sync

If you encounter any issues or have feedback, feel free to comment as well.

Copy link
Member

@purpshell purpshell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little risky to be messing with the proto gen stuff.

What if instead of relying on proto related constructs we have our own functions? Is that less technically feasible?

@jlucaso1
Copy link
Collaborator Author

little risky to be messing with the proto gen stuff.

What if instead of relying on proto related constructs we have our own functions? Is that less technically feasible?

This will cause more breaking changes, won't it? Perhaps there are some libraries and users who expect Long values ​​to be treated as they are currently.

@jlucaso1
Copy link
Collaborator Author

There is already existing tests for the protobuf Long, also I've added more to ensure work in another scenarios

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants