Skip to content

graphdb: fix backwards-compat for channel edge feature deserialization#10529

Merged
Roasbeef merged 1 commit intolightningnetwork:masterfrom
Roasbeef:fix-edge-features-decode-compat
Feb 3, 2026
Merged

graphdb: fix backwards-compat for channel edge feature deserialization#10529
Roasbeef merged 1 commit intolightningnetwork:masterfrom
Roasbeef:fix-edge-features-decode-compat

Conversation

@Roasbeef
Copy link
Member

This commit fixes a backwards compatibility issue that prevented nodes from upgrading from v0.19.x to v0.20.x.

In v0.19.x, channel edge features were serialized as raw feature bytes without a length prefix. In v0.20.x (commit 2f2845d), the serialization changed to use Features.Encode() which adds a 2-byte big-endian length prefix before the feature bits. The deserialization code was updated to use Features.Decode() which expects this length prefix.

When v0.20.x reads a database created by v0.19.x, Decode() tries to read a length prefix that doesn't exist, causing an EOF error:

unable to decode features: EOF

The fix adds a deserializeChanEdgeFeatures() helper that detects which format is being read and decodes accordingly:

  • New format (v0.20+): First 2 bytes encode the length of the remaining bytes. Detected when uint16(bytes[0:2]) == len(bytes)-2.
  • Legacy format (pre-v0.20): Raw feature bits without length prefix. Uses DecodeBase256 with the known length.

The format detection is safe because in the legacy format, the first byte always has at least one bit set (the serialization uses minimum bytes), so the first two bytes can never encode a value equal to len-2.

Fixes #10528.

@gemini-code-assist
Copy link

Summary of Changes

Hello @Roasbeef, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a significant backwards compatibility problem related to how channel edge features are deserialized from the database. Previously, an update to the serialization format in v0.20.x caused EOF errors when attempting to read data created by v0.19.x nodes. The solution introduces an intelligent deserialization function that can automatically identify and correctly process both the old and new feature serialization formats, thereby enabling seamless upgrades for nodes.

Highlights

  • Backwards Compatibility Fix: Addresses a critical issue preventing nodes from upgrading from v0.19.x to v0.20.x due to a change in channel edge feature serialization.
  • Dual-Format Deserialization: Introduces a new helper function, deserializeChanEdgeFeatures(), capable of correctly deserializing channel edge features from both the legacy (raw feature bits) and new (2-byte length prefixed) formats.
  • Safe Format Detection: The new deserialization logic safely detects the format by checking if the first two bytes, interpreted as a uint16 length, match the remaining byte count. This detection is robust as legacy format's first byte always has at least one bit set, preventing false positives.
  • Comprehensive Testing: A new test file (kv_store_features_test.go) has been added, including unit tests for empty features, legacy format, new format, format detection, round-trip serialization, and property-based testing to ensure the robustness of the new deserialization logic.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical backwards-compatibility issue with channel edge feature deserialization. The introduction of deserializeChanEdgeFeatures to handle both legacy and new formats is a good approach. The accompanying tests are extensive and cover many scenarios, including property-based testing.

However, I've identified a critical flaw in the format detection logic that could lead to data corruption in rare edge cases. The assumption that legacy feature vectors cannot accidentally match the new format's length prefix is incorrect for larger feature vectors. I've provided a detailed explanation and a suggested fix in a specific comment.

Once this critical issue is addressed, the PR will be a solid fix for the original problem.

Comment on lines 4743 to 4775
if len(featureBytes) >= 2 {
encodedLen := binary.BigEndian.Uint16(featureBytes[:2])
if int(encodedLen) == len(featureBytes)-2 {
// New format: skip the 2-byte length prefix and decode
// the remaining bytes as raw feature bits.
err := features.DecodeBase256(
bytes.NewReader(featureBytes[2:]),
int(encodedLen),
)
if err != nil {
return nil, fmt.Errorf("unable to decode "+
"features (new format): %w", err)
}

return lnwire.NewFeatureVector(
features, lnwire.Features,
), nil
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current format detection logic int(binary.BigEndian.Uint16(featureBytes[:2])) == len(featureBytes)-2 is not safe and can lead to data corruption. A legacy-formatted feature vector can accidentally satisfy this condition, causing it to be misinterpreted as the new format.

For example, a legacy feature vector of 258 bytes where the first two bytes are 0x01, 0x00 would be incorrectly detected as the new format, because 256 == 258 - 2. This would lead to the first two bytes being stripped and the remaining 256 bytes being parsed as a feature vector, losing the information from the first two bytes. This is possible if the original feature vector had a feature bit in the range [2056, 2063] set.

The existing test TestDeserializeChanEdgeFeaturesLegacyFormatNoCollision does not catch this because it only generates feature vectors with a maxBit up to 255, resulting in a maximum length of 32 bytes, which is too small to trigger this collision.

To fix this, we should add a check for canonical encoding. After decoding the payload (bytes after the supposed length prefix), we should re-encode it and verify that it produces the exact same byte slice. Legacy feature vectors that accidentally match the length check are very unlikely to be canonically encoded, so this will allow us to reliably distinguish between the formats.

if len(featureBytes) >= 2 {
		encodedLen := binary.BigEndian.Uint16(featureBytes[:2])
		if int(encodedLen) == len(featureBytes)-2 {
			// This looks like the new format. To be sure, we decode
			// and re-encode to check for canonical encoding, as a
			// legacy feature vector could accidentally match the
			// length check.
			payload := featureBytes[2:]
			tempFeatures := lnwire.NewRawFeatureVector()
			err := tempFeatures.DecodeBase256(
				bytes.NewReader(payload), int(encodedLen),
			)

			var checkBuf bytes.Buffer
			if err == nil {
				err = tempFeatures.EncodeBase256(&checkBuf)
			}

			// If there were no errors and the re-encoded payload
			// matches, we are confident it's the new format.
			if err == nil && bytes.Equal(checkBuf.Bytes(), payload) {
				return lnwire.NewFeatureVector(
					tempFeatures, lnwire.Features,
				), nil
			}
		}
	}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to be consistent we should do this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lightninglabs-deploy
Copy link
Collaborator

🟠 PR Severity: HIGH

Automated Classification | 2 files | 451 lines changed (65 non-test)

🟠 High (1 file)
  • graph/db/kv_store.go - Network graph database operations (feature deserialization logic)
🟢 Test Files (1 file)
  • graph/db/kv_store_features_test.go - Test coverage for backwards compatibility

Analysis

This PR fixes a backwards-compatibility issue in channel edge feature deserialization within the graph database layer (graph/*). According to the severity guidelines, changes to graph/* (network graph maintenance) are classified as HIGH severity, requiring review by a knowledgeable engineer.

The PR modifies core graph database logic (60 lines in production code) to handle feature deserialization correctly. While the change is relatively focused (1 production file, 65 non-test lines), it touches critical graph database functionality that could affect channel state persistence and network graph operations if not handled correctly.

No severity bump modifiers apply:

  • File count: 1 production file (threshold: >20)
  • Lines changed: 65 non-test lines (threshold: >500)
  • Single package touched (graph/db)
  • No database migrations

To override, add a severity-override-{critical,high,medium,low} label.

Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this 🙏 im a bit surprised that this wasnt caught by the backwards compatibility CI job 🤔

Copy link

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 909fa1a

Tested on 32-bit arm/ RPiOS, confirmed lnd started up and synced, this commit solves #10528

{
    "version":  "0.20.99-beta commit=v0.3-alpha-18056-g909fa1a59",
    "commit_hash":  "909fa1a59ee7aaf0e65b710633c7dfc0cc0ce6ca",
    "identity_pubkey":  "0355157b4260b70c7f407a720c527a84e9522cd948e7a8ad92ae00773be52488e3",
    "alias":  "🌟Star⭐️Service🌟",
    "color":  "#3399ff",
    "num_pending_channels":  0,
    "num_active_channels":  3,
    "num_inactive_channels":  3,
    "num_peers":  5,
    "block_height":  934226,
    "block_hash":  "0000000000000000000105d3650ec8ac8a25a9f93b00cdb06dad558a550ac2f4",
    "best_header_timestamp":  "1769696561",
    "synced_to_chain":  true,
    "synced_to_graph":  true,
    "testnet":  false,
    "chains":  [
        {
            "chain":  "bitcoin",
            "network":  "mainnet"
        }
    ],

@saubyk saubyk requested a review from ziggie1984 January 29, 2026 15:17
@Roasbeef
Copy link
Member Author

@claude review this

@claude
Copy link

claude bot commented Jan 29, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 fix-edge-features-decode-compat

I'll analyze this and get back to you.

Copy link
Collaborator

@ziggie1984 ziggie1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, pending CI and we should think whether it is worth ckecking for the collision of the legacy and new format.

@saubyk saubyk added graph backport-v0.20.x-branch This label is used to trigger the creation of a backport PR to the branch `v0.20.x-branch`. labels Jan 31, 2026
@Roasbeef
Copy link
Member Author

Roasbeef commented Feb 2, 2026

@claude review this

@Roasbeef Roasbeef force-pushed the fix-edge-features-decode-compat branch from 909fa1a to c65b268 Compare February 2, 2026 21:40
@lightninglabs-deploy
Copy link
Collaborator

🟠 PR Severity: HIGH

graphdb backwards-compat fix | 2 files | 81 lines changed

🟠 High (1 file)
  • graph/db/kv_store.go - Graph database channel edge feature deserialization (network graph maintenance)
🟢 Low (1 file)
  • docs/release-notes/release-notes-0.20.1.md - Release notes documentation
Test Files (excluded from severity)
  • graph/db/kv_store_features_test.go - Test coverage for feature fix

Analysis

This PR fixes a backwards-compatibility issue in the graph database layer for deserializing channel edge features. The core change is in graph/db/kv_store.go, which is part of the network graph maintenance system (HIGH tier).

Severity Determination:

  • Primary file: graph/db/kv_store.goHIGH (graph database operations)
  • File count: 2 non-test files (no bump needed)
  • Lines changed: 81 non-test lines (no bump needed)
  • Final: HIGH

This requires review by an engineer knowledgeable in the graph database layer, as it affects how channel edge features are persisted and retrieved. The backwards-compatibility aspect is important to ensure existing channel data can be properly read.


To override, add a severity-override-{critical,high,medium,low} label.

This commit fixes a backwards compatibility issue that prevented nodes
from upgrading from v0.19.x to v0.20.x.

In v0.19.x, channel edge features were serialized as raw feature bytes
without a length prefix. In v0.20.x (commit 2f2845d), the serialization
changed to use Features.Encode() which adds a 2-byte big-endian length
prefix before the feature bits. The deserialization code was updated to
use Features.Decode() which expects this length prefix.

When v0.20.x reads a database created by v0.19.x, Decode() tries to read
a length prefix that doesn't exist, causing an EOF error:

    unable to decode features: EOF

The fix adds a deserializeChanEdgeFeatures() helper that detects which
format is being read and decodes accordingly:

- New format (v0.20+): First 2 bytes encode the length of the remaining
  bytes. Detected when uint16(bytes[0:2]) == len(bytes)-2.

- Legacy format (pre-v0.20): Raw feature bits without length prefix.
  Uses DecodeBase256 with the known length.

The format detection is safe because in the legacy format, the first byte
always has at least one bit set (the serialization uses minimum bytes),
so the first two bytes can never encode a value equal to len-2.

Fixes lightningnetwork#10528.
@Roasbeef Roasbeef force-pushed the fix-edge-features-decode-compat branch from 1676dd6 to 56a7f45 Compare February 3, 2026 17:12
@Roasbeef Roasbeef merged commit d332cb0 into lightningnetwork:master Feb 3, 2026
34 of 36 checks passed
@github-actions
Copy link

github-actions bot commented Feb 3, 2026

Created backport PR for v0.20.x-branch:

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin backport-10529-to-v0.20.x-branch
git worktree add --checkout .worktree/backport-10529-to-v0.20.x-branch backport-10529-to-v0.20.x-branch
cd .worktree/backport-10529-to-v0.20.x-branch
git reset --hard HEAD^
git cherry-pick -x 56a7f45b998054ad290ecfe3ede57f274b58a5b4
git push --force-with-lease

@lightninglabs-deploy
Copy link
Collaborator

🟠 PR Severity: HIGH

Backwards-compatibility fix | 3 files | 79 lines changed (excl. tests/docs)

🟠 High (1 file)
  • graph/db/kv_store.go - Channel edge feature deserialization in network graph database (graph/* category)
🟢 Low (2 files - excluded from severity)
  • docs/release-notes/release-notes-0.20.1.md - Release notes documentation
  • graph/db/kv_store_features_test.go - Test coverage for the fix

Analysis

This PR addresses a backwards-compatibility issue in the graph database's channel edge feature deserialization. The core change is in graph/db/kv_store.go (74 additions, 5 deletions), which falls under the graph/* category (network graph maintenance) and qualifies as HIGH severity.

The fix ensures proper handling of feature deserialization for channel edges, which is critical for maintaining network graph integrity but doesn't touch consensus-critical or fund-security components (which would be CRITICAL severity).

No severity bump applied:

  • Only 1 significant file changed (excluding tests)
  • 79 lines changed (excluding tests/docs) - below 500 line threshold
  • Single package scope (graph/db)

This requires review from an engineer knowledgeable in graph database operations and Lightning Network gossip protocol semantics.


To override, add a severity-override-{critical,high,medium,low} label.

ziggie1984 added a commit that referenced this pull request Feb 4, 2026
…20.x-branch

[v0.20.x-branch] Backport #10529: graphdb: fix backwards-compat for channel edge feature deserialization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-v0.20.x-branch This label is used to trigger the creation of a backport PR to the branch `v0.20.x-branch`. graph

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unable to start server: could not populate the graph cache: unable to decode features: EOF

6 participants