Skip to content

Conversation

@dumanshu
Copy link
Contributor

@dumanshu dumanshu commented Oct 31, 2025

Summary

Adds comprehensive log redaction support to TiCDC with three modes (OFF, MARKER, ON) to prevent sensitive data leaks in production logs while preserving operational visibility.

Issue Number: close #2918

Motivation

Production logs may inadvertently expose sensitive user data (emails, names, addresses, credit cards, etc.) creating compliance and security risks. TiCDC needs configurable redaction to balance security with debuggability.

Implementation

Three Redaction Modes

  1. OFF (false): No redaction - all data visible in logs (default, current behavior)
  2. MARKER: Wraps sensitive data with ‹› markers for easy identification
    • Example: user_email='‹alice@example.com›'
    • Preserves data for debugging while clearly marking sensitive fields
  3. ON (true): Complete redaction - sensitive data replaced with ?
    • Example: user_email='?'
    • Maximum security for production

Configuration

Server Startup (via config file):

# cdc.toml - PD/TiKV compatible boolean syntax (recommended)
[security]
redact-info-log = true   # Enables ON mode (full redaction)
# redact-info-log = false  # Disables redaction (default)
# redact-info-log = "marker"  # MARKER mode (string only)
cdc server --config=cdc.toml

Runtime Changes (no restart required):

curl -X POST http://localhost:8300/api/v2/log/redact \
  -H "Content-Type: application/json" \
  -d '{"redact_info_log":"true"}'

API Response:

{
  "previous_mode": "OFF",
  "current_mode": "ON"
}

What Gets Redacted

Redacted (contains user/business data):

  • DML operations: INSERT, UPDATE, DELETE row data
  • Table row values in replication logs
  • Error messages containing query data
  • Query parameter values in error logs

NOT Redacted (system metadata):

  • DDL query strings (contain schema/table/column names)
  • SQL templates/query structure (e.g., INSERT INTO test.t VALUES (?,?,?))
  • Table names, database names
  • Changefeed IDs, task IDs
  • Timestamps, LSNs, commit_ts
  • Error codes, stack traces
  • Performance metrics

Important Note on DDL Queries

DDL queries are intentionally NOT redacted to assist with server debugging and operational visibility. While DDL queries primarily contain schema information (table/column names), they may occasionally include sensitive data in edge cases:

  • Default values with sensitive data: ALTER TABLE users ADD COLUMN ssn VARCHAR(11) DEFAULT '123-45-6789'
  • Column/table names that reveal sensitive information in compliance-sensitive contexts

This is a documented design decision prioritizing operational debuggability. Users with strict compliance requirements should:

  1. Avoid sensitive data in DDL default values
  2. Use generic column/table naming conventions
  3. Consider additional log filtering at the infrastructure level if DDL metadata exposure is a concern

Testing

  • Unit tests for all three modes
  • Integration tests validating end-to-end behavior
  • API tests for runtime mode switching with response validation
  • Build, linter, and full test suite pass

References

  • Similar to TiDB's tidb_redact_log variable
  • PD/TiKV use boolean redact-info-log configuration
Add log redaction support with three modes to TiCDC. Configure via `[security] redact-info-log` in config file (true/false/marker) or runtime API `/api/v2/log/redact`. MARKER mode wraps sensitive values with ‹› markers; ON mode replaces them with `?`.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 31, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hicqu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added contribution This PR is from a community contributor. first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Oct 31, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

Hi @dumanshu. Thanks for your PR.

I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 31, 2025

Welcome @dumanshu!

It looks like this is your first PR to pingcap/ticdc 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/ticdc. 😃

@pingcap-cla-assistant
Copy link

pingcap-cla-assistant bot commented Oct 31, 2025

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 31, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @dumanshu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances TiCDC's data privacy and security posture by implementing a robust log redaction system. The changes ensure that sensitive user data, such as SQL queries, row values, and keys, are no longer exposed in plain text within logs, bringing TiCDC into alignment with the redaction capabilities of the broader TiDB ecosystem. This addresses compliance requirements and reduces security risks associated with log exposure.

Highlights

  • New Redaction Package: Introduced a new pkg/redact package to provide comprehensive log redaction functionality, compatible with TiDB, TiKV, and PD redaction systems. It supports three modes: OFF (no redaction), ON (complete redaction with '?'), and MARKER (sensitive data wrapped with '‹' and '›' for post-processing).
  • Core Redaction Functions: Implemented thread-safe and performance-optimized redaction functions for various data types, including SQL() for queries, Value() for sensitive data, Key() for binary keys, Args() for SQL arguments, Any() for generic values, and Bytes() for byte slices.
  • Extensive Codebase Integration: Applied redaction throughout critical TiCDC components, including the MySQL sink, various codecs (Avro, Canal, Debezium, Open, Simple), schema store operations, event handlers, and infrastructure-related logging (e.g., Etcd paths, orchestrator metadata, upstream configurations) to prevent sensitive data leaks.
  • Build-Time Validation: Added a build-time validation script (scripts/check-redaction.sh) to automatically detect unredacted sensitive data in log statements. This script includes a hierarchical exclusion system and supports // skip-redaction: or // nolint:redact comments for intentional exceptions.
  • Documentation and Testing: Provided comprehensive documentation in pkg/redact/README.md detailing usage, configuration, performance considerations, and best practices. The new package also includes extensive unit and integration tests covering all redaction modes, concurrent access, and edge cases.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust log redaction mechanism to prevent sensitive data leakage, which is a significant improvement for security and compliance. The new redact package is well-designed, and its integration across the codebase is thorough. The addition of a build-time validation script is also a commendable effort to maintain this standard. I have a couple of minor suggestions for improvement in the new package and one of the modified files.

@dumanshu dumanshu force-pushed the add-log-redaction-support branch 2 times, most recently from 8bd00cb to 7f8b9df Compare November 4, 2025 00:40
@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 4, 2025
@dumanshu dumanshu force-pushed the add-log-redaction-support branch 3 times, most recently from 708cc24 to 511ceff Compare November 4, 2025 01:49
@dumanshu dumanshu marked this pull request as draft November 5, 2025 21:39
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 5, 2025
@dumanshu dumanshu force-pushed the add-log-redaction-support branch from df6ef07 to 34bb0b7 Compare November 6, 2025 02:11
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/needs-triage-completed and removed do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-linked-issue labels Nov 6, 2025
@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-triage-completed and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 18, 2025
@dumanshu dumanshu force-pushed the add-log-redaction-support branch 2 times, most recently from 274173a to 94a4e3b Compare December 18, 2025 05:47
@dumanshu
Copy link
Contributor Author

Overall, I think it's pretty good, and I've successfully debugged it locally. However, I think this particular situation needs further discussion.

Currently, if the on mode is enabled, and an unexpected synchronization issue occurs (such as a downstream system modifying the table structure without authorization, causing problems with upstream data writing to the downstream), with the current logging method, we won't know which table or statement is causing the problem. image

This will make troubleshooting extremely difficult. Therefore, could we consider simply not displaying the specific values, but retaining the SQL statement itself?

Thank you!

Makes sense. Your feedback is also aligned with "enables log redaction to shield user data values".

I have made the necessary changes on the PR. There are some code paths where SQL is fully formatted and would require further parsing to redact individual values. In those cases, I have redacted the full SQL in the ON mode. Those cases shouldn't be a debugging blocker though.

Here is an error example from the new code (formatted SQL available for debugging)

 [2025/12/18 05:54:13.083 +00:00] [WARN] [mysql_writer_dml.go:662] ["multiStmtExecute failed, fallback to sequence way"] [error="[CDC:ErrMySQLTxnError]MySQL txn error: Failed to execute DMLs, query info:BEGIN;DELETE FROM `log_redaction_test`.`users` WHERE `id` = ? LIMIT 1;REPLACE INTO `log_redaction_test`.`users` (`id`,`username`,`email`,`password`,`credit_card`,`ssn`,`phone`) VALUES (?,?,?,?,?,?,?);COMMIT;, args:(?, ?, ?, ?, ?, ?, ?, ?); : Error 1054 (42S22): Unknown column 'phone' in 'field list'"] [errorVerbose="[CDC:ErrMySQLTxnError]MySQL txn error: Failed to execute DMLs, query info:BEGIN;DELETE FROM `log_redaction_test`.`users` WHERE `id` = ? LIMIT 1;REPLACE INTO `log_redaction_test`.`users` (`id`,`username`,`email`,`password`,`credit_card`,`ssn`,`phone`) VALUES (?,?,?,?,?,?,?);COMMIT;, args:(?, ?, ?, ?, ?, ?, ?, ?); : Error 1054 (42S22): Unknown column 'phone' in 'field list'\ngithub.com/pingcap/errors.(*Error).GenWith..<skipped>..64.s:1693"] [sql="Total SQL Count: 2, Row Count: 1,[001] Query: DELETE FROM `log_redaction_test`.`users` WHERE `id` = ? LIMIT 1,[002] Query: REPLACE INTO `log_redaction_test`.`users` (`id`,`username`,`email`,`password`,`credit_card`,`ssn`,`phone`) VALUES (?,?,?,?,?,?,?),End"] [writerID=1]

@hongyunyan the PR is ready for your review.

@hongyunyan
Copy link
Collaborator

I see that the integration tests you added mainly use the downstream of the blackhole. In that case, you only need to add tests in the mysql_group. You don't need to add tests in other groups. The other groups are for testing scenarios where the downstream is Kafka or storage, etc.

}

log.Error("set snapshot read failed",
zap.String("query", query),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to redact these query here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all decoder.go in pkg/sink/codec, it's just a inner use for intergration tests or unit tests. I think we may don't need react values here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all decoder.go in pkg/sink/codec, it's just a inner use for intergration tests or unit tests. I think we may don't need react values here?

I was wondering - aren't the decoders in pkg/sink/codec/*/decoder.go used in production? I see them being instantiated via
pkg/sink/codec/builder.go#NewEventDecoder() and called from:

These consumer tools appear to be production code. Am I missing something?

Copy link
Contributor Author

@dumanshu dumanshu Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to redact these query here?

Here's my understanding - for the first query at line 271 (set @@tidb_snapshot={commitTs}), you're right that it only contains a timestamp, no user data. But the same query variable gets reassigned at line 289 to include WHERE clause conditions with primary key values from row data. I applied redaction to both for consistency (preventing accidental misses), but I'll remove the optional ones next.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These consumers, in my understanding, are used for internal integration tests, not for production code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hongyunyan - Please let me know if you’re comfortable with the following:

  1. Consumers usage for integration tests. This should be fine since tests can run with redaction turned off to support debugging.
  2. Since this code sits alongside production usage, applying redaction would help keep behavior consistent (minimize confusion).

@dumanshu dumanshu force-pushed the add-log-redaction-support branch from 94a4e3b to 395d4d6 Compare December 18, 2025 20:00
@dumanshu
Copy link
Contributor Author

I see that the integration tests you added mainly use the downstream of the blackhole. In that case, you only need to add tests in the mysql_group. You don't need to add tests in other groups. The other groups are for testing scenarios where the downstream is Kafka or storage, etc.

I have evolved the integration tests to cover for "kafka" sink type since it has some logs requiring redaction. It has mysql support. I have removed the other sink types. Please take a look.

@dumanshu dumanshu force-pushed the add-log-redaction-support branch from 395d4d6 to 3fe87ad Compare December 19, 2025 05:29
@dumanshu
Copy link
Contributor Author

/test all

@dumanshu dumanshu force-pushed the add-log-redaction-support branch from 3fe87ad to e62dcdf Compare December 19, 2025 05:52
@dumanshu
Copy link
Contributor Author

/test all

@dumanshu
Copy link
Contributor Author

@hongyunyan please take a look.

@dumanshu dumanshu force-pushed the add-log-redaction-support branch 2 times, most recently from 7e7a7cf to 91b9961 Compare December 19, 2025 19:15
@dumanshu dumanshu force-pushed the add-log-redaction-support branch from 91b9961 to ced71ad Compare January 6, 2026 20:14
@dumanshu
Copy link
Contributor Author

@hongyunyan - a gentle reminder..

@dumanshu dumanshu force-pushed the add-log-redaction-support branch from d4e18c4 to 4629a74 Compare January 12, 2026 18:03
… protection

Implement comprehensive log redaction framework with three operational modes:
- OFF: No redaction, values logged as-is (default for backward compatibility)
- MARKER: Values wrapped with ‹› markers for grep-ability while marking sensitive data
- ON: Full redaction, sensitive values replaced with "?"

Core changes:
- Add pkg/util/redact.go with RedactValue, RedactAny, RedactArgs, RedactKey,
  RedactBytes, RedactStrings functions
- Integrate redaction across all data paths:
  - DML event logging (row changes, column values)
  - Codec decoders (Canal, Avro, Open Protocol, Simple, Debezium)
  - SQL builders and multi-row operations
  - Kafka/storage consumers
  - Diff and merge operations
  - Credential and security logging
- Add comprehensive unit tests for all redaction functions

Security considerations:
- Column names preserved (schema info needed for debugging)
- Only user data values are redacted
- Null indicators ("IsNull") not redacted
- SQL templates (table/column names) not redacted, only bind values
Add server-side initialization of log redaction mode from configuration:
- Parse redact-info-log config option during server startup
- Initialize perrors.RedactLogEnabled based on config value
- Add unit tests for server redaction initialization
- Supports same three modes: off, marker, on
Add /api/v2/log/redaction endpoint for runtime redaction mode control:
- GET: Query current redaction mode
- POST: Update redaction mode (with security restrictions)

Security features:
- Mode can only be escalated (OFF→MARKER→ON), never downgraded
- Prevents runtime disabling of redaction once enabled
- Returns 400 error on downgrade attempts

API response format:
- GET returns: {"redact_mode": "off|marker|on"}
- POST accepts: {"redact_mode": "off|marker|on"}

Add comprehensive tests for API behavior and security restrictions.
@dumanshu dumanshu force-pushed the add-log-redaction-support branch 2 times, most recently from 3d1c77c to 32039c8 Compare January 12, 2026 23:50
@hongyunyan
Copy link
Collaborator

Overall, I think this PR is very good. I have a few minor issues I'd like to discuss:

  1. According to the comments, the SetRedactMode behavior doesn't support mode degradation. If the device is in an off state, and a set request for on and marker is executed concurrently, will it ultimately result in marker instead of on? Does this meet your expectations?

  2. Some panic operations also directly print values. Does this meet your requirements, such as log.Panic("invalid column value, please report a bug", zap.Any("col", c), zap.Error(err)) in pkg/sink/codec/open/message.go:99`?

Add integration tests validating end-to-end log redaction behavior:
- Test all three modes: OFF, MARKER, ON
- Test configuration via config file (true/false/marker values)
- Test runtime API mode switching
- Add test utilities in run_cdc_server script
- Support boolean config (true/false) aligned with PD/TiKV
@dumanshu dumanshu force-pushed the add-log-redaction-support branch from 32039c8 to 90210aa Compare January 13, 2026 15:12
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 13, 2026

@dumanshu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-error-log-review 90210aa link true /test pull-error-log-review

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. do-not-merge/needs-triage-completed first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add log redaction support to prevent sensitive data leaks

3 participants