-
Notifications
You must be signed in to change notification settings - Fork 462
[lake] Record a file path storing log offsets in lake snapshot property #2223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
33969e9 to
5442462
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
5442462 to
68a5039
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 30 out of 30 changed files in this pull request and generated 12 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...ink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperator.java
Outdated
Show resolved
Hide resolved
...ink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperator.java
Outdated
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/coordinator/TestCoordinatorGateway.java
Outdated
Show resolved
Hide resolved
fluss-rpc/src/test/java/org/apache/fluss/rpc/TestingTabletGatewayService.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/coordinator/CoordinatorEventProcessor.java
Outdated
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/tablet/TestTabletServerGateway.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/fluss/flink/tiering/committer/FlussTableLakeSnapshotCommitter.java
Outdated
Show resolved
Hide resolved
fluss-common/src/main/java/org/apache/fluss/lake/committer/CommittedLakeSnapshot.java
Show resolved
Hide resolved
06af001 to
3889069
Compare
a6c7f69 to
747e91b
Compare
747e91b to
f31d6f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 42 out of 42 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fluss-server/src/main/java/org/apache/fluss/server/zk/data/lake/LakeTableJsonSerde.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/fluss/flink/tiering/committer/FlussTableLakeSnapshotCommitter.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/zk/data/lake/LakeTableHelper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 43 out of 43 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...ink-common/src/main/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperator.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/entity/CommitLakeTableSnapshotData.java
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/RpcServiceBase.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/coordinator/CoordinatorEventProcessor.java
Outdated
Show resolved
Hide resolved
...common/src/test/java/org/apache/fluss/flink/tiering/committer/TieringCommitOperatorTest.java
Show resolved
Hide resolved
fluss-common/src/test/java/org/apache/fluss/utils/json/TableBucketOffsetsJsonSerdeTest.java
Show resolved
Hide resolved
37c8016 to
f7a09fa
Compare
f7a09fa to
f470b32
Compare
|
@wuchong Could you please help review this pr? The pr also handle the back compabitlity when use v2 to serialize lake table snapshot |
f470b32 to
0451cbf
Compare
| optional int64 max_timestamp = 6; | ||
| } | ||
|
|
||
| message PbPrepareCommitLakeTableRespForTable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a table_id field, as the PrepareCommitLakeTableSnapshotRequest request has multiple table ids, we need to distinguish which table is the PbPrepareCommitLakeTableRespForTable belong to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to use array index to distinguish the PbPrepareCommitLakeTableRespForTable belong to. It's a week contract. I'm fine with attaching table id which is a stronger contract, and also align with CommitLakeTableSnapshotRequest
fluss-server/src/main/java/org/apache/fluss/server/zk/data/lake/LakeTable.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/zk/data/lake/LakeTableJsonSerde.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/zk/data/lake/LakeTableSnapshotJsonSerde.java
Outdated
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/zk/data/LakeTableSnapshotJsonSerdeTest.java
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/coordinator/CoordinatorService.java
Outdated
Show resolved
Hide resolved
1c1722e to
9f517ed
Compare
9f517ed to
97849a9
Compare
|
@wuchong Thanks for review. Comments has been addressed |
wuchong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @luoyuxia for the updating. I only left some final comments.
| coordinatorEventManager.put( | ||
| new NotifyLakeTableOffsetEvent( | ||
| commitLakeTableSnapshotData.getLakeTableSnapshot(), | ||
| commitLakeTableSnapshotData | ||
| .getTableBucketsMaxTieredTimestamp())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the event-publishing logic into the forEach loop would cause the event to be sent multiple times, since the event is intended for all table ids, not per individual table.
| coordinatorEventManager.put( | ||
| new NotifyLakeTableOffsetEvent( | ||
| lakeTableSnapshots, | ||
| commitLakeTableSnapshotData | ||
| .getTableBucketsMaxTieredTimestamp())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| LIST_REBALANCE_PROGRESS(1050, 0, 0, PUBLIC), | ||
| CANCEL_REBALANCE(1051, 0, 0, PUBLIC); | ||
| CANCEL_REBALANCE(1051, 0, 0, PUBLIC), | ||
| PRE_LAKE_TABLE_SNAPSHOT(1052, 0, 0, PRIVATE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRE_LAKE_TABLE_SNAPSHOT -> PREPARE_LAKE_TABLE_SNAPSHOT
While PRE might imply "before," it’s ambiguous and non-idiomatic in event or phase naming. PREPARE clearly conveys the intent.
| optional int32 error_code = 1; | ||
| optional string error_message = 2; | ||
| optional int64 table_id = 3; | ||
| optional string lake_table_bucket_offsets_path = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lake_table_offsets_path to align with the TableOffsets concept.
| * Prepares lake table snapshots by merging them with existing snapshots and storing them to the | ||
| * file system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Prepares lake table snapshots by merging them with existing snapshots and storing them to the | |
| * file system. | |
| * Prepares lake table snapshots by merging them with existing snapshots and storing them to the | |
| * remote file system. |
| * completeness | ||
| * <li>Stores the merged snapshot to the remote file system. The stored file contains the log | ||
| * end offset information for each bucket in the table | ||
| * <li>Returns the file path where the snapshot is stored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an item for the second phase.
<li>Call {@link #commitLakeTableSnapshot(CommitLakeTableSnapshotRequest)} with the offset
* file path to finalize the snapshot commit to ZooKeeper in the second phase.
Purpose
Linked issue: close #2224
Brief change log
TieringCommitOperator, first prepare commit log offsets to fluss cluster which will write a file to store the log offsetsTests
Existing test
API and Format
Documentation