Skip to content

Conversation

@jogrogan
Copy link
Collaborator

@jogrogan jogrogan commented Dec 12, 2025

This change implements the following:

  1. Adds create table support
    • Does lead to some awkwardness since you cannot construct a HoptimatorJdbcTable directly but the table must exist in the connection schema in order for the Deployer to create it. Worked around this using a temporary table structure, but I don't love this.
    • It's also a little weird that Deployers will often need the same properties as are specified in the DB specific jdbc urls, we should make some way to passthrough this information rather than having to specify it in two places (in reference to the configmap update)
  2. Adds a Venice Deployer & DeployerProvider that will get kicked off by CREATE TABLE VENICE."<my-store>".
    • Includes create, update, and delete support
    • Added several integration tests
  3. Fixes Avro default values for null avro union ["null", "SqlType"]
    • Without defaults, Venice schemas wouldn't be backwards compatible causing update to always fail
  4. Fixes PipelineReconciler by adding missing failed state.
  5. Made log lines consistent across DDL statements
  6. One main issue with SPIs in Java is that when a dependency is pulled in, you also get the SPI. Meaning if we want an internal implementation of the Venice DeployerProvider, you can't override the OSS one. I added the ability for the DeploymentService to filter out top level SPIs in favor of a defined subclasses, should one exist.

Left various TODOs for future enhancements

@jogrogan jogrogan force-pushed the jogrogan/createTable branch 5 times, most recently from 71e7c4f to 8311ecd Compare December 12, 2025 20:47
@jogrogan jogrogan requested a review from ryannedolan December 15, 2025 15:00
Copy link
Collaborator

@ryannedolan ryannedolan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice tests!


// TODO: Add support for populating new tables from a query as a one-time operation.
if (create.query != null) {
throw new DdlException(create, "Populating new tables is not currently supported.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For us, I think CTAS would be the same as an MV, except without the job. Table triggers would work the same, and would backfill the table asynchronously. Maybe we can add REFRESH TABLE as well, which would make CTAS just a batch-only version of MV.

Copy link
Collaborator Author

@jogrogan jogrogan Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea it is similar but notably CTAS is meant as a one off backfill where as a MV is meant to have a running job to keep it in constant sync.
I guess it could be a trigger refreshed when executed but then not really again?

source = new Source(temporaryTable.databaseName(), tablePath, Collections.emptyMap());
}
deployers = DeploymentService.deployers(source, connection);
logger.info("Deleting table {}", tableName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering how we prevent a table from getting deleted if it's used in a view. Maybe we should add a Validator to that effect at some point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's interesting. Validating if an existing MV is sinking to the same table intended to be deleted is actually super easy. But if this table is used as the source of an MV, that's quite a bit harder today.

}

/**
* Temporary table implementation used during CREATE TABLE to provide row type information
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't love this tbh, this temporary table does become sticky and will likely cause problems if we are reusing connections. We could maybe have something in place that if a table resolves as a temporary table we force resolution of it?

Object defaultValue = null;
// For unions containing null, defaults are specified in a specific way
if (innerField.isUnion() && innerField.isNullable()) {
defaultValue = Schema.Field.NULL_DEFAULT_VALUE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL!


public static <T extends Deployable> Collection<Deployer> deployers(T obj, Connection connection) {
return providers().stream()
Collection<DeployerProvider> providers = providers();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. If downstream plugins are extending from upstream plugins, maybe we should split our dependencies into e.g. venice-core and venice-plugin (or whatever names make sense), s.t. we can depend on venice-core and not get the SPI service file?

Copy link
Collaborator Author

@jogrogan jogrogan Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea maybe, still some pros and cons. That'll get rid of some of the potential weirdness caused by only choosing subclasses (if there are any?) but still won't allow inheritance and overrides. Say we have this VeniceDeployer, we want to expose it internally but drop in a different ControllerClient implementation, we still wouldn't be able to extend this Deployer without pulling in the SPI (with that being said we could have a 3rd BaseClass of sorts existing in a different module). Idk both don't feel great to me.


@Override
public void restore() {
// TODO: Restoration can be complicated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use connection.warn(msg) here.

@jogrogan jogrogan force-pushed the jogrogan/createTable branch from d05859a to 0971ac1 Compare January 6, 2026 20:16
@jogrogan jogrogan merged commit 1ea853f into main Jan 7, 2026
1 check passed
@jogrogan jogrogan deleted the jogrogan/createTable branch January 7, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants