-
Notifications
You must be signed in to change notification settings - Fork 9
Add entity type schemas #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Adds support for per-entity-type schemas. Schemas are definied in configuration and applied as MongoDB data validations and indexes. The /api/v1/schemas/ endpoint exposes schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a schema validation framework for entity types, enabling users to define field validation rules and database indexes. The implementation includes MongoDB/DocumentDB JSON schema validation, automatic index management, and new API endpoints for querying entity types and schemas.
- Added comprehensive schema validation framework with support for field types, patterns, enums, and case rules
- Implemented automatic MongoDB index creation/deletion with retry logic for handling concurrent operations
- Added
/schemasand/entity-typesREST API endpoints to expose schema configurations
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| server/server.go | Integrated schema DDL execution during server boot with retry logic for MongoDB collMod conflicts |
| schema/mongo_schema.go | Core schema validation and MongoDB index management implementation |
| schema/mongo_schema_test.go | Unit tests for BSON schema validator and index operations |
| schema/mongo_integration_test.go | Integration tests verifying schema validation and index management against MongoDB |
| schema/config.go | Schema configuration data structures and YAML mappings |
| schema/README.md | Documentation for schema validation framework usage and extension |
| config/config.go | Added Schemas field to application config |
| api/api_gomux.go | Added HTTP handlers for schemas and entity-types endpoints |
| api/get_schemas_test.go | Tests for GET /schemas endpoint |
| api/entity_types_test.go | Tests for GET /entity-types endpoint |
| entity/validate.go | Added EntityTypes() method to Validator interface |
| go.mod | Added github.com/pkg/errors dependency |
| go.sum | Updated checksums for new dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "context" | ||
| "fmt" | ||
| "log" | ||
| "math/rand" |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Using math/rand for jitter in retry logic is not cryptographically secure. While this is acceptable for jitter, consider using crypto/rand or math/rand/v2 for better randomness quality.
| "math/rand" | |
| "crypto/rand" |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 15 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for ; try < 5; try++ { | ||
| if err = schema.CreateOrUpdateMongoSchema(context.Background(), db, s.appCtx.Config.Schemas); err == nil { | ||
| return nil | ||
| } | ||
| // Sleep 2-4 seconds before retrying. Updates are very fast, so we don't need long waits. | ||
| jitter := time.Duration(rand.Intn(2000)) * time.Millisecond |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of math/rand without seeding could lead to predictable retry patterns. Consider using math/rand/v2 (available in Go 1.22+) which is automatically seeded, or explicitly seed the random number generator with rand.New(rand.NewSource(time.Now().UnixNano())) to ensure different jitter values across application restarts.
| for ; try < 5; try++ { | |
| if err = schema.CreateOrUpdateMongoSchema(context.Background(), db, s.appCtx.Config.Schemas); err == nil { | |
| return nil | |
| } | |
| // Sleep 2-4 seconds before retrying. Updates are very fast, so we don't need long waits. | |
| jitter := time.Duration(rand.Intn(2000)) * time.Millisecond | |
| // Seed a local random number generator for jitter | |
| rnd := rand.New(rand.NewSource(time.Now().UnixNano())) | |
| for ; try < 5; try++ { | |
| if err = schema.CreateOrUpdateMongoSchema(context.Background(), db, s.appCtx.Config.Schemas); err == nil { | |
| return nil | |
| } | |
| // Sleep 2-4 seconds before retrying. Updates are very fast, so we don't need long waits. | |
| jitter := time.Duration(rnd.Intn(2000)) * time.Millisecond |
| // CreateOrUpdateMongoSchema creates or updates the MongoDB schema for the given entity. If the schema is nil or has | ||
| // empty fields, it removes the JSON schema validation. If the schema is not nil, it ensures that the indexes in the | ||
| // schema exists, and any indexes that are not in the schema are removed. Entity Collection creation is handled by the | ||
| // index creation process. We assume that any reasonably designed schema should not solely rely on full collection scans. |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "reasonaby" should be "reasonably".
| # How It Works | ||
| ## Basic Structure | ||
| ### Syntax | ||
| **Syntax:** YAML format with `validations` at the top level. |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation inconsistency: Line 8 mentions validations at the top level, but the example YAML at line 37 uses schemas at the top level. This inconsistency could confuse users about the correct configuration structure.
| **Syntax:** YAML format with `validations` at the top level. | |
| **Syntax:** YAML format with `schemas` at the top level. |
|
|
||
| Define how the validation appears in your YAML file: | ||
| ```yaml | ||
| validations: |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation inconsistency: The example at line 125 again uses validations as the top-level key, but the actual implementation in the codebase uses schemas (as seen in config.go line 125 and the first README example at line 37). This inconsistency should be corrected to match the actual implementation.
| validations: | |
| schemas: |
| func updateMongoJSONValidation(ctx context.Context, db *mongo.Database, entity string, schema Schema, global Global) error { | ||
| log.Printf("INFO: Generating bson schema validator for %s", entity) | ||
|
|
||
| // If there are no field defined, we assume the entity owner want to bypass JSON validation and only use |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar issue in comment: "owner want" should be "owner wants".
| // If there are no field defined, we assume the entity owner want to bypass JSON validation and only use | |
| // If there are no field defined, we assume the entity owner wants to bypass JSON validation and only use |
| // We only handle enums for strings right now. | ||
| if field.Type != "string" && field.Enum != nil { | ||
| return nil, errors.Wrapf(errEnumNotString, "field %s is of type %q", field.Name, field.Type) | ||
| } |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enum validation logic only checks if field.Type != "string", but bool-str type is treated as a string with a hardcoded enum of ["true", "false"]. This could cause confusion if someone tries to define a custom enum for a bool-str field. Consider adding a specific check to prevent enums on bool-str fields since the enum is automatically set, or document this behavior clearly.
| } | |
| } | |
| // Prevent custom enums for "bool-str" fields, as the enum is always ["true", "false"] | |
| if field.Type == "bool-str" && field.Enum != nil { | |
| return nil, errors.New(fmt.Sprintf("Custom enums are not supported for bool-str fields (field %s)", field.Name)) | |
| } |
| cfg.Entities = map[string]schema.EntitySchema{ | ||
| entityType: api.schemas.Entities[entityType], |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil pointer dereference: when an entity type is valid but not present in api.schemas.Entities, accessing api.schemas.Entities[entityType] at line 1215 will return a zero-value EntitySchema with a nil Schema field. While this may be intentional to return an empty schema for known types without configured schemas, it could be clearer to explicitly handle this case or document the behavior.
| cfg.Entities = map[string]schema.EntitySchema{ | |
| entityType: api.schemas.Entities[entityType], | |
| // If the entity type is valid but not present in api.schemas.Entities, | |
| // explicitly return an empty schema for that type. | |
| entitySchema, ok := api.schemas.Entities[entityType] | |
| if !ok { | |
| // Entity type is valid but has no configured schema; return empty schema. | |
| entitySchema = schema.EntitySchema{} | |
| } | |
| cfg.Entities = map[string]schema.EntitySchema{ | |
| entityType: entitySchema, |
| return nil, errors.Wrap(err, "Failed to decode index description") | ||
| } | ||
|
|
||
| log.Printf("INFO: Current existing index for %s: %+v", coll.Name(), index) |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in log message: "exising" should be "existing".
| func updateMongoJSONValidation(ctx context.Context, db *mongo.Database, entity string, schema Schema, global Global) error { | ||
| log.Printf("INFO: Generating bson schema validator for %s", entity) | ||
|
|
||
| // If there are no field defined, we assume the entity owner want to bypass JSON validation and only use |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar issue in comment: "field defined" should be "fields defined".
| // If there are no field defined, we assume the entity owner want to bypass JSON validation and only use | |
| // If there are no fields defined, we assume the entity owner want to bypass JSON validation and only use |
| type Field struct { | ||
| // Name is the name of the field in the schema. | ||
| Name string `yaml:"name"` | ||
| // Type is the type of the field. This can be string, int, bool, object, or enum. |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation inconsistency in comment: The comment at line 37 says "This can be string, int, bool, object, or enum" but enum is not a valid type. Looking at the BSONSchemaValidator function, the valid types are: string, int, bool, object, datetime, int-str, and bool-str. The comment should be updated to reflect the actual supported types.
| // Type is the type of the field. This can be string, int, bool, object, or enum. | |
| // Type is the type of the field. Supported types: string, int, bool, object, datetime, int-str, bool-str. |
|
I copied the schema code from block-etre and made a few changes along the way. The main changes are:
|
Users can now define a schema for entity types, including validation rules and indexing. New API endpoints have been added to fetch a list of entity types and schemas.