Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive JSON Schema definitions for Searchlite, introducing validation specifications for both search requests and index configurations. The schemas use JSON Schema Draft 2020-12 and provide detailed type definitions, constraints, and documentation for all supported features.
- Defines complete search request schema with support for queries, filters, aggregations, sorting, and vector search
- Defines index schema covering text, keyword, numeric, nested, and vector field configurations
- Establishes validation rules and documentation for all API parameters and data structures
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| schemas/search.json | Comprehensive schema defining the search request API including query parameters, filters (keyword equality/in, numeric ranges, nested filters), multiple aggregation types (terms, range, histogram, stats), sorting, pagination, and optional vector search support |
| schemas/index.json | Schema defining index configuration structure with support for multiple field types (text, keyword, numeric, vector) at top-level and nested levels, including field options for storage, indexing, tokenization, and nullability |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "type": "array", | ||
| "items": { "$ref": "#/$defs/date_range_bound" } | ||
| }, | ||
| "missing": {}, |
There was a problem hiding this comment.
The missing property uses an empty object schema allowing any JSON value. This is inconsistent with other aggregations where missing has specific type constraints. Consider adding type constraints for consistency and to provide clearer validation.
| "type": { "const": "histogram" }, | ||
| "field": { "type": "string", "minLength": 1 }, | ||
| "interval": { "type": "number", "exclusiveMinimum": 0 }, | ||
| "offset": { "anyOf": [{ "type": "number" }, { "type": "null" }] }, |
There was a problem hiding this comment.
The histogram interval allows a minimum value of 0 with "exclusiveMinimum": 0, which is correct. However, the offset field on line 283 allows any number including negative values and zero. Consider whether negative offsets are semantically valid for your use case, or if there should be constraints on this value.
| "offset": { "anyOf": [{ "type": "number" }, { "type": "null" }] }, | |
| "offset": { "anyOf": [{ "type": "number", "minimum": 0 }, { "type": "null" }] }, |
| "properties": { | ||
| "type": { "const": "stats" }, | ||
| "field": { "type": "string", "minLength": 1 }, | ||
| "missing": {} |
There was a problem hiding this comment.
The missing property in stats_agg, extended_stats_agg, and value_count_agg uses an empty object schema (line 340, 350, 360), which allows any JSON value. This is inconsistent with other aggregations where missing has a specific type constraint (e.g., number for histogram_agg on line 293, string for date_histogram_agg on line 321). Consider adding type constraints for consistency and clarity.
| "properties": { | ||
| "type": { "const": "extended_stats" }, | ||
| "field": { "type": "string", "minLength": 1 }, | ||
| "missing": {} |
There was a problem hiding this comment.
The missing property uses an empty object schema allowing any JSON value. This is inconsistent with other aggregations where missing has a specific type constraint (e.g., number for histogram_agg, string for date_histogram_agg). Consider adding type constraints for consistency and clarity.
| "properties": { | ||
| "type": { "const": "top_hits" }, | ||
| "size": { "type": "integer", "minimum": 0 }, | ||
| "from": { "type": "integer", "minimum": 0 }, |
There was a problem hiding this comment.
The top_hits_agg requires both "type" and "size", but "from" is optional. However, there's no validation preventing "from" from being greater than or equal to the maximum possible results. Consider adding documentation or constraints to clarify the relationship between "from" and "size" to prevent invalid pagination configurations.
| "from": { "type": "integer", "minimum": 0 }, | |
| "from": { | |
| "type": "integer", | |
| "minimum": 0, | |
| "description": "Optional offset for pagination within the top hits. Implementations may require that `from` is less than the maximum possible results addressed by `size` (for example, ensuring from + size does not exceed the allowed range) to avoid invalid or empty pages." | |
| }, |
| }, | ||
| "bmw_block_size": { | ||
| "description": "Optional block size override for BMW execution.", | ||
| "anyOf": [{ "type": "integer", "minimum": 0 }, { "type": "null" }] |
There was a problem hiding this comment.
The minimum value of 0 for bmw_block_size may not be semantically valid. A block size of 0 would not be meaningful in most contexts. Consider changing this to "minimum": 1 or "exclusiveMinimum": 0 to prevent invalid configurations.
| "anyOf": [{ "type": "integer", "minimum": 0 }, { "type": "null" }] | |
| "anyOf": [{ "type": "integer", "minimum": 1 }, { "type": "null" }] |
| "properties": { | ||
| "type": { "const": "value_count" }, | ||
| "field": { "type": "string", "minLength": 1 }, | ||
| "missing": {} |
There was a problem hiding this comment.
The missing property uses an empty object schema allowing any JSON value. This is inconsistent with other aggregations where missing has a specific type constraint (e.g., number for histogram_agg, string for date_histogram_agg). Consider adding type constraints for consistency and clarity.
| "min_doc_count": { | ||
| "anyOf": [{ "type": "integer", "minimum": 0 }, { "type": "null" }] | ||
| }, | ||
| "missing": {}, |
There was a problem hiding this comment.
The missing property in terms_agg and range_agg uses an empty object schema allowing any JSON value. This is inconsistent with other aggregations where missing has specific type constraints. Consider adding type constraints for consistency and to provide clearer validation.
| "field": { "type": "string", "minLength": 1 }, | ||
| "keyed": { "type": "boolean" }, | ||
| "ranges": { "type": "array", "items": { "$ref": "#/$defs/range_bound" } }, | ||
| "missing": {}, |
There was a problem hiding this comment.
The missing property uses an empty object schema allowing any JSON value. This is inconsistent with other aggregations where missing has specific type constraints. Consider adding type constraints for consistency and to provide clearer validation.
No description provided.