-
Notifications
You must be signed in to change notification settings - Fork 46
Open
Description
In JSON Schema specification maxLength keyword specifies the amount of characters in field, not bytes. However, in Redshift (as in most databases, I believe) in VARCHAR we specify amount of bytes, which may introduce mismatch in text-fields, usually written by humans, not computers.
This is generally not a problem as analytical data typically contains ASCII-text (written by computers), where amount of bytes precisely match amount of characters.
But at the same time, I can imagine an issue:
- User supposed to enter his/her city name in native language (likely with non-ASCII characters)
- Web-developer constrains input-field to 32 characters
- Analysts makes a wrong assumption that
maxLength: 32is correct constrain - Redshift truncates all non-ASCII city names to 16 characters
This could be done as part of #170 (format: "unicode", which specify that string has absolutely no structure) or similar custom JSON-schema extension.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels