-
Notifications
You must be signed in to change notification settings - Fork 151
Description
Although not part of the official ISO-14977 definition of EBNF, it is quite common for EBNF grammars to rely on extensions that add support for explicit character codes (e.g. #xN, where N is a Unicode code point in hexadecimal), and also character ranges (e.g. [a-zA-Z], [#xN-#xN]). It would be valuable if instaparse were to support these kinds of extensions when parsing EBNF, given their ubiquity in real world EBNF grammars, and the fact that they fix some glaring limitations in the official EBNF spec (notably lack of Unicode support).
In terms of "picking a standard extension", I would suggest the EBNF extensions that are defined and used in the XML specification, minus the XML-specific well-formedness and validity constraints. This specific form of EBNF extension has become a de-facto standard, given the ubiquity of XML parsers.