Skip to content

Support for EBNF character code and character range extensions #237

@pmonks

Description

@pmonks

Although not part of the official ISO-14977 definition of EBNF, it is quite common for EBNF grammars to rely on extensions that add support for explicit character codes (e.g. #xN, where N is a Unicode code point in hexadecimal), and also character ranges (e.g. [a-zA-Z], [#xN-#xN]). It would be valuable if instaparse were to support these kinds of extensions when parsing EBNF, given their ubiquity in real world EBNF grammars, and the fact that they fix some glaring limitations in the official EBNF spec (notably lack of Unicode support).

In terms of "picking a standard extension", I would suggest the EBNF extensions that are defined and used in the XML specification, minus the XML-specific well-formedness and validity constraints. This specific form of EBNF extension has become a de-facto standard, given the ubiquity of XML parsers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions