Skip to content

Proposed: support 4-6 hex digits in character literal hex form #94

@DeLaGuardo

Description

@DeLaGuardo

The EDN specification currently restricts the hex form of character literals to exactly 4 hex digits (\uXXXX). This limitation prevents EDN from representing Unicode characters in the Supplementary Planes (U+10000 to U+10FFFF), which includes:

  • Emoji (U+1F300–U+1F9FF)
  • Historic scripts (Linear B, Egyptian hieroglyphs, cuneiform)
  • Musical notation symbols
  • Mathematical alphanumeric symbols
  • CJK ideographs extensions
  • And many other modern and historic characters

Extending the character literal hex form to support 4-6 hex digits would allow direct representation of any valid Unicode code point (U+0000 to U+10FFFF).

Many languages that can use EDN already support extended Unicode:

  • JavaScript: Supports \u{XXXXX} syntax (ES6+)
  • Python: Supports \UXXXXXXXX (8 hex digits)
  • Ruby: Supports \u{XXXXX}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions