Skip to content

Error in ranges for isalpharune generation from UnicodeData.txt - CJK Ideographs #13

@mcrouse-chrome

Description

@mcrouse-chrome

tl;dr - The behavior of isalpharune treats CJK Ideographs as non alpha.

CJK ideographs are specified in the UnicodeData.txt (9.0.0) as:

12018 4E00;<CJK Ideograph, First>;Lo;0;L;;;;;N;;;;;
12019 9FD5;<CJK Ideograph, Last>;Lo;0;L;;;;;N;;;;;

However, the result of the generation (via awk) marks these as singles. This means that any query of a rune in that range (other than these two) will return false to isalpharune().

My best guess is that the awk script wants ranges to have hex codes in the same range and these have nothing in common there. (4E00 vs 9FD5)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions