-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
I have documents with Unicode characters, which rowan correctly handles. But the orgize-wasm frontend has issues because it creates the data-range links without considering that the indices are byte ranges, and JavaScript has UTF-16 strings. This clash causes the links to get out of sync rapidly.
For instance, parse this input in the web frontend:
Hello
world
Foo 💩 Bar
Baz
- 💩
- Lorem ipsum dolor sit amet
- 💩
- consectetur adipiscing elit
- 💩
- sed do eiusmod tempor incididunt ut labore et dolore magna aliquaIn the "Syntax" tab, clicking on the range in TEXT@0..7 "Hello\r\n" properly highlights "Hello", and the same for "World". However, after the poo emoji, the links will desynchronize with the input. The range in TEXT@34..39 "Baz\r\n" will highlight the "z" and all of the newlines up to the hyphen. The list is worse, since each emoji extends the range by 2 additional "phantom" UTF-16 characters.
Metadata
Metadata
Assignees
Labels
No labels