-
-
Notifications
You must be signed in to change notification settings - Fork 6
Description
I am taking a look at named-characters.yml at @rocky's request, and it occurs to me that there are some philosophical questions that should be answered about which Unicode symbols should be included and how. It appears Mathics has been relatively conservative with respect to adding Unicode aliases, so I think this discussion is really about making additions to what you already have from here on, not really about removing existing symbols.
Broadly speaking, I propose the following heuristics:
- Unicode symbols used by Mathematica should be used in the same way by Mathics for the sake of compatibility.
- Unicode symbols that correspond semantically with existing mathematical symbols should be included. Example:
−(U+2212, "Minus Sign") should be an alias for ASCII-even though Mathematica does not consider it so. - Unicode symbols outside of the Mathematical Operators Block (and the ASCII block) should be excluded unless one of the previous heuristics includes it. Example:
✕(U+2715, "Multiplication X") can be used forTimesbut is in the Dingbats Block and is thus excluded. - All typographical variants of "plain"/"regular" symbols should be excluded unless included by a previous heuristic. For example, all Full Width variants, bold variants, italic variants, and so forth are excluded.
- Unicode symbols should not be overloaded, i.e. should not be used for more than one underlying function, unless required for Mathematica compatibility. For example,
≫(U+226B, "Much Greater-Than") is already used forGreaterGreaterand therefore should not be an alias for>>forPut. Likewise,≪(U+226A, "Much Less-Than") forGet,∷(U+2237, "Proportion") forMessageName, etc.
The general idea is to continue to be conservative while also covering compatibility and use cases we are reasonably likely to encounter. I also argue that having these heuristics written down somewhere is helpful for future contributors, whether future us or someone else, for a variety of reasons.
These heuristics do not cover all cases worthy of discussion. Here are two cases where it's not clear whether the Unicode aliases should be included:
≔(U+2254, "Colon Equals") forSetDelayed. This feels to me to be, while not exact, a close enough semantic correspondence to include it under heuristic 2.⋙(U+22D9, "Very Much Greater-Than") forPutAppend. This would be awkward to include considering≫(U+226B, "Much Greater-Than") cannot be an alias forPut(by heuristic 5).
This issue is to solicit:
- Discussion on the heuristics themselves?
- Discussion on recording them somewhere?
- Opinions about my two specific symbols
≔and⋙?