Unicode anomaly

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found. Lua error in package.lua at line 80: module 'strict' not found. Lua error in package.lua at line 80: module 'strict' not found. The Unicode Standard has imposed for itself strict rules to guarantee stability.[1] Depending on the grade of strictness of a rule, a change can be prohibited or allowed. For example, a "Name" given to a code point can not and will not change. But a "Script" property is more flexible, by Unicode's own rules. In version 2.0, Unicode changed many code point "Names" from version 1. At the same moment, Unicode stated that from then on, an assigned Name to a code point will never change anymore. This implies that when mistakes are published, these mistakes cannot be corrected, even if they are trivial (as happened in one instance with the spelling BRAKCET for BRACKET in a character name).

Anomalies

In 2006 Unicode has published a list of anomalies in character names.[2]

  • U+0818 SAMARITAN MARK DAGESH and U+0819 SAMARITAN MARK OCCLUSION: Names mixed up.
Corrected text, names swapped:
U+0818 SAMARITAN MARK OCCLUSION (HTML &#2072;<dot-separator> "strengthens" the consonant, for example changing /w/ to /b/) and
U+0819 SAMARITAN MARK DAGESH (HTML &#2073;<dot-separator> indicates consonant gemination)[3]
  • U+2118 SCRIPT CAPITAL P (HTML &#8472;<dot-separator> &weierp;): it is not a capital
The name says "capital", but it is a small letter. The true capital is U+1D4AB 𝒫 MATHEMATICAL SCRIPT CAPITAL P (HTML &#119979;)[4]
  • U+FE18 PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET (HTML &#65048;): BRAKCET is spelled wrong. Since this is the fixed Character Name by policy, it cannot be changed.[5]

References