├── .gitignore ├── LICENSE ├── README.md ├── data ├── ArabicShaping.txt ├── Blocks.txt ├── CaseFolding.txt ├── CompositionExclusions.txt ├── DerivedNormalizationProps.txt ├── EastAsianWidth.txt ├── IndicPositionalCategory.txt ├── IndicSyllabicCategory.txt ├── PropertyValueAliases.txt ├── Scripts.txt ├── SpecialCasing.txt ├── UnicodeData.txt └── extracted │ └── DerivedNumericValues.txt ├── index.js ├── package.json ├── parser.js ├── test.js └── update-data.sh /.gitignore: -------------------------------------------------------------------------------- 1 | node_modules/ 2 | .DS_Store 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2014-present Devon Govett 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # codepoints 2 | 3 | A parser for files in the Unicode database. Produces a giant array of codepoint objects for 4 | every character represented by Unicode, with many properties derived from files in the Unicode 5 | database. 6 | 7 | **BUILD SCRIPTS ONLY**: Use in production is not recommended 8 | as the parsers are not optimized for speed, the text files are huge, and the resulting array uses a 9 | huge amount of memory. To access this data in real world applications, use modules that have 10 | precompiled the data into a compressed form: 11 | 12 | * [unicode-properties](https://github.com/devongovett/unicode-properties) 13 | 14 | ## Installation 15 | 16 | Install using npm: 17 | 18 | npm install codepoints 19 | 20 | ## Usage 21 | 22 | Basic usage: 23 | 24 | ```js 25 | codepoints = require('codepoints'); 26 | ``` 27 | 28 | The parser generates data by reading the text files contained in the 29 | [Unicode Character Database](http://unicode.org/ucd/). By default, it will use the database 30 | bundled with this package. To use a custom version of UCD, use `codepoints/parser` instead, 31 | which accepts an optional path to a directory containing the uncompressed UCD data: 32 | 33 | ```js 34 | parser = require('codepoints/parser'); 35 | codepoints = parser('/path/to/UCD'); 36 | ``` 37 | 38 | ## Codepoint data 39 | 40 | Each element in the generated array is either `undefined` (for unassigned code 41 | points), or an object containing the following properties: 42 | 43 | * `code` - the code point index 44 | * `name` - character name 45 | * `unicode1Name` - legacy name used by Unicode 1 46 | * `category` - Unicode category 47 | * `block` - the block name this character is a part of 48 | * `script` - the script this character belongs to 49 | * `eastAsianWidth` - the east asian width for this character 50 | * `combiningClass` - numeric combining class value 51 | * `combiningClassName` - a string name for the combining class 52 | * `bidiClass` - class for the Unicode bidirectional algorithm 53 | * `bidiMirrored` - whether the character is mirrored in the bidi algorithm 54 | * `numeric` - the numeric value for this character 55 | * `uppercase` - an array of code points mapping this character to upper case, if any 56 | * `lowercase` - an array of code points mapping this character to lower case, if any 57 | * `titlecase` - an array of code points mapping this character to title case, if any 58 | * `folded` - an array of code points mapping this character to a folded equivalent, if any 59 | * `caseConditions` - conditions used during case mapping for this character 60 | * `decomposition` - an array of code points that this character decomposes into. Used by the Unicode normalization algorithm. 61 | * `compositions` - a dictionary mapping of compositions for this character 62 | * `isCompat` - whether the decomposition is a compatibility one 63 | * `isExcluded` - whether the character is excluded from composition 64 | * `NFC_QC` - quickcheck value for NFC (0 = YES, 1 = NO, 2 = MAYBE) 65 | * `NFKC_QC` - quickcheck value for NFKC (0 = YES, 1 = NO, 2 = MAYBE) 66 | * `NFD_QC` - quickcheck value for NFD (0 = YES, 1 = NO) 67 | * `NFKD_QC` - quickcheck value for NFKD (0 = YES, 1 = NO) 68 | * `joiningType` - arabic joining type 69 | * `joiningGroup` - arabic joining group 70 | 71 | ## License 72 | 73 | MIT 74 | -------------------------------------------------------------------------------- /data/ArabicShaping.txt: -------------------------------------------------------------------------------- 1 | # ArabicShaping-12.0.0.txt 2 | # Date: 2018-09-22, 23:54:00 GMT [KW, RP] 3 | # © 2018 Unicode®, Inc. 4 | # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. 5 | # For terms of use, see http://www.unicode.org/terms_of_use.html 6 | # 7 | # This file is a normative contributory data file in the 8 | # Unicode Character Database. 9 | # 10 | # This file defines the Joining_Type and Joining_Group property 11 | # values for Arabic, Syriac, N'Ko, Mandaic, Manichaean, 12 | # Hanifi Rohingya, and Sogdian positional 13 | # shaping, repeating in machine readable form the information 14 | # exemplified in Tables 9-3, 9-8, 9-9, 9-10, 9-14, 9-15, 9-16, 9-19, 15 | # 9-20, 10-4, 10-5, 10-6, 10-7, 14-10, 16-16, and 19-5 of The Unicode Standard core 16 | # specification. This file also defines Joining_Type values for 17 | # Mongolian, Phags-pa, Psalter Pahlavi, and Adlam positional shaping, 18 | # which are not listed in tables in the standard. 19 | # 20 | # See Sections 9.2, 9.3, 9.5, 10.5, 10.6, 13.4, 14.3, 14.10, 16.13, 19.4, and 19.9 21 | # of The Unicode Standard core specification for more information. 22 | # 23 | # Each line contains four fields, separated by a semicolon. 24 | # 25 | # Field 0: the code point, in 4-digit hexadecimal 26 | # form, of an Arabic, Syriac, N'Ko, Mandaic, Mongolian, 27 | # Phags-pa, Manichaean, Psalter Pahlavi, Hanifi Rohingya, Sogdian, 28 | # or other character. 29 | # 30 | # Field 1: gives a short schematic name for that character. 31 | # The schematic name is descriptive of the shape, based as 32 | # consistently as possible on a name for the skeleton and 33 | # then the diacritic marks applied to the skeleton, if any. 34 | # Note that this schematic name is considered a comment, 35 | # and does not constitute a formal property value. 36 | # 37 | # Field 2: defines the joining type (property name: Joining_Type) 38 | # R Right_Joining 39 | # L Left_Joining 40 | # D Dual_Joining 41 | # C Join_Causing 42 | # U Non_Joining 43 | # T Transparent 44 | # 45 | # See Section 9.2, Arabic for more information on these joining types. 46 | # Note that for cursive joining scripts which are typically rendered 47 | # top-to-bottom, rather than right-to-left, Joining_Type=L conventionally 48 | # refers to bottom joining, and Joining_Type=R conventionally refers 49 | # to top joining. See Section 14.3, Phags-pa for more information on the 50 | # interpretation of joining types in vertical layout. 51 | # 52 | # Field 3: defines the joining group (property name: Joining_Group) 53 | # 54 | # The values of the joining group are based schematically on character 55 | # names. Where a schematic character name consists of two or more parts 56 | # separated by spaces, the formal Joining_Group property value, as specified in 57 | # PropertyValueAliases.txt, consists of the same name parts joined by 58 | # underscores. Hence, the entry: 59 | # 60 | # 0629; TEH MARBUTA; R; TEH MARBUTA 61 | # 62 | # corresponds to [Joining_Group = Teh_Marbuta]. 63 | # 64 | # Note: The property value now designated [Joining_Group = Teh_Marbuta_Goal] 65 | # used to apply to both of the following characters 66 | # in earlier versions of the standard: 67 | # 68 | # U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE 69 | # U+06C3 ARABIC LETTER TEH MARBUTA GOAL 70 | # 71 | # However, it currently applies only to U+06C3, and *not* to U+06C2. 72 | # To avoid destabilizing existing Joining_Group property aliases, the 73 | # prior Joining_Group value for U+06C3 (Hamza_On_Heh_Goal) has been 74 | # retained as a property value alias, despite the fact that it 75 | # no longer applies to its namesake character, U+06C2. 76 | # See PropertyValueAliases.txt. 77 | # 78 | # When other cursive scripts are added to the Unicode Standard in the 79 | # future, the joining group value of all its letters will default to 80 | # jg=No_Joining_Group in this data file. Other, more specific 81 | # joining group values will be defined only if an explicit proposal 82 | # to define those values exactly has been approved by the UTC. This 83 | # is the convention exemplified by the N'Ko, Mandaic, Mongolian, 84 | # Phags-pa, Psalter Pahlavi, and Sogdian scripts. 85 | # Only the Arabic, Manichaean, and Syriac scripts currently have 86 | # explicit joining group values defined for all characters, including 87 | # those which have only a single character in a particular Joining_Group 88 | # class. Hanifi Rohingya has explicit Joining_Group values assigned only for 89 | # the few characters which share a particular Joining_Group class, but 90 | # assigns jg=No_Joining_Group to all the singletons. 91 | # 92 | # Note: Code points that are not explicitly listed in this file are 93 | # either of joining type T or U: 94 | # 95 | # - Those that are not explicitly listed and that are of General Category Mn, Me, or Cf 96 | # have joining type T. 97 | # - All others not explicitly listed have joining type U. 98 | # 99 | # For an explicit listing of all characters of joining type T, see 100 | # the derived property file DerivedJoiningType.txt. 101 | # 102 | # ############################################################# 103 | 104 | # Unicode; Schematic Name; Joining Type; Joining Group 105 | 106 | # Arabic Characters 107 | 108 | 0600; ARABIC NUMBER SIGN; U; No_Joining_Group 109 | 0601; ARABIC SIGN SANAH; U; No_Joining_Group 110 | 0602; ARABIC FOOTNOTE MARKER; U; No_Joining_Group 111 | 0603; ARABIC SIGN SAFHA; U; No_Joining_Group 112 | 0604; ARABIC SIGN SAMVAT; U; No_Joining_Group 113 | 0605; ARABIC NUMBER MARK ABOVE; U; No_Joining_Group 114 | 0608; ARABIC RAY; U; No_Joining_Group 115 | 060B; AFGHANI SIGN; U; No_Joining_Group 116 | 0620; DOTLESS YEH WITH SEPARATE RING BELOW; D; YEH 117 | 0621; HAMZA; U; No_Joining_Group 118 | 0622; ALEF WITH MADDA ABOVE; R; ALEF 119 | 0623; ALEF WITH HAMZA ABOVE; R; ALEF 120 | 0624; WAW WITH HAMZA ABOVE; R; WAW 121 | 0625; ALEF WITH HAMZA BELOW; R; ALEF 122 | 0626; DOTLESS YEH WITH HAMZA ABOVE; D; YEH 123 | 0627; ALEF; R; ALEF 124 | 0628; BEH; D; BEH 125 | 0629; TEH MARBUTA; R; TEH MARBUTA 126 | 062A; DOTLESS BEH WITH 2 DOTS ABOVE; D; BEH 127 | 062B; DOTLESS BEH WITH 3 DOTS ABOVE; D; BEH 128 | 062C; HAH WITH DOT BELOW; D; HAH 129 | 062D; HAH; D; HAH 130 | 062E; HAH WITH DOT ABOVE; D; HAH 131 | 062F; DAL; R; DAL 132 | 0630; DAL WITH DOT ABOVE; R; DAL 133 | 0631; REH; R; REH 134 | 0632; REH WITH DOT ABOVE; R; REH 135 | 0633; SEEN; D; SEEN 136 | 0634; SEEN WITH 3 DOTS ABOVE; D; SEEN 137 | 0635; SAD; D; SAD 138 | 0636; SAD WITH DOT ABOVE; D; SAD 139 | 0637; TAH; D; TAH 140 | 0638; TAH WITH DOT ABOVE; D; TAH 141 | 0639; AIN; D; AIN 142 | 063A; AIN WITH DOT ABOVE; D; AIN 143 | 063B; KEHEH WITH 2 DOTS ABOVE; D; GAF 144 | 063C; KEHEH WITH 3 DOTS BELOW; D; GAF 145 | 063D; FARSI YEH WITH INVERTED V ABOVE; D; FARSI YEH 146 | 063E; FARSI YEH WITH 2 DOTS ABOVE; D; FARSI YEH 147 | 063F; FARSI YEH WITH 3 DOTS ABOVE; D; FARSI YEH 148 | 0640; TATWEEL; C; No_Joining_Group 149 | 0641; FEH; D; FEH 150 | 0642; QAF; D; QAF 151 | 0643; KAF; D; KAF 152 | 0644; LAM; D; LAM 153 | 0645; MEEM; D; MEEM 154 | 0646; NOON; D; NOON 155 | 0647; HEH; D; HEH 156 | 0648; WAW; R; WAW 157 | 0649; DOTLESS YEH; D; YEH 158 | 064A; YEH; D; YEH 159 | 066E; DOTLESS BEH; D; BEH 160 | 066F; DOTLESS QAF; D; QAF 161 | 0671; ALEF WITH WASLA ABOVE; R; ALEF 162 | 0672; ALEF WITH WAVY HAMZA ABOVE; R; ALEF 163 | 0673; ALEF WITH WAVY HAMZA BELOW; R; ALEF 164 | 0674; HIGH HAMZA; U; No_Joining_Group 165 | 0675; HIGH HAMZA ALEF; R; ALEF 166 | 0676; HIGH HAMZA WAW; R; WAW 167 | 0677; HIGH HAMZA WAW WITH DAMMA ABOVE; R; WAW 168 | 0678; HIGH HAMZA DOTLESS YEH; D; YEH 169 | 0679; DOTLESS BEH WITH TAH ABOVE; D; BEH 170 | 067A; DOTLESS BEH WITH VERTICAL 2 DOTS ABOVE; D; BEH 171 | 067B; DOTLESS BEH WITH VERTICAL 2 DOTS BELOW; D; BEH 172 | 067C; DOTLESS BEH WITH ATTACHED RING BELOW AND 2 DOTS ABOVE; D; BEH 173 | 067D; DOTLESS BEH WITH INVERTED 3 DOTS ABOVE; D; BEH 174 | 067E; DOTLESS BEH WITH 3 DOTS BELOW; D; BEH 175 | 067F; DOTLESS BEH WITH 4 DOTS ABOVE; D; BEH 176 | 0680; DOTLESS BEH WITH 4 DOTS BELOW; D; BEH 177 | 0681; HAH WITH HAMZA ABOVE; D; HAH 178 | 0682; HAH WITH VERTICAL 2 DOTS ABOVE; D; HAH 179 | 0683; HAH WITH 2 DOTS BELOW; D; HAH 180 | 0684; HAH WITH VERTICAL 2 DOTS BELOW; D; HAH 181 | 0685; HAH WITH 3 DOTS ABOVE; D; HAH 182 | 0686; HAH WITH 3 DOTS BELOW; D; HAH 183 | 0687; HAH WITH 4 DOTS BELOW; D; HAH 184 | 0688; DAL WITH TAH ABOVE; R; DAL 185 | 0689; DAL WITH ATTACHED RING BELOW; R; DAL 186 | 068A; DAL WITH DOT BELOW; R; DAL 187 | 068B; DAL WITH DOT BELOW AND TAH ABOVE; R; DAL 188 | 068C; DAL WITH 2 DOTS ABOVE; R; DAL 189 | 068D; DAL WITH 2 DOTS BELOW; R; DAL 190 | 068E; DAL WITH 3 DOTS ABOVE; R; DAL 191 | 068F; DAL WITH INVERTED 3 DOTS ABOVE; R; DAL 192 | 0690; DAL WITH 4 DOTS ABOVE; R; DAL 193 | 0691; REH WITH TAH ABOVE; R; REH 194 | 0692; REH WITH V ABOVE; R; REH 195 | 0693; REH WITH ATTACHED RING BELOW; R; REH 196 | 0694; REH WITH DOT BELOW; R; REH 197 | 0695; REH WITH V BELOW; R; REH 198 | 0696; REH WITH DOT BELOW AND DOT WITHIN; R; REH 199 | 0697; REH WITH 2 DOTS ABOVE; R; REH 200 | 0698; REH WITH 3 DOTS ABOVE; R; REH 201 | 0699; REH WITH 4 DOTS ABOVE; R; REH 202 | 069A; SEEN WITH DOT BELOW AND DOT ABOVE; D; SEEN 203 | 069B; SEEN WITH 3 DOTS BELOW; D; SEEN 204 | 069C; SEEN WITH 3 DOTS BELOW AND 3 DOTS ABOVE; D; SEEN 205 | 069D; SAD WITH 2 DOTS BELOW; D; SAD 206 | 069E; SAD WITH 3 DOTS ABOVE; D; SAD 207 | 069F; TAH WITH 3 DOTS ABOVE; D; TAH 208 | 06A0; AIN WITH 3 DOTS ABOVE; D; AIN 209 | 06A1; DOTLESS FEH; D; FEH 210 | 06A2; DOTLESS FEH WITH DOT BELOW; D; FEH 211 | 06A3; FEH WITH DOT BELOW; D; FEH 212 | 06A4; DOTLESS FEH WITH 3 DOTS ABOVE; D; FEH 213 | 06A5; DOTLESS FEH WITH 3 DOTS BELOW; D; FEH 214 | 06A6; DOTLESS FEH WITH 4 DOTS ABOVE; D; FEH 215 | 06A7; DOTLESS QAF WITH DOT ABOVE; D; QAF 216 | 06A8; DOTLESS QAF WITH 3 DOTS ABOVE; D; QAF 217 | 06A9; KEHEH; D; GAF 218 | 06AA; SWASH KAF; D; SWASH KAF 219 | 06AB; KEHEH WITH ATTACHED RING BELOW; D; GAF 220 | 06AC; KAF WITH DOT ABOVE; D; KAF 221 | 06AD; KAF WITH 3 DOTS ABOVE; D; KAF 222 | 06AE; KAF WITH 3 DOTS BELOW; D; KAF 223 | 06AF; GAF; D; GAF 224 | 06B0; GAF WITH ATTACHED RING BELOW; D; GAF 225 | 06B1; GAF WITH 2 DOTS ABOVE; D; GAF 226 | 06B2; GAF WITH 2 DOTS BELOW; D; GAF 227 | 06B3; GAF WITH VERTICAL 2 DOTS BELOW; D; GAF 228 | 06B4; GAF WITH 3 DOTS ABOVE; D; GAF 229 | 06B5; LAM WITH V ABOVE; D; LAM 230 | 06B6; LAM WITH DOT ABOVE; D; LAM 231 | 06B7; LAM WITH 3 DOTS ABOVE; D; LAM 232 | 06B8; LAM WITH 3 DOTS BELOW; D; LAM 233 | 06B9; NOON WITH DOT BELOW; D; NOON 234 | 06BA; DOTLESS NOON; D; NOON 235 | 06BB; DOTLESS NOON WITH TAH ABOVE; D; NOON 236 | 06BC; NOON WITH ATTACHED RING BELOW; D; NOON 237 | 06BD; NYA; D; NYA 238 | 06BE; KNOTTED HEH; D; KNOTTED HEH 239 | 06BF; HAH WITH 3 DOTS BELOW AND DOT ABOVE; D; HAH 240 | 06C0; DOTLESS TEH MARBUTA WITH HAMZA ABOVE; R; TEH MARBUTA 241 | 06C1; HEH GOAL; D; HEH GOAL 242 | 06C2; HEH GOAL WITH HAMZA ABOVE; D; HEH GOAL 243 | 06C3; TEH MARBUTA GOAL; R; TEH MARBUTA GOAL 244 | 06C4; WAW WITH ATTACHED RING WITHIN; R; WAW 245 | 06C5; WAW WITH BAR; R; WAW 246 | 06C6; WAW WITH V ABOVE; R; WAW 247 | 06C7; WAW WITH DAMMA ABOVE; R; WAW 248 | 06C8; WAW WITH ALEF ABOVE; R; WAW 249 | 06C9; WAW WITH INVERTED V ABOVE; R; WAW 250 | 06CA; WAW WITH 2 DOTS ABOVE; R; WAW 251 | 06CB; WAW WITH 3 DOTS ABOVE; R; WAW 252 | 06CC; FARSI YEH; D; FARSI YEH 253 | 06CD; YEH WITH TAIL; R; YEH WITH TAIL 254 | 06CE; FARSI YEH WITH V ABOVE; D; FARSI YEH 255 | 06CF; WAW WITH DOT ABOVE; R; WAW 256 | 06D0; DOTLESS YEH WITH VERTICAL 2 DOTS BELOW; D; YEH 257 | 06D1; DOTLESS YEH WITH 3 DOTS BELOW; D; YEH 258 | 06D2; YEH BARREE; R; YEH BARREE 259 | 06D3; YEH BARREE WITH HAMZA ABOVE; R; YEH BARREE 260 | 06D5; DOTLESS TEH MARBUTA; R; TEH MARBUTA 261 | 06DD; ARABIC END OF AYAH; U; No_Joining_Group 262 | 06EE; DAL WITH INVERTED V ABOVE; R; DAL 263 | 06EF; REH WITH INVERTED V ABOVE; R; REH 264 | 06FA; SEEN WITH DOT BELOW AND 3 DOTS ABOVE; D; SEEN 265 | 06FB; SAD WITH DOT BELOW AND DOT ABOVE; D; SAD 266 | 06FC; AIN WITH DOT BELOW AND DOT ABOVE; D; AIN 267 | 06FF; KNOTTED HEH WITH INVERTED V ABOVE; D; KNOTTED HEH 268 | 269 | # Syriac Characters 270 | 271 | 070F; SYRIAC ABBREVIATION MARK; T; No_Joining_Group 272 | 0710; ALAPH; R; ALAPH 273 | 0712; BETH; D; BETH 274 | 0713; GAMAL; D; GAMAL 275 | 0714; GAMAL GARSHUNI; D; GAMAL 276 | 0715; DALATH; R; DALATH RISH 277 | 0716; DOTLESS DALATH RISH; R; DALATH RISH 278 | 0717; HE; R; HE 279 | 0718; WAW; R; SYRIAC WAW 280 | 0719; ZAIN; R; ZAIN 281 | 071A; HETH; D; HETH 282 | 071B; TETH; D; TETH 283 | 071C; TETH GARSHUNI; D; TETH 284 | 071D; YUDH; D; YUDH 285 | 071E; YUDH HE; R; YUDH HE 286 | 071F; KAPH; D; KAPH 287 | 0720; LAMADH; D; LAMADH 288 | 0721; MIM; D; MIM 289 | 0722; NUN; D; NUN 290 | 0723; SEMKATH; D; SEMKATH 291 | 0724; FINAL SEMKATH; D; FINAL SEMKATH 292 | 0725; E; D; E 293 | 0726; PE; D; PE 294 | 0727; REVERSED PE; D; REVERSED PE 295 | 0728; SADHE; R; SADHE 296 | 0729; QAPH; D; QAPH 297 | 072A; RISH; R; DALATH RISH 298 | 072B; SHIN; D; SHIN 299 | 072C; TAW; R; TAW 300 | 072D; PERSIAN BHETH; D; BETH 301 | 072E; PERSIAN GHAMAL; D; GAMAL 302 | 072F; PERSIAN DHALATH; R; DALATH RISH 303 | 074D; SOGDIAN ZHAIN; R; ZHAIN 304 | 074E; SOGDIAN KHAPH; D; KHAPH 305 | 074F; SOGDIAN FE; D; FE 306 | 307 | # Arabic Supplement Characters 308 | 309 | 0750; DOTLESS BEH WITH HORIZONTAL 3 DOTS BELOW; D; BEH 310 | 0751; BEH WITH 3 DOTS ABOVE; D; BEH 311 | 0752; DOTLESS BEH WITH INVERTED 3 DOTS BELOW; D; BEH 312 | 0753; DOTLESS BEH WITH INVERTED 3 DOTS BELOW AND 2 DOTS ABOVE; D; BEH 313 | 0754; DOTLESS BEH WITH 2 DOTS BELOW AND DOT ABOVE; D; BEH 314 | 0755; DOTLESS BEH WITH INVERTED V BELOW; D; BEH 315 | 0756; DOTLESS BEH WITH V ABOVE; D; BEH 316 | 0757; HAH WITH 2 DOTS ABOVE; D; HAH 317 | 0758; HAH WITH INVERTED 3 DOTS BELOW; D; HAH 318 | 0759; DAL WITH VERTICAL 2 DOTS BELOW AND TAH ABOVE; R; DAL 319 | 075A; DAL WITH INVERTED V BELOW; R; DAL 320 | 075B; REH WITH BAR; R; REH 321 | 075C; SEEN WITH 4 DOTS ABOVE; D; SEEN 322 | 075D; AIN WITH 2 DOTS ABOVE; D; AIN 323 | 075E; AIN WITH INVERTED 3 DOTS ABOVE; D; AIN 324 | 075F; AIN WITH VERTICAL 2 DOTS ABOVE; D; AIN 325 | 0760; DOTLESS FEH WITH 2 DOTS BELOW; D; FEH 326 | 0761; DOTLESS FEH WITH INVERTED 3 DOTS BELOW; D; FEH 327 | 0762; KEHEH WITH DOT ABOVE; D; GAF 328 | 0763; KEHEH WITH 3 DOTS ABOVE; D; GAF 329 | 0764; KEHEH WITH INVERTED 3 DOTS BELOW; D; GAF 330 | 0765; MEEM WITH DOT ABOVE; D; MEEM 331 | 0766; MEEM WITH DOT BELOW; D; MEEM 332 | 0767; NOON WITH 2 DOTS BELOW; D; NOON 333 | 0768; NOON WITH TAH ABOVE; D; NOON 334 | 0769; NOON WITH V ABOVE; D; NOON 335 | 076A; LAM WITH BAR; D; LAM 336 | 076B; REH WITH VERTICAL 2 DOTS ABOVE; R; REH 337 | 076C; REH WITH HAMZA ABOVE; R; REH 338 | 076D; SEEN WITH VERTICAL 2 DOTS ABOVE; D; SEEN 339 | 076E; HAH WITH TAH BELOW; D; HAH 340 | 076F; HAH WITH TAH AND 2 DOTS BELOW; D; HAH 341 | 0770; SEEN WITH 2 DOTS AND TAH ABOVE; D; SEEN 342 | 0771; REH WITH 2 DOTS AND TAH ABOVE; R; REH 343 | 0772; HAH WITH TAH ABOVE; D; HAH 344 | 0773; ALEF WITH DIGIT TWO ABOVE; R; ALEF 345 | 0774; ALEF WITH DIGIT THREE ABOVE; R; ALEF 346 | 0775; FARSI YEH WITH DIGIT TWO ABOVE; D; FARSI YEH 347 | 0776; FARSI YEH WITH DIGIT THREE ABOVE; D; FARSI YEH 348 | 0777; DOTLESS YEH WITH DIGIT FOUR BELOW; D; YEH 349 | 0778; WAW WITH DIGIT TWO ABOVE; R; WAW 350 | 0779; WAW WITH DIGIT THREE ABOVE; R; WAW 351 | 077A; BURUSHASKI YEH BARREE WITH DIGIT TWO ABOVE; D; BURUSHASKI YEH BARREE 352 | 077B; BURUSHASKI YEH BARREE WITH DIGIT THREE ABOVE; D; BURUSHASKI YEH BARREE 353 | 077C; HAH WITH DIGIT FOUR BELOW; D; HAH 354 | 077D; SEEN WITH DIGIT FOUR ABOVE; D; SEEN 355 | 077E; SEEN WITH INVERTED V ABOVE; D; SEEN 356 | 077F; KAF WITH 2 DOTS ABOVE; D; KAF 357 | 358 | # N'Ko Characters 359 | 360 | 07CA; NKO A; D; No_Joining_Group 361 | 07CB; NKO EE; D; No_Joining_Group 362 | 07CC; NKO I; D; No_Joining_Group 363 | 07CD; NKO E; D; No_Joining_Group 364 | 07CE; NKO U; D; No_Joining_Group 365 | 07CF; NKO OO; D; No_Joining_Group 366 | 07D0; NKO O; D; No_Joining_Group 367 | 07D1; NKO DAGBASINNA; D; No_Joining_Group 368 | 07D2; NKO N; D; No_Joining_Group 369 | 07D3; NKO BA; D; No_Joining_Group 370 | 07D4; NKO PA; D; No_Joining_Group 371 | 07D5; NKO TA; D; No_Joining_Group 372 | 07D6; NKO JA; D; No_Joining_Group 373 | 07D7; NKO CHA; D; No_Joining_Group 374 | 07D8; NKO DA; D; No_Joining_Group 375 | 07D9; NKO RA; D; No_Joining_Group 376 | 07DA; NKO RRA; D; No_Joining_Group 377 | 07DB; NKO SA; D; No_Joining_Group 378 | 07DC; NKO GBA; D; No_Joining_Group 379 | 07DD; NKO FA; D; No_Joining_Group 380 | 07DE; NKO KA; D; No_Joining_Group 381 | 07DF; NKO LA; D; No_Joining_Group 382 | 07E0; NKO NA WOLOSO; D; No_Joining_Group 383 | 07E1; NKO MA; D; No_Joining_Group 384 | 07E2; NKO NYA; D; No_Joining_Group 385 | 07E3; NKO NA; D; No_Joining_Group 386 | 07E4; NKO HA; D; No_Joining_Group 387 | 07E5; NKO WA; D; No_Joining_Group 388 | 07E6; NKO YA; D; No_Joining_Group 389 | 07E7; NKO NYA WOLOSO; D; No_Joining_Group 390 | 07E8; NKO JONA JA; D; No_Joining_Group 391 | 07E9; NKO JONA CHA; D; No_Joining_Group 392 | 07EA; NKO JONA RA; D; No_Joining_Group 393 | 07FA; NKO LAJANYALAN; C; No_Joining_Group 394 | 395 | # Mandaic Characters 396 | 397 | 0840; MANDAIC HALQA; R; No_Joining_Group 398 | 0841; MANDAIC AB; D; No_Joining_Group 399 | 0842; MANDAIC AG; D; No_Joining_Group 400 | 0843; MANDAIC AD; D; No_Joining_Group 401 | 0844; MANDAIC AH; D; No_Joining_Group 402 | 0845; MANDAIC USHENNA; D; No_Joining_Group 403 | 0846; MANDAIC AZ; R; No_Joining_Group 404 | 0847; MANDAIC IT; R; No_Joining_Group 405 | 0848; MANDAIC ATT; D; No_Joining_Group 406 | 0849; MANDAIC AKSA; R; No_Joining_Group 407 | 084A; MANDAIC AK; D; No_Joining_Group 408 | 084B; MANDAIC AL; D; No_Joining_Group 409 | 084C; MANDAIC AM; D; No_Joining_Group 410 | 084D; MANDAIC AN; D; No_Joining_Group 411 | 084E; MANDAIC AS; D; No_Joining_Group 412 | 084F; MANDAIC IN; D; No_Joining_Group 413 | 0850; MANDAIC AP; D; No_Joining_Group 414 | 0851; MANDAIC ASZ; D; No_Joining_Group 415 | 0852; MANDAIC AQ; D; No_Joining_Group 416 | 0853; MANDAIC AR; D; No_Joining_Group 417 | 0854; MANDAIC ASH; R; No_Joining_Group 418 | 0855; MANDAIC AT; D; No_Joining_Group 419 | 0856; MANDAIC DUSHENNA; U; No_Joining_Group 420 | 0857; MANDAIC KAD; U; No_Joining_Group 421 | 0858; MANDAIC AIN; U; No_Joining_Group 422 | 423 | # Syriac Supplement Characters 424 | 425 | 0860; MALAYALAM NGA; D; MALAYALAM NGA 426 | 0861; MALAYALAM JA; U; MALAYALAM JA 427 | 0862; MALAYALAM NYA; D; MALAYALAM NYA 428 | 0863; MALAYALAM TTA; D; MALAYALAM TTA 429 | 0864; MALAYALAM NNA; D; MALAYALAM NNA 430 | 0865; MALAYALAM NNNA; D; MALAYALAM NNNA 431 | 0866; MALAYALAM BHA; U; MALAYALAM BHA 432 | 0867; MALAYALAM RA; R; MALAYALAM RA 433 | 0868; MALAYALAM LLA; D; MALAYALAM LLA 434 | 0869; MALAYALAM LLLA; R; MALAYALAM LLLA 435 | 086A; MALAYALAM SSA; R; MALAYALAM SSA 436 | 437 | # Arabic Extended-A Characters 438 | 439 | 08A0; DOTLESS BEH WITH V BELOW; D; BEH 440 | 08A1; BEH WITH HAMZA ABOVE; D; BEH 441 | 08A2; HAH WITH DOT BELOW AND 2 DOTS ABOVE; D; HAH 442 | 08A3; TAH WITH 2 DOTS ABOVE; D; TAH 443 | 08A4; DOTLESS FEH WITH DOT BELOW AND 3 DOTS ABOVE; D; FEH 444 | 08A5; QAF WITH DOT BELOW; D; QAF 445 | 08A6; LAM WITH DOUBLE BAR; D; LAM 446 | 08A7; MEEM WITH 3 DOTS ABOVE; D; MEEM 447 | 08A8; YEH WITH HAMZA ABOVE; D; YEH 448 | 08A9; YEH WITH DOT ABOVE; D; YEH 449 | 08AA; REH WITH LOOP; R; REH 450 | 08AB; WAW WITH DOT WITHIN; R; WAW 451 | 08AC; ROHINGYA YEH; R; ROHINGYA YEH 452 | 08AD; LOW ALEF; U; No_Joining_Group 453 | 08AE; DAL WITH 3 DOTS BELOW; R; DAL 454 | 08AF; SAD WITH 3 DOTS BELOW; D; SAD 455 | 08B0; KEHEH WITH STROKE BELOW; D; GAF 456 | 08B1; STRAIGHT WAW; R; STRAIGHT WAW 457 | 08B2; REH WITH DOT AND INVERTED V ABOVE; R; REH 458 | 08B3; AIN WITH 3 DOTS BELOW; D; AIN 459 | 08B4; KAF WITH DOT BELOW; D; KAF 460 | 08B6; BEH WITH MEEM ABOVE; D; BEH 461 | 08B7; DOTLESS BEH WITH 3 DOTS BELOW AND MEEM ABOVE; D; BEH 462 | 08B8; DOTLESS BEH WITH TEH ABOVE; D; BEH 463 | 08B9; REH WITH NOON ABOVE; R; REH 464 | 08BA; YEH WITH NOON ABOVE; D; YEH 465 | 08BB; AFRICAN FEH; D; AFRICAN FEH 466 | 08BC; AFRICAN QAF; D; AFRICAN QAF 467 | 08BD; AFRICAN NOON; D; AFRICAN NOON 468 | 08E2; ARABIC DISPUTED END OF AYAH; U; No_Joining_Group 469 | 470 | # Mongolian Characters 471 | 472 | 1806; MONGOLIAN TODO SOFT HYPHEN; U; No_Joining_Group 473 | 1807; MONGOLIAN SIBE SYLLABLE BOUNDARY MARKER; D; No_Joining_Group 474 | 180A; MONGOLIAN NIRUGU; C; No_Joining_Group 475 | 180E; MONGOLIAN VOWEL SEPARATOR; U; No_Joining_Group 476 | 1820; MONGOLIAN A; D; No_Joining_Group 477 | 1821; MONGOLIAN E; D; No_Joining_Group 478 | 1822; MONGOLIAN I; D; No_Joining_Group 479 | 1823; MONGOLIAN O; D; No_Joining_Group 480 | 1824; MONGOLIAN U; D; No_Joining_Group 481 | 1825; MONGOLIAN OE; D; No_Joining_Group 482 | 1826; MONGOLIAN UE; D; No_Joining_Group 483 | 1827; MONGOLIAN EE; D; No_Joining_Group 484 | 1828; MONGOLIAN NA; D; No_Joining_Group 485 | 1829; MONGOLIAN ANG; D; No_Joining_Group 486 | 182A; MONGOLIAN BA; D; No_Joining_Group 487 | 182B; MONGOLIAN PA; D; No_Joining_Group 488 | 182C; MONGOLIAN QA; D; No_Joining_Group 489 | 182D; MONGOLIAN GA; D; No_Joining_Group 490 | 182E; MONGOLIAN MA; D; No_Joining_Group 491 | 182F; MONGOLIAN LA; D; No_Joining_Group 492 | 1830; MONGOLIAN SA; D; No_Joining_Group 493 | 1831; MONGOLIAN SHA; D; No_Joining_Group 494 | 1832; MONGOLIAN TA; D; No_Joining_Group 495 | 1833; MONGOLIAN DA; D; No_Joining_Group 496 | 1834; MONGOLIAN CHA; D; No_Joining_Group 497 | 1835; MONGOLIAN JA; D; No_Joining_Group 498 | 1836; MONGOLIAN YA; D; No_Joining_Group 499 | 1837; MONGOLIAN RA; D; No_Joining_Group 500 | 1838; MONGOLIAN WA; D; No_Joining_Group 501 | 1839; MONGOLIAN FA; D; No_Joining_Group 502 | 183A; MONGOLIAN KA; D; No_Joining_Group 503 | 183B; MONGOLIAN KHA; D; No_Joining_Group 504 | 183C; MONGOLIAN TSA; D; No_Joining_Group 505 | 183D; MONGOLIAN ZA; D; No_Joining_Group 506 | 183E; MONGOLIAN HAA; D; No_Joining_Group 507 | 183F; MONGOLIAN ZRA; D; No_Joining_Group 508 | 1840; MONGOLIAN LHA; D; No_Joining_Group 509 | 1841; MONGOLIAN ZHI; D; No_Joining_Group 510 | 1842; MONGOLIAN CHI; D; No_Joining_Group 511 | 1843; MONGOLIAN TODO LONG VOWEL SIGN; D; No_Joining_Group 512 | 1844; MONGOLIAN TODO E; D; No_Joining_Group 513 | 1845; MONGOLIAN TODO I; D; No_Joining_Group 514 | 1846; MONGOLIAN TODO O; D; No_Joining_Group 515 | 1847; MONGOLIAN TODO U; D; No_Joining_Group 516 | 1848; MONGOLIAN TODO OE; D; No_Joining_Group 517 | 1849; MONGOLIAN TODO UE; D; No_Joining_Group 518 | 184A; MONGOLIAN TODO ANG; D; No_Joining_Group 519 | 184B; MONGOLIAN TODO BA; D; No_Joining_Group 520 | 184C; MONGOLIAN TODO PA; D; No_Joining_Group 521 | 184D; MONGOLIAN TODO QA; D; No_Joining_Group 522 | 184E; MONGOLIAN TODO GA; D; No_Joining_Group 523 | 184F; MONGOLIAN TODO MA; D; No_Joining_Group 524 | 1850; MONGOLIAN TODO TA; D; No_Joining_Group 525 | 1851; MONGOLIAN TODO DA; D; No_Joining_Group 526 | 1852; MONGOLIAN TODO CHA; D; No_Joining_Group 527 | 1853; MONGOLIAN TODO JA; D; No_Joining_Group 528 | 1854; MONGOLIAN TODO TSA; D; No_Joining_Group 529 | 1855; MONGOLIAN TODO YA; D; No_Joining_Group 530 | 1856; MONGOLIAN TODO WA; D; No_Joining_Group 531 | 1857; MONGOLIAN TODO KA; D; No_Joining_Group 532 | 1858; MONGOLIAN TODO GAA; D; No_Joining_Group 533 | 1859; MONGOLIAN TODO HAA; D; No_Joining_Group 534 | 185A; MONGOLIAN TODO JIA; D; No_Joining_Group 535 | 185B; MONGOLIAN TODO NIA; D; No_Joining_Group 536 | 185C; MONGOLIAN TODO DZA; D; No_Joining_Group 537 | 185D; MONGOLIAN SIBE E; D; No_Joining_Group 538 | 185E; MONGOLIAN SIBE I; D; No_Joining_Group 539 | 185F; MONGOLIAN SIBE IY; D; No_Joining_Group 540 | 1860; MONGOLIAN SIBE UE; D; No_Joining_Group 541 | 1861; MONGOLIAN SIBE U; D; No_Joining_Group 542 | 1862; MONGOLIAN SIBE ANG; D; No_Joining_Group 543 | 1863; MONGOLIAN SIBE KA; D; No_Joining_Group 544 | 1864; MONGOLIAN SIBE GA; D; No_Joining_Group 545 | 1865; MONGOLIAN SIBE HA; D; No_Joining_Group 546 | 1866; MONGOLIAN SIBE PA; D; No_Joining_Group 547 | 1867; MONGOLIAN SIBE SHA; D; No_Joining_Group 548 | 1868; MONGOLIAN SIBE TA; D; No_Joining_Group 549 | 1869; MONGOLIAN SIBE DA; D; No_Joining_Group 550 | 186A; MONGOLIAN SIBE JA; D; No_Joining_Group 551 | 186B; MONGOLIAN SIBE FA; D; No_Joining_Group 552 | 186C; MONGOLIAN SIBE GAA; D; No_Joining_Group 553 | 186D; MONGOLIAN SIBE HAA; D; No_Joining_Group 554 | 186E; MONGOLIAN SIBE TSA; D; No_Joining_Group 555 | 186F; MONGOLIAN SIBE ZA; D; No_Joining_Group 556 | 1870; MONGOLIAN SIBE RAA; D; No_Joining_Group 557 | 1871; MONGOLIAN SIBE CHA; D; No_Joining_Group 558 | 1872; MONGOLIAN SIBE ZHA; D; No_Joining_Group 559 | 1873; MONGOLIAN MANCHU I; D; No_Joining_Group 560 | 1874; MONGOLIAN MANCHU KA; D; No_Joining_Group 561 | 1875; MONGOLIAN MANCHU RA; D; No_Joining_Group 562 | 1876; MONGOLIAN MANCHU FA; D; No_Joining_Group 563 | 1877; MONGOLIAN MANCHU ZHA; D; No_Joining_Group 564 | 1878; MONGOLIAN MANCHU CHA WITH 2 DOTS; D; No_Joining_Group 565 | 1880; MONGOLIAN ALI GALI ANUSVARA ONE; U; No_Joining_Group 566 | 1881; MONGOLIAN ALI GALI VISARGA ONE; U; No_Joining_Group 567 | 1882; MONGOLIAN ALI GALI DAMARU; U; No_Joining_Group 568 | 1883; MONGOLIAN ALI GALI UBADAMA; U; No_Joining_Group 569 | 1884; MONGOLIAN ALI GALI INVERTED UBADAMA; U; No_Joining_Group 570 | 1885; MONGOLIAN ALI GALI BALUDA; T; No_Joining_Group 571 | 1886; MONGOLIAN ALI GALI THREE BALUDA; T; No_Joining_Group 572 | 1887; MONGOLIAN ALI GALI A; D; No_Joining_Group 573 | 1888; MONGOLIAN ALI GALI I; D; No_Joining_Group 574 | 1889; MONGOLIAN ALI GALI KA; D; No_Joining_Group 575 | 188A; MONGOLIAN ALI GALI NGA; D; No_Joining_Group 576 | 188B; MONGOLIAN ALI GALI CA; D; No_Joining_Group 577 | 188C; MONGOLIAN ALI GALI TTA; D; No_Joining_Group 578 | 188D; MONGOLIAN ALI GALI TTHA; D; No_Joining_Group 579 | 188E; MONGOLIAN ALI GALI DDA; D; No_Joining_Group 580 | 188F; MONGOLIAN ALI GALI NNA; D; No_Joining_Group 581 | 1890; MONGOLIAN ALI GALI TA; D; No_Joining_Group 582 | 1891; MONGOLIAN ALI GALI DA; D; No_Joining_Group 583 | 1892; MONGOLIAN ALI GALI PA; D; No_Joining_Group 584 | 1893; MONGOLIAN ALI GALI PHA; D; No_Joining_Group 585 | 1894; MONGOLIAN ALI GALI SSA; D; No_Joining_Group 586 | 1895; MONGOLIAN ALI GALI ZHA; D; No_Joining_Group 587 | 1896; MONGOLIAN ALI GALI ZA; D; No_Joining_Group 588 | 1897; MONGOLIAN ALI GALI AH; D; No_Joining_Group 589 | 1898; MONGOLIAN TODO ALI GALI TA; D; No_Joining_Group 590 | 1899; MONGOLIAN TODO ALI GALI ZHA; D; No_Joining_Group 591 | 189A; MONGOLIAN MANCHU ALI GALI GHA; D; No_Joining_Group 592 | 189B; MONGOLIAN MANCHU ALI GALI NGA; D; No_Joining_Group 593 | 189C; MONGOLIAN MANCHU ALI GALI CA; D; No_Joining_Group 594 | 189D; MONGOLIAN MANCHU ALI GALI JHA; D; No_Joining_Group 595 | 189E; MONGOLIAN MANCHU ALI GALI TTA; D; No_Joining_Group 596 | 189F; MONGOLIAN MANCHU ALI GALI DDHA; D; No_Joining_Group 597 | 18A0; MONGOLIAN MANCHU ALI GALI TA; D; No_Joining_Group 598 | 18A1; MONGOLIAN MANCHU ALI GALI DHA; D; No_Joining_Group 599 | 18A2; MONGOLIAN MANCHU ALI GALI SSA; D; No_Joining_Group 600 | 18A3; MONGOLIAN MANCHU ALI GALI CYA; D; No_Joining_Group 601 | 18A4; MONGOLIAN MANCHU ALI GALI ZHA; D; No_Joining_Group 602 | 18A5; MONGOLIAN MANCHU ALI GALI ZA; D; No_Joining_Group 603 | 18A6; MONGOLIAN ALI GALI HALF U; D; No_Joining_Group 604 | 18A7; MONGOLIAN ALI GALI HALF YA; D; No_Joining_Group 605 | 18A8; MONGOLIAN MANCHU ALI GALI BHA; D; No_Joining_Group 606 | 18AA; MONGOLIAN MANCHU ALI GALI LHA; D; No_Joining_Group 607 | 608 | # Other 609 | 610 | 200C; ZERO WIDTH NON-JOINER; U; No_Joining_Group 611 | 200D; ZERO WIDTH JOINER; C; No_Joining_Group 612 | 202F; NARROW NO-BREAK SPACE; U; No_Joining_Group 613 | 2066; LEFT-TO-RIGHT ISOLATE; U; No_Joining_Group 614 | 2067; RIGHT-TO-LEFT ISOLATE; U; No_Joining_Group 615 | 2068; FIRST STRONG ISOLATE; U; No_Joining_Group 616 | 2069; POP DIRECTIONAL ISOLATE; U; No_Joining_Group 617 | 618 | # Phags-Pa Characters 619 | 620 | A840; PHAGS-PA KA; D; No_Joining_Group 621 | A841; PHAGS-PA KHA; D; No_Joining_Group 622 | A842; PHAGS-PA GA; D; No_Joining_Group 623 | A843; PHAGS-PA NGA; D; No_Joining_Group 624 | A844; PHAGS-PA CA; D; No_Joining_Group 625 | A845; PHAGS-PA CHA; D; No_Joining_Group 626 | A846; PHAGS-PA JA; D; No_Joining_Group 627 | A847; PHAGS-PA NYA; D; No_Joining_Group 628 | A848; PHAGS-PA TA; D; No_Joining_Group 629 | A849; PHAGS-PA THA; D; No_Joining_Group 630 | A84A; PHAGS-PA DA; D; No_Joining_Group 631 | A84B; PHAGS-PA NA; D; No_Joining_Group 632 | A84C; PHAGS-PA PA; D; No_Joining_Group 633 | A84D; PHAGS-PA PHA; D; No_Joining_Group 634 | A84E; PHAGS-PA BA; D; No_Joining_Group 635 | A84F; PHAGS-PA MA; D; No_Joining_Group 636 | A850; PHAGS-PA TSA; D; No_Joining_Group 637 | A851; PHAGS-PA TSHA; D; No_Joining_Group 638 | A852; PHAGS-PA DZA; D; No_Joining_Group 639 | A853; PHAGS-PA WA; D; No_Joining_Group 640 | A854; PHAGS-PA ZHA; D; No_Joining_Group 641 | A855; PHAGS-PA ZA; D; No_Joining_Group 642 | A856; PHAGS-PA SMALL A; D; No_Joining_Group 643 | A857; PHAGS-PA YA; D; No_Joining_Group 644 | A858; PHAGS-PA RA; D; No_Joining_Group 645 | A859; PHAGS-PA LA; D; No_Joining_Group 646 | A85A; PHAGS-PA SHA; D; No_Joining_Group 647 | A85B; PHAGS-PA SA; D; No_Joining_Group 648 | A85C; PHAGS-PA HA; D; No_Joining_Group 649 | A85D; PHAGS-PA A; D; No_Joining_Group 650 | A85E; PHAGS-PA I; D; No_Joining_Group 651 | A85F; PHAGS-PA U; D; No_Joining_Group 652 | A860; PHAGS-PA E; D; No_Joining_Group 653 | A861; PHAGS-PA O; D; No_Joining_Group 654 | A862; PHAGS-PA QA; D; No_Joining_Group 655 | A863; PHAGS-PA XA; D; No_Joining_Group 656 | A864; PHAGS-PA FA; D; No_Joining_Group 657 | A865; PHAGS-PA GGA; D; No_Joining_Group 658 | A866; PHAGS-PA EE; D; No_Joining_Group 659 | A867; PHAGS-PA SUBJOINED WA; D; No_Joining_Group 660 | A868; PHAGS-PA SUBJOINED YA; D; No_Joining_Group 661 | A869; PHAGS-PA TTA; D; No_Joining_Group 662 | A86A; PHAGS-PA TTHA; D; No_Joining_Group 663 | A86B; PHAGS-PA DDA; D; No_Joining_Group 664 | A86C; PHAGS-PA NNA; D; No_Joining_Group 665 | A86D; PHAGS-PA ALTERNATE YA; D; No_Joining_Group 666 | A86E; PHAGS-PA VOICELESS SHA; D; No_Joining_Group 667 | A86F; PHAGS-PA VOICED HA; D; No_Joining_Group 668 | A870; PHAGS-PA ASPIRATED FA; D; No_Joining_Group 669 | A871; PHAGS-PA SUBJOINED RA; D; No_Joining_Group 670 | A872; PHAGS-PA SUPERFIXED RA; L; No_Joining_Group 671 | A873; PHAGS-PA CANDRABINDU; U; No_Joining_Group 672 | 673 | # Manichaean Characters 674 | 675 | 10AC0; MANICHAEAN ALEPH; D; MANICHAEAN ALEPH 676 | 10AC1; MANICHAEAN BETH; D; MANICHAEAN BETH 677 | 10AC2; MANICHAEAN BETH WITH 2 DOTS ABOVE; D; MANICHAEAN BETH 678 | 10AC3; MANICHAEAN GIMEL; D; MANICHAEAN GIMEL 679 | 10AC4; MANICHAEAN GIMEL WITH ATTACHED RING BELOW; D; MANICHAEAN GIMEL 680 | 10AC5; MANICHAEAN DALETH; R; MANICHAEAN DALETH 681 | 10AC6; MANICHAEAN HE; U; No_Joining_Group 682 | 10AC7; MANICHAEAN WAW; R; MANICHAEAN WAW 683 | 10AC8; MANICHAEAN UD; U; No_Joining_Group 684 | 10AC9; MANICHAEAN ZAYIN; R; MANICHAEAN ZAYIN 685 | 10ACA; MANICHAEAN ZAYIN WITH 2 DOTS ABOVE; R; MANICHAEAN ZAYIN 686 | 10ACB; MANICHAEAN JAYIN; U; No_Joining_Group 687 | 10ACC; MANICHAEAN JAYIN WITH 2 DOTS ABOVE; U; No_Joining_Group 688 | 10ACD; MANICHAEAN HETH; L; MANICHAEAN HETH 689 | 10ACE; MANICHAEAN TETH; R; MANICHAEAN TETH 690 | 10ACF; MANICHAEAN YODH; R; MANICHAEAN YODH 691 | 10AD0; MANICHAEAN KAPH; R; MANICHAEAN KAPH 692 | 10AD1; MANICHAEAN KAPH WITH DOT ABOVE; R; MANICHAEAN KAPH 693 | 10AD2; MANICHAEAN KAPH WITH 2 DOTS ABOVE; R; MANICHAEAN KAPH 694 | 10AD3; MANICHAEAN LAMEDH; D; MANICHAEAN LAMEDH 695 | 10AD4; MANICHAEAN DHAMEDH; D; MANICHAEAN DHAMEDH 696 | 10AD5; MANICHAEAN THAMEDH; D; MANICHAEAN THAMEDH 697 | 10AD6; MANICHAEAN MEM; D; MANICHAEAN MEM 698 | 10AD7; MANICHAEAN NUN; L; MANICHAEAN NUN 699 | 10AD8; MANICHAEAN SAMEKH; D; MANICHAEAN SAMEKH 700 | 10AD9; MANICHAEAN AYIN; D; MANICHAEAN AYIN 701 | 10ADA; MANICHAEAN AYIN WITH 2 DOTS ABOVE; D; MANICHAEAN AYIN 702 | 10ADB; MANICHAEAN PE; D; MANICHAEAN PE 703 | 10ADC; MANICHAEAN PE WITH DOT ABOVE; D; MANICHAEAN PE 704 | 10ADD; MANICHAEAN SADHE; R; MANICHAEAN SADHE 705 | 10ADE; MANICHAEAN QOPH; D; MANICHAEAN QOPH 706 | 10ADF; MANICHAEAN QOPH WITH DOT ABOVE; D; MANICHAEAN QOPH 707 | 10AE0; MANICHAEAN QOPH WITH 2 DOTS ABOVE; D; MANICHAEAN QOPH 708 | 10AE1; MANICHAEAN RESH; R; MANICHAEAN RESH 709 | 10AE2; MANICHAEAN SHIN; U; No_Joining_Group 710 | 10AE3; MANICHAEAN SHIN WITH 2 DOTS ABOVE; U; No_Joining_Group 711 | 10AE4; MANICHAEAN TAW; R; MANICHAEAN TAW 712 | 10AEB; MANICHAEAN ONE; D; MANICHAEAN ONE 713 | 10AEC; MANICHAEAN FIVE; D; MANICHAEAN FIVE 714 | 10AED; MANICHAEAN TEN; D; MANICHAEAN TEN 715 | 10AEE; MANICHAEAN TWENTY; D; MANICHAEAN TWENTY 716 | 10AEF; MANICHAEAN HUNDRED; R; MANICHAEAN HUNDRED 717 | 718 | # Psalter Pahlavi Characters 719 | 720 | 10B80; PSALTER PAHLAVI ALEPH; D; No_Joining_Group 721 | 10B81; PSALTER PAHLAVI BETH; R; No_Joining_Group 722 | 10B82; PSALTER PAHLAVI GIMEL; D; No_Joining_Group 723 | 10B83; PSALTER PAHLAVI DALETH; R; No_Joining_Group 724 | 10B84; PSALTER PAHLAVI HE; R; No_Joining_Group 725 | 10B85; PSALTER PAHLAVI WAW-AYIN-RESH; R; No_Joining_Group 726 | 10B86; PSALTER PAHLAVI ZAYIN; D; No_Joining_Group 727 | 10B87; PSALTER PAHLAVI HETH; D; No_Joining_Group 728 | 10B88; PSALTER PAHLAVI YODH; D; No_Joining_Group 729 | 10B89; PSALTER PAHLAVI KAPH; R; No_Joining_Group 730 | 10B8A; PSALTER PAHLAVI LAMEDH; D; No_Joining_Group 731 | 10B8B; PSALTER PAHLAVI MEM-QOPH; D; No_Joining_Group 732 | 10B8C; PSALTER PAHLAVI NUN; R; No_Joining_Group 733 | 10B8D; PSALTER PAHLAVI SAMEKH; D; No_Joining_Group 734 | 10B8E; PSALTER PAHLAVI PE; R; No_Joining_Group 735 | 10B8F; PSALTER PAHLAVI SADHE; R; No_Joining_Group 736 | 10B90; PSALTER PAHLAVI SHIN; D; No_Joining_Group 737 | 10B91; PSALTER PAHLAVI TAW; R; No_Joining_Group 738 | 10BA9; PSALTER PAHLAVI ONE; R; No_Joining_Group 739 | 10BAA; PSALTER PAHLAVI TWO; R; No_Joining_Group 740 | 10BAB; PSALTER PAHLAVI THREE; R; No_Joining_Group 741 | 10BAC; PSALTER PAHLAVI FOUR; R; No_Joining_Group 742 | 10BAD; PSALTER PAHLAVI TEN; D; No_Joining_Group 743 | 10BAE; PSALTER PAHLAVI TWENTY; D; No_Joining_Group 744 | 10BAF; PSALTER PAHLAVI HUNDRED; U; No_Joining_Group 745 | 746 | # Hanifi Rohingya Characters 747 | 748 | 10D00; HANIFI ROHINGYA A; L; No_Joining_Group 749 | 10D01; HANIFI ROHINGYA BA; D; No_Joining_Group 750 | 10D02; HANIFI ROHINGYA PA; D; HANIFI ROHINGYA PA 751 | 10D03; HANIFI ROHINGYA TA; D; No_Joining_Group 752 | 10D04; HANIFI ROHINGYA TTA; D; No_Joining_Group 753 | 10D05; HANIFI ROHINGYA JA; D; No_Joining_Group 754 | 10D06; HANIFI ROHINGYA CA; D; No_Joining_Group 755 | 10D07; HANIFI ROHINGYA HA; D; No_Joining_Group 756 | 10D08; HANIFI ROHINGYA KHA; D; No_Joining_Group 757 | 10D09; HANIFI ROHINGYA PA WITH DOT ABOVE; D; HANIFI ROHINGYA PA 758 | 10D0A; HANIFI ROHINGYA DA; D; No_Joining_Group 759 | 10D0B; HANIFI ROHINGYA DDA; D; No_Joining_Group 760 | 10D0C; HANIFI ROHINGYA RA; D; No_Joining_Group 761 | 10D0D; HANIFI ROHINGYA RRA; D; No_Joining_Group 762 | 10D0E; HANIFI ROHINGYA ZA; D; No_Joining_Group 763 | 10D0F; HANIFI ROHINGYA SA; D; No_Joining_Group 764 | 10D10; HANIFI ROHINGYA SHA; D; No_Joining_Group 765 | 10D11; HANIFI ROHINGYA KA; D; No_Joining_Group 766 | 10D12; HANIFI ROHINGYA GA; D; No_Joining_Group 767 | 10D13; HANIFI ROHINGYA LA; D; No_Joining_Group 768 | 10D14; HANIFI ROHINGYA MA; D; No_Joining_Group 769 | 10D15; HANIFI ROHINGYA NA; D; No_Joining_Group 770 | 10D16; HANIFI ROHINGYA WA; D; No_Joining_Group 771 | 10D17; HANIFI ROHINGYA KINNA WA; D; No_Joining_Group 772 | 10D18; HANIFI ROHINGYA YA; D; No_Joining_Group 773 | 10D19; HANIFI ROHINGYA KINNA YA; D; HANIFI ROHINGYA KINNA YA 774 | 10D1A; HANIFI ROHINGYA NGA; D; No_Joining_Group 775 | 10D1B; HANIFI ROHINGYA NYA; D; No_Joining_Group 776 | 10D1C; HANIFI ROHINGYA PA WITH 3 DOTS ABOVE; D; HANIFI ROHINGYA PA 777 | 10D1D; HANIFI ROHINGYA VOWEL A; D; No_Joining_Group 778 | 10D1E; HANIFI ROHINGYA DOTLESS KINNA YA WITH LEFT-FACING HOOK BELOW; D; HANIFI ROHINGYA KINNA YA 779 | 10D1F; HANIFI ROHINGYA VOWEL U; D; No_Joining_Group 780 | 10D20; HANIFI ROHINGYA DOTLESS KINNA YA WITH RIGHT-FACING HOOK BELOW; D; HANIFI ROHINGYA KINNA YA 781 | 10D21; HANIFI ROHINGYA VOWEL O; D; No_Joining_Group 782 | 10D22; HANIFI ROHINGYA SAKIN; R; No_Joining_Group 783 | 10D23; HANIFI ROHINGYA DOTLESS KINNA YA WITH DOT ABOVE; D; HANIFI ROHINGYA KINNA YA 784 | 785 | # Sogdian Characters 786 | 787 | 10F30; SOGDIAN ALEPH; D; No_Joining_Group 788 | 10F31; SOGDIAN BETH; D; No_Joining_Group 789 | 10F32; SOGDIAN GIMEL; D; No_Joining_Group 790 | 10F33; SOGDIAN HE; R; No_Joining_Group 791 | 10F34; SOGDIAN WAW; D; No_Joining_Group 792 | 10F35; SOGDIAN ZAYIN; D; No_Joining_Group 793 | 10F36; SOGDIAN HETH; D; No_Joining_Group 794 | 10F37; SOGDIAN YODH; D; No_Joining_Group 795 | 10F38; SOGDIAN KAPH; D; No_Joining_Group 796 | 10F39; SOGDIAN LAMEDH; D; No_Joining_Group 797 | 10F3A; SOGDIAN MEM; D; No_Joining_Group 798 | 10F3B; SOGDIAN NUN; D; No_Joining_Group 799 | 10F3C; SOGDIAN SAMEKH; D; No_Joining_Group 800 | 10F3D; SOGDIAN AYIN; D; No_Joining_Group 801 | 10F3E; SOGDIAN PE; D; No_Joining_Group 802 | 10F3F; SOGDIAN SADHE; D; No_Joining_Group 803 | 10F40; SOGDIAN RESH-AYIN; D; No_Joining_Group 804 | 10F41; SOGDIAN SHIN; D; No_Joining_Group 805 | 10F42; SOGDIAN TAW; D; No_Joining_Group 806 | 10F43; SOGDIAN FETH; D; No_Joining_Group 807 | 10F44; SOGDIAN LESH; D; No_Joining_Group 808 | 10F45; SOGDIAN INDEPENDENT SHIN; U; No_Joining_Group 809 | 10F51; SOGDIAN ONE; D; No_Joining_Group 810 | 10F52; SOGDIAN TEN; D; No_Joining_Group 811 | 10F53; SOGDIAN TWENTY; D; No_Joining_Group 812 | 10F54; SOGDIAN ONE HUNDRED; R; No_Joining_Group 813 | 814 | # Kaithi Number Signs 815 | # These are prepended concatenation marks, comparable 816 | # to the number signs in the Arabic script. 817 | # Listed here for consistency in property values. 818 | 819 | 110BD; KAITHI NUMBER SIGN; U; No_Joining_Group 820 | 110CD; KAITHI NUMBER SIGN ABOVE; U; No_Joining_Group 821 | 822 | # Adlam Characters 823 | 824 | 1E900;ADLAM CAPITAL ALIF; D; No_Joining_Group 825 | 1E901;ADLAM CAPITAL DAALI; D; No_Joining_Group 826 | 1E902;ADLAM CAPITAL LAAM; D; No_Joining_Group 827 | 1E903;ADLAM CAPITAL MIIM; D; No_Joining_Group 828 | 1E904;ADLAM CAPITAL BA; D; No_Joining_Group 829 | 1E905;ADLAM CAPITAL SINNYIIYHE; D; No_Joining_Group 830 | 1E906;ADLAM CAPITAL PE; D; No_Joining_Group 831 | 1E907;ADLAM CAPITAL BHE; D; No_Joining_Group 832 | 1E908;ADLAM CAPITAL RA; D; No_Joining_Group 833 | 1E909;ADLAM CAPITAL E; D; No_Joining_Group 834 | 1E90A;ADLAM CAPITAL FA; D; No_Joining_Group 835 | 1E90B;ADLAM CAPITAL I; D; No_Joining_Group 836 | 1E90C;ADLAM CAPITAL O; D; No_Joining_Group 837 | 1E90D;ADLAM CAPITAL DHA; D; No_Joining_Group 838 | 1E90E;ADLAM CAPITAL YHE; D; No_Joining_Group 839 | 1E90F;ADLAM CAPITAL WAW; D; No_Joining_Group 840 | 1E910;ADLAM CAPITAL NUN; D; No_Joining_Group 841 | 1E911;ADLAM CAPITAL KAF; D; No_Joining_Group 842 | 1E912;ADLAM CAPITAL YA; D; No_Joining_Group 843 | 1E913;ADLAM CAPITAL U; D; No_Joining_Group 844 | 1E914;ADLAM CAPITAL JIIM; D; No_Joining_Group 845 | 1E915;ADLAM CAPITAL CHI; D; No_Joining_Group 846 | 1E916;ADLAM CAPITAL HA; D; No_Joining_Group 847 | 1E917;ADLAM CAPITAL QAAF; D; No_Joining_Group 848 | 1E918;ADLAM CAPITAL GA; D; No_Joining_Group 849 | 1E919;ADLAM CAPITAL NYA; D; No_Joining_Group 850 | 1E91A;ADLAM CAPITAL TU; D; No_Joining_Group 851 | 1E91B;ADLAM CAPITAL NHA; D; No_Joining_Group 852 | 1E91C;ADLAM CAPITAL VA; D; No_Joining_Group 853 | 1E91D;ADLAM CAPITAL KHA; D; No_Joining_Group 854 | 1E91E;ADLAM CAPITAL GBE; D; No_Joining_Group 855 | 1E91F;ADLAM CAPITAL ZAL; D; No_Joining_Group 856 | 1E920;ADLAM CAPITAL KPO; D; No_Joining_Group 857 | 1E921;ADLAM CAPITAL SHA; D; No_Joining_Group 858 | 1E922;ADLAM SMALL ALIF; D; No_Joining_Group 859 | 1E923;ADLAM SMALL DAALI; D; No_Joining_Group 860 | 1E924;ADLAM SMALL LAAM; D; No_Joining_Group 861 | 1E925;ADLAM SMALL MIIM; D; No_Joining_Group 862 | 1E926;ADLAM SMALL BA; D; No_Joining_Group 863 | 1E927;ADLAM SMALL SINNYIIYHE; D; No_Joining_Group 864 | 1E928;ADLAM SMALL PE; D; No_Joining_Group 865 | 1E929;ADLAM SMALL BHE; D; No_Joining_Group 866 | 1E92A;ADLAM SMALL RA; D; No_Joining_Group 867 | 1E92B;ADLAM SMALL E; D; No_Joining_Group 868 | 1E92C;ADLAM SMALL FA; D; No_Joining_Group 869 | 1E92D;ADLAM SMALL I; D; No_Joining_Group 870 | 1E92E;ADLAM SMALL O; D; No_Joining_Group 871 | 1E92F;ADLAM SMALL DHA; D; No_Joining_Group 872 | 1E930;ADLAM SMALL YHE; D; No_Joining_Group 873 | 1E931;ADLAM SMALL WAW; D; No_Joining_Group 874 | 1E932;ADLAM SMALL NUN; D; No_Joining_Group 875 | 1E933;ADLAM SMALL KAF; D; No_Joining_Group 876 | 1E934;ADLAM SMALL YA; D; No_Joining_Group 877 | 1E935;ADLAM SMALL U; D; No_Joining_Group 878 | 1E936;ADLAM SMALL JIIM; D; No_Joining_Group 879 | 1E937;ADLAM SMALL CHI; D; No_Joining_Group 880 | 1E938;ADLAM SMALL HA; D; No_Joining_Group 881 | 1E939;ADLAM SMALL QAAF; D; No_Joining_Group 882 | 1E93A;ADLAM SMALL GA; D; No_Joining_Group 883 | 1E93B;ADLAM SMALL NYA; D; No_Joining_Group 884 | 1E93C;ADLAM SMALL TU; D; No_Joining_Group 885 | 1E93D;ADLAM SMALL NHA; D; No_Joining_Group 886 | 1E93E;ADLAM SMALL VA; D; No_Joining_Group 887 | 1E93F;ADLAM SMALL KHA; D; No_Joining_Group 888 | 1E940;ADLAM SMALL GBE; D; No_Joining_Group 889 | 1E941;ADLAM SMALL ZAL; D; No_Joining_Group 890 | 1E942;ADLAM SMALL KPO; D; No_Joining_Group 891 | 1E943;ADLAM SMALL SHA; D; No_Joining_Group 892 | 1E94B;ADLAM NASALIZATION MARK; T; No_Joining_Group 893 | 894 | # EOF 895 | -------------------------------------------------------------------------------- /data/Blocks.txt: -------------------------------------------------------------------------------- 1 | # Blocks-12.0.0.txt 2 | # Date: 2018-07-30, 19:40:00 GMT [KW] 3 | # © 2018 Unicode®, Inc. 4 | # For terms of use, see http://www.unicode.org/terms_of_use.html 5 | # 6 | # Unicode Character Database 7 | # For documentation, see http://www.unicode.org/reports/tr44/ 8 | # 9 | # Format: 10 | # Start Code..End Code; Block Name 11 | 12 | # ================================================ 13 | 14 | # Note: When comparing block names, casing, whitespace, hyphens, 15 | # and underbars are ignored. 16 | # For example, "Latin Extended-A" and "latin extended a" are equivalent. 17 | # For more information on the comparison of property values, 18 | # see UAX #44: http://www.unicode.org/reports/tr44/ 19 | # 20 | # All block ranges start with a value where (cp MOD 16) = 0, 21 | # and end with a value where (cp MOD 16) = 15. In other words, 22 | # the last hexadecimal digit of the start of range is ...0 23 | # and the last hexadecimal digit of the end of range is ...F. 24 | # This constraint on block ranges guarantees that allocations 25 | # are done in terms of whole columns, and that code chart display 26 | # never involves splitting columns in the charts. 27 | # 28 | # All code points not explicitly listed for Block 29 | # have the value No_Block. 30 | 31 | # Property: Block 32 | # 33 | # @missing: 0000..10FFFF; No_Block 34 | 35 | 0000..007F; Basic Latin 36 | 0080..00FF; Latin-1 Supplement 37 | 0100..017F; Latin Extended-A 38 | 0180..024F; Latin Extended-B 39 | 0250..02AF; IPA Extensions 40 | 02B0..02FF; Spacing Modifier Letters 41 | 0300..036F; Combining Diacritical Marks 42 | 0370..03FF; Greek and Coptic 43 | 0400..04FF; Cyrillic 44 | 0500..052F; Cyrillic Supplement 45 | 0530..058F; Armenian 46 | 0590..05FF; Hebrew 47 | 0600..06FF; Arabic 48 | 0700..074F; Syriac 49 | 0750..077F; Arabic Supplement 50 | 0780..07BF; Thaana 51 | 07C0..07FF; NKo 52 | 0800..083F; Samaritan 53 | 0840..085F; Mandaic 54 | 0860..086F; Syriac Supplement 55 | 08A0..08FF; Arabic Extended-A 56 | 0900..097F; Devanagari 57 | 0980..09FF; Bengali 58 | 0A00..0A7F; Gurmukhi 59 | 0A80..0AFF; Gujarati 60 | 0B00..0B7F; Oriya 61 | 0B80..0BFF; Tamil 62 | 0C00..0C7F; Telugu 63 | 0C80..0CFF; Kannada 64 | 0D00..0D7F; Malayalam 65 | 0D80..0DFF; Sinhala 66 | 0E00..0E7F; Thai 67 | 0E80..0EFF; Lao 68 | 0F00..0FFF; Tibetan 69 | 1000..109F; Myanmar 70 | 10A0..10FF; Georgian 71 | 1100..11FF; Hangul Jamo 72 | 1200..137F; Ethiopic 73 | 1380..139F; Ethiopic Supplement 74 | 13A0..13FF; Cherokee 75 | 1400..167F; Unified Canadian Aboriginal Syllabics 76 | 1680..169F; Ogham 77 | 16A0..16FF; Runic 78 | 1700..171F; Tagalog 79 | 1720..173F; Hanunoo 80 | 1740..175F; Buhid 81 | 1760..177F; Tagbanwa 82 | 1780..17FF; Khmer 83 | 1800..18AF; Mongolian 84 | 18B0..18FF; Unified Canadian Aboriginal Syllabics Extended 85 | 1900..194F; Limbu 86 | 1950..197F; Tai Le 87 | 1980..19DF; New Tai Lue 88 | 19E0..19FF; Khmer Symbols 89 | 1A00..1A1F; Buginese 90 | 1A20..1AAF; Tai Tham 91 | 1AB0..1AFF; Combining Diacritical Marks Extended 92 | 1B00..1B7F; Balinese 93 | 1B80..1BBF; Sundanese 94 | 1BC0..1BFF; Batak 95 | 1C00..1C4F; Lepcha 96 | 1C50..1C7F; Ol Chiki 97 | 1C80..1C8F; Cyrillic Extended-C 98 | 1C90..1CBF; Georgian Extended 99 | 1CC0..1CCF; Sundanese Supplement 100 | 1CD0..1CFF; Vedic Extensions 101 | 1D00..1D7F; Phonetic Extensions 102 | 1D80..1DBF; Phonetic Extensions Supplement 103 | 1DC0..1DFF; Combining Diacritical Marks Supplement 104 | 1E00..1EFF; Latin Extended Additional 105 | 1F00..1FFF; Greek Extended 106 | 2000..206F; General Punctuation 107 | 2070..209F; Superscripts and Subscripts 108 | 20A0..20CF; Currency Symbols 109 | 20D0..20FF; Combining Diacritical Marks for Symbols 110 | 2100..214F; Letterlike Symbols 111 | 2150..218F; Number Forms 112 | 2190..21FF; Arrows 113 | 2200..22FF; Mathematical Operators 114 | 2300..23FF; Miscellaneous Technical 115 | 2400..243F; Control Pictures 116 | 2440..245F; Optical Character Recognition 117 | 2460..24FF; Enclosed Alphanumerics 118 | 2500..257F; Box Drawing 119 | 2580..259F; Block Elements 120 | 25A0..25FF; Geometric Shapes 121 | 2600..26FF; Miscellaneous Symbols 122 | 2700..27BF; Dingbats 123 | 27C0..27EF; Miscellaneous Mathematical Symbols-A 124 | 27F0..27FF; Supplemental Arrows-A 125 | 2800..28FF; Braille Patterns 126 | 2900..297F; Supplemental Arrows-B 127 | 2980..29FF; Miscellaneous Mathematical Symbols-B 128 | 2A00..2AFF; Supplemental Mathematical Operators 129 | 2B00..2BFF; Miscellaneous Symbols and Arrows 130 | 2C00..2C5F; Glagolitic 131 | 2C60..2C7F; Latin Extended-C 132 | 2C80..2CFF; Coptic 133 | 2D00..2D2F; Georgian Supplement 134 | 2D30..2D7F; Tifinagh 135 | 2D80..2DDF; Ethiopic Extended 136 | 2DE0..2DFF; Cyrillic Extended-A 137 | 2E00..2E7F; Supplemental Punctuation 138 | 2E80..2EFF; CJK Radicals Supplement 139 | 2F00..2FDF; Kangxi Radicals 140 | 2FF0..2FFF; Ideographic Description Characters 141 | 3000..303F; CJK Symbols and Punctuation 142 | 3040..309F; Hiragana 143 | 30A0..30FF; Katakana 144 | 3100..312F; Bopomofo 145 | 3130..318F; Hangul Compatibility Jamo 146 | 3190..319F; Kanbun 147 | 31A0..31BF; Bopomofo Extended 148 | 31C0..31EF; CJK Strokes 149 | 31F0..31FF; Katakana Phonetic Extensions 150 | 3200..32FF; Enclosed CJK Letters and Months 151 | 3300..33FF; CJK Compatibility 152 | 3400..4DBF; CJK Unified Ideographs Extension A 153 | 4DC0..4DFF; Yijing Hexagram Symbols 154 | 4E00..9FFF; CJK Unified Ideographs 155 | A000..A48F; Yi Syllables 156 | A490..A4CF; Yi Radicals 157 | A4D0..A4FF; Lisu 158 | A500..A63F; Vai 159 | A640..A69F; Cyrillic Extended-B 160 | A6A0..A6FF; Bamum 161 | A700..A71F; Modifier Tone Letters 162 | A720..A7FF; Latin Extended-D 163 | A800..A82F; Syloti Nagri 164 | A830..A83F; Common Indic Number Forms 165 | A840..A87F; Phags-pa 166 | A880..A8DF; Saurashtra 167 | A8E0..A8FF; Devanagari Extended 168 | A900..A92F; Kayah Li 169 | A930..A95F; Rejang 170 | A960..A97F; Hangul Jamo Extended-A 171 | A980..A9DF; Javanese 172 | A9E0..A9FF; Myanmar Extended-B 173 | AA00..AA5F; Cham 174 | AA60..AA7F; Myanmar Extended-A 175 | AA80..AADF; Tai Viet 176 | AAE0..AAFF; Meetei Mayek Extensions 177 | AB00..AB2F; Ethiopic Extended-A 178 | AB30..AB6F; Latin Extended-E 179 | AB70..ABBF; Cherokee Supplement 180 | ABC0..ABFF; Meetei Mayek 181 | AC00..D7AF; Hangul Syllables 182 | D7B0..D7FF; Hangul Jamo Extended-B 183 | D800..DB7F; High Surrogates 184 | DB80..DBFF; High Private Use Surrogates 185 | DC00..DFFF; Low Surrogates 186 | E000..F8FF; Private Use Area 187 | F900..FAFF; CJK Compatibility Ideographs 188 | FB00..FB4F; Alphabetic Presentation Forms 189 | FB50..FDFF; Arabic Presentation Forms-A 190 | FE00..FE0F; Variation Selectors 191 | FE10..FE1F; Vertical Forms 192 | FE20..FE2F; Combining Half Marks 193 | FE30..FE4F; CJK Compatibility Forms 194 | FE50..FE6F; Small Form Variants 195 | FE70..FEFF; Arabic Presentation Forms-B 196 | FF00..FFEF; Halfwidth and Fullwidth Forms 197 | FFF0..FFFF; Specials 198 | 10000..1007F; Linear B Syllabary 199 | 10080..100FF; Linear B Ideograms 200 | 10100..1013F; Aegean Numbers 201 | 10140..1018F; Ancient Greek Numbers 202 | 10190..101CF; Ancient Symbols 203 | 101D0..101FF; Phaistos Disc 204 | 10280..1029F; Lycian 205 | 102A0..102DF; Carian 206 | 102E0..102FF; Coptic Epact Numbers 207 | 10300..1032F; Old Italic 208 | 10330..1034F; Gothic 209 | 10350..1037F; Old Permic 210 | 10380..1039F; Ugaritic 211 | 103A0..103DF; Old Persian 212 | 10400..1044F; Deseret 213 | 10450..1047F; Shavian 214 | 10480..104AF; Osmanya 215 | 104B0..104FF; Osage 216 | 10500..1052F; Elbasan 217 | 10530..1056F; Caucasian Albanian 218 | 10600..1077F; Linear A 219 | 10800..1083F; Cypriot Syllabary 220 | 10840..1085F; Imperial Aramaic 221 | 10860..1087F; Palmyrene 222 | 10880..108AF; Nabataean 223 | 108E0..108FF; Hatran 224 | 10900..1091F; Phoenician 225 | 10920..1093F; Lydian 226 | 10980..1099F; Meroitic Hieroglyphs 227 | 109A0..109FF; Meroitic Cursive 228 | 10A00..10A5F; Kharoshthi 229 | 10A60..10A7F; Old South Arabian 230 | 10A80..10A9F; Old North Arabian 231 | 10AC0..10AFF; Manichaean 232 | 10B00..10B3F; Avestan 233 | 10B40..10B5F; Inscriptional Parthian 234 | 10B60..10B7F; Inscriptional Pahlavi 235 | 10B80..10BAF; Psalter Pahlavi 236 | 10C00..10C4F; Old Turkic 237 | 10C80..10CFF; Old Hungarian 238 | 10D00..10D3F; Hanifi Rohingya 239 | 10E60..10E7F; Rumi Numeral Symbols 240 | 10F00..10F2F; Old Sogdian 241 | 10F30..10F6F; Sogdian 242 | 10FE0..10FFF; Elymaic 243 | 11000..1107F; Brahmi 244 | 11080..110CF; Kaithi 245 | 110D0..110FF; Sora Sompeng 246 | 11100..1114F; Chakma 247 | 11150..1117F; Mahajani 248 | 11180..111DF; Sharada 249 | 111E0..111FF; Sinhala Archaic Numbers 250 | 11200..1124F; Khojki 251 | 11280..112AF; Multani 252 | 112B0..112FF; Khudawadi 253 | 11300..1137F; Grantha 254 | 11400..1147F; Newa 255 | 11480..114DF; Tirhuta 256 | 11580..115FF; Siddham 257 | 11600..1165F; Modi 258 | 11660..1167F; Mongolian Supplement 259 | 11680..116CF; Takri 260 | 11700..1173F; Ahom 261 | 11800..1184F; Dogra 262 | 118A0..118FF; Warang Citi 263 | 119A0..119FF; Nandinagari 264 | 11A00..11A4F; Zanabazar Square 265 | 11A50..11AAF; Soyombo 266 | 11AC0..11AFF; Pau Cin Hau 267 | 11C00..11C6F; Bhaiksuki 268 | 11C70..11CBF; Marchen 269 | 11D00..11D5F; Masaram Gondi 270 | 11D60..11DAF; Gunjala Gondi 271 | 11EE0..11EFF; Makasar 272 | 11FC0..11FFF; Tamil Supplement 273 | 12000..123FF; Cuneiform 274 | 12400..1247F; Cuneiform Numbers and Punctuation 275 | 12480..1254F; Early Dynastic Cuneiform 276 | 13000..1342F; Egyptian Hieroglyphs 277 | 13430..1343F; Egyptian Hieroglyph Format Controls 278 | 14400..1467F; Anatolian Hieroglyphs 279 | 16800..16A3F; Bamum Supplement 280 | 16A40..16A6F; Mro 281 | 16AD0..16AFF; Bassa Vah 282 | 16B00..16B8F; Pahawh Hmong 283 | 16E40..16E9F; Medefaidrin 284 | 16F00..16F9F; Miao 285 | 16FE0..16FFF; Ideographic Symbols and Punctuation 286 | 17000..187FF; Tangut 287 | 18800..18AFF; Tangut Components 288 | 1B000..1B0FF; Kana Supplement 289 | 1B100..1B12F; Kana Extended-A 290 | 1B130..1B16F; Small Kana Extension 291 | 1B170..1B2FF; Nushu 292 | 1BC00..1BC9F; Duployan 293 | 1BCA0..1BCAF; Shorthand Format Controls 294 | 1D000..1D0FF; Byzantine Musical Symbols 295 | 1D100..1D1FF; Musical Symbols 296 | 1D200..1D24F; Ancient Greek Musical Notation 297 | 1D2E0..1D2FF; Mayan Numerals 298 | 1D300..1D35F; Tai Xuan Jing Symbols 299 | 1D360..1D37F; Counting Rod Numerals 300 | 1D400..1D7FF; Mathematical Alphanumeric Symbols 301 | 1D800..1DAAF; Sutton SignWriting 302 | 1E000..1E02F; Glagolitic Supplement 303 | 1E100..1E14F; Nyiakeng Puachue Hmong 304 | 1E2C0..1E2FF; Wancho 305 | 1E800..1E8DF; Mende Kikakui 306 | 1E900..1E95F; Adlam 307 | 1EC70..1ECBF; Indic Siyaq Numbers 308 | 1ED00..1ED4F; Ottoman Siyaq Numbers 309 | 1EE00..1EEFF; Arabic Mathematical Alphabetic Symbols 310 | 1F000..1F02F; Mahjong Tiles 311 | 1F030..1F09F; Domino Tiles 312 | 1F0A0..1F0FF; Playing Cards 313 | 1F100..1F1FF; Enclosed Alphanumeric Supplement 314 | 1F200..1F2FF; Enclosed Ideographic Supplement 315 | 1F300..1F5FF; Miscellaneous Symbols and Pictographs 316 | 1F600..1F64F; Emoticons 317 | 1F650..1F67F; Ornamental Dingbats 318 | 1F680..1F6FF; Transport and Map Symbols 319 | 1F700..1F77F; Alchemical Symbols 320 | 1F780..1F7FF; Geometric Shapes Extended 321 | 1F800..1F8FF; Supplemental Arrows-C 322 | 1F900..1F9FF; Supplemental Symbols and Pictographs 323 | 1FA00..1FA6F; Chess Symbols 324 | 1FA70..1FAFF; Symbols and Pictographs Extended-A 325 | 20000..2A6DF; CJK Unified Ideographs Extension B 326 | 2A700..2B73F; CJK Unified Ideographs Extension C 327 | 2B740..2B81F; CJK Unified Ideographs Extension D 328 | 2B820..2CEAF; CJK Unified Ideographs Extension E 329 | 2CEB0..2EBEF; CJK Unified Ideographs Extension F 330 | 2F800..2FA1F; CJK Compatibility Ideographs Supplement 331 | E0000..E007F; Tags 332 | E0100..E01EF; Variation Selectors Supplement 333 | F0000..FFFFF; Supplementary Private Use Area-A 334 | 100000..10FFFF; Supplementary Private Use Area-B 335 | 336 | # EOF 337 | -------------------------------------------------------------------------------- /data/CompositionExclusions.txt: -------------------------------------------------------------------------------- 1 | # CompositionExclusions-12.0.0.txt 2 | # Date: 2018-08-03, 00:00:00 GMT [KW, LI] 3 | # © 2018 Unicode®, Inc. 4 | # For terms of use, see http://www.unicode.org/terms_of_use.html 5 | # 6 | # Unicode Character Database 7 | # For documentation, see http://www.unicode.org/reports/tr44/ 8 | # 9 | # This file lists the characters for the Composition Exclusion Table 10 | # defined in UAX #15, Unicode Normalization Forms. 11 | # 12 | # This file is a normative contributory data file in the 13 | # Unicode Character Database. 14 | # 15 | # For more information, see 16 | # http://www.unicode.org/unicode/reports/tr15/#Primary_Exclusion_List_Table 17 | # 18 | # For a full derivation of composition exclusions, see the derived property 19 | # Full_Composition_Exclusion in DerivedNormalizationProps.txt 20 | # 21 | 22 | # ================================================ 23 | # (1) Script Specifics 24 | # 25 | # This list of characters cannot be derived from the UnicodeData.txt file. 26 | # ================================================ 27 | 28 | 0958 # DEVANAGARI LETTER QA 29 | 0959 # DEVANAGARI LETTER KHHA 30 | 095A # DEVANAGARI LETTER GHHA 31 | 095B # DEVANAGARI LETTER ZA 32 | 095C # DEVANAGARI LETTER DDDHA 33 | 095D # DEVANAGARI LETTER RHA 34 | 095E # DEVANAGARI LETTER FA 35 | 095F # DEVANAGARI LETTER YYA 36 | 09DC # BENGALI LETTER RRA 37 | 09DD # BENGALI LETTER RHA 38 | 09DF # BENGALI LETTER YYA 39 | 0A33 # GURMUKHI LETTER LLA 40 | 0A36 # GURMUKHI LETTER SHA 41 | 0A59 # GURMUKHI LETTER KHHA 42 | 0A5A # GURMUKHI LETTER GHHA 43 | 0A5B # GURMUKHI LETTER ZA 44 | 0A5E # GURMUKHI LETTER FA 45 | 0B5C # ORIYA LETTER RRA 46 | 0B5D # ORIYA LETTER RHA 47 | 0F43 # TIBETAN LETTER GHA 48 | 0F4D # TIBETAN LETTER DDHA 49 | 0F52 # TIBETAN LETTER DHA 50 | 0F57 # TIBETAN LETTER BHA 51 | 0F5C # TIBETAN LETTER DZHA 52 | 0F69 # TIBETAN LETTER KSSA 53 | 0F76 # TIBETAN VOWEL SIGN VOCALIC R 54 | 0F78 # TIBETAN VOWEL SIGN VOCALIC L 55 | 0F93 # TIBETAN SUBJOINED LETTER GHA 56 | 0F9D # TIBETAN SUBJOINED LETTER DDHA 57 | 0FA2 # TIBETAN SUBJOINED LETTER DHA 58 | 0FA7 # TIBETAN SUBJOINED LETTER BHA 59 | 0FAC # TIBETAN SUBJOINED LETTER DZHA 60 | 0FB9 # TIBETAN SUBJOINED LETTER KSSA 61 | FB1D # HEBREW LETTER YOD WITH HIRIQ 62 | FB1F # HEBREW LIGATURE YIDDISH YOD YOD PATAH 63 | FB2A # HEBREW LETTER SHIN WITH SHIN DOT 64 | FB2B # HEBREW LETTER SHIN WITH SIN DOT 65 | FB2C # HEBREW LETTER SHIN WITH DAGESH AND SHIN DOT 66 | FB2D # HEBREW LETTER SHIN WITH DAGESH AND SIN DOT 67 | FB2E # HEBREW LETTER ALEF WITH PATAH 68 | FB2F # HEBREW LETTER ALEF WITH QAMATS 69 | FB30 # HEBREW LETTER ALEF WITH MAPIQ 70 | FB31 # HEBREW LETTER BET WITH DAGESH 71 | FB32 # HEBREW LETTER GIMEL WITH DAGESH 72 | FB33 # HEBREW LETTER DALET WITH DAGESH 73 | FB34 # HEBREW LETTER HE WITH MAPIQ 74 | FB35 # HEBREW LETTER VAV WITH DAGESH 75 | FB36 # HEBREW LETTER ZAYIN WITH DAGESH 76 | FB38 # HEBREW LETTER TET WITH DAGESH 77 | FB39 # HEBREW LETTER YOD WITH DAGESH 78 | FB3A # HEBREW LETTER FINAL KAF WITH DAGESH 79 | FB3B # HEBREW LETTER KAF WITH DAGESH 80 | FB3C # HEBREW LETTER LAMED WITH DAGESH 81 | FB3E # HEBREW LETTER MEM WITH DAGESH 82 | FB40 # HEBREW LETTER NUN WITH DAGESH 83 | FB41 # HEBREW LETTER SAMEKH WITH DAGESH 84 | FB43 # HEBREW LETTER FINAL PE WITH DAGESH 85 | FB44 # HEBREW LETTER PE WITH DAGESH 86 | FB46 # HEBREW LETTER TSADI WITH DAGESH 87 | FB47 # HEBREW LETTER QOF WITH DAGESH 88 | FB48 # HEBREW LETTER RESH WITH DAGESH 89 | FB49 # HEBREW LETTER SHIN WITH DAGESH 90 | FB4A # HEBREW LETTER TAV WITH DAGESH 91 | FB4B # HEBREW LETTER VAV WITH HOLAM 92 | FB4C # HEBREW LETTER BET WITH RAFE 93 | FB4D # HEBREW LETTER KAF WITH RAFE 94 | FB4E # HEBREW LETTER PE WITH RAFE 95 | 96 | # Total code points: 67 97 | 98 | # ================================================ 99 | # (2) Post Composition Version precomposed characters 100 | # 101 | # These characters cannot be derived solely from the UnicodeData.txt file 102 | # in this version of Unicode. 103 | # 104 | # Note that characters added to the standard after the 105 | # Composition Version and which have canonical decomposition mappings 106 | # are not automatically added to this list of Post Composition 107 | # Version precomposed characters. 108 | # ================================================ 109 | 110 | 2ADC # FORKING 111 | 1D15E # MUSICAL SYMBOL HALF NOTE 112 | 1D15F # MUSICAL SYMBOL QUARTER NOTE 113 | 1D160 # MUSICAL SYMBOL EIGHTH NOTE 114 | 1D161 # MUSICAL SYMBOL SIXTEENTH NOTE 115 | 1D162 # MUSICAL SYMBOL THIRTY-SECOND NOTE 116 | 1D163 # MUSICAL SYMBOL SIXTY-FOURTH NOTE 117 | 1D164 # MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE 118 | 1D1BB # MUSICAL SYMBOL MINIMA 119 | 1D1BC # MUSICAL SYMBOL MINIMA BLACK 120 | 1D1BD # MUSICAL SYMBOL SEMIMINIMA WHITE 121 | 1D1BE # MUSICAL SYMBOL SEMIMINIMA BLACK 122 | 1D1BF # MUSICAL SYMBOL FUSA WHITE 123 | 1D1C0 # MUSICAL SYMBOL FUSA BLACK 124 | 125 | # Total code points: 14 126 | 127 | # ================================================ 128 | # (3) Singleton Decompositions 129 | # 130 | # These characters can be derived from the UnicodeData.txt file 131 | # by including all canonically decomposable characters whose 132 | # canonical decomposition consists of a single character. 133 | # 134 | # These characters are simply quoted here for reference. 135 | # See also Full_Composition_Exclusion in DerivedNormalizationProps.txt 136 | # ================================================ 137 | 138 | # 0340..0341 [2] COMBINING GRAVE TONE MARK..COMBINING ACUTE TONE MARK 139 | # 0343 COMBINING GREEK KORONIS 140 | # 0374 GREEK NUMERAL SIGN 141 | # 037E GREEK QUESTION MARK 142 | # 0387 GREEK ANO TELEIA 143 | # 1F71 GREEK SMALL LETTER ALPHA WITH OXIA 144 | # 1F73 GREEK SMALL LETTER EPSILON WITH OXIA 145 | # 1F75 GREEK SMALL LETTER ETA WITH OXIA 146 | # 1F77 GREEK SMALL LETTER IOTA WITH OXIA 147 | # 1F79 GREEK SMALL LETTER OMICRON WITH OXIA 148 | # 1F7B GREEK SMALL LETTER UPSILON WITH OXIA 149 | # 1F7D GREEK SMALL LETTER OMEGA WITH OXIA 150 | # 1FBB GREEK CAPITAL LETTER ALPHA WITH OXIA 151 | # 1FBE GREEK PROSGEGRAMMENI 152 | # 1FC9 GREEK CAPITAL LETTER EPSILON WITH OXIA 153 | # 1FCB GREEK CAPITAL LETTER ETA WITH OXIA 154 | # 1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA 155 | # 1FDB GREEK CAPITAL LETTER IOTA WITH OXIA 156 | # 1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA 157 | # 1FEB GREEK CAPITAL LETTER UPSILON WITH OXIA 158 | # 1FEE..1FEF [2] GREEK DIALYTIKA AND OXIA..GREEK VARIA 159 | # 1FF9 GREEK CAPITAL LETTER OMICRON WITH OXIA 160 | # 1FFB GREEK CAPITAL LETTER OMEGA WITH OXIA 161 | # 1FFD GREEK OXIA 162 | # 2000..2001 [2] EN QUAD..EM QUAD 163 | # 2126 OHM SIGN 164 | # 212A..212B [2] KELVIN SIGN..ANGSTROM SIGN 165 | # 2329 LEFT-POINTING ANGLE BRACKET 166 | # 232A RIGHT-POINTING ANGLE BRACKET 167 | # F900..FA0D [270] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA0D 168 | # FA10 CJK COMPATIBILITY IDEOGRAPH-FA10 169 | # FA12 CJK COMPATIBILITY IDEOGRAPH-FA12 170 | # FA15..FA1E [10] CJK COMPATIBILITY IDEOGRAPH-FA15..CJK COMPATIBILITY IDEOGRAPH-FA1E 171 | # FA20 CJK COMPATIBILITY IDEOGRAPH-FA20 172 | # FA22 CJK COMPATIBILITY IDEOGRAPH-FA22 173 | # FA25..FA26 [2] CJK COMPATIBILITY IDEOGRAPH-FA25..CJK COMPATIBILITY IDEOGRAPH-FA26 174 | # FA2A..FA6D [68] CJK COMPATIBILITY IDEOGRAPH-FA2A..CJK COMPATIBILITY IDEOGRAPH-FA6D 175 | # FA70..FAD9 [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9 176 | # 2F800..2FA1D [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D 177 | 178 | # Total code points: 1035 179 | 180 | # ================================================ 181 | # (4) Non-Starter Decompositions 182 | # 183 | # These characters can be derived from the UnicodeData.txt file 184 | # by including each expanding canonical decomposition 185 | # (i.e., those which canonically decompose to a sequence 186 | # of characters instead of a single character), such that: 187 | # 188 | # A. The character is not a Starter. 189 | # 190 | # OR (inclusive) 191 | # 192 | # B. The character's canonical decomposition begins 193 | # with a character that is not a Starter. 194 | # 195 | # Note that a "Starter" is any character with a zero combining class. 196 | # 197 | # These characters are simply quoted here for reference. 198 | # See also Full_Composition_Exclusion in DerivedNormalizationProps.txt 199 | # ================================================ 200 | 201 | # 0344 COMBINING GREEK DIALYTIKA TONOS 202 | # 0F73 TIBETAN VOWEL SIGN II 203 | # 0F75 TIBETAN VOWEL SIGN UU 204 | # 0F81 TIBETAN VOWEL SIGN REVERSED II 205 | 206 | # Total code points: 4 207 | 208 | # EOF 209 | -------------------------------------------------------------------------------- /data/IndicPositionalCategory.txt: -------------------------------------------------------------------------------- 1 | # IndicPositionalCategory-12.0.0.txt 2 | # Date: 2019-01-31, 02:26:00 GMT [KW, RP] 3 | # © 2019 Unicode®, Inc. 4 | # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. 5 | # For terms of use, see http://www.unicode.org/terms_of_use.html 6 | # 7 | # For documentation, see UAX #44: Unicode Character Database, 8 | # at http://www.unicode.org/reports/tr44/ 9 | # 10 | # This file defines the following property: 11 | # 12 | # Indic_Positional_Category enumerated property 13 | # 14 | # Scope: This property is aimed at the problem of 15 | # the specification of syllabic structure for Indic scripts. 16 | # Because dependent vowels (matras), visible viramas, and other 17 | # characters are placed in notional slots around the consonant (or 18 | # consonant cluster) core of an Indic syllable, there may be 19 | # cooccurrence constraints or other interactions. Also, it may be 20 | # desirable, in cases where more than one such character may occur in 21 | # sequence, as for example, in a top slot and a bottom slot, to 22 | # specify preferred orders for spelling. As such, this property 23 | # is designed primarily to supplement the Indic_Syllabic_Category 24 | # property. 25 | # 26 | # Note that this property is *not* intended as 27 | # a prescriptive property regarding display or font design, 28 | # for a number of reasons. Good font design requires information 29 | # that is outside the context of a character encoding standard, 30 | # and is best handled in other venues. For Indic dependent 31 | # vowels and similar characters, in particular: 32 | # 33 | # 1. Matra placement may vary somewhat based on typeface design. 34 | # 2. Matra placement, even within a single script, may vary 35 | # somewhat according to historic period or local conventions. 36 | # 3. Matra placement may be changed by explicit orthographic reform 37 | # decisions. 38 | # 4. Matras may ligate in various ways with a consonant (or even 39 | # other elements of a syllable) instead of occurring in a 40 | # discrete location. 41 | # 5. Matra display may be contextually determined. This is 42 | # notable, for example, in the Tamil script, where the shape 43 | # and placement of -u and -uu vowels depends strongly on 44 | # which consonant they adjoin. 45 | # 46 | # Format: 47 | # Field 0 Unicode code point value or range of code point values 48 | # Field 1 Indic_Positional_Category property value 49 | # 50 | # Field 1 is followed by a comment field, starting with the number sign '#', 51 | # which shows the General_Category property value, the Unicode character name 52 | # or names, and, in lines with ranges of code points, the code point count in 53 | # square brackets. 54 | # 55 | # The scripts assessed as containing dependent vowels or similar characters 56 | # in the structural sense used for the Indic_Positional_Category are the 57 | # following: 58 | # 59 | # Ahom, Balinese, Batak, Bengali, Bhaiksuki, Brahmi, Buginese, Buhid, 60 | # Chakma, Cham, Devanagari, Dogra, Grantha, Gujarati, Gunjala Gondi, 61 | # Gurmukhi, Hanunoo, Javanese, Kaithi, Kannada, Kharoshthi, Khmer, 62 | # Khojki, Khudawadi, Lao, Lepcha, Limbu, Makasar, Malayalam, Marchen, 63 | # Masaram Gondi, Meetei Mayek, Modi, Myanmar, Nandinagari, Newa, 64 | # New Tai Lue, Oriya, Rejang, Saurashtra, Sharada, Siddham, Sinhala, 65 | # Soyombo, Sundanese, Syloti Nagri, Tagalog, Tagbanwa, Tai Tham, Tai 66 | # Viet, Takri, Tamil, Telugu, Thai, Tibetan, Tirhuta, and Zanabazar 67 | # Square. 68 | # 69 | # All characters for all other scripts not in that list 70 | # take the default value for this property. 71 | # 72 | # See IndicSyllabicCategory.txt for a slightly more extended 73 | # list of Indic scripts, including those which do not have 74 | # positional characters. Currently, those additional 75 | # Indic scripts without positional characters are 76 | # Kayah Li, Mahajani, Multani, Phags-pa, and Tai Le. 77 | # 78 | # Notes: 79 | # 80 | # 1. The following characters are all assigned the positional category Right, 81 | # but may have different positions in some cases: 82 | # * U+0BC1 TAMIL VOWEL SIGN U and U+0BC2 TAMIL VOWEL SIGN UU have 83 | # contextually variable placement in Tamil. 84 | # * U+0D41 MALAYALAM VOWEL SIGN U and U+0D42 MALAYALAM VOWEL SIGN UU form 85 | # complex ligatures with consonants in older Malayalam orthography. 86 | # * U+11341 GRANTHA VOWEL SIGN U and U+11342 GRANTHA VOWEL SIGN UU have 87 | # contextually variable placement in Grantha. 88 | # * U+11440 NEWA VOWEL SIGN O and U+11441 NEWA VOWEL SIGN AU have contextually 89 | # variable placement in Newa. 90 | # 91 | # 2. The following characters are all assigned the positional category Top, 92 | # but may have different positions in some cases: 93 | # * U+1143E NEWA VOWEL SIGN E and U+1143F NEWA VOWEL SIGN AI have contextually 94 | # variable placement in Newa. 95 | # 96 | # 3. The following characters are all assigned the positional category Bottom, 97 | # but may have different positions in some cases: 98 | # * U+102F MYANMAR VOWEL SIGN U and U+1030 MYANMAR VOWEL SIGN UU have 99 | # contextually variable placement in Myanmar. 100 | # * U+1A69 TAI THAM VOWEL SIGN U and U+1A6A TAI THAM VOWEL SIGN UU have 101 | # contextually variable placement in Tai Tham. 102 | # 103 | # 4. The following character is assigned the positional category Left, but 104 | # may have different positions in different styles: 105 | # * U+119D2 NANDINAGARI VOWEL SIGN I has stylistically variable placement 106 | # in Nandinagari. 107 | 108 | 109 | # ================================================ 110 | 111 | # Property: Indic_Positional_Category 112 | # 113 | # All code points not explicitly listed for Indic_Positional_Category 114 | # have the value NA (not applicable). 115 | # 116 | # @missing: 0000..10FFFF; NA 117 | 118 | # ------------------------------------------------ 119 | 120 | # Indic_Positional_Category=Right 121 | 122 | 0903 ; Right # Mc DEVANAGARI SIGN VISARGA 123 | 093B ; Right # Mc DEVANAGARI VOWEL SIGN OOE 124 | 093E ; Right # Mc DEVANAGARI VOWEL SIGN AA 125 | 0940 ; Right # Mc DEVANAGARI VOWEL SIGN II 126 | 0949..094C ; Right # Mc [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU 127 | 094F ; Right # Mc DEVANAGARI VOWEL SIGN AW 128 | 0982..0983 ; Right # Mc [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA 129 | 09BE ; Right # Mc BENGALI VOWEL SIGN AA 130 | 09C0 ; Right # Mc BENGALI VOWEL SIGN II 131 | 09D7 ; Right # Mc BENGALI AU LENGTH MARK 132 | 0A03 ; Right # Mc GURMUKHI SIGN VISARGA 133 | 0A3E ; Right # Mc GURMUKHI VOWEL SIGN AA 134 | 0A40 ; Right # Mc GURMUKHI VOWEL SIGN II 135 | 0A83 ; Right # Mc GUJARATI SIGN VISARGA 136 | 0ABE ; Right # Mc GUJARATI VOWEL SIGN AA 137 | 0AC0 ; Right # Mc GUJARATI VOWEL SIGN II 138 | 0ACB..0ACC ; Right # Mc [2] GUJARATI VOWEL SIGN O..GUJARATI VOWEL SIGN AU 139 | 0B02..0B03 ; Right # Mc [2] ORIYA SIGN ANUSVARA..ORIYA SIGN VISARGA 140 | 0B3E ; Right # Mc ORIYA VOWEL SIGN AA 141 | 0B40 ; Right # Mc ORIYA VOWEL SIGN II 142 | 0BBE..0BBF ; Right # Mc [2] TAMIL VOWEL SIGN AA..TAMIL VOWEL SIGN I 143 | 0BC1..0BC2 ; Right # Mc [2] TAMIL VOWEL SIGN U..TAMIL VOWEL SIGN UU 144 | 0BD7 ; Right # Mc TAMIL AU LENGTH MARK 145 | 0C01..0C03 ; Right # Mc [3] TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA 146 | 0C41..0C44 ; Right # Mc [4] TELUGU VOWEL SIGN U..TELUGU VOWEL SIGN VOCALIC RR 147 | 0C82..0C83 ; Right # Mc [2] KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA 148 | 0CBE ; Right # Mc KANNADA VOWEL SIGN AA 149 | 0CC1..0CC4 ; Right # Mc [4] KANNADA VOWEL SIGN U..KANNADA VOWEL SIGN VOCALIC RR 150 | 0CD5..0CD6 ; Right # Mc [2] KANNADA LENGTH MARK..KANNADA AI LENGTH MARK 151 | 0D02..0D03 ; Right # Mc [2] MALAYALAM SIGN ANUSVARA..MALAYALAM SIGN VISARGA 152 | 0D3E..0D40 ; Right # Mc [3] MALAYALAM VOWEL SIGN AA..MALAYALAM VOWEL SIGN II 153 | 0D41..0D42 ; Right # Mn [2] MALAYALAM VOWEL SIGN U..MALAYALAM VOWEL SIGN UU 154 | 0D57 ; Right # Mc MALAYALAM AU LENGTH MARK 155 | 0D82..0D83 ; Right # Mc [2] SINHALA SIGN ANUSVARAYA..SINHALA SIGN VISARGAYA 156 | 0DCF..0DD1 ; Right # Mc [3] SINHALA VOWEL SIGN AELA-PILLA..SINHALA VOWEL SIGN DIGA AEDA-PILLA 157 | 0DD8 ; Right # Mc SINHALA VOWEL SIGN GAETTA-PILLA 158 | 0DDF ; Right # Mc SINHALA VOWEL SIGN GAYANUKITTA 159 | 0DF2..0DF3 ; Right # Mc [2] SINHALA VOWEL SIGN DIGA GAETTA-PILLA..SINHALA VOWEL SIGN DIGA GAYANUKITTA 160 | 0E30 ; Right # Lo THAI CHARACTER SARA A 161 | 0E32..0E33 ; Right # Lo [2] THAI CHARACTER SARA AA..THAI CHARACTER SARA AM 162 | 0E45 ; Right # Lo THAI CHARACTER LAKKHANGYAO 163 | 0EB0 ; Right # Lo LAO VOWEL SIGN A 164 | 0EB2..0EB3 ; Right # Lo [2] LAO VOWEL SIGN AA..LAO VOWEL SIGN AM 165 | 0F3E ; Right # Mc TIBETAN SIGN YAR TSHES 166 | 0F7F ; Right # Mc TIBETAN SIGN RNAM BCAD 167 | 102B..102C ; Right # Mc [2] MYANMAR VOWEL SIGN TALL AA..MYANMAR VOWEL SIGN AA 168 | 1038 ; Right # Mc MYANMAR SIGN VISARGA 169 | 103B ; Right # Mc MYANMAR CONSONANT SIGN MEDIAL YA 170 | 1056..1057 ; Right # Mc [2] MYANMAR VOWEL SIGN VOCALIC R..MYANMAR VOWEL SIGN VOCALIC RR 171 | 1062..1064 ; Right # Mc [3] MYANMAR VOWEL SIGN SGAW KAREN EU..MYANMAR TONE MARK SGAW KAREN KE PHO 172 | 1067..106D ; Right # Mc [7] MYANMAR VOWEL SIGN WESTERN PWO KAREN EU..MYANMAR SIGN WESTERN PWO KAREN TONE-5 173 | 1083 ; Right # Mc MYANMAR VOWEL SIGN SHAN AA 174 | 1087..108C ; Right # Mc [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3 175 | 108F ; Right # Mc MYANMAR SIGN RUMAI PALAUNG TONE-5 176 | 109A..109C ; Right # Mc [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A 177 | 17B6 ; Right # Mc KHMER VOWEL SIGN AA 178 | 17C7..17C8 ; Right # Mc [2] KHMER SIGN REAHMUK..KHMER SIGN YUUKALEAPINTU 179 | 1923..1924 ; Right # Mc [2] LIMBU VOWEL SIGN EE..LIMBU VOWEL SIGN AI 180 | 1929..192B ; Right # Mc [3] LIMBU SUBJOINED LETTER YA..LIMBU SUBJOINED LETTER WA 181 | 1930..1931 ; Right # Mc [2] LIMBU SMALL LETTER KA..LIMBU SMALL LETTER NGA 182 | 1933..1938 ; Right # Mc [6] LIMBU SMALL LETTER TA..LIMBU SMALL LETTER LA 183 | 19B0..19B4 ; Right # Lo [5] NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI LUE VOWEL SIGN UU 184 | 19B8..19B9 ; Right # Lo [2] NEW TAI LUE VOWEL SIGN OA..NEW TAI LUE VOWEL SIGN UE 185 | 19BB..19C0 ; Right # Lo [6] NEW TAI LUE VOWEL SIGN AAY..NEW TAI LUE VOWEL SIGN IY 186 | 19C8..19C9 ; Right # Lo [2] NEW TAI LUE TONE MARK-1..NEW TAI LUE TONE MARK-2 187 | 1A1A ; Right # Mc BUGINESE VOWEL SIGN O 188 | 1A57 ; Right # Mc TAI THAM CONSONANT SIGN LA TANG LAI 189 | 1A61 ; Right # Mc TAI THAM VOWEL SIGN A 190 | 1A63..1A64 ; Right # Mc [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA 191 | 1A6D ; Right # Mc TAI THAM VOWEL SIGN OY 192 | 1B04 ; Right # Mc BALINESE SIGN BISAH 193 | 1B35 ; Right # Mc BALINESE VOWEL SIGN TEDUNG 194 | 1B44 ; Right # Mc BALINESE ADEG ADEG 195 | 1B82 ; Right # Mc SUNDANESE SIGN PANGWISAD 196 | 1BA1 ; Right # Mc SUNDANESE CONSONANT SIGN PAMINGKAL 197 | 1BA7 ; Right # Mc SUNDANESE VOWEL SIGN PANOLONG 198 | 1BAA ; Right # Mc SUNDANESE SIGN PAMAAEH 199 | 1BE7 ; Right # Mc BATAK VOWEL SIGN E 200 | 1BEA..1BEC ; Right # Mc [3] BATAK VOWEL SIGN I..BATAK VOWEL SIGN O 201 | 1BEE ; Right # Mc BATAK VOWEL SIGN U 202 | 1BF2..1BF3 ; Right # Mc [2] BATAK PANGOLAT..BATAK PANONGONAN 203 | 1C24..1C26 ; Right # Mc [3] LEPCHA SUBJOINED LETTER YA..LEPCHA VOWEL SIGN AA 204 | 1C2A..1C2B ; Right # Mc [2] LEPCHA VOWEL SIGN U..LEPCHA VOWEL SIGN UU 205 | 1CE1 ; Right # Mc VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA 206 | 1CF7 ; Right # Mc VEDIC SIGN ATIKRAMA 207 | A823..A824 ; Right # Mc [2] SYLOTI NAGRI VOWEL SIGN A..SYLOTI NAGRI VOWEL SIGN I 208 | A827 ; Right # Mc SYLOTI NAGRI VOWEL SIGN OO 209 | A880..A881 ; Right # Mc [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA 210 | A8B4..A8C3 ; Right # Mc [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU 211 | A952..A953 ; Right # Mc [2] REJANG CONSONANT SIGN H..REJANG VIRAMA 212 | A983 ; Right # Mc JAVANESE SIGN WIGNYAN 213 | A9B4..A9B5 ; Right # Mc [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG 214 | A9BE ; Right # Mc JAVANESE CONSONANT SIGN PENGKAL 215 | AA33 ; Right # Mc CHAM CONSONANT SIGN YA 216 | AA4D ; Right # Mc CHAM CONSONANT SIGN FINAL H 217 | AA7B ; Right # Mc MYANMAR SIGN PAO KAREN TONE 218 | AA7D ; Right # Mc MYANMAR SIGN TAI LAING TONE-5 219 | AAB1 ; Right # Lo TAI VIET VOWEL AA 220 | AABA ; Right # Lo TAI VIET VOWEL UA 221 | AABD ; Right # Lo TAI VIET VOWEL AN 222 | AAEF ; Right # Mc MEETEI MAYEK VOWEL SIGN AAU 223 | AAF5 ; Right # Mc MEETEI MAYEK VOWEL SIGN VISARGA 224 | ABE3..ABE4 ; Right # Mc [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP 225 | ABE6..ABE7 ; Right # Mc [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP 226 | ABE9..ABEA ; Right # Mc [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG 227 | ABEC ; Right # Mc MEETEI MAYEK LUM IYEK 228 | 11000 ; Right # Mc BRAHMI SIGN CANDRABINDU 229 | 11002 ; Right # Mc BRAHMI SIGN VISARGA 230 | 11082 ; Right # Mc KAITHI SIGN VISARGA 231 | 110B0 ; Right # Mc KAITHI VOWEL SIGN AA 232 | 110B2 ; Right # Mc KAITHI VOWEL SIGN II 233 | 110B7..110B8 ; Right # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU 234 | 11145..11146 ; Right # Mc [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI 235 | 11182 ; Right # Mc SHARADA SIGN VISARGA 236 | 111B3 ; Right # Mc SHARADA VOWEL SIGN AA 237 | 111B5 ; Right # Mc SHARADA VOWEL SIGN II 238 | 111C0 ; Right # Mc SHARADA SIGN VIRAMA 239 | 1122C..1122E ; Right # Mc [3] KHOJKI VOWEL SIGN AA..KHOJKI VOWEL SIGN II 240 | 11235 ; Right # Mc KHOJKI SIGN VIRAMA 241 | 112E0 ; Right # Mc KHUDAWADI VOWEL SIGN AA 242 | 112E2 ; Right # Mc KHUDAWADI VOWEL SIGN II 243 | 11302..11303 ; Right # Mc [2] GRANTHA SIGN ANUSVARA..GRANTHA SIGN VISARGA 244 | 1133E..1133F ; Right # Mc [2] GRANTHA VOWEL SIGN AA..GRANTHA VOWEL SIGN I 245 | 11341..11344 ; Right # Mc [4] GRANTHA VOWEL SIGN U..GRANTHA VOWEL SIGN VOCALIC RR 246 | 1134D ; Right # Mc GRANTHA SIGN VIRAMA 247 | 11357 ; Right # Mc GRANTHA AU LENGTH MARK 248 | 11362..11363 ; Right # Mc [2] GRANTHA VOWEL SIGN VOCALIC L..GRANTHA VOWEL SIGN VOCALIC LL 249 | 11435 ; Right # Mc NEWA VOWEL SIGN AA 250 | 11437 ; Right # Mc NEWA VOWEL SIGN II 251 | 11440..11441 ; Right # Mc [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU 252 | 11445 ; Right # Mc NEWA SIGN VISARGA 253 | 114B0 ; Right # Mc TIRHUTA VOWEL SIGN AA 254 | 114B2 ; Right # Mc TIRHUTA VOWEL SIGN II 255 | 114BD ; Right # Mc TIRHUTA VOWEL SIGN SHORT O 256 | 114C1 ; Right # Mc TIRHUTA SIGN VISARGA 257 | 115AF ; Right # Mc SIDDHAM VOWEL SIGN AA 258 | 115B1 ; Right # Mc SIDDHAM VOWEL SIGN II 259 | 115BE ; Right # Mc SIDDHAM SIGN VISARGA 260 | 11630..11632 ; Right # Mc [3] MODI VOWEL SIGN AA..MODI VOWEL SIGN II 261 | 1163B..1163C ; Right # Mc [2] MODI VOWEL SIGN O..MODI VOWEL SIGN AU 262 | 1163E ; Right # Mc MODI SIGN VISARGA 263 | 116AC ; Right # Mc TAKRI SIGN VISARGA 264 | 116AF ; Right # Mc TAKRI VOWEL SIGN II 265 | 116B6 ; Right # Mc TAKRI SIGN VIRAMA 266 | 11720..11721 ; Right # Mc [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA 267 | 1182C ; Right # Mc DOGRA VOWEL SIGN AA 268 | 1182E ; Right # Mc DOGRA VOWEL SIGN II 269 | 11838 ; Right # Mc DOGRA SIGN VISARGA 270 | 119D1 ; Right # Mc NANDINAGARI VOWEL SIGN AA 271 | 119D3 ; Right # Mc NANDINAGARI VOWEL SIGN II 272 | 119DC..119DF ; Right # Mc [4] NANDINAGARI VOWEL SIGN O..NANDINAGARI SIGN VISARGA 273 | 11A39 ; Right # Mc ZANABAZAR SQUARE SIGN VISARGA 274 | 11A57..11A58 ; Right # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU 275 | 11A97 ; Right # Mc SOYOMBO SIGN VISARGA 276 | 11C2F ; Right # Mc BHAIKSUKI VOWEL SIGN AA 277 | 11C3E ; Right # Mc BHAIKSUKI SIGN VISARGA 278 | 11CA9 ; Right # Mc MARCHEN SUBJOINED LETTER YA 279 | 11CB4 ; Right # Mc MARCHEN VOWEL SIGN O 280 | 11D8A..11D8E ; Right # Mc [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU 281 | 11D93..11D94 ; Right # Mc [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU 282 | 11D96 ; Right # Mc GUNJALA GONDI SIGN VISARGA 283 | 11EF6 ; Right # Mc MAKASAR VOWEL SIGN O 284 | 285 | # Indic_Positional_Category=Left 286 | 287 | 093F ; Left # Mc DEVANAGARI VOWEL SIGN I 288 | 094E ; Left # Mc DEVANAGARI VOWEL SIGN PRISHTHAMATRA E 289 | 09BF ; Left # Mc BENGALI VOWEL SIGN I 290 | 09C7..09C8 ; Left # Mc [2] BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI 291 | 0A3F ; Left # Mc GURMUKHI VOWEL SIGN I 292 | 0ABF ; Left # Mc GUJARATI VOWEL SIGN I 293 | 0B47 ; Left # Mc ORIYA VOWEL SIGN E 294 | 0BC6..0BC8 ; Left # Mc [3] TAMIL VOWEL SIGN E..TAMIL VOWEL SIGN AI 295 | 0D46..0D48 ; Left # Mc [3] MALAYALAM VOWEL SIGN E..MALAYALAM VOWEL SIGN AI 296 | 0DD9 ; Left # Mc SINHALA VOWEL SIGN KOMBUVA 297 | 0DDB ; Left # Mc SINHALA VOWEL SIGN KOMBU DEKA 298 | 0F3F ; Left # Mc TIBETAN SIGN MAR TSHES 299 | 1031 ; Left # Mc MYANMAR VOWEL SIGN E 300 | 1084 ; Left # Mc MYANMAR VOWEL SIGN SHAN E 301 | 17C1..17C3 ; Left # Mc [3] KHMER VOWEL SIGN E..KHMER VOWEL SIGN AI 302 | 1A19 ; Left # Mc BUGINESE VOWEL SIGN E 303 | 1A55 ; Left # Mc TAI THAM CONSONANT SIGN MEDIAL RA 304 | 1A6E..1A72 ; Left # Mc [5] TAI THAM VOWEL SIGN E..TAI THAM VOWEL SIGN THAM AI 305 | 1B3E..1B3F ; Left # Mc [2] BALINESE VOWEL SIGN TALING..BALINESE VOWEL SIGN TALING REPA 306 | 1BA6 ; Left # Mc SUNDANESE VOWEL SIGN PANAELAENG 307 | 1C27..1C28 ; Left # Mc [2] LEPCHA VOWEL SIGN I..LEPCHA VOWEL SIGN O 308 | 1C34..1C35 ; Left # Mc [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG 309 | A9BA..A9BB ; Left # Mc [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE 310 | AA2F..AA30 ; Left # Mc [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI 311 | AA34 ; Left # Mc CHAM CONSONANT SIGN RA 312 | AAEB ; Left # Mc MEETEI MAYEK VOWEL SIGN II 313 | AAEE ; Left # Mc MEETEI MAYEK VOWEL SIGN AU 314 | 110B1 ; Left # Mc KAITHI VOWEL SIGN I 315 | 1112C ; Left # Mc CHAKMA VOWEL SIGN E 316 | 111B4 ; Left # Mc SHARADA VOWEL SIGN I 317 | 112E1 ; Left # Mc KHUDAWADI VOWEL SIGN I 318 | 11347..11348 ; Left # Mc [2] GRANTHA VOWEL SIGN EE..GRANTHA VOWEL SIGN AI 319 | 11436 ; Left # Mc NEWA VOWEL SIGN I 320 | 114B1 ; Left # Mc TIRHUTA VOWEL SIGN I 321 | 114B9 ; Left # Mc TIRHUTA VOWEL SIGN E 322 | 115B0 ; Left # Mc SIDDHAM VOWEL SIGN I 323 | 115B8 ; Left # Mc SIDDHAM VOWEL SIGN E 324 | 116AE ; Left # Mc TAKRI VOWEL SIGN I 325 | 11726 ; Left # Mc AHOM VOWEL SIGN E 326 | 1182D ; Left # Mc DOGRA VOWEL SIGN I 327 | 119D2 ; Left # Mc NANDINAGARI VOWEL SIGN I 328 | 119E4 ; Left # Mc NANDINAGARI VOWEL SIGN PRISHTHAMATRA E 329 | 11CB1 ; Left # Mc MARCHEN VOWEL SIGN I 330 | 11EF5 ; Left # Mc MAKASAR VOWEL SIGN E 331 | 332 | # Indic_Positional_Category=Visual_Order_Left 333 | 334 | # These are dependent vowels that occur to the left of the consonant 335 | # letter in a syllable, but which occur in scripts using the visual order 336 | # model, instead of the logical order model. Because of the different 337 | # model, these left-side vowels occur first in the backing store (before 338 | # the consonant letter) and are not reordered during text rendering. 339 | # 340 | # [Derivation: Logical_Order_Exception=Yes] 341 | 342 | 0E40..0E44 ; Visual_Order_Left # Lo [5] THAI CHARACTER SARA E..THAI CHARACTER SARA AI MAIMALAI 343 | 0EC0..0EC4 ; Visual_Order_Left # Lo [5] LAO VOWEL SIGN E..LAO VOWEL SIGN AI 344 | 19B5..19B7 ; Visual_Order_Left # Lo [3] NEW TAI LUE VOWEL SIGN E..NEW TAI LUE VOWEL SIGN O 345 | 19BA ; Visual_Order_Left # Lo NEW TAI LUE VOWEL SIGN AY 346 | AAB5..AAB6 ; Visual_Order_Left # Lo [2] TAI VIET VOWEL E..TAI VIET VOWEL O 347 | AAB9 ; Visual_Order_Left # Lo TAI VIET VOWEL UEA 348 | AABB..AABC ; Visual_Order_Left # Lo [2] TAI VIET VOWEL AUE..TAI VIET VOWEL AY 349 | 350 | # Indic_Positional_Category=Left_And_Right 351 | 352 | 09CB..09CC ; Left_And_Right # Mc [2] BENGALI VOWEL SIGN O..BENGALI VOWEL SIGN AU 353 | 0B4B ; Left_And_Right # Mc ORIYA VOWEL SIGN O 354 | 0BCA..0BCC ; Left_And_Right # Mc [3] TAMIL VOWEL SIGN O..TAMIL VOWEL SIGN AU 355 | 0D4A..0D4C ; Left_And_Right # Mc [3] MALAYALAM VOWEL SIGN O..MALAYALAM VOWEL SIGN AU 356 | 0DDC ; Left_And_Right # Mc SINHALA VOWEL SIGN KOMBUVA HAA AELA-PILLA 357 | 0DDE ; Left_And_Right # Mc SINHALA VOWEL SIGN KOMBUVA HAA GAYANUKITTA 358 | 17C0 ; Left_And_Right # Mc KHMER VOWEL SIGN IE 359 | 17C4..17C5 ; Left_And_Right # Mc [2] KHMER VOWEL SIGN OO..KHMER VOWEL SIGN AU 360 | 1B40..1B41 ; Left_And_Right # Mc [2] BALINESE VOWEL SIGN TALING TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG 361 | 1134B..1134C ; Left_And_Right # Mc [2] GRANTHA VOWEL SIGN OO..GRANTHA VOWEL SIGN AU 362 | 114BC ; Left_And_Right # Mc TIRHUTA VOWEL SIGN O 363 | 114BE ; Left_And_Right # Mc TIRHUTA VOWEL SIGN AU 364 | 115BA ; Left_And_Right # Mc SIDDHAM VOWEL SIGN O 365 | 366 | # Indic_Positional_Category=Top 367 | 368 | 0900..0902 ; Top # Mn [3] DEVANAGARI SIGN INVERTED CANDRABINDU..DEVANAGARI SIGN ANUSVARA 369 | 093A ; Top # Mn DEVANAGARI VOWEL SIGN OE 370 | 0945..0948 ; Top # Mn [4] DEVANAGARI VOWEL SIGN CANDRA E..DEVANAGARI VOWEL SIGN AI 371 | 0951 ; Top # Mn DEVANAGARI STRESS SIGN UDATTA 372 | 0953..0955 ; Top # Mn [3] DEVANAGARI GRAVE ACCENT..DEVANAGARI VOWEL SIGN CANDRA LONG E 373 | 0981 ; Top # Mn BENGALI SIGN CANDRABINDU 374 | 09FE ; Top # Mn BENGALI SANDHI MARK 375 | 0A01..0A02 ; Top # Mn [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI 376 | 0A47..0A48 ; Top # Mn [2] GURMUKHI VOWEL SIGN EE..GURMUKHI VOWEL SIGN AI 377 | 0A4B..0A4C ; Top # Mn [2] GURMUKHI VOWEL SIGN OO..GURMUKHI VOWEL SIGN AU 378 | 0A70..0A71 ; Top # Mn [2] GURMUKHI TIPPI..GURMUKHI ADDAK 379 | 0A81..0A82 ; Top # Mn [2] GUJARATI SIGN CANDRABINDU..GUJARATI SIGN ANUSVARA 380 | 0AC5 ; Top # Mn GUJARATI VOWEL SIGN CANDRA E 381 | 0AC7..0AC8 ; Top # Mn [2] GUJARATI VOWEL SIGN E..GUJARATI VOWEL SIGN AI 382 | 0AFA..0AFF ; Top # Mn [6] GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE NUKTA ABOVE 383 | 0B01 ; Top # Mn ORIYA SIGN CANDRABINDU 384 | 0B3F ; Top # Mn ORIYA VOWEL SIGN I 385 | 0B56 ; Top # Mn ORIYA AI LENGTH MARK 386 | 0B82 ; Top # Mn TAMIL SIGN ANUSVARA 387 | 0BC0 ; Top # Mn TAMIL VOWEL SIGN II 388 | 0BCD ; Top # Mn TAMIL SIGN VIRAMA 389 | 0C00 ; Top # Mn TELUGU SIGN COMBINING CANDRABINDU ABOVE 390 | 0C04 ; Top # Mn TELUGU SIGN COMBINING ANUSVARA ABOVE 391 | 0C3E..0C40 ; Top # Mn [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II 392 | 0C46..0C47 ; Top # Mn [2] TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN EE 393 | 0C4A..0C4D ; Top # Mn [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA 394 | 0C55 ; Top # Mn TELUGU LENGTH MARK 395 | 0C81 ; Top # Mn KANNADA SIGN CANDRABINDU 396 | 0CBF ; Top # Mn KANNADA VOWEL SIGN I 397 | 0CC6 ; Top # Mn KANNADA VOWEL SIGN E 398 | 0CCC..0CCD ; Top # Mn [2] KANNADA VOWEL SIGN AU..KANNADA SIGN VIRAMA 399 | 0D00..0D01 ; Top # Mn [2] MALAYALAM SIGN COMBINING ANUSVARA ABOVE..MALAYALAM SIGN CANDRABINDU 400 | 0D3B..0D3C ; Top # Mn [2] MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM SIGN CIRCULAR VIRAMA 401 | 0D4D ; Top # Mn MALAYALAM SIGN VIRAMA 402 | 0DCA ; Top # Mn SINHALA SIGN AL-LAKUNA 403 | 0DD2..0DD3 ; Top # Mn [2] SINHALA VOWEL SIGN KETTI IS-PILLA..SINHALA VOWEL SIGN DIGA IS-PILLA 404 | 0E31 ; Top # Mn THAI CHARACTER MAI HAN-AKAT 405 | 0E34..0E37 ; Top # Mn [4] THAI CHARACTER SARA I..THAI CHARACTER SARA UEE 406 | 0E47..0E4E ; Top # Mn [8] THAI CHARACTER MAITAIKHU..THAI CHARACTER YAMAKKAN 407 | 0EB1 ; Top # Mn LAO VOWEL SIGN MAI KAN 408 | 0EB4..0EB7 ; Top # Mn [4] LAO VOWEL SIGN I..LAO VOWEL SIGN YY 409 | 0EBB ; Top # Mn LAO VOWEL SIGN MAI KON 410 | 0EC8..0ECD ; Top # Mn [6] LAO TONE MAI EK..LAO NIGGAHITA 411 | 0F39 ; Top # Mn TIBETAN MARK TSA -PHRU 412 | 0F72 ; Top # Mn TIBETAN VOWEL SIGN I 413 | 0F7A..0F7E ; Top # Mn [5] TIBETAN VOWEL SIGN E..TIBETAN SIGN RJES SU NGA RO 414 | 0F80 ; Top # Mn TIBETAN VOWEL SIGN REVERSED I 415 | 0F82..0F83 ; Top # Mn [2] TIBETAN SIGN NYI ZLA NAA DA..TIBETAN SIGN SNA LDAN 416 | 0F86..0F87 ; Top # Mn [2] TIBETAN SIGN LCI RTAGS..TIBETAN SIGN YANG RTAGS 417 | 102D..102E ; Top # Mn [2] MYANMAR VOWEL SIGN I..MYANMAR VOWEL SIGN II 418 | 1032..1036 ; Top # Mn [5] MYANMAR VOWEL SIGN AI..MYANMAR SIGN ANUSVARA 419 | 103A ; Top # Mn MYANMAR SIGN ASAT 420 | 1071..1074 ; Top # Mn [4] MYANMAR VOWEL SIGN GEBA KAREN I..MYANMAR VOWEL SIGN KAYAH EE 421 | 1085..1086 ; Top # Mn [2] MYANMAR VOWEL SIGN SHAN E ABOVE..MYANMAR VOWEL SIGN SHAN FINAL Y 422 | 109D ; Top # Mn MYANMAR VOWEL SIGN AITON AI 423 | 1712 ; Top # Mn TAGALOG VOWEL SIGN I 424 | 1732 ; Top # Mn HANUNOO VOWEL SIGN I 425 | 1752 ; Top # Mn BUHID VOWEL SIGN I 426 | 1772 ; Top # Mn TAGBANWA VOWEL SIGN I 427 | 17B7..17BA ; Top # Mn [4] KHMER VOWEL SIGN I..KHMER VOWEL SIGN YY 428 | 17C6 ; Top # Mn KHMER SIGN NIKAHIT 429 | 17C9..17D1 ; Top # Mn [9] KHMER SIGN MUUSIKATOAN..KHMER SIGN VIRIAM 430 | 17D3 ; Top # Mn KHMER SIGN BATHAMASAT 431 | 17DD ; Top # Mn KHMER SIGN ATTHACAN 432 | 1920..1921 ; Top # Mn [2] LIMBU VOWEL SIGN A..LIMBU VOWEL SIGN I 433 | 1927..1928 ; Top # Mn [2] LIMBU VOWEL SIGN E..LIMBU VOWEL SIGN O 434 | 193A ; Top # Mn LIMBU SIGN KEMPHRENG 435 | 1A17 ; Top # Mn BUGINESE VOWEL SIGN I 436 | 1A1B ; Top # Mn BUGINESE VOWEL SIGN AE 437 | 1A58..1A5A ; Top # Mn [3] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN LOW PA 438 | 1A62 ; Top # Mn TAI THAM VOWEL SIGN MAI SAT 439 | 1A65..1A68 ; Top # Mn [4] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN UUE 440 | 1A6B ; Top # Mn TAI THAM VOWEL SIGN O 441 | 1A73..1A7C ; Top # Mn [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN 442 | 1B00..1B03 ; Top # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG 443 | 1B34 ; Top # Mn BALINESE SIGN REREKAN 444 | 1B36..1B37 ; Top # Mn [2] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN ULU SARI 445 | 1B42 ; Top # Mn BALINESE VOWEL SIGN PEPET 446 | 1B6B ; Top # Mn BALINESE MUSICAL SYMBOL COMBINING TEGEH 447 | 1B6D..1B73 ; Top # Mn [7] BALINESE MUSICAL SYMBOL COMBINING KEMPUL..BALINESE MUSICAL SYMBOL COMBINING GONG 448 | 1B80..1B81 ; Top # Mn [2] SUNDANESE SIGN PANYECEK..SUNDANESE SIGN PANGLAYAR 449 | 1BA4 ; Top # Mn SUNDANESE VOWEL SIGN PANGHULU 450 | 1BA8..1BA9 ; Top # Mn [2] SUNDANESE VOWEL SIGN PAMEPET..SUNDANESE VOWEL SIGN PANEULEUNG 451 | 1BE6 ; Top # Mn BATAK SIGN TOMPI 452 | 1BE8..1BE9 ; Top # Mn [2] BATAK VOWEL SIGN PAKPAK E..BATAK VOWEL SIGN EE 453 | 1BED ; Top # Mn BATAK VOWEL SIGN KARO O 454 | 1BEF..1BF1 ; Top # Mn [3] BATAK VOWEL SIGN U FOR SIMALUNGUN SA..BATAK CONSONANT SIGN H 455 | 1C2D..1C33 ; Top # Mn [7] LEPCHA CONSONANT SIGN K..LEPCHA CONSONANT SIGN T 456 | 1C36 ; Top # Mn LEPCHA SIGN RAN 457 | 1CD0..1CD2 ; Top # Mn [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA 458 | 1CDA..1CDB ; Top # Mn [2] VEDIC TONE DOUBLE SVARITA..VEDIC TONE TRIPLE SVARITA 459 | 1CE0 ; Top # Mn VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA 460 | 1CF4 ; Top # Mn VEDIC TONE CANDRA ABOVE 461 | 1DFB ; Top # Mn COMBINING DELETION MARK 462 | 20F0 ; Top # Mn COMBINING ASTERISK ABOVE 463 | A802 ; Top # Mn SYLOTI NAGRI SIGN DVISVARA 464 | A806 ; Top # Mn SYLOTI NAGRI SIGN HASANTA 465 | A80B ; Top # Mn SYLOTI NAGRI SIGN ANUSVARA 466 | A826 ; Top # Mn SYLOTI NAGRI VOWEL SIGN E 467 | A8C5 ; Top # Mn SAURASHTRA SIGN CANDRABINDU 468 | A8E0..A8F1 ; Top # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA 469 | A8FF ; Top # Mn DEVANAGARI VOWEL SIGN AY 470 | A94A ; Top # Mn REJANG VOWEL SIGN AI 471 | A94F..A951 ; Top # Mn [3] REJANG CONSONANT SIGN NG..REJANG CONSONANT SIGN R 472 | A980..A982 ; Top # Mn [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR 473 | A9B3 ; Top # Mn JAVANESE SIGN CECAK TELU 474 | A9B6..A9B7 ; Top # Mn [2] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN WULU MELIK 475 | A9BC ; Top # Mn JAVANESE VOWEL SIGN PEPET 476 | A9E5 ; Top # Mn MYANMAR SIGN SHAN SAW 477 | AA29..AA2C ; Top # Mn [4] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN EI 478 | AA2E ; Top # Mn CHAM VOWEL SIGN OE 479 | AA31 ; Top # Mn CHAM VOWEL SIGN AU 480 | AA43 ; Top # Mn CHAM CONSONANT SIGN FINAL NG 481 | AA4C ; Top # Mn CHAM CONSONANT SIGN FINAL M 482 | AA7C ; Top # Mn MYANMAR SIGN TAI LAING TONE-2 483 | AAB0 ; Top # Mn TAI VIET MAI KANG 484 | AAB2..AAB3 ; Top # Mn [2] TAI VIET VOWEL I..TAI VIET VOWEL UE 485 | AAB7..AAB8 ; Top # Mn [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA 486 | AABE..AABF ; Top # Mn [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK 487 | AAC1 ; Top # Mn TAI VIET TONE MAI THO 488 | AAED ; Top # Mn MEETEI MAYEK VOWEL SIGN AAI 489 | ABE5 ; Top # Mn MEETEI MAYEK VOWEL SIGN ANAP 490 | 10A05 ; Top # Mn KHAROSHTHI VOWEL SIGN E 491 | 10A0F ; Top # Mn KHAROSHTHI SIGN VISARGA 492 | 10A38 ; Top # Mn KHAROSHTHI SIGN BAR ABOVE 493 | 11001 ; Top # Mn BRAHMI SIGN ANUSVARA 494 | 11038..1103B ; Top # Mn [4] BRAHMI VOWEL SIGN AA..BRAHMI VOWEL SIGN II 495 | 11042..11046 ; Top # Mn [5] BRAHMI VOWEL SIGN E..BRAHMI VIRAMA 496 | 11080..11081 ; Top # Mn [2] KAITHI SIGN CANDRABINDU..KAITHI SIGN ANUSVARA 497 | 110B5..110B6 ; Top # Mn [2] KAITHI VOWEL SIGN E..KAITHI VOWEL SIGN AI 498 | 11100..11102 ; Top # Mn [3] CHAKMA SIGN CANDRABINDU..CHAKMA SIGN VISARGA 499 | 11127..11129 ; Top # Mn [3] CHAKMA VOWEL SIGN A..CHAKMA VOWEL SIGN II 500 | 1112D ; Top # Mn CHAKMA VOWEL SIGN AI 501 | 11130 ; Top # Mn CHAKMA VOWEL SIGN OI 502 | 11134 ; Top # Mn CHAKMA MAAYYAA 503 | 11180..11181 ; Top # Mn [2] SHARADA SIGN CANDRABINDU..SHARADA SIGN ANUSVARA 504 | 111BC..111BE ; Top # Mn [3] SHARADA VOWEL SIGN E..SHARADA VOWEL SIGN O 505 | 111CB ; Top # Mn SHARADA VOWEL MODIFIER MARK 506 | 11230..11231 ; Top # Mn [2] KHOJKI VOWEL SIGN E..KHOJKI VOWEL SIGN AI 507 | 11234 ; Top # Mn KHOJKI SIGN ANUSVARA 508 | 11236..11237 ; Top # Mn [2] KHOJKI SIGN NUKTA..KHOJKI SIGN SHADDA 509 | 1123E ; Top # Mn KHOJKI SIGN SUKUN 510 | 112DF ; Top # Mn KHUDAWADI SIGN ANUSVARA 511 | 112E5..112E8 ; Top # Mn [4] KHUDAWADI VOWEL SIGN E..KHUDAWADI VOWEL SIGN AU 512 | 11300..11301 ; Top # Mn [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU 513 | 11340 ; Top # Mn GRANTHA VOWEL SIGN II 514 | 11366..1136C ; Top # Mn [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX 515 | 11370..11374 ; Top # Mn [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA 516 | 1143E..1143F ; Top # Mn [2] NEWA VOWEL SIGN E..NEWA VOWEL SIGN AI 517 | 11443..11444 ; Top # Mn [2] NEWA SIGN CANDRABINDU..NEWA SIGN ANUSVARA 518 | 1145E ; Top # Mn NEWA SANDHI MARK 519 | 114BA ; Top # Mn TIRHUTA VOWEL SIGN SHORT E 520 | 114BF..114C0 ; Top # Mn [2] TIRHUTA SIGN CANDRABINDU..TIRHUTA SIGN ANUSVARA 521 | 115BC..115BD ; Top # Mn [2] SIDDHAM SIGN CANDRABINDU..SIDDHAM SIGN ANUSVARA 522 | 11639..1163A ; Top # Mn [2] MODI VOWEL SIGN E..MODI VOWEL SIGN AI 523 | 1163D ; Top # Mn MODI SIGN ANUSVARA 524 | 11640 ; Top # Mn MODI SIGN ARDHACANDRA 525 | 116AB ; Top # Mn TAKRI SIGN ANUSVARA 526 | 116AD ; Top # Mn TAKRI VOWEL SIGN AA 527 | 116B2..116B5 ; Top # Mn [4] TAKRI VOWEL SIGN E..TAKRI VOWEL SIGN AU 528 | 1171F ; Top # Mn AHOM CONSONANT SIGN MEDIAL LIGATING RA 529 | 11722..11723 ; Top # Mn [2] AHOM VOWEL SIGN I..AHOM VOWEL SIGN II 530 | 11727 ; Top # Mn AHOM VOWEL SIGN AW 531 | 11729..1172B ; Top # Mn [3] AHOM VOWEL SIGN AI..AHOM SIGN KILLER 532 | 11833..11837 ; Top # Mn [5] DOGRA VOWEL SIGN E..DOGRA SIGN ANUSVARA 533 | 119DA..119DB ; Top # Mn [2] NANDINAGARI VOWEL SIGN E..NANDINAGARI VOWEL SIGN AI 534 | 11A01 ; Top # Mn ZANABAZAR SQUARE VOWEL SIGN I 535 | 11A04..11A09 ; Top # Mn [6] ZANABAZAR SQUARE VOWEL SIGN E..ZANABAZAR SQUARE VOWEL SIGN REVERSED I 536 | 11A35..11A38 ; Top # Mn [4] ZANABAZAR SQUARE SIGN CANDRABINDU..ZANABAZAR SQUARE SIGN ANUSVARA 537 | 11A51 ; Top # Mn SOYOMBO VOWEL SIGN I 538 | 11A54..11A56 ; Top # Mn [3] SOYOMBO VOWEL SIGN E..SOYOMBO VOWEL SIGN OE 539 | 11A96 ; Top # Mn SOYOMBO SIGN ANUSVARA 540 | 11A98 ; Top # Mn SOYOMBO GEMINATION MARK 541 | 11C30..11C31 ; Top # Mn [2] BHAIKSUKI VOWEL SIGN I..BHAIKSUKI VOWEL SIGN II 542 | 11C38..11C3D ; Top # Mn [6] BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN ANUSVARA 543 | 11CB3 ; Top # Mn MARCHEN VOWEL SIGN E 544 | 11CB5..11CB6 ; Top # Mn [2] MARCHEN SIGN ANUSVARA..MARCHEN SIGN CANDRABINDU 545 | 11D31..11D35 ; Top # Mn [5] MASARAM GONDI VOWEL SIGN AA..MASARAM GONDI VOWEL SIGN UU 546 | 11D3A ; Top # Mn MASARAM GONDI VOWEL SIGN E 547 | 11D3C..11D3D ; Top # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O 548 | 11D3F..11D41 ; Top # Mn [3] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI SIGN VISARGA 549 | 11D43 ; Top # Mn MASARAM GONDI SIGN CANDRA 550 | 11D90..11D91 ; Top # Mn [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI 551 | 11D95 ; Top # Mn GUNJALA GONDI SIGN ANUSVARA 552 | 11EF3 ; Top # Mn MAKASAR VOWEL SIGN I 553 | 554 | # Indic_Positional_Category=Bottom 555 | 556 | 093C ; Bottom # Mn DEVANAGARI SIGN NUKTA 557 | 0941..0944 ; Bottom # Mn [4] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN VOCALIC RR 558 | 094D ; Bottom # Mn DEVANAGARI SIGN VIRAMA 559 | 0952 ; Bottom # Mn DEVANAGARI STRESS SIGN ANUDATTA 560 | 0956..0957 ; Bottom # Mn [2] DEVANAGARI VOWEL SIGN UE..DEVANAGARI VOWEL SIGN UUE 561 | 0962..0963 ; Bottom # Mn [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL 562 | 09BC ; Bottom # Mn BENGALI SIGN NUKTA 563 | 09C1..09C4 ; Bottom # Mn [4] BENGALI VOWEL SIGN U..BENGALI VOWEL SIGN VOCALIC RR 564 | 09CD ; Bottom # Mn BENGALI SIGN VIRAMA 565 | 09E2..09E3 ; Bottom # Mn [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL 566 | 0A3C ; Bottom # Mn GURMUKHI SIGN NUKTA 567 | 0A41..0A42 ; Bottom # Mn [2] GURMUKHI VOWEL SIGN U..GURMUKHI VOWEL SIGN UU 568 | 0A4D ; Bottom # Mn GURMUKHI SIGN VIRAMA 569 | 0A51 ; Bottom # Mn GURMUKHI SIGN UDAAT 570 | 0A75 ; Bottom # Mn GURMUKHI SIGN YAKASH 571 | 0ABC ; Bottom # Mn GUJARATI SIGN NUKTA 572 | 0AC1..0AC4 ; Bottom # Mn [4] GUJARATI VOWEL SIGN U..GUJARATI VOWEL SIGN VOCALIC RR 573 | 0ACD ; Bottom # Mn GUJARATI SIGN VIRAMA 574 | 0AE2..0AE3 ; Bottom # Mn [2] GUJARATI VOWEL SIGN VOCALIC L..GUJARATI VOWEL SIGN VOCALIC LL 575 | 0B3C ; Bottom # Mn ORIYA SIGN NUKTA 576 | 0B41..0B44 ; Bottom # Mn [4] ORIYA VOWEL SIGN U..ORIYA VOWEL SIGN VOCALIC RR 577 | 0B4D ; Bottom # Mn ORIYA SIGN VIRAMA 578 | 0B62..0B63 ; Bottom # Mn [2] ORIYA VOWEL SIGN VOCALIC L..ORIYA VOWEL SIGN VOCALIC LL 579 | 0C56 ; Bottom # Mn TELUGU AI LENGTH MARK 580 | 0C62..0C63 ; Bottom # Mn [2] TELUGU VOWEL SIGN VOCALIC L..TELUGU VOWEL SIGN VOCALIC LL 581 | 0CBC ; Bottom # Mn KANNADA SIGN NUKTA 582 | 0CE2..0CE3 ; Bottom # Mn [2] KANNADA VOWEL SIGN VOCALIC L..KANNADA VOWEL SIGN VOCALIC LL 583 | 0D43..0D44 ; Bottom # Mn [2] MALAYALAM VOWEL SIGN VOCALIC R..MALAYALAM VOWEL SIGN VOCALIC RR 584 | 0D62..0D63 ; Bottom # Mn [2] MALAYALAM VOWEL SIGN VOCALIC L..MALAYALAM VOWEL SIGN VOCALIC LL 585 | 0DD4 ; Bottom # Mn SINHALA VOWEL SIGN KETTI PAA-PILLA 586 | 0DD6 ; Bottom # Mn SINHALA VOWEL SIGN DIGA PAA-PILLA 587 | 0E38..0E3A ; Bottom # Mn [3] THAI CHARACTER SARA U..THAI CHARACTER PHINTHU 588 | 0EB8..0EBA ; Bottom # Mn [3] LAO VOWEL SIGN U..LAO SIGN PALI VIRAMA 589 | 0EBC ; Bottom # Mn LAO SEMIVOWEL SIGN LO 590 | 0F18..0F19 ; Bottom # Mn [2] TIBETAN ASTROLOGICAL SIGN -KHYUD PA..TIBETAN ASTROLOGICAL SIGN SDONG TSHUGS 591 | 0F35 ; Bottom # Mn TIBETAN MARK NGAS BZUNG NYI ZLA 592 | 0F37 ; Bottom # Mn TIBETAN MARK NGAS BZUNG SGOR RTAGS 593 | 0F71 ; Bottom # Mn TIBETAN VOWEL SIGN AA 594 | 0F74..0F75 ; Bottom # Mn [2] TIBETAN VOWEL SIGN U..TIBETAN VOWEL SIGN UU 595 | 0F84 ; Bottom # Mn TIBETAN MARK HALANTA 596 | 0F8D..0F97 ; Bottom # Mn [11] TIBETAN SUBJOINED SIGN LCE TSA CAN..TIBETAN SUBJOINED LETTER JA 597 | 0F99..0FBC ; Bottom # Mn [36] TIBETAN SUBJOINED LETTER NYA..TIBETAN SUBJOINED LETTER FIXED-FORM RA 598 | 0FC6 ; Bottom # Mn TIBETAN SYMBOL PADMA GDAN 599 | 102F..1030 ; Bottom # Mn [2] MYANMAR VOWEL SIGN U..MYANMAR VOWEL SIGN UU 600 | 1037 ; Bottom # Mn MYANMAR SIGN DOT BELOW 601 | 103D..103E ; Bottom # Mn [2] MYANMAR CONSONANT SIGN MEDIAL WA..MYANMAR CONSONANT SIGN MEDIAL HA 602 | 1058..1059 ; Bottom # Mn [2] MYANMAR VOWEL SIGN VOCALIC L..MYANMAR VOWEL SIGN VOCALIC LL 603 | 105E..1060 ; Bottom # Mn [3] MYANMAR CONSONANT SIGN MON MEDIAL NA..MYANMAR CONSONANT SIGN MON MEDIAL LA 604 | 1082 ; Bottom # Mn MYANMAR CONSONANT SIGN SHAN MEDIAL WA 605 | 108D ; Bottom # Mn MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE 606 | 1713..1714 ; Bottom # Mn [2] TAGALOG VOWEL SIGN U..TAGALOG SIGN VIRAMA 607 | 1733..1734 ; Bottom # Mn [2] HANUNOO VOWEL SIGN U..HANUNOO SIGN PAMUDPOD 608 | 1753 ; Bottom # Mn BUHID VOWEL SIGN U 609 | 1773 ; Bottom # Mn TAGBANWA VOWEL SIGN U 610 | 17BB..17BD ; Bottom # Mn [3] KHMER VOWEL SIGN U..KHMER VOWEL SIGN UA 611 | 1922 ; Bottom # Mn LIMBU VOWEL SIGN U 612 | 1932 ; Bottom # Mn LIMBU SMALL LETTER ANUSVARA 613 | 1939 ; Bottom # Mn LIMBU SIGN MUKPHRENG 614 | 193B ; Bottom # Mn LIMBU SIGN SA-I 615 | 1A18 ; Bottom # Mn BUGINESE VOWEL SIGN U 616 | 1A56 ; Bottom # Mn TAI THAM CONSONANT SIGN MEDIAL LA 617 | 1A5B..1A5E ; Bottom # Mn [4] TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA..TAI THAM CONSONANT SIGN SA 618 | 1A69..1A6A ; Bottom # Mn [2] TAI THAM VOWEL SIGN U..TAI THAM VOWEL SIGN UU 619 | 1A6C ; Bottom # Mn TAI THAM VOWEL SIGN OA BELOW 620 | 1A7F ; Bottom # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT 621 | 1B38..1B3A ; Bottom # Mn [3] BALINESE VOWEL SIGN SUKU..BALINESE VOWEL SIGN RA REPA 622 | 1B6C ; Bottom # Mn BALINESE MUSICAL SYMBOL COMBINING ENDEP 623 | 1BA2..1BA3 ; Bottom # Mn [2] SUNDANESE CONSONANT SIGN PANYAKRA..SUNDANESE CONSONANT SIGN PANYIKU 624 | 1BA5 ; Bottom # Mn SUNDANESE VOWEL SIGN PANYUKU 625 | 1BAC..1BAD ; Bottom # Mn [2] SUNDANESE CONSONANT SIGN PASANGAN MA..SUNDANESE CONSONANT SIGN PASANGAN WA 626 | 1C2C ; Bottom # Mn LEPCHA VOWEL SIGN E 627 | 1C37 ; Bottom # Mn LEPCHA SIGN NUKTA 628 | 1CD5..1CD9 ; Bottom # Mn [5] VEDIC TONE YAJURVEDIC AGGRAVATED INDEPENDENT SVARITA..VEDIC TONE YAJURVEDIC KATHAKA INDEPENDENT SVARITA SCHROEDER 629 | 1CDC..1CDF ; Bottom # Mn [4] VEDIC TONE KATHAKA ANUDATTA..VEDIC TONE THREE DOTS BELOW 630 | 1CED ; Bottom # Mn VEDIC SIGN TIRYAK 631 | A825 ; Bottom # Mn SYLOTI NAGRI VOWEL SIGN U 632 | A8C4 ; Bottom # Mn SAURASHTRA SIGN VIRAMA 633 | A92B..A92D ; Bottom # Mn [3] KAYAH LI TONE PLOPHU..KAYAH LI TONE CALYA PLOPHU 634 | A947..A949 ; Bottom # Mn [3] REJANG VOWEL SIGN I..REJANG VOWEL SIGN E 635 | A94B..A94E ; Bottom # Mn [4] REJANG VOWEL SIGN O..REJANG VOWEL SIGN EA 636 | A9B8..A9B9 ; Bottom # Mn [2] JAVANESE VOWEL SIGN SUKU..JAVANESE VOWEL SIGN SUKU MENDUT 637 | A9BD ; Bottom # Mn JAVANESE CONSONANT SIGN KERET 638 | AA2D ; Bottom # Mn CHAM VOWEL SIGN U 639 | AA32 ; Bottom # Mn CHAM VOWEL SIGN UE 640 | AA35..AA36 ; Bottom # Mn [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA 641 | AAB4 ; Bottom # Mn TAI VIET VOWEL U 642 | AAEC ; Bottom # Mn MEETEI MAYEK VOWEL SIGN UU 643 | ABE8 ; Bottom # Mn MEETEI MAYEK VOWEL SIGN UNAP 644 | ABED ; Bottom # Mn MEETEI MAYEK APUN IYEK 645 | 10A02..10A03 ; Bottom # Mn [2] KHAROSHTHI VOWEL SIGN U..KHAROSHTHI VOWEL SIGN VOCALIC R 646 | 10A0C..10A0E ; Bottom # Mn [3] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN ANUSVARA 647 | 10A39..10A3A ; Bottom # Mn [2] KHAROSHTHI SIGN CAUDA..KHAROSHTHI SIGN DOT BELOW 648 | 1103C..11041 ; Bottom # Mn [6] BRAHMI VOWEL SIGN U..BRAHMI VOWEL SIGN VOCALIC LL 649 | 110B3..110B4 ; Bottom # Mn [2] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN UU 650 | 110B9..110BA ; Bottom # Mn [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA 651 | 1112A..1112B ; Bottom # Mn [2] CHAKMA VOWEL SIGN U..CHAKMA VOWEL SIGN UU 652 | 11131..11132 ; Bottom # Mn [2] CHAKMA O MARK..CHAKMA AU MARK 653 | 11173 ; Bottom # Mn MAHAJANI SIGN NUKTA 654 | 111B6..111BB ; Bottom # Mn [6] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN VOCALIC LL 655 | 111C9..111CA ; Bottom # Mn [2] SHARADA SANDHI MARK..SHARADA SIGN NUKTA 656 | 111CC ; Bottom # Mn SHARADA EXTRA SHORT VOWEL MARK 657 | 1122F ; Bottom # Mn KHOJKI VOWEL SIGN U 658 | 112E3..112E4 ; Bottom # Mn [2] KHUDAWADI VOWEL SIGN U..KHUDAWADI VOWEL SIGN UU 659 | 112E9..112EA ; Bottom # Mn [2] KHUDAWADI SIGN NUKTA..KHUDAWADI SIGN VIRAMA 660 | 1133B..1133C ; Bottom # Mn [2] COMBINING BINDU BELOW..GRANTHA SIGN NUKTA 661 | 11438..1143D ; Bottom # Mn [6] NEWA VOWEL SIGN U..NEWA VOWEL SIGN VOCALIC LL 662 | 11442 ; Bottom # Mn NEWA SIGN VIRAMA 663 | 11446 ; Bottom # Mn NEWA SIGN NUKTA 664 | 114B3..114B8 ; Bottom # Mn [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL 665 | 114C2..114C3 ; Bottom # Mn [2] TIRHUTA SIGN VIRAMA..TIRHUTA SIGN NUKTA 666 | 115B2..115B5 ; Bottom # Mn [4] SIDDHAM VOWEL SIGN U..SIDDHAM VOWEL SIGN VOCALIC RR 667 | 115BF..115C0 ; Bottom # Mn [2] SIDDHAM SIGN VIRAMA..SIDDHAM SIGN NUKTA 668 | 115DC..115DD ; Bottom # Mn [2] SIDDHAM VOWEL SIGN ALTERNATE U..SIDDHAM VOWEL SIGN ALTERNATE UU 669 | 11633..11638 ; Bottom # Mn [6] MODI VOWEL SIGN U..MODI VOWEL SIGN VOCALIC LL 670 | 1163F ; Bottom # Mn MODI SIGN VIRAMA 671 | 116B0..116B1 ; Bottom # Mn [2] TAKRI VOWEL SIGN U..TAKRI VOWEL SIGN UU 672 | 116B7 ; Bottom # Mn TAKRI SIGN NUKTA 673 | 1171D ; Bottom # Mn AHOM CONSONANT SIGN MEDIAL LA 674 | 11724..11725 ; Bottom # Mn [2] AHOM VOWEL SIGN U..AHOM VOWEL SIGN UU 675 | 11728 ; Bottom # Mn AHOM VOWEL SIGN O 676 | 1182F..11832 ; Bottom # Mn [4] DOGRA VOWEL SIGN U..DOGRA VOWEL SIGN VOCALIC RR 677 | 11839..1183A ; Bottom # Mn [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA 678 | 119D4..119D7 ; Bottom # Mn [4] NANDINAGARI VOWEL SIGN U..NANDINAGARI VOWEL SIGN VOCALIC RR 679 | 119E0 ; Bottom # Mn NANDINAGARI SIGN VIRAMA 680 | 11A02..11A03 ; Bottom # Mn [2] ZANABAZAR SQUARE VOWEL SIGN UE..ZANABAZAR SQUARE VOWEL SIGN U 681 | 11A0A ; Bottom # Mn ZANABAZAR SQUARE VOWEL LENGTH MARK 682 | 11A33..11A34 ; Bottom # Mn [2] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN VIRAMA 683 | 11A3B..11A3E ; Bottom # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA 684 | 11A52..11A53 ; Bottom # Mn [2] SOYOMBO VOWEL SIGN UE..SOYOMBO VOWEL SIGN U 685 | 11A59..11A5B ; Bottom # Mn [3] SOYOMBO VOWEL SIGN VOCALIC R..SOYOMBO VOWEL LENGTH MARK 686 | 11A8A..11A95 ; Bottom # Mn [12] SOYOMBO FINAL CONSONANT SIGN G..SOYOMBO FINAL CONSONANT SIGN -A 687 | 11C32..11C36 ; Bottom # Mn [5] BHAIKSUKI VOWEL SIGN U..BHAIKSUKI VOWEL SIGN VOCALIC L 688 | 11C3F ; Bottom # Mn BHAIKSUKI SIGN VIRAMA 689 | 11C92..11CA7 ; Bottom # Mn [22] MARCHEN SUBJOINED LETTER KA..MARCHEN SUBJOINED LETTER ZA 690 | 11CAA..11CB0 ; Bottom # Mn [7] MARCHEN SUBJOINED LETTER RA..MARCHEN VOWEL SIGN AA 691 | 11CB2 ; Bottom # Mn MARCHEN VOWEL SIGN U 692 | 11D36 ; Bottom # Mn MASARAM GONDI VOWEL SIGN VOCALIC R 693 | 11D42 ; Bottom # Mn MASARAM GONDI SIGN NUKTA 694 | 11D44 ; Bottom # Mn MASARAM GONDI SIGN HALANTA 695 | 11D47 ; Bottom # Mn MASARAM GONDI RA-KARA 696 | 11EF4 ; Bottom # Mn MAKASAR VOWEL SIGN U 697 | 698 | # Indic_Positional_Category=Top_And_Bottom 699 | 700 | 0C48 ; Top_And_Bottom # Mn TELUGU VOWEL SIGN AI 701 | 0F73 ; Top_And_Bottom # Mn TIBETAN VOWEL SIGN II 702 | 0F76..0F79 ; Top_And_Bottom # Mn [4] TIBETAN VOWEL SIGN VOCALIC R..TIBETAN VOWEL SIGN VOCALIC LL 703 | 0F81 ; Top_And_Bottom # Mn TIBETAN VOWEL SIGN REVERSED II 704 | 1B3C ; Top_And_Bottom # Mn BALINESE VOWEL SIGN LA LENGA 705 | 1112E..1112F ; Top_And_Bottom # Mn [2] CHAKMA VOWEL SIGN O..CHAKMA VOWEL SIGN AU 706 | 707 | # Indic_Positional_Category=Top_And_Right 708 | 709 | 0AC9 ; Top_And_Right # Mc GUJARATI VOWEL SIGN CANDRA O 710 | 0B57 ; Top_And_Right # Mc ORIYA AU LENGTH MARK 711 | 0CC0 ; Top_And_Right # Mc KANNADA VOWEL SIGN II 712 | 0CC7..0CC8 ; Top_And_Right # Mc [2] KANNADA VOWEL SIGN EE..KANNADA VOWEL SIGN AI 713 | 0CCA..0CCB ; Top_And_Right # Mc [2] KANNADA VOWEL SIGN O..KANNADA VOWEL SIGN OO 714 | 1925..1926 ; Top_And_Right # Mc [2] LIMBU VOWEL SIGN OO..LIMBU VOWEL SIGN AU 715 | 1B43 ; Top_And_Right # Mc BALINESE VOWEL SIGN PEPET TEDUNG 716 | 111BF ; Top_And_Right # Mc SHARADA VOWEL SIGN AU 717 | 11232..11233 ; Top_And_Right # Mc [2] KHOJKI VOWEL SIGN O..KHOJKI VOWEL SIGN AU 718 | 719 | # Indic_Positional_Category=Top_And_Left 720 | 721 | 0B48 ; Top_And_Left # Mc ORIYA VOWEL SIGN AI 722 | 0DDA ; Top_And_Left # Mc SINHALA VOWEL SIGN DIGA KOMBUVA 723 | 17BE ; Top_And_Left # Mc KHMER VOWEL SIGN OE 724 | 1C29 ; Top_And_Left # Mc LEPCHA VOWEL SIGN OO 725 | 114BB ; Top_And_Left # Mc TIRHUTA VOWEL SIGN AI 726 | 115B9 ; Top_And_Left # Mc SIDDHAM VOWEL SIGN AI 727 | 728 | # Indic_Positional_Category=Top_And_Left_And_Right 729 | 730 | 0B4C ; Top_And_Left_And_Right # Mc ORIYA VOWEL SIGN AU 731 | 0DDD ; Top_And_Left_And_Right # Mc SINHALA VOWEL SIGN KOMBUVA HAA DIGA AELA-PILLA 732 | 17BF ; Top_And_Left_And_Right # Mc KHMER VOWEL SIGN YA 733 | 115BB ; Top_And_Left_And_Right # Mc SIDDHAM VOWEL SIGN AU 734 | 735 | # Indic_Positional_Category=Bottom_And_Right 736 | 737 | 1B3B ; Bottom_And_Right # Mc BALINESE VOWEL SIGN RA REPA TEDUNG 738 | A9C0 ; Bottom_And_Right # Mc JAVANESE PANGKON 739 | 740 | # Indic_Positional_Category=Bottom_And_Left 741 | 742 | A9BF ; Bottom_And_Left # Mc JAVANESE CONSONANT SIGN CAKRA 743 | 744 | # Indic_Positional_Category=Top_And_Bottom_And_Right 745 | 746 | 1B3D ; Top_And_Bottom_And_Right # Mc BALINESE VOWEL SIGN LA LENGA TEDUNG 747 | 748 | # Indic_Positional_Category=Overstruck 749 | 750 | 1CD4 ; Overstruck # Mn VEDIC SIGN YAJURVEDIC MIDLINE SVARITA 751 | 1CE2..1CE8 ; Overstruck # Mn [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL 752 | 10A01 ; Overstruck # Mn KHAROSHTHI VOWEL SIGN I 753 | 10A06 ; Overstruck # Mn KHAROSHTHI VOWEL SIGN O 754 | 755 | # EOF 756 | -------------------------------------------------------------------------------- /data/PropertyValueAliases.txt: -------------------------------------------------------------------------------- 1 | # PropertyValueAliases-12.0.0.txt 2 | # Date: 2019-02-19, 05:01:57 GMT 3 | # © 2019 Unicode®, Inc. 4 | # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. 5 | # For terms of use, see http://www.unicode.org/terms_of_use.html 6 | # 7 | # Unicode Character Database 8 | # For documentation, see http://www.unicode.org/reports/tr44/ 9 | # 10 | # This file contains aliases for property values used in the UCD. 11 | # These names can be used for XML formats of UCD data, for regular-expression 12 | # property tests, and other programmatic textual descriptions of Unicode data. 13 | # 14 | # The names may be translated in appropriate environments, and additional 15 | # aliases may be useful. 16 | # 17 | # FORMAT 18 | # 19 | # Each line describes a property value name. 20 | # This consists of three or more fields, separated by semicolons. 21 | # 22 | # First Field: The first field describes the property for which that 23 | # property value name is used. 24 | # 25 | # Second Field: The second field is the short name for the property value. 26 | # It is typically an abbreviation, but in a number of cases it is simply 27 | # a duplicate of the "long name" in the third field. 28 | # 29 | # Third Field: The third field is the long name for the property value, 30 | # typically the formal name used in documentation about the property value. 31 | # 32 | # In the case of Canonical_Combining_Class (ccc), there are 4 fields: 33 | # The second field is numeric, the third is the short name, and the fourth is the long name. 34 | # 35 | # The above are the preferred aliases. Other aliases may be listed in additional fields. 36 | # 37 | # Loose matching should be applied to all property names and property values, with 38 | # the exception of String Property values. With loose matching of property names and 39 | # values, the case distinctions, whitespace, hyphens, and '_' are ignored. 40 | # For Numeric Property values, numeric equivalence is applied: thus "01.00" 41 | # is equivalent to "1". 42 | # 43 | # NOTE: Property value names are NOT unique across properties. For example: 44 | # 45 | # AL means Arabic Letter for the Bidi_Class property, and 46 | # AL means Above_Left for the Canonical_Combining_Class property, and 47 | # AL means Alphabetic for the Line_Break property. 48 | # 49 | # In addition, some property names may be the same as some property value names. 50 | # For example: 51 | # 52 | # sc means the Script property, and 53 | # Sc means the General_Category property value Currency_Symbol (Sc) 54 | # 55 | # The combination of property value and property name is, however, unique. 56 | # 57 | # For more information, see UAX #44, Unicode Character Database, and 58 | # UTS #18, Unicode Regular Expressions. 59 | # ================================================ 60 | 61 | 62 | # ASCII_Hex_Digit (AHex) 63 | 64 | AHex; N ; No ; F ; False 65 | AHex; Y ; Yes ; T ; True 66 | 67 | # Age (age) 68 | 69 | age; 1.1 ; V1_1 70 | age; 2.0 ; V2_0 71 | age; 2.1 ; V2_1 72 | age; 3.0 ; V3_0 73 | age; 3.1 ; V3_1 74 | age; 3.2 ; V3_2 75 | age; 4.0 ; V4_0 76 | age; 4.1 ; V4_1 77 | age; 5.0 ; V5_0 78 | age; 5.1 ; V5_1 79 | age; 5.2 ; V5_2 80 | age; 6.0 ; V6_0 81 | age; 6.1 ; V6_1 82 | age; 6.2 ; V6_2 83 | age; 6.3 ; V6_3 84 | age; 7.0 ; V7_0 85 | age; 8.0 ; V8_0 86 | age; 9.0 ; V9_0 87 | age; 10.0 ; V10_0 88 | age; 11.0 ; V11_0 89 | age; 12.0 ; V12_0 90 | age; NA ; Unassigned 91 | 92 | # Alphabetic (Alpha) 93 | 94 | Alpha; N ; No ; F ; False 95 | Alpha; Y ; Yes ; T ; True 96 | 97 | # Bidi_Class (bc) 98 | 99 | bc ; AL ; Arabic_Letter 100 | bc ; AN ; Arabic_Number 101 | bc ; B ; Paragraph_Separator 102 | bc ; BN ; Boundary_Neutral 103 | bc ; CS ; Common_Separator 104 | bc ; EN ; European_Number 105 | bc ; ES ; European_Separator 106 | bc ; ET ; European_Terminator 107 | bc ; FSI ; First_Strong_Isolate 108 | bc ; L ; Left_To_Right 109 | bc ; LRE ; Left_To_Right_Embedding 110 | bc ; LRI ; Left_To_Right_Isolate 111 | bc ; LRO ; Left_To_Right_Override 112 | bc ; NSM ; Nonspacing_Mark 113 | bc ; ON ; Other_Neutral 114 | bc ; PDF ; Pop_Directional_Format 115 | bc ; PDI ; Pop_Directional_Isolate 116 | bc ; R ; Right_To_Left 117 | bc ; RLE ; Right_To_Left_Embedding 118 | bc ; RLI ; Right_To_Left_Isolate 119 | bc ; RLO ; Right_To_Left_Override 120 | bc ; S ; Segment_Separator 121 | bc ; WS ; White_Space 122 | 123 | # Bidi_Control (Bidi_C) 124 | 125 | Bidi_C; N ; No ; F ; False 126 | Bidi_C; Y ; Yes ; T ; True 127 | 128 | # Bidi_Mirrored (Bidi_M) 129 | 130 | Bidi_M; N ; No ; F ; False 131 | Bidi_M; Y ; Yes ; T ; True 132 | 133 | # Bidi_Mirroring_Glyph (bmg) 134 | 135 | # @missing: 0000..10FFFF; Bidi_Mirroring_Glyph; 136 | 137 | # Bidi_Paired_Bracket (bpb) 138 | 139 | # @missing: 0000..10FFFF; Bidi_Paired_Bracket; 140 | 141 | # Bidi_Paired_Bracket_Type (bpt) 142 | 143 | bpt; c ; Close 144 | bpt; n ; None 145 | bpt; o ; Open 146 | # @missing: 0000..10FFFF; Bidi_Paired_Bracket_Type; n 147 | 148 | # Block (blk) 149 | 150 | blk; Adlam ; Adlam 151 | blk; Aegean_Numbers ; Aegean_Numbers 152 | blk; Ahom ; Ahom 153 | blk; Alchemical ; Alchemical_Symbols 154 | blk; Alphabetic_PF ; Alphabetic_Presentation_Forms 155 | blk; Anatolian_Hieroglyphs ; Anatolian_Hieroglyphs 156 | blk; Ancient_Greek_Music ; Ancient_Greek_Musical_Notation 157 | blk; Ancient_Greek_Numbers ; Ancient_Greek_Numbers 158 | blk; Ancient_Symbols ; Ancient_Symbols 159 | blk; Arabic ; Arabic 160 | blk; Arabic_Ext_A ; Arabic_Extended_A 161 | blk; Arabic_Math ; Arabic_Mathematical_Alphabetic_Symbols 162 | blk; Arabic_PF_A ; Arabic_Presentation_Forms_A ; Arabic_Presentation_Forms-A 163 | blk; Arabic_PF_B ; Arabic_Presentation_Forms_B 164 | blk; Arabic_Sup ; Arabic_Supplement 165 | blk; Armenian ; Armenian 166 | blk; Arrows ; Arrows 167 | blk; ASCII ; Basic_Latin 168 | blk; Avestan ; Avestan 169 | blk; Balinese ; Balinese 170 | blk; Bamum ; Bamum 171 | blk; Bamum_Sup ; Bamum_Supplement 172 | blk; Bassa_Vah ; Bassa_Vah 173 | blk; Batak ; Batak 174 | blk; Bengali ; Bengali 175 | blk; Bhaiksuki ; Bhaiksuki 176 | blk; Block_Elements ; Block_Elements 177 | blk; Bopomofo ; Bopomofo 178 | blk; Bopomofo_Ext ; Bopomofo_Extended 179 | blk; Box_Drawing ; Box_Drawing 180 | blk; Brahmi ; Brahmi 181 | blk; Braille ; Braille_Patterns 182 | blk; Buginese ; Buginese 183 | blk; Buhid ; Buhid 184 | blk; Byzantine_Music ; Byzantine_Musical_Symbols 185 | blk; Carian ; Carian 186 | blk; Caucasian_Albanian ; Caucasian_Albanian 187 | blk; Chakma ; Chakma 188 | blk; Cham ; Cham 189 | blk; Cherokee ; Cherokee 190 | blk; Cherokee_Sup ; Cherokee_Supplement 191 | blk; Chess_Symbols ; Chess_Symbols 192 | blk; CJK ; CJK_Unified_Ideographs 193 | blk; CJK_Compat ; CJK_Compatibility 194 | blk; CJK_Compat_Forms ; CJK_Compatibility_Forms 195 | blk; CJK_Compat_Ideographs ; CJK_Compatibility_Ideographs 196 | blk; CJK_Compat_Ideographs_Sup ; CJK_Compatibility_Ideographs_Supplement 197 | blk; CJK_Ext_A ; CJK_Unified_Ideographs_Extension_A 198 | blk; CJK_Ext_B ; CJK_Unified_Ideographs_Extension_B 199 | blk; CJK_Ext_C ; CJK_Unified_Ideographs_Extension_C 200 | blk; CJK_Ext_D ; CJK_Unified_Ideographs_Extension_D 201 | blk; CJK_Ext_E ; CJK_Unified_Ideographs_Extension_E 202 | blk; CJK_Ext_F ; CJK_Unified_Ideographs_Extension_F 203 | blk; CJK_Radicals_Sup ; CJK_Radicals_Supplement 204 | blk; CJK_Strokes ; CJK_Strokes 205 | blk; CJK_Symbols ; CJK_Symbols_And_Punctuation 206 | blk; Compat_Jamo ; Hangul_Compatibility_Jamo 207 | blk; Control_Pictures ; Control_Pictures 208 | blk; Coptic ; Coptic 209 | blk; Coptic_Epact_Numbers ; Coptic_Epact_Numbers 210 | blk; Counting_Rod ; Counting_Rod_Numerals 211 | blk; Cuneiform ; Cuneiform 212 | blk; Cuneiform_Numbers ; Cuneiform_Numbers_And_Punctuation 213 | blk; Currency_Symbols ; Currency_Symbols 214 | blk; Cypriot_Syllabary ; Cypriot_Syllabary 215 | blk; Cyrillic ; Cyrillic 216 | blk; Cyrillic_Ext_A ; Cyrillic_Extended_A 217 | blk; Cyrillic_Ext_B ; Cyrillic_Extended_B 218 | blk; Cyrillic_Ext_C ; Cyrillic_Extended_C 219 | blk; Cyrillic_Sup ; Cyrillic_Supplement ; Cyrillic_Supplementary 220 | blk; Deseret ; Deseret 221 | blk; Devanagari ; Devanagari 222 | blk; Devanagari_Ext ; Devanagari_Extended 223 | blk; Diacriticals ; Combining_Diacritical_Marks 224 | blk; Diacriticals_Ext ; Combining_Diacritical_Marks_Extended 225 | blk; Diacriticals_For_Symbols ; Combining_Diacritical_Marks_For_Symbols; Combining_Marks_For_Symbols 226 | blk; Diacriticals_Sup ; Combining_Diacritical_Marks_Supplement 227 | blk; Dingbats ; Dingbats 228 | blk; Dogra ; Dogra 229 | blk; Domino ; Domino_Tiles 230 | blk; Duployan ; Duployan 231 | blk; Early_Dynastic_Cuneiform ; Early_Dynastic_Cuneiform 232 | blk; Egyptian_Hieroglyph_Format_Controls; Egyptian_Hieroglyph_Format_Controls 233 | blk; Egyptian_Hieroglyphs ; Egyptian_Hieroglyphs 234 | blk; Elbasan ; Elbasan 235 | blk; Elymaic ; Elymaic 236 | blk; Emoticons ; Emoticons 237 | blk; Enclosed_Alphanum ; Enclosed_Alphanumerics 238 | blk; Enclosed_Alphanum_Sup ; Enclosed_Alphanumeric_Supplement 239 | blk; Enclosed_CJK ; Enclosed_CJK_Letters_And_Months 240 | blk; Enclosed_Ideographic_Sup ; Enclosed_Ideographic_Supplement 241 | blk; Ethiopic ; Ethiopic 242 | blk; Ethiopic_Ext ; Ethiopic_Extended 243 | blk; Ethiopic_Ext_A ; Ethiopic_Extended_A 244 | blk; Ethiopic_Sup ; Ethiopic_Supplement 245 | blk; Geometric_Shapes ; Geometric_Shapes 246 | blk; Geometric_Shapes_Ext ; Geometric_Shapes_Extended 247 | blk; Georgian ; Georgian 248 | blk; Georgian_Ext ; Georgian_Extended 249 | blk; Georgian_Sup ; Georgian_Supplement 250 | blk; Glagolitic ; Glagolitic 251 | blk; Glagolitic_Sup ; Glagolitic_Supplement 252 | blk; Gothic ; Gothic 253 | blk; Grantha ; Grantha 254 | blk; Greek ; Greek_And_Coptic 255 | blk; Greek_Ext ; Greek_Extended 256 | blk; Gujarati ; Gujarati 257 | blk; Gunjala_Gondi ; Gunjala_Gondi 258 | blk; Gurmukhi ; Gurmukhi 259 | blk; Half_And_Full_Forms ; Halfwidth_And_Fullwidth_Forms 260 | blk; Half_Marks ; Combining_Half_Marks 261 | blk; Hangul ; Hangul_Syllables 262 | blk; Hanifi_Rohingya ; Hanifi_Rohingya 263 | blk; Hanunoo ; Hanunoo 264 | blk; Hatran ; Hatran 265 | blk; Hebrew ; Hebrew 266 | blk; High_PU_Surrogates ; High_Private_Use_Surrogates 267 | blk; High_Surrogates ; High_Surrogates 268 | blk; Hiragana ; Hiragana 269 | blk; IDC ; Ideographic_Description_Characters 270 | blk; Ideographic_Symbols ; Ideographic_Symbols_And_Punctuation 271 | blk; Imperial_Aramaic ; Imperial_Aramaic 272 | blk; Indic_Number_Forms ; Common_Indic_Number_Forms 273 | blk; Indic_Siyaq_Numbers ; Indic_Siyaq_Numbers 274 | blk; Inscriptional_Pahlavi ; Inscriptional_Pahlavi 275 | blk; Inscriptional_Parthian ; Inscriptional_Parthian 276 | blk; IPA_Ext ; IPA_Extensions 277 | blk; Jamo ; Hangul_Jamo 278 | blk; Jamo_Ext_A ; Hangul_Jamo_Extended_A 279 | blk; Jamo_Ext_B ; Hangul_Jamo_Extended_B 280 | blk; Javanese ; Javanese 281 | blk; Kaithi ; Kaithi 282 | blk; Kana_Ext_A ; Kana_Extended_A 283 | blk; Kana_Sup ; Kana_Supplement 284 | blk; Kanbun ; Kanbun 285 | blk; Kangxi ; Kangxi_Radicals 286 | blk; Kannada ; Kannada 287 | blk; Katakana ; Katakana 288 | blk; Katakana_Ext ; Katakana_Phonetic_Extensions 289 | blk; Kayah_Li ; Kayah_Li 290 | blk; Kharoshthi ; Kharoshthi 291 | blk; Khmer ; Khmer 292 | blk; Khmer_Symbols ; Khmer_Symbols 293 | blk; Khojki ; Khojki 294 | blk; Khudawadi ; Khudawadi 295 | blk; Lao ; Lao 296 | blk; Latin_1_Sup ; Latin_1_Supplement ; Latin_1 297 | blk; Latin_Ext_A ; Latin_Extended_A 298 | blk; Latin_Ext_Additional ; Latin_Extended_Additional 299 | blk; Latin_Ext_B ; Latin_Extended_B 300 | blk; Latin_Ext_C ; Latin_Extended_C 301 | blk; Latin_Ext_D ; Latin_Extended_D 302 | blk; Latin_Ext_E ; Latin_Extended_E 303 | blk; Lepcha ; Lepcha 304 | blk; Letterlike_Symbols ; Letterlike_Symbols 305 | blk; Limbu ; Limbu 306 | blk; Linear_A ; Linear_A 307 | blk; Linear_B_Ideograms ; Linear_B_Ideograms 308 | blk; Linear_B_Syllabary ; Linear_B_Syllabary 309 | blk; Lisu ; Lisu 310 | blk; Low_Surrogates ; Low_Surrogates 311 | blk; Lycian ; Lycian 312 | blk; Lydian ; Lydian 313 | blk; Mahajani ; Mahajani 314 | blk; Mahjong ; Mahjong_Tiles 315 | blk; Makasar ; Makasar 316 | blk; Malayalam ; Malayalam 317 | blk; Mandaic ; Mandaic 318 | blk; Manichaean ; Manichaean 319 | blk; Marchen ; Marchen 320 | blk; Masaram_Gondi ; Masaram_Gondi 321 | blk; Math_Alphanum ; Mathematical_Alphanumeric_Symbols 322 | blk; Math_Operators ; Mathematical_Operators 323 | blk; Mayan_Numerals ; Mayan_Numerals 324 | blk; Medefaidrin ; Medefaidrin 325 | blk; Meetei_Mayek ; Meetei_Mayek 326 | blk; Meetei_Mayek_Ext ; Meetei_Mayek_Extensions 327 | blk; Mende_Kikakui ; Mende_Kikakui 328 | blk; Meroitic_Cursive ; Meroitic_Cursive 329 | blk; Meroitic_Hieroglyphs ; Meroitic_Hieroglyphs 330 | blk; Miao ; Miao 331 | blk; Misc_Arrows ; Miscellaneous_Symbols_And_Arrows 332 | blk; Misc_Math_Symbols_A ; Miscellaneous_Mathematical_Symbols_A 333 | blk; Misc_Math_Symbols_B ; Miscellaneous_Mathematical_Symbols_B 334 | blk; Misc_Pictographs ; Miscellaneous_Symbols_And_Pictographs 335 | blk; Misc_Symbols ; Miscellaneous_Symbols 336 | blk; Misc_Technical ; Miscellaneous_Technical 337 | blk; Modi ; Modi 338 | blk; Modifier_Letters ; Spacing_Modifier_Letters 339 | blk; Modifier_Tone_Letters ; Modifier_Tone_Letters 340 | blk; Mongolian ; Mongolian 341 | blk; Mongolian_Sup ; Mongolian_Supplement 342 | blk; Mro ; Mro 343 | blk; Multani ; Multani 344 | blk; Music ; Musical_Symbols 345 | blk; Myanmar ; Myanmar 346 | blk; Myanmar_Ext_A ; Myanmar_Extended_A 347 | blk; Myanmar_Ext_B ; Myanmar_Extended_B 348 | blk; Nabataean ; Nabataean 349 | blk; Nandinagari ; Nandinagari 350 | blk; NB ; No_Block 351 | blk; New_Tai_Lue ; New_Tai_Lue 352 | blk; Newa ; Newa 353 | blk; NKo ; NKo 354 | blk; Number_Forms ; Number_Forms 355 | blk; Nushu ; Nushu 356 | blk; Nyiakeng_Puachue_Hmong ; Nyiakeng_Puachue_Hmong 357 | blk; OCR ; Optical_Character_Recognition 358 | blk; Ogham ; Ogham 359 | blk; Ol_Chiki ; Ol_Chiki 360 | blk; Old_Hungarian ; Old_Hungarian 361 | blk; Old_Italic ; Old_Italic 362 | blk; Old_North_Arabian ; Old_North_Arabian 363 | blk; Old_Permic ; Old_Permic 364 | blk; Old_Persian ; Old_Persian 365 | blk; Old_Sogdian ; Old_Sogdian 366 | blk; Old_South_Arabian ; Old_South_Arabian 367 | blk; Old_Turkic ; Old_Turkic 368 | blk; Oriya ; Oriya 369 | blk; Ornamental_Dingbats ; Ornamental_Dingbats 370 | blk; Osage ; Osage 371 | blk; Osmanya ; Osmanya 372 | blk; Ottoman_Siyaq_Numbers ; Ottoman_Siyaq_Numbers 373 | blk; Pahawh_Hmong ; Pahawh_Hmong 374 | blk; Palmyrene ; Palmyrene 375 | blk; Pau_Cin_Hau ; Pau_Cin_Hau 376 | blk; Phags_Pa ; Phags_Pa 377 | blk; Phaistos ; Phaistos_Disc 378 | blk; Phoenician ; Phoenician 379 | blk; Phonetic_Ext ; Phonetic_Extensions 380 | blk; Phonetic_Ext_Sup ; Phonetic_Extensions_Supplement 381 | blk; Playing_Cards ; Playing_Cards 382 | blk; Psalter_Pahlavi ; Psalter_Pahlavi 383 | blk; PUA ; Private_Use_Area ; Private_Use 384 | blk; Punctuation ; General_Punctuation 385 | blk; Rejang ; Rejang 386 | blk; Rumi ; Rumi_Numeral_Symbols 387 | blk; Runic ; Runic 388 | blk; Samaritan ; Samaritan 389 | blk; Saurashtra ; Saurashtra 390 | blk; Sharada ; Sharada 391 | blk; Shavian ; Shavian 392 | blk; Shorthand_Format_Controls ; Shorthand_Format_Controls 393 | blk; Siddham ; Siddham 394 | blk; Sinhala ; Sinhala 395 | blk; Sinhala_Archaic_Numbers ; Sinhala_Archaic_Numbers 396 | blk; Small_Forms ; Small_Form_Variants 397 | blk; Small_Kana_Ext ; Small_Kana_Extension 398 | blk; Sogdian ; Sogdian 399 | blk; Sora_Sompeng ; Sora_Sompeng 400 | blk; Soyombo ; Soyombo 401 | blk; Specials ; Specials 402 | blk; Sundanese ; Sundanese 403 | blk; Sundanese_Sup ; Sundanese_Supplement 404 | blk; Sup_Arrows_A ; Supplemental_Arrows_A 405 | blk; Sup_Arrows_B ; Supplemental_Arrows_B 406 | blk; Sup_Arrows_C ; Supplemental_Arrows_C 407 | blk; Sup_Math_Operators ; Supplemental_Mathematical_Operators 408 | blk; Sup_PUA_A ; Supplementary_Private_Use_Area_A 409 | blk; Sup_PUA_B ; Supplementary_Private_Use_Area_B 410 | blk; Sup_Punctuation ; Supplemental_Punctuation 411 | blk; Sup_Symbols_And_Pictographs ; Supplemental_Symbols_And_Pictographs 412 | blk; Super_And_Sub ; Superscripts_And_Subscripts 413 | blk; Sutton_SignWriting ; Sutton_SignWriting 414 | blk; Syloti_Nagri ; Syloti_Nagri 415 | blk; Symbols_And_Pictographs_Ext_A ; Symbols_And_Pictographs_Extended_A 416 | blk; Syriac ; Syriac 417 | blk; Syriac_Sup ; Syriac_Supplement 418 | blk; Tagalog ; Tagalog 419 | blk; Tagbanwa ; Tagbanwa 420 | blk; Tags ; Tags 421 | blk; Tai_Le ; Tai_Le 422 | blk; Tai_Tham ; Tai_Tham 423 | blk; Tai_Viet ; Tai_Viet 424 | blk; Tai_Xuan_Jing ; Tai_Xuan_Jing_Symbols 425 | blk; Takri ; Takri 426 | blk; Tamil ; Tamil 427 | blk; Tamil_Sup ; Tamil_Supplement 428 | blk; Tangut ; Tangut 429 | blk; Tangut_Components ; Tangut_Components 430 | blk; Telugu ; Telugu 431 | blk; Thaana ; Thaana 432 | blk; Thai ; Thai 433 | blk; Tibetan ; Tibetan 434 | blk; Tifinagh ; Tifinagh 435 | blk; Tirhuta ; Tirhuta 436 | blk; Transport_And_Map ; Transport_And_Map_Symbols 437 | blk; UCAS ; Unified_Canadian_Aboriginal_Syllabics; Canadian_Syllabics 438 | blk; UCAS_Ext ; Unified_Canadian_Aboriginal_Syllabics_Extended 439 | blk; Ugaritic ; Ugaritic 440 | blk; Vai ; Vai 441 | blk; Vedic_Ext ; Vedic_Extensions 442 | blk; Vertical_Forms ; Vertical_Forms 443 | blk; VS ; Variation_Selectors 444 | blk; VS_Sup ; Variation_Selectors_Supplement 445 | blk; Wancho ; Wancho 446 | blk; Warang_Citi ; Warang_Citi 447 | blk; Yi_Radicals ; Yi_Radicals 448 | blk; Yi_Syllables ; Yi_Syllables 449 | blk; Yijing ; Yijing_Hexagram_Symbols 450 | blk; Zanabazar_Square ; Zanabazar_Square 451 | 452 | # Canonical_Combining_Class (ccc) 453 | 454 | ccc; 0; NR ; Not_Reordered 455 | ccc; 1; OV ; Overlay 456 | ccc; 7; NK ; Nukta 457 | ccc; 8; KV ; Kana_Voicing 458 | ccc; 9; VR ; Virama 459 | ccc; 10; CCC10 ; CCC10 460 | ccc; 11; CCC11 ; CCC11 461 | ccc; 12; CCC12 ; CCC12 462 | ccc; 13; CCC13 ; CCC13 463 | ccc; 14; CCC14 ; CCC14 464 | ccc; 15; CCC15 ; CCC15 465 | ccc; 16; CCC16 ; CCC16 466 | ccc; 17; CCC17 ; CCC17 467 | ccc; 18; CCC18 ; CCC18 468 | ccc; 19; CCC19 ; CCC19 469 | ccc; 20; CCC20 ; CCC20 470 | ccc; 21; CCC21 ; CCC21 471 | ccc; 22; CCC22 ; CCC22 472 | ccc; 23; CCC23 ; CCC23 473 | ccc; 24; CCC24 ; CCC24 474 | ccc; 25; CCC25 ; CCC25 475 | ccc; 26; CCC26 ; CCC26 476 | ccc; 27; CCC27 ; CCC27 477 | ccc; 28; CCC28 ; CCC28 478 | ccc; 29; CCC29 ; CCC29 479 | ccc; 30; CCC30 ; CCC30 480 | ccc; 31; CCC31 ; CCC31 481 | ccc; 32; CCC32 ; CCC32 482 | ccc; 33; CCC33 ; CCC33 483 | ccc; 34; CCC34 ; CCC34 484 | ccc; 35; CCC35 ; CCC35 485 | ccc; 36; CCC36 ; CCC36 486 | ccc; 84; CCC84 ; CCC84 487 | ccc; 91; CCC91 ; CCC91 488 | ccc; 103; CCC103 ; CCC103 489 | ccc; 107; CCC107 ; CCC107 490 | ccc; 118; CCC118 ; CCC118 491 | ccc; 122; CCC122 ; CCC122 492 | ccc; 129; CCC129 ; CCC129 493 | ccc; 130; CCC130 ; CCC130 494 | ccc; 132; CCC132 ; CCC132 495 | ccc; 133; CCC133 ; CCC133 # RESERVED 496 | ccc; 200; ATBL ; Attached_Below_Left 497 | ccc; 202; ATB ; Attached_Below 498 | ccc; 214; ATA ; Attached_Above 499 | ccc; 216; ATAR ; Attached_Above_Right 500 | ccc; 218; BL ; Below_Left 501 | ccc; 220; B ; Below 502 | ccc; 222; BR ; Below_Right 503 | ccc; 224; L ; Left 504 | ccc; 226; R ; Right 505 | ccc; 228; AL ; Above_Left 506 | ccc; 230; A ; Above 507 | ccc; 232; AR ; Above_Right 508 | ccc; 233; DB ; Double_Below 509 | ccc; 234; DA ; Double_Above 510 | ccc; 240; IS ; Iota_Subscript 511 | 512 | # Case_Folding (cf) 513 | 514 | # @missing: 0000..10FFFF; Case_Folding; 515 | 516 | # Case_Ignorable (CI) 517 | 518 | CI ; N ; No ; F ; False 519 | CI ; Y ; Yes ; T ; True 520 | 521 | # Cased (Cased) 522 | 523 | Cased; N ; No ; F ; False 524 | Cased; Y ; Yes ; T ; True 525 | 526 | # Changes_When_Casefolded (CWCF) 527 | 528 | CWCF; N ; No ; F ; False 529 | CWCF; Y ; Yes ; T ; True 530 | 531 | # Changes_When_Casemapped (CWCM) 532 | 533 | CWCM; N ; No ; F ; False 534 | CWCM; Y ; Yes ; T ; True 535 | 536 | # Changes_When_Lowercased (CWL) 537 | 538 | CWL; N ; No ; F ; False 539 | CWL; Y ; Yes ; T ; True 540 | 541 | # Changes_When_NFKC_Casefolded (CWKCF) 542 | 543 | CWKCF; N ; No ; F ; False 544 | CWKCF; Y ; Yes ; T ; True 545 | 546 | # Changes_When_Titlecased (CWT) 547 | 548 | CWT; N ; No ; F ; False 549 | CWT; Y ; Yes ; T ; True 550 | 551 | # Changes_When_Uppercased (CWU) 552 | 553 | CWU; N ; No ; F ; False 554 | CWU; Y ; Yes ; T ; True 555 | 556 | # Composition_Exclusion (CE) 557 | 558 | CE ; N ; No ; F ; False 559 | CE ; Y ; Yes ; T ; True 560 | 561 | # Dash (Dash) 562 | 563 | Dash; N ; No ; F ; False 564 | Dash; Y ; Yes ; T ; True 565 | 566 | # Decomposition_Mapping (dm) 567 | 568 | # @missing: 0000..10FFFF; Decomposition_Mapping; 569 | 570 | # Decomposition_Type (dt) 571 | 572 | dt ; Can ; Canonical ; can 573 | dt ; Com ; Compat ; com 574 | dt ; Enc ; Circle ; enc 575 | dt ; Fin ; Final ; fin 576 | dt ; Font ; Font ; font 577 | dt ; Fra ; Fraction ; fra 578 | dt ; Init ; Initial ; init 579 | dt ; Iso ; Isolated ; iso 580 | dt ; Med ; Medial ; med 581 | dt ; Nar ; Narrow ; nar 582 | dt ; Nb ; Nobreak ; nb 583 | dt ; None ; None ; none 584 | dt ; Sml ; Small ; sml 585 | dt ; Sqr ; Square ; sqr 586 | dt ; Sub ; Sub ; sub 587 | dt ; Sup ; Super ; sup 588 | dt ; Vert ; Vertical ; vert 589 | dt ; Wide ; Wide ; wide 590 | 591 | # Default_Ignorable_Code_Point (DI) 592 | 593 | DI ; N ; No ; F ; False 594 | DI ; Y ; Yes ; T ; True 595 | 596 | # Deprecated (Dep) 597 | 598 | Dep; N ; No ; F ; False 599 | Dep; Y ; Yes ; T ; True 600 | 601 | # Diacritic (Dia) 602 | 603 | Dia; N ; No ; F ; False 604 | Dia; Y ; Yes ; T ; True 605 | 606 | # East_Asian_Width (ea) 607 | 608 | ea ; A ; Ambiguous 609 | ea ; F ; Fullwidth 610 | ea ; H ; Halfwidth 611 | ea ; N ; Neutral 612 | ea ; Na ; Narrow 613 | ea ; W ; Wide 614 | 615 | # Equivalent_Unified_Ideograph (EqUIdeo) 616 | 617 | # @missing: 0000..10FFFF; Equivalent_Unified_Ideograph; 618 | 619 | # Expands_On_NFC (XO_NFC) 620 | 621 | XO_NFC; N ; No ; F ; False 622 | XO_NFC; Y ; Yes ; T ; True 623 | 624 | # Expands_On_NFD (XO_NFD) 625 | 626 | XO_NFD; N ; No ; F ; False 627 | XO_NFD; Y ; Yes ; T ; True 628 | 629 | # Expands_On_NFKC (XO_NFKC) 630 | 631 | XO_NFKC; N ; No ; F ; False 632 | XO_NFKC; Y ; Yes ; T ; True 633 | 634 | # Expands_On_NFKD (XO_NFKD) 635 | 636 | XO_NFKD; N ; No ; F ; False 637 | XO_NFKD; Y ; Yes ; T ; True 638 | 639 | # Extender (Ext) 640 | 641 | Ext; N ; No ; F ; False 642 | Ext; Y ; Yes ; T ; True 643 | 644 | # FC_NFKC_Closure (FC_NFKC) 645 | 646 | # @missing: 0000..10FFFF; FC_NFKC_Closure; 647 | 648 | # Full_Composition_Exclusion (Comp_Ex) 649 | 650 | Comp_Ex; N ; No ; F ; False 651 | Comp_Ex; Y ; Yes ; T ; True 652 | 653 | # General_Category (gc) 654 | 655 | gc ; C ; Other # Cc | Cf | Cn | Co | Cs 656 | gc ; Cc ; Control ; cntrl 657 | gc ; Cf ; Format 658 | gc ; Cn ; Unassigned 659 | gc ; Co ; Private_Use 660 | gc ; Cs ; Surrogate 661 | gc ; L ; Letter # Ll | Lm | Lo | Lt | Lu 662 | gc ; LC ; Cased_Letter # Ll | Lt | Lu 663 | gc ; Ll ; Lowercase_Letter 664 | gc ; Lm ; Modifier_Letter 665 | gc ; Lo ; Other_Letter 666 | gc ; Lt ; Titlecase_Letter 667 | gc ; Lu ; Uppercase_Letter 668 | gc ; M ; Mark ; Combining_Mark # Mc | Me | Mn 669 | gc ; Mc ; Spacing_Mark 670 | gc ; Me ; Enclosing_Mark 671 | gc ; Mn ; Nonspacing_Mark 672 | gc ; N ; Number # Nd | Nl | No 673 | gc ; Nd ; Decimal_Number ; digit 674 | gc ; Nl ; Letter_Number 675 | gc ; No ; Other_Number 676 | gc ; P ; Punctuation ; punct # Pc | Pd | Pe | Pf | Pi | Po | Ps 677 | gc ; Pc ; Connector_Punctuation 678 | gc ; Pd ; Dash_Punctuation 679 | gc ; Pe ; Close_Punctuation 680 | gc ; Pf ; Final_Punctuation 681 | gc ; Pi ; Initial_Punctuation 682 | gc ; Po ; Other_Punctuation 683 | gc ; Ps ; Open_Punctuation 684 | gc ; S ; Symbol # Sc | Sk | Sm | So 685 | gc ; Sc ; Currency_Symbol 686 | gc ; Sk ; Modifier_Symbol 687 | gc ; Sm ; Math_Symbol 688 | gc ; So ; Other_Symbol 689 | gc ; Z ; Separator # Zl | Zp | Zs 690 | gc ; Zl ; Line_Separator 691 | gc ; Zp ; Paragraph_Separator 692 | gc ; Zs ; Space_Separator 693 | # @missing: 0000..10FFFF; General_Category; Unassigned 694 | 695 | # Grapheme_Base (Gr_Base) 696 | 697 | Gr_Base; N ; No ; F ; False 698 | Gr_Base; Y ; Yes ; T ; True 699 | 700 | # Grapheme_Cluster_Break (GCB) 701 | 702 | GCB; CN ; Control 703 | GCB; CR ; CR 704 | GCB; EB ; E_Base 705 | GCB; EBG ; E_Base_GAZ 706 | GCB; EM ; E_Modifier 707 | GCB; EX ; Extend 708 | GCB; GAZ ; Glue_After_Zwj 709 | GCB; L ; L 710 | GCB; LF ; LF 711 | GCB; LV ; LV 712 | GCB; LVT ; LVT 713 | GCB; PP ; Prepend 714 | GCB; RI ; Regional_Indicator 715 | GCB; SM ; SpacingMark 716 | GCB; T ; T 717 | GCB; V ; V 718 | GCB; XX ; Other 719 | GCB; ZWJ ; ZWJ 720 | 721 | # Grapheme_Extend (Gr_Ext) 722 | 723 | Gr_Ext; N ; No ; F ; False 724 | Gr_Ext; Y ; Yes ; T ; True 725 | 726 | # Grapheme_Link (Gr_Link) 727 | 728 | Gr_Link; N ; No ; F ; False 729 | Gr_Link; Y ; Yes ; T ; True 730 | 731 | # Hangul_Syllable_Type (hst) 732 | 733 | hst; L ; Leading_Jamo 734 | hst; LV ; LV_Syllable 735 | hst; LVT ; LVT_Syllable 736 | hst; NA ; Not_Applicable 737 | hst; T ; Trailing_Jamo 738 | hst; V ; Vowel_Jamo 739 | 740 | # Hex_Digit (Hex) 741 | 742 | Hex; N ; No ; F ; False 743 | Hex; Y ; Yes ; T ; True 744 | 745 | # Hyphen (Hyphen) 746 | 747 | Hyphen; N ; No ; F ; False 748 | Hyphen; Y ; Yes ; T ; True 749 | 750 | # IDS_Binary_Operator (IDSB) 751 | 752 | IDSB; N ; No ; F ; False 753 | IDSB; Y ; Yes ; T ; True 754 | 755 | # IDS_Trinary_Operator (IDST) 756 | 757 | IDST; N ; No ; F ; False 758 | IDST; Y ; Yes ; T ; True 759 | 760 | # ID_Continue (IDC) 761 | 762 | IDC; N ; No ; F ; False 763 | IDC; Y ; Yes ; T ; True 764 | 765 | # ID_Start (IDS) 766 | 767 | IDS; N ; No ; F ; False 768 | IDS; Y ; Yes ; T ; True 769 | 770 | # ISO_Comment (isc) 771 | 772 | # @missing: 0000..10FFFF; ISO_Comment; 773 | 774 | # Ideographic (Ideo) 775 | 776 | Ideo; N ; No ; F ; False 777 | Ideo; Y ; Yes ; T ; True 778 | 779 | # Indic_Positional_Category (InPC) 780 | 781 | InPC; Bottom ; Bottom 782 | InPC; Bottom_And_Left ; Bottom_And_Left 783 | InPC; Bottom_And_Right ; Bottom_And_Right 784 | InPC; Left ; Left 785 | InPC; Left_And_Right ; Left_And_Right 786 | InPC; NA ; NA 787 | InPC; Overstruck ; Overstruck 788 | InPC; Right ; Right 789 | InPC; Top ; Top 790 | InPC; Top_And_Bottom ; Top_And_Bottom 791 | InPC; Top_And_Bottom_And_Right ; Top_And_Bottom_And_Right 792 | InPC; Top_And_Left ; Top_And_Left 793 | InPC; Top_And_Left_And_Right ; Top_And_Left_And_Right 794 | InPC; Top_And_Right ; Top_And_Right 795 | InPC; Visual_Order_Left ; Visual_Order_Left 796 | 797 | # Indic_Syllabic_Category (InSC) 798 | 799 | InSC; Avagraha ; Avagraha 800 | InSC; Bindu ; Bindu 801 | InSC; Brahmi_Joining_Number ; Brahmi_Joining_Number 802 | InSC; Cantillation_Mark ; Cantillation_Mark 803 | InSC; Consonant ; Consonant 804 | InSC; Consonant_Dead ; Consonant_Dead 805 | InSC; Consonant_Final ; Consonant_Final 806 | InSC; Consonant_Head_Letter ; Consonant_Head_Letter 807 | InSC; Consonant_Initial_Postfixed ; Consonant_Initial_Postfixed 808 | InSC; Consonant_Killer ; Consonant_Killer 809 | InSC; Consonant_Medial ; Consonant_Medial 810 | InSC; Consonant_Placeholder ; Consonant_Placeholder 811 | InSC; Consonant_Preceding_Repha ; Consonant_Preceding_Repha 812 | InSC; Consonant_Prefixed ; Consonant_Prefixed 813 | InSC; Consonant_Subjoined ; Consonant_Subjoined 814 | InSC; Consonant_Succeeding_Repha ; Consonant_Succeeding_Repha 815 | InSC; Consonant_With_Stacker ; Consonant_With_Stacker 816 | InSC; Gemination_Mark ; Gemination_Mark 817 | InSC; Invisible_Stacker ; Invisible_Stacker 818 | InSC; Joiner ; Joiner 819 | InSC; Modifying_Letter ; Modifying_Letter 820 | InSC; Non_Joiner ; Non_Joiner 821 | InSC; Nukta ; Nukta 822 | InSC; Number ; Number 823 | InSC; Number_Joiner ; Number_Joiner 824 | InSC; Other ; Other 825 | InSC; Pure_Killer ; Pure_Killer 826 | InSC; Register_Shifter ; Register_Shifter 827 | InSC; Syllable_Modifier ; Syllable_Modifier 828 | InSC; Tone_Letter ; Tone_Letter 829 | InSC; Tone_Mark ; Tone_Mark 830 | InSC; Virama ; Virama 831 | InSC; Visarga ; Visarga 832 | InSC; Vowel ; Vowel 833 | InSC; Vowel_Dependent ; Vowel_Dependent 834 | InSC; Vowel_Independent ; Vowel_Independent 835 | 836 | # Jamo_Short_Name (JSN) 837 | 838 | JSN; A ; A 839 | JSN; AE ; AE 840 | JSN; B ; B 841 | JSN; BB ; BB 842 | JSN; BS ; BS 843 | JSN; C ; C 844 | JSN; D ; D 845 | JSN; DD ; DD 846 | JSN; E ; E 847 | JSN; EO ; EO 848 | JSN; EU ; EU 849 | JSN; G ; G 850 | JSN; GG ; GG 851 | JSN; GS ; GS 852 | JSN; H ; H 853 | JSN; I ; I 854 | JSN; J ; J 855 | JSN; JJ ; JJ 856 | JSN; K ; K 857 | JSN; L ; L 858 | JSN; LB ; LB 859 | JSN; LG ; LG 860 | JSN; LH ; LH 861 | JSN; LM ; LM 862 | JSN; LP ; LP 863 | JSN; LS ; LS 864 | JSN; LT ; LT 865 | JSN; M ; M 866 | JSN; N ; N 867 | JSN; NG ; NG 868 | JSN; NH ; NH 869 | JSN; NJ ; NJ 870 | JSN; O ; O 871 | JSN; OE ; OE 872 | JSN; P ; P 873 | JSN; R ; R 874 | JSN; S ; S 875 | JSN; SS ; SS 876 | JSN; T ; T 877 | JSN; U ; U 878 | JSN; WA ; WA 879 | JSN; WAE ; WAE 880 | JSN; WE ; WE 881 | JSN; WEO ; WEO 882 | JSN; WI ; WI 883 | JSN; YA ; YA 884 | JSN; YAE ; YAE 885 | JSN; YE ; YE 886 | JSN; YEO ; YEO 887 | JSN; YI ; YI 888 | JSN; YO ; YO 889 | JSN; YU ; YU 890 | # @missing: 0000..10FFFF; Jamo_Short_Name; 891 | 892 | # Join_Control (Join_C) 893 | 894 | Join_C; N ; No ; F ; False 895 | Join_C; Y ; Yes ; T ; True 896 | 897 | # Joining_Group (jg) 898 | 899 | jg ; African_Feh ; African_Feh 900 | jg ; African_Noon ; African_Noon 901 | jg ; African_Qaf ; African_Qaf 902 | jg ; Ain ; Ain 903 | jg ; Alaph ; Alaph 904 | jg ; Alef ; Alef 905 | jg ; Beh ; Beh 906 | jg ; Beth ; Beth 907 | jg ; Burushaski_Yeh_Barree ; Burushaski_Yeh_Barree 908 | jg ; Dal ; Dal 909 | jg ; Dalath_Rish ; Dalath_Rish 910 | jg ; E ; E 911 | jg ; Farsi_Yeh ; Farsi_Yeh 912 | jg ; Fe ; Fe 913 | jg ; Feh ; Feh 914 | jg ; Final_Semkath ; Final_Semkath 915 | jg ; Gaf ; Gaf 916 | jg ; Gamal ; Gamal 917 | jg ; Hah ; Hah 918 | jg ; Hanifi_Rohingya_Kinna_Ya ; Hanifi_Rohingya_Kinna_Ya 919 | jg ; Hanifi_Rohingya_Pa ; Hanifi_Rohingya_Pa 920 | jg ; He ; He 921 | jg ; Heh ; Heh 922 | jg ; Heh_Goal ; Heh_Goal 923 | jg ; Heth ; Heth 924 | jg ; Kaf ; Kaf 925 | jg ; Kaph ; Kaph 926 | jg ; Khaph ; Khaph 927 | jg ; Knotted_Heh ; Knotted_Heh 928 | jg ; Lam ; Lam 929 | jg ; Lamadh ; Lamadh 930 | jg ; Malayalam_Bha ; Malayalam_Bha 931 | jg ; Malayalam_Ja ; Malayalam_Ja 932 | jg ; Malayalam_Lla ; Malayalam_Lla 933 | jg ; Malayalam_Llla ; Malayalam_Llla 934 | jg ; Malayalam_Nga ; Malayalam_Nga 935 | jg ; Malayalam_Nna ; Malayalam_Nna 936 | jg ; Malayalam_Nnna ; Malayalam_Nnna 937 | jg ; Malayalam_Nya ; Malayalam_Nya 938 | jg ; Malayalam_Ra ; Malayalam_Ra 939 | jg ; Malayalam_Ssa ; Malayalam_Ssa 940 | jg ; Malayalam_Tta ; Malayalam_Tta 941 | jg ; Manichaean_Aleph ; Manichaean_Aleph 942 | jg ; Manichaean_Ayin ; Manichaean_Ayin 943 | jg ; Manichaean_Beth ; Manichaean_Beth 944 | jg ; Manichaean_Daleth ; Manichaean_Daleth 945 | jg ; Manichaean_Dhamedh ; Manichaean_Dhamedh 946 | jg ; Manichaean_Five ; Manichaean_Five 947 | jg ; Manichaean_Gimel ; Manichaean_Gimel 948 | jg ; Manichaean_Heth ; Manichaean_Heth 949 | jg ; Manichaean_Hundred ; Manichaean_Hundred 950 | jg ; Manichaean_Kaph ; Manichaean_Kaph 951 | jg ; Manichaean_Lamedh ; Manichaean_Lamedh 952 | jg ; Manichaean_Mem ; Manichaean_Mem 953 | jg ; Manichaean_Nun ; Manichaean_Nun 954 | jg ; Manichaean_One ; Manichaean_One 955 | jg ; Manichaean_Pe ; Manichaean_Pe 956 | jg ; Manichaean_Qoph ; Manichaean_Qoph 957 | jg ; Manichaean_Resh ; Manichaean_Resh 958 | jg ; Manichaean_Sadhe ; Manichaean_Sadhe 959 | jg ; Manichaean_Samekh ; Manichaean_Samekh 960 | jg ; Manichaean_Taw ; Manichaean_Taw 961 | jg ; Manichaean_Ten ; Manichaean_Ten 962 | jg ; Manichaean_Teth ; Manichaean_Teth 963 | jg ; Manichaean_Thamedh ; Manichaean_Thamedh 964 | jg ; Manichaean_Twenty ; Manichaean_Twenty 965 | jg ; Manichaean_Waw ; Manichaean_Waw 966 | jg ; Manichaean_Yodh ; Manichaean_Yodh 967 | jg ; Manichaean_Zayin ; Manichaean_Zayin 968 | jg ; Meem ; Meem 969 | jg ; Mim ; Mim 970 | jg ; No_Joining_Group ; No_Joining_Group 971 | jg ; Noon ; Noon 972 | jg ; Nun ; Nun 973 | jg ; Nya ; Nya 974 | jg ; Pe ; Pe 975 | jg ; Qaf ; Qaf 976 | jg ; Qaph ; Qaph 977 | jg ; Reh ; Reh 978 | jg ; Reversed_Pe ; Reversed_Pe 979 | jg ; Rohingya_Yeh ; Rohingya_Yeh 980 | jg ; Sad ; Sad 981 | jg ; Sadhe ; Sadhe 982 | jg ; Seen ; Seen 983 | jg ; Semkath ; Semkath 984 | jg ; Shin ; Shin 985 | jg ; Straight_Waw ; Straight_Waw 986 | jg ; Swash_Kaf ; Swash_Kaf 987 | jg ; Syriac_Waw ; Syriac_Waw 988 | jg ; Tah ; Tah 989 | jg ; Taw ; Taw 990 | jg ; Teh_Marbuta ; Teh_Marbuta 991 | jg ; Teh_Marbuta_Goal ; Hamza_On_Heh_Goal 992 | jg ; Teth ; Teth 993 | jg ; Waw ; Waw 994 | jg ; Yeh ; Yeh 995 | jg ; Yeh_Barree ; Yeh_Barree 996 | jg ; Yeh_With_Tail ; Yeh_With_Tail 997 | jg ; Yudh ; Yudh 998 | jg ; Yudh_He ; Yudh_He 999 | jg ; Zain ; Zain 1000 | jg ; Zhain ; Zhain 1001 | 1002 | # Joining_Type (jt) 1003 | 1004 | jt ; C ; Join_Causing 1005 | jt ; D ; Dual_Joining 1006 | jt ; L ; Left_Joining 1007 | jt ; R ; Right_Joining 1008 | jt ; T ; Transparent 1009 | jt ; U ; Non_Joining 1010 | 1011 | # Line_Break (lb) 1012 | 1013 | lb ; AI ; Ambiguous 1014 | lb ; AL ; Alphabetic 1015 | lb ; B2 ; Break_Both 1016 | lb ; BA ; Break_After 1017 | lb ; BB ; Break_Before 1018 | lb ; BK ; Mandatory_Break 1019 | lb ; CB ; Contingent_Break 1020 | lb ; CJ ; Conditional_Japanese_Starter 1021 | lb ; CL ; Close_Punctuation 1022 | lb ; CM ; Combining_Mark 1023 | lb ; CP ; Close_Parenthesis 1024 | lb ; CR ; Carriage_Return 1025 | lb ; EB ; E_Base 1026 | lb ; EM ; E_Modifier 1027 | lb ; EX ; Exclamation 1028 | lb ; GL ; Glue 1029 | lb ; H2 ; H2 1030 | lb ; H3 ; H3 1031 | lb ; HL ; Hebrew_Letter 1032 | lb ; HY ; Hyphen 1033 | lb ; ID ; Ideographic 1034 | lb ; IN ; Inseparable ; Inseperable 1035 | lb ; IS ; Infix_Numeric 1036 | lb ; JL ; JL 1037 | lb ; JT ; JT 1038 | lb ; JV ; JV 1039 | lb ; LF ; Line_Feed 1040 | lb ; NL ; Next_Line 1041 | lb ; NS ; Nonstarter 1042 | lb ; NU ; Numeric 1043 | lb ; OP ; Open_Punctuation 1044 | lb ; PO ; Postfix_Numeric 1045 | lb ; PR ; Prefix_Numeric 1046 | lb ; QU ; Quotation 1047 | lb ; RI ; Regional_Indicator 1048 | lb ; SA ; Complex_Context 1049 | lb ; SG ; Surrogate 1050 | lb ; SP ; Space 1051 | lb ; SY ; Break_Symbols 1052 | lb ; WJ ; Word_Joiner 1053 | lb ; XX ; Unknown 1054 | lb ; ZW ; ZWSpace 1055 | lb ; ZWJ ; ZWJ 1056 | 1057 | # Logical_Order_Exception (LOE) 1058 | 1059 | LOE; N ; No ; F ; False 1060 | LOE; Y ; Yes ; T ; True 1061 | 1062 | # Lowercase (Lower) 1063 | 1064 | Lower; N ; No ; F ; False 1065 | Lower; Y ; Yes ; T ; True 1066 | 1067 | # Lowercase_Mapping (lc) 1068 | 1069 | # @missing: 0000..10FFFF; Lowercase_Mapping; 1070 | 1071 | # Math (Math) 1072 | 1073 | Math; N ; No ; F ; False 1074 | Math; Y ; Yes ; T ; True 1075 | 1076 | # NFC_Quick_Check (NFC_QC) 1077 | 1078 | NFC_QC; M ; Maybe 1079 | NFC_QC; N ; No 1080 | NFC_QC; Y ; Yes 1081 | 1082 | # NFD_Quick_Check (NFD_QC) 1083 | 1084 | NFD_QC; N ; No 1085 | NFD_QC; Y ; Yes 1086 | 1087 | # NFKC_Casefold (NFKC_CF) 1088 | 1089 | # @missing: 0000..10FFFF; NFKC_Casefold; 1090 | 1091 | # NFKC_Quick_Check (NFKC_QC) 1092 | 1093 | NFKC_QC; M ; Maybe 1094 | NFKC_QC; N ; No 1095 | NFKC_QC; Y ; Yes 1096 | 1097 | # NFKD_Quick_Check (NFKD_QC) 1098 | 1099 | NFKD_QC; N ; No 1100 | NFKD_QC; Y ; Yes 1101 | 1102 | # Name (na) 1103 | 1104 | # @missing: 0000..10FFFF; Name; 1105 | 1106 | # Name_Alias (Name_Alias) 1107 | 1108 | # @missing: 0000..10FFFF; Name_Alias; 1109 | 1110 | # Noncharacter_Code_Point (NChar) 1111 | 1112 | NChar; N ; No ; F ; False 1113 | NChar; Y ; Yes ; T ; True 1114 | 1115 | # Numeric_Type (nt) 1116 | 1117 | nt ; De ; Decimal 1118 | nt ; Di ; Digit 1119 | nt ; None ; None 1120 | nt ; Nu ; Numeric 1121 | 1122 | # Numeric_Value (nv) 1123 | 1124 | # @missing: 0000..10FFFF; Numeric_Value; NaN 1125 | 1126 | # Other_Alphabetic (OAlpha) 1127 | 1128 | OAlpha; N ; No ; F ; False 1129 | OAlpha; Y ; Yes ; T ; True 1130 | 1131 | # Other_Default_Ignorable_Code_Point (ODI) 1132 | 1133 | ODI; N ; No ; F ; False 1134 | ODI; Y ; Yes ; T ; True 1135 | 1136 | # Other_Grapheme_Extend (OGr_Ext) 1137 | 1138 | OGr_Ext; N ; No ; F ; False 1139 | OGr_Ext; Y ; Yes ; T ; True 1140 | 1141 | # Other_ID_Continue (OIDC) 1142 | 1143 | OIDC; N ; No ; F ; False 1144 | OIDC; Y ; Yes ; T ; True 1145 | 1146 | # Other_ID_Start (OIDS) 1147 | 1148 | OIDS; N ; No ; F ; False 1149 | OIDS; Y ; Yes ; T ; True 1150 | 1151 | # Other_Lowercase (OLower) 1152 | 1153 | OLower; N ; No ; F ; False 1154 | OLower; Y ; Yes ; T ; True 1155 | 1156 | # Other_Math (OMath) 1157 | 1158 | OMath; N ; No ; F ; False 1159 | OMath; Y ; Yes ; T ; True 1160 | 1161 | # Other_Uppercase (OUpper) 1162 | 1163 | OUpper; N ; No ; F ; False 1164 | OUpper; Y ; Yes ; T ; True 1165 | 1166 | # Pattern_Syntax (Pat_Syn) 1167 | 1168 | Pat_Syn; N ; No ; F ; False 1169 | Pat_Syn; Y ; Yes ; T ; True 1170 | 1171 | # Pattern_White_Space (Pat_WS) 1172 | 1173 | Pat_WS; N ; No ; F ; False 1174 | Pat_WS; Y ; Yes ; T ; True 1175 | 1176 | # Prepended_Concatenation_Mark (PCM) 1177 | 1178 | PCM; N ; No ; F ; False 1179 | PCM; Y ; Yes ; T ; True 1180 | 1181 | # Quotation_Mark (QMark) 1182 | 1183 | QMark; N ; No ; F ; False 1184 | QMark; Y ; Yes ; T ; True 1185 | 1186 | # Radical (Radical) 1187 | 1188 | Radical; N ; No ; F ; False 1189 | Radical; Y ; Yes ; T ; True 1190 | 1191 | # Regional_Indicator (RI) 1192 | 1193 | RI ; N ; No ; F ; False 1194 | RI ; Y ; Yes ; T ; True 1195 | 1196 | # Script (sc) 1197 | 1198 | sc ; Adlm ; Adlam 1199 | sc ; Aghb ; Caucasian_Albanian 1200 | sc ; Ahom ; Ahom 1201 | sc ; Arab ; Arabic 1202 | sc ; Armi ; Imperial_Aramaic 1203 | sc ; Armn ; Armenian 1204 | sc ; Avst ; Avestan 1205 | sc ; Bali ; Balinese 1206 | sc ; Bamu ; Bamum 1207 | sc ; Bass ; Bassa_Vah 1208 | sc ; Batk ; Batak 1209 | sc ; Beng ; Bengali 1210 | sc ; Bhks ; Bhaiksuki 1211 | sc ; Bopo ; Bopomofo 1212 | sc ; Brah ; Brahmi 1213 | sc ; Brai ; Braille 1214 | sc ; Bugi ; Buginese 1215 | sc ; Buhd ; Buhid 1216 | sc ; Cakm ; Chakma 1217 | sc ; Cans ; Canadian_Aboriginal 1218 | sc ; Cari ; Carian 1219 | sc ; Cham ; Cham 1220 | sc ; Cher ; Cherokee 1221 | sc ; Copt ; Coptic ; Qaac 1222 | sc ; Cprt ; Cypriot 1223 | sc ; Cyrl ; Cyrillic 1224 | sc ; Deva ; Devanagari 1225 | sc ; Dogr ; Dogra 1226 | sc ; Dsrt ; Deseret 1227 | sc ; Dupl ; Duployan 1228 | sc ; Egyp ; Egyptian_Hieroglyphs 1229 | sc ; Elba ; Elbasan 1230 | sc ; Elym ; Elymaic 1231 | sc ; Ethi ; Ethiopic 1232 | sc ; Geor ; Georgian 1233 | sc ; Glag ; Glagolitic 1234 | sc ; Gong ; Gunjala_Gondi 1235 | sc ; Gonm ; Masaram_Gondi 1236 | sc ; Goth ; Gothic 1237 | sc ; Gran ; Grantha 1238 | sc ; Grek ; Greek 1239 | sc ; Gujr ; Gujarati 1240 | sc ; Guru ; Gurmukhi 1241 | sc ; Hang ; Hangul 1242 | sc ; Hani ; Han 1243 | sc ; Hano ; Hanunoo 1244 | sc ; Hatr ; Hatran 1245 | sc ; Hebr ; Hebrew 1246 | sc ; Hira ; Hiragana 1247 | sc ; Hluw ; Anatolian_Hieroglyphs 1248 | sc ; Hmng ; Pahawh_Hmong 1249 | sc ; Hmnp ; Nyiakeng_Puachue_Hmong 1250 | sc ; Hrkt ; Katakana_Or_Hiragana 1251 | sc ; Hung ; Old_Hungarian 1252 | sc ; Ital ; Old_Italic 1253 | sc ; Java ; Javanese 1254 | sc ; Kali ; Kayah_Li 1255 | sc ; Kana ; Katakana 1256 | sc ; Khar ; Kharoshthi 1257 | sc ; Khmr ; Khmer 1258 | sc ; Khoj ; Khojki 1259 | sc ; Knda ; Kannada 1260 | sc ; Kthi ; Kaithi 1261 | sc ; Lana ; Tai_Tham 1262 | sc ; Laoo ; Lao 1263 | sc ; Latn ; Latin 1264 | sc ; Lepc ; Lepcha 1265 | sc ; Limb ; Limbu 1266 | sc ; Lina ; Linear_A 1267 | sc ; Linb ; Linear_B 1268 | sc ; Lisu ; Lisu 1269 | sc ; Lyci ; Lycian 1270 | sc ; Lydi ; Lydian 1271 | sc ; Mahj ; Mahajani 1272 | sc ; Maka ; Makasar 1273 | sc ; Mand ; Mandaic 1274 | sc ; Mani ; Manichaean 1275 | sc ; Marc ; Marchen 1276 | sc ; Medf ; Medefaidrin 1277 | sc ; Mend ; Mende_Kikakui 1278 | sc ; Merc ; Meroitic_Cursive 1279 | sc ; Mero ; Meroitic_Hieroglyphs 1280 | sc ; Mlym ; Malayalam 1281 | sc ; Modi ; Modi 1282 | sc ; Mong ; Mongolian 1283 | sc ; Mroo ; Mro 1284 | sc ; Mtei ; Meetei_Mayek 1285 | sc ; Mult ; Multani 1286 | sc ; Mymr ; Myanmar 1287 | sc ; Nand ; Nandinagari 1288 | sc ; Narb ; Old_North_Arabian 1289 | sc ; Nbat ; Nabataean 1290 | sc ; Newa ; Newa 1291 | sc ; Nkoo ; Nko 1292 | sc ; Nshu ; Nushu 1293 | sc ; Ogam ; Ogham 1294 | sc ; Olck ; Ol_Chiki 1295 | sc ; Orkh ; Old_Turkic 1296 | sc ; Orya ; Oriya 1297 | sc ; Osge ; Osage 1298 | sc ; Osma ; Osmanya 1299 | sc ; Palm ; Palmyrene 1300 | sc ; Pauc ; Pau_Cin_Hau 1301 | sc ; Perm ; Old_Permic 1302 | sc ; Phag ; Phags_Pa 1303 | sc ; Phli ; Inscriptional_Pahlavi 1304 | sc ; Phlp ; Psalter_Pahlavi 1305 | sc ; Phnx ; Phoenician 1306 | sc ; Plrd ; Miao 1307 | sc ; Prti ; Inscriptional_Parthian 1308 | sc ; Rjng ; Rejang 1309 | sc ; Rohg ; Hanifi_Rohingya 1310 | sc ; Runr ; Runic 1311 | sc ; Samr ; Samaritan 1312 | sc ; Sarb ; Old_South_Arabian 1313 | sc ; Saur ; Saurashtra 1314 | sc ; Sgnw ; SignWriting 1315 | sc ; Shaw ; Shavian 1316 | sc ; Shrd ; Sharada 1317 | sc ; Sidd ; Siddham 1318 | sc ; Sind ; Khudawadi 1319 | sc ; Sinh ; Sinhala 1320 | sc ; Sogd ; Sogdian 1321 | sc ; Sogo ; Old_Sogdian 1322 | sc ; Sora ; Sora_Sompeng 1323 | sc ; Soyo ; Soyombo 1324 | sc ; Sund ; Sundanese 1325 | sc ; Sylo ; Syloti_Nagri 1326 | sc ; Syrc ; Syriac 1327 | sc ; Tagb ; Tagbanwa 1328 | sc ; Takr ; Takri 1329 | sc ; Tale ; Tai_Le 1330 | sc ; Talu ; New_Tai_Lue 1331 | sc ; Taml ; Tamil 1332 | sc ; Tang ; Tangut 1333 | sc ; Tavt ; Tai_Viet 1334 | sc ; Telu ; Telugu 1335 | sc ; Tfng ; Tifinagh 1336 | sc ; Tglg ; Tagalog 1337 | sc ; Thaa ; Thaana 1338 | sc ; Thai ; Thai 1339 | sc ; Tibt ; Tibetan 1340 | sc ; Tirh ; Tirhuta 1341 | sc ; Ugar ; Ugaritic 1342 | sc ; Vaii ; Vai 1343 | sc ; Wara ; Warang_Citi 1344 | sc ; Wcho ; Wancho 1345 | sc ; Xpeo ; Old_Persian 1346 | sc ; Xsux ; Cuneiform 1347 | sc ; Yiii ; Yi 1348 | sc ; Zanb ; Zanabazar_Square 1349 | sc ; Zinh ; Inherited ; Qaai 1350 | sc ; Zyyy ; Common 1351 | sc ; Zzzz ; Unknown 1352 | 1353 | # Script_Extensions (scx) 1354 | 1355 | # @missing: 0000..10FFFF; Script_Extensions;