├── .gitignore ├── .hgignore ├── MANIFEST.in ├── AUTHORS ├── transtab ├── Makefile ├── transtab.changes ├── REFERENCES ├── README ├── transcomp ├── transtab.repertoire ├── transtab.missing-MES-2 └── transtab ├── LICENSE ├── setup.py ├── CHANGES ├── scripts └── update_table.py ├── README ├── tests └── test_codec.py └── translitcodec └── __init__.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | MANIFEST 3 | dist/ 4 | -------------------------------------------------------------------------------- /.hgignore: -------------------------------------------------------------------------------- 1 | syntax: glob 2 | 3 | MANIFEST 4 | dist 5 | build 6 | *.py? 7 | 8 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include AUTHORS 2 | include LICENSE 3 | include CHANGES 4 | include README 5 | recursive-include tests *py 6 | recursive-include scripts *py 7 | recursive-include transtab * 8 | -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | translitcodec was originally written by Jason Kirtland in 2008. 2 | 3 | Contributors are: 4 | 5 | - Jason Kirtland 6 | - Craig Dennis 7 | - Piotr Skamruk 8 | - Claude Paroz 9 | - Wojciech Banaś 10 | 11 | The translitcodec source distribution includes the 'transtab' package 12 | by Markus Kuhn . 13 | -------------------------------------------------------------------------------- /transtab/Makefile: -------------------------------------------------------------------------------- 1 | TARGETS=transtab transtab.repertoire transtab.missing-MES-2 transtab.changes 2 | 3 | all: $(TARGETS) 4 | 5 | # transtab.utf is the file that should be edited 6 | 7 | transtab: transtab.utf 8 | format=iso ./transcomp $< >$@ 9 | format=isoutf ./transcomp transtab >transtab.utf 10 | 11 | transtab.repertoire: transtab 12 | format=utf ./transcomp transtab >$@ 13 | 14 | transtab.missing-MES-2: transtab.repertoire 15 | uniset + ../MES-2 - transtab.repertoire - 0000-007f clean table | \ 16 | format=isoutf ./transcomp - >$@ 17 | 18 | transtab.missing-TARGET1: transtab.repertoire 19 | uniset + ../../font/ucs-fonts/TARGET1 - transtab.repertoire \ 20 | - 0000-007f clean table | \ 21 | format=isoutf ./transcomp - >$@ 22 | 23 | transtab.changes: transtab.utf 24 | rlog $< >$@ 25 | 26 | distribution: $(TARGETS) 27 | ci -l transtab.utf 28 | cd .. ; tar cvf transtab.tar \ 29 | transtab/README transtab/REFERENCES transtab/Makefile \ 30 | transtab/transcomp \ 31 | transtab/transtab.utf $(TARGETS:%=transtab/%) ; \ 32 | gzip -9f transtab.tar ; \ 33 | mv transtab.tar.gz $(HOME)/.www/download/ 34 | 35 | clean: 36 | rm -f *~ 37 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2008 Jason Kirtland 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a 4 | copy of this software and associated documentation files (the 5 | "Software"), to deal in the Software without restriction, including 6 | without limitation the rights to use, copy, modify, merge, publish, 7 | distribute, sublicense, and/or sell copies of the Software, and to 8 | permit persons to whom the Software is furnished to do so, subject to 9 | the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be included 12 | in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 15 | OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 17 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 18 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 19 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 20 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import codecs 2 | from setuptools import setup 3 | 4 | 5 | lines = codecs.open('README', 'r', 'utf-8').readlines()[3:] 6 | lines.append('\n') 7 | lines.extend(codecs.open('CHANGES', 'r', 'utf-8').readlines()[1:]) 8 | desc = ''.join(lines).lstrip() 9 | 10 | import translitcodec 11 | version = translitcodec.__version__ 12 | 13 | setup(name='translitcodec', 14 | version=version, 15 | description='Unicode to 8-bit charset transliteration codec', 16 | long_description=desc, 17 | long_description_content_type='text/x-rst', 18 | author='Jason Kirtland', 19 | author_email='jek@discorporate.us', 20 | url='https://github.com/claudep/translitcodec', 21 | packages=['translitcodec'], 22 | license='MIT License', 23 | python_requires='>=3', 24 | classifiers=[ 25 | 'Development Status :: 5 - Production/Stable', 26 | 'Intended Audience :: Developers', 27 | 'License :: OSI Approved :: MIT License', 28 | 'Operating System :: OS Independent', 29 | 'Programming Language :: Python', 30 | 'Programming Language :: Python :: 3', 31 | 'Programming Language :: Python :: 3 :: Only', 32 | 'Programming Language :: Python :: Implementation :: CPython', 33 | 'Topic :: Software Development :: Libraries', 34 | 'Topic :: Utilities', 35 | ], 36 | ) 37 | -------------------------------------------------------------------------------- /CHANGES: -------------------------------------------------------------------------------- 1 | ===================== 2 | translitcodec Changes 3 | ===================== 4 | 5 | 0.7.0 6 | ----- 7 | Released on May 8, 2021 8 | 9 | - Added support for error handles 10 | - Fixed conversion of the German eszett char 11 | 12 | 0.6.0 13 | ----- 14 | Released on December 13, 2020 15 | 16 | - Add support for Python 3.9 17 | 18 | 0.5.2 19 | ----- 20 | Released on January 19, 2020 21 | 22 | - Install package with setuptools 23 | 24 | 0.5.1 25 | ----- 26 | Released on January 19, 2020 27 | 28 | - Add python_requires to prevent installation with Python 2 packages 29 | 30 | 0.5 31 | --- 32 | Released on January 18, 2020 33 | 34 | - Complete coverage of the Vietnamese alphabet 35 | 36 | - Removed Python 2 support 37 | 38 | 0.4 39 | --- 40 | Released on May 11, 2015 41 | 42 | - Added Python 3 compatibility 43 | 44 | 0.3 45 | --- 46 | 47 | Released on February 14, 2011 48 | 49 | - Fixes to the transtab table rebuilding tool. 50 | 51 | - Added translitcodec.__version__ 52 | 53 | 0.2 54 | --- 55 | 56 | Released on January 27, 2011 57 | 58 | - Resolves issue of "TypeError: character mapping must return integer, 59 | None or unicode" when a blank value (eg: \N{ZERO WIDTH SPACE} \u200B) 60 | was encoded. Unicode blanks are now returned. 61 | 62 | - Characters in the ASCII range are no longer included in the translation 63 | tables. 64 | 65 | 0.1 66 | --- 67 | 68 | Released on December 28, 2008 69 | 70 | - Initial packaged release. 71 | -------------------------------------------------------------------------------- /transtab/transtab.changes: -------------------------------------------------------------------------------- 1 | 2 | RCS file: RCS/transtab.utf,v 3 | Working file: transtab.utf 4 | head: 1.8 5 | branch: 6 | locks: strict 7 | mgk25: 1.8 8 | access list: 9 | symbolic names: 10 | keyword substitution: kv 11 | total revisions: 8; selected revisions: 8 12 | description: 13 | Transliteration table in ISO/IEC TR 14652 format 14 | ---------------------------- 15 | revision 1.8 locked by: mgk25; 16 | date: 2000-10-12 11:01:28+01; author: mgk25; state: Exp; lines: +2 -0 17 | RCS id added 18 | ---------------------------- 19 | revision 1.7 20 | date: 2000-10-12 09:38:41+01; author: mgk25; state: Exp; lines: +4 -4 21 | added ae->a 22 | ---------------------------- 23 | revision 1.6 24 | date: 2000-10-10 09:13:10+01; author: mgk25; state: Exp; lines: +20 -20 25 | Byrial Jensen added transliterations for 26 | Esperanto, such that C, G, H, J, S with circumflex are presented 27 | by the base character followed by an H. 28 | ---------------------------- 29 | revision 1.5 30 | date: 2000-10-09 11:33:26+01; author: mgk25; state: Exp; lines: +2 -2 31 | *** empty log message *** 32 | ---------------------------- 33 | revision 1.4 34 | date: 2000-10-09 11:23:38+01; author: mgk25; state: Exp; lines: +541 -232 35 | *** empty log message *** 36 | ---------------------------- 37 | revision 1.3 38 | date: 2000-10-09 00:35:12+01; author: mgk25; state: Exp; lines: +372 -0 39 | *** empty log message *** 40 | ---------------------------- 41 | revision 1.2 42 | date: 2000-10-08 23:43:06+01; author: mgk25; state: Exp; lines: +88 -4 43 | *** empty log message *** 44 | ---------------------------- 45 | revision 1.1 46 | date: 2000-10-08 23:19:22+01; author: mgk25; state: Exp; 47 | Initial revision 48 | ============================================================================= 49 | -------------------------------------------------------------------------------- /transtab/REFERENCES: -------------------------------------------------------------------------------- 1 | 2 | Some Literature References on Transliteration and Transcription 3 | --------------------------------------------------------------- 4 | 5 | Markus Kuhn -- 2000-10-12 6 | 7 | 8 | Arabic 9 | 10 | ISO 233:1984 Documentation -- Transliteration of Arabic characters 11 | into Latin characters 12 | 13 | ISO 233-2:1993 Information and documentation -- Transliteration of 14 | Arabic characters into Latin characters -- Part 2: Arabic language 15 | -- Simplified transliteration 16 | 17 | ISO 233-3:1999 Information and documentation -- Transliteration of 18 | Arabic characters into Latin characters -- Part 3: Persian language 19 | -- Simplified transliteration (available in English only) 20 | 21 | Armenian 22 | 23 | ISO 9985:1996 Information and documentation -- Transliteration of 24 | Armenian characters into Latin characters 25 | 26 | Esperanto 27 | 28 | L.L. Zamenhof: Fundamento de Esperanto, 1905 29 | 30 | http://www.esperanto.net/veb/faq-15.html 31 | 32 | Georgian 33 | 34 | ISO 9984:1996 Information and documentation -- Transliteration of 35 | Georgian characters into Latin characters 36 | 37 | Hebrew 38 | 39 | ISO 259:1984 Documentation -- Transliteration of Hebrew characters 40 | into Latin characters 41 | 42 | ISO 259-2:1994 Information and documentation -- Transliteration of 43 | Hebrew characters into Latin characters -- Part 2: Simplified 44 | transliteration 45 | 46 | International Phonetic Alphabet 47 | 48 | http://www.hpl.hp.com/personal/Evan_Kirshenbaum/IPA/faq.html 49 | 50 | Korean 51 | 52 | ISO/TR 11941:1996 Information and documentation -- Transliteration 53 | of Korean script into Latin characters 54 | 55 | Russian 56 | 57 | ISO 9:1995 Information and documentation -- Transliteration of 58 | Cyrillic characters into Latin characters -- Slavic and non-Slavic 59 | languages 60 | 61 | Thai 62 | 63 | ISO 11940:1998 Information and documentation -- Transliteration of 64 | Thai 65 | 66 | -------------------------------------------------------------------------------- /scripts/update_table.py: -------------------------------------------------------------------------------- 1 | """ 2 | Updates translitcodec/__init__.py with translation table information 3 | built from the 'transtab' database. 4 | 5 | :copyright: the translitcodec authors and developers, see AUTHORS. 6 | :license: MIT, see LICENSE for more details. 7 | """ 8 | import csv 9 | import os 10 | import sys 11 | 12 | 13 | csv.register_dialect('transtab', delimiter=';') 14 | 15 | 16 | def read_table(path='transtab/transtab'): 17 | long, short, single = {}, {}, {} 18 | 19 | with open(path) as fh: 20 | for line in fh.readlines(): 21 | if not line.startswith('<'): 22 | continue 23 | from_spec, raw_to = line.strip().split(' ', 1) 24 | from_ord = int(from_spec[2:-1], 16) 25 | if from_ord <= 128: 26 | continue 27 | 28 | raw = next(csv.reader([raw_to], 'transtab')) 29 | long_char = _unpack_uchrs(raw[0]) 30 | if len(raw) < 2: 31 | short_char = long_char 32 | else: 33 | short_char = _unpack_uchrs(raw[1]) 34 | 35 | long[from_ord] = long_char 36 | short[from_ord] = short_char 37 | if len(short_char) == 1: 38 | single[from_ord] = short_char 39 | return long, short, single 40 | 41 | 42 | def _unpack_uchrs(packed): 43 | chunks = packed.replace(''): 56 | bucket = old 57 | 58 | with open(path, 'w') as fh: 59 | fh.writelines(preamble) 60 | fh.write("\n") 61 | _dump_dict(fh, 'long_table', long) 62 | _dump_dict(fh, 'short_table', short) 63 | _dump_dict(fh, 'single_table', single) 64 | fh.write("\n") 65 | fh.writelines(postamble) 66 | 67 | 68 | def _dump_dict(fh, name, data): 69 | fh.write("%s = {\n" % name) 70 | for pair in sorted(data.items()): 71 | fh.write(" %r: %r,\n" % pair) 72 | fh.write("}\n\n") 73 | 74 | if __name__ == '__main__': 75 | if not (os.path.exists('translitcodec') and os.path.exists('transtab')): 76 | print("Can not find translitcodec/ and transtab/ directories.") 77 | sys.exit(-1) 78 | tables = read_table() 79 | update_inclusion(*tables) 80 | print("Updated.") 81 | -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | Unicode to 8-bit charset transliteration codec. 2 | 3 | This package contains codecs for transliterating ISO 10646 texts into 4 | best-effort representations using smaller coded character sets (ASCII, 5 | ISO 8859, etc.). The translation tables used by the codecs are from 6 | the ``transtab`` collection by Markus Kuhn. 7 | 8 | Three types of transliterating codecs are provided: 9 | 10 | "long", using as many characters as needed to make a natural 11 | replacement. For example, \u00e4 LATIN SMALL LETTER A WITH 12 | DIAERESIS ``ä`` will be replaced with ``ae``. 13 | 14 | "short", using the minimum number of characters to make a 15 | replacement. For example, \u00e4 LATIN SMALL LETTER A WITH 16 | DIAERESIS ``ä`` will be replaced with ``a``. 17 | 18 | "one", only performing single character replacements. Characters 19 | that can not be transliterated with a single character are passed 20 | through unchanged. For example, \u2639 WHITE FROWNING FACE ``☹`` 21 | will be passed through unchanged. 22 | 23 | Using the codecs is simple:: 24 | 25 | >>> import translitcodec 26 | >>> import codecs 27 | >>> codecs.encode('fácil € ☺', 'translit/long') 28 | 'facil EUR :-)' 29 | >>> codecs.encode('fácil € ☺', 'translit/short') 30 | 'facil E :-)' 31 | 32 | The codecs return Unicode by default. To receive a bytestring back, 33 | either chain the output of encode() to another codec, or append the 34 | name of the desired byte encoding to the codec name:: 35 | 36 | >>> codecs.encode('fácil € ☺', 'translit/one').encode('ascii', 'replace') 37 | 'facil E ?' 38 | >>> 'fácil € ☺'.encode('translit/one/ascii', 'replace') 39 | 'facil E ?' 40 | 41 | The package also supplies a 'transliterate' codec, an alias for 42 | 'translit/long'. 43 | 44 | Another way to use the library is to use an error handle. 45 | Error handles are available: 46 | 47 | * 'strict/translit/long', 'strict/translit/short', 'strict/translit/one' - similar to 'strict' 48 | * 'ignore/translit/long', 'ignore/translit/short', 'ignore/translit/one' - similar to 'ignore' 49 | * 'replace/translit/long', 'replace/translit/short', 'replace/translit/one' - similar to 'replace' 50 | 51 | These error handles above, work similarly to Python's built-in ones. 52 | The difference is that transliteration is attempted first. 53 | 54 | >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/long').decode('ISO-8859-2') 55 | 'Zażółć gęślą jaźń EUR :-)?!@#' 56 | >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/short').decode('ISO-8859-2') 57 | 'Zażółć gęślą jaźń E :-)?!@#' 58 | >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'replace/translit/one').decode('ISO-8859-2') 59 | 'Zażółć gęślą jaźń E ??!@#' 60 | >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/long').decode('ISO-8859-2') 61 | 'Zażółć gęślą jaźń EUR :-)!@#' 62 | >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/short').decode('ISO-8859-2') 63 | 'Zażółć gęślą jaźń E :-)!@#' 64 | >>> codecs.encode('Zażółć gęślą jaźń € ☺另!@#', 'ISO-8859-2', 'ignore/translit/one').decode('ISO-8859-2') 65 | 'Zażółć gęślą jaźń E !@#' 66 | -------------------------------------------------------------------------------- /tests/test_codec.py: -------------------------------------------------------------------------------- 1 | """Very basic codec tests. 2 | 3 | :copyright: the translitcodec authors and developers, see AUTHORS. 4 | :license: MIT, see LICENSE for more details. 5 | 6 | """ 7 | import codecs 8 | import translitcodec 9 | from unittest import TestCase 10 | 11 | 12 | class CodecTests(TestCase): 13 | data = '£ ☹ wøóf méåw' 14 | 15 | def test_default(self): 16 | assert codecs.encode(self.data, 'transliterate') == 'GBP :-( woof meaaw' 17 | 18 | def test_translit_long(self): 19 | assert codecs.encode(self.data, 'translit/long') == 'GBP :-( woof meaaw' 20 | 21 | def test_translit_short(self): 22 | assert codecs.encode(self.data, 'translit/short') == 'GBP :-( woof meaw' 23 | 24 | def test_translit_one(self): 25 | assert codecs.encode(self.data, 'translit/one') == '\u00a3 \u2639 woof meaw' 26 | 27 | def test_translit_long_ascii(self): 28 | assert self.data.encode('translit/long/ascii') == b'GBP :-( woof meaaw' 29 | 30 | def test_translit_short_ascii(self): 31 | assert self.data.encode('translit/short/ascii') == b'GBP :-( woof meaw' 32 | 33 | def test_translit_one_ascii(self): 34 | try: 35 | codecs.encode(self.data, 'translit/one/ascii') 36 | assert False 37 | except UnicodeEncodeError: 38 | assert True 39 | 40 | assert codecs.encode(self.data, 'translit/one/ascii', 'replace') == b'? ? woof meaw' 41 | 42 | def test_ascii_level_characters_remain(self): 43 | assert codecs.encode("'", 'translit/long') == "'" 44 | 45 | def test_zero_width_space(self): 46 | try: 47 | char = codecs.encode('\u200b', 'translit/long') 48 | assert char == '' 49 | except TypeError: 50 | assert False 51 | 52 | 53 | class AlphabetTests(TestCase): 54 | def test_vietnamese(self): 55 | alphabet_upper = 'AĂÂBCDĐEÊGHIKLMNOÔƠPQRSTUƯVXY' 56 | alphabet_lower = 'aăâbcdđeêghiklmnoôơpqrstuưvxy' 57 | self.assertEqual( 58 | codecs.encode(alphabet_upper, 'transliterate'), 59 | 'AAABCDDEEGHIKLMNOOOPQRSTUUVXY' 60 | ) 61 | self.assertEqual( 62 | codecs.encode(alphabet_lower, 'transliterate'), 63 | 'aaabcddeeghiklmnooopqrstuuvxy' 64 | ) 65 | 66 | 67 | class ErrorHandlersTests(TestCase): 68 | data = 'Zażółć gęślą jaźń € ☺另!@#' 69 | page = 'ISO-8859-2' 70 | 71 | def _process(self, error_handler_name): 72 | return codecs.encode(self.data, self.page, error_handler_name).decode(self.page) 73 | 74 | def test_replace_long(self): 75 | assert self._process('replace/translit/long') == 'Zażółć gęślą jaźń EUR :-)?!@#' 76 | 77 | def test_replace_short(self): 78 | assert self._process('replace/translit/short') == 'Zażółć gęślą jaźń E :-)?!@#' 79 | 80 | def test_replace_one(self): 81 | assert self._process('replace/translit/one') == 'Zażółć gęślą jaźń E ??!@#' 82 | 83 | def test_ignore_long(self): 84 | assert self._process('ignore/translit/long') == 'Zażółć gęślą jaźń EUR :-)!@#' 85 | 86 | def test_ignore_short(self): 87 | assert self._process('ignore/translit/short') == 'Zażółć gęślą jaźń E :-)!@#' 88 | 89 | def test_ignore_one(self): 90 | assert self._process('ignore/translit/one') == 'Zażółć gęślą jaźń E !@#' 91 | 92 | def test_strict_long(self): 93 | with self.assertRaises(UnicodeEncodeError): 94 | self._process('strict/translit/long') 95 | 96 | def test_strict_short(self): 97 | with self.assertRaises(UnicodeEncodeError): 98 | self._process('strict/translit/short') 99 | 100 | def test_strict_one(self): 101 | with self.assertRaises(UnicodeEncodeError): 102 | self._process('strict/translit/one') 103 | -------------------------------------------------------------------------------- /transtab/README: -------------------------------------------------------------------------------- 1 | 2 | Unicode to 8-bit charset transliteration table 3 | ---------------------------------------------- 4 | 5 | Markus Kuhn -- 2000-10-09 6 | 7 | 8 | This package contains a table for transliterating ISO 10646 texts into 9 | best-effort representations using smaller coded character sets (ASCII, 10 | ISO 8859, etc.). It is primarily intended for inclusion into the GNU C 11 | library, but might be of use for other applications as well. The table 12 | is freely available to anyone. 13 | 14 | Files: 15 | 16 | transtab 17 | 18 | This is the table in the format suggested in ISO/IEC TR 14652 19 | 20 | 21 | transtab.utf 22 | 23 | Same as transtab, but with added comments that show the strings 24 | encoded in UTF-8. This is the file that should be edited to make 25 | changes. The makefile will build the others from this one. 26 | 27 | transtab.repertoire 28 | 29 | List of characters covered by transtab suitable for feeding into 30 | uniset. Also contains the UTF-8 strings as comments. 31 | 32 | transtab.missing-MES-2 33 | 34 | List of characters in CEN MES-2 minus those in transtab.repertoire. 35 | Intended to help getting an overview of what is and what is not 36 | covered. Transtab does not aim to cover MES-2 completely. It aims 37 | to provide transliterations only for those characters where they are 38 | feasible. 39 | 40 | transcomp 41 | 42 | Perl script to reformat and merge transliteration tables 43 | 44 | 45 | The transliteration table contains a list of substitution strings 46 | for each member of the covered Unicode subset. Applications are 47 | expected to use this list as follows: 48 | 49 | - Remove all substitution strings that contain Unicode characters 50 | that are not available in the destination character set. 51 | 52 | - Remove all substitution strings that are longer (or shorter) than 53 | required by the application (in particular, some applications 54 | might need substitution strings that are exactly one character 55 | long). 56 | 57 | - Of the remaining substitution strings, pick the first one 58 | in the list. 59 | 60 | - If no substitution string remains for a Unicode character, 61 | use a default character such as for instance "?". 62 | 63 | Applications are not required or supposed to recursively substitute 64 | Unicode characters found in substitution strings. 65 | 66 | The substitution strings make no use of combining characters, that is 67 | the output will be ISO 10646 Level 1. The input strings should preferably 68 | be normalized into decomposed form first. 69 | 70 | The substitution strings in this table aim to be visually or 71 | semantically equivalent to the characters they replace. Ideally, they 72 | should correspond to the fallback notation that people naturally use 73 | in email or on typewriters to substitute unavailable characters. They 74 | are not intended as unique mnemonics for characters (such as for 75 | example those in RFC 1345). 76 | 77 | If you use transliteration in C library locales, please make sure that 78 | the X/Open function wcwidth() and wcswidth() accurately predict how 79 | many character cell position the cursor will advance, even when 80 | transliteration is used. This is essential to allow applications to 81 | perform correct terminal screen layout even when multi-character 82 | transliterations are used. 83 | 84 | The latest version of this package is available from 85 | 86 | http://www.cl.cam.ac.uk/~mgk25/download/transtab.tar.gz 87 | 88 | Please send comments and patches (preferably diff -u on transtab.utf) to 89 | 90 | Markus.Kuhn@cl.cam.ac.uk 91 | 92 | Acknowledgements: Some parts of this table were inspired and recycled 93 | from the def7_uni.tbl file in lynx-2.8.4. 94 | 95 | Enjoy ... 96 | 97 | Markus 98 | 99 | -- 100 | Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK 101 | Email: mkuhn at acm.org, WWW: 102 | -------------------------------------------------------------------------------- /transtab/transcomp: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | 3 | sub utf8 ($) { 4 | my $c = shift(@_); 5 | 6 | if ($c < 0x80) { 7 | return sprintf("%c", $c); 8 | } elsif ($c < 0x800) { 9 | return sprintf("%c%c", 0xc0 | ($c >> 6), 0x80 | ($c & 0x3f)); 10 | } elsif ($c < 0x10000) { 11 | return sprintf("%c%c%c", 12 | 0xe0 | ($c >> 12), 13 | 0x80 | (($c >> 6) & 0x3f), 14 | 0x80 | ( $c & 0x3f)); 15 | } elsif ($c < 0x200000) { 16 | return sprintf("%c%c%c%c", 17 | 0xf0 | ($c >> 18), 18 | 0x80 | (($c >> 12) & 0x3f), 19 | 0x80 | (($c >> 6) & 0x3f), 20 | 0x80 | ( $c & 0x3f)); 21 | } elsif ($c < 0x4000000) { 22 | return sprintf("%c%c%c%c%c", 23 | 0xf8 | ($c >> 24), 24 | 0x80 | (($c >> 18) & 0x3f), 25 | 0x80 | (($c >> 12) & 0x3f), 26 | 0x80 | (($c >> 6) & 0x3f), 27 | 0x80 | ( $c & 0x3f)); 28 | 29 | } elsif ($c < 0x80000000) { 30 | return sprintf("%c%c%c%c%c%c", 31 | 0xfe | ($c >> 30), 32 | 0x80 | (($c >> 24) & 0x3f), 33 | 0x80 | (($c >> 18) & 0x3f), 34 | 0x80 | (($c >> 12) & 0x3f), 35 | 0x80 | (($c >> 6) & 0x3f), 36 | 0x80 | ( $c & 0x3f)); 37 | } else { 38 | return utf8(0xfffd); 39 | } 40 | } 41 | 42 | sub append_translit { 43 | my ($ucs, $t) = @_; 44 | 45 | $ucs =~ /^[0-9A-F]{4}$/ || die("ERROR: append_translit('$ucs','$t')\n"); 46 | $t =~ /^([0-9A-F]{4})*$/ || die("ERROR: append_translit('$ucs','$t')\n"); 47 | #print STDERR "append_translit('$ucs','$t')\n"; 48 | if (!defined($trans{$ucs})) { 49 | $trans{$ucs} = []; 50 | } 51 | push(@{$trans{$ucs}}, $t); 52 | } 53 | 54 | $unicodedata = "UnicodeData.txt"; 55 | $datadir = "$ENV{HOME}/local/lib/ucs"; 56 | 57 | # read list of all Unicode names 58 | if (!open(UDATA, $unicodedata) && !open(UDATA, "$datadir/$unicodedata")) { 59 | die ("Can't open Unicode database '$unicodedata':\n$!\n\n" . 60 | "Please make sure that you have downloaded the file\n" . 61 | "ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt\n"); 62 | } 63 | while () { 64 | if (/^([0-9,A-F]{4,6});([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*);([^;]*)$/) { 65 | $name{$1} = $2; 66 | } else { 67 | die("Syntax error in line '$_' in file '$unicodedata'"); 68 | } 69 | } 70 | close(UDATA); 71 | 72 | while () { 73 | next if /^\s*[\%\#]/; 74 | next if /^\s*$/; 75 | if (/^([0-9a-fA-F]{4})\s*(\#.*)?$/) { 76 | # uniset table format 77 | $ucs = $1; 78 | $ucs =~ tr/a-f/A-F/; 79 | if (!$trans{$ucs}) { 80 | append_translit($ucs, ""); 81 | } 82 | } elsif (/^\s*\s+(.*?)(\%.*)?$/) { 83 | # ISO/IEC TR 14652 format 84 | $ucs = $1; 85 | $ucs =~ tr/a-f/A-F/; 86 | $_ = $2; 87 | while (1) { 88 | if (/^;?(.*)$/) { 89 | $t = $1; 90 | $t =~ tr/a-f/A-F/; 91 | $_=$2; 92 | append_translit($ucs, $t); 93 | } elsif (/^\"((?:)*)\";?(.*)$/) { 94 | $t = $1; 95 | $_ = $2; 96 | $t =~ tr/a-f/A-F/; 97 | $t =~ s///g; 99 | append_translit($ucs, $t); 100 | } elsif (/^\"([^<\"]+)\";?(.*)$/) { 101 | $_ = $2; 102 | $t = ""; 103 | for ($i = 0; $i < length($1); $i++) { 104 | $t .= sprintf("%04X", ord(substr($1,$i,1))); 105 | } 106 | append_translit($ucs, $t); 107 | } elsif (/^\s*\%/ || /^\s*$/) { 108 | last; 109 | } else { 110 | die("parsing problem: '$_'\n"); 111 | } 112 | } 113 | } elsif (/^U\+([0-9a-fA-F]{4}):(.*)$/ || 114 | /^U\+([0-9a-fA-F]{4})\s*\"(.*)\"\s*(\#.*)?$/) { 115 | # Lynx format 116 | $ucs = $1; 117 | $ucs =~ tr/a-f/A-F/; 118 | $t = ""; 119 | for ($i = 0; $i < length($2); $i++) { 120 | $t .= sprintf("%04X", ord(substr($2,$i,1))); 121 | } 122 | append_translit($ucs, $t); 123 | } elsif (/0x([0-9a-fA-F]{2})\s*(.*?)\s*(\#.*)?$/) { 124 | # Lynx format 125 | $t = hex($1); 126 | $_=$2; 127 | while ($_) { 128 | if (/^U\+([0-9a-fA-F]{4})-U\+([0-9a-fA-F]{4})\s*(.*)$/) { 129 | $ucs1=$1; 130 | $ucs2=$2; 131 | $_=$3; 132 | for ($ucs=hex($ucs1); $ucs <= hex($ucs2); $ucs++) { 133 | append_translit(sprintf("%04X", $ucs), 134 | sprintf("%04X", $t)); 135 | } 136 | } elsif (/^U\+([0-9a-fA-F]{4})\s*(.*)$/) { 137 | $_=$3; 138 | append_translit(sprintf("%04X", hex($1)), 139 | sprintf("%04X", $t)); 140 | } else { 141 | print STDERR "Can't handle suffix: '$_'\n"; 142 | last; 143 | } 144 | } 145 | } else { 146 | print STDERR "Can't handle: $_"; 147 | } 148 | } 149 | 150 | $ENV{format} = iso if !$ENV{format}; 151 | 152 | if ($ENV{format} =~ /^iso/) { 153 | # output in ISO/IEC DTR 14652 format 154 | print "% \$Id: \$\n\n"; 155 | for $ucs (sort(keys(%trans))) { 156 | print "% $name{$ucs}\n"; 157 | if ($ENV{format} eq isoutf) { 158 | print "% " . utf8(hex($ucs)) . " -> "; 159 | @l = @{$trans{$ucs}}; 160 | while ($t = shift @l) { 161 | print "'"; 162 | while ($t =~ /^(....)/) { 163 | $t = $'; 164 | print utf8(hex($1)); 165 | } 166 | print "'"; 167 | print ", " if @l; 168 | } 169 | print "\n"; 170 | } 171 | print " "; 172 | @l = @{$trans{$ucs}}; 173 | while (defined($t = shift @l)) { 174 | if (length($t) == 4) { 175 | print ""; 176 | } else { 177 | $t =~ s/(....)//g; 178 | print "\"$t\""; 179 | } 180 | print ";" if @l; 181 | } 182 | print "\n"; 183 | } 184 | } 185 | 186 | if ($ENV{format} eq utf) { 187 | for $ucs (sort(keys(%trans))) { 188 | print "U+$ucs # " . utf8(hex($ucs)) . " -> "; 189 | @l = @{$trans{$ucs}}; 190 | while ($t = shift @l) { 191 | print "'"; 192 | while ($t =~ /^(....)/) { 193 | $t = $'; 194 | print utf8(hex($1)); 195 | } 196 | print "'"; 197 | print ", " if @l; 198 | } 199 | print "\n"; 200 | } 201 | } 202 | -------------------------------------------------------------------------------- /transtab/transtab.repertoire: -------------------------------------------------------------------------------- 1 | U+0027 # ' -> '’' 2 | U+0060 # ` -> '‛', '‘' 3 | U+00A0 #   -> ' ' 4 | U+00A1 # ¡ -> '!' 5 | U+00A2 # ¢ -> 'c' 6 | U+00A3 # £ -> 'GBP' 7 | U+00A5 # ¥ -> 'Y' 8 | U+00A6 # ¦ -> '|' 9 | U+00A7 # § -> 'S' 10 | U+00A8 # ¨ -> '"' 11 | U+00A9 # © -> '(c)', 'c' 12 | U+00AA # ª -> 'a' 13 | U+00AB # « -> '<<' 14 | U+00AC # ¬ -> '-' 15 | U+00AD # ­ -> '-' 16 | U+00AE # ® -> '(R)' 17 | U+00AF # ¯ -> '-' 18 | U+00B0 # ° -> ' ' 19 | U+00B1 # ± -> '+/-' 20 | U+00B2 # ² -> '^2', '2' 21 | U+00B3 # ³ -> '^3', '3' 22 | U+00B4 # ´ -> ''' 23 | U+00B5 # µ -> 'μ', 'u' 24 | U+00B6 # ¶ -> 'P' 25 | U+00B7 # · -> '.' 26 | U+00B8 # ¸ -> ',' 27 | U+00B9 # ¹ -> '^1', '1' 28 | U+00BA # º -> 'o' 29 | U+00BB # » -> '>>' 30 | U+00BC # ¼ -> ' 1/4' 31 | U+00BD # ½ -> ' 1/2' 32 | U+00BE # ¾ -> ' 3/4' 33 | U+00BF # ¿ -> '?' 34 | U+00C0 # À -> 'A' 35 | U+00C1 # Á -> 'A' 36 | U+00C2 #  -> 'A' 37 | U+00C3 # à -> 'A' 38 | U+00C4 # Ä -> 'Ae', 'A' 39 | U+00C5 # Å -> 'Aa', 'A' 40 | U+00C6 # Æ -> 'AE', 'A' 41 | U+00C7 # Ç -> 'C' 42 | U+00C8 # È -> 'E' 43 | U+00C9 # É -> 'E' 44 | U+00CA # Ê -> 'E' 45 | U+00CB # Ë -> 'E' 46 | U+00CC # Ì -> 'I' 47 | U+00CD # Í -> 'I' 48 | U+00CE # Î -> 'I' 49 | U+00CF # Ï -> 'I' 50 | U+00D0 # Ð -> 'D' 51 | U+00D1 # Ñ -> 'N' 52 | U+00D2 # Ò -> 'O' 53 | U+00D3 # Ó -> 'O' 54 | U+00D4 # Ô -> 'O' 55 | U+00D5 # Õ -> 'O' 56 | U+00D6 # Ö -> 'Oe', 'O' 57 | U+00D7 # × -> 'x' 58 | U+00D8 # Ø -> 'O' 59 | U+00D9 # Ù -> 'U' 60 | U+00DA # Ú -> 'U' 61 | U+00DB # Û -> 'U' 62 | U+00DC # Ü -> 'Ue', 'U' 63 | U+00DD # Ý -> 'Y' 64 | U+00DE # Þ -> 'Th' 65 | U+00DF # ß -> 'ss', 'β' 66 | U+00E0 # à -> 'a' 67 | U+00E1 # á -> 'a' 68 | U+00E2 # â -> 'a' 69 | U+00E3 # ã -> 'a' 70 | U+00E4 # ä -> 'ae', 'a' 71 | U+00E5 # å -> 'aa', 'a' 72 | U+00E6 # æ -> 'ae', 'a' 73 | U+00E7 # ç -> 'c' 74 | U+00E8 # è -> 'e' 75 | U+00E9 # é -> 'e' 76 | U+00EA # ê -> 'e' 77 | U+00EB # ë -> 'e' 78 | U+00EC # ì -> 'i' 79 | U+00ED # í -> 'i' 80 | U+00EE # î -> 'i' 81 | U+00EF # ï -> 'i' 82 | U+00F0 # ð -> 'd' 83 | U+00F1 # ñ -> 'n' 84 | U+00F2 # ò -> 'o' 85 | U+00F3 # ó -> 'o' 86 | U+00F4 # ô -> 'o' 87 | U+00F5 # õ -> 'o' 88 | U+00F6 # ö -> 'oe', 'o' 89 | U+00F7 # ÷ -> ':' 90 | U+00F8 # ø -> 'o' 91 | U+00F9 # ù -> 'u' 92 | U+00FA # ú -> 'u' 93 | U+00FB # û -> 'u' 94 | U+00FC # ü -> 'ue', 'u' 95 | U+00FD # ý -> 'y' 96 | U+00FE # þ -> 'th' 97 | U+00FF # ÿ -> 'y' 98 | U+0100 # Ā -> 'A' 99 | U+0101 # ā -> 'a' 100 | U+0102 # Ă -> 'A' 101 | U+0103 # ă -> 'a' 102 | U+0104 # Ą -> 'A' 103 | U+0105 # ą -> 'a' 104 | U+0106 # Ć -> 'C' 105 | U+0107 # ć -> 'c' 106 | U+0108 # Ĉ -> 'Ch', 'C' 107 | U+0109 # ĉ -> 'ch', 'c' 108 | U+010A # Ċ -> 'C' 109 | U+010B # ċ -> 'c' 110 | U+010C # Č -> 'C' 111 | U+010D # č -> 'c' 112 | U+010E # Ď -> 'D' 113 | U+010F # ď -> 'd' 114 | U+0110 # Đ -> 'D' 115 | U+0111 # đ -> 'd' 116 | U+0112 # Ē -> 'E' 117 | U+0113 # ē -> 'e' 118 | U+0114 # Ĕ -> 'E' 119 | U+0115 # ĕ -> 'e' 120 | U+0116 # Ė -> 'E' 121 | U+0117 # ė -> 'e' 122 | U+0118 # Ę -> 'E' 123 | U+0119 # ę -> 'e' 124 | U+011A # Ě -> 'E' 125 | U+011B # ě -> 'e' 126 | U+011C # Ĝ -> 'Gh', 'G' 127 | U+011D # ĝ -> 'gh', 'g' 128 | U+011E # Ğ -> 'G' 129 | U+011F # ğ -> 'g' 130 | U+0120 # Ġ -> 'G' 131 | U+0121 # ġ -> 'g' 132 | U+0122 # Ģ -> 'G' 133 | U+0123 # ģ -> 'g' 134 | U+0124 # Ĥ -> 'Hh', 'H' 135 | U+0125 # ĥ -> 'hh', 'h' 136 | U+0126 # Ħ -> 'H' 137 | U+0127 # ħ -> 'h' 138 | U+0128 # Ĩ -> 'I' 139 | U+0129 # ĩ -> 'i' 140 | U+012A # Ī -> 'I' 141 | U+012B # ī -> 'i' 142 | U+012C # Ĭ -> 'I' 143 | U+012D # ĭ -> 'i' 144 | U+012E # Į -> 'I' 145 | U+012F # į -> 'i' 146 | U+0130 # İ -> 'I' 147 | U+0131 # ı -> 'i' 148 | U+0132 # IJ -> 'IJ' 149 | U+0133 # ij -> 'ij' 150 | U+0134 # Ĵ -> 'Jh', 'J' 151 | U+0135 # ĵ -> 'jh', 'j' 152 | U+0136 # Ķ -> 'K' 153 | U+0137 # ķ -> 'k' 154 | U+0138 # ĸ -> 'k' 155 | U+0139 # Ĺ -> 'L' 156 | U+013A # ĺ -> 'l' 157 | U+013B # Ļ -> 'L' 158 | U+013C # ļ -> 'l' 159 | U+013D # Ľ -> 'L' 160 | U+013E # ľ -> 'l' 161 | U+013F # Ŀ -> 'L·', 'L.', 'L' 162 | U+0140 # ŀ -> 'l·', 'l.', 'l' 163 | U+0141 # Ł -> 'L' 164 | U+0142 # ł -> 'l' 165 | U+0143 # Ń -> 'N' 166 | U+0144 # ń -> 'n' 167 | U+0145 # Ņ -> 'N' 168 | U+0146 # ņ -> 'n' 169 | U+0147 # Ň -> 'N' 170 | U+0148 # ň -> 'n' 171 | U+0149 # ʼn -> ''n' 172 | U+014A # Ŋ -> 'NG', 'N' 173 | U+014B # ŋ -> 'ng', 'n' 174 | U+014C # Ō -> 'O' 175 | U+014D # ō -> 'o' 176 | U+014E # Ŏ -> 'O' 177 | U+014F # ŏ -> 'o' 178 | U+0150 # Ő -> 'O' 179 | U+0151 # ő -> 'o' 180 | U+0152 # Œ -> 'OE' 181 | U+0153 # œ -> 'oe' 182 | U+0154 # Ŕ -> 'R' 183 | U+0155 # ŕ -> 'r' 184 | U+0156 # Ŗ -> 'R' 185 | U+0157 # ŗ -> 'r' 186 | U+0158 # Ř -> 'R' 187 | U+0159 # ř -> 'r' 188 | U+015A # Ś -> 'S' 189 | U+015B # ś -> 's' 190 | U+015C # Ŝ -> 'Sh', 'S' 191 | U+015D # ŝ -> 'sh', 's' 192 | U+015E # Ş -> 'S' 193 | U+015F # ş -> 's' 194 | U+0160 # Š -> 'S' 195 | U+0161 # š -> 's' 196 | U+0162 # Ţ -> 'T' 197 | U+0163 # ţ -> 't' 198 | U+0164 # Ť -> 'T' 199 | U+0165 # ť -> 't' 200 | U+0166 # Ŧ -> 'T' 201 | U+0167 # ŧ -> 't' 202 | U+0168 # Ũ -> 'U' 203 | U+0169 # ũ -> 'u' 204 | U+016A # Ū -> 'U' 205 | U+016B # ū -> 'u' 206 | U+016C # Ŭ -> 'U' 207 | U+016D # ŭ -> 'u' 208 | U+016E # Ů -> 'U' 209 | U+016F # ů -> 'u' 210 | U+0170 # Ű -> 'U' 211 | U+0171 # ű -> 'u' 212 | U+0172 # Ų -> 'U' 213 | U+0173 # ų -> 'u' 214 | U+0174 # Ŵ -> 'W' 215 | U+0175 # ŵ -> 'w' 216 | U+0176 # Ŷ -> 'Y' 217 | U+0177 # ŷ -> 'y' 218 | U+0178 # Ÿ -> 'Y' 219 | U+0179 # Ź -> 'Z' 220 | U+017A # ź -> 'z' 221 | U+017B # Ż -> 'Z' 222 | U+017C # ż -> 'z' 223 | U+017D # Ž -> 'Z' 224 | U+017E # ž -> 'z' 225 | U+017F # ſ -> 's' 226 | U+0192 # ƒ -> 'f' 227 | U+01A0 # Ơ -> 'O' 228 | U+01A1 # ơ -> 'o' 229 | U+01AF # Ư -> 'U' 230 | U+01B0 # ư -> 'u' 231 | U+0218 # Ș -> 'Ş', 'S' 232 | U+0219 # ș -> 'ş', 's' 233 | U+021A # Ț -> 'Ţ', 'T' 234 | U+021B # ț -> 'ţ', 't' 235 | U+02B9 # ʹ -> '′', ''' 236 | U+02BB # ʻ -> '‘' 237 | U+02BC # ʼ -> '’', ''' 238 | U+02BD # ʽ -> '‛' 239 | U+02C6 # ˆ -> '^' 240 | U+02C8 # ˈ -> ''' 241 | U+02C9 # ˉ -> '¯' 242 | U+02CC # ˌ -> ',' 243 | U+02D0 # ː -> ':' 244 | U+02DA # ˚ -> '°' 245 | U+02DC # ˜ -> '~' 246 | U+02DD # ˝ -> '"' 247 | U+0374 # ʹ -> ''' 248 | U+0375 # ͵ -> ',' 249 | U+037E # ; -> ';' 250 | U+1E02 # Ḃ -> 'B' 251 | U+1E03 # ḃ -> 'b' 252 | U+1E0A # Ḋ -> 'D' 253 | U+1E0B # ḋ -> 'd' 254 | U+1E1E # Ḟ -> 'F' 255 | U+1E1F # ḟ -> 'f' 256 | U+1E40 # Ṁ -> 'M' 257 | U+1E41 # ṁ -> 'm' 258 | U+1E56 # Ṗ -> 'P' 259 | U+1E57 # ṗ -> 'p' 260 | U+1E60 # Ṡ -> 'S' 261 | U+1E61 # ṡ -> 's' 262 | U+1E6A # Ṫ -> 'T' 263 | U+1E6B # ṫ -> 't' 264 | U+1E80 # Ẁ -> 'W' 265 | U+1E81 # ẁ -> 'w' 266 | U+1E82 # Ẃ -> 'W' 267 | U+1E83 # ẃ -> 'w' 268 | U+1E84 # Ẅ -> 'W' 269 | U+1E85 # ẅ -> 'w' 270 | U+1EEE # Ữ -> 'U' 271 | U+1EEF # ữ -> 'u' 272 | U+1EF2 # Ỳ -> 'Y' 273 | U+1EF3 # ỳ -> 'y' 274 | U+2000 #   -> ' ' 275 | U+2001 #   -> ' ' 276 | U+2002 #   -> ' ' 277 | U+2003 #   -> ' ' 278 | U+2004 #   -> ' ' 279 | U+2005 #   -> ' ' 280 | U+2006 #   -> ' ' 281 | U+2007 #   -> ' ' 282 | U+2008 #   -> ' ' 283 | U+2009 #   -> ' ' 284 | U+200A #   -> 285 | U+200B # ​ -> 286 | U+200C # ‌ -> 287 | U+200D # ‍ -> 288 | U+200E # ‎ -> 289 | U+200F # ‏ -> 290 | U+2010 # ‐ -> '-' 291 | U+2011 # ‑ -> '-' 292 | U+2012 # ‒ -> '-' 293 | U+2013 # – -> '-' 294 | U+2014 # — -> '--' 295 | U+2015 # ― -> '--' 296 | U+2016 # ‖ -> '||' 297 | U+2017 # ‗ -> '_' 298 | U+2018 # ‘ -> ''' 299 | U+2019 # ’ -> ''' 300 | U+201A # ‚ -> ''' 301 | U+201B # ‛ -> ''' 302 | U+201C # “ -> '"' 303 | U+201D # ” -> '"' 304 | U+201E # „ -> '"' 305 | U+201F # ‟ -> '"' 306 | U+2020 # † -> '+' 307 | U+2021 # ‡ -> '++' 308 | U+2022 # • -> 'o' 309 | U+2023 # ‣ -> '>' 310 | U+2024 # ․ -> '.' 311 | U+2025 # ‥ -> '..' 312 | U+2026 # … -> '...' 313 | U+2027 # ‧ -> '-' 314 | U+202A # ‪ -> 315 | U+202B # ‫ -> 316 | U+202C # ‬ -> 317 | U+202D # ‭ -> 318 | U+202E # ‮ -> 319 | U+202F #   -> ' ' 320 | U+2030 # ‰ -> ' 0/00' 321 | U+2032 # ′ -> ''' 322 | U+2033 # ″ -> '"' 323 | U+2034 # ‴ -> ''''' 324 | U+2035 # ‵ -> '`' 325 | U+2036 # ‶ -> '``' 326 | U+2037 # ‷ -> '```' 327 | U+2039 # ‹ -> '<' 328 | U+203A # › -> '>' 329 | U+203C # ‼ -> '!!' 330 | U+203E # ‾ -> '-' 331 | U+2043 # ⁃ -> '-' 332 | U+2044 # ⁄ -> '/' 333 | U+2048 # ⁈ -> '?!' 334 | U+2049 # ⁉ -> '!?' 335 | U+204A # ⁊ -> '7' 336 | U+2070 # ⁰ -> '^0', '0' 337 | U+2074 # ⁴ -> '^4', '4' 338 | U+2075 # ⁵ -> '^5', '5' 339 | U+2076 # ⁶ -> '^6', '6' 340 | U+2077 # ⁷ -> '^7', '7' 341 | U+2078 # ⁸ -> '^8', '8' 342 | U+2079 # ⁹ -> '^9', '9' 343 | U+207A # ⁺ -> '^+', '+' 344 | U+207B # ⁻ -> '^-', '-' 345 | U+207C # ⁼ -> '^=', '=' 346 | U+207D # ⁽ -> '^(', '(' 347 | U+207E # ⁾ -> '^)', ')' 348 | U+207F # ⁿ -> '^n', 'n' 349 | U+2080 # ₀ -> '_0', '0' 350 | U+2081 # ₁ -> '_1', '1' 351 | U+2082 # ₂ -> '_2', '2' 352 | U+2083 # ₃ -> '_3', '3' 353 | U+2084 # ₄ -> '_4', '4' 354 | U+2085 # ₅ -> '_5', '5' 355 | U+2086 # ₆ -> '_6', '6' 356 | U+2087 # ₇ -> '_7', '7' 357 | U+2088 # ₈ -> '_8', '8' 358 | U+2089 # ₉ -> '_9', '9' 359 | U+208A # ₊ -> '_+', '+' 360 | U+208B # ₋ -> '_-', '-' 361 | U+208C # ₌ -> '_=', '=' 362 | U+208D # ₍ -> '_(', '(' 363 | U+208E # ₎ -> '_)', ')' 364 | U+20AC # € -> 'EUR', 'E' 365 | U+2100 # ℀ -> 'a/c' 366 | U+2101 # ℁ -> 'a/s' 367 | U+2103 # ℃ -> '°C', 'C' 368 | U+2105 # ℅ -> 'c/o' 369 | U+2106 # ℆ -> 'c/u' 370 | U+2109 # ℉ -> '°F', 'F' 371 | U+2113 # ℓ -> 'l' 372 | U+2116 # № -> 'Nº', 'No' 373 | U+2117 # ℗ -> '(P)' 374 | U+2120 # ℠ -> '[SM]' 375 | U+2121 # ℡ -> 'TEL' 376 | U+2122 # ™ -> '[TM]' 377 | U+2126 # Ω -> 'Ω', 'ohm', 'O' 378 | U+212A # K -> 'K' 379 | U+212B # Å -> 'Å' 380 | U+212E # ℮ -> 'e' 381 | U+2153 # ⅓ -> ' 1/3' 382 | U+2154 # ⅔ -> ' 2/3' 383 | U+2155 # ⅕ -> ' 1/5' 384 | U+2156 # ⅖ -> ' 2/5' 385 | U+2157 # ⅗ -> ' 3/5' 386 | U+2158 # ⅘ -> ' 4/5' 387 | U+2159 # ⅙ -> ' 1/6' 388 | U+215A # ⅚ -> ' 5/6' 389 | U+215B # ⅛ -> ' 1/8' 390 | U+215C # ⅜ -> ' 3/8' 391 | U+215D # ⅝ -> ' 5/8' 392 | U+215E # ⅞ -> ' 7/8' 393 | U+215F # ⅟ -> ' 1/' 394 | U+2160 # Ⅰ -> 'I' 395 | U+2161 # Ⅱ -> 'II' 396 | U+2162 # Ⅲ -> 'III' 397 | U+2163 # Ⅳ -> 'IV' 398 | U+2164 # Ⅴ -> 'V' 399 | U+2165 # Ⅵ -> 'VI' 400 | U+2166 # Ⅶ -> 'VII' 401 | U+2167 # Ⅷ -> 'VIII' 402 | U+2168 # Ⅸ -> 'IX' 403 | U+2169 # Ⅹ -> 'X' 404 | U+216A # Ⅺ -> 'XI' 405 | U+216B # Ⅻ -> 'XII' 406 | U+216C # Ⅼ -> 'L' 407 | U+216D # Ⅽ -> 'C' 408 | U+216E # Ⅾ -> 'D' 409 | U+216F # Ⅿ -> 'M' 410 | U+2170 # ⅰ -> 'i' 411 | U+2171 # ⅱ -> 'ii' 412 | U+2172 # ⅲ -> 'iii' 413 | U+2173 # ⅳ -> 'iv' 414 | U+2174 # ⅴ -> 'v' 415 | U+2175 # ⅵ -> 'vi' 416 | U+2176 # ⅶ -> 'vii' 417 | U+2177 # ⅷ -> 'viii' 418 | U+2178 # ⅸ -> 'ix' 419 | U+2179 # ⅹ -> 'x' 420 | U+217A # ⅺ -> 'xi' 421 | U+217B # ⅻ -> 'xii' 422 | U+217C # ⅼ -> 'l' 423 | U+217D # ⅽ -> 'c' 424 | U+217E # ⅾ -> 'd' 425 | U+217F # ⅿ -> 'm' 426 | U+2190 # ← -> '<-' 427 | U+2191 # ↑ -> '^' 428 | U+2192 # → -> '->' 429 | U+2193 # ↓ -> 'v' 430 | U+2194 # ↔ -> '<->' 431 | U+21D0 # ⇐ -> '<=' 432 | U+21D2 # ⇒ -> '=>' 433 | U+21D4 # ⇔ -> '<=>' 434 | U+2212 # − -> '–', '-' 435 | U+2215 # ∕ -> '/' 436 | U+2216 # ∖ -> '\' 437 | U+2217 # ∗ -> '*' 438 | U+2218 # ∘ -> 'o' 439 | U+2219 # ∙ -> '·' 440 | U+221E # ∞ -> 'inf' 441 | U+2223 # ∣ -> '|' 442 | U+2225 # ∥ -> '||' 443 | U+2236 # ∶ -> ':' 444 | U+223C # ∼ -> '~' 445 | U+2260 # ≠ -> '/=' 446 | U+2261 # ≡ -> '=' 447 | U+2264 # ≤ -> '<=' 448 | U+2265 # ≥ -> '>=' 449 | U+226A # ≪ -> '<<' 450 | U+226B # ≫ -> '>>' 451 | U+2295 # ⊕ -> '(+)' 452 | U+2296 # ⊖ -> '(-)' 453 | U+2297 # ⊗ -> '(x)' 454 | U+2298 # ⊘ -> '(/)' 455 | U+22A2 # ⊢ -> '|-' 456 | U+22A3 # ⊣ -> '-|' 457 | U+22A6 # ⊦ -> '|-' 458 | U+22A7 # ⊧ -> '|=' 459 | U+22A8 # ⊨ -> '|=' 460 | U+22A9 # ⊩ -> '||-' 461 | U+22C5 # ⋅ -> '·' 462 | U+22C6 # ⋆ -> '*' 463 | U+22D5 # ⋕ -> '#' 464 | U+22D8 # ⋘ -> '<<<' 465 | U+22D9 # ⋙ -> '>>>' 466 | U+22EF # ⋯ -> '...' 467 | U+2329 # 〈 -> '<' 468 | U+232A # 〉 -> '>' 469 | U+2400 # ␀ -> 'NUL' 470 | U+2401 # ␁ -> 'SOH' 471 | U+2402 # ␂ -> 'STX' 472 | U+2403 # ␃ -> 'ETX' 473 | U+2404 # ␄ -> 'EOT' 474 | U+2405 # ␅ -> 'ENQ' 475 | U+2406 # ␆ -> 'ACK' 476 | U+2407 # ␇ -> 'BEL' 477 | U+2408 # ␈ -> 'BS' 478 | U+2409 # ␉ -> 'HT' 479 | U+240A # ␊ -> 'LF' 480 | U+240B # ␋ -> 'VT' 481 | U+240C # ␌ -> 'FF' 482 | U+240D # ␍ -> 'CR' 483 | U+240E # ␎ -> 'SO' 484 | U+240F # ␏ -> 'SI' 485 | U+2410 # ␐ -> 'DLE' 486 | U+2411 # ␑ -> 'DC1' 487 | U+2412 # ␒ -> 'DC2' 488 | U+2413 # ␓ -> 'DC3' 489 | U+2414 # ␔ -> 'DC4' 490 | U+2415 # ␕ -> 'NAK' 491 | U+2416 # ␖ -> 'SYN' 492 | U+2417 # ␗ -> 'ETB' 493 | U+2418 # ␘ -> 'CAN' 494 | U+2419 # ␙ -> 'EM' 495 | U+241A # ␚ -> 'SUB' 496 | U+241B # ␛ -> 'ESC' 497 | U+241C # ␜ -> 'FS' 498 | U+241D # ␝ -> 'GS' 499 | U+241E # ␞ -> 'RS' 500 | U+241F # ␟ -> 'US' 501 | U+2420 # ␠ -> 'SP' 502 | U+2421 # ␡ -> 'DEL' 503 | U+2423 # ␣ -> '_' 504 | U+2424 # ␤ -> 'NL' 505 | U+2425 # ␥ -> '///' 506 | U+2426 # ␦ -> '?' 507 | U+2460 # ① -> '(1)', '1' 508 | U+2461 # ② -> '(2)', '2' 509 | U+2462 # ③ -> '(3)', '3' 510 | U+2463 # ④ -> '(4)', '4' 511 | U+2464 # ⑤ -> '(5)', '5' 512 | U+2465 # ⑥ -> '(6)', '6' 513 | U+2466 # ⑦ -> '(7)', '7' 514 | U+2467 # ⑧ -> '(8)', '8' 515 | U+2468 # ⑨ -> '(9)', '9' 516 | U+2469 # ⑩ -> '(10)' 517 | U+246A # ⑪ -> '(11)' 518 | U+246B # ⑫ -> '(12)' 519 | U+246C # ⑬ -> '(13)' 520 | U+246D # ⑭ -> '(14)' 521 | U+246E # ⑮ -> '(15)' 522 | U+246F # ⑯ -> '(16)' 523 | U+2470 # ⑰ -> '(17)' 524 | U+2471 # ⑱ -> '(18)' 525 | U+2472 # ⑲ -> '(19)' 526 | U+2473 # ⑳ -> '(20)' 527 | U+2474 # ⑴ -> '(1)', '1' 528 | U+2475 # ⑵ -> '(2)', '2' 529 | U+2476 # ⑶ -> '(3)', '3' 530 | U+2477 # ⑷ -> '(4)', '4' 531 | U+2478 # ⑸ -> '(5)', '5' 532 | U+2479 # ⑹ -> '(6)', '6' 533 | U+247A # ⑺ -> '(7)', '7' 534 | U+247B # ⑻ -> '(8)', '8' 535 | U+247C # ⑼ -> '(9)', '9' 536 | U+247D # ⑽ -> '(10)' 537 | U+247E # ⑾ -> '(11)' 538 | U+247F # ⑿ -> '(12)' 539 | U+2480 # ⒀ -> '(13)' 540 | U+2481 # ⒁ -> '(14)' 541 | U+2482 # ⒂ -> '(15)' 542 | U+2483 # ⒃ -> '(16)' 543 | U+2484 # ⒄ -> '(17)' 544 | U+2485 # ⒅ -> '(18)' 545 | U+2486 # ⒆ -> '(19)' 546 | U+2487 # ⒇ -> '(20)' 547 | U+2488 # ⒈ -> '1.', '1' 548 | U+2489 # ⒉ -> '2.', '2' 549 | U+248A # ⒊ -> '3.', '3' 550 | U+248B # ⒋ -> '4.', '4' 551 | U+248C # ⒌ -> '5.', '5' 552 | U+248D # ⒍ -> '6.', '6' 553 | U+248E # ⒎ -> '7.', '7' 554 | U+248F # ⒏ -> '8.', '8' 555 | U+2490 # ⒐ -> '9.', '9' 556 | U+2491 # ⒑ -> '10.' 557 | U+2492 # ⒒ -> '11.' 558 | U+2493 # ⒓ -> '12.' 559 | U+2494 # ⒔ -> '13.' 560 | U+2495 # ⒕ -> '14.' 561 | U+2496 # ⒖ -> '15.' 562 | U+2497 # ⒗ -> '16.' 563 | U+2498 # ⒘ -> '17.' 564 | U+2499 # ⒙ -> '18.' 565 | U+249A # ⒚ -> '19.' 566 | U+249B # ⒛ -> '20.' 567 | U+249C # ⒜ -> '(a)', 'a' 568 | U+249D # ⒝ -> '(b)', 'b' 569 | U+249E # ⒞ -> '(c)', 'c' 570 | U+249F # ⒟ -> '(d)', 'd' 571 | U+24A0 # ⒠ -> '(e)', 'e' 572 | U+24A1 # ⒡ -> '(f)', 'f' 573 | U+24A2 # ⒢ -> '(g)', 'g' 574 | U+24A3 # ⒣ -> '(h)', 'h' 575 | U+24A4 # ⒤ -> '(i)', 'i' 576 | U+24A5 # ⒥ -> '(j)', 'j' 577 | U+24A6 # ⒦ -> '(k)', 'k' 578 | U+24A7 # ⒧ -> '(l)', 'l' 579 | U+24A8 # ⒨ -> '(m)', 'm' 580 | U+24A9 # ⒩ -> '(n)', 'n' 581 | U+24AA # ⒪ -> '(o)', 'o' 582 | U+24AB # ⒫ -> '(p)', 'p' 583 | U+24AC # ⒬ -> '(q)', 'q' 584 | U+24AD # ⒭ -> '(r)', 'r' 585 | U+24AE # ⒮ -> '(s)', 's' 586 | U+24AF # ⒯ -> '(t)', 't' 587 | U+24B0 # ⒰ -> '(u)', 'u' 588 | U+24B1 # ⒱ -> '(v)', 'v' 589 | U+24B2 # ⒲ -> '(w)', 'w' 590 | U+24B3 # ⒳ -> '(x)', 'x' 591 | U+24B4 # ⒴ -> '(y)', 'y' 592 | U+24B5 # ⒵ -> '(z)', 'z' 593 | U+24B6 # Ⓐ -> '(A)', 'A' 594 | U+24B7 # Ⓑ -> '(B)', 'B' 595 | U+24B8 # Ⓒ -> '(C)', 'C' 596 | U+24B9 # Ⓓ -> '(D)', 'D' 597 | U+24BA # Ⓔ -> '(E)', 'E' 598 | U+24BB # Ⓕ -> '(F)', 'F' 599 | U+24BC # Ⓖ -> '(G)', 'G' 600 | U+24BD # Ⓗ -> '(H)', 'H' 601 | U+24BE # Ⓘ -> '(I)', 'I' 602 | U+24BF # Ⓙ -> '(J)', 'J' 603 | U+24C0 # Ⓚ -> '(K)', 'K' 604 | U+24C1 # Ⓛ -> '(L)', 'L' 605 | U+24C2 # Ⓜ -> '(M)', 'M' 606 | U+24C3 # Ⓝ -> '(N)', 'N' 607 | U+24C4 # Ⓞ -> '(O)', 'O' 608 | U+24C5 # Ⓟ -> '(P)', 'P' 609 | U+24C6 # Ⓠ -> '(Q)', 'Q' 610 | U+24C7 # Ⓡ -> '(R)', 'R' 611 | U+24C8 # Ⓢ -> '(S)', 'S' 612 | U+24C9 # Ⓣ -> '(T)', 'T' 613 | U+24CA # Ⓤ -> '(U)', 'U' 614 | U+24CB # Ⓥ -> '(V)', 'V' 615 | U+24CC # Ⓦ -> '(W)', 'W' 616 | U+24CD # Ⓧ -> '(X)', 'X' 617 | U+24CE # Ⓨ -> '(Y)', 'Y' 618 | U+24CF # Ⓩ -> '(Z)', 'Z' 619 | U+24D0 # ⓐ -> '(a)', 'a' 620 | U+24D1 # ⓑ -> '(b)', 'b' 621 | U+24D2 # ⓒ -> '(c)', 'c' 622 | U+24D3 # ⓓ -> '(d)', 'd' 623 | U+24D4 # ⓔ -> '(e)', 'e' 624 | U+24D5 # ⓕ -> '(f)', 'f' 625 | U+24D6 # ⓖ -> '(g)', 'g' 626 | U+24D7 # ⓗ -> '(h)', 'h' 627 | U+24D8 # ⓘ -> '(i)', 'i' 628 | U+24D9 # ⓙ -> '(j)', 'j' 629 | U+24DA # ⓚ -> '(k)', 'k' 630 | U+24DB # ⓛ -> '(l)', 'l' 631 | U+24DC # ⓜ -> '(m)', 'm' 632 | U+24DD # ⓝ -> '(n)', 'n' 633 | U+24DE # ⓞ -> '(o)', 'o' 634 | U+24DF # ⓟ -> '(p)', 'p' 635 | U+24E0 # ⓠ -> '(q)', 'q' 636 | U+24E1 # ⓡ -> '(r)', 'r' 637 | U+24E2 # ⓢ -> '(s)', 's' 638 | U+24E3 # ⓣ -> '(t)', 't' 639 | U+24E4 # ⓤ -> '(u)', 'u' 640 | U+24E5 # ⓥ -> '(v)', 'v' 641 | U+24E6 # ⓦ -> '(w)', 'w' 642 | U+24E7 # ⓧ -> '(x)', 'x' 643 | U+24E8 # ⓨ -> '(y)', 'y' 644 | U+24E9 # ⓩ -> '(z)', 'z' 645 | U+24EA # ⓪ -> '(0)', '0' 646 | U+2500 # ─ -> '-' 647 | U+2501 # ━ -> '=' 648 | U+2502 # │ -> '|' 649 | U+2503 # ┃ -> '|' 650 | U+2504 # ┄ -> '-' 651 | U+2505 # ┅ -> '=' 652 | U+2506 # ┆ -> '|' 653 | U+2507 # ┇ -> '|' 654 | U+2508 # ┈ -> '-' 655 | U+2509 # ┉ -> '=' 656 | U+250A # ┊ -> '|' 657 | U+250B # ┋ -> '|' 658 | U+250C # ┌ -> '+' 659 | U+250D # ┍ -> '+' 660 | U+250E # ┎ -> '+' 661 | U+250F # ┏ -> '+' 662 | U+2510 # ┐ -> '+' 663 | U+2511 # ┑ -> '+' 664 | U+2512 # ┒ -> '+' 665 | U+2513 # ┓ -> '+' 666 | U+2514 # └ -> '+' 667 | U+2515 # ┕ -> '+' 668 | U+2516 # ┖ -> '+' 669 | U+2517 # ┗ -> '+' 670 | U+2518 # ┘ -> '+' 671 | U+2519 # ┙ -> '+' 672 | U+251A # ┚ -> '+' 673 | U+251B # ┛ -> '+' 674 | U+251C # ├ -> '+' 675 | U+251D # ┝ -> '+' 676 | U+251E # ┞ -> '+' 677 | U+251F # ┟ -> '+' 678 | U+2520 # ┠ -> '+' 679 | U+2521 # ┡ -> '+' 680 | U+2522 # ┢ -> '+' 681 | U+2523 # ┣ -> '+' 682 | U+2524 # ┤ -> '+' 683 | U+2525 # ┥ -> '+' 684 | U+2526 # ┦ -> '+' 685 | U+2527 # ┧ -> '+' 686 | U+2528 # ┨ -> '+' 687 | U+2529 # ┩ -> '+' 688 | U+252A # ┪ -> '+' 689 | U+252B # ┫ -> '+' 690 | U+252C # ┬ -> '+' 691 | U+252D # ┭ -> '+' 692 | U+252E # ┮ -> '+' 693 | U+252F # ┯ -> '+' 694 | U+2530 # ┰ -> '+' 695 | U+2531 # ┱ -> '+' 696 | U+2532 # ┲ -> '+' 697 | U+2533 # ┳ -> '+' 698 | U+2534 # ┴ -> '+' 699 | U+2535 # ┵ -> '+' 700 | U+2536 # ┶ -> '+' 701 | U+2537 # ┷ -> '+' 702 | U+2538 # ┸ -> '+' 703 | U+2539 # ┹ -> '+' 704 | U+253A # ┺ -> '+' 705 | U+253B # ┻ -> '+' 706 | U+253C # ┼ -> '+' 707 | U+253D # ┽ -> '+' 708 | U+253E # ┾ -> '+' 709 | U+253F # ┿ -> '+' 710 | U+2540 # ╀ -> '+' 711 | U+2541 # ╁ -> '+' 712 | U+2542 # ╂ -> '+' 713 | U+2543 # ╃ -> '+' 714 | U+2544 # ╄ -> '+' 715 | U+2545 # ╅ -> '+' 716 | U+2546 # ╆ -> '+' 717 | U+2547 # ╇ -> '+' 718 | U+2548 # ╈ -> '+' 719 | U+2549 # ╉ -> '+' 720 | U+254A # ╊ -> '+' 721 | U+254B # ╋ -> '+' 722 | U+254C # ╌ -> '-' 723 | U+254D # ╍ -> '=' 724 | U+254E # ╎ -> '|' 725 | U+254F # ╏ -> '|' 726 | U+2550 # ═ -> '=' 727 | U+2551 # ║ -> '|' 728 | U+2552 # ╒ -> '+' 729 | U+2553 # ╓ -> '+' 730 | U+2554 # ╔ -> '+' 731 | U+2555 # ╕ -> '+' 732 | U+2556 # ╖ -> '+' 733 | U+2557 # ╗ -> '+' 734 | U+2558 # ╘ -> '+' 735 | U+2559 # ╙ -> '+' 736 | U+255A # ╚ -> '+' 737 | U+255B # ╛ -> '+' 738 | U+255C # ╜ -> '+' 739 | U+255D # ╝ -> '+' 740 | U+255E # ╞ -> '+' 741 | U+255F # ╟ -> '+' 742 | U+2560 # ╠ -> '+' 743 | U+2561 # ╡ -> '+' 744 | U+2562 # ╢ -> '+' 745 | U+2563 # ╣ -> '+' 746 | U+2564 # ╤ -> '+' 747 | U+2565 # ╥ -> '+' 748 | U+2566 # ╦ -> '+' 749 | U+2567 # ╧ -> '+' 750 | U+2568 # ╨ -> '+' 751 | U+2569 # ╩ -> '+' 752 | U+256A # ╪ -> '+' 753 | U+256B # ╫ -> '+' 754 | U+256C # ╬ -> '+' 755 | U+256D # ╭ -> '+' 756 | U+256E # ╮ -> '+' 757 | U+256F # ╯ -> '+' 758 | U+2570 # ╰ -> '+' 759 | U+2571 # ╱ -> '/' 760 | U+2572 # ╲ -> '\' 761 | U+2573 # ╳ -> 'X' 762 | U+257C # ╼ -> '-' 763 | U+257D # ╽ -> '|' 764 | U+257E # ╾ -> '-' 765 | U+257F # ╿ -> '|' 766 | U+25CB # ○ -> 'o' 767 | U+25E6 # ◦ -> 'o' 768 | U+2605 # ★ -> '*' 769 | U+2606 # ☆ -> '*' 770 | U+2612 # ☒ -> 'X' 771 | U+2613 # ☓ -> 'X' 772 | U+2639 # ☹ -> ':-(' 773 | U+263A # ☺ -> ':-)' 774 | U+263B # ☻ -> '(-:' 775 | U+266D # ♭ -> 'b' 776 | U+266F # ♯ -> '#' 777 | U+2701 # ✁ -> '%<' 778 | U+2702 # ✂ -> '%<' 779 | U+2703 # ✃ -> '%<' 780 | U+2704 # ✄ -> '%<' 781 | U+270C # ✌ -> 'V' 782 | U+2713 # ✓ -> '√' 783 | U+2714 # ✔ -> '√' 784 | U+2715 # ✕ -> 'x' 785 | U+2716 # ✖ -> 'x' 786 | U+2717 # ✗ -> 'X' 787 | U+2718 # ✘ -> 'X' 788 | U+2719 # ✙ -> '+' 789 | U+271A # ✚ -> '+' 790 | U+271B # ✛ -> '+' 791 | U+271C # ✜ -> '+' 792 | U+271D # ✝ -> '+' 793 | U+271E # ✞ -> '+' 794 | U+271F # ✟ -> '+' 795 | U+2720 # ✠ -> '+' 796 | U+2721 # ✡ -> '*' 797 | U+2722 # ✢ -> '+' 798 | U+2723 # ✣ -> '+' 799 | U+2724 # ✤ -> '+' 800 | U+2725 # ✥ -> '+' 801 | U+2726 # ✦ -> '+' 802 | U+2727 # ✧ -> '+' 803 | U+2729 # ✩ -> '*' 804 | U+272A # ✪ -> '*' 805 | U+272B # ✫ -> '*' 806 | U+272C # ✬ -> '*' 807 | U+272D # ✭ -> '*' 808 | U+272E # ✮ -> '*' 809 | U+272F # ✯ -> '*' 810 | U+2730 # ✰ -> '*' 811 | U+2731 # ✱ -> '*' 812 | U+2732 # ✲ -> '*' 813 | U+2733 # ✳ -> '*' 814 | U+2734 # ✴ -> '*' 815 | U+2735 # ✵ -> '*' 816 | U+2736 # ✶ -> '*' 817 | U+2737 # ✷ -> '*' 818 | U+2738 # ✸ -> '*' 819 | U+2739 # ✹ -> '*' 820 | U+273A # ✺ -> '*' 821 | U+273B # ✻ -> '*' 822 | U+273C # ✼ -> '*' 823 | U+273D # ✽ -> '*' 824 | U+273E # ✾ -> '*' 825 | U+273F # ✿ -> '*' 826 | U+2740 # ❀ -> '*' 827 | U+2741 # ❁ -> '*' 828 | U+2742 # ❂ -> '*' 829 | U+2743 # ❃ -> '*' 830 | U+2744 # ❄ -> '*' 831 | U+2745 # ❅ -> '*' 832 | U+2746 # ❆ -> '*' 833 | U+2747 # ❇ -> '*' 834 | U+2748 # ❈ -> '*' 835 | U+2749 # ❉ -> '*' 836 | U+274A # ❊ -> '*' 837 | U+274B # ❋ -> '*' 838 | U+FB00 # ff -> 'ff' 839 | U+FB01 # fi -> 'fi' 840 | U+FB02 # fl -> 'fl' 841 | U+FB03 # ffi -> 'ffi' 842 | U+FB04 # ffl -> 'ffl' 843 | U+FB05 # ſt -> 'ſt', 'st' 844 | U+FB06 # st -> 'st' 845 | U+FEFF #  -> 846 | U+FFFD # � -> '?' 847 | -------------------------------------------------------------------------------- /transtab/transtab.missing-MES-2: -------------------------------------------------------------------------------- 1 | % $Id: $ 2 | 3 | % CURRENCY SIGN 4 | % ¤ -> 5 | "" 6 | % LATIN CAPITAL LETTER SCHWA 7 | % Ə -> 8 | "" 9 | % LATIN CAPITAL LETTER EZH 10 | % Ʒ -> 11 | "" 12 | % LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON 13 | % Ǟ -> 14 | "" 15 | % LATIN SMALL LETTER A WITH DIAERESIS AND MACRON 16 | % ǟ -> 17 | "" 18 | % LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON 19 | % Ǡ -> 20 | "" 21 | % LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON 22 | % ǡ -> 23 | "" 24 | % LATIN CAPITAL LETTER AE WITH MACRON 25 | % Ǣ -> 26 | "" 27 | % LATIN SMALL LETTER AE WITH MACRON 28 | % ǣ -> 29 | "" 30 | % LATIN CAPITAL LETTER G WITH STROKE 31 | % Ǥ -> 32 | "" 33 | % LATIN SMALL LETTER G WITH STROKE 34 | % ǥ -> 35 | "" 36 | % LATIN CAPITAL LETTER G WITH CARON 37 | % Ǧ -> 38 | "" 39 | % LATIN SMALL LETTER G WITH CARON 40 | % ǧ -> 41 | "" 42 | % LATIN CAPITAL LETTER K WITH CARON 43 | % Ǩ -> 44 | "" 45 | % LATIN SMALL LETTER K WITH CARON 46 | % ǩ -> 47 | "" 48 | % LATIN CAPITAL LETTER O WITH OGONEK 49 | % Ǫ -> 50 | "" 51 | % LATIN SMALL LETTER O WITH OGONEK 52 | % ǫ -> 53 | "" 54 | % LATIN CAPITAL LETTER O WITH OGONEK AND MACRON 55 | % Ǭ -> 56 | "" 57 | % LATIN SMALL LETTER O WITH OGONEK AND MACRON 58 | % ǭ -> 59 | "" 60 | % LATIN CAPITAL LETTER EZH WITH CARON 61 | % Ǯ -> 62 | "" 63 | % LATIN SMALL LETTER EZH WITH CARON 64 | % ǯ -> 65 | "" 66 | % LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE 67 | % Ǻ -> 68 | "" 69 | % LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE 70 | % ǻ -> 71 | "" 72 | % LATIN CAPITAL LETTER AE WITH ACUTE 73 | % Ǽ -> 74 | "" 75 | % LATIN SMALL LETTER AE WITH ACUTE 76 | % ǽ -> 77 | "" 78 | % LATIN CAPITAL LETTER O WITH STROKE AND ACUTE 79 | % Ǿ -> 80 | "" 81 | % LATIN SMALL LETTER O WITH STROKE AND ACUTE 82 | % ǿ -> 83 | "" 84 | % LATIN CAPITAL LETTER H WITH CARON 85 | % Ȟ -> 86 | "" 87 | % LATIN SMALL LETTER H WITH CARON 88 | % ȟ -> 89 | "" 90 | % LATIN SMALL LETTER SCHWA 91 | % ə -> 92 | "" 93 | % LATIN SMALL LETTER R WITH LONG LEG 94 | % ɼ -> 95 | "" 96 | % LATIN SMALL LETTER EZH 97 | % ʒ -> 98 | "" 99 | % CARON 100 | % ˇ -> 101 | "" 102 | % BREVE 103 | % ˘ -> 104 | "" 105 | % DOT ABOVE 106 | % ˙ -> 107 | "" 108 | % OGONEK 109 | % ˛ -> 110 | "" 111 | % MODIFIER LETTER DOUBLE APOSTROPHE 112 | % ˮ -> 113 | "" 114 | % GREEK YPOGEGRAMMENI 115 | % ͺ -> 116 | "" 117 | % GREEK TONOS 118 | % ΄ -> 119 | "" 120 | % GREEK DIALYTIKA TONOS 121 | % ΅ -> 122 | "" 123 | % GREEK CAPITAL LETTER ALPHA WITH TONOS 124 | % Ά -> 125 | "" 126 | % GREEK ANO TELEIA 127 | % · -> 128 | "" 129 | % GREEK CAPITAL LETTER EPSILON WITH TONOS 130 | % Έ -> 131 | "" 132 | % GREEK CAPITAL LETTER ETA WITH TONOS 133 | % Ή -> 134 | "" 135 | % GREEK CAPITAL LETTER IOTA WITH TONOS 136 | % Ί -> 137 | "" 138 | % GREEK CAPITAL LETTER OMICRON WITH TONOS 139 | % Ό -> 140 | "" 141 | % GREEK CAPITAL LETTER UPSILON WITH TONOS 142 | % Ύ -> 143 | "" 144 | % GREEK CAPITAL LETTER OMEGA WITH TONOS 145 | % Ώ -> 146 | "" 147 | % GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS 148 | % ΐ -> 149 | "" 150 | % GREEK CAPITAL LETTER ALPHA 151 | % Α -> 152 | "" 153 | % GREEK CAPITAL LETTER BETA 154 | % Β -> 155 | "" 156 | % GREEK CAPITAL LETTER GAMMA 157 | % Γ -> 158 | "" 159 | % GREEK CAPITAL LETTER DELTA 160 | % Δ -> 161 | "" 162 | % GREEK CAPITAL LETTER EPSILON 163 | % Ε -> 164 | "" 165 | % GREEK CAPITAL LETTER ZETA 166 | % Ζ -> 167 | "" 168 | % GREEK CAPITAL LETTER ETA 169 | % Η -> 170 | "" 171 | % GREEK CAPITAL LETTER THETA 172 | % Θ -> 173 | "" 174 | % GREEK CAPITAL LETTER IOTA 175 | % Ι -> 176 | "" 177 | % GREEK CAPITAL LETTER KAPPA 178 | % Κ -> 179 | "" 180 | % GREEK CAPITAL LETTER LAMDA 181 | % Λ -> 182 | "" 183 | % GREEK CAPITAL LETTER MU 184 | % Μ -> 185 | "" 186 | % GREEK CAPITAL LETTER NU 187 | % Ν -> 188 | "" 189 | % GREEK CAPITAL LETTER XI 190 | % Ξ -> 191 | "" 192 | % GREEK CAPITAL LETTER OMICRON 193 | % Ο -> 194 | "" 195 | % GREEK CAPITAL LETTER PI 196 | % Π -> 197 | "" 198 | % GREEK CAPITAL LETTER RHO 199 | % Ρ -> 200 | "" 201 | % GREEK CAPITAL LETTER SIGMA 202 | % Σ -> 203 | "" 204 | % GREEK CAPITAL LETTER TAU 205 | % Τ -> 206 | "" 207 | % GREEK CAPITAL LETTER UPSILON 208 | % Υ -> 209 | "" 210 | % GREEK CAPITAL LETTER PHI 211 | % Φ -> 212 | "" 213 | % GREEK CAPITAL LETTER CHI 214 | % Χ -> 215 | "" 216 | % GREEK CAPITAL LETTER PSI 217 | % Ψ -> 218 | "" 219 | % GREEK CAPITAL LETTER OMEGA 220 | % Ω -> 221 | "" 222 | % GREEK CAPITAL LETTER IOTA WITH DIALYTIKA 223 | % Ϊ -> 224 | "" 225 | % GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA 226 | % Ϋ -> 227 | "" 228 | % GREEK SMALL LETTER ALPHA WITH TONOS 229 | % ά -> 230 | "" 231 | % GREEK SMALL LETTER EPSILON WITH TONOS 232 | % έ -> 233 | "" 234 | % GREEK SMALL LETTER ETA WITH TONOS 235 | % ή -> 236 | "" 237 | % GREEK SMALL LETTER IOTA WITH TONOS 238 | % ί -> 239 | "" 240 | % GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS 241 | % ΰ -> 242 | "" 243 | % GREEK SMALL LETTER ALPHA 244 | % α -> 245 | "" 246 | % GREEK SMALL LETTER BETA 247 | % β -> 248 | "" 249 | % GREEK SMALL LETTER GAMMA 250 | % γ -> 251 | "" 252 | % GREEK SMALL LETTER DELTA 253 | % δ -> 254 | "" 255 | % GREEK SMALL LETTER EPSILON 256 | % ε -> 257 | "" 258 | % GREEK SMALL LETTER ZETA 259 | % ζ -> 260 | "" 261 | % GREEK SMALL LETTER ETA 262 | % η -> 263 | "" 264 | % GREEK SMALL LETTER THETA 265 | % θ -> 266 | "" 267 | % GREEK SMALL LETTER IOTA 268 | % ι -> 269 | "" 270 | % GREEK SMALL LETTER KAPPA 271 | % κ -> 272 | "" 273 | % GREEK SMALL LETTER LAMDA 274 | % λ -> 275 | "" 276 | % GREEK SMALL LETTER MU 277 | % μ -> 278 | "" 279 | % GREEK SMALL LETTER NU 280 | % ν -> 281 | "" 282 | % GREEK SMALL LETTER XI 283 | % ξ -> 284 | "" 285 | % GREEK SMALL LETTER OMICRON 286 | % ο -> 287 | "" 288 | % GREEK SMALL LETTER PI 289 | % π -> 290 | "" 291 | % GREEK SMALL LETTER RHO 292 | % ρ -> 293 | "" 294 | % GREEK SMALL LETTER FINAL SIGMA 295 | % ς -> 296 | "" 297 | % GREEK SMALL LETTER SIGMA 298 | % σ -> 299 | "" 300 | % GREEK SMALL LETTER TAU 301 | % τ -> 302 | "" 303 | % GREEK SMALL LETTER UPSILON 304 | % υ -> 305 | "" 306 | % GREEK SMALL LETTER PHI 307 | % φ -> 308 | "" 309 | % GREEK SMALL LETTER CHI 310 | % χ -> 311 | "" 312 | % GREEK SMALL LETTER PSI 313 | % ψ -> 314 | "" 315 | % GREEK SMALL LETTER OMEGA 316 | % ω -> 317 | "" 318 | % GREEK SMALL LETTER IOTA WITH DIALYTIKA 319 | % ϊ -> 320 | "" 321 | % GREEK SMALL LETTER UPSILON WITH DIALYTIKA 322 | % ϋ -> 323 | "" 324 | % GREEK SMALL LETTER OMICRON WITH TONOS 325 | % ό -> 326 | "" 327 | % GREEK SMALL LETTER UPSILON WITH TONOS 328 | % ύ -> 329 | "" 330 | % GREEK SMALL LETTER OMEGA WITH TONOS 331 | % ώ -> 332 | "" 333 | % GREEK KAI SYMBOL 334 | % ϗ -> 335 | "" 336 | % GREEK LETTER STIGMA 337 | % Ϛ -> 338 | "" 339 | % GREEK SMALL LETTER STIGMA 340 | % ϛ -> 341 | "" 342 | % GREEK LETTER DIGAMMA 343 | % Ϝ -> 344 | "" 345 | % GREEK SMALL LETTER DIGAMMA 346 | % ϝ -> 347 | "" 348 | % GREEK LETTER KOPPA 349 | % Ϟ -> 350 | "" 351 | % GREEK SMALL LETTER KOPPA 352 | % ϟ -> 353 | "" 354 | % GREEK LETTER SAMPI 355 | % Ϡ -> 356 | "" 357 | % GREEK SMALL LETTER SAMPI 358 | % ϡ -> 359 | "" 360 | % CYRILLIC CAPITAL LETTER IE WITH GRAVE 361 | % Ѐ -> 362 | "" 363 | % CYRILLIC CAPITAL LETTER IO 364 | % Ё -> 365 | "" 366 | % CYRILLIC CAPITAL LETTER DJE 367 | % Ђ -> 368 | "" 369 | % CYRILLIC CAPITAL LETTER GJE 370 | % Ѓ -> 371 | "" 372 | % CYRILLIC CAPITAL LETTER UKRAINIAN IE 373 | % Є -> 374 | "" 375 | % CYRILLIC CAPITAL LETTER DZE 376 | % Ѕ -> 377 | "" 378 | % CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I 379 | % І -> 380 | "" 381 | % CYRILLIC CAPITAL LETTER YI 382 | % Ї -> 383 | "" 384 | % CYRILLIC CAPITAL LETTER JE 385 | % Ј -> 386 | "" 387 | % CYRILLIC CAPITAL LETTER LJE 388 | % Љ -> 389 | "" 390 | % CYRILLIC CAPITAL LETTER NJE 391 | % Њ -> 392 | "" 393 | % CYRILLIC CAPITAL LETTER TSHE 394 | % Ћ -> 395 | "" 396 | % CYRILLIC CAPITAL LETTER KJE 397 | % Ќ -> 398 | "" 399 | % CYRILLIC CAPITAL LETTER I WITH GRAVE 400 | % Ѝ -> 401 | "" 402 | % CYRILLIC CAPITAL LETTER SHORT U 403 | % Ў -> 404 | "" 405 | % CYRILLIC CAPITAL LETTER DZHE 406 | % Џ -> 407 | "" 408 | % CYRILLIC CAPITAL LETTER A 409 | % А -> 410 | "" 411 | % CYRILLIC CAPITAL LETTER BE 412 | % Б -> 413 | "" 414 | % CYRILLIC CAPITAL LETTER VE 415 | % В -> 416 | "" 417 | % CYRILLIC CAPITAL LETTER GHE 418 | % Г -> 419 | "" 420 | % CYRILLIC CAPITAL LETTER DE 421 | % Д -> 422 | "" 423 | % CYRILLIC CAPITAL LETTER IE 424 | % Е -> 425 | "" 426 | % CYRILLIC CAPITAL LETTER ZHE 427 | % Ж -> 428 | "" 429 | % CYRILLIC CAPITAL LETTER ZE 430 | % З -> 431 | "" 432 | % CYRILLIC CAPITAL LETTER I 433 | % И -> 434 | "" 435 | % CYRILLIC CAPITAL LETTER SHORT I 436 | % Й -> 437 | "" 438 | % CYRILLIC CAPITAL LETTER KA 439 | % К -> 440 | "" 441 | % CYRILLIC CAPITAL LETTER EL 442 | % Л -> 443 | "" 444 | % CYRILLIC CAPITAL LETTER EM 445 | % М -> 446 | "" 447 | % CYRILLIC CAPITAL LETTER EN 448 | % Н -> 449 | "" 450 | % CYRILLIC CAPITAL LETTER O 451 | % О -> 452 | "" 453 | % CYRILLIC CAPITAL LETTER PE 454 | % П -> 455 | "" 456 | % CYRILLIC CAPITAL LETTER ER 457 | % Р -> 458 | "" 459 | % CYRILLIC CAPITAL LETTER ES 460 | % С -> 461 | "" 462 | % CYRILLIC CAPITAL LETTER TE 463 | % Т -> 464 | "" 465 | % CYRILLIC CAPITAL LETTER U 466 | % У -> 467 | "" 468 | % CYRILLIC CAPITAL LETTER EF 469 | % Ф -> 470 | "" 471 | % CYRILLIC CAPITAL LETTER HA 472 | % Х -> 473 | "" 474 | % CYRILLIC CAPITAL LETTER TSE 475 | % Ц -> 476 | "" 477 | % CYRILLIC CAPITAL LETTER CHE 478 | % Ч -> 479 | "" 480 | % CYRILLIC CAPITAL LETTER SHA 481 | % Ш -> 482 | "" 483 | % CYRILLIC CAPITAL LETTER SHCHA 484 | % Щ -> 485 | "" 486 | % CYRILLIC CAPITAL LETTER HARD SIGN 487 | % Ъ -> 488 | "" 489 | % CYRILLIC CAPITAL LETTER YERU 490 | % Ы -> 491 | "" 492 | % CYRILLIC CAPITAL LETTER SOFT SIGN 493 | % Ь -> 494 | "" 495 | % CYRILLIC CAPITAL LETTER E 496 | % Э -> 497 | "" 498 | % CYRILLIC CAPITAL LETTER YU 499 | % Ю -> 500 | "" 501 | % CYRILLIC CAPITAL LETTER YA 502 | % Я -> 503 | "" 504 | % CYRILLIC SMALL LETTER A 505 | % а -> 506 | "" 507 | % CYRILLIC SMALL LETTER BE 508 | % б -> 509 | "" 510 | % CYRILLIC SMALL LETTER VE 511 | % в -> 512 | "" 513 | % CYRILLIC SMALL LETTER GHE 514 | % г -> 515 | "" 516 | % CYRILLIC SMALL LETTER DE 517 | % д -> 518 | "" 519 | % CYRILLIC SMALL LETTER IE 520 | % е -> 521 | "" 522 | % CYRILLIC SMALL LETTER ZHE 523 | % ж -> 524 | "" 525 | % CYRILLIC SMALL LETTER ZE 526 | % з -> 527 | "" 528 | % CYRILLIC SMALL LETTER I 529 | % и -> 530 | "" 531 | % CYRILLIC SMALL LETTER SHORT I 532 | % й -> 533 | "" 534 | % CYRILLIC SMALL LETTER KA 535 | % к -> 536 | "" 537 | % CYRILLIC SMALL LETTER EL 538 | % л -> 539 | "" 540 | % CYRILLIC SMALL LETTER EM 541 | % м -> 542 | "" 543 | % CYRILLIC SMALL LETTER EN 544 | % н -> 545 | "" 546 | % CYRILLIC SMALL LETTER O 547 | % о -> 548 | "" 549 | % CYRILLIC SMALL LETTER PE 550 | % п -> 551 | "" 552 | % CYRILLIC SMALL LETTER ER 553 | % р -> 554 | "" 555 | % CYRILLIC SMALL LETTER ES 556 | % с -> 557 | "" 558 | % CYRILLIC SMALL LETTER TE 559 | % т -> 560 | "" 561 | % CYRILLIC SMALL LETTER U 562 | % у -> 563 | "" 564 | % CYRILLIC SMALL LETTER EF 565 | % ф -> 566 | "" 567 | % CYRILLIC SMALL LETTER HA 568 | % х -> 569 | "" 570 | % CYRILLIC SMALL LETTER TSE 571 | % ц -> 572 | "" 573 | % CYRILLIC SMALL LETTER CHE 574 | % ч -> 575 | "" 576 | % CYRILLIC SMALL LETTER SHA 577 | % ш -> 578 | "" 579 | % CYRILLIC SMALL LETTER SHCHA 580 | % щ -> 581 | "" 582 | % CYRILLIC SMALL LETTER HARD SIGN 583 | % ъ -> 584 | "" 585 | % CYRILLIC SMALL LETTER YERU 586 | % ы -> 587 | "" 588 | % CYRILLIC SMALL LETTER SOFT SIGN 589 | % ь -> 590 | "" 591 | % CYRILLIC SMALL LETTER E 592 | % э -> 593 | "" 594 | % CYRILLIC SMALL LETTER YU 595 | % ю -> 596 | "" 597 | % CYRILLIC SMALL LETTER YA 598 | % я -> 599 | "" 600 | % CYRILLIC SMALL LETTER IE WITH GRAVE 601 | % ѐ -> 602 | "" 603 | % CYRILLIC SMALL LETTER IO 604 | % ё -> 605 | "" 606 | % CYRILLIC SMALL LETTER DJE 607 | % ђ -> 608 | "" 609 | % CYRILLIC SMALL LETTER GJE 610 | % ѓ -> 611 | "" 612 | % CYRILLIC SMALL LETTER UKRAINIAN IE 613 | % є -> 614 | "" 615 | % CYRILLIC SMALL LETTER DZE 616 | % ѕ -> 617 | "" 618 | % CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I 619 | % і -> 620 | "" 621 | % CYRILLIC SMALL LETTER YI 622 | % ї -> 623 | "" 624 | % CYRILLIC SMALL LETTER JE 625 | % ј -> 626 | "" 627 | % CYRILLIC SMALL LETTER LJE 628 | % љ -> 629 | "" 630 | % CYRILLIC SMALL LETTER NJE 631 | % њ -> 632 | "" 633 | % CYRILLIC SMALL LETTER TSHE 634 | % ћ -> 635 | "" 636 | % CYRILLIC SMALL LETTER KJE 637 | % ќ -> 638 | "" 639 | % CYRILLIC SMALL LETTER I WITH GRAVE 640 | % ѝ -> 641 | "" 642 | % CYRILLIC SMALL LETTER SHORT U 643 | % ў -> 644 | "" 645 | % CYRILLIC SMALL LETTER DZHE 646 | % џ -> 647 | "" 648 | % CYRILLIC CAPITAL LETTER GHE WITH UPTURN 649 | % Ґ -> 650 | "" 651 | % CYRILLIC SMALL LETTER GHE WITH UPTURN 652 | % ґ -> 653 | "" 654 | % CYRILLIC CAPITAL LETTER GHE WITH STROKE 655 | % Ғ -> 656 | "" 657 | % CYRILLIC SMALL LETTER GHE WITH STROKE 658 | % ғ -> 659 | "" 660 | % CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK 661 | % Ҕ -> 662 | "" 663 | % CYRILLIC SMALL LETTER GHE WITH MIDDLE HOOK 664 | % ҕ -> 665 | "" 666 | % CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER 667 | % Җ -> 668 | "" 669 | % CYRILLIC SMALL LETTER ZHE WITH DESCENDER 670 | % җ -> 671 | "" 672 | % CYRILLIC CAPITAL LETTER ZE WITH DESCENDER 673 | % Ҙ -> 674 | "" 675 | % CYRILLIC SMALL LETTER ZE WITH DESCENDER 676 | % ҙ -> 677 | "" 678 | % CYRILLIC CAPITAL LETTER KA WITH DESCENDER 679 | % Қ -> 680 | "" 681 | % CYRILLIC SMALL LETTER KA WITH DESCENDER 682 | % қ -> 683 | "" 684 | % CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE 685 | % Ҝ -> 686 | "" 687 | % CYRILLIC SMALL LETTER KA WITH VERTICAL STROKE 688 | % ҝ -> 689 | "" 690 | % CYRILLIC CAPITAL LETTER KA WITH STROKE 691 | % Ҟ -> 692 | "" 693 | % CYRILLIC SMALL LETTER KA WITH STROKE 694 | % ҟ -> 695 | "" 696 | % CYRILLIC CAPITAL LETTER BASHKIR KA 697 | % Ҡ -> 698 | "" 699 | % CYRILLIC SMALL LETTER BASHKIR KA 700 | % ҡ -> 701 | "" 702 | % CYRILLIC CAPITAL LETTER EN WITH DESCENDER 703 | % Ң -> 704 | "" 705 | % CYRILLIC SMALL LETTER EN WITH DESCENDER 706 | % ң -> 707 | "" 708 | % CYRILLIC CAPITAL LIGATURE EN GHE 709 | % Ҥ -> 710 | "" 711 | % CYRILLIC SMALL LIGATURE EN GHE 712 | % ҥ -> 713 | "" 714 | % CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK 715 | % Ҧ -> 716 | "" 717 | % CYRILLIC SMALL LETTER PE WITH MIDDLE HOOK 718 | % ҧ -> 719 | "" 720 | % CYRILLIC CAPITAL LETTER ABKHASIAN HA 721 | % Ҩ -> 722 | "" 723 | % CYRILLIC SMALL LETTER ABKHASIAN HA 724 | % ҩ -> 725 | "" 726 | % CYRILLIC CAPITAL LETTER ES WITH DESCENDER 727 | % Ҫ -> 728 | "" 729 | % CYRILLIC SMALL LETTER ES WITH DESCENDER 730 | % ҫ -> 731 | "" 732 | % CYRILLIC CAPITAL LETTER TE WITH DESCENDER 733 | % Ҭ -> 734 | "" 735 | % CYRILLIC SMALL LETTER TE WITH DESCENDER 736 | % ҭ -> 737 | "" 738 | % CYRILLIC CAPITAL LETTER STRAIGHT U 739 | % Ү -> 740 | "" 741 | % CYRILLIC SMALL LETTER STRAIGHT U 742 | % ү -> 743 | "" 744 | % CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE 745 | % Ұ -> 746 | "" 747 | % CYRILLIC SMALL LETTER STRAIGHT U WITH STROKE 748 | % ұ -> 749 | "" 750 | % CYRILLIC CAPITAL LETTER HA WITH DESCENDER 751 | % Ҳ -> 752 | "" 753 | % CYRILLIC SMALL LETTER HA WITH DESCENDER 754 | % ҳ -> 755 | "" 756 | % CYRILLIC CAPITAL LIGATURE TE TSE 757 | % Ҵ -> 758 | "" 759 | % CYRILLIC SMALL LIGATURE TE TSE 760 | % ҵ -> 761 | "" 762 | % CYRILLIC CAPITAL LETTER CHE WITH DESCENDER 763 | % Ҷ -> 764 | "" 765 | % CYRILLIC SMALL LETTER CHE WITH DESCENDER 766 | % ҷ -> 767 | "" 768 | % CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROKE 769 | % Ҹ -> 770 | "" 771 | % CYRILLIC SMALL LETTER CHE WITH VERTICAL STROKE 772 | % ҹ -> 773 | "" 774 | % CYRILLIC CAPITAL LETTER SHHA 775 | % Һ -> 776 | "" 777 | % CYRILLIC SMALL LETTER SHHA 778 | % һ -> 779 | "" 780 | % CYRILLIC CAPITAL LETTER ABKHASIAN CHE 781 | % Ҽ -> 782 | "" 783 | % CYRILLIC SMALL LETTER ABKHASIAN CHE 784 | % ҽ -> 785 | "" 786 | % CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER 787 | % Ҿ -> 788 | "" 789 | % CYRILLIC SMALL LETTER ABKHASIAN CHE WITH DESCENDER 790 | % ҿ -> 791 | "" 792 | % CYRILLIC LETTER PALOCHKA 793 | % Ӏ -> 794 | "" 795 | % CYRILLIC CAPITAL LETTER ZHE WITH BREVE 796 | % Ӂ -> 797 | "" 798 | % CYRILLIC SMALL LETTER ZHE WITH BREVE 799 | % ӂ -> 800 | "" 801 | % CYRILLIC CAPITAL LETTER KA WITH HOOK 802 | % Ӄ -> 803 | "" 804 | % CYRILLIC SMALL LETTER KA WITH HOOK 805 | % ӄ -> 806 | "" 807 | % CYRILLIC CAPITAL LETTER EN WITH HOOK 808 | % Ӈ -> 809 | "" 810 | % CYRILLIC SMALL LETTER EN WITH HOOK 811 | % ӈ -> 812 | "" 813 | % CYRILLIC CAPITAL LETTER KHAKASSIAN CHE 814 | % Ӌ -> 815 | "" 816 | % CYRILLIC SMALL LETTER KHAKASSIAN CHE 817 | % ӌ -> 818 | "" 819 | % CYRILLIC CAPITAL LETTER A WITH BREVE 820 | % Ӑ -> 821 | "" 822 | % CYRILLIC SMALL LETTER A WITH BREVE 823 | % ӑ -> 824 | "" 825 | % CYRILLIC CAPITAL LETTER A WITH DIAERESIS 826 | % Ӓ -> 827 | "" 828 | % CYRILLIC SMALL LETTER A WITH DIAERESIS 829 | % ӓ -> 830 | "" 831 | % CYRILLIC CAPITAL LIGATURE A IE 832 | % Ӕ -> 833 | "" 834 | % CYRILLIC SMALL LIGATURE A IE 835 | % ӕ -> 836 | "" 837 | % CYRILLIC CAPITAL LETTER IE WITH BREVE 838 | % Ӗ -> 839 | "" 840 | % CYRILLIC SMALL LETTER IE WITH BREVE 841 | % ӗ -> 842 | "" 843 | % CYRILLIC CAPITAL LETTER SCHWA 844 | % Ә -> 845 | "" 846 | % CYRILLIC SMALL LETTER SCHWA 847 | % ә -> 848 | "" 849 | % CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS 850 | % Ӛ -> 851 | "" 852 | % CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS 853 | % ӛ -> 854 | "" 855 | % CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS 856 | % Ӝ -> 857 | "" 858 | % CYRILLIC SMALL LETTER ZHE WITH DIAERESIS 859 | % ӝ -> 860 | "" 861 | % CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS 862 | % Ӟ -> 863 | "" 864 | % CYRILLIC SMALL LETTER ZE WITH DIAERESIS 865 | % ӟ -> 866 | "" 867 | % CYRILLIC CAPITAL LETTER ABKHASIAN DZE 868 | % Ӡ -> 869 | "" 870 | % CYRILLIC SMALL LETTER ABKHASIAN DZE 871 | % ӡ -> 872 | "" 873 | % CYRILLIC CAPITAL LETTER I WITH MACRON 874 | % Ӣ -> 875 | "" 876 | % CYRILLIC SMALL LETTER I WITH MACRON 877 | % ӣ -> 878 | "" 879 | % CYRILLIC CAPITAL LETTER I WITH DIAERESIS 880 | % Ӥ -> 881 | "" 882 | % CYRILLIC SMALL LETTER I WITH DIAERESIS 883 | % ӥ -> 884 | "" 885 | % CYRILLIC CAPITAL LETTER O WITH DIAERESIS 886 | % Ӧ -> 887 | "" 888 | % CYRILLIC SMALL LETTER O WITH DIAERESIS 889 | % ӧ -> 890 | "" 891 | % CYRILLIC CAPITAL LETTER BARRED O 892 | % Ө -> 893 | "" 894 | % CYRILLIC SMALL LETTER BARRED O 895 | % ө -> 896 | "" 897 | % CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS 898 | % Ӫ -> 899 | "" 900 | % CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS 901 | % ӫ -> 902 | "" 903 | % CYRILLIC CAPITAL LETTER U WITH MACRON 904 | % Ӯ -> 905 | "" 906 | % CYRILLIC SMALL LETTER U WITH MACRON 907 | % ӯ -> 908 | "" 909 | % CYRILLIC CAPITAL LETTER U WITH DIAERESIS 910 | % Ӱ -> 911 | "" 912 | % CYRILLIC SMALL LETTER U WITH DIAERESIS 913 | % ӱ -> 914 | "" 915 | % CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE 916 | % Ӳ -> 917 | "" 918 | % CYRILLIC SMALL LETTER U WITH DOUBLE ACUTE 919 | % ӳ -> 920 | "" 921 | % CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS 922 | % Ӵ -> 923 | "" 924 | % CYRILLIC SMALL LETTER CHE WITH DIAERESIS 925 | % ӵ -> 926 | "" 927 | % CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS 928 | % Ӹ -> 929 | "" 930 | % CYRILLIC SMALL LETTER YERU WITH DIAERESIS 931 | % ӹ -> 932 | "" 933 | % LATIN SMALL LETTER LONG S WITH DOT ABOVE 934 | % ẛ -> 935 | "" 936 | % GREEK SMALL LETTER ALPHA WITH PSILI 937 | % ἀ -> 938 | "" 939 | % GREEK SMALL LETTER ALPHA WITH DASIA 940 | % ἁ -> 941 | "" 942 | % GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA 943 | % ἂ -> 944 | "" 945 | % GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA 946 | % ἃ -> 947 | "" 948 | % GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA 949 | % ἄ -> 950 | "" 951 | % GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA 952 | % ἅ -> 953 | "" 954 | % GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI 955 | % ἆ -> 956 | "" 957 | % GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI 958 | % ἇ -> 959 | "" 960 | % GREEK CAPITAL LETTER ALPHA WITH PSILI 961 | % Ἀ -> 962 | "" 963 | % GREEK CAPITAL LETTER ALPHA WITH DASIA 964 | % Ἁ -> 965 | "" 966 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA 967 | % Ἂ -> 968 | "" 969 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA 970 | % Ἃ -> 971 | "" 972 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA 973 | % Ἄ -> 974 | "" 975 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA 976 | % Ἅ -> 977 | "" 978 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI 979 | % Ἆ -> 980 | "" 981 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI 982 | % Ἇ -> 983 | "" 984 | % GREEK SMALL LETTER EPSILON WITH PSILI 985 | % ἐ -> 986 | "" 987 | % GREEK SMALL LETTER EPSILON WITH DASIA 988 | % ἑ -> 989 | "" 990 | % GREEK SMALL LETTER EPSILON WITH PSILI AND VARIA 991 | % ἒ -> 992 | "" 993 | % GREEK SMALL LETTER EPSILON WITH DASIA AND VARIA 994 | % ἓ -> 995 | "" 996 | % GREEK SMALL LETTER EPSILON WITH PSILI AND OXIA 997 | % ἔ -> 998 | "" 999 | % GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA 1000 | % ἕ -> 1001 | "" 1002 | % GREEK CAPITAL LETTER EPSILON WITH PSILI 1003 | % Ἐ -> 1004 | "" 1005 | % GREEK CAPITAL LETTER EPSILON WITH DASIA 1006 | % Ἑ -> 1007 | "" 1008 | % GREEK CAPITAL LETTER EPSILON WITH PSILI AND VARIA 1009 | % Ἒ -> 1010 | "" 1011 | % GREEK CAPITAL LETTER EPSILON WITH DASIA AND VARIA 1012 | % Ἓ -> 1013 | "" 1014 | % GREEK CAPITAL LETTER EPSILON WITH PSILI AND OXIA 1015 | % Ἔ -> 1016 | "" 1017 | % GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA 1018 | % Ἕ -> 1019 | "" 1020 | % GREEK SMALL LETTER ETA WITH PSILI 1021 | % ἠ -> 1022 | "" 1023 | % GREEK SMALL LETTER ETA WITH DASIA 1024 | % ἡ -> 1025 | "" 1026 | % GREEK SMALL LETTER ETA WITH PSILI AND VARIA 1027 | % ἢ -> 1028 | "" 1029 | % GREEK SMALL LETTER ETA WITH DASIA AND VARIA 1030 | % ἣ -> 1031 | "" 1032 | % GREEK SMALL LETTER ETA WITH PSILI AND OXIA 1033 | % ἤ -> 1034 | "" 1035 | % GREEK SMALL LETTER ETA WITH DASIA AND OXIA 1036 | % ἥ -> 1037 | "" 1038 | % GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI 1039 | % ἦ -> 1040 | "" 1041 | % GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI 1042 | % ἧ -> 1043 | "" 1044 | % GREEK CAPITAL LETTER ETA WITH PSILI 1045 | % Ἠ -> 1046 | "" 1047 | % GREEK CAPITAL LETTER ETA WITH DASIA 1048 | % Ἡ -> 1049 | "" 1050 | % GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA 1051 | % Ἢ -> 1052 | "" 1053 | % GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA 1054 | % Ἣ -> 1055 | "" 1056 | % GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA 1057 | % Ἤ -> 1058 | "" 1059 | % GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA 1060 | % Ἥ -> 1061 | "" 1062 | % GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI 1063 | % Ἦ -> 1064 | "" 1065 | % GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI 1066 | % Ἧ -> 1067 | "" 1068 | % GREEK SMALL LETTER IOTA WITH PSILI 1069 | % ἰ -> 1070 | "" 1071 | % GREEK SMALL LETTER IOTA WITH DASIA 1072 | % ἱ -> 1073 | "" 1074 | % GREEK SMALL LETTER IOTA WITH PSILI AND VARIA 1075 | % ἲ -> 1076 | "" 1077 | % GREEK SMALL LETTER IOTA WITH DASIA AND VARIA 1078 | % ἳ -> 1079 | "" 1080 | % GREEK SMALL LETTER IOTA WITH PSILI AND OXIA 1081 | % ἴ -> 1082 | "" 1083 | % GREEK SMALL LETTER IOTA WITH DASIA AND OXIA 1084 | % ἵ -> 1085 | "" 1086 | % GREEK SMALL LETTER IOTA WITH PSILI AND PERISPOMENI 1087 | % ἶ -> 1088 | "" 1089 | % GREEK SMALL LETTER IOTA WITH DASIA AND PERISPOMENI 1090 | % ἷ -> 1091 | "" 1092 | % GREEK CAPITAL LETTER IOTA WITH PSILI 1093 | % Ἰ -> 1094 | "" 1095 | % GREEK CAPITAL LETTER IOTA WITH DASIA 1096 | % Ἱ -> 1097 | "" 1098 | % GREEK CAPITAL LETTER IOTA WITH PSILI AND VARIA 1099 | % Ἲ -> 1100 | "" 1101 | % GREEK CAPITAL LETTER IOTA WITH DASIA AND VARIA 1102 | % Ἳ -> 1103 | "" 1104 | % GREEK CAPITAL LETTER IOTA WITH PSILI AND OXIA 1105 | % Ἴ -> 1106 | "" 1107 | % GREEK CAPITAL LETTER IOTA WITH DASIA AND OXIA 1108 | % Ἵ -> 1109 | "" 1110 | % GREEK CAPITAL LETTER IOTA WITH PSILI AND PERISPOMENI 1111 | % Ἶ -> 1112 | "" 1113 | % GREEK CAPITAL LETTER IOTA WITH DASIA AND PERISPOMENI 1114 | % Ἷ -> 1115 | "" 1116 | % GREEK SMALL LETTER OMICRON WITH PSILI 1117 | % ὀ -> 1118 | "" 1119 | % GREEK SMALL LETTER OMICRON WITH DASIA 1120 | % ὁ -> 1121 | "" 1122 | % GREEK SMALL LETTER OMICRON WITH PSILI AND VARIA 1123 | % ὂ -> 1124 | "" 1125 | % GREEK SMALL LETTER OMICRON WITH DASIA AND VARIA 1126 | % ὃ -> 1127 | "" 1128 | % GREEK SMALL LETTER OMICRON WITH PSILI AND OXIA 1129 | % ὄ -> 1130 | "" 1131 | % GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA 1132 | % ὅ -> 1133 | "" 1134 | % GREEK CAPITAL LETTER OMICRON WITH PSILI 1135 | % Ὀ -> 1136 | "" 1137 | % GREEK CAPITAL LETTER OMICRON WITH DASIA 1138 | % Ὁ -> 1139 | "" 1140 | % GREEK CAPITAL LETTER OMICRON WITH PSILI AND VARIA 1141 | % Ὂ -> 1142 | "" 1143 | % GREEK CAPITAL LETTER OMICRON WITH DASIA AND VARIA 1144 | % Ὃ -> 1145 | "" 1146 | % GREEK CAPITAL LETTER OMICRON WITH PSILI AND OXIA 1147 | % Ὄ -> 1148 | "" 1149 | % GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA 1150 | % Ὅ -> 1151 | "" 1152 | % GREEK SMALL LETTER UPSILON WITH PSILI 1153 | % ὐ -> 1154 | "" 1155 | % GREEK SMALL LETTER UPSILON WITH DASIA 1156 | % ὑ -> 1157 | "" 1158 | % GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA 1159 | % ὒ -> 1160 | "" 1161 | % GREEK SMALL LETTER UPSILON WITH DASIA AND VARIA 1162 | % ὓ -> 1163 | "" 1164 | % GREEK SMALL LETTER UPSILON WITH PSILI AND OXIA 1165 | % ὔ -> 1166 | "" 1167 | % GREEK SMALL LETTER UPSILON WITH DASIA AND OXIA 1168 | % ὕ -> 1169 | "" 1170 | % GREEK SMALL LETTER UPSILON WITH PSILI AND PERISPOMENI 1171 | % ὖ -> 1172 | "" 1173 | % GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI 1174 | % ὗ -> 1175 | "" 1176 | % GREEK CAPITAL LETTER UPSILON WITH DASIA 1177 | % Ὑ -> 1178 | "" 1179 | % GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA 1180 | % Ὓ -> 1181 | "" 1182 | % GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA 1183 | % Ὕ -> 1184 | "" 1185 | % GREEK CAPITAL LETTER UPSILON WITH DASIA AND PERISPOMENI 1186 | % Ὗ -> 1187 | "" 1188 | % GREEK SMALL LETTER OMEGA WITH PSILI 1189 | % ὠ -> 1190 | "" 1191 | % GREEK SMALL LETTER OMEGA WITH DASIA 1192 | % ὡ -> 1193 | "" 1194 | % GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA 1195 | % ὢ -> 1196 | "" 1197 | % GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA 1198 | % ὣ -> 1199 | "" 1200 | % GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA 1201 | % ὤ -> 1202 | "" 1203 | % GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA 1204 | % ὥ -> 1205 | "" 1206 | % GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI 1207 | % ὦ -> 1208 | "" 1209 | % GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI 1210 | % ὧ -> 1211 | "" 1212 | % GREEK CAPITAL LETTER OMEGA WITH PSILI 1213 | % Ὠ -> 1214 | "" 1215 | % GREEK CAPITAL LETTER OMEGA WITH DASIA 1216 | % Ὡ -> 1217 | "" 1218 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA 1219 | % Ὢ -> 1220 | "" 1221 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA 1222 | % Ὣ -> 1223 | "" 1224 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA 1225 | % Ὤ -> 1226 | "" 1227 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA 1228 | % Ὥ -> 1229 | "" 1230 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI 1231 | % Ὦ -> 1232 | "" 1233 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI 1234 | % Ὧ -> 1235 | "" 1236 | % GREEK SMALL LETTER ALPHA WITH VARIA 1237 | % ὰ -> 1238 | "" 1239 | % GREEK SMALL LETTER ALPHA WITH OXIA 1240 | % ά -> 1241 | "" 1242 | % GREEK SMALL LETTER EPSILON WITH VARIA 1243 | % ὲ -> 1244 | "" 1245 | % GREEK SMALL LETTER EPSILON WITH OXIA 1246 | % έ -> 1247 | "" 1248 | % GREEK SMALL LETTER ETA WITH VARIA 1249 | % ὴ -> 1250 | "" 1251 | % GREEK SMALL LETTER ETA WITH OXIA 1252 | % ή -> 1253 | "" 1254 | % GREEK SMALL LETTER IOTA WITH VARIA 1255 | % ὶ -> 1256 | "" 1257 | % GREEK SMALL LETTER IOTA WITH OXIA 1258 | % ί -> 1259 | "" 1260 | % GREEK SMALL LETTER OMICRON WITH VARIA 1261 | % ὸ -> 1262 | "" 1263 | % GREEK SMALL LETTER OMICRON WITH OXIA 1264 | % ό -> 1265 | "" 1266 | % GREEK SMALL LETTER UPSILON WITH VARIA 1267 | % ὺ -> 1268 | "" 1269 | % GREEK SMALL LETTER UPSILON WITH OXIA 1270 | % ύ -> 1271 | "" 1272 | % GREEK SMALL LETTER OMEGA WITH VARIA 1273 | % ὼ -> 1274 | "" 1275 | % GREEK SMALL LETTER OMEGA WITH OXIA 1276 | % ώ -> 1277 | "" 1278 | % GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI 1279 | % ᾀ -> 1280 | "" 1281 | % GREEK SMALL LETTER ALPHA WITH DASIA AND YPOGEGRAMMENI 1282 | % ᾁ -> 1283 | "" 1284 | % GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA AND YPOGEGRAMMENI 1285 | % ᾂ -> 1286 | "" 1287 | % GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA AND YPOGEGRAMMENI 1288 | % ᾃ -> 1289 | "" 1290 | % GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA AND YPOGEGRAMMENI 1291 | % ᾄ -> 1292 | "" 1293 | % GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA AND YPOGEGRAMMENI 1294 | % ᾅ -> 1295 | "" 1296 | % GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI 1297 | % ᾆ -> 1298 | "" 1299 | % GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI 1300 | % ᾇ -> 1301 | "" 1302 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI 1303 | % ᾈ -> 1304 | "" 1305 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND PROSGEGRAMMENI 1306 | % ᾉ -> 1307 | "" 1308 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND PROSGEGRAMMENI 1309 | % ᾊ -> 1310 | "" 1311 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND PROSGEGRAMMENI 1312 | % ᾋ -> 1313 | "" 1314 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND PROSGEGRAMMENI 1315 | % ᾌ -> 1316 | "" 1317 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND PROSGEGRAMMENI 1318 | % ᾍ -> 1319 | "" 1320 | % GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI 1321 | % ᾎ -> 1322 | "" 1323 | % GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI 1324 | % ᾏ -> 1325 | "" 1326 | % GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI 1327 | % ᾐ -> 1328 | "" 1329 | % GREEK SMALL LETTER ETA WITH DASIA AND YPOGEGRAMMENI 1330 | % ᾑ -> 1331 | "" 1332 | % GREEK SMALL LETTER ETA WITH PSILI AND VARIA AND YPOGEGRAMMENI 1333 | % ᾒ -> 1334 | "" 1335 | % GREEK SMALL LETTER ETA WITH DASIA AND VARIA AND YPOGEGRAMMENI 1336 | % ᾓ -> 1337 | "" 1338 | % GREEK SMALL LETTER ETA WITH PSILI AND OXIA AND YPOGEGRAMMENI 1339 | % ᾔ -> 1340 | "" 1341 | % GREEK SMALL LETTER ETA WITH DASIA AND OXIA AND YPOGEGRAMMENI 1342 | % ᾕ -> 1343 | "" 1344 | % GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI 1345 | % ᾖ -> 1346 | "" 1347 | % GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI 1348 | % ᾗ -> 1349 | "" 1350 | % GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI 1351 | % ᾘ -> 1352 | "" 1353 | % GREEK CAPITAL LETTER ETA WITH DASIA AND PROSGEGRAMMENI 1354 | % ᾙ -> 1355 | "" 1356 | % GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND PROSGEGRAMMENI 1357 | % ᾚ -> 1358 | "" 1359 | % GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND PROSGEGRAMMENI 1360 | % ᾛ -> 1361 | "" 1362 | % GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND PROSGEGRAMMENI 1363 | % ᾜ -> 1364 | "" 1365 | % GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND PROSGEGRAMMENI 1366 | % ᾝ -> 1367 | "" 1368 | % GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI 1369 | % ᾞ -> 1370 | "" 1371 | % GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI 1372 | % ᾟ -> 1373 | "" 1374 | % GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI 1375 | % ᾠ -> 1376 | "" 1377 | % GREEK SMALL LETTER OMEGA WITH DASIA AND YPOGEGRAMMENI 1378 | % ᾡ -> 1379 | "" 1380 | % GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA AND YPOGEGRAMMENI 1381 | % ᾢ -> 1382 | "" 1383 | % GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA AND YPOGEGRAMMENI 1384 | % ᾣ -> 1385 | "" 1386 | % GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA AND YPOGEGRAMMENI 1387 | % ᾤ -> 1388 | "" 1389 | % GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA AND YPOGEGRAMMENI 1390 | % ᾥ -> 1391 | "" 1392 | % GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI 1393 | % ᾦ -> 1394 | "" 1395 | % GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI 1396 | % ᾧ -> 1397 | "" 1398 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI 1399 | % ᾨ -> 1400 | "" 1401 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND PROSGEGRAMMENI 1402 | % ᾩ -> 1403 | "" 1404 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND PROSGEGRAMMENI 1405 | % ᾪ -> 1406 | "" 1407 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND PROSGEGRAMMENI 1408 | % ᾫ -> 1409 | "" 1410 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND PROSGEGRAMMENI 1411 | % ᾬ -> 1412 | "" 1413 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND PROSGEGRAMMENI 1414 | % ᾭ -> 1415 | "" 1416 | % GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI 1417 | % ᾮ -> 1418 | "" 1419 | % GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI 1420 | % ᾯ -> 1421 | "" 1422 | % GREEK SMALL LETTER ALPHA WITH VRACHY 1423 | % ᾰ -> 1424 | "" 1425 | % GREEK SMALL LETTER ALPHA WITH MACRON 1426 | % ᾱ -> 1427 | "" 1428 | % GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGRAMMENI 1429 | % ᾲ -> 1430 | "" 1431 | % GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI 1432 | % ᾳ -> 1433 | "" 1434 | % GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI 1435 | % ᾴ -> 1436 | "" 1437 | % GREEK SMALL LETTER ALPHA WITH PERISPOMENI 1438 | % ᾶ -> 1439 | "" 1440 | % GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI 1441 | % ᾷ -> 1442 | "" 1443 | % GREEK CAPITAL LETTER ALPHA WITH VRACHY 1444 | % Ᾰ -> 1445 | "" 1446 | % GREEK CAPITAL LETTER ALPHA WITH MACRON 1447 | % Ᾱ -> 1448 | "" 1449 | % GREEK CAPITAL LETTER ALPHA WITH VARIA 1450 | % Ὰ -> 1451 | "" 1452 | % GREEK CAPITAL LETTER ALPHA WITH OXIA 1453 | % Ά -> 1454 | "" 1455 | % GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI 1456 | % ᾼ -> 1457 | "" 1458 | % GREEK KORONIS 1459 | % ᾽ -> 1460 | "" 1461 | % GREEK PROSGEGRAMMENI 1462 | % ι -> 1463 | "" 1464 | % GREEK PSILI 1465 | % ᾿ -> 1466 | "" 1467 | % GREEK PERISPOMENI 1468 | % ῀ -> 1469 | "" 1470 | % GREEK DIALYTIKA AND PERISPOMENI 1471 | % ῁ -> 1472 | "" 1473 | % GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI 1474 | % ῂ -> 1475 | "" 1476 | % GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI 1477 | % ῃ -> 1478 | "" 1479 | % GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI 1480 | % ῄ -> 1481 | "" 1482 | % GREEK SMALL LETTER ETA WITH PERISPOMENI 1483 | % ῆ -> 1484 | "" 1485 | % GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI 1486 | % ῇ -> 1487 | "" 1488 | % GREEK CAPITAL LETTER EPSILON WITH VARIA 1489 | % Ὲ -> 1490 | "" 1491 | % GREEK CAPITAL LETTER EPSILON WITH OXIA 1492 | % Έ -> 1493 | "" 1494 | % GREEK CAPITAL LETTER ETA WITH VARIA 1495 | % Ὴ -> 1496 | "" 1497 | % GREEK CAPITAL LETTER ETA WITH OXIA 1498 | % Ή -> 1499 | "" 1500 | % GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI 1501 | % ῌ -> 1502 | "" 1503 | % GREEK PSILI AND VARIA 1504 | % ῍ -> 1505 | "" 1506 | % GREEK PSILI AND OXIA 1507 | % ῎ -> 1508 | "" 1509 | % GREEK PSILI AND PERISPOMENI 1510 | % ῏ -> 1511 | "" 1512 | % GREEK SMALL LETTER IOTA WITH VRACHY 1513 | % ῐ -> 1514 | "" 1515 | % GREEK SMALL LETTER IOTA WITH MACRON 1516 | % ῑ -> 1517 | "" 1518 | % GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARIA 1519 | % ῒ -> 1520 | "" 1521 | % GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA 1522 | % ΐ -> 1523 | "" 1524 | % GREEK SMALL LETTER IOTA WITH PERISPOMENI 1525 | % ῖ -> 1526 | "" 1527 | % GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI 1528 | % ῗ -> 1529 | "" 1530 | % GREEK CAPITAL LETTER IOTA WITH VRACHY 1531 | % Ῐ -> 1532 | "" 1533 | % GREEK CAPITAL LETTER IOTA WITH MACRON 1534 | % Ῑ -> 1535 | "" 1536 | % GREEK CAPITAL LETTER IOTA WITH VARIA 1537 | % Ὶ -> 1538 | "" 1539 | % GREEK CAPITAL LETTER IOTA WITH OXIA 1540 | % Ί -> 1541 | "" 1542 | % GREEK DASIA AND VARIA 1543 | % ῝ -> 1544 | "" 1545 | % GREEK DASIA AND OXIA 1546 | % ῞ -> 1547 | "" 1548 | % GREEK DASIA AND PERISPOMENI 1549 | % ῟ -> 1550 | "" 1551 | % GREEK SMALL LETTER UPSILON WITH VRACHY 1552 | % ῠ -> 1553 | "" 1554 | % GREEK SMALL LETTER UPSILON WITH MACRON 1555 | % ῡ -> 1556 | "" 1557 | % GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND VARIA 1558 | % ῢ -> 1559 | "" 1560 | % GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA 1561 | % ΰ -> 1562 | "" 1563 | % GREEK SMALL LETTER RHO WITH PSILI 1564 | % ῤ -> 1565 | "" 1566 | % GREEK SMALL LETTER RHO WITH DASIA 1567 | % ῥ -> 1568 | "" 1569 | % GREEK SMALL LETTER UPSILON WITH PERISPOMENI 1570 | % ῦ -> 1571 | "" 1572 | % GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI 1573 | % ῧ -> 1574 | "" 1575 | % GREEK CAPITAL LETTER UPSILON WITH VRACHY 1576 | % Ῠ -> 1577 | "" 1578 | % GREEK CAPITAL LETTER UPSILON WITH MACRON 1579 | % Ῡ -> 1580 | "" 1581 | % GREEK CAPITAL LETTER UPSILON WITH VARIA 1582 | % Ὺ -> 1583 | "" 1584 | % GREEK CAPITAL LETTER UPSILON WITH OXIA 1585 | % Ύ -> 1586 | "" 1587 | % GREEK CAPITAL LETTER RHO WITH DASIA 1588 | % Ῥ -> 1589 | "" 1590 | % GREEK DIALYTIKA AND VARIA 1591 | % ῭ -> 1592 | "" 1593 | % GREEK DIALYTIKA AND OXIA 1594 | % ΅ -> 1595 | "" 1596 | % GREEK VARIA 1597 | % ` -> 1598 | "" 1599 | % GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI 1600 | % ῲ -> 1601 | "" 1602 | % GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI 1603 | % ῳ -> 1604 | "" 1605 | % GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI 1606 | % ῴ -> 1607 | "" 1608 | % GREEK SMALL LETTER OMEGA WITH PERISPOMENI 1609 | % ῶ -> 1610 | "" 1611 | % GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI 1612 | % ῷ -> 1613 | "" 1614 | % GREEK CAPITAL LETTER OMICRON WITH VARIA 1615 | % Ὸ -> 1616 | "" 1617 | % GREEK CAPITAL LETTER OMICRON WITH OXIA 1618 | % Ό -> 1619 | "" 1620 | % GREEK CAPITAL LETTER OMEGA WITH VARIA 1621 | % Ὼ -> 1622 | "" 1623 | % GREEK CAPITAL LETTER OMEGA WITH OXIA 1624 | % Ώ -> 1625 | "" 1626 | % GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI 1627 | % ῼ -> 1628 | "" 1629 | % GREEK OXIA 1630 | % ´ -> 1631 | "" 1632 | % GREEK DASIA 1633 | % ῾ -> 1634 | "" 1635 | % FRENCH FRANC SIGN 1636 | % ₣ -> 1637 | "" 1638 | % LIRA SIGN 1639 | % ₤ -> 1640 | "" 1641 | % PESETA SIGN 1642 | % ₧ -> 1643 | "" 1644 | % DRACHMA SIGN 1645 | % ₯ -> 1646 | "" 1647 | % UP DOWN ARROW 1648 | % ↕ -> 1649 | "" 1650 | % UP DOWN ARROW WITH BASE 1651 | % ↨ -> 1652 | "" 1653 | % FOR ALL 1654 | % ∀ -> 1655 | "" 1656 | % PARTIAL DIFFERENTIAL 1657 | % ∂ -> 1658 | "" 1659 | % THERE EXISTS 1660 | % ∃ -> 1661 | "" 1662 | % INCREMENT 1663 | % ∆ -> 1664 | "" 1665 | % ELEMENT OF 1666 | % ∈ -> 1667 | "" 1668 | % NOT AN ELEMENT OF 1669 | % ∉ -> 1670 | "" 1671 | % N-ARY PRODUCT 1672 | % ∏ -> 1673 | "" 1674 | % N-ARY SUMMATION 1675 | % ∑ -> 1676 | "" 1677 | % SQUARE ROOT 1678 | % √ -> 1679 | "" 1680 | % RIGHT ANGLE 1681 | % ∟ -> 1682 | "" 1683 | % LOGICAL AND 1684 | % ∧ -> 1685 | "" 1686 | % LOGICAL OR 1687 | % ∨ -> 1688 | "" 1689 | % INTERSECTION 1690 | % ∩ -> 1691 | "" 1692 | % UNION 1693 | % ∪ -> 1694 | "" 1695 | % INTEGRAL 1696 | % ∫ -> 1697 | "" 1698 | % ALMOST EQUAL TO 1699 | % ≈ -> 1700 | "" 1701 | % ESTIMATES 1702 | % ≙ -> 1703 | "" 1704 | % SUBSET OF 1705 | % ⊂ -> 1706 | "" 1707 | % SUPERSET OF 1708 | % ⊃ -> 1709 | "" 1710 | % HOUSE 1711 | % ⌂ -> 1712 | "" 1713 | % REVERSED NOT SIGN 1714 | % ⌐ -> 1715 | "" 1716 | % TOP HALF INTEGRAL 1717 | % ⌠ -> 1718 | "" 1719 | % BOTTOM HALF INTEGRAL 1720 | % ⌡ -> 1721 | "" 1722 | % UPPER HALF BLOCK 1723 | % ▀ -> 1724 | "" 1725 | % LOWER HALF BLOCK 1726 | % ▄ -> 1727 | "" 1728 | % FULL BLOCK 1729 | % █ -> 1730 | "" 1731 | % LEFT HALF BLOCK 1732 | % ▌ -> 1733 | "" 1734 | % RIGHT HALF BLOCK 1735 | % ▐ -> 1736 | "" 1737 | % LIGHT SHADE 1738 | % ░ -> 1739 | "" 1740 | % MEDIUM SHADE 1741 | % ▒ -> 1742 | "" 1743 | % DARK SHADE 1744 | % ▓ -> 1745 | "" 1746 | % BLACK SQUARE 1747 | % ■ -> 1748 | "" 1749 | % BLACK RECTANGLE 1750 | % ▬ -> 1751 | "" 1752 | % BLACK UP-POINTING TRIANGLE 1753 | % ▲ -> 1754 | "" 1755 | % BLACK RIGHT-POINTING POINTER 1756 | % ► -> 1757 | "" 1758 | % BLACK DOWN-POINTING TRIANGLE 1759 | % ▼ -> 1760 | "" 1761 | % BLACK LEFT-POINTING POINTER 1762 | % ◄ -> 1763 | "" 1764 | % LOZENGE 1765 | % ◊ -> 1766 | "" 1767 | % INVERSE BULLET 1768 | % ◘ -> 1769 | "" 1770 | % INVERSE WHITE CIRCLE 1771 | % ◙ -> 1772 | "" 1773 | % WHITE SUN WITH RAYS 1774 | % ☼ -> 1775 | "" 1776 | % FEMALE SIGN 1777 | % ♀ -> 1778 | "" 1779 | % MALE SIGN 1780 | % ♂ -> 1781 | "" 1782 | % BLACK SPADE SUIT 1783 | % ♠ -> 1784 | "" 1785 | % BLACK CLUB SUIT 1786 | % ♣ -> 1787 | "" 1788 | % BLACK HEART SUIT 1789 | % ♥ -> 1790 | "" 1791 | % BLACK DIAMOND SUIT 1792 | % ♦ -> 1793 | "" 1794 | % EIGHTH NOTE 1795 | % ♪ -> 1796 | "" 1797 | % BEAMED EIGHTH NOTES 1798 | % ♫ -> 1799 | "" 1800 | -------------------------------------------------------------------------------- /translitcodec/__init__.py: -------------------------------------------------------------------------------- 1 | """Unicode to 8-bit charset transliteration codec. 2 | 3 | This package contains codecs for transliterating ISO 10646 texts into 4 | best-effort representations using smaller coded character sets (ASCII, 5 | ISO 8859, etc.). The translation tables used by the codecs are from 6 | the ``transtab`` collection by Markus Kuhn. 7 | 8 | :copyright: the translitcodec authors and developers, see AUTHORS. 9 | :license: MIT, see LICENSE for more details. 10 | 11 | """ 12 | import codecs 13 | import sys 14 | import unicodedata 15 | 16 | 17 | __version_info__ = (0, 7, 0) 18 | __version__ = '.'.join(str(_) for _ in __version_info__) 19 | 20 | 21 | def long_encode(input, errors='strict'): 22 | """Transliterate to 8 bit using as many letters as needed. 23 | 24 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 25 | be replaced with ``ae``. 26 | 27 | """ 28 | if not isinstance(input, str): 29 | input = str(input, sys.getdefaultencoding(), errors) 30 | length = len(input) 31 | input = unicodedata.normalize('NFKC', input) 32 | return input.translate(long_table), length 33 | 34 | 35 | def short_encode(input, errors='strict'): 36 | """Transliterate to 8 bit using as few letters as possible. 37 | 38 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 39 | be replaced with ``a``. 40 | 41 | """ 42 | if not isinstance(input, str): 43 | input = str(input, sys.getdefaultencoding(), errors) 44 | length = len(input) 45 | input = unicodedata.normalize('NFKC', input) 46 | return input.translate(short_table), length 47 | 48 | 49 | def single_encode(input, errors='strict'): 50 | """Transliterate to 8 bit using only single letter replacements. 51 | 52 | For example, \u2639 WHITE FROWNING FACE ``☹`` will be passed 53 | through unchanged. 54 | 55 | """ 56 | if not isinstance(input, str): 57 | input = str(input, sys.getdefaultencoding(), errors) 58 | length = len(input) 59 | input = unicodedata.normalize('NFKC', input) 60 | return input.translate(single_table), length 61 | 62 | 63 | def _error_handle_base(exc, table, unknown_char_cb): 64 | if isinstance(exc, UnicodeEncodeError): 65 | char = unicodedata.normalize('NFKC', exc.object[exc.start:exc.end])[0] 66 | new_char = char.translate(table) 67 | if char == new_char: 68 | new_char = unknown_char_cb(char, new_char, exc) 69 | return new_char, exc.start + 1 70 | else: 71 | raise exc 72 | 73 | 74 | def replace_long(exc): 75 | """Error handler for transliterate to 8 bit using as many letters as needed. 76 | 77 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 78 | be replaced with ``ae``. 79 | 80 | If the character is not replaced, then the '?' character is returned. 81 | """ 82 | return _error_handle_base(exc, long_table, lambda c, n, e: '?') 83 | 84 | 85 | def replace_short(exc): 86 | """Error handler for transliterate to 8 bit using as few letters as possible. 87 | 88 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 89 | be replaced with ``a``. 90 | 91 | If the character is not replaced, then the '?' character is returned. 92 | """ 93 | return _error_handle_base(exc, short_table, lambda c, n, e: '?') 94 | 95 | 96 | def replace_single(exc): 97 | """Error handler for transliterate to 8 bit using only single letter replacements. 98 | 99 | For example, \u2639 WHITE FROWNING FACE ``☹`` will be passed 100 | through unchanged. 101 | 102 | If the character is not replaced, then the '?' character is returned. 103 | """ 104 | return _error_handle_base(exc, single_table, lambda c, n, e: '?') 105 | 106 | 107 | def ignore_long(exc): 108 | """Error handler for transliterate to 8 bit using as many letters as needed. 109 | 110 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 111 | be replaced with ``ae``. 112 | 113 | If the character is not replaced, then it will be skipped. 114 | """ 115 | return _error_handle_base(exc, long_table, lambda c, n, e: '') 116 | 117 | 118 | def ignore_short(exc): 119 | """Error handler for transliterate to 8 bit using as few letters as possible. 120 | 121 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 122 | be replaced with ``a``. 123 | 124 | If the character is not replaced, then it will be skipped. 125 | """ 126 | return _error_handle_base(exc, short_table, lambda c, n, e: '') 127 | 128 | 129 | def ignore_single(exc): 130 | """Error handler for transliterate to 8 bit using only single letter replacements. 131 | 132 | For example, \u2639 WHITE FROWNING FACE ``☹`` will be passed 133 | through unchanged. 134 | 135 | If the character is not replaced, then it will be skipped. 136 | """ 137 | return _error_handle_base(exc, single_table, lambda c, n, e: '') 138 | 139 | 140 | def re_reaise(c, n, e): 141 | raise e 142 | 143 | 144 | def strict_long(exc): 145 | """Error handler for transliterate to 8 bit using as many letters as needed. 146 | 147 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 148 | be replaced with ``ae``. 149 | 150 | If the character is not replaced, then an exception is thrown. 151 | """ 152 | return _error_handle_base(exc, long_table, re_reaise) 153 | 154 | 155 | def strict_short(exc): 156 | """Error handler for transliterate to 8 bit using as few letters as possible. 157 | 158 | For example, \u00e4 LATIN SMALL LETTER A WITH DIAERESIS ``ä`` will 159 | be replaced with ``a``. 160 | 161 | If the character is not replaced, then an exception is thrown. 162 | """ 163 | return _error_handle_base(exc, short_table, re_reaise) 164 | 165 | 166 | def strict_single(exc): 167 | """Error handler for transliterate to 8 bit using only single letter replacements. 168 | 169 | For example, \u2639 WHITE FROWNING FACE ``☹`` will be passed 170 | through unchanged. 171 | 172 | If the character is not replaced, then an exception is thrown. 173 | """ 174 | return _error_handle_base(exc, single_table, re_reaise) 175 | 176 | 177 | def no_decode(input, errors='strict'): 178 | raise TypeError("transliterating codec does not support decode.") 179 | 180 | 181 | def _double_encoding_factory(encoder, byte_encoder, byte_encoding): 182 | """Send the transliterated output to another codec.""" 183 | def dbl_encode(input, errors='strict'): 184 | uni, length = encoder(input, errors) 185 | return byte_encoder(uni, errors)[0], length 186 | dbl_encode.__name__ = '%s_%s' % (encoder.__name__, byte_encoding) 187 | return dbl_encode 188 | 189 | 190 | def trans_search(encoding): 191 | """Lookup transliterating codecs.""" 192 | if encoding == 'transliterate': 193 | return codecs.CodecInfo(long_encode, no_decode) 194 | 195 | # translit/long/utf8 196 | # translit/one 197 | # translit/short/ascii 198 | 199 | delim = '/' 200 | if sys.version_info > (3, 9): 201 | delim = '_' 202 | 203 | if encoding.startswith('translit' + delim): 204 | parts = encoding.split(delim) 205 | if parts[1] == 'long': 206 | encoder = long_encode 207 | elif parts[1] == 'short': 208 | encoder = short_encode 209 | elif parts[1] == 'one': 210 | encoder = single_encode 211 | else: 212 | return None 213 | 214 | if len(parts) == 2: 215 | pass 216 | elif len(parts) == 3: 217 | byte_enc = parts[2] 218 | byte_encoder = codecs.lookup(byte_enc).encode 219 | encoder = _double_encoding_factory(encoder, byte_encoder, byte_enc) 220 | else: 221 | return None 222 | return codecs.CodecInfo(encoder, no_decode) 223 | return None 224 | 225 | codecs.register(trans_search) 226 | 227 | codecs.register_error('replace/translit/long', replace_long) 228 | codecs.register_error('replace/translit/short', replace_short) 229 | codecs.register_error('replace/translit/one', replace_single) 230 | 231 | codecs.register_error('ignore/translit/long', ignore_long) 232 | codecs.register_error('ignore/translit/short', ignore_short) 233 | codecs.register_error('ignore/translit/one', ignore_single) 234 | 235 | codecs.register_error('strict/translit/long', strict_long) 236 | codecs.register_error('strict/translit/short', strict_short) 237 | codecs.register_error('strict/translit/one', strict_single) 238 | 239 | ### Code below is generated by update_table.py; do not edit. 240 | ### > 241 | 242 | long_table = { 243 | 160: ' ', 244 | 161: '!', 245 | 162: 'c', 246 | 163: 'GBP', 247 | 165: 'Y', 248 | 166: '|', 249 | 167: 'S', 250 | 168: '"', 251 | 169: '(c)', 252 | 170: 'a', 253 | 171: '<<', 254 | 172: '-', 255 | 173: '-', 256 | 174: '(R)', 257 | 175: '-', 258 | 176: ' ', 259 | 177: '+/-', 260 | 178: '^2', 261 | 179: '^3', 262 | 180: "'", 263 | 181: 'μ', 264 | 182: 'P', 265 | 183: '.', 266 | 184: ',', 267 | 185: '^1', 268 | 186: 'o', 269 | 187: '>>', 270 | 188: ' 1/4', 271 | 189: ' 1/2', 272 | 190: ' 3/4', 273 | 191: '?', 274 | 192: 'A', 275 | 193: 'A', 276 | 194: 'A', 277 | 195: 'A', 278 | 196: 'Ae', 279 | 197: 'Aa', 280 | 198: 'AE', 281 | 199: 'C', 282 | 200: 'E', 283 | 201: 'E', 284 | 202: 'E', 285 | 203: 'E', 286 | 204: 'I', 287 | 205: 'I', 288 | 206: 'I', 289 | 207: 'I', 290 | 208: 'D', 291 | 209: 'N', 292 | 210: 'O', 293 | 211: 'O', 294 | 212: 'O', 295 | 213: 'O', 296 | 214: 'Oe', 297 | 215: 'x', 298 | 216: 'O', 299 | 217: 'U', 300 | 218: 'U', 301 | 219: 'U', 302 | 220: 'Ue', 303 | 221: 'Y', 304 | 222: 'Th', 305 | 223: 'ss', 306 | 224: 'a', 307 | 225: 'a', 308 | 226: 'a', 309 | 227: 'a', 310 | 228: 'ae', 311 | 229: 'aa', 312 | 230: 'ae', 313 | 231: 'c', 314 | 232: 'e', 315 | 233: 'e', 316 | 234: 'e', 317 | 235: 'e', 318 | 236: 'i', 319 | 237: 'i', 320 | 238: 'i', 321 | 239: 'i', 322 | 240: 'd', 323 | 241: 'n', 324 | 242: 'o', 325 | 243: 'o', 326 | 244: 'o', 327 | 245: 'o', 328 | 246: 'oe', 329 | 247: ':', 330 | 248: 'o', 331 | 249: 'u', 332 | 250: 'u', 333 | 251: 'u', 334 | 252: 'ue', 335 | 253: 'y', 336 | 254: 'th', 337 | 255: 'y', 338 | 256: 'A', 339 | 257: 'a', 340 | 258: 'A', 341 | 259: 'a', 342 | 260: 'A', 343 | 261: 'a', 344 | 262: 'C', 345 | 263: 'c', 346 | 264: 'Ch', 347 | 265: 'ch', 348 | 266: 'C', 349 | 267: 'c', 350 | 268: 'C', 351 | 269: 'c', 352 | 270: 'D', 353 | 271: 'd', 354 | 272: 'D', 355 | 273: 'd', 356 | 274: 'E', 357 | 275: 'e', 358 | 276: 'E', 359 | 277: 'e', 360 | 278: 'E', 361 | 279: 'e', 362 | 280: 'E', 363 | 281: 'e', 364 | 282: 'E', 365 | 283: 'e', 366 | 284: 'Gh', 367 | 285: 'gh', 368 | 286: 'G', 369 | 287: 'g', 370 | 288: 'G', 371 | 289: 'g', 372 | 290: 'G', 373 | 291: 'g', 374 | 292: 'Hh', 375 | 293: 'hh', 376 | 294: 'H', 377 | 295: 'h', 378 | 296: 'I', 379 | 297: 'i', 380 | 298: 'I', 381 | 299: 'i', 382 | 300: 'I', 383 | 301: 'i', 384 | 302: 'I', 385 | 303: 'i', 386 | 304: 'I', 387 | 305: 'i', 388 | 306: 'IJ', 389 | 307: 'ij', 390 | 308: 'Jh', 391 | 309: 'jh', 392 | 310: 'K', 393 | 311: 'k', 394 | 312: 'k', 395 | 313: 'L', 396 | 314: 'l', 397 | 315: 'L', 398 | 316: 'l', 399 | 317: 'L', 400 | 318: 'l', 401 | 319: 'L·', 402 | 320: 'l·', 403 | 321: 'L', 404 | 322: 'l', 405 | 323: 'N', 406 | 324: 'n', 407 | 325: 'N', 408 | 326: 'n', 409 | 327: 'N', 410 | 328: 'n', 411 | 329: "'n", 412 | 330: 'NG', 413 | 331: 'ng', 414 | 332: 'O', 415 | 333: 'o', 416 | 334: 'O', 417 | 335: 'o', 418 | 336: 'O', 419 | 337: 'o', 420 | 338: 'OE', 421 | 339: 'oe', 422 | 340: 'R', 423 | 341: 'r', 424 | 342: 'R', 425 | 343: 'r', 426 | 344: 'R', 427 | 345: 'r', 428 | 346: 'S', 429 | 347: 's', 430 | 348: 'Sh', 431 | 349: 'sh', 432 | 350: 'S', 433 | 351: 's', 434 | 352: 'S', 435 | 353: 's', 436 | 354: 'T', 437 | 355: 't', 438 | 356: 'T', 439 | 357: 't', 440 | 358: 'T', 441 | 359: 't', 442 | 360: 'U', 443 | 361: 'u', 444 | 362: 'U', 445 | 363: 'u', 446 | 364: 'U', 447 | 365: 'u', 448 | 366: 'U', 449 | 367: 'u', 450 | 368: 'U', 451 | 369: 'u', 452 | 370: 'U', 453 | 371: 'u', 454 | 372: 'W', 455 | 373: 'w', 456 | 374: 'Y', 457 | 375: 'y', 458 | 376: 'Y', 459 | 377: 'Z', 460 | 378: 'z', 461 | 379: 'Z', 462 | 380: 'z', 463 | 381: 'Z', 464 | 382: 'z', 465 | 383: 's', 466 | 402: 'f', 467 | 416: 'O', 468 | 417: 'o', 469 | 431: 'U', 470 | 432: 'u', 471 | 536: 'Ş', 472 | 537: 'ş', 473 | 538: 'Ţ', 474 | 539: 'ţ', 475 | 697: '′', 476 | 699: '‘', 477 | 700: '’', 478 | 701: '‛', 479 | 710: '^', 480 | 712: "'", 481 | 713: '¯', 482 | 716: ',', 483 | 720: ':', 484 | 730: '°', 485 | 732: '~', 486 | 733: '"', 487 | 884: "'", 488 | 885: ',', 489 | 894: ';', 490 | 7682: 'B', 491 | 7683: 'b', 492 | 7690: 'D', 493 | 7691: 'd', 494 | 7710: 'F', 495 | 7711: 'f', 496 | 7744: 'M', 497 | 7745: 'm', 498 | 7766: 'P', 499 | 7767: 'p', 500 | 7776: 'S', 501 | 7777: 's', 502 | 7786: 'T', 503 | 7787: 't', 504 | 7808: 'W', 505 | 7809: 'w', 506 | 7810: 'W', 507 | 7811: 'w', 508 | 7812: 'W', 509 | 7813: 'w', 510 | 7918: 'U', 511 | 7919: 'u', 512 | 7922: 'Y', 513 | 7923: 'y', 514 | 8192: ' ', 515 | 8193: ' ', 516 | 8194: ' ', 517 | 8195: ' ', 518 | 8196: ' ', 519 | 8197: ' ', 520 | 8198: ' ', 521 | 8199: ' ', 522 | 8200: ' ', 523 | 8201: ' ', 524 | 8202: '', 525 | 8203: '', 526 | 8204: '', 527 | 8205: '', 528 | 8206: '', 529 | 8207: '', 530 | 8208: '-', 531 | 8209: '-', 532 | 8210: '-', 533 | 8211: '-', 534 | 8212: '--', 535 | 8213: '--', 536 | 8214: '||', 537 | 8215: '_', 538 | 8216: "'", 539 | 8217: "'", 540 | 8218: "'", 541 | 8219: "'", 542 | 8220: '"', 543 | 8221: '"', 544 | 8222: '"', 545 | 8223: '"', 546 | 8224: '+', 547 | 8225: '++', 548 | 8226: 'o', 549 | 8227: '>', 550 | 8228: '.', 551 | 8229: '..', 552 | 8230: '...', 553 | 8231: '-', 554 | 8234: '', 555 | 8235: '', 556 | 8236: '', 557 | 8237: '', 558 | 8238: '', 559 | 8239: ' ', 560 | 8240: ' 0/00', 561 | 8242: "'", 562 | 8243: '"', 563 | 8244: "'''", 564 | 8245: '`', 565 | 8246: '``', 566 | 8247: '```', 567 | 8249: '<', 568 | 8250: '>', 569 | 8252: '!!', 570 | 8254: '-', 571 | 8259: '-', 572 | 8260: '/', 573 | 8264: '?!', 574 | 8265: '!?', 575 | 8266: '7', 576 | 8304: '^0', 577 | 8308: '^4', 578 | 8309: '^5', 579 | 8310: '^6', 580 | 8311: '^7', 581 | 8312: '^8', 582 | 8313: '^9', 583 | 8314: '^+', 584 | 8315: '^-', 585 | 8316: '^=', 586 | 8317: '^(', 587 | 8318: '^)', 588 | 8319: '^n', 589 | 8320: '_0', 590 | 8321: '_1', 591 | 8322: '_2', 592 | 8323: '_3', 593 | 8324: '_4', 594 | 8325: '_5', 595 | 8326: '_6', 596 | 8327: '_7', 597 | 8328: '_8', 598 | 8329: '_9', 599 | 8330: '_+', 600 | 8331: '_-', 601 | 8332: '_=', 602 | 8333: '_(', 603 | 8334: '_)', 604 | 8364: 'EUR', 605 | 8448: 'a/c', 606 | 8449: 'a/s', 607 | 8451: '°C', 608 | 8453: 'c/o', 609 | 8454: 'c/u', 610 | 8457: '°F', 611 | 8467: 'l', 612 | 8470: 'Nº', 613 | 8471: '(P)', 614 | 8480: '[SM]', 615 | 8481: 'TEL', 616 | 8482: '[TM]', 617 | 8486: 'Ω', 618 | 8490: 'K', 619 | 8491: 'Å', 620 | 8494: 'e', 621 | 8531: ' 1/3', 622 | 8532: ' 2/3', 623 | 8533: ' 1/5', 624 | 8534: ' 2/5', 625 | 8535: ' 3/5', 626 | 8536: ' 4/5', 627 | 8537: ' 1/6', 628 | 8538: ' 5/6', 629 | 8539: ' 1/8', 630 | 8540: ' 3/8', 631 | 8541: ' 5/8', 632 | 8542: ' 7/8', 633 | 8543: ' 1/', 634 | 8544: 'I', 635 | 8545: 'II', 636 | 8546: 'III', 637 | 8547: 'IV', 638 | 8548: 'V', 639 | 8549: 'VI', 640 | 8550: 'VII', 641 | 8551: 'VIII', 642 | 8552: 'IX', 643 | 8553: 'X', 644 | 8554: 'XI', 645 | 8555: 'XII', 646 | 8556: 'L', 647 | 8557: 'C', 648 | 8558: 'D', 649 | 8559: 'M', 650 | 8560: 'i', 651 | 8561: 'ii', 652 | 8562: 'iii', 653 | 8563: 'iv', 654 | 8564: 'v', 655 | 8565: 'vi', 656 | 8566: 'vii', 657 | 8567: 'viii', 658 | 8568: 'ix', 659 | 8569: 'x', 660 | 8570: 'xi', 661 | 8571: 'xii', 662 | 8572: 'l', 663 | 8573: 'c', 664 | 8574: 'd', 665 | 8575: 'm', 666 | 8592: '<-', 667 | 8593: '^', 668 | 8594: '->', 669 | 8595: 'v', 670 | 8596: '<->', 671 | 8656: '<=', 672 | 8658: '=>', 673 | 8660: '<=>', 674 | 8722: '–', 675 | 8725: '/', 676 | 8726: '\\', 677 | 8727: '*', 678 | 8728: 'o', 679 | 8729: '·', 680 | 8734: 'inf', 681 | 8739: '|', 682 | 8741: '||', 683 | 8758: ':', 684 | 8764: '~', 685 | 8800: '/=', 686 | 8801: '=', 687 | 8804: '<=', 688 | 8805: '>=', 689 | 8810: '<<', 690 | 8811: '>>', 691 | 8853: '(+)', 692 | 8854: '(-)', 693 | 8855: '(x)', 694 | 8856: '(/)', 695 | 8866: '|-', 696 | 8867: '-|', 697 | 8870: '|-', 698 | 8871: '|=', 699 | 8872: '|=', 700 | 8873: '||-', 701 | 8901: '·', 702 | 8902: '*', 703 | 8917: '#', 704 | 8920: '<<<', 705 | 8921: '>>>', 706 | 8943: '...', 707 | 9001: '<', 708 | 9002: '>', 709 | 9216: 'NUL', 710 | 9217: 'SOH', 711 | 9218: 'STX', 712 | 9219: 'ETX', 713 | 9220: 'EOT', 714 | 9221: 'ENQ', 715 | 9222: 'ACK', 716 | 9223: 'BEL', 717 | 9224: 'BS', 718 | 9225: 'HT', 719 | 9226: 'LF', 720 | 9227: 'VT', 721 | 9228: 'FF', 722 | 9229: 'CR', 723 | 9230: 'SO', 724 | 9231: 'SI', 725 | 9232: 'DLE', 726 | 9233: 'DC1', 727 | 9234: 'DC2', 728 | 9235: 'DC3', 729 | 9236: 'DC4', 730 | 9237: 'NAK', 731 | 9238: 'SYN', 732 | 9239: 'ETB', 733 | 9240: 'CAN', 734 | 9241: 'EM', 735 | 9242: 'SUB', 736 | 9243: 'ESC', 737 | 9244: 'FS', 738 | 9245: 'GS', 739 | 9246: 'RS', 740 | 9247: 'US', 741 | 9248: 'SP', 742 | 9249: 'DEL', 743 | 9251: '_', 744 | 9252: 'NL', 745 | 9253: '///', 746 | 9254: '?', 747 | 9312: '(1)', 748 | 9313: '(2)', 749 | 9314: '(3)', 750 | 9315: '(4)', 751 | 9316: '(5)', 752 | 9317: '(6)', 753 | 9318: '(7)', 754 | 9319: '(8)', 755 | 9320: '(9)', 756 | 9321: '(10)', 757 | 9322: '(11)', 758 | 9323: '(12)', 759 | 9324: '(13)', 760 | 9325: '(14)', 761 | 9326: '(15)', 762 | 9327: '(16)', 763 | 9328: '(17)', 764 | 9329: '(18)', 765 | 9330: '(19)', 766 | 9331: '(20)', 767 | 9332: '(1)', 768 | 9333: '(2)', 769 | 9334: '(3)', 770 | 9335: '(4)', 771 | 9336: '(5)', 772 | 9337: '(6)', 773 | 9338: '(7)', 774 | 9339: '(8)', 775 | 9340: '(9)', 776 | 9341: '(10)', 777 | 9342: '(11)', 778 | 9343: '(12)', 779 | 9344: '(13)', 780 | 9345: '(14)', 781 | 9346: '(15)', 782 | 9347: '(16)', 783 | 9348: '(17)', 784 | 9349: '(18)', 785 | 9350: '(19)', 786 | 9351: '(20)', 787 | 9352: '1.', 788 | 9353: '2.', 789 | 9354: '3.', 790 | 9355: '4.', 791 | 9356: '5.', 792 | 9357: '6.', 793 | 9358: '7.', 794 | 9359: '8.', 795 | 9360: '9.', 796 | 9361: '10.', 797 | 9362: '11.', 798 | 9363: '12.', 799 | 9364: '13.', 800 | 9365: '14.', 801 | 9366: '15.', 802 | 9367: '16.', 803 | 9368: '17.', 804 | 9369: '18.', 805 | 9370: '19.', 806 | 9371: '20.', 807 | 9372: '(a)', 808 | 9373: '(b)', 809 | 9374: '(c)', 810 | 9375: '(d)', 811 | 9376: '(e)', 812 | 9377: '(f)', 813 | 9378: '(g)', 814 | 9379: '(h)', 815 | 9380: '(i)', 816 | 9381: '(j)', 817 | 9382: '(k)', 818 | 9383: '(l)', 819 | 9384: '(m)', 820 | 9385: '(n)', 821 | 9386: '(o)', 822 | 9387: '(p)', 823 | 9388: '(q)', 824 | 9389: '(r)', 825 | 9390: '(s)', 826 | 9391: '(t)', 827 | 9392: '(u)', 828 | 9393: '(v)', 829 | 9394: '(w)', 830 | 9395: '(x)', 831 | 9396: '(y)', 832 | 9397: '(z)', 833 | 9398: '(A)', 834 | 9399: '(B)', 835 | 9400: '(C)', 836 | 9401: '(D)', 837 | 9402: '(E)', 838 | 9403: '(F)', 839 | 9404: '(G)', 840 | 9405: '(H)', 841 | 9406: '(I)', 842 | 9407: '(J)', 843 | 9408: '(K)', 844 | 9409: '(L)', 845 | 9410: '(M)', 846 | 9411: '(N)', 847 | 9412: '(O)', 848 | 9413: '(P)', 849 | 9414: '(Q)', 850 | 9415: '(R)', 851 | 9416: '(S)', 852 | 9417: '(T)', 853 | 9418: '(U)', 854 | 9419: '(V)', 855 | 9420: '(W)', 856 | 9421: '(X)', 857 | 9422: '(Y)', 858 | 9423: '(Z)', 859 | 9424: '(a)', 860 | 9425: '(b)', 861 | 9426: '(c)', 862 | 9427: '(d)', 863 | 9428: '(e)', 864 | 9429: '(f)', 865 | 9430: '(g)', 866 | 9431: '(h)', 867 | 9432: '(i)', 868 | 9433: '(j)', 869 | 9434: '(k)', 870 | 9435: '(l)', 871 | 9436: '(m)', 872 | 9437: '(n)', 873 | 9438: '(o)', 874 | 9439: '(p)', 875 | 9440: '(q)', 876 | 9441: '(r)', 877 | 9442: '(s)', 878 | 9443: '(t)', 879 | 9444: '(u)', 880 | 9445: '(v)', 881 | 9446: '(w)', 882 | 9447: '(x)', 883 | 9448: '(y)', 884 | 9449: '(z)', 885 | 9450: '(0)', 886 | 9472: '-', 887 | 9473: '=', 888 | 9474: '|', 889 | 9475: '|', 890 | 9476: '-', 891 | 9477: '=', 892 | 9478: '|', 893 | 9479: '|', 894 | 9480: '-', 895 | 9481: '=', 896 | 9482: '|', 897 | 9483: '|', 898 | 9484: '+', 899 | 9485: '+', 900 | 9486: '+', 901 | 9487: '+', 902 | 9488: '+', 903 | 9489: '+', 904 | 9490: '+', 905 | 9491: '+', 906 | 9492: '+', 907 | 9493: '+', 908 | 9494: '+', 909 | 9495: '+', 910 | 9496: '+', 911 | 9497: '+', 912 | 9498: '+', 913 | 9499: '+', 914 | 9500: '+', 915 | 9501: '+', 916 | 9502: '+', 917 | 9503: '+', 918 | 9504: '+', 919 | 9505: '+', 920 | 9506: '+', 921 | 9507: '+', 922 | 9508: '+', 923 | 9509: '+', 924 | 9510: '+', 925 | 9511: '+', 926 | 9512: '+', 927 | 9513: '+', 928 | 9514: '+', 929 | 9515: '+', 930 | 9516: '+', 931 | 9517: '+', 932 | 9518: '+', 933 | 9519: '+', 934 | 9520: '+', 935 | 9521: '+', 936 | 9522: '+', 937 | 9523: '+', 938 | 9524: '+', 939 | 9525: '+', 940 | 9526: '+', 941 | 9527: '+', 942 | 9528: '+', 943 | 9529: '+', 944 | 9530: '+', 945 | 9531: '+', 946 | 9532: '+', 947 | 9533: '+', 948 | 9534: '+', 949 | 9535: '+', 950 | 9536: '+', 951 | 9537: '+', 952 | 9538: '+', 953 | 9539: '+', 954 | 9540: '+', 955 | 9541: '+', 956 | 9542: '+', 957 | 9543: '+', 958 | 9544: '+', 959 | 9545: '+', 960 | 9546: '+', 961 | 9547: '+', 962 | 9548: '-', 963 | 9549: '=', 964 | 9550: '|', 965 | 9551: '|', 966 | 9552: '=', 967 | 9553: '|', 968 | 9554: '+', 969 | 9555: '+', 970 | 9556: '+', 971 | 9557: '+', 972 | 9558: '+', 973 | 9559: '+', 974 | 9560: '+', 975 | 9561: '+', 976 | 9562: '+', 977 | 9563: '+', 978 | 9564: '+', 979 | 9565: '+', 980 | 9566: '+', 981 | 9567: '+', 982 | 9568: '+', 983 | 9569: '+', 984 | 9570: '+', 985 | 9571: '+', 986 | 9572: '+', 987 | 9573: '+', 988 | 9574: '+', 989 | 9575: '+', 990 | 9576: '+', 991 | 9577: '+', 992 | 9578: '+', 993 | 9579: '+', 994 | 9580: '+', 995 | 9581: '+', 996 | 9582: '+', 997 | 9583: '+', 998 | 9584: '+', 999 | 9585: '/', 1000 | 9586: '\\', 1001 | 9587: 'X', 1002 | 9596: '-', 1003 | 9597: '|', 1004 | 9598: '-', 1005 | 9599: '|', 1006 | 9675: 'o', 1007 | 9702: 'o', 1008 | 9733: '*', 1009 | 9734: '*', 1010 | 9746: 'X', 1011 | 9747: 'X', 1012 | 9785: ':-(', 1013 | 9786: ':-)', 1014 | 9787: '(-:', 1015 | 9837: 'b', 1016 | 9839: '#', 1017 | 9985: '%<', 1018 | 9986: '%<', 1019 | 9987: '%<', 1020 | 9988: '%<', 1021 | 9996: 'V', 1022 | 10003: '√', 1023 | 10004: '√', 1024 | 10005: 'x', 1025 | 10006: 'x', 1026 | 10007: 'X', 1027 | 10008: 'X', 1028 | 10009: '+', 1029 | 10010: '+', 1030 | 10011: '+', 1031 | 10012: '+', 1032 | 10013: '+', 1033 | 10014: '+', 1034 | 10015: '+', 1035 | 10016: '+', 1036 | 10017: '*', 1037 | 10018: '+', 1038 | 10019: '+', 1039 | 10020: '+', 1040 | 10021: '+', 1041 | 10022: '+', 1042 | 10023: '+', 1043 | 10025: '*', 1044 | 10026: '*', 1045 | 10027: '*', 1046 | 10028: '*', 1047 | 10029: '*', 1048 | 10030: '*', 1049 | 10031: '*', 1050 | 10032: '*', 1051 | 10033: '*', 1052 | 10034: '*', 1053 | 10035: '*', 1054 | 10036: '*', 1055 | 10037: '*', 1056 | 10038: '*', 1057 | 10039: '*', 1058 | 10040: '*', 1059 | 10041: '*', 1060 | 10042: '*', 1061 | 10043: '*', 1062 | 10044: '*', 1063 | 10045: '*', 1064 | 10046: '*', 1065 | 10047: '*', 1066 | 10048: '*', 1067 | 10049: '*', 1068 | 10050: '*', 1069 | 10051: '*', 1070 | 10052: '*', 1071 | 10053: '*', 1072 | 10054: '*', 1073 | 10055: '*', 1074 | 10056: '*', 1075 | 10057: '*', 1076 | 10058: '*', 1077 | 10059: '*', 1078 | 64256: 'ff', 1079 | 64257: 'fi', 1080 | 64258: 'fl', 1081 | 64259: 'ffi', 1082 | 64260: 'ffl', 1083 | 64261: 'ſt', 1084 | 64262: 'st', 1085 | 65279: '', 1086 | 65533: '?', 1087 | } 1088 | 1089 | short_table = { 1090 | 160: ' ', 1091 | 161: '!', 1092 | 162: 'c', 1093 | 163: 'GBP', 1094 | 165: 'Y', 1095 | 166: '|', 1096 | 167: 'S', 1097 | 168: '"', 1098 | 169: 'c', 1099 | 170: 'a', 1100 | 171: '<<', 1101 | 172: '-', 1102 | 173: '-', 1103 | 174: '(R)', 1104 | 175: '-', 1105 | 176: ' ', 1106 | 177: '+/-', 1107 | 178: '2', 1108 | 179: '3', 1109 | 180: "'", 1110 | 181: 'u', 1111 | 182: 'P', 1112 | 183: '.', 1113 | 184: ',', 1114 | 185: '1', 1115 | 186: 'o', 1116 | 187: '>>', 1117 | 188: ' 1/4', 1118 | 189: ' 1/2', 1119 | 190: ' 3/4', 1120 | 191: '?', 1121 | 192: 'A', 1122 | 193: 'A', 1123 | 194: 'A', 1124 | 195: 'A', 1125 | 196: 'A', 1126 | 197: 'A', 1127 | 198: 'A', 1128 | 199: 'C', 1129 | 200: 'E', 1130 | 201: 'E', 1131 | 202: 'E', 1132 | 203: 'E', 1133 | 204: 'I', 1134 | 205: 'I', 1135 | 206: 'I', 1136 | 207: 'I', 1137 | 208: 'D', 1138 | 209: 'N', 1139 | 210: 'O', 1140 | 211: 'O', 1141 | 212: 'O', 1142 | 213: 'O', 1143 | 214: 'O', 1144 | 215: 'x', 1145 | 216: 'O', 1146 | 217: 'U', 1147 | 218: 'U', 1148 | 219: 'U', 1149 | 220: 'U', 1150 | 221: 'Y', 1151 | 222: 'Th', 1152 | 223: 'ss', 1153 | 224: 'a', 1154 | 225: 'a', 1155 | 226: 'a', 1156 | 227: 'a', 1157 | 228: 'a', 1158 | 229: 'a', 1159 | 230: 'a', 1160 | 231: 'c', 1161 | 232: 'e', 1162 | 233: 'e', 1163 | 234: 'e', 1164 | 235: 'e', 1165 | 236: 'i', 1166 | 237: 'i', 1167 | 238: 'i', 1168 | 239: 'i', 1169 | 240: 'd', 1170 | 241: 'n', 1171 | 242: 'o', 1172 | 243: 'o', 1173 | 244: 'o', 1174 | 245: 'o', 1175 | 246: 'o', 1176 | 247: ':', 1177 | 248: 'o', 1178 | 249: 'u', 1179 | 250: 'u', 1180 | 251: 'u', 1181 | 252: 'u', 1182 | 253: 'y', 1183 | 254: 'th', 1184 | 255: 'y', 1185 | 256: 'A', 1186 | 257: 'a', 1187 | 258: 'A', 1188 | 259: 'a', 1189 | 260: 'A', 1190 | 261: 'a', 1191 | 262: 'C', 1192 | 263: 'c', 1193 | 264: 'C', 1194 | 265: 'c', 1195 | 266: 'C', 1196 | 267: 'c', 1197 | 268: 'C', 1198 | 269: 'c', 1199 | 270: 'D', 1200 | 271: 'd', 1201 | 272: 'D', 1202 | 273: 'd', 1203 | 274: 'E', 1204 | 275: 'e', 1205 | 276: 'E', 1206 | 277: 'e', 1207 | 278: 'E', 1208 | 279: 'e', 1209 | 280: 'E', 1210 | 281: 'e', 1211 | 282: 'E', 1212 | 283: 'e', 1213 | 284: 'G', 1214 | 285: 'g', 1215 | 286: 'G', 1216 | 287: 'g', 1217 | 288: 'G', 1218 | 289: 'g', 1219 | 290: 'G', 1220 | 291: 'g', 1221 | 292: 'H', 1222 | 293: 'h', 1223 | 294: 'H', 1224 | 295: 'h', 1225 | 296: 'I', 1226 | 297: 'i', 1227 | 298: 'I', 1228 | 299: 'i', 1229 | 300: 'I', 1230 | 301: 'i', 1231 | 302: 'I', 1232 | 303: 'i', 1233 | 304: 'I', 1234 | 305: 'i', 1235 | 306: 'IJ', 1236 | 307: 'ij', 1237 | 308: 'J', 1238 | 309: 'j', 1239 | 310: 'K', 1240 | 311: 'k', 1241 | 312: 'k', 1242 | 313: 'L', 1243 | 314: 'l', 1244 | 315: 'L', 1245 | 316: 'l', 1246 | 317: 'L', 1247 | 318: 'l', 1248 | 319: 'L.', 1249 | 320: 'l.', 1250 | 321: 'L', 1251 | 322: 'l', 1252 | 323: 'N', 1253 | 324: 'n', 1254 | 325: 'N', 1255 | 326: 'n', 1256 | 327: 'N', 1257 | 328: 'n', 1258 | 329: "'n", 1259 | 330: 'N', 1260 | 331: 'n', 1261 | 332: 'O', 1262 | 333: 'o', 1263 | 334: 'O', 1264 | 335: 'o', 1265 | 336: 'O', 1266 | 337: 'o', 1267 | 338: 'OE', 1268 | 339: 'oe', 1269 | 340: 'R', 1270 | 341: 'r', 1271 | 342: 'R', 1272 | 343: 'r', 1273 | 344: 'R', 1274 | 345: 'r', 1275 | 346: 'S', 1276 | 347: 's', 1277 | 348: 'S', 1278 | 349: 's', 1279 | 350: 'S', 1280 | 351: 's', 1281 | 352: 'S', 1282 | 353: 's', 1283 | 354: 'T', 1284 | 355: 't', 1285 | 356: 'T', 1286 | 357: 't', 1287 | 358: 'T', 1288 | 359: 't', 1289 | 360: 'U', 1290 | 361: 'u', 1291 | 362: 'U', 1292 | 363: 'u', 1293 | 364: 'U', 1294 | 365: 'u', 1295 | 366: 'U', 1296 | 367: 'u', 1297 | 368: 'U', 1298 | 369: 'u', 1299 | 370: 'U', 1300 | 371: 'u', 1301 | 372: 'W', 1302 | 373: 'w', 1303 | 374: 'Y', 1304 | 375: 'y', 1305 | 376: 'Y', 1306 | 377: 'Z', 1307 | 378: 'z', 1308 | 379: 'Z', 1309 | 380: 'z', 1310 | 381: 'Z', 1311 | 382: 'z', 1312 | 383: 's', 1313 | 402: 'f', 1314 | 416: 'O', 1315 | 417: 'o', 1316 | 431: 'U', 1317 | 432: 'u', 1318 | 536: 'S', 1319 | 537: 's', 1320 | 538: 'T', 1321 | 539: 't', 1322 | 697: "'", 1323 | 699: '‘', 1324 | 700: "'", 1325 | 701: '‛', 1326 | 710: '^', 1327 | 712: "'", 1328 | 713: '¯', 1329 | 716: ',', 1330 | 720: ':', 1331 | 730: '°', 1332 | 732: '~', 1333 | 733: '"', 1334 | 884: "'", 1335 | 885: ',', 1336 | 894: ';', 1337 | 7682: 'B', 1338 | 7683: 'b', 1339 | 7690: 'D', 1340 | 7691: 'd', 1341 | 7710: 'F', 1342 | 7711: 'f', 1343 | 7744: 'M', 1344 | 7745: 'm', 1345 | 7766: 'P', 1346 | 7767: 'p', 1347 | 7776: 'S', 1348 | 7777: 's', 1349 | 7786: 'T', 1350 | 7787: 't', 1351 | 7808: 'W', 1352 | 7809: 'w', 1353 | 7810: 'W', 1354 | 7811: 'w', 1355 | 7812: 'W', 1356 | 7813: 'w', 1357 | 7918: 'U', 1358 | 7919: 'u', 1359 | 7922: 'Y', 1360 | 7923: 'y', 1361 | 8192: ' ', 1362 | 8193: ' ', 1363 | 8194: ' ', 1364 | 8195: ' ', 1365 | 8196: ' ', 1366 | 8197: ' ', 1367 | 8198: ' ', 1368 | 8199: ' ', 1369 | 8200: ' ', 1370 | 8201: ' ', 1371 | 8202: '', 1372 | 8203: '', 1373 | 8204: '', 1374 | 8205: '', 1375 | 8206: '', 1376 | 8207: '', 1377 | 8208: '-', 1378 | 8209: '-', 1379 | 8210: '-', 1380 | 8211: '-', 1381 | 8212: '--', 1382 | 8213: '--', 1383 | 8214: '||', 1384 | 8215: '_', 1385 | 8216: "'", 1386 | 8217: "'", 1387 | 8218: "'", 1388 | 8219: "'", 1389 | 8220: '"', 1390 | 8221: '"', 1391 | 8222: '"', 1392 | 8223: '"', 1393 | 8224: '+', 1394 | 8225: '++', 1395 | 8226: 'o', 1396 | 8227: '>', 1397 | 8228: '.', 1398 | 8229: '..', 1399 | 8230: '...', 1400 | 8231: '-', 1401 | 8234: '', 1402 | 8235: '', 1403 | 8236: '', 1404 | 8237: '', 1405 | 8238: '', 1406 | 8239: ' ', 1407 | 8240: ' 0/00', 1408 | 8242: "'", 1409 | 8243: '"', 1410 | 8244: "'''", 1411 | 8245: '`', 1412 | 8246: '``', 1413 | 8247: '```', 1414 | 8249: '<', 1415 | 8250: '>', 1416 | 8252: '!!', 1417 | 8254: '-', 1418 | 8259: '-', 1419 | 8260: '/', 1420 | 8264: '?!', 1421 | 8265: '!?', 1422 | 8266: '7', 1423 | 8304: '0', 1424 | 8308: '4', 1425 | 8309: '5', 1426 | 8310: '6', 1427 | 8311: '7', 1428 | 8312: '8', 1429 | 8313: '9', 1430 | 8314: '+', 1431 | 8315: '-', 1432 | 8316: '=', 1433 | 8317: '(', 1434 | 8318: ')', 1435 | 8319: 'n', 1436 | 8320: '0', 1437 | 8321: '1', 1438 | 8322: '2', 1439 | 8323: '3', 1440 | 8324: '4', 1441 | 8325: '5', 1442 | 8326: '6', 1443 | 8327: '7', 1444 | 8328: '8', 1445 | 8329: '9', 1446 | 8330: '+', 1447 | 8331: '-', 1448 | 8332: '=', 1449 | 8333: '(', 1450 | 8334: ')', 1451 | 8364: 'E', 1452 | 8448: 'a/c', 1453 | 8449: 'a/s', 1454 | 8451: 'C', 1455 | 8453: 'c/o', 1456 | 8454: 'c/u', 1457 | 8457: 'F', 1458 | 8467: 'l', 1459 | 8470: 'No', 1460 | 8471: '(P)', 1461 | 8480: '[SM]', 1462 | 8481: 'TEL', 1463 | 8482: '[TM]', 1464 | 8486: 'ohm', 1465 | 8490: 'K', 1466 | 8491: 'Å', 1467 | 8494: 'e', 1468 | 8531: ' 1/3', 1469 | 8532: ' 2/3', 1470 | 8533: ' 1/5', 1471 | 8534: ' 2/5', 1472 | 8535: ' 3/5', 1473 | 8536: ' 4/5', 1474 | 8537: ' 1/6', 1475 | 8538: ' 5/6', 1476 | 8539: ' 1/8', 1477 | 8540: ' 3/8', 1478 | 8541: ' 5/8', 1479 | 8542: ' 7/8', 1480 | 8543: ' 1/', 1481 | 8544: 'I', 1482 | 8545: 'II', 1483 | 8546: 'III', 1484 | 8547: 'IV', 1485 | 8548: 'V', 1486 | 8549: 'VI', 1487 | 8550: 'VII', 1488 | 8551: 'VIII', 1489 | 8552: 'IX', 1490 | 8553: 'X', 1491 | 8554: 'XI', 1492 | 8555: 'XII', 1493 | 8556: 'L', 1494 | 8557: 'C', 1495 | 8558: 'D', 1496 | 8559: 'M', 1497 | 8560: 'i', 1498 | 8561: 'ii', 1499 | 8562: 'iii', 1500 | 8563: 'iv', 1501 | 8564: 'v', 1502 | 8565: 'vi', 1503 | 8566: 'vii', 1504 | 8567: 'viii', 1505 | 8568: 'ix', 1506 | 8569: 'x', 1507 | 8570: 'xi', 1508 | 8571: 'xii', 1509 | 8572: 'l', 1510 | 8573: 'c', 1511 | 8574: 'd', 1512 | 8575: 'm', 1513 | 8592: '<-', 1514 | 8593: '^', 1515 | 8594: '->', 1516 | 8595: 'v', 1517 | 8596: '<->', 1518 | 8656: '<=', 1519 | 8658: '=>', 1520 | 8660: '<=>', 1521 | 8722: '-', 1522 | 8725: '/', 1523 | 8726: '\\', 1524 | 8727: '*', 1525 | 8728: 'o', 1526 | 8729: '·', 1527 | 8734: 'inf', 1528 | 8739: '|', 1529 | 8741: '||', 1530 | 8758: ':', 1531 | 8764: '~', 1532 | 8800: '/=', 1533 | 8801: '=', 1534 | 8804: '<=', 1535 | 8805: '>=', 1536 | 8810: '<<', 1537 | 8811: '>>', 1538 | 8853: '(+)', 1539 | 8854: '(-)', 1540 | 8855: '(x)', 1541 | 8856: '(/)', 1542 | 8866: '|-', 1543 | 8867: '-|', 1544 | 8870: '|-', 1545 | 8871: '|=', 1546 | 8872: '|=', 1547 | 8873: '||-', 1548 | 8901: '·', 1549 | 8902: '*', 1550 | 8917: '#', 1551 | 8920: '<<<', 1552 | 8921: '>>>', 1553 | 8943: '...', 1554 | 9001: '<', 1555 | 9002: '>', 1556 | 9216: 'NUL', 1557 | 9217: 'SOH', 1558 | 9218: 'STX', 1559 | 9219: 'ETX', 1560 | 9220: 'EOT', 1561 | 9221: 'ENQ', 1562 | 9222: 'ACK', 1563 | 9223: 'BEL', 1564 | 9224: 'BS', 1565 | 9225: 'HT', 1566 | 9226: 'LF', 1567 | 9227: 'VT', 1568 | 9228: 'FF', 1569 | 9229: 'CR', 1570 | 9230: 'SO', 1571 | 9231: 'SI', 1572 | 9232: 'DLE', 1573 | 9233: 'DC1', 1574 | 9234: 'DC2', 1575 | 9235: 'DC3', 1576 | 9236: 'DC4', 1577 | 9237: 'NAK', 1578 | 9238: 'SYN', 1579 | 9239: 'ETB', 1580 | 9240: 'CAN', 1581 | 9241: 'EM', 1582 | 9242: 'SUB', 1583 | 9243: 'ESC', 1584 | 9244: 'FS', 1585 | 9245: 'GS', 1586 | 9246: 'RS', 1587 | 9247: 'US', 1588 | 9248: 'SP', 1589 | 9249: 'DEL', 1590 | 9251: '_', 1591 | 9252: 'NL', 1592 | 9253: '///', 1593 | 9254: '?', 1594 | 9312: '1', 1595 | 9313: '2', 1596 | 9314: '3', 1597 | 9315: '4', 1598 | 9316: '5', 1599 | 9317: '6', 1600 | 9318: '7', 1601 | 9319: '8', 1602 | 9320: '9', 1603 | 9321: '(10)', 1604 | 9322: '(11)', 1605 | 9323: '(12)', 1606 | 9324: '(13)', 1607 | 9325: '(14)', 1608 | 9326: '(15)', 1609 | 9327: '(16)', 1610 | 9328: '(17)', 1611 | 9329: '(18)', 1612 | 9330: '(19)', 1613 | 9331: '(20)', 1614 | 9332: '1', 1615 | 9333: '2', 1616 | 9334: '3', 1617 | 9335: '4', 1618 | 9336: '5', 1619 | 9337: '6', 1620 | 9338: '7', 1621 | 9339: '8', 1622 | 9340: '9', 1623 | 9341: '(10)', 1624 | 9342: '(11)', 1625 | 9343: '(12)', 1626 | 9344: '(13)', 1627 | 9345: '(14)', 1628 | 9346: '(15)', 1629 | 9347: '(16)', 1630 | 9348: '(17)', 1631 | 9349: '(18)', 1632 | 9350: '(19)', 1633 | 9351: '(20)', 1634 | 9352: '1', 1635 | 9353: '2', 1636 | 9354: '3', 1637 | 9355: '4', 1638 | 9356: '5', 1639 | 9357: '6', 1640 | 9358: '7', 1641 | 9359: '8', 1642 | 9360: '9', 1643 | 9361: '10.', 1644 | 9362: '11.', 1645 | 9363: '12.', 1646 | 9364: '13.', 1647 | 9365: '14.', 1648 | 9366: '15.', 1649 | 9367: '16.', 1650 | 9368: '17.', 1651 | 9369: '18.', 1652 | 9370: '19.', 1653 | 9371: '20.', 1654 | 9372: 'a', 1655 | 9373: 'b', 1656 | 9374: 'c', 1657 | 9375: 'd', 1658 | 9376: 'e', 1659 | 9377: 'f', 1660 | 9378: 'g', 1661 | 9379: 'h', 1662 | 9380: 'i', 1663 | 9381: 'j', 1664 | 9382: 'k', 1665 | 9383: 'l', 1666 | 9384: 'm', 1667 | 9385: 'n', 1668 | 9386: 'o', 1669 | 9387: 'p', 1670 | 9388: 'q', 1671 | 9389: 'r', 1672 | 9390: 's', 1673 | 9391: 't', 1674 | 9392: 'u', 1675 | 9393: 'v', 1676 | 9394: 'w', 1677 | 9395: 'x', 1678 | 9396: 'y', 1679 | 9397: 'z', 1680 | 9398: 'A', 1681 | 9399: 'B', 1682 | 9400: 'C', 1683 | 9401: 'D', 1684 | 9402: 'E', 1685 | 9403: 'F', 1686 | 9404: 'G', 1687 | 9405: 'H', 1688 | 9406: 'I', 1689 | 9407: 'J', 1690 | 9408: 'K', 1691 | 9409: 'L', 1692 | 9410: 'M', 1693 | 9411: 'N', 1694 | 9412: 'O', 1695 | 9413: 'P', 1696 | 9414: 'Q', 1697 | 9415: 'R', 1698 | 9416: 'S', 1699 | 9417: 'T', 1700 | 9418: 'U', 1701 | 9419: 'V', 1702 | 9420: 'W', 1703 | 9421: 'X', 1704 | 9422: 'Y', 1705 | 9423: 'Z', 1706 | 9424: 'a', 1707 | 9425: 'b', 1708 | 9426: 'c', 1709 | 9427: 'd', 1710 | 9428: 'e', 1711 | 9429: 'f', 1712 | 9430: 'g', 1713 | 9431: 'h', 1714 | 9432: 'i', 1715 | 9433: 'j', 1716 | 9434: 'k', 1717 | 9435: 'l', 1718 | 9436: 'm', 1719 | 9437: 'n', 1720 | 9438: 'o', 1721 | 9439: 'p', 1722 | 9440: 'q', 1723 | 9441: 'r', 1724 | 9442: 's', 1725 | 9443: 't', 1726 | 9444: 'u', 1727 | 9445: 'v', 1728 | 9446: 'w', 1729 | 9447: 'x', 1730 | 9448: 'y', 1731 | 9449: 'z', 1732 | 9450: '0', 1733 | 9472: '-', 1734 | 9473: '=', 1735 | 9474: '|', 1736 | 9475: '|', 1737 | 9476: '-', 1738 | 9477: '=', 1739 | 9478: '|', 1740 | 9479: '|', 1741 | 9480: '-', 1742 | 9481: '=', 1743 | 9482: '|', 1744 | 9483: '|', 1745 | 9484: '+', 1746 | 9485: '+', 1747 | 9486: '+', 1748 | 9487: '+', 1749 | 9488: '+', 1750 | 9489: '+', 1751 | 9490: '+', 1752 | 9491: '+', 1753 | 9492: '+', 1754 | 9493: '+', 1755 | 9494: '+', 1756 | 9495: '+', 1757 | 9496: '+', 1758 | 9497: '+', 1759 | 9498: '+', 1760 | 9499: '+', 1761 | 9500: '+', 1762 | 9501: '+', 1763 | 9502: '+', 1764 | 9503: '+', 1765 | 9504: '+', 1766 | 9505: '+', 1767 | 9506: '+', 1768 | 9507: '+', 1769 | 9508: '+', 1770 | 9509: '+', 1771 | 9510: '+', 1772 | 9511: '+', 1773 | 9512: '+', 1774 | 9513: '+', 1775 | 9514: '+', 1776 | 9515: '+', 1777 | 9516: '+', 1778 | 9517: '+', 1779 | 9518: '+', 1780 | 9519: '+', 1781 | 9520: '+', 1782 | 9521: '+', 1783 | 9522: '+', 1784 | 9523: '+', 1785 | 9524: '+', 1786 | 9525: '+', 1787 | 9526: '+', 1788 | 9527: '+', 1789 | 9528: '+', 1790 | 9529: '+', 1791 | 9530: '+', 1792 | 9531: '+', 1793 | 9532: '+', 1794 | 9533: '+', 1795 | 9534: '+', 1796 | 9535: '+', 1797 | 9536: '+', 1798 | 9537: '+', 1799 | 9538: '+', 1800 | 9539: '+', 1801 | 9540: '+', 1802 | 9541: '+', 1803 | 9542: '+', 1804 | 9543: '+', 1805 | 9544: '+', 1806 | 9545: '+', 1807 | 9546: '+', 1808 | 9547: '+', 1809 | 9548: '-', 1810 | 9549: '=', 1811 | 9550: '|', 1812 | 9551: '|', 1813 | 9552: '=', 1814 | 9553: '|', 1815 | 9554: '+', 1816 | 9555: '+', 1817 | 9556: '+', 1818 | 9557: '+', 1819 | 9558: '+', 1820 | 9559: '+', 1821 | 9560: '+', 1822 | 9561: '+', 1823 | 9562: '+', 1824 | 9563: '+', 1825 | 9564: '+', 1826 | 9565: '+', 1827 | 9566: '+', 1828 | 9567: '+', 1829 | 9568: '+', 1830 | 9569: '+', 1831 | 9570: '+', 1832 | 9571: '+', 1833 | 9572: '+', 1834 | 9573: '+', 1835 | 9574: '+', 1836 | 9575: '+', 1837 | 9576: '+', 1838 | 9577: '+', 1839 | 9578: '+', 1840 | 9579: '+', 1841 | 9580: '+', 1842 | 9581: '+', 1843 | 9582: '+', 1844 | 9583: '+', 1845 | 9584: '+', 1846 | 9585: '/', 1847 | 9586: '\\', 1848 | 9587: 'X', 1849 | 9596: '-', 1850 | 9597: '|', 1851 | 9598: '-', 1852 | 9599: '|', 1853 | 9675: 'o', 1854 | 9702: 'o', 1855 | 9733: '*', 1856 | 9734: '*', 1857 | 9746: 'X', 1858 | 9747: 'X', 1859 | 9785: ':-(', 1860 | 9786: ':-)', 1861 | 9787: '(-:', 1862 | 9837: 'b', 1863 | 9839: '#', 1864 | 9985: '%<', 1865 | 9986: '%<', 1866 | 9987: '%<', 1867 | 9988: '%<', 1868 | 9996: 'V', 1869 | 10003: '√', 1870 | 10004: '√', 1871 | 10005: 'x', 1872 | 10006: 'x', 1873 | 10007: 'X', 1874 | 10008: 'X', 1875 | 10009: '+', 1876 | 10010: '+', 1877 | 10011: '+', 1878 | 10012: '+', 1879 | 10013: '+', 1880 | 10014: '+', 1881 | 10015: '+', 1882 | 10016: '+', 1883 | 10017: '*', 1884 | 10018: '+', 1885 | 10019: '+', 1886 | 10020: '+', 1887 | 10021: '+', 1888 | 10022: '+', 1889 | 10023: '+', 1890 | 10025: '*', 1891 | 10026: '*', 1892 | 10027: '*', 1893 | 10028: '*', 1894 | 10029: '*', 1895 | 10030: '*', 1896 | 10031: '*', 1897 | 10032: '*', 1898 | 10033: '*', 1899 | 10034: '*', 1900 | 10035: '*', 1901 | 10036: '*', 1902 | 10037: '*', 1903 | 10038: '*', 1904 | 10039: '*', 1905 | 10040: '*', 1906 | 10041: '*', 1907 | 10042: '*', 1908 | 10043: '*', 1909 | 10044: '*', 1910 | 10045: '*', 1911 | 10046: '*', 1912 | 10047: '*', 1913 | 10048: '*', 1914 | 10049: '*', 1915 | 10050: '*', 1916 | 10051: '*', 1917 | 10052: '*', 1918 | 10053: '*', 1919 | 10054: '*', 1920 | 10055: '*', 1921 | 10056: '*', 1922 | 10057: '*', 1923 | 10058: '*', 1924 | 10059: '*', 1925 | 64256: 'ff', 1926 | 64257: 'fi', 1927 | 64258: 'fl', 1928 | 64259: 'ffi', 1929 | 64260: 'ffl', 1930 | 64261: 'st', 1931 | 64262: 'st', 1932 | 65279: '', 1933 | 65533: '?', 1934 | } 1935 | 1936 | single_table = { 1937 | 160: ' ', 1938 | 161: '!', 1939 | 162: 'c', 1940 | 165: 'Y', 1941 | 166: '|', 1942 | 167: 'S', 1943 | 168: '"', 1944 | 169: 'c', 1945 | 170: 'a', 1946 | 172: '-', 1947 | 173: '-', 1948 | 175: '-', 1949 | 176: ' ', 1950 | 178: '2', 1951 | 179: '3', 1952 | 180: "'", 1953 | 181: 'u', 1954 | 182: 'P', 1955 | 183: '.', 1956 | 184: ',', 1957 | 185: '1', 1958 | 186: 'o', 1959 | 191: '?', 1960 | 192: 'A', 1961 | 193: 'A', 1962 | 194: 'A', 1963 | 195: 'A', 1964 | 196: 'A', 1965 | 197: 'A', 1966 | 198: 'A', 1967 | 199: 'C', 1968 | 200: 'E', 1969 | 201: 'E', 1970 | 202: 'E', 1971 | 203: 'E', 1972 | 204: 'I', 1973 | 205: 'I', 1974 | 206: 'I', 1975 | 207: 'I', 1976 | 208: 'D', 1977 | 209: 'N', 1978 | 210: 'O', 1979 | 211: 'O', 1980 | 212: 'O', 1981 | 213: 'O', 1982 | 214: 'O', 1983 | 215: 'x', 1984 | 216: 'O', 1985 | 217: 'U', 1986 | 218: 'U', 1987 | 219: 'U', 1988 | 220: 'U', 1989 | 221: 'Y', 1990 | 223: 's', 1991 | 224: 'a', 1992 | 225: 'a', 1993 | 226: 'a', 1994 | 227: 'a', 1995 | 228: 'a', 1996 | 229: 'a', 1997 | 230: 'a', 1998 | 231: 'c', 1999 | 232: 'e', 2000 | 233: 'e', 2001 | 234: 'e', 2002 | 235: 'e', 2003 | 236: 'i', 2004 | 237: 'i', 2005 | 238: 'i', 2006 | 239: 'i', 2007 | 240: 'd', 2008 | 241: 'n', 2009 | 242: 'o', 2010 | 243: 'o', 2011 | 244: 'o', 2012 | 245: 'o', 2013 | 246: 'o', 2014 | 247: ':', 2015 | 248: 'o', 2016 | 249: 'u', 2017 | 250: 'u', 2018 | 251: 'u', 2019 | 252: 'u', 2020 | 253: 'y', 2021 | 255: 'y', 2022 | 256: 'A', 2023 | 257: 'a', 2024 | 258: 'A', 2025 | 259: 'a', 2026 | 260: 'A', 2027 | 261: 'a', 2028 | 262: 'C', 2029 | 263: 'c', 2030 | 264: 'C', 2031 | 265: 'c', 2032 | 266: 'C', 2033 | 267: 'c', 2034 | 268: 'C', 2035 | 269: 'c', 2036 | 270: 'D', 2037 | 271: 'd', 2038 | 272: 'D', 2039 | 273: 'd', 2040 | 274: 'E', 2041 | 275: 'e', 2042 | 276: 'E', 2043 | 277: 'e', 2044 | 278: 'E', 2045 | 279: 'e', 2046 | 280: 'E', 2047 | 281: 'e', 2048 | 282: 'E', 2049 | 283: 'e', 2050 | 284: 'G', 2051 | 285: 'g', 2052 | 286: 'G', 2053 | 287: 'g', 2054 | 288: 'G', 2055 | 289: 'g', 2056 | 290: 'G', 2057 | 291: 'g', 2058 | 292: 'H', 2059 | 293: 'h', 2060 | 294: 'H', 2061 | 295: 'h', 2062 | 296: 'I', 2063 | 297: 'i', 2064 | 298: 'I', 2065 | 299: 'i', 2066 | 300: 'I', 2067 | 301: 'i', 2068 | 302: 'I', 2069 | 303: 'i', 2070 | 304: 'I', 2071 | 305: 'i', 2072 | 308: 'J', 2073 | 309: 'j', 2074 | 310: 'K', 2075 | 311: 'k', 2076 | 312: 'k', 2077 | 313: 'L', 2078 | 314: 'l', 2079 | 315: 'L', 2080 | 316: 'l', 2081 | 317: 'L', 2082 | 318: 'l', 2083 | 321: 'L', 2084 | 322: 'l', 2085 | 323: 'N', 2086 | 324: 'n', 2087 | 325: 'N', 2088 | 326: 'n', 2089 | 327: 'N', 2090 | 328: 'n', 2091 | 330: 'N', 2092 | 331: 'n', 2093 | 332: 'O', 2094 | 333: 'o', 2095 | 334: 'O', 2096 | 335: 'o', 2097 | 336: 'O', 2098 | 337: 'o', 2099 | 340: 'R', 2100 | 341: 'r', 2101 | 342: 'R', 2102 | 343: 'r', 2103 | 344: 'R', 2104 | 345: 'r', 2105 | 346: 'S', 2106 | 347: 's', 2107 | 348: 'S', 2108 | 349: 's', 2109 | 350: 'S', 2110 | 351: 's', 2111 | 352: 'S', 2112 | 353: 's', 2113 | 354: 'T', 2114 | 355: 't', 2115 | 356: 'T', 2116 | 357: 't', 2117 | 358: 'T', 2118 | 359: 't', 2119 | 360: 'U', 2120 | 361: 'u', 2121 | 362: 'U', 2122 | 363: 'u', 2123 | 364: 'U', 2124 | 365: 'u', 2125 | 366: 'U', 2126 | 367: 'u', 2127 | 368: 'U', 2128 | 369: 'u', 2129 | 370: 'U', 2130 | 371: 'u', 2131 | 372: 'W', 2132 | 373: 'w', 2133 | 374: 'Y', 2134 | 375: 'y', 2135 | 376: 'Y', 2136 | 377: 'Z', 2137 | 378: 'z', 2138 | 379: 'Z', 2139 | 380: 'z', 2140 | 381: 'Z', 2141 | 382: 'z', 2142 | 383: 's', 2143 | 402: 'f', 2144 | 416: 'O', 2145 | 417: 'o', 2146 | 431: 'U', 2147 | 432: 'u', 2148 | 536: 'S', 2149 | 537: 's', 2150 | 538: 'T', 2151 | 539: 't', 2152 | 697: "'", 2153 | 699: '‘', 2154 | 700: "'", 2155 | 701: '‛', 2156 | 710: '^', 2157 | 712: "'", 2158 | 713: '¯', 2159 | 716: ',', 2160 | 720: ':', 2161 | 730: '°', 2162 | 732: '~', 2163 | 733: '"', 2164 | 884: "'", 2165 | 885: ',', 2166 | 894: ';', 2167 | 7682: 'B', 2168 | 7683: 'b', 2169 | 7690: 'D', 2170 | 7691: 'd', 2171 | 7710: 'F', 2172 | 7711: 'f', 2173 | 7744: 'M', 2174 | 7745: 'm', 2175 | 7766: 'P', 2176 | 7767: 'p', 2177 | 7776: 'S', 2178 | 7777: 's', 2179 | 7786: 'T', 2180 | 7787: 't', 2181 | 7808: 'W', 2182 | 7809: 'w', 2183 | 7810: 'W', 2184 | 7811: 'w', 2185 | 7812: 'W', 2186 | 7813: 'w', 2187 | 7918: 'U', 2188 | 7919: 'u', 2189 | 7922: 'Y', 2190 | 7923: 'y', 2191 | 8192: ' ', 2192 | 8194: ' ', 2193 | 8196: ' ', 2194 | 8197: ' ', 2195 | 8198: ' ', 2196 | 8199: ' ', 2197 | 8200: ' ', 2198 | 8201: ' ', 2199 | 8208: '-', 2200 | 8209: '-', 2201 | 8210: '-', 2202 | 8211: '-', 2203 | 8215: '_', 2204 | 8216: "'", 2205 | 8217: "'", 2206 | 8218: "'", 2207 | 8219: "'", 2208 | 8220: '"', 2209 | 8221: '"', 2210 | 8222: '"', 2211 | 8223: '"', 2212 | 8224: '+', 2213 | 8226: 'o', 2214 | 8227: '>', 2215 | 8228: '.', 2216 | 8231: '-', 2217 | 8239: ' ', 2218 | 8242: "'", 2219 | 8243: '"', 2220 | 8245: '`', 2221 | 8249: '<', 2222 | 8250: '>', 2223 | 8254: '-', 2224 | 8259: '-', 2225 | 8260: '/', 2226 | 8266: '7', 2227 | 8304: '0', 2228 | 8308: '4', 2229 | 8309: '5', 2230 | 8310: '6', 2231 | 8311: '7', 2232 | 8312: '8', 2233 | 8313: '9', 2234 | 8314: '+', 2235 | 8315: '-', 2236 | 8316: '=', 2237 | 8317: '(', 2238 | 8318: ')', 2239 | 8319: 'n', 2240 | 8320: '0', 2241 | 8321: '1', 2242 | 8322: '2', 2243 | 8323: '3', 2244 | 8324: '4', 2245 | 8325: '5', 2246 | 8326: '6', 2247 | 8327: '7', 2248 | 8328: '8', 2249 | 8329: '9', 2250 | 8330: '+', 2251 | 8331: '-', 2252 | 8332: '=', 2253 | 8333: '(', 2254 | 8334: ')', 2255 | 8364: 'E', 2256 | 8451: 'C', 2257 | 8457: 'F', 2258 | 8467: 'l', 2259 | 8490: 'K', 2260 | 8491: 'Å', 2261 | 8494: 'e', 2262 | 8544: 'I', 2263 | 8548: 'V', 2264 | 8553: 'X', 2265 | 8556: 'L', 2266 | 8557: 'C', 2267 | 8558: 'D', 2268 | 8559: 'M', 2269 | 8560: 'i', 2270 | 8564: 'v', 2271 | 8569: 'x', 2272 | 8572: 'l', 2273 | 8573: 'c', 2274 | 8574: 'd', 2275 | 8575: 'm', 2276 | 8593: '^', 2277 | 8595: 'v', 2278 | 8722: '-', 2279 | 8725: '/', 2280 | 8726: '\\', 2281 | 8727: '*', 2282 | 8728: 'o', 2283 | 8729: '·', 2284 | 8739: '|', 2285 | 8758: ':', 2286 | 8764: '~', 2287 | 8801: '=', 2288 | 8901: '·', 2289 | 8902: '*', 2290 | 8917: '#', 2291 | 9001: '<', 2292 | 9002: '>', 2293 | 9251: '_', 2294 | 9254: '?', 2295 | 9312: '1', 2296 | 9313: '2', 2297 | 9314: '3', 2298 | 9315: '4', 2299 | 9316: '5', 2300 | 9317: '6', 2301 | 9318: '7', 2302 | 9319: '8', 2303 | 9320: '9', 2304 | 9332: '1', 2305 | 9333: '2', 2306 | 9334: '3', 2307 | 9335: '4', 2308 | 9336: '5', 2309 | 9337: '6', 2310 | 9338: '7', 2311 | 9339: '8', 2312 | 9340: '9', 2313 | 9352: '1', 2314 | 9353: '2', 2315 | 9354: '3', 2316 | 9355: '4', 2317 | 9356: '5', 2318 | 9357: '6', 2319 | 9358: '7', 2320 | 9359: '8', 2321 | 9360: '9', 2322 | 9372: 'a', 2323 | 9373: 'b', 2324 | 9374: 'c', 2325 | 9375: 'd', 2326 | 9376: 'e', 2327 | 9377: 'f', 2328 | 9378: 'g', 2329 | 9379: 'h', 2330 | 9380: 'i', 2331 | 9381: 'j', 2332 | 9382: 'k', 2333 | 9383: 'l', 2334 | 9384: 'm', 2335 | 9385: 'n', 2336 | 9386: 'o', 2337 | 9387: 'p', 2338 | 9388: 'q', 2339 | 9389: 'r', 2340 | 9390: 's', 2341 | 9391: 't', 2342 | 9392: 'u', 2343 | 9393: 'v', 2344 | 9394: 'w', 2345 | 9395: 'x', 2346 | 9396: 'y', 2347 | 9397: 'z', 2348 | 9398: 'A', 2349 | 9399: 'B', 2350 | 9400: 'C', 2351 | 9401: 'D', 2352 | 9402: 'E', 2353 | 9403: 'F', 2354 | 9404: 'G', 2355 | 9405: 'H', 2356 | 9406: 'I', 2357 | 9407: 'J', 2358 | 9408: 'K', 2359 | 9409: 'L', 2360 | 9410: 'M', 2361 | 9411: 'N', 2362 | 9412: 'O', 2363 | 9413: 'P', 2364 | 9414: 'Q', 2365 | 9415: 'R', 2366 | 9416: 'S', 2367 | 9417: 'T', 2368 | 9418: 'U', 2369 | 9419: 'V', 2370 | 9420: 'W', 2371 | 9421: 'X', 2372 | 9422: 'Y', 2373 | 9423: 'Z', 2374 | 9424: 'a', 2375 | 9425: 'b', 2376 | 9426: 'c', 2377 | 9427: 'd', 2378 | 9428: 'e', 2379 | 9429: 'f', 2380 | 9430: 'g', 2381 | 9431: 'h', 2382 | 9432: 'i', 2383 | 9433: 'j', 2384 | 9434: 'k', 2385 | 9435: 'l', 2386 | 9436: 'm', 2387 | 9437: 'n', 2388 | 9438: 'o', 2389 | 9439: 'p', 2390 | 9440: 'q', 2391 | 9441: 'r', 2392 | 9442: 's', 2393 | 9443: 't', 2394 | 9444: 'u', 2395 | 9445: 'v', 2396 | 9446: 'w', 2397 | 9447: 'x', 2398 | 9448: 'y', 2399 | 9449: 'z', 2400 | 9450: '0', 2401 | 9472: '-', 2402 | 9473: '=', 2403 | 9474: '|', 2404 | 9475: '|', 2405 | 9476: '-', 2406 | 9477: '=', 2407 | 9478: '|', 2408 | 9479: '|', 2409 | 9480: '-', 2410 | 9481: '=', 2411 | 9482: '|', 2412 | 9483: '|', 2413 | 9484: '+', 2414 | 9485: '+', 2415 | 9486: '+', 2416 | 9487: '+', 2417 | 9488: '+', 2418 | 9489: '+', 2419 | 9490: '+', 2420 | 9491: '+', 2421 | 9492: '+', 2422 | 9493: '+', 2423 | 9494: '+', 2424 | 9495: '+', 2425 | 9496: '+', 2426 | 9497: '+', 2427 | 9498: '+', 2428 | 9499: '+', 2429 | 9500: '+', 2430 | 9501: '+', 2431 | 9502: '+', 2432 | 9503: '+', 2433 | 9504: '+', 2434 | 9505: '+', 2435 | 9506: '+', 2436 | 9507: '+', 2437 | 9508: '+', 2438 | 9509: '+', 2439 | 9510: '+', 2440 | 9511: '+', 2441 | 9512: '+', 2442 | 9513: '+', 2443 | 9514: '+', 2444 | 9515: '+', 2445 | 9516: '+', 2446 | 9517: '+', 2447 | 9518: '+', 2448 | 9519: '+', 2449 | 9520: '+', 2450 | 9521: '+', 2451 | 9522: '+', 2452 | 9523: '+', 2453 | 9524: '+', 2454 | 9525: '+', 2455 | 9526: '+', 2456 | 9527: '+', 2457 | 9528: '+', 2458 | 9529: '+', 2459 | 9530: '+', 2460 | 9531: '+', 2461 | 9532: '+', 2462 | 9533: '+', 2463 | 9534: '+', 2464 | 9535: '+', 2465 | 9536: '+', 2466 | 9537: '+', 2467 | 9538: '+', 2468 | 9539: '+', 2469 | 9540: '+', 2470 | 9541: '+', 2471 | 9542: '+', 2472 | 9543: '+', 2473 | 9544: '+', 2474 | 9545: '+', 2475 | 9546: '+', 2476 | 9547: '+', 2477 | 9548: '-', 2478 | 9549: '=', 2479 | 9550: '|', 2480 | 9551: '|', 2481 | 9552: '=', 2482 | 9553: '|', 2483 | 9554: '+', 2484 | 9555: '+', 2485 | 9556: '+', 2486 | 9557: '+', 2487 | 9558: '+', 2488 | 9559: '+', 2489 | 9560: '+', 2490 | 9561: '+', 2491 | 9562: '+', 2492 | 9563: '+', 2493 | 9564: '+', 2494 | 9565: '+', 2495 | 9566: '+', 2496 | 9567: '+', 2497 | 9568: '+', 2498 | 9569: '+', 2499 | 9570: '+', 2500 | 9571: '+', 2501 | 9572: '+', 2502 | 9573: '+', 2503 | 9574: '+', 2504 | 9575: '+', 2505 | 9576: '+', 2506 | 9577: '+', 2507 | 9578: '+', 2508 | 9579: '+', 2509 | 9580: '+', 2510 | 9581: '+', 2511 | 9582: '+', 2512 | 9583: '+', 2513 | 9584: '+', 2514 | 9585: '/', 2515 | 9586: '\\', 2516 | 9587: 'X', 2517 | 9596: '-', 2518 | 9597: '|', 2519 | 9598: '-', 2520 | 9599: '|', 2521 | 9675: 'o', 2522 | 9702: 'o', 2523 | 9733: '*', 2524 | 9734: '*', 2525 | 9746: 'X', 2526 | 9747: 'X', 2527 | 9837: 'b', 2528 | 9839: '#', 2529 | 9996: 'V', 2530 | 10003: '√', 2531 | 10004: '√', 2532 | 10005: 'x', 2533 | 10006: 'x', 2534 | 10007: 'X', 2535 | 10008: 'X', 2536 | 10009: '+', 2537 | 10010: '+', 2538 | 10011: '+', 2539 | 10012: '+', 2540 | 10013: '+', 2541 | 10014: '+', 2542 | 10015: '+', 2543 | 10016: '+', 2544 | 10017: '*', 2545 | 10018: '+', 2546 | 10019: '+', 2547 | 10020: '+', 2548 | 10021: '+', 2549 | 10022: '+', 2550 | 10023: '+', 2551 | 10025: '*', 2552 | 10026: '*', 2553 | 10027: '*', 2554 | 10028: '*', 2555 | 10029: '*', 2556 | 10030: '*', 2557 | 10031: '*', 2558 | 10032: '*', 2559 | 10033: '*', 2560 | 10034: '*', 2561 | 10035: '*', 2562 | 10036: '*', 2563 | 10037: '*', 2564 | 10038: '*', 2565 | 10039: '*', 2566 | 10040: '*', 2567 | 10041: '*', 2568 | 10042: '*', 2569 | 10043: '*', 2570 | 10044: '*', 2571 | 10045: '*', 2572 | 10046: '*', 2573 | 10047: '*', 2574 | 10048: '*', 2575 | 10049: '*', 2576 | 10050: '*', 2577 | 10051: '*', 2578 | 10052: '*', 2579 | 10053: '*', 2580 | 10054: '*', 2581 | 10055: '*', 2582 | 10056: '*', 2583 | 10057: '*', 2584 | 10058: '*', 2585 | 10059: '*', 2586 | 65533: '?', 2587 | } 2588 | 2589 | 2590 | ### < 2591 | -------------------------------------------------------------------------------- /transtab/transtab: -------------------------------------------------------------------------------- 1 | % $Id: $ 2 | 3 | % APOSTROPHE 4 | 5 | % GRAVE ACCENT 6 | ; 7 | % NO-BREAK SPACE 8 | 9 | % INVERTED EXCLAMATION MARK 10 | 11 | % CENT SIGN 12 | 13 | % POUND SIGN 14 | "" 15 | % YEN SIGN 16 | 17 | % BROKEN BAR 18 | 19 | % SECTION SIGN 20 | 21 | % DIAERESIS 22 | 23 | % COPYRIGHT SIGN 24 | ""; 25 | % FEMININE ORDINAL INDICATOR 26 | 27 | % LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 28 | "" 29 | % NOT SIGN 30 | 31 | % SOFT HYPHEN 32 | 33 | % REGISTERED SIGN 34 | "" 35 | % MACRON 36 | 37 | % DEGREE SIGN 38 | 39 | % PLUS-MINUS SIGN 40 | "" 41 | % SUPERSCRIPT TWO 42 | ""; 43 | % SUPERSCRIPT THREE 44 | ""; 45 | % ACUTE ACCENT 46 | 47 | % MICRO SIGN 48 | ; 49 | % PILCROW SIGN 50 | 51 | % MIDDLE DOT 52 | 53 | % CEDILLA 54 | 55 | % SUPERSCRIPT ONE 56 | ""; 57 | % MASCULINE ORDINAL INDICATOR 58 | 59 | % RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 60 | "" 61 | % VULGAR FRACTION ONE QUARTER 62 | "" 63 | % VULGAR FRACTION ONE HALF 64 | "" 65 | % VULGAR FRACTION THREE QUARTERS 66 | "" 67 | % INVERTED QUESTION MARK 68 | 69 | % LATIN CAPITAL LETTER A WITH GRAVE 70 | 71 | % LATIN CAPITAL LETTER A WITH ACUTE 72 | 73 | % LATIN CAPITAL LETTER A WITH CIRCUMFLEX 74 | 75 | % LATIN CAPITAL LETTER A WITH TILDE 76 | 77 | % LATIN CAPITAL LETTER A WITH DIAERESIS 78 | ""; 79 | % LATIN CAPITAL LETTER A WITH RING ABOVE 80 | ""; 81 | % LATIN CAPITAL LETTER AE 82 | ""; 83 | % LATIN CAPITAL LETTER C WITH CEDILLA 84 | 85 | % LATIN CAPITAL LETTER E WITH GRAVE 86 | 87 | % LATIN CAPITAL LETTER E WITH ACUTE 88 | 89 | % LATIN CAPITAL LETTER E WITH CIRCUMFLEX 90 | 91 | % LATIN CAPITAL LETTER E WITH DIAERESIS 92 | 93 | % LATIN CAPITAL LETTER I WITH GRAVE 94 | 95 | % LATIN CAPITAL LETTER I WITH ACUTE 96 | 97 | % LATIN CAPITAL LETTER I WITH CIRCUMFLEX 98 | 99 | % LATIN CAPITAL LETTER I WITH DIAERESIS 100 | 101 | % LATIN CAPITAL LETTER ETH 102 | 103 | % LATIN CAPITAL LETTER N WITH TILDE 104 | 105 | % LATIN CAPITAL LETTER O WITH GRAVE 106 | 107 | % LATIN CAPITAL LETTER O WITH ACUTE 108 | 109 | % LATIN CAPITAL LETTER O WITH CIRCUMFLEX 110 | 111 | % LATIN CAPITAL LETTER O WITH TILDE 112 | 113 | % LATIN CAPITAL LETTER O WITH DIAERESIS 114 | ""; 115 | % MULTIPLICATION SIGN 116 | 117 | % LATIN CAPITAL LETTER O WITH STROKE 118 | 119 | % LATIN CAPITAL LETTER U WITH GRAVE 120 | 121 | % LATIN CAPITAL LETTER U WITH ACUTE 122 | 123 | % LATIN CAPITAL LETTER U WITH CIRCUMFLEX 124 | 125 | % LATIN CAPITAL LETTER U WITH DIAERESIS 126 | ""; 127 | % LATIN CAPITAL LETTER Y WITH ACUTE 128 | 129 | % LATIN CAPITAL LETTER THORN 130 | "" 131 | % LATIN SMALL LETTER SHARP S 132 | ""; 133 | % LATIN SMALL LETTER A WITH GRAVE 134 | 135 | % LATIN SMALL LETTER A WITH ACUTE 136 | 137 | % LATIN SMALL LETTER A WITH CIRCUMFLEX 138 | 139 | % LATIN SMALL LETTER A WITH TILDE 140 | 141 | % LATIN SMALL LETTER A WITH DIAERESIS 142 | ""; 143 | % LATIN SMALL LETTER A WITH RING ABOVE 144 | ""; 145 | % LATIN SMALL LETTER AE 146 | ""; 147 | % LATIN SMALL LETTER C WITH CEDILLA 148 | 149 | % LATIN SMALL LETTER E WITH GRAVE 150 | 151 | % LATIN SMALL LETTER E WITH ACUTE 152 | 153 | % LATIN SMALL LETTER E WITH CIRCUMFLEX 154 | 155 | % LATIN SMALL LETTER E WITH DIAERESIS 156 | 157 | % LATIN SMALL LETTER I WITH GRAVE 158 | 159 | % LATIN SMALL LETTER I WITH ACUTE 160 | 161 | % LATIN SMALL LETTER I WITH CIRCUMFLEX 162 | 163 | % LATIN SMALL LETTER I WITH DIAERESIS 164 | 165 | % LATIN SMALL LETTER ETH 166 | 167 | % LATIN SMALL LETTER N WITH TILDE 168 | 169 | % LATIN SMALL LETTER O WITH GRAVE 170 | 171 | % LATIN SMALL LETTER O WITH ACUTE 172 | 173 | % LATIN SMALL LETTER O WITH CIRCUMFLEX 174 | 175 | % LATIN SMALL LETTER O WITH TILDE 176 | 177 | % LATIN SMALL LETTER O WITH DIAERESIS 178 | ""; 179 | % DIVISION SIGN 180 | 181 | % LATIN SMALL LETTER O WITH STROKE 182 | 183 | % LATIN SMALL LETTER U WITH GRAVE 184 | 185 | % LATIN SMALL LETTER U WITH ACUTE 186 | 187 | % LATIN SMALL LETTER U WITH CIRCUMFLEX 188 | 189 | % LATIN SMALL LETTER U WITH DIAERESIS 190 | ""; 191 | % LATIN SMALL LETTER Y WITH ACUTE 192 | 193 | % LATIN SMALL LETTER THORN 194 | "" 195 | % LATIN SMALL LETTER Y WITH DIAERESIS 196 | 197 | % LATIN CAPITAL LETTER A WITH MACRON 198 | 199 | % LATIN SMALL LETTER A WITH MACRON 200 | 201 | % LATIN CAPITAL LETTER A WITH BREVE 202 | 203 | % LATIN SMALL LETTER A WITH BREVE 204 | 205 | % LATIN CAPITAL LETTER A WITH OGONEK 206 | 207 | % LATIN SMALL LETTER A WITH OGONEK 208 | 209 | % LATIN CAPITAL LETTER C WITH ACUTE 210 | 211 | % LATIN SMALL LETTER C WITH ACUTE 212 | 213 | % LATIN CAPITAL LETTER C WITH CIRCUMFLEX 214 | ""; 215 | % LATIN SMALL LETTER C WITH CIRCUMFLEX 216 | ""; 217 | % LATIN CAPITAL LETTER C WITH DOT ABOVE 218 | 219 | % LATIN SMALL LETTER C WITH DOT ABOVE 220 | 221 | % LATIN CAPITAL LETTER C WITH CARON 222 | 223 | % LATIN SMALL LETTER C WITH CARON 224 | 225 | % LATIN CAPITAL LETTER D WITH CARON 226 | 227 | % LATIN SMALL LETTER D WITH CARON 228 | 229 | % LATIN CAPITAL LETTER D WITH STROKE 230 | 231 | % LATIN SMALL LETTER D WITH STROKE 232 | 233 | % LATIN CAPITAL LETTER E WITH MACRON 234 | 235 | % LATIN SMALL LETTER E WITH MACRON 236 | 237 | % LATIN CAPITAL LETTER E WITH BREVE 238 | 239 | % LATIN SMALL LETTER E WITH BREVE 240 | 241 | % LATIN CAPITAL LETTER E WITH DOT ABOVE 242 | 243 | % LATIN SMALL LETTER E WITH DOT ABOVE 244 | 245 | % LATIN CAPITAL LETTER E WITH OGONEK 246 | 247 | % LATIN SMALL LETTER E WITH OGONEK 248 | 249 | % LATIN CAPITAL LETTER E WITH CARON 250 | 251 | % LATIN SMALL LETTER E WITH CARON 252 | 253 | % LATIN CAPITAL LETTER G WITH CIRCUMFLEX 254 | ""; 255 | % LATIN SMALL LETTER G WITH CIRCUMFLEX 256 | ""; 257 | % LATIN CAPITAL LETTER G WITH BREVE 258 | 259 | % LATIN SMALL LETTER G WITH BREVE 260 | 261 | % LATIN CAPITAL LETTER G WITH DOT ABOVE 262 | 263 | % LATIN SMALL LETTER G WITH DOT ABOVE 264 | 265 | % LATIN CAPITAL LETTER G WITH CEDILLA 266 | 267 | % LATIN SMALL LETTER G WITH CEDILLA 268 | 269 | % LATIN CAPITAL LETTER H WITH CIRCUMFLEX 270 | ""; 271 | % LATIN SMALL LETTER H WITH CIRCUMFLEX 272 | ""; 273 | % LATIN CAPITAL LETTER H WITH STROKE 274 | 275 | % LATIN SMALL LETTER H WITH STROKE 276 | 277 | % LATIN CAPITAL LETTER I WITH TILDE 278 | 279 | % LATIN SMALL LETTER I WITH TILDE 280 | 281 | % LATIN CAPITAL LETTER I WITH MACRON 282 | 283 | % LATIN SMALL LETTER I WITH MACRON 284 | 285 | % LATIN CAPITAL LETTER I WITH BREVE 286 | 287 | % LATIN SMALL LETTER I WITH BREVE 288 | 289 | % LATIN CAPITAL LETTER I WITH OGONEK 290 | 291 | % LATIN SMALL LETTER I WITH OGONEK 292 | 293 | % LATIN CAPITAL LETTER I WITH DOT ABOVE 294 | 295 | % LATIN SMALL LETTER DOTLESS I 296 | 297 | % LATIN CAPITAL LIGATURE IJ 298 | "" 299 | % LATIN SMALL LIGATURE IJ 300 | "" 301 | % LATIN CAPITAL LETTER J WITH CIRCUMFLEX 302 | ""; 303 | % LATIN SMALL LETTER J WITH CIRCUMFLEX 304 | ""; 305 | % LATIN CAPITAL LETTER K WITH CEDILLA 306 | 307 | % LATIN SMALL LETTER K WITH CEDILLA 308 | 309 | % LATIN SMALL LETTER KRA 310 | 311 | % LATIN CAPITAL LETTER L WITH ACUTE 312 | 313 | % LATIN SMALL LETTER L WITH ACUTE 314 | 315 | % LATIN CAPITAL LETTER L WITH CEDILLA 316 | 317 | % LATIN SMALL LETTER L WITH CEDILLA 318 | 319 | % LATIN CAPITAL LETTER L WITH CARON 320 | 321 | % LATIN SMALL LETTER L WITH CARON 322 | 323 | % LATIN CAPITAL LETTER L WITH MIDDLE DOT 324 | "";""; 325 | % LATIN SMALL LETTER L WITH MIDDLE DOT 326 | "";""; 327 | % LATIN CAPITAL LETTER L WITH STROKE 328 | 329 | % LATIN SMALL LETTER L WITH STROKE 330 | 331 | % LATIN CAPITAL LETTER N WITH ACUTE 332 | 333 | % LATIN SMALL LETTER N WITH ACUTE 334 | 335 | % LATIN CAPITAL LETTER N WITH CEDILLA 336 | 337 | % LATIN SMALL LETTER N WITH CEDILLA 338 | 339 | % LATIN CAPITAL LETTER N WITH CARON 340 | 341 | % LATIN SMALL LETTER N WITH CARON 342 | 343 | % LATIN SMALL LETTER N PRECEDED BY APOSTROPHE 344 | "" 345 | % LATIN CAPITAL LETTER ENG 346 | ""; 347 | % LATIN SMALL LETTER ENG 348 | ""; 349 | % LATIN CAPITAL LETTER O WITH MACRON 350 | 351 | % LATIN SMALL LETTER O WITH MACRON 352 | 353 | % LATIN CAPITAL LETTER O WITH BREVE 354 | 355 | % LATIN SMALL LETTER O WITH BREVE 356 | 357 | % LATIN CAPITAL LETTER O WITH DOUBLE ACUTE 358 | 359 | % LATIN SMALL LETTER O WITH DOUBLE ACUTE 360 | 361 | % LATIN CAPITAL LIGATURE OE 362 | "" 363 | % LATIN SMALL LIGATURE OE 364 | "" 365 | % LATIN CAPITAL LETTER R WITH ACUTE 366 | 367 | % LATIN SMALL LETTER R WITH ACUTE 368 | 369 | % LATIN CAPITAL LETTER R WITH CEDILLA 370 | 371 | % LATIN SMALL LETTER R WITH CEDILLA 372 | 373 | % LATIN CAPITAL LETTER R WITH CARON 374 | 375 | % LATIN SMALL LETTER R WITH CARON 376 | 377 | % LATIN CAPITAL LETTER S WITH ACUTE 378 | 379 | % LATIN SMALL LETTER S WITH ACUTE 380 | 381 | % LATIN CAPITAL LETTER S WITH CIRCUMFLEX 382 | ""; 383 | % LATIN SMALL LETTER S WITH CIRCUMFLEX 384 | ""; 385 | % LATIN CAPITAL LETTER S WITH CEDILLA 386 | 387 | % LATIN SMALL LETTER S WITH CEDILLA 388 | 389 | % LATIN CAPITAL LETTER S WITH CARON 390 | 391 | % LATIN SMALL LETTER S WITH CARON 392 | 393 | % LATIN CAPITAL LETTER T WITH CEDILLA 394 | 395 | % LATIN SMALL LETTER T WITH CEDILLA 396 | 397 | % LATIN CAPITAL LETTER T WITH CARON 398 | 399 | % LATIN SMALL LETTER T WITH CARON 400 | 401 | % LATIN CAPITAL LETTER T WITH STROKE 402 | 403 | % LATIN SMALL LETTER T WITH STROKE 404 | 405 | % LATIN CAPITAL LETTER U WITH TILDE 406 | 407 | % LATIN SMALL LETTER U WITH TILDE 408 | 409 | % LATIN CAPITAL LETTER U WITH MACRON 410 | 411 | % LATIN SMALL LETTER U WITH MACRON 412 | 413 | % LATIN CAPITAL LETTER U WITH BREVE 414 | 415 | % LATIN SMALL LETTER U WITH BREVE 416 | 417 | % LATIN CAPITAL LETTER U WITH RING ABOVE 418 | 419 | % LATIN SMALL LETTER U WITH RING ABOVE 420 | 421 | % LATIN CAPITAL LETTER U WITH DOUBLE ACUTE 422 | 423 | % LATIN SMALL LETTER U WITH DOUBLE ACUTE 424 | 425 | % LATIN CAPITAL LETTER U WITH OGONEK 426 | 427 | % LATIN SMALL LETTER U WITH OGONEK 428 | 429 | % LATIN CAPITAL LETTER W WITH CIRCUMFLEX 430 | 431 | % LATIN SMALL LETTER W WITH CIRCUMFLEX 432 | 433 | % LATIN CAPITAL LETTER Y WITH CIRCUMFLEX 434 | 435 | % LATIN SMALL LETTER Y WITH CIRCUMFLEX 436 | 437 | % LATIN CAPITAL LETTER Y WITH DIAERESIS 438 | 439 | % LATIN CAPITAL LETTER Z WITH ACUTE 440 | 441 | % LATIN SMALL LETTER Z WITH ACUTE 442 | 443 | % LATIN CAPITAL LETTER Z WITH DOT ABOVE 444 | 445 | % LATIN SMALL LETTER Z WITH DOT ABOVE 446 | 447 | % LATIN CAPITAL LETTER Z WITH CARON 448 | 449 | % LATIN SMALL LETTER Z WITH CARON 450 | 451 | % LATIN SMALL LETTER LONG S 452 | 453 | % LATIN SMALL LETTER F WITH HOOK 454 | 455 | % LATIN CAPITAL LETTER O WITH HORN 456 | 457 | % LATIN SMALL LETTER O WITH HORN 458 | 459 | % LATIN CAPITAL LETTER U WITH HORN 460 | 461 | % LATIN SMALL LETTER U WITH HORN 462 | 463 | % LATIN CAPITAL LETTER S WITH COMMA BELOW 464 | ; 465 | % LATIN SMALL LETTER S WITH COMMA BELOW 466 | ; 467 | % LATIN CAPITAL LETTER T WITH COMMA BELOW 468 | ; 469 | % LATIN SMALL LETTER T WITH COMMA BELOW 470 | ; 471 | % MODIFIER LETTER PRIME 472 | ; 473 | % MODIFIER LETTER TURNED COMMA 474 | 475 | % MODIFIER LETTER APOSTROPHE 476 | ; 477 | % MODIFIER LETTER REVERSED COMMA 478 | 479 | % MODIFIER LETTER CIRCUMFLEX ACCENT 480 | 481 | % MODIFIER LETTER VERTICAL LINE 482 | 483 | % MODIFIER LETTER MACRON 484 | 485 | % MODIFIER LETTER LOW VERTICAL LINE 486 | 487 | % MODIFIER LETTER TRIANGULAR COLON 488 | 489 | % RING ABOVE 490 | 491 | % SMALL TILDE 492 | 493 | % DOUBLE ACUTE ACCENT 494 | 495 | % GREEK NUMERAL SIGN 496 | 497 | % GREEK LOWER NUMERAL SIGN 498 | 499 | % GREEK QUESTION MARK 500 | 501 | % LATIN CAPITAL LETTER B WITH DOT ABOVE 502 | 503 | % LATIN SMALL LETTER B WITH DOT ABOVE 504 | 505 | % LATIN CAPITAL LETTER D WITH DOT ABOVE 506 | 507 | % LATIN SMALL LETTER D WITH DOT ABOVE 508 | 509 | % LATIN CAPITAL LETTER F WITH DOT ABOVE 510 | 511 | % LATIN SMALL LETTER F WITH DOT ABOVE 512 | 513 | % LATIN CAPITAL LETTER M WITH DOT ABOVE 514 | 515 | % LATIN SMALL LETTER M WITH DOT ABOVE 516 | 517 | % LATIN CAPITAL LETTER P WITH DOT ABOVE 518 | 519 | % LATIN SMALL LETTER P WITH DOT ABOVE 520 | 521 | % LATIN CAPITAL LETTER S WITH DOT ABOVE 522 | 523 | % LATIN SMALL LETTER S WITH DOT ABOVE 524 | 525 | % LATIN CAPITAL LETTER T WITH DOT ABOVE 526 | 527 | % LATIN SMALL LETTER T WITH DOT ABOVE 528 | 529 | % LATIN CAPITAL LETTER W WITH GRAVE 530 | 531 | % LATIN SMALL LETTER W WITH GRAVE 532 | 533 | % LATIN CAPITAL LETTER W WITH ACUTE 534 | 535 | % LATIN SMALL LETTER W WITH ACUTE 536 | 537 | % LATIN CAPITAL LETTER W WITH DIAERESIS 538 | 539 | % LATIN SMALL LETTER W WITH DIAERESIS 540 | 541 | % LATIN CAPITAL LETTER U WITH HORN AND TILDE 542 | 543 | % LATIN SMALL LETTER U WITH HORN AND TILDE 544 | 545 | % LATIN CAPITAL LETTER Y WITH GRAVE 546 | 547 | % LATIN SMALL LETTER Y WITH GRAVE 548 | 549 | % EN QUAD 550 | 551 | % EM QUAD 552 | "" 553 | % EN SPACE 554 | 555 | % EM SPACE 556 | "" 557 | % THREE-PER-EM SPACE 558 | 559 | % FOUR-PER-EM SPACE 560 | 561 | % SIX-PER-EM SPACE 562 | 563 | % FIGURE SPACE 564 | 565 | % PUNCTUATION SPACE 566 | 567 | % THIN SPACE 568 | 569 | % HAIR SPACE 570 | "" 571 | % ZERO WIDTH SPACE 572 | "" 573 | % ZERO WIDTH NON-JOINER 574 | "" 575 | % ZERO WIDTH JOINER 576 | "" 577 | % LEFT-TO-RIGHT MARK 578 | "" 579 | % RIGHT-TO-LEFT MARK 580 | "" 581 | % HYPHEN 582 | 583 | % NON-BREAKING HYPHEN 584 | 585 | % FIGURE DASH 586 | 587 | % EN DASH 588 | 589 | % EM DASH 590 | "" 591 | % HORIZONTAL BAR 592 | "" 593 | % DOUBLE VERTICAL LINE 594 | "" 595 | % DOUBLE LOW LINE 596 | 597 | % LEFT SINGLE QUOTATION MARK 598 | 599 | % RIGHT SINGLE QUOTATION MARK 600 | 601 | % SINGLE LOW-9 QUOTATION MARK 602 | 603 | % SINGLE HIGH-REVERSED-9 QUOTATION MARK 604 | 605 | % LEFT DOUBLE QUOTATION MARK 606 | 607 | % RIGHT DOUBLE QUOTATION MARK 608 | 609 | % DOUBLE LOW-9 QUOTATION MARK 610 | 611 | % DOUBLE HIGH-REVERSED-9 QUOTATION MARK 612 | 613 | % DAGGER 614 | 615 | % DOUBLE DAGGER 616 | "" 617 | % BULLET 618 | 619 | % TRIANGULAR BULLET 620 | 621 | % ONE DOT LEADER 622 | 623 | % TWO DOT LEADER 624 | "" 625 | % HORIZONTAL ELLIPSIS 626 | "" 627 | % HYPHENATION POINT 628 | 629 | % LEFT-TO-RIGHT EMBEDDING 630 | "" 631 | % RIGHT-TO-LEFT EMBEDDING 632 | "" 633 | % POP DIRECTIONAL FORMATTING 634 | "" 635 | % LEFT-TO-RIGHT OVERRIDE 636 | "" 637 | % RIGHT-TO-LEFT OVERRIDE 638 | "" 639 | % NARROW NO-BREAK SPACE 640 | 641 | % PER MILLE SIGN 642 | "" 643 | % PRIME 644 | 645 | % DOUBLE PRIME 646 | 647 | % TRIPLE PRIME 648 | "" 649 | % REVERSED PRIME 650 | 651 | % REVERSED DOUBLE PRIME 652 | "" 653 | % REVERSED TRIPLE PRIME 654 | "" 655 | % SINGLE LEFT-POINTING ANGLE QUOTATION MARK 656 | 657 | % SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 658 | 659 | % DOUBLE EXCLAMATION MARK 660 | "" 661 | % OVERLINE 662 | 663 | % HYPHEN BULLET 664 | 665 | % FRACTION SLASH 666 | 667 | % QUESTION EXCLAMATION MARK 668 | "" 669 | % EXCLAMATION QUESTION MARK 670 | "" 671 | % TIRONIAN SIGN ET 672 | 673 | % SUPERSCRIPT ZERO 674 | ""; 675 | % SUPERSCRIPT FOUR 676 | ""; 677 | % SUPERSCRIPT FIVE 678 | ""; 679 | % SUPERSCRIPT SIX 680 | ""; 681 | % SUPERSCRIPT SEVEN 682 | ""; 683 | % SUPERSCRIPT EIGHT 684 | ""; 685 | % SUPERSCRIPT NINE 686 | ""; 687 | % SUPERSCRIPT PLUS SIGN 688 | ""; 689 | % SUPERSCRIPT MINUS 690 | ""; 691 | % SUPERSCRIPT EQUALS SIGN 692 | ""; 693 | % SUPERSCRIPT LEFT PARENTHESIS 694 | ""; 695 | % SUPERSCRIPT RIGHT PARENTHESIS 696 | ""; 697 | % SUPERSCRIPT LATIN SMALL LETTER N 698 | ""; 699 | % SUBSCRIPT ZERO 700 | ""; 701 | % SUBSCRIPT ONE 702 | ""; 703 | % SUBSCRIPT TWO 704 | ""; 705 | % SUBSCRIPT THREE 706 | ""; 707 | % SUBSCRIPT FOUR 708 | ""; 709 | % SUBSCRIPT FIVE 710 | ""; 711 | % SUBSCRIPT SIX 712 | ""; 713 | % SUBSCRIPT SEVEN 714 | ""; 715 | % SUBSCRIPT EIGHT 716 | ""; 717 | % SUBSCRIPT NINE 718 | ""; 719 | % SUBSCRIPT PLUS SIGN 720 | ""; 721 | % SUBSCRIPT MINUS 722 | ""; 723 | % SUBSCRIPT EQUALS SIGN 724 | ""; 725 | % SUBSCRIPT LEFT PARENTHESIS 726 | ""; 727 | % SUBSCRIPT RIGHT PARENTHESIS 728 | ""; 729 | % EURO SIGN 730 | ""; 731 | % ACCOUNT OF 732 | "" 733 | % ADDRESSED TO THE SUBJECT 734 | "" 735 | % DEGREE CELSIUS 736 | ""; 737 | % CARE OF 738 | "" 739 | % CADA UNA 740 | "" 741 | % DEGREE FAHRENHEIT 742 | ""; 743 | % SCRIPT SMALL L 744 | 745 | % NUMERO SIGN 746 | "";"" 747 | % SOUND RECORDING COPYRIGHT 748 | "" 749 | % SERVICE MARK 750 | "" 751 | % TELEPHONE SIGN 752 | "" 753 | % TRADE MARK SIGN 754 | "" 755 | % OHM SIGN 756 | ;""; 757 | % KELVIN SIGN 758 | 759 | % ANGSTROM SIGN 760 | 761 | % ESTIMATED SYMBOL 762 | 763 | % VULGAR FRACTION ONE THIRD 764 | "" 765 | % VULGAR FRACTION TWO THIRDS 766 | "" 767 | % VULGAR FRACTION ONE FIFTH 768 | "" 769 | % VULGAR FRACTION TWO FIFTHS 770 | "" 771 | % VULGAR FRACTION THREE FIFTHS 772 | "" 773 | % VULGAR FRACTION FOUR FIFTHS 774 | "" 775 | % VULGAR FRACTION ONE SIXTH 776 | "" 777 | % VULGAR FRACTION FIVE SIXTHS 778 | "" 779 | % VULGAR FRACTION ONE EIGHTH 780 | "" 781 | % VULGAR FRACTION THREE EIGHTHS 782 | "" 783 | % VULGAR FRACTION FIVE EIGHTHS 784 | "" 785 | % VULGAR FRACTION SEVEN EIGHTHS 786 | "" 787 | % FRACTION NUMERATOR ONE 788 | "" 789 | % ROMAN NUMERAL ONE 790 | 791 | % ROMAN NUMERAL TWO 792 | "" 793 | % ROMAN NUMERAL THREE 794 | "" 795 | % ROMAN NUMERAL FOUR 796 | "" 797 | % ROMAN NUMERAL FIVE 798 | 799 | % ROMAN NUMERAL SIX 800 | "" 801 | % ROMAN NUMERAL SEVEN 802 | "" 803 | % ROMAN NUMERAL EIGHT 804 | "" 805 | % ROMAN NUMERAL NINE 806 | "" 807 | % ROMAN NUMERAL TEN 808 | 809 | % ROMAN NUMERAL ELEVEN 810 | "" 811 | % ROMAN NUMERAL TWELVE 812 | "" 813 | % ROMAN NUMERAL FIFTY 814 | 815 | % ROMAN NUMERAL ONE HUNDRED 816 | 817 | % ROMAN NUMERAL FIVE HUNDRED 818 | 819 | % ROMAN NUMERAL ONE THOUSAND 820 | 821 | % SMALL ROMAN NUMERAL ONE 822 | 823 | % SMALL ROMAN NUMERAL TWO 824 | "" 825 | % SMALL ROMAN NUMERAL THREE 826 | "" 827 | % SMALL ROMAN NUMERAL FOUR 828 | "" 829 | % SMALL ROMAN NUMERAL FIVE 830 | 831 | % SMALL ROMAN NUMERAL SIX 832 | "" 833 | % SMALL ROMAN NUMERAL SEVEN 834 | "" 835 | % SMALL ROMAN NUMERAL EIGHT 836 | "" 837 | % SMALL ROMAN NUMERAL NINE 838 | "" 839 | % SMALL ROMAN NUMERAL TEN 840 | 841 | % SMALL ROMAN NUMERAL ELEVEN 842 | "" 843 | % SMALL ROMAN NUMERAL TWELVE 844 | "" 845 | % SMALL ROMAN NUMERAL FIFTY 846 | 847 | % SMALL ROMAN NUMERAL ONE HUNDRED 848 | 849 | % SMALL ROMAN NUMERAL FIVE HUNDRED 850 | 851 | % SMALL ROMAN NUMERAL ONE THOUSAND 852 | 853 | % LEFTWARDS ARROW 854 | "" 855 | % UPWARDS ARROW 856 | 857 | % RIGHTWARDS ARROW 858 | "" 859 | % DOWNWARDS ARROW 860 | 861 | % LEFT RIGHT ARROW 862 | "" 863 | % LEFTWARDS DOUBLE ARROW 864 | "" 865 | % RIGHTWARDS DOUBLE ARROW 866 | "" 867 | % LEFT RIGHT DOUBLE ARROW 868 | "" 869 | % MINUS SIGN 870 | ; 871 | % DIVISION SLASH 872 | 873 | % SET MINUS 874 | 875 | % ASTERISK OPERATOR 876 | 877 | % RING OPERATOR 878 | 879 | % BULLET OPERATOR 880 | 881 | % INFINITY 882 | "" 883 | % DIVIDES 884 | 885 | % PARALLEL TO 886 | "" 887 | % RATIO 888 | 889 | % TILDE OPERATOR 890 | 891 | % NOT EQUAL TO 892 | "" 893 | % IDENTICAL TO 894 | 895 | % LESS-THAN OR EQUAL TO 896 | "" 897 | % GREATER-THAN OR EQUAL TO 898 | "" 899 | % MUCH LESS-THAN 900 | "" 901 | % MUCH GREATER-THAN 902 | "" 903 | % CIRCLED PLUS 904 | "" 905 | % CIRCLED MINUS 906 | "" 907 | % CIRCLED TIMES 908 | "" 909 | % CIRCLED DIVISION SLASH 910 | "" 911 | % RIGHT TACK 912 | "" 913 | % LEFT TACK 914 | "" 915 | % ASSERTION 916 | "" 917 | % MODELS 918 | "" 919 | % TRUE 920 | "" 921 | % FORCES 922 | "" 923 | % DOT OPERATOR 924 | 925 | % STAR OPERATOR 926 | 927 | % EQUAL AND PARALLEL TO 928 | 929 | % VERY MUCH LESS-THAN 930 | "" 931 | % VERY MUCH GREATER-THAN 932 | "" 933 | % MIDLINE HORIZONTAL ELLIPSIS 934 | "" 935 | % LEFT-POINTING ANGLE BRACKET 936 | 937 | % RIGHT-POINTING ANGLE BRACKET 938 | 939 | % SYMBOL FOR NULL 940 | "" 941 | % SYMBOL FOR START OF HEADING 942 | "" 943 | % SYMBOL FOR START OF TEXT 944 | "" 945 | % SYMBOL FOR END OF TEXT 946 | "" 947 | % SYMBOL FOR END OF TRANSMISSION 948 | "" 949 | % SYMBOL FOR ENQUIRY 950 | "" 951 | % SYMBOL FOR ACKNOWLEDGE 952 | "" 953 | % SYMBOL FOR BELL 954 | "" 955 | % SYMBOL FOR BACKSPACE 956 | "" 957 | % SYMBOL FOR HORIZONTAL TABULATION 958 | "" 959 | % SYMBOL FOR LINE FEED 960 | "" 961 | % SYMBOL FOR VERTICAL TABULATION 962 | "" 963 | % SYMBOL FOR FORM FEED 964 | "" 965 | % SYMBOL FOR CARRIAGE RETURN 966 | "" 967 | % SYMBOL FOR SHIFT OUT 968 | "" 969 | % SYMBOL FOR SHIFT IN 970 | "" 971 | % SYMBOL FOR DATA LINK ESCAPE 972 | "" 973 | % SYMBOL FOR DEVICE CONTROL ONE 974 | "" 975 | % SYMBOL FOR DEVICE CONTROL TWO 976 | "" 977 | % SYMBOL FOR DEVICE CONTROL THREE 978 | "" 979 | % SYMBOL FOR DEVICE CONTROL FOUR 980 | "" 981 | % SYMBOL FOR NEGATIVE ACKNOWLEDGE 982 | "" 983 | % SYMBOL FOR SYNCHRONOUS IDLE 984 | "" 985 | % SYMBOL FOR END OF TRANSMISSION BLOCK 986 | "" 987 | % SYMBOL FOR CANCEL 988 | "" 989 | % SYMBOL FOR END OF MEDIUM 990 | "" 991 | % SYMBOL FOR SUBSTITUTE 992 | "" 993 | % SYMBOL FOR ESCAPE 994 | "" 995 | % SYMBOL FOR FILE SEPARATOR 996 | "" 997 | % SYMBOL FOR GROUP SEPARATOR 998 | "" 999 | % SYMBOL FOR RECORD SEPARATOR 1000 | "" 1001 | % SYMBOL FOR UNIT SEPARATOR 1002 | "" 1003 | % SYMBOL FOR SPACE 1004 | "" 1005 | % SYMBOL FOR DELETE 1006 | "" 1007 | % OPEN BOX 1008 | 1009 | % SYMBOL FOR NEWLINE 1010 | "" 1011 | % SYMBOL FOR DELETE FORM TWO 1012 | "" 1013 | % SYMBOL FOR SUBSTITUTE FORM TWO 1014 | 1015 | % CIRCLED DIGIT ONE 1016 | ""; 1017 | % CIRCLED DIGIT TWO 1018 | ""; 1019 | % CIRCLED DIGIT THREE 1020 | ""; 1021 | % CIRCLED DIGIT FOUR 1022 | ""; 1023 | % CIRCLED DIGIT FIVE 1024 | ""; 1025 | % CIRCLED DIGIT SIX 1026 | ""; 1027 | % CIRCLED DIGIT SEVEN 1028 | ""; 1029 | % CIRCLED DIGIT EIGHT 1030 | ""; 1031 | % CIRCLED DIGIT NINE 1032 | ""; 1033 | % CIRCLED NUMBER TEN 1034 | "" 1035 | % CIRCLED NUMBER ELEVEN 1036 | "" 1037 | % CIRCLED NUMBER TWELVE 1038 | "" 1039 | % CIRCLED NUMBER THIRTEEN 1040 | "" 1041 | % CIRCLED NUMBER FOURTEEN 1042 | "" 1043 | % CIRCLED NUMBER FIFTEEN 1044 | "" 1045 | % CIRCLED NUMBER SIXTEEN 1046 | "" 1047 | % CIRCLED NUMBER SEVENTEEN 1048 | "" 1049 | % CIRCLED NUMBER EIGHTEEN 1050 | "" 1051 | % CIRCLED NUMBER NINETEEN 1052 | "" 1053 | % CIRCLED NUMBER TWENTY 1054 | "" 1055 | % PARENTHESIZED DIGIT ONE 1056 | ""; 1057 | % PARENTHESIZED DIGIT TWO 1058 | ""; 1059 | % PARENTHESIZED DIGIT THREE 1060 | ""; 1061 | % PARENTHESIZED DIGIT FOUR 1062 | ""; 1063 | % PARENTHESIZED DIGIT FIVE 1064 | ""; 1065 | % PARENTHESIZED DIGIT SIX 1066 | ""; 1067 | % PARENTHESIZED DIGIT SEVEN 1068 | ""; 1069 | % PARENTHESIZED DIGIT EIGHT 1070 | ""; 1071 | % PARENTHESIZED DIGIT NINE 1072 | ""; 1073 | % PARENTHESIZED NUMBER TEN 1074 | "" 1075 | % PARENTHESIZED NUMBER ELEVEN 1076 | "" 1077 | % PARENTHESIZED NUMBER TWELVE 1078 | "" 1079 | % PARENTHESIZED NUMBER THIRTEEN 1080 | "" 1081 | % PARENTHESIZED NUMBER FOURTEEN 1082 | "" 1083 | % PARENTHESIZED NUMBER FIFTEEN 1084 | "" 1085 | % PARENTHESIZED NUMBER SIXTEEN 1086 | "" 1087 | % PARENTHESIZED NUMBER SEVENTEEN 1088 | "" 1089 | % PARENTHESIZED NUMBER EIGHTEEN 1090 | "" 1091 | % PARENTHESIZED NUMBER NINETEEN 1092 | "" 1093 | % PARENTHESIZED NUMBER TWENTY 1094 | "" 1095 | % DIGIT ONE FULL STOP 1096 | ""; 1097 | % DIGIT TWO FULL STOP 1098 | ""; 1099 | % DIGIT THREE FULL STOP 1100 | ""; 1101 | % DIGIT FOUR FULL STOP 1102 | ""; 1103 | % DIGIT FIVE FULL STOP 1104 | ""; 1105 | % DIGIT SIX FULL STOP 1106 | ""; 1107 | % DIGIT SEVEN FULL STOP 1108 | ""; 1109 | % DIGIT EIGHT FULL STOP 1110 | ""; 1111 | % DIGIT NINE FULL STOP 1112 | ""; 1113 | % NUMBER TEN FULL STOP 1114 | "" 1115 | % NUMBER ELEVEN FULL STOP 1116 | "" 1117 | % NUMBER TWELVE FULL STOP 1118 | "" 1119 | % NUMBER THIRTEEN FULL STOP 1120 | "" 1121 | % NUMBER FOURTEEN FULL STOP 1122 | "" 1123 | % NUMBER FIFTEEN FULL STOP 1124 | "" 1125 | % NUMBER SIXTEEN FULL STOP 1126 | "" 1127 | % NUMBER SEVENTEEN FULL STOP 1128 | "" 1129 | % NUMBER EIGHTEEN FULL STOP 1130 | "" 1131 | % NUMBER NINETEEN FULL STOP 1132 | "" 1133 | % NUMBER TWENTY FULL STOP 1134 | "" 1135 | % PARENTHESIZED LATIN SMALL LETTER A 1136 | ""; 1137 | % PARENTHESIZED LATIN SMALL LETTER B 1138 | ""; 1139 | % PARENTHESIZED LATIN SMALL LETTER C 1140 | ""; 1141 | % PARENTHESIZED LATIN SMALL LETTER D 1142 | ""; 1143 | % PARENTHESIZED LATIN SMALL LETTER E 1144 | ""; 1145 | % PARENTHESIZED LATIN SMALL LETTER F 1146 | ""; 1147 | % PARENTHESIZED LATIN SMALL LETTER G 1148 | ""; 1149 | % PARENTHESIZED LATIN SMALL LETTER H 1150 | ""; 1151 | % PARENTHESIZED LATIN SMALL LETTER I 1152 | ""; 1153 | % PARENTHESIZED LATIN SMALL LETTER J 1154 | ""; 1155 | % PARENTHESIZED LATIN SMALL LETTER K 1156 | ""; 1157 | % PARENTHESIZED LATIN SMALL LETTER L 1158 | ""; 1159 | % PARENTHESIZED LATIN SMALL LETTER M 1160 | ""; 1161 | % PARENTHESIZED LATIN SMALL LETTER N 1162 | ""; 1163 | % PARENTHESIZED LATIN SMALL LETTER O 1164 | ""; 1165 | % PARENTHESIZED LATIN SMALL LETTER P 1166 | ""; 1167 | % PARENTHESIZED LATIN SMALL LETTER Q 1168 | ""; 1169 | % PARENTHESIZED LATIN SMALL LETTER R 1170 | ""; 1171 | % PARENTHESIZED LATIN SMALL LETTER S 1172 | ""; 1173 | % PARENTHESIZED LATIN SMALL LETTER T 1174 | ""; 1175 | % PARENTHESIZED LATIN SMALL LETTER U 1176 | ""; 1177 | % PARENTHESIZED LATIN SMALL LETTER V 1178 | ""; 1179 | % PARENTHESIZED LATIN SMALL LETTER W 1180 | ""; 1181 | % PARENTHESIZED LATIN SMALL LETTER X 1182 | ""; 1183 | % PARENTHESIZED LATIN SMALL LETTER Y 1184 | ""; 1185 | % PARENTHESIZED LATIN SMALL LETTER Z 1186 | ""; 1187 | % CIRCLED LATIN CAPITAL LETTER A 1188 | ""; 1189 | % CIRCLED LATIN CAPITAL LETTER B 1190 | ""; 1191 | % CIRCLED LATIN CAPITAL LETTER C 1192 | ""; 1193 | % CIRCLED LATIN CAPITAL LETTER D 1194 | ""; 1195 | % CIRCLED LATIN CAPITAL LETTER E 1196 | ""; 1197 | % CIRCLED LATIN CAPITAL LETTER F 1198 | ""; 1199 | % CIRCLED LATIN CAPITAL LETTER G 1200 | ""; 1201 | % CIRCLED LATIN CAPITAL LETTER H 1202 | ""; 1203 | % CIRCLED LATIN CAPITAL LETTER I 1204 | ""; 1205 | % CIRCLED LATIN CAPITAL LETTER J 1206 | ""; 1207 | % CIRCLED LATIN CAPITAL LETTER K 1208 | ""; 1209 | % CIRCLED LATIN CAPITAL LETTER L 1210 | ""; 1211 | % CIRCLED LATIN CAPITAL LETTER M 1212 | ""; 1213 | % CIRCLED LATIN CAPITAL LETTER N 1214 | ""; 1215 | % CIRCLED LATIN CAPITAL LETTER O 1216 | ""; 1217 | % CIRCLED LATIN CAPITAL LETTER P 1218 | ""; 1219 | % CIRCLED LATIN CAPITAL LETTER Q 1220 | ""; 1221 | % CIRCLED LATIN CAPITAL LETTER R 1222 | ""; 1223 | % CIRCLED LATIN CAPITAL LETTER S 1224 | ""; 1225 | % CIRCLED LATIN CAPITAL LETTER T 1226 | ""; 1227 | % CIRCLED LATIN CAPITAL LETTER U 1228 | ""; 1229 | % CIRCLED LATIN CAPITAL LETTER V 1230 | ""; 1231 | % CIRCLED LATIN CAPITAL LETTER W 1232 | ""; 1233 | % CIRCLED LATIN CAPITAL LETTER X 1234 | ""; 1235 | % CIRCLED LATIN CAPITAL LETTER Y 1236 | ""; 1237 | % CIRCLED LATIN CAPITAL LETTER Z 1238 | ""; 1239 | % CIRCLED LATIN SMALL LETTER A 1240 | ""; 1241 | % CIRCLED LATIN SMALL LETTER B 1242 | ""; 1243 | % CIRCLED LATIN SMALL LETTER C 1244 | ""; 1245 | % CIRCLED LATIN SMALL LETTER D 1246 | ""; 1247 | % CIRCLED LATIN SMALL LETTER E 1248 | ""; 1249 | % CIRCLED LATIN SMALL LETTER F 1250 | ""; 1251 | % CIRCLED LATIN SMALL LETTER G 1252 | ""; 1253 | % CIRCLED LATIN SMALL LETTER H 1254 | ""; 1255 | % CIRCLED LATIN SMALL LETTER I 1256 | ""; 1257 | % CIRCLED LATIN SMALL LETTER J 1258 | ""; 1259 | % CIRCLED LATIN SMALL LETTER K 1260 | ""; 1261 | % CIRCLED LATIN SMALL LETTER L 1262 | ""; 1263 | % CIRCLED LATIN SMALL LETTER M 1264 | ""; 1265 | % CIRCLED LATIN SMALL LETTER N 1266 | ""; 1267 | % CIRCLED LATIN SMALL LETTER O 1268 | ""; 1269 | % CIRCLED LATIN SMALL LETTER P 1270 | ""; 1271 | % CIRCLED LATIN SMALL LETTER Q 1272 | ""; 1273 | % CIRCLED LATIN SMALL LETTER R 1274 | ""; 1275 | % CIRCLED LATIN SMALL LETTER S 1276 | ""; 1277 | % CIRCLED LATIN SMALL LETTER T 1278 | ""; 1279 | % CIRCLED LATIN SMALL LETTER U 1280 | ""; 1281 | % CIRCLED LATIN SMALL LETTER V 1282 | ""; 1283 | % CIRCLED LATIN SMALL LETTER W 1284 | ""; 1285 | % CIRCLED LATIN SMALL LETTER X 1286 | ""; 1287 | % CIRCLED LATIN SMALL LETTER Y 1288 | ""; 1289 | % CIRCLED LATIN SMALL LETTER Z 1290 | ""; 1291 | % CIRCLED DIGIT ZERO 1292 | ""; 1293 | % BOX DRAWINGS LIGHT HORIZONTAL 1294 | 1295 | % BOX DRAWINGS HEAVY HORIZONTAL 1296 | 1297 | % BOX DRAWINGS LIGHT VERTICAL 1298 | 1299 | % BOX DRAWINGS HEAVY VERTICAL 1300 | 1301 | % BOX DRAWINGS LIGHT TRIPLE DASH HORIZONTAL 1302 | 1303 | % BOX DRAWINGS HEAVY TRIPLE DASH HORIZONTAL 1304 | 1305 | % BOX DRAWINGS LIGHT TRIPLE DASH VERTICAL 1306 | 1307 | % BOX DRAWINGS HEAVY TRIPLE DASH VERTICAL 1308 | 1309 | % BOX DRAWINGS LIGHT QUADRUPLE DASH HORIZONTAL 1310 | 1311 | % BOX DRAWINGS HEAVY QUADRUPLE DASH HORIZONTAL 1312 | 1313 | % BOX DRAWINGS LIGHT QUADRUPLE DASH VERTICAL 1314 | 1315 | % BOX DRAWINGS HEAVY QUADRUPLE DASH VERTICAL 1316 | 1317 | % BOX DRAWINGS LIGHT DOWN AND RIGHT 1318 | 1319 | % BOX DRAWINGS DOWN LIGHT AND RIGHT HEAVY 1320 | 1321 | % BOX DRAWINGS DOWN HEAVY AND RIGHT LIGHT 1322 | 1323 | % BOX DRAWINGS HEAVY DOWN AND RIGHT 1324 | 1325 | % BOX DRAWINGS LIGHT DOWN AND LEFT 1326 | 1327 | % BOX DRAWINGS DOWN LIGHT AND LEFT HEAVY 1328 | 1329 | % BOX DRAWINGS DOWN HEAVY AND LEFT LIGHT 1330 | 1331 | % BOX DRAWINGS HEAVY DOWN AND LEFT 1332 | 1333 | % BOX DRAWINGS LIGHT UP AND RIGHT 1334 | 1335 | % BOX DRAWINGS UP LIGHT AND RIGHT HEAVY 1336 | 1337 | % BOX DRAWINGS UP HEAVY AND RIGHT LIGHT 1338 | 1339 | % BOX DRAWINGS HEAVY UP AND RIGHT 1340 | 1341 | % BOX DRAWINGS LIGHT UP AND LEFT 1342 | 1343 | % BOX DRAWINGS UP LIGHT AND LEFT HEAVY 1344 | 1345 | % BOX DRAWINGS UP HEAVY AND LEFT LIGHT 1346 | 1347 | % BOX DRAWINGS HEAVY UP AND LEFT 1348 | 1349 | % BOX DRAWINGS LIGHT VERTICAL AND RIGHT 1350 | 1351 | % BOX DRAWINGS VERTICAL LIGHT AND RIGHT HEAVY 1352 | 1353 | % BOX DRAWINGS UP HEAVY AND RIGHT DOWN LIGHT 1354 | 1355 | % BOX DRAWINGS DOWN HEAVY AND RIGHT UP LIGHT 1356 | 1357 | % BOX DRAWINGS VERTICAL HEAVY AND RIGHT LIGHT 1358 | 1359 | % BOX DRAWINGS DOWN LIGHT AND RIGHT UP HEAVY 1360 | 1361 | % BOX DRAWINGS UP LIGHT AND RIGHT DOWN HEAVY 1362 | 1363 | % BOX DRAWINGS HEAVY VERTICAL AND RIGHT 1364 | 1365 | % BOX DRAWINGS LIGHT VERTICAL AND LEFT 1366 | 1367 | % BOX DRAWINGS VERTICAL LIGHT AND LEFT HEAVY 1368 | 1369 | % BOX DRAWINGS UP HEAVY AND LEFT DOWN LIGHT 1370 | 1371 | % BOX DRAWINGS DOWN HEAVY AND LEFT UP LIGHT 1372 | 1373 | % BOX DRAWINGS VERTICAL HEAVY AND LEFT LIGHT 1374 | 1375 | % BOX DRAWINGS DOWN LIGHT AND LEFT UP HEAVY 1376 | 1377 | % BOX DRAWINGS UP LIGHT AND LEFT DOWN HEAVY 1378 | 1379 | % BOX DRAWINGS HEAVY VERTICAL AND LEFT 1380 | 1381 | % BOX DRAWINGS LIGHT DOWN AND HORIZONTAL 1382 | 1383 | % BOX DRAWINGS LEFT HEAVY AND RIGHT DOWN LIGHT 1384 | 1385 | % BOX DRAWINGS RIGHT HEAVY AND LEFT DOWN LIGHT 1386 | 1387 | % BOX DRAWINGS DOWN LIGHT AND HORIZONTAL HEAVY 1388 | 1389 | % BOX DRAWINGS DOWN HEAVY AND HORIZONTAL LIGHT 1390 | 1391 | % BOX DRAWINGS RIGHT LIGHT AND LEFT DOWN HEAVY 1392 | 1393 | % BOX DRAWINGS LEFT LIGHT AND RIGHT DOWN HEAVY 1394 | 1395 | % BOX DRAWINGS HEAVY DOWN AND HORIZONTAL 1396 | 1397 | % BOX DRAWINGS LIGHT UP AND HORIZONTAL 1398 | 1399 | % BOX DRAWINGS LEFT HEAVY AND RIGHT UP LIGHT 1400 | 1401 | % BOX DRAWINGS RIGHT HEAVY AND LEFT UP LIGHT 1402 | 1403 | % BOX DRAWINGS UP LIGHT AND HORIZONTAL HEAVY 1404 | 1405 | % BOX DRAWINGS UP HEAVY AND HORIZONTAL LIGHT 1406 | 1407 | % BOX DRAWINGS RIGHT LIGHT AND LEFT UP HEAVY 1408 | 1409 | % BOX DRAWINGS LEFT LIGHT AND RIGHT UP HEAVY 1410 | 1411 | % BOX DRAWINGS HEAVY UP AND HORIZONTAL 1412 | 1413 | % BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL 1414 | 1415 | % BOX DRAWINGS LEFT HEAVY AND RIGHT VERTICAL LIGHT 1416 | 1417 | % BOX DRAWINGS RIGHT HEAVY AND LEFT VERTICAL LIGHT 1418 | 1419 | % BOX DRAWINGS VERTICAL LIGHT AND HORIZONTAL HEAVY 1420 | 1421 | % BOX DRAWINGS UP HEAVY AND DOWN HORIZONTAL LIGHT 1422 | 1423 | % BOX DRAWINGS DOWN HEAVY AND UP HORIZONTAL LIGHT 1424 | 1425 | % BOX DRAWINGS VERTICAL HEAVY AND HORIZONTAL LIGHT 1426 | 1427 | % BOX DRAWINGS LEFT UP HEAVY AND RIGHT DOWN LIGHT 1428 | 1429 | % BOX DRAWINGS RIGHT UP HEAVY AND LEFT DOWN LIGHT 1430 | 1431 | % BOX DRAWINGS LEFT DOWN HEAVY AND RIGHT UP LIGHT 1432 | 1433 | % BOX DRAWINGS RIGHT DOWN HEAVY AND LEFT UP LIGHT 1434 | 1435 | % BOX DRAWINGS DOWN LIGHT AND UP HORIZONTAL HEAVY 1436 | 1437 | % BOX DRAWINGS UP LIGHT AND DOWN HORIZONTAL HEAVY 1438 | 1439 | % BOX DRAWINGS RIGHT LIGHT AND LEFT VERTICAL HEAVY 1440 | 1441 | % BOX DRAWINGS LEFT LIGHT AND RIGHT VERTICAL HEAVY 1442 | 1443 | % BOX DRAWINGS HEAVY VERTICAL AND HORIZONTAL 1444 | 1445 | % BOX DRAWINGS LIGHT DOUBLE DASH HORIZONTAL 1446 | 1447 | % BOX DRAWINGS HEAVY DOUBLE DASH HORIZONTAL 1448 | 1449 | % BOX DRAWINGS LIGHT DOUBLE DASH VERTICAL 1450 | 1451 | % BOX DRAWINGS HEAVY DOUBLE DASH VERTICAL 1452 | 1453 | % BOX DRAWINGS DOUBLE HORIZONTAL 1454 | 1455 | % BOX DRAWINGS DOUBLE VERTICAL 1456 | 1457 | % BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE 1458 | 1459 | % BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE 1460 | 1461 | % BOX DRAWINGS DOUBLE DOWN AND RIGHT 1462 | 1463 | % BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE 1464 | 1465 | % BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE 1466 | 1467 | % BOX DRAWINGS DOUBLE DOWN AND LEFT 1468 | 1469 | % BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE 1470 | 1471 | % BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE 1472 | 1473 | % BOX DRAWINGS DOUBLE UP AND RIGHT 1474 | 1475 | % BOX DRAWINGS UP SINGLE AND LEFT DOUBLE 1476 | 1477 | % BOX DRAWINGS UP DOUBLE AND LEFT SINGLE 1478 | 1479 | % BOX DRAWINGS DOUBLE UP AND LEFT 1480 | 1481 | % BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE 1482 | 1483 | % BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE 1484 | 1485 | % BOX DRAWINGS DOUBLE VERTICAL AND RIGHT 1486 | 1487 | % BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE 1488 | 1489 | % BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE 1490 | 1491 | % BOX DRAWINGS DOUBLE VERTICAL AND LEFT 1492 | 1493 | % BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE 1494 | 1495 | % BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE 1496 | 1497 | % BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL 1498 | 1499 | % BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE 1500 | 1501 | % BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE 1502 | 1503 | % BOX DRAWINGS DOUBLE UP AND HORIZONTAL 1504 | 1505 | % BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE 1506 | 1507 | % BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE 1508 | 1509 | % BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL 1510 | 1511 | % BOX DRAWINGS LIGHT ARC DOWN AND RIGHT 1512 | 1513 | % BOX DRAWINGS LIGHT ARC DOWN AND LEFT 1514 | 1515 | % BOX DRAWINGS LIGHT ARC UP AND LEFT 1516 | 1517 | % BOX DRAWINGS LIGHT ARC UP AND RIGHT 1518 | 1519 | % BOX DRAWINGS LIGHT DIAGONAL UPPER RIGHT TO LOWER LEFT 1520 | 1521 | % BOX DRAWINGS LIGHT DIAGONAL UPPER LEFT TO LOWER RIGHT 1522 | 1523 | % BOX DRAWINGS LIGHT DIAGONAL CROSS 1524 | 1525 | % BOX DRAWINGS LIGHT LEFT AND HEAVY RIGHT 1526 | 1527 | % BOX DRAWINGS LIGHT UP AND HEAVY DOWN 1528 | 1529 | % BOX DRAWINGS HEAVY LEFT AND LIGHT RIGHT 1530 | 1531 | % BOX DRAWINGS HEAVY UP AND LIGHT DOWN 1532 | 1533 | % WHITE CIRCLE 1534 | 1535 | % WHITE BULLET 1536 | 1537 | % BLACK STAR 1538 | 1539 | % WHITE STAR 1540 | 1541 | % BALLOT BOX WITH X 1542 | 1543 | % SALTIRE 1544 | 1545 | % WHITE FROWNING FACE 1546 | "" 1547 | % WHITE SMILING FACE 1548 | "" 1549 | % BLACK SMILING FACE 1550 | "" 1551 | % MUSIC FLAT SIGN 1552 | 1553 | % MUSIC SHARP SIGN 1554 | 1555 | % UPPER BLADE SCISSORS 1556 | "" 1557 | % BLACK SCISSORS 1558 | "" 1559 | % LOWER BLADE SCISSORS 1560 | "" 1561 | % WHITE SCISSORS 1562 | "" 1563 | % VICTORY HAND 1564 | 1565 | % CHECK MARK 1566 | 1567 | % HEAVY CHECK MARK 1568 | 1569 | % MULTIPLICATION X 1570 | 1571 | % HEAVY MULTIPLICATION X 1572 | 1573 | % BALLOT X 1574 | 1575 | % HEAVY BALLOT X 1576 | 1577 | % OUTLINED GREEK CROSS 1578 | 1579 | % HEAVY GREEK CROSS 1580 | 1581 | % OPEN CENTRE CROSS 1582 | 1583 | % HEAVY OPEN CENTRE CROSS 1584 | 1585 | % LATIN CROSS 1586 | 1587 | % SHADOWED WHITE LATIN CROSS 1588 | 1589 | % OUTLINED LATIN CROSS 1590 | 1591 | % MALTESE CROSS 1592 | 1593 | % STAR OF DAVID 1594 | 1595 | % FOUR TEARDROP-SPOKED ASTERISK 1596 | 1597 | % FOUR BALLOON-SPOKED ASTERISK 1598 | 1599 | % HEAVY FOUR BALLOON-SPOKED ASTERISK 1600 | 1601 | % FOUR CLUB-SPOKED ASTERISK 1602 | 1603 | % BLACK FOUR POINTED STAR 1604 | 1605 | % WHITE FOUR POINTED STAR 1606 | 1607 | % STRESS OUTLINED WHITE STAR 1608 | 1609 | % CIRCLED WHITE STAR 1610 | 1611 | % OPEN CENTRE BLACK STAR 1612 | 1613 | % BLACK CENTRE WHITE STAR 1614 | 1615 | % OUTLINED BLACK STAR 1616 | 1617 | % HEAVY OUTLINED BLACK STAR 1618 | 1619 | % PINWHEEL STAR 1620 | 1621 | % SHADOWED WHITE STAR 1622 | 1623 | % HEAVY ASTERISK 1624 | 1625 | % OPEN CENTRE ASTERISK 1626 | 1627 | % EIGHT SPOKED ASTERISK 1628 | 1629 | % EIGHT POINTED BLACK STAR 1630 | 1631 | % EIGHT POINTED PINWHEEL STAR 1632 | 1633 | % SIX POINTED BLACK STAR 1634 | 1635 | % EIGHT POINTED RECTILINEAR BLACK STAR 1636 | 1637 | % HEAVY EIGHT POINTED RECTILINEAR BLACK STAR 1638 | 1639 | % TWELVE POINTED BLACK STAR 1640 | 1641 | % SIXTEEN POINTED ASTERISK 1642 | 1643 | % TEARDROP-SPOKED ASTERISK 1644 | 1645 | % OPEN CENTRE TEARDROP-SPOKED ASTERISK 1646 | 1647 | % HEAVY TEARDROP-SPOKED ASTERISK 1648 | 1649 | % SIX PETALLED BLACK AND WHITE FLORETTE 1650 | 1651 | % BLACK FLORETTE 1652 | 1653 | % WHITE FLORETTE 1654 | 1655 | % EIGHT PETALLED OUTLINED BLACK FLORETTE 1656 | 1657 | % CIRCLED OPEN CENTRE EIGHT POINTED STAR 1658 | 1659 | % HEAVY TEARDROP-SPOKED PINWHEEL ASTERISK 1660 | 1661 | % SNOWFLAKE 1662 | 1663 | % TIGHT TRIFOLIATE SNOWFLAKE 1664 | 1665 | % HEAVY CHEVRON SNOWFLAKE 1666 | 1667 | % SPARKLE 1668 | 1669 | % HEAVY SPARKLE 1670 | 1671 | % BALLOON-SPOKED ASTERISK 1672 | 1673 | % EIGHT TEARDROP-SPOKED PROPELLER ASTERISK 1674 | 1675 | % HEAVY EIGHT TEARDROP-SPOKED PROPELLER ASTERISK 1676 | 1677 | % LATIN SMALL LIGATURE FF 1678 | "" 1679 | % LATIN SMALL LIGATURE FI 1680 | "" 1681 | % LATIN SMALL LIGATURE FL 1682 | "" 1683 | % LATIN SMALL LIGATURE FFI 1684 | "" 1685 | % LATIN SMALL LIGATURE FFL 1686 | "" 1687 | % LATIN SMALL LIGATURE LONG S T 1688 | "";"" 1689 | % LATIN SMALL LIGATURE ST 1690 | "" 1691 | % ZERO WIDTH NO-BREAK SPACE 1692 | "" 1693 | % REPLACEMENT CHARACTER 1694 | 1695 | --------------------------------------------------------------------------------