├── LICENSE ├── README.md └── myparser.py /LICENSE: -------------------------------------------------------------------------------- 1 | GNU LESSER GENERAL PUBLIC LICENSE 2 | Version 2.1, February 1999 3 | 4 | Copyright (C) 1991, 1999 Free Software Foundation, Inc. 5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 6 | Everyone is permitted to copy and distribute verbatim copies 7 | of this license document, but changing it is not allowed. 8 | 9 | (This is the first released version of the Lesser GPL. It also counts 10 | as the successor of the GNU Library Public License, version 2, hence 11 | the version number 2.1.) 12 | 13 | Preamble 14 | 15 | The licenses for most software are designed to take away your 16 | freedom to share and change it. By contrast, the GNU General Public 17 | Licenses are intended to guarantee your freedom to share and change 18 | free software--to make sure the software is free for all its users. 19 | 20 | This license, the Lesser General Public License, applies to some 21 | specially designated software packages--typically libraries--of the 22 | Free Software Foundation and other authors who decide to use it. You 23 | can use it too, but we suggest you first think carefully about whether 24 | this license or the ordinary General Public License is the better 25 | strategy to use in any particular case, based on the explanations below. 26 | 27 | When we speak of free software, we are referring to freedom of use, 28 | not price. Our General Public Licenses are designed to make sure that 29 | you have the freedom to distribute copies of free software (and charge 30 | for this service if you wish); that you receive source code or can get 31 | it if you want it; that you can change the software and use pieces of 32 | it in new free programs; and that you are informed that you can do 33 | these things. 34 | 35 | To protect your rights, we need to make restrictions that forbid 36 | distributors to deny you these rights or to ask you to surrender these 37 | rights. These restrictions translate to certain responsibilities for 38 | you if you distribute copies of the library or if you modify it. 39 | 40 | For example, if you distribute copies of the library, whether gratis 41 | or for a fee, you must give the recipients all the rights that we gave 42 | you. You must make sure that they, too, receive or can get the source 43 | code. If you link other code with the library, you must provide 44 | complete object files to the recipients, so that they can relink them 45 | with the library after making changes to the library and recompiling 46 | it. And you must show them these terms so they know their rights. 47 | 48 | We protect your rights with a two-step method: (1) we copyright the 49 | library, and (2) we offer you this license, which gives you legal 50 | permission to copy, distribute and/or modify the library. 51 | 52 | To protect each distributor, we want to make it very clear that 53 | there is no warranty for the free library. Also, if the library is 54 | modified by someone else and passed on, the recipients should know 55 | that what they have is not the original version, so that the original 56 | author's reputation will not be affected by problems that might be 57 | introduced by others. 58 | 59 | Finally, software patents pose a constant threat to the existence of 60 | any free program. We wish to make sure that a company cannot 61 | effectively restrict the users of a free program by obtaining a 62 | restrictive license from a patent holder. Therefore, we insist that 63 | any patent license obtained for a version of the library must be 64 | consistent with the full freedom of use specified in this license. 65 | 66 | Most GNU software, including some libraries, is covered by the 67 | ordinary GNU General Public License. This license, the GNU Lesser 68 | General Public License, applies to certain designated libraries, and 69 | is quite different from the ordinary General Public License. We use 70 | this license for certain libraries in order to permit linking those 71 | libraries into non-free programs. 72 | 73 | When a program is linked with a library, whether statically or using 74 | a shared library, the combination of the two is legally speaking a 75 | combined work, a derivative of the original library. The ordinary 76 | General Public License therefore permits such linking only if the 77 | entire combination fits its criteria of freedom. The Lesser General 78 | Public License permits more lax criteria for linking other code with 79 | the library. 80 | 81 | We call this license the "Lesser" General Public License because it 82 | does Less to protect the user's freedom than the ordinary General 83 | Public License. It also provides other free software developers Less 84 | of an advantage over competing non-free programs. These disadvantages 85 | are the reason we use the ordinary General Public License for many 86 | libraries. However, the Lesser license provides advantages in certain 87 | special circumstances. 88 | 89 | For example, on rare occasions, there may be a special need to 90 | encourage the widest possible use of a certain library, so that it becomes 91 | a de-facto standard. To achieve this, non-free programs must be 92 | allowed to use the library. A more frequent case is that a free 93 | library does the same job as widely used non-free libraries. In this 94 | case, there is little to gain by limiting the free library to free 95 | software only, so we use the Lesser General Public License. 96 | 97 | In other cases, permission to use a particular library in non-free 98 | programs enables a greater number of people to use a large body of 99 | free software. For example, permission to use the GNU C Library in 100 | non-free programs enables many more people to use the whole GNU 101 | operating system, as well as its variant, the GNU/Linux operating 102 | system. 103 | 104 | Although the Lesser General Public License is Less protective of the 105 | users' freedom, it does ensure that the user of a program that is 106 | linked with the Library has the freedom and the wherewithal to run 107 | that program using a modified version of the Library. 108 | 109 | The precise terms and conditions for copying, distribution and 110 | modification follow. Pay close attention to the difference between a 111 | "work based on the library" and a "work that uses the library". The 112 | former contains code derived from the library, whereas the latter must 113 | be combined with the library in order to run. 114 | 115 | GNU LESSER GENERAL PUBLIC LICENSE 116 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 117 | 118 | 0. This License Agreement applies to any software library or other 119 | program which contains a notice placed by the copyright holder or 120 | other authorized party saying it may be distributed under the terms of 121 | this Lesser General Public License (also called "this License"). 122 | Each licensee is addressed as "you". 123 | 124 | A "library" means a collection of software functions and/or data 125 | prepared so as to be conveniently linked with application programs 126 | (which use some of those functions and data) to form executables. 127 | 128 | The "Library", below, refers to any such software library or work 129 | which has been distributed under these terms. A "work based on the 130 | Library" means either the Library or any derivative work under 131 | copyright law: that is to say, a work containing the Library or a 132 | portion of it, either verbatim or with modifications and/or translated 133 | straightforwardly into another language. (Hereinafter, translation is 134 | included without limitation in the term "modification".) 135 | 136 | "Source code" for a work means the preferred form of the work for 137 | making modifications to it. For a library, complete source code means 138 | all the source code for all modules it contains, plus any associated 139 | interface definition files, plus the scripts used to control compilation 140 | and installation of the library. 141 | 142 | Activities other than copying, distribution and modification are not 143 | covered by this License; they are outside its scope. The act of 144 | running a program using the Library is not restricted, and output from 145 | such a program is covered only if its contents constitute a work based 146 | on the Library (independent of the use of the Library in a tool for 147 | writing it). Whether that is true depends on what the Library does 148 | and what the program that uses the Library does. 149 | 150 | 1. You may copy and distribute verbatim copies of the Library's 151 | complete source code as you receive it, in any medium, provided that 152 | you conspicuously and appropriately publish on each copy an 153 | appropriate copyright notice and disclaimer of warranty; keep intact 154 | all the notices that refer to this License and to the absence of any 155 | warranty; and distribute a copy of this License along with the 156 | Library. 157 | 158 | You may charge a fee for the physical act of transferring a copy, 159 | and you may at your option offer warranty protection in exchange for a 160 | fee. 161 | 162 | 2. You may modify your copy or copies of the Library or any portion 163 | of it, thus forming a work based on the Library, and copy and 164 | distribute such modifications or work under the terms of Section 1 165 | above, provided that you also meet all of these conditions: 166 | 167 | a) The modified work must itself be a software library. 168 | 169 | b) You must cause the files modified to carry prominent notices 170 | stating that you changed the files and the date of any change. 171 | 172 | c) You must cause the whole of the work to be licensed at no 173 | charge to all third parties under the terms of this License. 174 | 175 | d) If a facility in the modified Library refers to a function or a 176 | table of data to be supplied by an application program that uses 177 | the facility, other than as an argument passed when the facility 178 | is invoked, then you must make a good faith effort to ensure that, 179 | in the event an application does not supply such function or 180 | table, the facility still operates, and performs whatever part of 181 | its purpose remains meaningful. 182 | 183 | (For example, a function in a library to compute square roots has 184 | a purpose that is entirely well-defined independent of the 185 | application. Therefore, Subsection 2d requires that any 186 | application-supplied function or table used by this function must 187 | be optional: if the application does not supply it, the square 188 | root function must still compute square roots.) 189 | 190 | These requirements apply to the modified work as a whole. If 191 | identifiable sections of that work are not derived from the Library, 192 | and can be reasonably considered independent and separate works in 193 | themselves, then this License, and its terms, do not apply to those 194 | sections when you distribute them as separate works. But when you 195 | distribute the same sections as part of a whole which is a work based 196 | on the Library, the distribution of the whole must be on the terms of 197 | this License, whose permissions for other licensees extend to the 198 | entire whole, and thus to each and every part regardless of who wrote 199 | it. 200 | 201 | Thus, it is not the intent of this section to claim rights or contest 202 | your rights to work written entirely by you; rather, the intent is to 203 | exercise the right to control the distribution of derivative or 204 | collective works based on the Library. 205 | 206 | In addition, mere aggregation of another work not based on the Library 207 | with the Library (or with a work based on the Library) on a volume of 208 | a storage or distribution medium does not bring the other work under 209 | the scope of this License. 210 | 211 | 3. You may opt to apply the terms of the ordinary GNU General Public 212 | License instead of this License to a given copy of the Library. To do 213 | this, you must alter all the notices that refer to this License, so 214 | that they refer to the ordinary GNU General Public License, version 2, 215 | instead of to this License. (If a newer version than version 2 of the 216 | ordinary GNU General Public License has appeared, then you can specify 217 | that version instead if you wish.) Do not make any other change in 218 | these notices. 219 | 220 | Once this change is made in a given copy, it is irreversible for 221 | that copy, so the ordinary GNU General Public License applies to all 222 | subsequent copies and derivative works made from that copy. 223 | 224 | This option is useful when you wish to copy part of the code of 225 | the Library into a program that is not a library. 226 | 227 | 4. You may copy and distribute the Library (or a portion or 228 | derivative of it, under Section 2) in object code or executable form 229 | under the terms of Sections 1 and 2 above provided that you accompany 230 | it with the complete corresponding machine-readable source code, which 231 | must be distributed under the terms of Sections 1 and 2 above on a 232 | medium customarily used for software interchange. 233 | 234 | If distribution of object code is made by offering access to copy 235 | from a designated place, then offering equivalent access to copy the 236 | source code from the same place satisfies the requirement to 237 | distribute the source code, even though third parties are not 238 | compelled to copy the source along with the object code. 239 | 240 | 5. A program that contains no derivative of any portion of the 241 | Library, but is designed to work with the Library by being compiled or 242 | linked with it, is called a "work that uses the Library". Such a 243 | work, in isolation, is not a derivative work of the Library, and 244 | therefore falls outside the scope of this License. 245 | 246 | However, linking a "work that uses the Library" with the Library 247 | creates an executable that is a derivative of the Library (because it 248 | contains portions of the Library), rather than a "work that uses the 249 | library". The executable is therefore covered by this License. 250 | Section 6 states terms for distribution of such executables. 251 | 252 | When a "work that uses the Library" uses material from a header file 253 | that is part of the Library, the object code for the work may be a 254 | derivative work of the Library even though the source code is not. 255 | Whether this is true is especially significant if the work can be 256 | linked without the Library, or if the work is itself a library. The 257 | threshold for this to be true is not precisely defined by law. 258 | 259 | If such an object file uses only numerical parameters, data 260 | structure layouts and accessors, and small macros and small inline 261 | functions (ten lines or less in length), then the use of the object 262 | file is unrestricted, regardless of whether it is legally a derivative 263 | work. (Executables containing this object code plus portions of the 264 | Library will still fall under Section 6.) 265 | 266 | Otherwise, if the work is a derivative of the Library, you may 267 | distribute the object code for the work under the terms of Section 6. 268 | Any executables containing that work also fall under Section 6, 269 | whether or not they are linked directly with the Library itself. 270 | 271 | 6. As an exception to the Sections above, you may also combine or 272 | link a "work that uses the Library" with the Library to produce a 273 | work containing portions of the Library, and distribute that work 274 | under terms of your choice, provided that the terms permit 275 | modification of the work for the customer's own use and reverse 276 | engineering for debugging such modifications. 277 | 278 | You must give prominent notice with each copy of the work that the 279 | Library is used in it and that the Library and its use are covered by 280 | this License. You must supply a copy of this License. If the work 281 | during execution displays copyright notices, you must include the 282 | copyright notice for the Library among them, as well as a reference 283 | directing the user to the copy of this License. Also, you must do one 284 | of these things: 285 | 286 | a) Accompany the work with the complete corresponding 287 | machine-readable source code for the Library including whatever 288 | changes were used in the work (which must be distributed under 289 | Sections 1 and 2 above); and, if the work is an executable linked 290 | with the Library, with the complete machine-readable "work that 291 | uses the Library", as object code and/or source code, so that the 292 | user can modify the Library and then relink to produce a modified 293 | executable containing the modified Library. (It is understood 294 | that the user who changes the contents of definitions files in the 295 | Library will not necessarily be able to recompile the application 296 | to use the modified definitions.) 297 | 298 | b) Use a suitable shared library mechanism for linking with the 299 | Library. A suitable mechanism is one that (1) uses at run time a 300 | copy of the library already present on the user's computer system, 301 | rather than copying library functions into the executable, and (2) 302 | will operate properly with a modified version of the library, if 303 | the user installs one, as long as the modified version is 304 | interface-compatible with the version that the work was made with. 305 | 306 | c) Accompany the work with a written offer, valid for at 307 | least three years, to give the same user the materials 308 | specified in Subsection 6a, above, for a charge no more 309 | than the cost of performing this distribution. 310 | 311 | d) If distribution of the work is made by offering access to copy 312 | from a designated place, offer equivalent access to copy the above 313 | specified materials from the same place. 314 | 315 | e) Verify that the user has already received a copy of these 316 | materials or that you have already sent this user a copy. 317 | 318 | For an executable, the required form of the "work that uses the 319 | Library" must include any data and utility programs needed for 320 | reproducing the executable from it. However, as a special exception, 321 | the materials to be distributed need not include anything that is 322 | normally distributed (in either source or binary form) with the major 323 | components (compiler, kernel, and so on) of the operating system on 324 | which the executable runs, unless that component itself accompanies 325 | the executable. 326 | 327 | It may happen that this requirement contradicts the license 328 | restrictions of other proprietary libraries that do not normally 329 | accompany the operating system. Such a contradiction means you cannot 330 | use both them and the Library together in an executable that you 331 | distribute. 332 | 333 | 7. You may place library facilities that are a work based on the 334 | Library side-by-side in a single library together with other library 335 | facilities not covered by this License, and distribute such a combined 336 | library, provided that the separate distribution of the work based on 337 | the Library and of the other library facilities is otherwise 338 | permitted, and provided that you do these two things: 339 | 340 | a) Accompany the combined library with a copy of the same work 341 | based on the Library, uncombined with any other library 342 | facilities. This must be distributed under the terms of the 343 | Sections above. 344 | 345 | b) Give prominent notice with the combined library of the fact 346 | that part of it is a work based on the Library, and explaining 347 | where to find the accompanying uncombined form of the same work. 348 | 349 | 8. You may not copy, modify, sublicense, link with, or distribute 350 | the Library except as expressly provided under this License. Any 351 | attempt otherwise to copy, modify, sublicense, link with, or 352 | distribute the Library is void, and will automatically terminate your 353 | rights under this License. However, parties who have received copies, 354 | or rights, from you under this License will not have their licenses 355 | terminated so long as such parties remain in full compliance. 356 | 357 | 9. You are not required to accept this License, since you have not 358 | signed it. However, nothing else grants you permission to modify or 359 | distribute the Library or its derivative works. These actions are 360 | prohibited by law if you do not accept this License. Therefore, by 361 | modifying or distributing the Library (or any work based on the 362 | Library), you indicate your acceptance of this License to do so, and 363 | all its terms and conditions for copying, distributing or modifying 364 | the Library or works based on it. 365 | 366 | 10. Each time you redistribute the Library (or any work based on the 367 | Library), the recipient automatically receives a license from the 368 | original licensor to copy, distribute, link with or modify the Library 369 | subject to these terms and conditions. You may not impose any further 370 | restrictions on the recipients' exercise of the rights granted herein. 371 | You are not responsible for enforcing compliance by third parties with 372 | this License. 373 | 374 | 11. If, as a consequence of a court judgment or allegation of patent 375 | infringement or for any other reason (not limited to patent issues), 376 | conditions are imposed on you (whether by court order, agreement or 377 | otherwise) that contradict the conditions of this License, they do not 378 | excuse you from the conditions of this License. If you cannot 379 | distribute so as to satisfy simultaneously your obligations under this 380 | License and any other pertinent obligations, then as a consequence you 381 | may not distribute the Library at all. For example, if a patent 382 | license would not permit royalty-free redistribution of the Library by 383 | all those who receive copies directly or indirectly through you, then 384 | the only way you could satisfy both it and this License would be to 385 | refrain entirely from distribution of the Library. 386 | 387 | If any portion of this section is held invalid or unenforceable under any 388 | particular circumstance, the balance of the section is intended to apply, 389 | and the section as a whole is intended to apply in other circumstances. 390 | 391 | It is not the purpose of this section to induce you to infringe any 392 | patents or other property right claims or to contest validity of any 393 | such claims; this section has the sole purpose of protecting the 394 | integrity of the free software distribution system which is 395 | implemented by public license practices. Many people have made 396 | generous contributions to the wide range of software distributed 397 | through that system in reliance on consistent application of that 398 | system; it is up to the author/donor to decide if he or she is willing 399 | to distribute software through any other system and a licensee cannot 400 | impose that choice. 401 | 402 | This section is intended to make thoroughly clear what is believed to 403 | be a consequence of the rest of this License. 404 | 405 | 12. If the distribution and/or use of the Library is restricted in 406 | certain countries either by patents or by copyrighted interfaces, the 407 | original copyright holder who places the Library under this License may add 408 | an explicit geographical distribution limitation excluding those countries, 409 | so that distribution is permitted only in or among countries not thus 410 | excluded. In such case, this License incorporates the limitation as if 411 | written in the body of this License. 412 | 413 | 13. The Free Software Foundation may publish revised and/or new 414 | versions of the Lesser General Public License from time to time. 415 | Such new versions will be similar in spirit to the present version, 416 | but may differ in detail to address new problems or concerns. 417 | 418 | Each version is given a distinguishing version number. If the Library 419 | specifies a version number of this License which applies to it and 420 | "any later version", you have the option of following the terms and 421 | conditions either of that version or of any later version published by 422 | the Free Software Foundation. If the Library does not specify a 423 | license version number, you may choose any version ever published by 424 | the Free Software Foundation. 425 | 426 | 14. If you wish to incorporate parts of the Library into other free 427 | programs whose distribution conditions are incompatible with these, 428 | write to the author to ask for permission. For software which is 429 | copyrighted by the Free Software Foundation, write to the Free 430 | Software Foundation; we sometimes make exceptions for this. Our 431 | decision will be guided by the two goals of preserving the free status 432 | of all derivatives of our free software and of promoting the sharing 433 | and reuse of software generally. 434 | 435 | NO WARRANTY 436 | 437 | 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO 438 | WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. 439 | EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR 440 | OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY 441 | KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE 442 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 443 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE 444 | LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME 445 | THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 446 | 447 | 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN 448 | WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY 449 | AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU 450 | FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR 451 | CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE 452 | LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING 453 | RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A 454 | FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF 455 | SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH 456 | DAMAGES. 457 | 458 | END OF TERMS AND CONDITIONS 459 | 460 | How to Apply These Terms to Your New Libraries 461 | 462 | If you develop a new library, and you want it to be of the greatest 463 | possible use to the public, we recommend making it free software that 464 | everyone can redistribute and change. You can do so by permitting 465 | redistribution under these terms (or, alternatively, under the terms of the 466 | ordinary General Public License). 467 | 468 | To apply these terms, attach the following notices to the library. It is 469 | safest to attach them to the start of each source file to most effectively 470 | convey the exclusion of warranty; and each file should have at least the 471 | "copyright" line and a pointer to where the full notice is found. 472 | 473 | {description} 474 | Copyright (C) {year} {fullname} 475 | 476 | This library is free software; you can redistribute it and/or 477 | modify it under the terms of the GNU Lesser General Public 478 | License as published by the Free Software Foundation; either 479 | version 2.1 of the License, or (at your option) any later version. 480 | 481 | This library is distributed in the hope that it will be useful, 482 | but WITHOUT ANY WARRANTY; without even the implied warranty of 483 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 484 | Lesser General Public License for more details. 485 | 486 | You should have received a copy of the GNU Lesser General Public 487 | License along with this library; if not, write to the Free Software 488 | Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 489 | USA 490 | 491 | Also add information on how to contact you by electronic and paper mail. 492 | 493 | You should also get your employer (if you work as a programmer) or your 494 | school, if any, to sign a "copyright disclaimer" for the library, if 495 | necessary. Here is a sample; alter the names: 496 | 497 | Yoyodyne, Inc., hereby disclaims all copyright interest in the 498 | library `Frob' (a library for tweaking knobs) written by James Random 499 | Hacker. 500 | 501 | {signature of Ty Coon}, 1 April 1990 502 | Ty Coon, President of Vice 503 | 504 | That's all there is to it! 505 | 506 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MyanmarParser-Py 2 | 3 | This is the python version of https://github.com/thanlwinsoft/MyanmarParser 4 | 5 | ## Usage 6 | 7 | ```python 8 | 9 | from myparser import MyParser 10 | 11 | m = MyParser() 12 | 13 | str = u'နေကောင်းရဲ့လား' 14 | offset = 0 15 | 16 | while offset < len(str): 17 | breaktype, next_offset = m.get_next_syllable(str, len(str), offset) # parse 18 | print str[offset:next_offset] # extract syllable using start offset and end offset 19 | offset = next_offset 20 | 21 | # prints 22 | # နေ 23 | # ကောင်း 24 | # ရဲ့ 25 | # လား 26 | ``` 27 | -------------------------------------------------------------------------------- /myparser.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | # MyanmarParser-Py 5 | # https://github.com/thantthet/MyanmarParser-Py 6 | # Copyright (C) 2015 Thant 7 | 8 | # This library is free software; you can redistribute it and/or 9 | # modify it under the terms of the GNU Lesser General Public 10 | # License as published by the Free Software Foundation; either 11 | # version 2.1 of the License, or (at your option) any later version. 12 | 13 | # This library is distributed in the hope that it will be useful, 14 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 15 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 16 | # Lesser General Public License for more details. 17 | 18 | # You should have received a copy of the GNU Lesser General Public 19 | # License along with this library; if not, write to the Free Software 20 | # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 21 | # USA 22 | 23 | import types 24 | 25 | class MyParser: 26 | MY_SYLLABLE_UNKNOWN = 0 27 | MY_SYLLABLE_CONSONANT = 1 28 | MY_SYLLABLE_MEDIAL = 2 29 | MY_SYLLABLE_VOWEL = 3 30 | MY_SYLLABLE_TONE = 4 31 | MY_SYLLABLE_1039 = 5 32 | MY_SYLLABLE_103A = 6 33 | MY_SYLLABLE_NUMBER = 7 34 | MY_SYLLABLE_SECTION = 8 35 | 36 | CHAR_PART = () 37 | 38 | 39 | MY_PAIR_ILLEGAL = 0 # illegal sequence 40 | MY_PAIR_NO_BREAK = 1 # no break 41 | MY_PAIR_SYL_BREAK = 2 # syllable break 42 | MY_PAIR_WORD_BREAK = 3 # word break 43 | MY_PAIR_PUNCTUATION = 4 # punctuation break 44 | MY_PAIR_CONTEXT = 5 # needs further context analysis 45 | MY_PAIR_EOL = 6 # end of line 46 | 47 | LANG_MY = 0 # Myanmar 48 | 49 | MM_MAX_CONTEXT_LENGTH = 4 50 | 51 | def __init__(self): 52 | self.CHAR_PART = ( 53 | self.MY_SYLLABLE_CONSONANT,#1000;MYANMAR LETTER KA;Lo;0;L;;;;;N;;;;; 54 | self.MY_SYLLABLE_CONSONANT,#1001;MYANMAR LETTER KHA;Lo;0;L;;;;;N;;;;; 55 | self.MY_SYLLABLE_CONSONANT,#1002;MYANMAR LETTER GA;Lo;0;L;;;;;N;;;;; 56 | self.MY_SYLLABLE_CONSONANT,#1003;MYANMAR LETTER GHA;Lo;0;L;;;;;N;;;;; 57 | self.MY_SYLLABLE_CONSONANT,#1004;MYANMAR LETTER NGA;Lo;0;L;;;;;N;;;;; 58 | self.MY_SYLLABLE_CONSONANT,#1005;MYANMAR LETTER CA;Lo;0;L;;;;;N;;;;; 59 | self.MY_SYLLABLE_CONSONANT,#1006;MYANMAR LETTER CHA;Lo;0;L;;;;;N;;;;; 60 | self.MY_SYLLABLE_CONSONANT,#1007;MYANMAR LETTER JA;Lo;0;L;;;;;N;;;;; 61 | self.MY_SYLLABLE_CONSONANT,#1008;MYANMAR LETTER JHA;Lo;0;L;;;;;N;;;;; 62 | self.MY_SYLLABLE_CONSONANT,#1009;MYANMAR LETTER NYA;Lo;0;L;;;;;N;;;;; 63 | self.MY_SYLLABLE_CONSONANT,#100A;MYANMAR LETTER NNYA;Lo;0;L;;;;;N;;;;; 64 | self.MY_SYLLABLE_CONSONANT,#100B;MYANMAR LETTER TTA;Lo;0;L;;;;;N;;;;; 65 | self.MY_SYLLABLE_CONSONANT,#100C;MYANMAR LETTER TTHA;Lo;0;L;;;;;N;;;;; 66 | self.MY_SYLLABLE_CONSONANT,#100D;MYANMAR LETTER DDA;Lo;0;L;;;;;N;;;;; 67 | self.MY_SYLLABLE_CONSONANT,#100E;MYANMAR LETTER DDHA;Lo;0;L;;;;;N;;;;; 68 | self.MY_SYLLABLE_CONSONANT,#100F;MYANMAR LETTER NNA;Lo;0;L;;;;;N;;;;; 69 | self.MY_SYLLABLE_CONSONANT,#1010;MYANMAR LETTER TA;Lo;0;L;;;;;N;;;;; 70 | self.MY_SYLLABLE_CONSONANT,#1011;MYANMAR LETTER THA;Lo;0;L;;;;;N;;;;; 71 | self.MY_SYLLABLE_CONSONANT,#1012;MYANMAR LETTER DA;Lo;0;L;;;;;N;;;;; 72 | self.MY_SYLLABLE_CONSONANT,#1013;MYANMAR LETTER DHA;Lo;0;L;;;;;N;;;;; 73 | self.MY_SYLLABLE_CONSONANT,#1014;MYANMAR LETTER NA;Lo;0;L;;;;;N;;;;; 74 | self.MY_SYLLABLE_CONSONANT,#1015;MYANMAR LETTER PA;Lo;0;L;;;;;N;;;;; 75 | self.MY_SYLLABLE_CONSONANT,#1016;MYANMAR LETTER PHA;Lo;0;L;;;;;N;;;;; 76 | self.MY_SYLLABLE_CONSONANT,#1017;MYANMAR LETTER BA;Lo;0;L;;;;;N;;;;; 77 | self.MY_SYLLABLE_CONSONANT,#1018;MYANMAR LETTER BHA;Lo;0;L;;;;;N;;;;; 78 | self.MY_SYLLABLE_CONSONANT,#1019;MYANMAR LETTER MA;Lo;0;L;;;;;N;;;;; 79 | self.MY_SYLLABLE_CONSONANT,#101A;MYANMAR LETTER YA;Lo;0;L;;;;;N;;;;; 80 | self.MY_SYLLABLE_CONSONANT,#101B;MYANMAR LETTER RA;Lo;0;L;;;;;N;;;;; 81 | self.MY_SYLLABLE_CONSONANT,#101C;MYANMAR LETTER LA;Lo;0;L;;;;;N;;;;; 82 | self.MY_SYLLABLE_CONSONANT,#101D;MYANMAR LETTER WA;Lo;0;L;;;;;N;;;;; 83 | self.MY_SYLLABLE_CONSONANT,#101E;MYANMAR LETTER SA;Lo;0;L;;;;;N;;;;; 84 | self.MY_SYLLABLE_CONSONANT,#101F;MYANMAR LETTER HA;Lo;0;L;;;;;N;;;;; 85 | self.MY_SYLLABLE_CONSONANT,#1020;MYANMAR LETTER LLA;Lo;0;L;;;;;N;;;;; 86 | self.MY_SYLLABLE_CONSONANT,#1021;MYANMAR LETTER A;Lo;0;L;;;;;N;;;;; 87 | self.MY_SYLLABLE_CONSONANT,#1022;MYANMAR LETTER SHAN A;Lo;0;L;;;;;N;;;;; 88 | self.MY_SYLLABLE_CONSONANT,#1023;MYANMAR LETTER I;Lo;0;L;;;;;N;;;;; 89 | self.MY_SYLLABLE_CONSONANT,#1024;MYANMAR LETTER II;Lo;0;L;;;;;N;;;;; 90 | self.MY_SYLLABLE_CONSONANT,#1025;MYANMAR LETTER U;Lo;0;L;;;;;N;;;;; 91 | self.MY_SYLLABLE_CONSONANT,#1026;MYANMAR LETTER UU;Lo;0;L;1025 102E;;;;N;;;;; 92 | self.MY_SYLLABLE_CONSONANT,#1027;MYANMAR LETTER E;Lo;0;L;;;;;N;;;;; 93 | self.MY_SYLLABLE_CONSONANT,#1028;MYANMAR LETTER MON E;Lo;0;L;;;;;N;;;;; 94 | self.MY_SYLLABLE_CONSONANT,#1029;MYANMAR LETTER O;Lo;0;L;;;;;N;;;;; 95 | self.MY_SYLLABLE_CONSONANT,#102A;MYANMAR LETTER AU;Lo;0;L;;;;;N;;;;; 96 | self.MY_SYLLABLE_VOWEL,#102B;MYANMAR VOWEL SIGN TALL AA;Mc;0;L;;;;;N;;;;; 97 | self.MY_SYLLABLE_VOWEL,#102C;MYANMAR VOWEL SIGN AA;Mc;0;L;;;;;N;;;;; 98 | self.MY_SYLLABLE_VOWEL,#102D;MYANMAR VOWEL SIGN I;Mn;0;NSM;;;;;N;;;;; 99 | self.MY_SYLLABLE_VOWEL,#102E;MYANMAR VOWEL SIGN II;Mn;0;NSM;;;;;N;;;;; 100 | self.MY_SYLLABLE_VOWEL,#102F;MYANMAR VOWEL SIGN U;Mn;0;NSM;;;;;N;;;;; 101 | self.MY_SYLLABLE_VOWEL,#1030;MYANMAR VOWEL SIGN UU;Mn;0;NSM;;;;;N;;;;; 102 | self.MY_SYLLABLE_VOWEL,#1031;MYANMAR VOWEL SIGN E;Mc;0;L;;;;;N;;;;; 103 | self.MY_SYLLABLE_VOWEL,#1032;MYANMAR VOWEL SIGN AI;Mn;0;NSM;;;;;N;;;;; 104 | self.MY_SYLLABLE_VOWEL,#1033;MYANMAR VOWEL SIGN MON II;Mn;0;NSM;;;;;N;;;;; 105 | self.MY_SYLLABLE_VOWEL,#1034;MYANMAR VOWEL SIGN MON O;Mn;0;NSM;;;;;N;;;;; 106 | self.MY_SYLLABLE_VOWEL,#1035;MYANMAR VOWEL SIGN E ABOVE;Mn;0;NSM;;;;;N;;;;; 107 | self.MY_SYLLABLE_VOWEL,#1036;MYANMAR SIGN ANUSVARA;Mn;0;NSM;;;;;N;;;;; 108 | self.MY_SYLLABLE_TONE,#1037;MYANMAR SIGN DOT BELOW;Mn;7;NSM;;;;;N;;;;; 109 | self.MY_SYLLABLE_TONE,#1038;MYANMAR SIGN VISARGA;Mc;0;L;;;;;N;;;;; 110 | self.MY_SYLLABLE_1039,#1039;MYANMAR SIGN VIRAMA;Mn;9;NSM;;;;;N;;;;; 111 | self.MY_SYLLABLE_103A,#103A;MYANMAR SIGN ASAT;Mn;9;NSM;;;;;N;;;;; 112 | self.MY_SYLLABLE_MEDIAL,#103B;MYANMAR CONSONANT SIGN MEDIAL YA;Mc;0;L;;;;;N;;;;; 113 | self.MY_SYLLABLE_MEDIAL,#103C;MYANMAR CONSONANT SIGN MEDIAL RA;Mc;0;L;;;;;N;;;;; 114 | self.MY_SYLLABLE_MEDIAL,#103D;MYANMAR CONSONANT SIGN MEDIAL WA;Mn;0;NSM;;;;;N;;;;; 115 | self.MY_SYLLABLE_MEDIAL,#103E;MYANMAR CONSONANT SIGN MEDIAL HA;Mn;0;NSM;;;;;N;;;;; 116 | self.MY_SYLLABLE_CONSONANT,#103F;MYANMAR LETTER GREAT SA;Lo;0;L;;;;;N;;;;; 117 | self.MY_SYLLABLE_NUMBER,#1040;MYANMAR DIGIT ZERO;Nd;0;L;;0;0;0;N;;;;; 118 | self.MY_SYLLABLE_NUMBER,#1041;MYANMAR DIGIT ONE;Nd;0;L;;1;1;1;N;;;;; 119 | self.MY_SYLLABLE_NUMBER,#1042;MYANMAR DIGIT TWO;Nd;0;L;;2;2;2;N;;;;; 120 | self.MY_SYLLABLE_NUMBER,#1043;MYANMAR DIGIT THREE;Nd;0;L;;3;3;3;N;;;;; 121 | self.MY_SYLLABLE_NUMBER,#1044;MYANMAR DIGIT FOUR;Nd;0;L;;4;4;4;N;;;;; 122 | self.MY_SYLLABLE_NUMBER,#1045;MYANMAR DIGIT FIVE;Nd;0;L;;5;5;5;N;;;;; 123 | self.MY_SYLLABLE_NUMBER,#1046;MYANMAR DIGIT SIX;Nd;0;L;;6;6;6;N;;;;; 124 | self.MY_SYLLABLE_NUMBER,#1047;MYANMAR DIGIT SEVEN;Nd;0;L;;7;7;7;N;;;;; 125 | self.MY_SYLLABLE_NUMBER,#1048;MYANMAR DIGIT EIGHT;Nd;0;L;;8;8;8;N;;;;; 126 | self.MY_SYLLABLE_NUMBER,#1049;MYANMAR DIGIT NINE;Nd;0;L;;9;9;9;N;;;;; 127 | self.MY_SYLLABLE_SECTION,#104A;MYANMAR SIGN LITTLE SECTION;Po;0;L;;;;;N;;;;; 128 | self.MY_SYLLABLE_SECTION,#104B;MYANMAR SIGN SECTION;Po;0;L;;;;;N;;;;; 129 | self.MY_SYLLABLE_CONSONANT,#104C;MYANMAR SYMBOL LOCATIVE;Po;0;L;;;;;N;;;;; 130 | self.MY_SYLLABLE_CONSONANT,#104D;MYANMAR SYMBOL COMPLETED;Po;0;L;;;;;N;;;;; 131 | self.MY_SYLLABLE_CONSONANT,#104E;MYANMAR SYMBOL AFOREMENTIONED;Po;0;L;;;;;N;;;;; 132 | self.MY_SYLLABLE_CONSONANT,#104F;MYANMAR SYMBOL GENITIVE;Po;0;L;;;;;N;;;;; 133 | self.MY_SYLLABLE_CONSONANT,#1050;MYANMAR LETTER SHA;Lo;0;L;;;;;N;;;;; 134 | self.MY_SYLLABLE_CONSONANT,#1051;MYANMAR LETTER SSA;Lo;0;L;;;;;N;;;;; 135 | self.MY_SYLLABLE_CONSONANT,#1052;MYANMAR LETTER VOCALIC R;Lo;0;L;;;;;N;;;;; 136 | self.MY_SYLLABLE_CONSONANT,#1053;MYANMAR LETTER VOCALIC RR;Lo;0;L;;;;;N;;;;; 137 | self.MY_SYLLABLE_CONSONANT,#1054;MYANMAR LETTER VOCALIC L;Lo;0;L;;;;;N;;;;; 138 | self.MY_SYLLABLE_CONSONANT,#1055;MYANMAR LETTER VOCALIC LL;Lo;0;L;;;;;N;;;;; 139 | self.MY_SYLLABLE_VOWEL,#1056;MYANMAR VOWEL SIGN VOCALIC R;Mc;0;L;;;;;N;;;;; 140 | self.MY_SYLLABLE_VOWEL,#1057;MYANMAR VOWEL SIGN VOCALIC RR;Mc;0;L;;;;;N;;;;; 141 | self.MY_SYLLABLE_VOWEL,#1058;MYANMAR VOWEL SIGN VOCALIC L;Mn;0;NSM;;;;;N;;;;; 142 | self.MY_SYLLABLE_VOWEL,#1059;MYANMAR VOWEL SIGN VOCALIC LL;Mn;0;NSM;;;;;N;;;;; 143 | self.MY_SYLLABLE_CONSONANT,#105A;MYANMAR LETTER MON NGA;Lo;0;L;;;;;N;;;;; 144 | self.MY_SYLLABLE_CONSONANT,#105B;MYANMAR LETTER MON JHA;Lo;0;L;;;;;N;;;;; 145 | self.MY_SYLLABLE_CONSONANT,#105C;MYANMAR LETTER MON BBA;Lo;0;L;;;;;N;;;;; 146 | self.MY_SYLLABLE_CONSONANT,#105D;MYANMAR LETTER MON BBE;Lo;0;L;;;;;N;;;;; 147 | self.MY_SYLLABLE_MEDIAL,#105E;MYANMAR CONSONANT SIGN MON MEDIAL NA;Mn;0;NSM;;;;;N;;;;; 148 | self.MY_SYLLABLE_MEDIAL,#105F;MYANMAR CONSONANT SIGN MON MEDIAL MA;Mn;0;NSM;;;;;N;;;;; 149 | self.MY_SYLLABLE_MEDIAL,#1060;MYANMAR CONSONANT SIGN MON MEDIAL LA;Mn;0;NSM;;;;;N;;;;; 150 | self.MY_SYLLABLE_CONSONANT,#1061;MYANMAR LETTER SGAW KAREN SHA;Lo;0;L;;;;;N;;;;; 151 | self.MY_SYLLABLE_VOWEL,#1062;MYANMAR VOWEL SIGN SGAW KAREN EU;Mc;0;L;;;;;N;;;;; 152 | self.MY_SYLLABLE_VOWEL,#1063;MYANMAR TONE MARK SGAW KAREN HATHI;Mc;0;L;;;;;N;;;;; 153 | self.MY_SYLLABLE_VOWEL,#1064;MYANMAR TONE MARK SGAW KAREN KE PHO;Mc;0;L;;;;;N;;;;; 154 | self.MY_SYLLABLE_CONSONANT,#1065;MYANMAR LETTER WESTERN PWO KAREN THA;Lo;0;L;;;;;N;;;;; 155 | self.MY_SYLLABLE_CONSONANT,#1066;MYANMAR LETTER WESTERN PWO KAREN PWA;Lo;0;L;;;;;N;;;;; 156 | self.MY_SYLLABLE_VOWEL,#1067;MYANMAR VOWEL SIGN WESTERN PWO KAREN EU;Mc;0;L;;;;;N;;;;; 157 | self.MY_SYLLABLE_VOWEL,#1068;MYANMAR VOWEL SIGN WESTERN PWO KAREN UE;Mc;0;L;;;;;N;;;;; 158 | self.MY_SYLLABLE_TONE,#1069;MYANMAR SIGN WESTERN PWO KAREN TONE-1;Mc;0;L;;;;;N;;;;; 159 | self.MY_SYLLABLE_TONE,#106A;MYANMAR SIGN WESTERN PWO KAREN TONE-2;Mc;0;L;;;;;N;;;;; 160 | self.MY_SYLLABLE_TONE,#106B;MYANMAR SIGN WESTERN PWO KAREN TONE-3;Mc;0;L;;;;;N;;;;; 161 | self.MY_SYLLABLE_TONE,#106C;MYANMAR SIGN WESTERN PWO KAREN TONE-4;Mc;0;L;;;;;N;;;;; 162 | self.MY_SYLLABLE_TONE,#106D;MYANMAR SIGN WESTERN PWO KAREN TONE-5;Mc;0;L;;;;;N;;;;; 163 | self.MY_SYLLABLE_CONSONANT,#106E;MYANMAR LETTER EASTERN PWO KAREN NNA;Lo;0;L;;;;;N;;;;; 164 | self.MY_SYLLABLE_CONSONANT,#106F;MYANMAR LETTER EASTERN PWO KAREN YWA;Lo;0;L;;;;;N;;;;; 165 | self.MY_SYLLABLE_CONSONANT,#1070;MYANMAR LETTER EASTERN PWO KAREN GHWA;Lo;0;L;;;;;N;;;;; 166 | self.MY_SYLLABLE_VOWEL,#1071;MYANMAR VOWEL SIGN GEBA KAREN I;Mn;0;NSM;;;;;N;;;;; 167 | self.MY_SYLLABLE_VOWEL,#1072;MYANMAR VOWEL SIGN KAYAH OE;Mn;0;NSM;;;;;N;;;;; 168 | self.MY_SYLLABLE_VOWEL,#1073;MYANMAR VOWEL SIGN KAYAH U;Mn;0;NSM;;;;;N;;;;; 169 | self.MY_SYLLABLE_VOWEL,#1074;MYANMAR VOWEL SIGN KAYAH EE;Mn;0;NSM;;;;;N;;;;; 170 | self.MY_SYLLABLE_CONSONANT,#1075;MYANMAR LETTER SHAN KA;Lo;0;L;;;;;N;;;;; 171 | self.MY_SYLLABLE_CONSONANT,#1076;MYANMAR LETTER SHAN KHA;Lo;0;L;;;;;N;;;;; 172 | self.MY_SYLLABLE_CONSONANT,#1077;MYANMAR LETTER SHAN GA;Lo;0;L;;;;;N;;;;; 173 | self.MY_SYLLABLE_CONSONANT,#1078;MYANMAR LETTER SHAN CA;Lo;0;L;;;;;N;;;;; 174 | self.MY_SYLLABLE_CONSONANT,#1079;MYANMAR LETTER SHAN ZA;Lo;0;L;;;;;N;;;;; 175 | self.MY_SYLLABLE_CONSONANT,#107A;MYANMAR LETTER SHAN NYA;Lo;0;L;;;;;N;;;;; 176 | self.MY_SYLLABLE_CONSONANT,#107B;MYANMAR LETTER SHAN DA;Lo;0;L;;;;;N;;;;; 177 | self.MY_SYLLABLE_CONSONANT,#107C;MYANMAR LETTER SHAN NA;Lo;0;L;;;;;N;;;;; 178 | self.MY_SYLLABLE_CONSONANT,#107D;MYANMAR LETTER SHAN PHA;Lo;0;L;;;;;N;;;;; 179 | self.MY_SYLLABLE_CONSONANT,#107E;MYANMAR LETTER SHAN FA;Lo;0;L;;;;;N;;;;; 180 | self.MY_SYLLABLE_CONSONANT,#107F;MYANMAR LETTER SHAN BA;Lo;0;L;;;;;N;;;;; 181 | self.MY_SYLLABLE_CONSONANT,#1080;MYANMAR LETTER SHAN THA;Lo;0;L;;;;;N;;;;; 182 | self.MY_SYLLABLE_CONSONANT,#1081;MYANMAR LETTER SHAN HA;Lo;0;L;;;;;N;;;;; 183 | self.MY_SYLLABLE_MEDIAL,#1082;MYANMAR CONSONANT SIGN SHAN MEDIAL WA;Mn;0;NSM;;;;;N;;;;; 184 | self.MY_SYLLABLE_VOWEL,#1083;MYANMAR VOWEL SIGN SHAN AA;Mc;0;L;;;;;N;;;;; 185 | self.MY_SYLLABLE_VOWEL,#1084;MYANMAR VOWEL SIGN SHAN E;Mc;0;L;;;;;N;;;;; 186 | self.MY_SYLLABLE_VOWEL,#1085;MYANMAR VOWEL SIGN SHAN E ABOVE;Mn;0;NSM;;;;;N;;;;; 187 | self.MY_SYLLABLE_VOWEL,#1086;MYANMAR VOWEL SIGN SHAN FINAL Y;Mn;0;NSM;;;;;N;;;;; 188 | self.MY_SYLLABLE_TONE,#1087;MYANMAR SIGN SHAN TONE-2;Mc;0;L;;;;;N;;;;; 189 | self.MY_SYLLABLE_TONE,#1088;MYANMAR SIGN SHAN TONE-3;Mc;0;L;;;;;N;;;;; 190 | self.MY_SYLLABLE_TONE,#1089;MYANMAR SIGN SHAN TONE-5;Mc;0;L;;;;;N;;;;; 191 | self.MY_SYLLABLE_TONE,#108A;MYANMAR SIGN SHAN TONE-6;Mc;0;L;;;;;N;;;;; 192 | self.MY_SYLLABLE_TONE,#108B;MYANMAR SIGN SHAN COUNCIL TONE-2;Mc;0;L;;;;;N;;;;; 193 | self.MY_SYLLABLE_TONE,#108C;MYANMAR SIGN SHAN COUNCIL TONE-3;Mc;0;L;;;;;N;;;;; 194 | self.MY_SYLLABLE_TONE,#108D;MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE;Mn;220;NSM;;;;;N;;;;; 195 | self.MY_SYLLABLE_CONSONANT,#108E;MYANMAR LETTER RUMAI PALAUNG FA;Lo;0;L;;;;;N;;;;; 196 | self.MY_SYLLABLE_TONE,#108F;MYANMAR SIGN RUMAI PALAUNG TONE-5;Mc;0;L;;;;;N;;;;; 197 | self.MY_SYLLABLE_NUMBER,#1090;MYANMAR SHAN DIGIT ZERO;Nd;0;L;;0;0;0;N;;;;; 198 | self.MY_SYLLABLE_NUMBER,#1091;MYANMAR SHAN DIGIT ONE;Nd;0;L;;1;1;1;N;;;;; 199 | self.MY_SYLLABLE_NUMBER,#1092;MYANMAR SHAN DIGIT TWO;Nd;0;L;;2;2;2;N;;;;; 200 | self.MY_SYLLABLE_NUMBER,#1093;MYANMAR SHAN DIGIT THREE;Nd;0;L;;3;3;3;N;;;;; 201 | self.MY_SYLLABLE_NUMBER,#1094;MYANMAR SHAN DIGIT FOUR;Nd;0;L;;4;4;4;N;;;;; 202 | self.MY_SYLLABLE_NUMBER,#1095;MYANMAR SHAN DIGIT FIVE;Nd;0;L;;5;5;5;N;;;;; 203 | self.MY_SYLLABLE_NUMBER,#1096;MYANMAR SHAN DIGIT SIX;Nd;0;L;;6;6;6;N;;;;; 204 | self.MY_SYLLABLE_NUMBER,#1097;MYANMAR SHAN DIGIT SEVEN;Nd;0;L;;7;7;7;N;;;;; 205 | self.MY_SYLLABLE_NUMBER,#1098;MYANMAR SHAN DIGIT EIGHT;Nd;0;L;;8;8;8;N;;;;; 206 | self.MY_SYLLABLE_NUMBER,#1099;MYANMAR SHAN DIGIT NINE;Nd;0;L;;9;9;9;N;;;;; 207 | self.MY_SYLLABLE_TONE,#109A 208 | self.MY_SYLLABLE_TONE,#109B 209 | self.MY_SYLLABLE_TONE,#109C 210 | self.MY_SYLLABLE_TONE,#109D?? 211 | self.MY_SYLLABLE_CONSONANT,#109E;MYANMAR SYMBOL SHAN ONE;So;0;L;;;;;N;;;;; 212 | self.MY_SYLLABLE_CONSONANT#109F;MYANMAR SYMBOL SHAN EXCLAMATION;So;0;L;;;;;N;;;;; 213 | ) 214 | 215 | def get_char(self, srtingOrChar): 216 | # print type(srtingOrChar) 217 | # print isinstance(srtingOrChar, (str, unicode)) 218 | 219 | if isinstance(srtingOrChar, basestring): 220 | char = ord(srtingOrChar) 221 | else: 222 | char = srtingOrChar 223 | return char 224 | 225 | def get_char_class(self, string): 226 | identifiedClass = self.MY_SYLLABLE_UNKNOWN 227 | char = self.get_char(string) 228 | 229 | if 0x1000 > char or char > 0x109F: 230 | if 0xAA60 <= char < 0xAA7C: 231 | if char == 0xAA70: 232 | return self.MY_SYLLABLE_TONE 233 | elif char == 0xAA7B: 234 | return self.MY_SYLLABLE_TONE 235 | return self.MY_SYLLABLE_CONSONANT 236 | return self.MY_SYLLABLE_UNKNOWN 237 | 238 | identifiedClass = self.CHAR_PART[char - 0x1000] 239 | return identifiedClass 240 | 241 | def get_break_status(self, before, after): 242 | 243 | # first char = row, second char = column 244 | # 0=illegal, 1=no, 2=yes, 3=yes-line, 4=punctuation, 5=context, 245 | 246 | BKSTATUS = ( 247 | # - C M V T 39 3A N S 248 | ( 1, 3, 1, 1, 1, 1, 1, 1, 1 ),#- 249 | ( 3, 5, 1, 1, 1, 1, 1, 2, 4 ),#C 250 | ( 3, 5, 1, 1, 1, 0, 1, 2, 4 ),#M 251 | ( 3, 5, 0, 1, 1, 0, 1, 2, 4 ),#V 252 | ( 3, 2, 0, 1, 1, 0, 1, 2, 4 ),#T 253 | ( 3, 1, 0, 0, 0, 0, 0, 0, 0 ),#1039 254 | ( 3, 2, 1, 1, 1, 1, 0, 2, 4 ),#103A 255 | ( 3, 2, 1, 1, 1, 0, 0, 1, 4 ),#N 256 | ( 3, 2, 0, 0, 0, 0, 0, 2, 0 ) #S 257 | ) 258 | 259 | firstClass = self.get_char_class(before) 260 | secondClass = self.get_char_class(after) 261 | 262 | # print firstClass, secondClass 263 | 264 | return BKSTATUS[firstClass][secondClass] 265 | 266 | def evaluate_context(self, contextText, offset, langHint): 267 | text = contextText[offset:] 268 | 269 | length = len(text) 270 | if length < 4: 271 | for x in xrange(1,4 - length): 272 | text += " " 273 | 274 | if text[0] == u'\u1021' and langHint == self.LANG_MY: 275 | return self.MY_PAIR_NO_BREAK; 276 | if text[1] == u'\u002d': 277 | return self.MY_PAIR_NO_BREAK; 278 | if text[1] == u'\u103F': 279 | return self.MY_PAIR_NO_BREAK; 280 | if text[2] == u'\u1037' and text[3] == u'\u103A': 281 | return self.MY_PAIR_NO_BREAK; 282 | if text[2] == u'\u1039': 283 | return self.MY_PAIR_NO_BREAK; 284 | elif text[2] == u'\u103A' and langHint == self.LANG_MY: 285 | # Karen (and also some loan words in Myanmar) can have a starting 103A 286 | return self.MY_PAIR_NO_BREAK; 287 | else: 288 | return self.MY_PAIR_SYL_BREAK; 289 | 290 | def get_next_syllable(self, text, length, offset): 291 | # print "get_next_syllable",text[offset:] 292 | breakType = self.MY_PAIR_NO_BREAK 293 | i = offset 294 | foundCluster = False 295 | if (offset >= length): 296 | return (breakType, length) 297 | while i + 1 < length: 298 | breakStatus = self.get_break_status(text[i],text[i+1]) 299 | # print "break:",text[i],text[i+1],"status:",breakStatus 300 | if breakStatus == self.MY_PAIR_NO_BREAK: 301 | pass 302 | elif breakStatus == self.MY_PAIR_SYL_BREAK or breakStatus == self.MY_PAIR_WORD_BREAK or breakStatus == self.MY_PAIR_PUNCTUATION or breakStatus == self.MY_PAIR_ILLEGAL: 303 | breakType = breakStatus 304 | foundCluster = True 305 | elif breakStatus == self.MY_PAIR_CONTEXT: 306 | breakType = self.evaluate_context(text, i, self.LANG_MY) 307 | # print "evl:",text,"type:",breakType 308 | if breakType != self.MY_PAIR_NO_BREAK: 309 | foundCluster = True 310 | else: 311 | print ("Unexpected status" + breakStatus) 312 | if foundCluster: 313 | break 314 | i += 1 315 | if i + 1 == len(text): 316 | breakType = self.MY_PAIR_EOL; 317 | return (breakType, i + 1) 318 | 319 | def is_myanmar_char(self, string): 320 | char = self.get_char(string) 321 | if 0x1000 <= char <= 0x109f or 0xaa60 <= char <= 0xaa7f: 322 | return True 323 | return False 324 | 325 | def is_not_myanmar(self, string): 326 | char = self.get_char(string) 327 | charClass = self.get_char_class(char) 328 | if charClass == MMC_OT or charClass == MMC_RQ or charClass == MMC_LQ or charClass == MMC_SP: 329 | return true 330 | return false 331 | 332 | def is_neutral(self, string): 333 | char = self.get_char(string) 334 | charClass = self.get_char_class(char) 335 | if charClass == MMC_WJ or charClass == MMC_RQ or charClass == MMC_LQ or charClass == MMC_SP or charClass == MMC_NJ: 336 | return true 337 | return false 338 | --------------------------------------------------------------------------------