├── .gitignore
├── doc
    ├── ComfyUI_temp_dgtgr_00001_.png
    ├── ComfyUI_temp_rhsxy_00001_.png
    └── base_workflow.json
├── f5_model
    ├── __init__.py
    ├── backbones
    │   ├── README.md
    │   ├── mmdit.py
    │   ├── dit.py
    │   └── unett.py
    ├── dataset.py
    ├── cfm.py
    ├── ecapa_tdnn.py
    ├── trainer.py
    ├── modules.py
    └── utils.py
├── requirements.txt
├── zh_normalization
    ├── README.md
    ├── __init__.py
    ├── quantifier.py
    ├── phonecode.py
    ├── constants.py
    ├── chronology.py
    ├── text_normlization.py
    ├── num.py
    └── char_convert.py
├── LICENSE
├── README.md
├── __init__.py
└── data
    └── Emilia_ZH_EN_pinyin
        └── vocab.txt


/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__


--------------------------------------------------------------------------------
/doc/ComfyUI_temp_dgtgr_00001_.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIFSH/F5-TTS-ComfyUI/HEAD/doc/ComfyUI_temp_dgtgr_00001_.png


--------------------------------------------------------------------------------
/doc/ComfyUI_temp_rhsxy_00001_.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AIFSH/F5-TTS-ComfyUI/HEAD/doc/ComfyUI_temp_rhsxy_00001_.png


--------------------------------------------------------------------------------
/f5_model/__init__.py:
--------------------------------------------------------------------------------
1 | from .cfm import CFM
2 | 
3 | from .backbones.unett import UNetT
4 | from .backbones.dit import DiT
5 | from .backbones.mmdit import MMDiT
6 | 
7 | from .trainer import Trainer
8 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | accelerate>=0.33.0
 2 | datasets
 3 | einops>=0.8.0
 4 | einx>=0.3.0
 5 | ema_pytorch>=0.5.2
 6 | faster_whisper
 7 | funasr
 8 | jieba
 9 | jiwer
10 | librosa
11 | matplotlib
12 | pypinyin
13 | safetensors
14 | # torch>=2.0
15 | # torchaudio>=2.3.0
16 | torchdiffeq
17 | tqdm>=4.65.0
18 | transformers
19 | vocos
20 | wandb
21 | x_transformers>=1.31.14
22 | zhconv
23 | zhon
24 | cached_path
25 | pydub
26 | soundfile
27 | LangSegment
28 | numpy==1.26.4
29 | 


--------------------------------------------------------------------------------
/f5_model/backbones/README.md:
--------------------------------------------------------------------------------
 1 | ## Backbones quick introduction
 2 | 
 3 | 
 4 | ### unett.py
 5 | - flat unet transformer
 6 | - structure same as in e2-tts & voicebox paper except using rotary pos emb
 7 | - update: allow possible abs pos emb & convnextv2 blocks for embedded text before concat
 8 | 
 9 | ### dit.py
10 | - adaln-zero dit
11 | - embedded timestep as condition
12 | - concatted noised_input + masked_cond + embedded_text, linear proj in
13 | - possible abs pos emb & convnextv2 blocks for embedded text before concat
14 | - possible long skip connection (first layer to last layer)
15 | 
16 | ### mmdit.py
17 | - sd3 structure
18 | - timestep as condition
19 | - left stream: text embedded and applied a abs pos emb
20 | - right stream: masked_cond & noised_input concatted and with same conv pos emb as unett
21 | 


--------------------------------------------------------------------------------
/zh_normalization/README.md:
--------------------------------------------------------------------------------
 1 | ## Supported NSW (Non-Standard-Word) Normalization
 2 | 
 3 | |NSW type|raw|normalized|
 4 | |:--|:-|:-|
 5 | |serial number|电影中梁朝伟扮演的陈永仁的编号27149|电影中梁朝伟扮演的陈永仁的编号二七一四九|
 6 | |cardinal|这块黄金重达324.75克<br>我们班的最高总分为583分|这块黄金重达三百二十四点七五克<br>我们班的最高总分为五百八十三分|
 7 | |numeric range |12\~23<br>-1.5\~2|十二到二十三<br>负一点五到二|
 8 | |date|她出生于86年8月18日，她弟弟出生于1995年3月1日|她出生于八六年八月十八日， 她弟弟出生于一九九五年三月一日|
 9 | |time|等会请在12:05请通知我|等会请在十二点零五分请通知我
10 | |temperature|今天的最低气温达到-10°C|今天的最低气温达到零下十度
11 | |fraction|现场有7/12的观众投出了赞成票|现场有十二分之七的观众投出了赞成票|
12 | |percentage|明天有62％的概率降雨|明天有百分之六十二的概率降雨|
13 | |money|随便来几个价格12块5，34.5元，20.1万|随便来几个价格十二块五，三十四点五元，二十点一万|
14 | |telephone|这是固话0421-33441122<br>这是手机+86 18544139121|这是固话零四二一三三四四一一二二<br>这是手机八六一八五四四一三九一二一|
15 | ## References
16 | [Pull requests #658 of DeepSpeech](https://github.com/PaddlePaddle/DeepSpeech/pull/658/files)
17 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2024 AIFSH
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # F5-TTS-ComfyUI
 2 | a custom node for [F5-TTS](https://github.com/SWivid/F5-TTS),you can find [workflow here](./doc/base_workflow.json)
 3 | 
 4 | ## Weights
 5 | weights will be download from hf automaticlly,对于国内用户，你可以手动下载解压后把F5-TTS文件夹放到`ComfyUI/models/AIFSH`目录下面,[下载地址](https://pan.quark.cn/s/e3a3e4281ada)
 6 | 
 7 | ## 教程
 8 | - [演示视频](https://www.bilibili.com/video/BV1Tjm5YLEsX)
 9 | - [一键包,内含F5-TTS，FireRedTTS，JoyHallo，hallo2四个节点，持续更新中，一次订阅31天免费更新](https://b23.tv/Zm3kPNP)
10 | ## Disclaimer / 免责声明
11 | We do not hold any responsibility for any illegal usage of the codebase. Please refer to your local laws about DMCA and other related laws. 我们不对代码库的任何非法使用承担任何责任. 请参阅您当地关于 DMCA (数字千年法案) 和其他相关法律法规.
12 | ## Example
13 | 
14 | | gen_text | ref_audio | out_audio | audio_img |
15 | | -- | -- | -- | -- |
16 | |`你好，我是太乙真人！欢迎来四川找我玩`| <video src="https://github.com/user-attachments/assets/6758239e-9215-4301-ba06-ac9dad06c306" /> | <video src="https://github.com/user-attachments/assets/2f08ad54-0728-4542-84d3-6e8588b6ef3d" /> | ![](./doc/ComfyUI_temp_dgtgr_00001_.png) |
17 | |`有的人叫我自然，也有的人尊称我为自然母亲`|  <video src="https://github.com/user-attachments/assets/89fde537-abba-4959-9e8f-03230d76014a" /> | <video src="https://github.com/user-attachments/assets/c4058295-1db1-4009-af7d-4c84339eae59" /> | ![](./doc/ComfyUI_temp_rhsxy_00001_.png)
18 | 


--------------------------------------------------------------------------------
/zh_normalization/__init__.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | from .text_normlization import *
15 | 
16 | rep_map = {
17 |     "：": ",",
18 |     "；": ",",
19 |     "，": ",",
20 |     "。": ".",
21 |     "！": "!",
22 |     "？": "?",
23 |     "\n": ".",
24 |     "·": ",",
25 |     "、": ",",
26 |     "...": "…",
27 |     "$": ".",
28 |     "/": ",",
29 |     "—": "-",
30 |     "~": "…",
31 |     "～": "…",
32 | }
33 | 
34 | 
35 | def replace_punctuation(text):
36 |     text = text.replace("嗯", "恩").replace("呣", "母")
37 |     pattern = re.compile("|".join(re.escape(p) for p in rep_map.keys()))
38 | 
39 |     replaced_text = pattern.sub(lambda x: rep_map[x.group()], text)
40 |     punctuation = ["!", "?", "…", ",", "."]
41 |     replaced_text = re.sub(
42 |         r"[^\u4e00-\u9fa5" + "".join(punctuation) + r"]+", "", replaced_text
43 |     )
44 | 
45 |     return replaced_text
46 | 
47 | 
48 | def text_normalize(text):
49 |     # https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/paddlespeech/t2s/frontend/zh_normalization
50 |     tx = TextNormalizer()
51 |     sentences = tx.normalize(text)
52 |     dest_text = ""
53 |     for sentence in sentences:
54 |         dest_text += replace_punctuation(sentence)
55 |     return dest_text
56 | 


--------------------------------------------------------------------------------
/zh_normalization/quantifier.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | import re
15 | 
16 | from .num import num2str
17 | 
18 | # 温度表达式，温度会影响负号的读法
19 | # -3°C 零下三度
20 | RE_TEMPERATURE = re.compile(r'(-?)(\d+(\.\d+)?)(°C|℃|度|摄氏度)')
21 | measure_dict = {
22 |     "cm2": "平方厘米",
23 |     "cm²": "平方厘米",
24 |     "cm3": "立方厘米",
25 |     "cm³": "立方厘米",
26 |     "cm": "厘米",
27 |     "db": "分贝",
28 |     "ds": "毫秒",
29 |     "kg": "千克",
30 |     "km": "千米",
31 |     "m2": "平方米",
32 |     "m²": "平方米",
33 |     "m³": "立方米",
34 |     "m3": "立方米",
35 |     "ml": "毫升",
36 |     "m": "米",
37 |     "mm": "毫米",
38 |     "s": "秒"
39 | }
40 | 
41 | 
42 | def replace_temperature(match) -> str:
43 |     """
44 |     Args:
45 |         match (re.Match)
46 |     Returns:
47 |         str
48 |     """
49 |     sign = match.group(1)
50 |     temperature = match.group(2)
51 |     unit = match.group(3)
52 |     sign: str = "零下" if sign else ""
53 |     temperature: str = num2str(temperature)
54 |     unit: str = "摄氏度" if unit == "摄氏度" else "度"
55 |     result = f"{sign}{temperature}{unit}"
56 |     return result
57 | 
58 | 
59 | def replace_measure(sentence) -> str:
60 |     for q_notation in measure_dict:
61 |         if q_notation in sentence:
62 |             sentence = sentence.replace(q_notation, measure_dict[q_notation])
63 |     return sentence
64 | 


--------------------------------------------------------------------------------
/zh_normalization/phonecode.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | import re
15 | 
16 | from .num import verbalize_digit
17 | 
18 | # 规范化固话/手机号码
19 | # 手机
20 | # http://www.jihaoba.com/news/show/13680
21 | # 移动：139、138、137、136、135、134、159、158、157、150、151、152、188、187、182、183、184、178、198
22 | # 联通：130、131、132、156、155、186、185、176
23 | # 电信：133、153、189、180、181、177
24 | RE_MOBILE_PHONE = re.compile(
25 |     r"(?<!\d)((\+?86 ?)?1([38]\d|5[0-35-9]|7[678]|9[89])\d{8})(?!\d)")
26 | RE_TELEPHONE = re.compile(
27 |     r"(?<!\d)((0(10|2[1-3]|[3-9]\d{2})-?)?[1-9]\d{6,7})(?!\d)")
28 | 
29 | # 全国统一的号码400开头
30 | RE_NATIONAL_UNIFORM_NUMBER = re.compile(r"(400)(-)?\d{3}(-)?\d{4}")
31 | 
32 | 
33 | def phone2str(phone_string: str, mobile=True) -> str:
34 |     if mobile:
35 |         sp_parts = phone_string.strip('+').split()
36 |         result = '，'.join(
37 |             [verbalize_digit(part, alt_one=True) for part in sp_parts])
38 |         return result
39 |     else:
40 |         sil_parts = phone_string.split('-')
41 |         result = '，'.join(
42 |             [verbalize_digit(part, alt_one=True) for part in sil_parts])
43 |         return result
44 | 
45 | 
46 | def replace_phone(match) -> str:
47 |     """
48 |     Args:
49 |         match (re.Match)
50 |     Returns:
51 |         str
52 |     """
53 |     return phone2str(match.group(0), mobile=False)
54 | 
55 | 
56 | def replace_mobile(match) -> str:
57 |     """
58 |     Args:
59 |         match (re.Match)
60 |     Returns:
61 |         str
62 |     """
63 |     return phone2str(match.group(0))
64 | 


--------------------------------------------------------------------------------
/zh_normalization/constants.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | import re
15 | import string
16 | 
17 | from pypinyin.constants import SUPPORT_UCS4
18 | 
19 | # 全角半角转换
20 | # 英文字符全角 -> 半角映射表 (num: 52)
21 | F2H_ASCII_LETTERS = {
22 |     ord(char) + 65248: ord(char)
23 |     for char in string.ascii_letters
24 | }
25 | 
26 | # 英文字符半角 -> 全角映射表
27 | H2F_ASCII_LETTERS = {value: key for key, value in F2H_ASCII_LETTERS.items()}
28 | 
29 | # 数字字符全角 -> 半角映射表 (num: 10)
30 | F2H_DIGITS = {ord(char) + 65248: ord(char) for char in string.digits}
31 | # 数字字符半角 -> 全角映射表
32 | H2F_DIGITS = {value: key for key, value in F2H_DIGITS.items()}
33 | 
34 | # 标点符号全角 -> 半角映射表 (num: 32)
35 | F2H_PUNCTUATIONS = {ord(char) + 65248: ord(char) for char in string.punctuation}
36 | # 标点符号半角 -> 全角映射表
37 | H2F_PUNCTUATIONS = {value: key for key, value in F2H_PUNCTUATIONS.items()}
38 | 
39 | # 空格 (num: 1)
40 | F2H_SPACE = {'\u3000': ' '}
41 | H2F_SPACE = {' ': '\u3000'}
42 | 
43 | # 非"有拼音的汉字"的字符串，可用于NSW提取
44 | if SUPPORT_UCS4:
45 |     RE_NSW = re.compile(r'(?:[^'
46 |                         r'\u3007'  # 〇
47 |                         r'\u3400-\u4dbf'  # CJK扩展A:[3400-4DBF]
48 |                         r'\u4e00-\u9fff'  # CJK基本:[4E00-9FFF]
49 |                         r'\uf900-\ufaff'  # CJK兼容:[F900-FAFF]
50 |                         r'\U00020000-\U0002A6DF'  # CJK扩展B:[20000-2A6DF]
51 |                         r'\U0002A703-\U0002B73F'  # CJK扩展C:[2A700-2B73F]
52 |                         r'\U0002B740-\U0002B81D'  # CJK扩展D:[2B740-2B81D]
53 |                         r'\U0002F80A-\U0002FA1F'  # CJK兼容扩展:[2F800-2FA1F]
54 |                         r'])+')
55 | else:
56 |     RE_NSW = re.compile(  # pragma: no cover
57 |         r'(?:[^'
58 |         r'\u3007'  # 〇
59 |         r'\u3400-\u4dbf'  # CJK扩展A:[3400-4DBF]
60 |         r'\u4e00-\u9fff'  # CJK基本:[4E00-9FFF]
61 |         r'\uf900-\ufaff'  # CJK兼容:[F900-FAFF]
62 |         r'])+')
63 | 


--------------------------------------------------------------------------------
/zh_normalization/chronology.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | import re
 15 | 
 16 | from .num import DIGITS
 17 | from .num import num2str
 18 | from .num import verbalize_cardinal
 19 | from .num import verbalize_digit
 20 | 
 21 | 
 22 | def _time_num2str(num_string: str) -> str:
 23 |     """A special case for verbalizing number in time."""
 24 |     result = num2str(num_string.lstrip('0'))
 25 |     if num_string.startswith('0'):
 26 |         result = DIGITS['0'] + result
 27 |     return result
 28 | 
 29 | 
 30 | # 时刻表达式
 31 | RE_TIME = re.compile(r'([0-1]?[0-9]|2[0-3])'
 32 |                      r':([0-5][0-9])'
 33 |                      r'(:([0-5][0-9]))?')
 34 | 
 35 | # 时间范围，如8:30-12:30
 36 | RE_TIME_RANGE = re.compile(r'([0-1]?[0-9]|2[0-3])'
 37 |                            r':([0-5][0-9])'
 38 |                            r'(:([0-5][0-9]))?'
 39 |                            r'(~|-)'
 40 |                            r'([0-1]?[0-9]|2[0-3])'
 41 |                            r':([0-5][0-9])'
 42 |                            r'(:([0-5][0-9]))?')
 43 | 
 44 | 
 45 | def replace_time(match) -> str:
 46 |     """
 47 |     Args:
 48 |         match (re.Match)
 49 |     Returns:
 50 |         str
 51 |     """
 52 | 
 53 |     is_range = len(match.groups()) > 5
 54 | 
 55 |     hour = match.group(1)
 56 |     minute = match.group(2)
 57 |     second = match.group(4)
 58 | 
 59 |     if is_range:
 60 |         hour_2 = match.group(6)
 61 |         minute_2 = match.group(7)
 62 |         second_2 = match.group(9)
 63 | 
 64 |     result = f"{num2str(hour)}点"
 65 |     if minute.lstrip('0'):
 66 |         if int(minute) == 30:
 67 |             result += "半"
 68 |         else:
 69 |             result += f"{_time_num2str(minute)}分"
 70 |     if second and second.lstrip('0'):
 71 |         result += f"{_time_num2str(second)}秒"
 72 | 
 73 |     if is_range:
 74 |         result += "至"
 75 |         result += f"{num2str(hour_2)}点"
 76 |         if minute_2.lstrip('0'):
 77 |             if int(minute) == 30:
 78 |                 result += "半"
 79 |             else:
 80 |                 result += f"{_time_num2str(minute_2)}分"
 81 |         if second_2 and second_2.lstrip('0'):
 82 |             result += f"{_time_num2str(second_2)}秒"
 83 | 
 84 |     return result
 85 | 
 86 | 
 87 | RE_DATE = re.compile(r'(\d{4}|\d{2})年'
 88 |                      r'((0?[1-9]|1[0-2])月)?'
 89 |                      r'(((0?[1-9])|((1|2)[0-9])|30|31)([日号]))?')
 90 | 
 91 | 
 92 | def replace_date(match) -> str:
 93 |     """
 94 |     Args:
 95 |         match (re.Match)
 96 |     Returns:
 97 |         str
 98 |     """
 99 |     year = match.group(1)
100 |     month = match.group(3)
101 |     day = match.group(5)
102 |     result = ""
103 |     if year:
104 |         result += f"{verbalize_digit(year)}年"
105 |     if month:
106 |         result += f"{verbalize_cardinal(month)}月"
107 |     if day:
108 |         result += f"{verbalize_cardinal(day)}{match.group(9)}"
109 |     return result
110 | 
111 | 
112 | # 用 / 或者 - 分隔的 YY/MM/DD 或者 YY-MM-DD 日期
113 | RE_DATE2 = re.compile(
114 |     r'(\d{4})([- /.])(0[1-9]|1[012])\2(0[1-9]|[12][0-9]|3[01])')
115 | 
116 | 
117 | def replace_date2(match) -> str:
118 |     """
119 |     Args:
120 |         match (re.Match)
121 |     Returns:
122 |         str
123 |     """
124 |     year = match.group(1)
125 |     month = match.group(3)
126 |     day = match.group(4)
127 |     result = ""
128 |     if year:
129 |         result += f"{verbalize_digit(year)}年"
130 |     if month:
131 |         result += f"{verbalize_cardinal(month)}月"
132 |     if day:
133 |         result += f"{verbalize_cardinal(day)}日"
134 |     return result
135 | 


--------------------------------------------------------------------------------
/f5_model/backbones/mmdit.py:
--------------------------------------------------------------------------------
  1 | """
  2 | ein notation:
  3 | b - batch
  4 | n - sequence
  5 | nt - text sequence
  6 | nw - raw wave length
  7 | d - dimension
  8 | """
  9 | 
 10 | from __future__ import annotations
 11 | 
 12 | import torch
 13 | from torch import nn
 14 | 
 15 | from einops import repeat
 16 | 
 17 | from x_transformers.x_transformers import RotaryEmbedding
 18 | 
 19 | from ..modules import (
 20 |     TimestepEmbedding,
 21 |     ConvPositionEmbedding,
 22 |     MMDiTBlock,
 23 |     AdaLayerNormZero_Final,
 24 |     precompute_freqs_cis, get_pos_embed_indices,
 25 | )
 26 | 
 27 | 
 28 | # text embedding
 29 | 
 30 | class TextEmbedding(nn.Module):
 31 |     def __init__(self, out_dim, text_num_embeds):
 32 |         super().__init__()
 33 |         self.text_embed = nn.Embedding(text_num_embeds + 1, out_dim)  # will use 0 as filler token
 34 | 
 35 |         self.precompute_max_pos = 1024
 36 |         self.register_buffer("freqs_cis", precompute_freqs_cis(out_dim, self.precompute_max_pos), persistent=False)
 37 | 
 38 |     def forward(self, text: int['b nt'], drop_text = False) -> int['b nt d']:
 39 |         text = text + 1
 40 |         if drop_text:
 41 |             text = torch.zeros_like(text)
 42 |         text = self.text_embed(text)
 43 | 
 44 |         # sinus pos emb
 45 |         batch_start = torch.zeros((text.shape[0],), dtype=torch.long)
 46 |         batch_text_len = text.shape[1]
 47 |         pos_idx = get_pos_embed_indices(batch_start, batch_text_len, max_pos=self.precompute_max_pos)
 48 |         text_pos_embed = self.freqs_cis[pos_idx]
 49 | 
 50 |         text = text + text_pos_embed
 51 | 
 52 |         return text
 53 | 
 54 | 
 55 | # noised input & masked cond audio embedding
 56 | 
 57 | class AudioEmbedding(nn.Module):
 58 |     def __init__(self, in_dim, out_dim):
 59 |         super().__init__()
 60 |         self.linear = nn.Linear(2 * in_dim, out_dim)
 61 |         self.conv_pos_embed = ConvPositionEmbedding(out_dim)
 62 | 
 63 |     def forward(self, x: float['b n d'], cond: float['b n d'], drop_audio_cond = False):
 64 |         if drop_audio_cond:
 65 |             cond = torch.zeros_like(cond)
 66 |         x = torch.cat((x, cond), dim = -1)
 67 |         x = self.linear(x)
 68 |         x = self.conv_pos_embed(x) + x
 69 |         return x
 70 |     
 71 | 
 72 | # Transformer backbone using MM-DiT blocks
 73 | 
 74 | class MMDiT(nn.Module):
 75 |     def __init__(self, *, 
 76 |                  dim, depth = 8, heads = 8, dim_head = 64, dropout = 0.1, ff_mult = 4,
 77 |                  text_num_embeds = 256, mel_dim = 100,
 78 |     ):
 79 |         super().__init__()
 80 | 
 81 |         self.time_embed = TimestepEmbedding(dim)
 82 |         self.text_embed = TextEmbedding(dim, text_num_embeds)
 83 |         self.audio_embed = AudioEmbedding(mel_dim, dim)
 84 | 
 85 |         self.rotary_embed = RotaryEmbedding(dim_head)
 86 | 
 87 |         self.dim = dim
 88 |         self.depth = depth
 89 |         
 90 |         self.transformer_blocks = nn.ModuleList(
 91 |             [
 92 |                 MMDiTBlock(
 93 |                     dim = dim,
 94 |                     heads = heads,
 95 |                     dim_head = dim_head,
 96 |                     dropout = dropout,
 97 |                     ff_mult = ff_mult,
 98 |                     context_pre_only = i == depth - 1,
 99 |                 )
100 |                 for i in range(depth)
101 |             ]
102 |         )
103 |         self.norm_out = AdaLayerNormZero_Final(dim)  # final modulation
104 |         self.proj_out = nn.Linear(dim, mel_dim)
105 | 
106 |     def forward(
107 |         self,
108 |         x: float['b n d'],     # nosied input audio
109 |         cond: float['b n d'],  # masked cond audio
110 |         text: int['b nt'],     # text
111 |         time: float['b'] | float[''],  # time step
112 |         drop_audio_cond,  # cfg for cond audio
113 |         drop_text,        # cfg for text
114 |         mask: bool['b n'] | None = None,
115 |     ):
116 |         batch = x.shape[0]
117 |         if time.ndim == 0:
118 |             time = repeat(time, ' -> b', b = batch)
119 | 
120 |         # t: conditioning (time), c: context (text + masked cond audio), x: noised input audio
121 |         t = self.time_embed(time)
122 |         c = self.text_embed(text, drop_text = drop_text)
123 |         x = self.audio_embed(x, cond, drop_audio_cond = drop_audio_cond)
124 | 
125 |         seq_len = x.shape[1]
126 |         text_len = text.shape[1]
127 |         rope_audio = self.rotary_embed.forward_from_seq_len(seq_len)
128 |         rope_text = self.rotary_embed.forward_from_seq_len(text_len)
129 |         
130 |         for block in self.transformer_blocks:
131 |             c, x = block(x, c, t, mask = mask, rope = rope_audio, c_rope = rope_text)
132 | 
133 |         x = self.norm_out(x, t)
134 |         output = self.proj_out(x)
135 | 
136 |         return output
137 | 


--------------------------------------------------------------------------------
/doc/base_workflow.json:
--------------------------------------------------------------------------------
  1 | {
  2 |   "last_node_id": 6,
  3 |   "last_link_id": 8,
  4 |   "nodes": [
  5 |     {
  6 |       "id": 2,
  7 |       "type": "PromptTextNode",
  8 |       "pos": {
  9 |         "0": 35,
 10 |         "1": 62
 11 |       },
 12 |       "size": {
 13 |         "0": 400,
 14 |         "1": 200
 15 |       },
 16 |       "flags": {},
 17 |       "order": 0,
 18 |       "mode": 0,
 19 |       "inputs": [],
 20 |       "outputs": [
 21 |         {
 22 |           "name": "TEXT",
 23 |           "type": "TEXT",
 24 |           "links": [
 25 |             5
 26 |           ],
 27 |           "slot_index": 0
 28 |         }
 29 |       ],
 30 |       "properties": {
 31 |         "Node name for S&R": "PromptTextNode"
 32 |       },
 33 |       "widgets_values": [
 34 |         "你好，我是顶针！欢迎来四川找我玩"
 35 |       ]
 36 |     },
 37 |     {
 38 |       "id": 3,
 39 |       "type": "LoadAudio",
 40 |       "pos": {
 41 |         "0": 58,
 42 |         "1": 345
 43 |       },
 44 |       "size": {
 45 |         "0": 315,
 46 |         "1": 124
 47 |       },
 48 |       "flags": {},
 49 |       "order": 1,
 50 |       "mode": 0,
 51 |       "inputs": [],
 52 |       "outputs": [
 53 |         {
 54 |           "name": "AUDIO",
 55 |           "type": "AUDIO",
 56 |           "links": [
 57 |             6
 58 |           ],
 59 |           "slot_index": 0
 60 |         }
 61 |       ],
 62 |       "properties": {
 63 |         "Node name for S&R": "LoadAudio"
 64 |       },
 65 |       "widgets_values": [
 66 |         "dingzhen_0.wav",
 67 |         null,
 68 |         ""
 69 |       ]
 70 |     },
 71 |     {
 72 |       "id": 6,
 73 |       "type": "F5TTSNode",
 74 |       "pos": {
 75 |         "0": 491,
 76 |         "1": 191
 77 |       },
 78 |       "size": {
 79 |         "0": 412.60003662109375,
 80 |         "1": 256.20001220703125
 81 |       },
 82 |       "flags": {},
 83 |       "order": 2,
 84 |       "mode": 0,
 85 |       "inputs": [
 86 |         {
 87 |           "name": "gen_text",
 88 |           "type": "TEXT",
 89 |           "link": 5
 90 |         },
 91 |         {
 92 |           "name": "ref_audio",
 93 |           "type": "AUDIO",
 94 |           "link": 6
 95 |         },
 96 |         {
 97 |           "name": "ref_text",
 98 |           "type": "TEXT",
 99 |           "link": null,
100 |           "shape": 7
101 |         }
102 |       ],
103 |       "outputs": [
104 |         {
105 |           "name": "AUDIO",
106 |           "type": "AUDIO",
107 |           "links": [
108 |             7
109 |           ],
110 |           "slot_index": 0
111 |         },
112 |         {
113 |           "name": "IMAGE",
114 |           "type": "IMAGE",
115 |           "links": [
116 |             8
117 |           ],
118 |           "slot_index": 1
119 |         }
120 |       ],
121 |       "properties": {
122 |         "Node name for S&R": "F5TTSNode"
123 |       },
124 |       "widgets_values": [
125 |         "F5-TTS",
126 |         1,
127 |         true,
128 |         "but,however,nevertheless,yet,still,therefore,thus,hence,consequently,moreover,furthermore,additionally,meanwhile,alternatively,otherwise,namely,specifically,for example,such as,in fact,indeed,notably,in contrast,on the other hand,conversely,in conclusion,to summarize,finally"
129 |       ]
130 |     },
131 |     {
132 |       "id": 5,
133 |       "type": "PreviewImage",
134 |       "pos": {
135 |         "0": 995,
136 |         "1": 318
137 |       },
138 |       "size": {
139 |         "0": 210,
140 |         "1": 246
141 |       },
142 |       "flags": {},
143 |       "order": 4,
144 |       "mode": 0,
145 |       "inputs": [
146 |         {
147 |           "name": "images",
148 |           "type": "IMAGE",
149 |           "link": 8
150 |         }
151 |       ],
152 |       "outputs": [],
153 |       "properties": {
154 |         "Node name for S&R": "PreviewImage"
155 |       },
156 |       "widgets_values": []
157 |     },
158 |     {
159 |       "id": 4,
160 |       "type": "PreviewAudio",
161 |       "pos": {
162 |         "0": 965,
163 |         "1": 68
164 |       },
165 |       "size": {
166 |         "0": 315,
167 |         "1": 76
168 |       },
169 |       "flags": {},
170 |       "order": 3,
171 |       "mode": 0,
172 |       "inputs": [
173 |         {
174 |           "name": "audio",
175 |           "type": "AUDIO",
176 |           "link": 7
177 |         }
178 |       ],
179 |       "outputs": [],
180 |       "properties": {
181 |         "Node name for S&R": "PreviewAudio"
182 |       },
183 |       "widgets_values": [
184 |         null
185 |       ]
186 |     }
187 |   ],
188 |   "links": [
189 |     [
190 |       5,
191 |       2,
192 |       0,
193 |       6,
194 |       0,
195 |       "TEXT"
196 |     ],
197 |     [
198 |       6,
199 |       3,
200 |       0,
201 |       6,
202 |       1,
203 |       "AUDIO"
204 |     ],
205 |     [
206 |       7,
207 |       6,
208 |       0,
209 |       4,
210 |       0,
211 |       "AUDIO"
212 |     ],
213 |     [
214 |       8,
215 |       6,
216 |       1,
217 |       5,
218 |       0,
219 |       "IMAGE"
220 |     ]
221 |   ],
222 |   "groups": [],
223 |   "config": {},
224 |   "extra": {
225 |     "ds": {
226 |       "scale": 1,
227 |       "offset": [
228 |         0,
229 |         0
230 |       ]
231 |     }
232 |   },
233 |   "version": 0.4
234 | }


--------------------------------------------------------------------------------
/f5_model/backbones/dit.py:
--------------------------------------------------------------------------------
  1 | """
  2 | ein notation:
  3 | b - batch
  4 | n - sequence
  5 | nt - text sequence
  6 | nw - raw wave length
  7 | d - dimension
  8 | """
  9 | 
 10 | from __future__ import annotations
 11 | 
 12 | import torch
 13 | from torch import nn
 14 | import torch.nn.functional as F
 15 | 
 16 | from einops import repeat
 17 | 
 18 | from x_transformers.x_transformers import RotaryEmbedding
 19 | 
 20 | from ..modules import (
 21 |     TimestepEmbedding,
 22 |     ConvNeXtV2Block,
 23 |     ConvPositionEmbedding,
 24 |     DiTBlock,
 25 |     AdaLayerNormZero_Final,
 26 |     precompute_freqs_cis, get_pos_embed_indices,
 27 | )
 28 | 
 29 | 
 30 | # Text embedding
 31 | 
 32 | class TextEmbedding(nn.Module):
 33 |     def __init__(self, text_num_embeds, text_dim, conv_layers = 0, conv_mult = 2):
 34 |         super().__init__()
 35 |         self.text_embed = nn.Embedding(text_num_embeds + 1, text_dim)  # use 0 as filler token
 36 | 
 37 |         if conv_layers > 0:
 38 |             self.extra_modeling = True
 39 |             self.precompute_max_pos = 4096  # ~44s of 24khz audio
 40 |             self.register_buffer("freqs_cis", precompute_freqs_cis(text_dim, self.precompute_max_pos), persistent=False)
 41 |             self.text_blocks = nn.Sequential(*[ConvNeXtV2Block(text_dim, text_dim * conv_mult) for _ in range(conv_layers)])
 42 |         else:
 43 |             self.extra_modeling = False
 44 | 
 45 |     def forward(self, text: int['b nt'], seq_len, drop_text = False):
 46 |         batch, text_len = text.shape[0], text.shape[1]
 47 |         text = text + 1  # use 0 as filler token. preprocess of batch pad -1, see list_str_to_idx()
 48 |         text = text[:, :seq_len]  # curtail if character tokens are more than the mel spec tokens
 49 |         text = F.pad(text, (0, seq_len - text_len), value = 0)
 50 | 
 51 |         if drop_text:  # cfg for text
 52 |             text = torch.zeros_like(text)
 53 | 
 54 |         text = self.text_embed(text) # b n -> b n d
 55 | 
 56 |         # possible extra modeling
 57 |         if self.extra_modeling:
 58 |             # sinus pos emb
 59 |             batch_start = torch.zeros((batch,), dtype=torch.long)
 60 |             pos_idx = get_pos_embed_indices(batch_start, seq_len, max_pos=self.precompute_max_pos)
 61 |             text_pos_embed = self.freqs_cis[pos_idx]
 62 |             text = text + text_pos_embed
 63 | 
 64 |             # convnextv2 blocks
 65 |             text = self.text_blocks(text)
 66 | 
 67 |         return text
 68 | 
 69 | 
 70 | # noised input audio and context mixing embedding
 71 | 
 72 | class InputEmbedding(nn.Module):
 73 |     def __init__(self, mel_dim, text_dim, out_dim):
 74 |         super().__init__()
 75 |         self.proj = nn.Linear(mel_dim * 2 + text_dim, out_dim)
 76 |         self.conv_pos_embed = ConvPositionEmbedding(dim = out_dim)
 77 | 
 78 |     def forward(self, x: float['b n d'], cond: float['b n d'], text_embed: float['b n d'], drop_audio_cond = False):
 79 |         if drop_audio_cond:  # cfg for cond audio
 80 |             cond = torch.zeros_like(cond)
 81 | 
 82 |         x = self.proj(torch.cat((x, cond, text_embed), dim = -1))
 83 |         x = self.conv_pos_embed(x) + x
 84 |         return x
 85 |     
 86 | 
 87 | # Transformer backbone using DiT blocks
 88 | 
 89 | class DiT(nn.Module):
 90 |     def __init__(self, *, 
 91 |                  dim, depth = 8, heads = 8, dim_head = 64, dropout = 0.1, ff_mult = 4,
 92 |                  mel_dim = 100, text_num_embeds = 256, text_dim = None, conv_layers = 0,
 93 |                  long_skip_connection = False,
 94 |     ):
 95 |         super().__init__()
 96 | 
 97 |         self.time_embed = TimestepEmbedding(dim)
 98 |         if text_dim is None:
 99 |             text_dim = mel_dim
100 |         self.text_embed = TextEmbedding(text_num_embeds, text_dim, conv_layers = conv_layers)
101 |         self.input_embed = InputEmbedding(mel_dim, text_dim, dim)
102 | 
103 |         self.rotary_embed = RotaryEmbedding(dim_head)
104 | 
105 |         self.dim = dim
106 |         self.depth = depth
107 |         
108 |         self.transformer_blocks = nn.ModuleList(
109 |             [
110 |                 DiTBlock(
111 |                     dim = dim,
112 |                     heads = heads,
113 |                     dim_head = dim_head,
114 |                     ff_mult = ff_mult,
115 |                     dropout = dropout
116 |                 )
117 |                 for _ in range(depth)
118 |             ]
119 |         )
120 |         self.long_skip_connection = nn.Linear(dim * 2, dim, bias = False) if long_skip_connection else None
121 |         
122 |         self.norm_out = AdaLayerNormZero_Final(dim)  # final modulation
123 |         self.proj_out = nn.Linear(dim, mel_dim)
124 | 
125 |     def forward(
126 |         self,
127 |         x: float['b n d'],     # nosied input audio
128 |         cond: float['b n d'],  # masked cond audio
129 |         text: int['b nt'],     # text
130 |         time: float['b'] | float[''],  # time step
131 |         drop_audio_cond,  # cfg for cond audio
132 |         drop_text,        # cfg for text
133 |         mask: bool['b n'] | None = None,
134 |     ):
135 |         batch, seq_len = x.shape[0], x.shape[1]
136 |         if time.ndim == 0:
137 |             time = repeat(time, ' -> b', b = batch)
138 |         
139 |         # t: conditioning time, c: context (text + masked cond audio), x: noised input audio
140 |         t = self.time_embed(time)
141 |         text_embed = self.text_embed(text, seq_len, drop_text = drop_text)
142 |         x = self.input_embed(x, cond, text_embed, drop_audio_cond = drop_audio_cond)
143 |         
144 |         rope = self.rotary_embed.forward_from_seq_len(seq_len)
145 | 
146 |         if self.long_skip_connection is not None:
147 |             residual = x
148 | 
149 |         for block in self.transformer_blocks:
150 |             x = block(x, t, mask = mask, rope = rope)
151 | 
152 |         if self.long_skip_connection is not None:
153 |             x = self.long_skip_connection(torch.cat((x, residual), dim = -1))
154 | 
155 |         x = self.norm_out(x, t)
156 |         output = self.proj_out(x)
157 | 
158 |         return output
159 | 


--------------------------------------------------------------------------------
/zh_normalization/text_normlization.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | import re
 15 | from typing import List
 16 | 
 17 | from .char_convert import tranditional_to_simplified
 18 | from .chronology import RE_DATE
 19 | from .chronology import RE_DATE2
 20 | from .chronology import RE_TIME
 21 | from .chronology import RE_TIME_RANGE
 22 | from .chronology import replace_date
 23 | from .chronology import replace_date2
 24 | from .chronology import replace_time
 25 | from .constants import F2H_ASCII_LETTERS
 26 | from .constants import F2H_DIGITS
 27 | from .constants import F2H_SPACE
 28 | from .num import RE_DECIMAL_NUM
 29 | from .num import RE_DEFAULT_NUM
 30 | from .num import RE_FRAC
 31 | from .num import RE_INTEGER
 32 | from .num import RE_NUMBER
 33 | from .num import RE_PERCENTAGE
 34 | from .num import RE_POSITIVE_QUANTIFIERS
 35 | from .num import RE_RANGE
 36 | from .num import RE_TO_RANGE
 37 | from .num import RE_ASMD
 38 | from .num import replace_default_num
 39 | from .num import replace_frac
 40 | from .num import replace_negative_num
 41 | from .num import replace_number
 42 | from .num import replace_percentage
 43 | from .num import replace_positive_quantifier
 44 | from .num import replace_range
 45 | from .num import replace_to_range
 46 | from .num import replace_asmd
 47 | from .phonecode import RE_MOBILE_PHONE
 48 | from .phonecode import RE_NATIONAL_UNIFORM_NUMBER
 49 | from .phonecode import RE_TELEPHONE
 50 | from .phonecode import replace_mobile
 51 | from .phonecode import replace_phone
 52 | from .quantifier import RE_TEMPERATURE
 53 | from .quantifier import replace_measure
 54 | from .quantifier import replace_temperature
 55 | 
 56 | 
 57 | class TextNormalizer():
 58 |     def __init__(self):
 59 |         self.SENTENCE_SPLITOR = re.compile(r'([：、，；。？！,;?!][”’]?)')
 60 | 
 61 |     def _split(self, text: str, lang="zh") -> List[str]:
 62 |         """Split long text into sentences with sentence-splitting punctuations.
 63 |         Args:
 64 |             text (str): The input text.
 65 |         Returns:
 66 |             List[str]: Sentences.
 67 |         """
 68 |         # Only for pure Chinese here
 69 |         if lang == "zh":
 70 |             text = text.replace(" ", "")
 71 |             # 过滤掉特殊字符
 72 |             text = re.sub(r'[——《》【】<>{}()（）#&@“”^_|\\]', '', text)
 73 |         text = self.SENTENCE_SPLITOR.sub(r'\1\n', text)
 74 |         text = text.strip()
 75 |         sentences = [sentence.strip() for sentence in re.split(r'\n+', text)]
 76 |         return sentences
 77 | 
 78 |     def _post_replace(self, sentence: str) -> str:
 79 |         sentence = sentence.replace('/', '每')
 80 |         # sentence = sentence.replace('~', '至')
 81 |         # sentence = sentence.replace('～', '至')
 82 |         sentence = sentence.replace('①', '一')
 83 |         sentence = sentence.replace('②', '二')
 84 |         sentence = sentence.replace('③', '三')
 85 |         sentence = sentence.replace('④', '四')
 86 |         sentence = sentence.replace('⑤', '五')
 87 |         sentence = sentence.replace('⑥', '六')
 88 |         sentence = sentence.replace('⑦', '七')
 89 |         sentence = sentence.replace('⑧', '八')
 90 |         sentence = sentence.replace('⑨', '九')
 91 |         sentence = sentence.replace('⑩', '十')
 92 |         sentence = sentence.replace('α', '阿尔法')
 93 |         sentence = sentence.replace('β', '贝塔')
 94 |         sentence = sentence.replace('γ', '伽玛').replace('Γ', '伽玛')
 95 |         sentence = sentence.replace('δ', '德尔塔').replace('Δ', '德尔塔')
 96 |         sentence = sentence.replace('ε', '艾普西龙')
 97 |         sentence = sentence.replace('ζ', '捷塔')
 98 |         sentence = sentence.replace('η', '依塔')
 99 |         sentence = sentence.replace('θ', '西塔').replace('Θ', '西塔')
100 |         sentence = sentence.replace('ι', '艾欧塔')
101 |         sentence = sentence.replace('κ', '喀帕')
102 |         sentence = sentence.replace('λ', '拉姆达').replace('Λ', '拉姆达')
103 |         sentence = sentence.replace('μ', '缪')
104 |         sentence = sentence.replace('ν', '拗')
105 |         sentence = sentence.replace('ξ', '克西').replace('Ξ', '克西')
106 |         sentence = sentence.replace('ο', '欧米克伦')
107 |         sentence = sentence.replace('π', '派').replace('Π', '派')
108 |         sentence = sentence.replace('ρ', '肉')
109 |         sentence = sentence.replace('ς', '西格玛').replace('Σ', '西格玛').replace(
110 |             'σ', '西格玛')
111 |         sentence = sentence.replace('τ', '套')
112 |         sentence = sentence.replace('υ', '宇普西龙')
113 |         sentence = sentence.replace('φ', '服艾').replace('Φ', '服艾')
114 |         sentence = sentence.replace('χ', '器')
115 |         sentence = sentence.replace('ψ', '普赛').replace('Ψ', '普赛')
116 |         sentence = sentence.replace('ω', '欧米伽').replace('Ω', '欧米伽')
117 |         # re filter special characters, have one more character "-" than line 68
118 |         sentence = re.sub(r'[-——《》【】<=>{}()（）#&@“”^_|\\]', '', sentence)
119 |         return sentence
120 | 
121 |     def normalize_sentence(self, sentence: str) -> str:
122 |         # basic character conversions
123 |         sentence = tranditional_to_simplified(sentence)
124 |         sentence = sentence.translate(F2H_ASCII_LETTERS).translate(
125 |             F2H_DIGITS).translate(F2H_SPACE)
126 | 
127 |         # number related NSW verbalization
128 |         sentence = RE_DATE.sub(replace_date, sentence)
129 |         sentence = RE_DATE2.sub(replace_date2, sentence)
130 | 
131 |         # range first
132 |         sentence = RE_TIME_RANGE.sub(replace_time, sentence)
133 |         sentence = RE_TIME.sub(replace_time, sentence)
134 | 
135 |         # 处理~波浪号作为至的替换
136 |         sentence = RE_TO_RANGE.sub(replace_to_range, sentence)
137 |         sentence = RE_TEMPERATURE.sub(replace_temperature, sentence)
138 |         sentence = replace_measure(sentence)
139 |         sentence = RE_FRAC.sub(replace_frac, sentence)
140 |         sentence = RE_PERCENTAGE.sub(replace_percentage, sentence)
141 |         sentence = RE_MOBILE_PHONE.sub(replace_mobile, sentence)
142 | 
143 |         sentence = RE_TELEPHONE.sub(replace_phone, sentence)
144 |         sentence = RE_NATIONAL_UNIFORM_NUMBER.sub(replace_phone, sentence)
145 | 
146 |         sentence = RE_RANGE.sub(replace_range, sentence)
147 | 
148 |         # 处理加减乘除
149 |         while RE_ASMD.search(sentence):
150 |             sentence = RE_ASMD.sub(replace_asmd, sentence)
151 | 
152 |         sentence = RE_INTEGER.sub(replace_negative_num, sentence)
153 |         sentence = RE_DECIMAL_NUM.sub(replace_number, sentence)
154 |         sentence = RE_POSITIVE_QUANTIFIERS.sub(replace_positive_quantifier,
155 |                                                sentence)
156 |         sentence = RE_DEFAULT_NUM.sub(replace_default_num, sentence)
157 |         sentence = RE_NUMBER.sub(replace_number, sentence)
158 |         sentence = self._post_replace(sentence)
159 | 
160 |         return sentence
161 | 
162 |     def normalize(self, text: str) -> List[str]:
163 |         sentences = self._split(text)
164 |         sentences = [self.normalize_sentence(sent) for sent in sentences]
165 |         return sentences
166 | 


--------------------------------------------------------------------------------
/f5_model/backbones/unett.py:
--------------------------------------------------------------------------------
  1 | """
  2 | ein notation:
  3 | b - batch
  4 | n - sequence
  5 | nt - text sequence
  6 | nw - raw wave length
  7 | d - dimension
  8 | """
  9 | 
 10 | from __future__ import annotations
 11 | from typing import Literal
 12 | 
 13 | import torch
 14 | from torch import nn
 15 | import torch.nn.functional as F
 16 | 
 17 | from einops import repeat, pack, unpack
 18 | 
 19 | from x_transformers import RMSNorm
 20 | from x_transformers.x_transformers import RotaryEmbedding
 21 | 
 22 | from ..modules import (
 23 |     TimestepEmbedding,
 24 |     ConvNeXtV2Block,
 25 |     ConvPositionEmbedding,
 26 |     Attention,
 27 |     AttnProcessor,
 28 |     FeedForward,
 29 |     precompute_freqs_cis, get_pos_embed_indices,
 30 | )
 31 | 
 32 | 
 33 | # Text embedding
 34 | 
 35 | class TextEmbedding(nn.Module):
 36 |     def __init__(self, text_num_embeds, text_dim, conv_layers = 0, conv_mult = 2):
 37 |         super().__init__()
 38 |         self.text_embed = nn.Embedding(text_num_embeds + 1, text_dim)  # use 0 as filler token
 39 | 
 40 |         if conv_layers > 0:
 41 |             self.extra_modeling = True
 42 |             self.precompute_max_pos = 4096  # ~44s of 24khz audio
 43 |             self.register_buffer("freqs_cis", precompute_freqs_cis(text_dim, self.precompute_max_pos), persistent=False)
 44 |             self.text_blocks = nn.Sequential(*[ConvNeXtV2Block(text_dim, text_dim * conv_mult) for _ in range(conv_layers)])
 45 |         else:
 46 |             self.extra_modeling = False
 47 | 
 48 |     def forward(self, text: int['b nt'], seq_len, drop_text = False):
 49 |         batch, text_len = text.shape[0], text.shape[1]
 50 |         text = text + 1  # use 0 as filler token. preprocess of batch pad -1, see list_str_to_idx()
 51 |         text = text[:, :seq_len]  # curtail if character tokens are more than the mel spec tokens
 52 |         text = F.pad(text, (0, seq_len - text_len), value = 0)
 53 | 
 54 |         if drop_text:  # cfg for text
 55 |             text = torch.zeros_like(text)
 56 | 
 57 |         text = self.text_embed(text) # b n -> b n d
 58 | 
 59 |         # possible extra modeling
 60 |         if self.extra_modeling:
 61 |             # sinus pos emb
 62 |             batch_start = torch.zeros((batch,), dtype=torch.long)
 63 |             pos_idx = get_pos_embed_indices(batch_start, seq_len, max_pos=self.precompute_max_pos)
 64 |             text_pos_embed = self.freqs_cis[pos_idx]
 65 |             text = text + text_pos_embed
 66 | 
 67 |             # convnextv2 blocks
 68 |             text = self.text_blocks(text)
 69 | 
 70 |         return text
 71 | 
 72 | 
 73 | # noised input audio and context mixing embedding
 74 | 
 75 | class InputEmbedding(nn.Module):
 76 |     def __init__(self, mel_dim, text_dim, out_dim):
 77 |         super().__init__()
 78 |         self.proj = nn.Linear(mel_dim * 2 + text_dim, out_dim)
 79 |         self.conv_pos_embed = ConvPositionEmbedding(dim = out_dim)
 80 | 
 81 |     def forward(self, x: float['b n d'], cond: float['b n d'], text_embed: float['b n d'], drop_audio_cond = False):
 82 |         if drop_audio_cond:  # cfg for cond audio
 83 |             cond = torch.zeros_like(cond)
 84 | 
 85 |         x = self.proj(torch.cat((x, cond, text_embed), dim = -1))
 86 |         x = self.conv_pos_embed(x) + x
 87 |         return x
 88 | 
 89 | 
 90 | # Flat UNet Transformer backbone
 91 | 
 92 | class UNetT(nn.Module):
 93 |     def __init__(self, *,
 94 |                  dim, depth = 8, heads = 8, dim_head = 64, dropout = 0.1, ff_mult = 4,
 95 |                  mel_dim = 100, text_num_embeds = 256, text_dim = None, conv_layers = 0,
 96 |                  skip_connect_type: Literal['add', 'concat', 'none'] = 'concat',
 97 |     ):
 98 |         super().__init__()
 99 |         assert depth % 2 == 0, "UNet-Transformer's depth should be even."
100 | 
101 |         self.time_embed = TimestepEmbedding(dim)
102 |         if text_dim is None:
103 |             text_dim = mel_dim
104 |         self.text_embed = TextEmbedding(text_num_embeds, text_dim, conv_layers = conv_layers)
105 |         self.input_embed = InputEmbedding(mel_dim, text_dim, dim)
106 | 
107 |         self.rotary_embed = RotaryEmbedding(dim_head)
108 | 
109 |         # transformer layers & skip connections
110 | 
111 |         self.dim = dim
112 |         self.skip_connect_type = skip_connect_type
113 |         needs_skip_proj = skip_connect_type == 'concat'
114 | 
115 |         self.depth = depth
116 |         self.layers = nn.ModuleList([])
117 | 
118 |         for idx in range(depth):
119 |             is_later_half = idx >= (depth // 2)
120 | 
121 |             attn_norm = RMSNorm(dim)
122 |             attn = Attention(
123 |                 processor = AttnProcessor(),
124 |                 dim = dim,
125 |                 heads = heads,
126 |                 dim_head = dim_head,
127 |                 dropout = dropout,
128 |                 )
129 | 
130 |             ff_norm = RMSNorm(dim)
131 |             ff = FeedForward(dim = dim, mult = ff_mult, dropout = dropout, approximate = "tanh")
132 | 
133 |             skip_proj = nn.Linear(dim * 2, dim, bias = False) if needs_skip_proj and is_later_half else None
134 | 
135 |             self.layers.append(nn.ModuleList([
136 |                 skip_proj,
137 |                 attn_norm,
138 |                 attn,
139 |                 ff_norm,
140 |                 ff,
141 |             ]))
142 | 
143 |         self.norm_out = RMSNorm(dim)
144 |         self.proj_out = nn.Linear(dim, mel_dim)
145 | 
146 |     def forward(
147 |         self,
148 |         x: float['b n d'],     # nosied input audio
149 |         cond: float['b n d'],  # masked cond audio
150 |         text: int['b nt'],     # text
151 |         time: float['b'] | float[''],  # time step
152 |         drop_audio_cond,  # cfg for cond audio
153 |         drop_text,        # cfg for text
154 |         mask: bool['b n'] | None = None,
155 |     ):
156 |         batch, seq_len = x.shape[0], x.shape[1]
157 |         if time.ndim == 0:
158 |             time = repeat(time, ' -> b', b = batch)
159 |         
160 |         # t: conditioning time, c: context (text + masked cond audio), x: noised input audio
161 |         t = self.time_embed(time)
162 |         text_embed = self.text_embed(text, seq_len, drop_text = drop_text)
163 |         x = self.input_embed(x, cond, text_embed, drop_audio_cond = drop_audio_cond)
164 | 
165 |         # postfix time t to input x, [b n d] -> [b n+1 d]
166 |         x, ps = pack((t, x), 'b * d')
167 |         if mask is not None:
168 |             mask = F.pad(mask, (1, 0), value=1)
169 |         
170 |         rope = self.rotary_embed.forward_from_seq_len(seq_len + 1)
171 | 
172 |         # flat unet transformer
173 |         skip_connect_type = self.skip_connect_type
174 |         skips = []
175 |         for idx, (maybe_skip_proj, attn_norm, attn, ff_norm, ff) in enumerate(self.layers):
176 |             layer = idx + 1
177 | 
178 |             # skip connection logic
179 |             is_first_half = layer <= (self.depth // 2)
180 |             is_later_half = not is_first_half
181 | 
182 |             if is_first_half:
183 |                 skips.append(x)
184 | 
185 |             if is_later_half:
186 |                 skip = skips.pop()
187 |                 if skip_connect_type == 'concat':
188 |                     x = torch.cat((x, skip), dim = -1)
189 |                     x = maybe_skip_proj(x)
190 |                 elif skip_connect_type == 'add':
191 |                     x = x + skip
192 | 
193 |             # attention and feedforward blocks
194 |             x = attn(attn_norm(x), rope = rope, mask = mask) + x
195 |             x = ff(ff_norm(x)) + x
196 | 
197 |         assert len(skips) == 0
198 | 
199 |         _, x = unpack(self.norm_out(x), ps, 'b * d')
200 | 
201 |         return self.proj_out(x)
202 | 


--------------------------------------------------------------------------------
/zh_normalization/num.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | """
 15 | Rules to verbalize numbers into Chinese characters.
 16 | https://zh.wikipedia.org/wiki/中文数字#現代中文
 17 | """
 18 | import re
 19 | from collections import OrderedDict
 20 | from typing import List
 21 | 
 22 | DIGITS = {str(i): tran for i, tran in enumerate('零一二三四五六七八九')}
 23 | UNITS = OrderedDict({
 24 |     1: '十',
 25 |     2: '百',
 26 |     3: '千',
 27 |     4: '万',
 28 |     8: '亿',
 29 | })
 30 | 
 31 | COM_QUANTIFIERS = '(封|艘|把|目|套|段|人|所|朵|匹|张|座|回|场|尾|条|个|首|阙|阵|网|炮|顶|丘|棵|只|支|袭|辆|挑|担|颗|壳|窠|曲|墙|群|腔|砣|座|客|贯|扎|捆|刀|令|打|手|罗|坡|山|岭|江|溪|钟|队|单|双|对|出|口|头|脚|板|跳|枝|件|贴|针|线|管|名|位|身|堂|课|本|页|家|户|层|丝|毫|厘|分|钱|两|斤|担|铢|石|钧|锱|忽|(千|毫|微)克|毫|厘|(公)分|分|寸|尺|丈|里|寻|常|铺|程|(千|分|厘|毫|微)米|米|撮|勺|合|升|斗|石|盘|碗|碟|叠|桶|笼|盆|盒|杯|钟|斛|锅|簋|篮|盘|桶|罐|瓶|壶|卮|盏|箩|箱|煲|啖|袋|钵|年|月|日|季|刻|时|周|天|秒|分|小时|旬|纪|岁|世|更|夜|春|夏|秋|冬|代|伏|辈|丸|泡|粒|颗|幢|堆|条|根|支|道|面|片|张|颗|块|元|(亿|千万|百万|万|千|百)|(亿|千万|百万|万|千|百|美|)元|(亿|千万|百万|万|千|百|十|)吨|(亿|千万|百万|万|千|百|)块|角|毛|分)'
 32 | 
 33 | # 分数表达式
 34 | RE_FRAC = re.compile(r'(-?)(\d+)/(\d+)')
 35 | 
 36 | 
 37 | def replace_frac(match) -> str:
 38 |     """
 39 |     Args:
 40 |         match (re.Match)
 41 |     Returns:
 42 |         str
 43 |     """
 44 |     sign = match.group(1)
 45 |     nominator = match.group(2)
 46 |     denominator = match.group(3)
 47 |     sign: str = "负" if sign else ""
 48 |     nominator: str = num2str(nominator)
 49 |     denominator: str = num2str(denominator)
 50 |     result = f"{sign}{denominator}分之{nominator}"
 51 |     return result
 52 | 
 53 | 
 54 | # 百分数表达式
 55 | RE_PERCENTAGE = re.compile(r'(-?)(\d+(\.\d+)?)%')
 56 | 
 57 | 
 58 | def replace_percentage(match) -> str:
 59 |     """
 60 |     Args:
 61 |         match (re.Match)
 62 |     Returns:
 63 |         str
 64 |     """
 65 |     sign = match.group(1)
 66 |     percent = match.group(2)
 67 |     sign: str = "负" if sign else ""
 68 |     percent: str = num2str(percent)
 69 |     result = f"{sign}百分之{percent}"
 70 |     return result
 71 | 
 72 | 
 73 | # 整数表达式
 74 | # 带负号的整数 -10
 75 | RE_INTEGER = re.compile(r'(-)' r'(\d+)')
 76 | 
 77 | 
 78 | def replace_negative_num(match) -> str:
 79 |     """
 80 |     Args:
 81 |         match (re.Match)
 82 |     Returns:
 83 |         str
 84 |     """
 85 |     sign = match.group(1)
 86 |     number = match.group(2)
 87 |     sign: str = "负" if sign else ""
 88 |     number: str = num2str(number)
 89 |     result = f"{sign}{number}"
 90 |     return result
 91 | 
 92 | 
 93 | # 编号-无符号整形
 94 | # 00078
 95 | RE_DEFAULT_NUM = re.compile(r'\d{3}\d*')
 96 | 
 97 | 
 98 | def replace_default_num(match):
 99 |     """
100 |     Args:
101 |         match (re.Match)
102 |     Returns:
103 |         str
104 |     """
105 |     number = match.group(0)
106 |     return verbalize_digit(number, alt_one=True)
107 | 
108 | 
109 | # 加减乘除
110 | RE_ASMD = re.compile(
111 |     r'((-?)((\d+)(\.\d+)?)|(\.(\d+)))([\+\-\×÷=])((-?)((\d+)(\.\d+)?)|(\.(\d+)))')
112 | asmd_map = {
113 |     '+': '加',
114 |     '-': '减',
115 |     '×': '乘',
116 |     '÷': '除',
117 |     '=': '等于'
118 | }
119 | 
120 | 
121 | def replace_asmd(match) -> str:
122 |     """
123 |     Args:
124 |         match (re.Match)
125 |     Returns:
126 |         str
127 |     """
128 |     result = match.group(1) + asmd_map[match.group(8)] + match.group(9)
129 |     return result
130 | 
131 | 
132 | # 数字表达式
133 | # 纯小数
134 | RE_DECIMAL_NUM = re.compile(r'(-?)((\d+)(\.\d+))' r'|(\.(\d+))')
135 | # 正整数 + 量词
136 | RE_POSITIVE_QUANTIFIERS = re.compile(r"(\d+)([多余几\+])?" + COM_QUANTIFIERS)
137 | RE_NUMBER = re.compile(r'(-?)((\d+)(\.\d+)?)' r'|(\.(\d+))')
138 | 
139 | 
140 | def replace_positive_quantifier(match) -> str:
141 |     """
142 |     Args:
143 |         match (re.Match)
144 |     Returns:
145 |         str
146 |     """
147 |     number = match.group(1)
148 |     match_2 = match.group(2)
149 |     if match_2 == "+":
150 |         match_2 = "多"
151 |     match_2: str = match_2 if match_2 else ""
152 |     quantifiers: str = match.group(3)
153 |     number: str = num2str(number)
154 |     result = f"{number}{match_2}{quantifiers}"
155 |     return result
156 | 
157 | 
158 | def replace_number(match) -> str:
159 |     """
160 |     Args:
161 |         match (re.Match)
162 |     Returns:
163 |         str
164 |     """
165 |     sign = match.group(1)
166 |     number = match.group(2)
167 |     pure_decimal = match.group(5)
168 |     if pure_decimal:
169 |         result = num2str(pure_decimal)
170 |     else:
171 |         sign: str = "负" if sign else ""
172 |         number: str = num2str(number)
173 |         result = f"{sign}{number}"
174 |     return result
175 | 
176 | 
177 | # 范围表达式
178 | # match.group(1) and match.group(8) are copy from RE_NUMBER
179 | 
180 | RE_RANGE = re.compile(
181 |     r"""
182 |     (?<![\d\+\-\×÷=])      # 使用反向前瞻以确保数字范围之前没有其他数字和操作符
183 |     ((-?)((\d+)(\.\d+)?))  # 匹配范围起始的负数或正数（整数或小数）
184 |     [-~]                   # 匹配范围分隔符
185 |     ((-?)((\d+)(\.\d+)?))  # 匹配范围结束的负数或正数（整数或小数）
186 |     (?![\d\+\-\×÷=])       # 使用正向前瞻以确保数字范围之后没有其他数字和操作符
187 |     """, re.VERBOSE)
188 | 
189 | 
190 | def replace_range(match) -> str:
191 |     """
192 |     Args:
193 |         match (re.Match)
194 |     Returns:
195 |         str
196 |     """
197 |     first, second = match.group(1), match.group(6)
198 |     first = RE_NUMBER.sub(replace_number, first)
199 |     second = RE_NUMBER.sub(replace_number, second)
200 |     result = f"{first}到{second}"
201 |     return result
202 | 
203 | 
204 | # ~至表达式
205 | RE_TO_RANGE = re.compile(
206 |     r'((-?)((\d+)(\.\d+)?)|(\.(\d+)))(%|°C|℃|度|摄氏度|cm2|cm²|cm3|cm³|cm|db|ds|kg|km|m2|m²|m³|m3|ml|m|mm|s)[~]((-?)((\d+)(\.\d+)?)|(\.(\d+)))(%|°C|℃|度|摄氏度|cm2|cm²|cm3|cm³|cm|db|ds|kg|km|m2|m²|m³|m3|ml|m|mm|s)')
207 | 
208 | def replace_to_range(match) -> str:
209 |     """
210 |     Args:
211 |         match (re.Match)
212 |     Returns:
213 |         str
214 |     """
215 |     result = match.group(0).replace('~', '至')
216 |     return result
217 | 
218 | 
219 | def _get_value(value_string: str, use_zero: bool=True) -> List[str]:
220 |     stripped = value_string.lstrip('0')
221 |     if len(stripped) == 0:
222 |         return []
223 |     elif len(stripped) == 1:
224 |         if use_zero and len(stripped) < len(value_string):
225 |             return [DIGITS['0'], DIGITS[stripped]]
226 |         else:
227 |             return [DIGITS[stripped]]
228 |     else:
229 |         largest_unit = next(
230 |             power for power in reversed(UNITS.keys()) if power < len(stripped))
231 |         first_part = value_string[:-largest_unit]
232 |         second_part = value_string[-largest_unit:]
233 |         return _get_value(first_part) + [UNITS[largest_unit]] + _get_value(
234 |             second_part)
235 | 
236 | 
237 | def verbalize_cardinal(value_string: str) -> str:
238 |     if not value_string:
239 |         return ''
240 | 
241 |     # 000 -> '零' , 0 -> '零'
242 |     value_string = value_string.lstrip('0')
243 |     if len(value_string) == 0:
244 |         return DIGITS['0']
245 | 
246 |     result_symbols = _get_value(value_string)
247 |     # verbalized number starting with '一十*' is abbreviated as `十*`
248 |     if len(result_symbols) >= 2 and result_symbols[0] == DIGITS[
249 |             '1'] and result_symbols[1] == UNITS[1]:
250 |         result_symbols = result_symbols[1:]
251 |     return ''.join(result_symbols)
252 | 
253 | 
254 | def verbalize_digit(value_string: str, alt_one=False) -> str:
255 |     result_symbols = [DIGITS[digit] for digit in value_string]
256 |     result = ''.join(result_symbols)
257 |     if alt_one:
258 |         result = result.replace("一", "幺")
259 |     return result
260 | 
261 | 
262 | def num2str(value_string: str) -> str:
263 |     integer_decimal = value_string.split('.')
264 |     if len(integer_decimal) == 1:
265 |         integer = integer_decimal[0]
266 |         decimal = ''
267 |     elif len(integer_decimal) == 2:
268 |         integer, decimal = integer_decimal
269 |     else:
270 |         raise ValueError(
271 |             f"The value string: '${value_string}' has more than one point in it."
272 |         )
273 | 
274 |     result = verbalize_cardinal(integer)
275 | 
276 |     decimal = decimal.rstrip('0')
277 |     if decimal:
278 |         # '.22' is verbalized as '零点二二'
279 |         # '3.20' is verbalized as '三点二
280 |         result = result if result else "零"
281 |         result += '点' + verbalize_digit(decimal)
282 |     return result
283 | 


--------------------------------------------------------------------------------
/f5_model/dataset.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import random
  3 | from tqdm import tqdm
  4 | 
  5 | import torch
  6 | import torch.nn.functional as F
  7 | from torch.utils.data import Dataset, Sampler
  8 | import torchaudio
  9 | from datasets import load_dataset, load_from_disk
 10 | from datasets import Dataset as Dataset_
 11 | 
 12 | from einops import rearrange
 13 | 
 14 | from .modules import MelSpec
 15 | 
 16 | 
 17 | class HFDataset(Dataset):
 18 |     def __init__(
 19 |         self,
 20 |         hf_dataset: Dataset,
 21 |         target_sample_rate = 24_000,
 22 |         n_mel_channels = 100,
 23 |         hop_length = 256,
 24 |     ):
 25 |         self.data = hf_dataset
 26 |         self.target_sample_rate = target_sample_rate
 27 |         self.hop_length = hop_length
 28 |         self.mel_spectrogram = MelSpec(target_sample_rate=target_sample_rate, n_mel_channels=n_mel_channels, hop_length=hop_length)
 29 |         
 30 |     def get_frame_len(self, index):
 31 |         row = self.data[index]
 32 |         audio = row['audio']['array']
 33 |         sample_rate = row['audio']['sampling_rate']
 34 |         return audio.shape[-1] / sample_rate * self.target_sample_rate / self.hop_length
 35 | 
 36 |     def __len__(self):
 37 |         return len(self.data)
 38 |     
 39 |     def __getitem__(self, index):
 40 |         row = self.data[index]
 41 |         audio = row['audio']['array']
 42 | 
 43 |         # logger.info(f"Audio shape: {audio.shape}")
 44 | 
 45 |         sample_rate = row['audio']['sampling_rate']
 46 |         duration = audio.shape[-1] / sample_rate
 47 | 
 48 |         if duration > 30 or duration < 0.3:
 49 |             return self.__getitem__((index + 1) % len(self.data))
 50 |         
 51 |         audio_tensor = torch.from_numpy(audio).float()
 52 |         
 53 |         if sample_rate != self.target_sample_rate:
 54 |             resampler = torchaudio.transforms.Resample(sample_rate, self.target_sample_rate)
 55 |             audio_tensor = resampler(audio_tensor)
 56 |         
 57 |         audio_tensor = rearrange(audio_tensor, 't -> 1 t')
 58 |         
 59 |         mel_spec = self.mel_spectrogram(audio_tensor)
 60 |         
 61 |         mel_spec = rearrange(mel_spec, '1 d t -> d t')
 62 |         
 63 |         text = row['text']
 64 |         
 65 |         return dict(
 66 |             mel_spec = mel_spec,
 67 |             text = text,
 68 |         )
 69 | 
 70 | 
 71 | class CustomDataset(Dataset):
 72 |     def __init__(
 73 |         self,
 74 |         custom_dataset: Dataset,
 75 |         durations = None,
 76 |         target_sample_rate = 24_000,
 77 |         hop_length = 256,
 78 |         n_mel_channels = 100,
 79 |         preprocessed_mel = False,
 80 |     ):
 81 |         self.data = custom_dataset
 82 |         self.durations = durations
 83 |         self.target_sample_rate = target_sample_rate
 84 |         self.hop_length = hop_length
 85 |         self.preprocessed_mel = preprocessed_mel
 86 |         if not preprocessed_mel:
 87 |             self.mel_spectrogram = MelSpec(target_sample_rate=target_sample_rate, hop_length=hop_length, n_mel_channels=n_mel_channels)
 88 | 
 89 |     def get_frame_len(self, index):
 90 |         if self.durations is not None:  # Please make sure the separately provided durations are correct, otherwise 99.99% OOM
 91 |             return self.durations[index] * self.target_sample_rate / self.hop_length
 92 |         return self.data[index]["duration"] * self.target_sample_rate / self.hop_length
 93 |     
 94 |     def __len__(self):
 95 |         return len(self.data)
 96 |     
 97 |     def __getitem__(self, index):
 98 |         row = self.data[index]
 99 |         audio_path = row["audio_path"]
100 |         text = row["text"]
101 |         duration = row["duration"]
102 | 
103 |         if self.preprocessed_mel:
104 |             mel_spec = torch.tensor(row["mel_spec"])
105 | 
106 |         else:
107 |             audio, source_sample_rate = torchaudio.load(audio_path)
108 | 
109 |             if duration > 30 or duration < 0.3:
110 |                 return self.__getitem__((index + 1) % len(self.data))
111 |             
112 |             if source_sample_rate != self.target_sample_rate:
113 |                 resampler = torchaudio.transforms.Resample(source_sample_rate, self.target_sample_rate)
114 |                 audio = resampler(audio)
115 |             
116 |             mel_spec = self.mel_spectrogram(audio)
117 |             mel_spec = rearrange(mel_spec, '1 d t -> d t')
118 |         
119 |         return dict(
120 |             mel_spec = mel_spec,
121 |             text = text,
122 |         )
123 |     
124 | 
125 | # Dynamic Batch Sampler
126 | 
127 | class DynamicBatchSampler(Sampler[list[int]]):
128 |     """ Extension of Sampler that will do the following:
129 |         1.  Change the batch size (essentially number of sequences)
130 |             in a batch to ensure that the total number of frames are less
131 |             than a certain threshold.
132 |         2.  Make sure the padding efficiency in the batch is high.
133 |     """
134 | 
135 |     def __init__(self, sampler: Sampler[int], frames_threshold: int, max_samples=0, random_seed=None, drop_last: bool = False):
136 |         self.sampler = sampler
137 |         self.frames_threshold = frames_threshold
138 |         self.max_samples = max_samples
139 | 
140 |         indices, batches = [], []
141 |         data_source = self.sampler.data_source
142 |         
143 |         for idx in tqdm(self.sampler, desc=f"Sorting with sampler... if slow, check whether dataset is provided with duration"):
144 |             indices.append((idx, data_source.get_frame_len(idx)))
145 |         indices.sort(key=lambda elem : elem[1])
146 | 
147 |         batch = []
148 |         batch_frames = 0
149 |         for idx, frame_len in tqdm(indices, desc=f"Creating dynamic batches with {frames_threshold} audio frames per gpu"):
150 |             if batch_frames + frame_len <= self.frames_threshold and (max_samples == 0 or len(batch) < max_samples):
151 |                 batch.append(idx)
152 |                 batch_frames += frame_len
153 |             else:
154 |                 if len(batch) > 0:
155 |                     batches.append(batch)
156 |                 if frame_len <= self.frames_threshold:
157 |                     batch = [idx]
158 |                     batch_frames = frame_len
159 |                 else:
160 |                     batch = []
161 |                     batch_frames = 0
162 | 
163 |         if not drop_last and len(batch) > 0:
164 |             batches.append(batch)
165 | 
166 |         del indices
167 | 
168 |         # if want to have different batches between epochs, may just set a seed and log it in ckpt
169 |         # cuz during multi-gpu training, although the batch on per gpu not change between epochs, the formed general minibatch is different
170 |         # e.g. for epoch n, use (random_seed + n)
171 |         random.seed(random_seed)
172 |         random.shuffle(batches)
173 | 
174 |         self.batches = batches
175 | 
176 |     def __iter__(self):
177 |         return iter(self.batches)
178 | 
179 |     def __len__(self):
180 |         return len(self.batches)
181 | 
182 | 
183 | # Load dataset
184 | 
185 | def load_dataset(
186 |         dataset_name: str,
187 |         tokenizer: str,
188 |         dataset_type: str = "CustomDataset", 
189 |         audio_type: str = "raw", 
190 |         mel_spec_kwargs: dict = dict()
191 |         ) -> CustomDataset | HFDataset:
192 |     
193 |     print("Loading dataset ...")
194 | 
195 |     if dataset_type == "CustomDataset":
196 |         if audio_type == "raw":
197 |             try:
198 |                 train_dataset = load_from_disk(f"data/{dataset_name}_{tokenizer}/raw")
199 |             except:
200 |                 train_dataset = Dataset_.from_file(f"data/{dataset_name}_{tokenizer}/raw.arrow")
201 |             preprocessed_mel = False
202 |         elif audio_type == "mel":
203 |             train_dataset = Dataset_.from_file(f"data/{dataset_name}_{tokenizer}/mel.arrow")
204 |             preprocessed_mel = True
205 |         with open(f"data/{dataset_name}_{tokenizer}/duration.json", 'r', encoding='utf-8') as f:
206 |             data_dict = json.load(f)
207 |         durations = data_dict["duration"]
208 |         train_dataset = CustomDataset(train_dataset, durations=durations, preprocessed_mel=preprocessed_mel, **mel_spec_kwargs)
209 | 
210 |     elif dataset_type == "HFDataset":
211 |         print("Should manually modify the path of huggingface dataset to your need.\n" +
212 |               "May also the corresponding script cuz different dataset may have different format.")
213 |         pre, post = dataset_name.split("_")
214 |         train_dataset = HFDataset(load_dataset(f"{pre}/{pre}", split=f"train.{post}", cache_dir="./data"),)
215 | 
216 |     return train_dataset
217 | 
218 | 
219 | # collation
220 | 
221 | def collate_fn(batch):
222 |     mel_specs = [item['mel_spec'].squeeze(0) for item in batch]
223 |     mel_lengths = torch.LongTensor([spec.shape[-1] for spec in mel_specs])
224 |     max_mel_length = mel_lengths.amax()
225 | 
226 |     padded_mel_specs = []
227 |     for spec in mel_specs:  # TODO. maybe records mask for attention here
228 |         padding = (0, max_mel_length - spec.size(-1))
229 |         padded_spec = F.pad(spec, padding, value = 0)
230 |         padded_mel_specs.append(padded_spec)
231 |     
232 |     mel_specs = torch.stack(padded_mel_specs)
233 | 
234 |     text = [item['text'] for item in batch]
235 |     text_lengths = torch.LongTensor([len(item) for item in text])
236 | 
237 |     return dict(
238 |         mel = mel_specs,
239 |         mel_lengths = mel_lengths,
240 |         text = text,
241 |         text_lengths = text_lengths,
242 |     )
243 | 


--------------------------------------------------------------------------------
/f5_model/cfm.py:
--------------------------------------------------------------------------------
  1 | """
  2 | ein notation:
  3 | b - batch
  4 | n - sequence
  5 | nt - text sequence
  6 | nw - raw wave length
  7 | d - dimension
  8 | """
  9 | 
 10 | from __future__ import annotations
 11 | from typing import Callable
 12 | from random import random
 13 | 
 14 | import torch
 15 | from torch import nn
 16 | import torch.nn.functional as F
 17 | from torch.nn.utils.rnn import pad_sequence
 18 | 
 19 | from torchdiffeq import odeint
 20 | 
 21 | from einops import rearrange
 22 | 
 23 | from .modules import MelSpec
 24 | 
 25 | from .utils import (
 26 |     default, exists, 
 27 |     list_str_to_idx, list_str_to_tensor, 
 28 |     lens_to_mask, mask_from_frac_lengths,
 29 | ) 
 30 | 
 31 | 
 32 | class CFM(nn.Module):
 33 |     def __init__(
 34 |         self,
 35 |         transformer: nn.Module,
 36 |         sigma = 0.,
 37 |         odeint_kwargs: dict = dict(
 38 |             # atol = 1e-5,
 39 |             # rtol = 1e-5,
 40 |             method = 'euler'  # 'midpoint'
 41 |         ),
 42 |         audio_drop_prob = 0.3,
 43 |         cond_drop_prob = 0.2,
 44 |         num_channels = None,
 45 |         mel_spec_module: nn.Module | None = None,
 46 |         mel_spec_kwargs: dict = dict(),
 47 |         frac_lengths_mask: tuple[float, float] = (0.7, 1.),
 48 |         vocab_char_map: dict[str: int] | None = None
 49 |     ):
 50 |         super().__init__()
 51 | 
 52 |         self.frac_lengths_mask = frac_lengths_mask
 53 | 
 54 |         # mel spec
 55 |         self.mel_spec = default(mel_spec_module, MelSpec(**mel_spec_kwargs))
 56 |         num_channels = default(num_channels, self.mel_spec.n_mel_channels)
 57 |         self.num_channels = num_channels
 58 | 
 59 |         # classifier-free guidance
 60 |         self.audio_drop_prob = audio_drop_prob
 61 |         self.cond_drop_prob = cond_drop_prob
 62 | 
 63 |         # transformer
 64 |         self.transformer = transformer
 65 |         dim = transformer.dim
 66 |         self.dim = dim
 67 | 
 68 |         # conditional flow related
 69 |         self.sigma = sigma
 70 | 
 71 |         # sampling related
 72 |         self.odeint_kwargs = odeint_kwargs
 73 | 
 74 |         # vocab map for tokenization
 75 |         self.vocab_char_map = vocab_char_map
 76 | 
 77 |     @property
 78 |     def device(self):
 79 |         return next(self.parameters()).device
 80 | 
 81 |     @torch.no_grad()
 82 |     def sample(
 83 |         self,
 84 |         cond: float['b n d'] | float['b nw'],
 85 |         text: int['b nt'] | list[str],
 86 |         duration: int | int['b'],
 87 |         *,
 88 |         lens: int['b'] | None = None,
 89 |         steps = 32,
 90 |         cfg_strength = 1., 
 91 |         sway_sampling_coef = None,
 92 |         seed: int | None = None,
 93 |         max_duration = 4096, 
 94 |         vocoder: Callable[[float['b d n']], float['b nw']] | None = None,
 95 |         no_ref_audio = False,
 96 |         duplicate_test = False,
 97 |         t_inter = 0.1,
 98 |         edit_mask = None,
 99 |     ):
100 |         self.eval()
101 | 
102 |         # raw wave
103 | 
104 |         if cond.ndim == 2:
105 |             cond = self.mel_spec(cond)
106 |             cond = rearrange(cond, 'b d n -> b n d')
107 |             assert cond.shape[-1] == self.num_channels
108 | 
109 |         batch, cond_seq_len, device = *cond.shape[:2], cond.device
110 |         if not exists(lens):
111 |             lens = torch.full((batch,), cond_seq_len, device = device, dtype = torch.long)
112 | 
113 |         # text
114 | 
115 |         if isinstance(text, list):
116 |             if exists(self.vocab_char_map):
117 |                 text = list_str_to_idx(text, self.vocab_char_map).to(device)
118 |             else:
119 |                 text = list_str_to_tensor(text).to(device)
120 |             assert text.shape[0] == batch
121 | 
122 |         if exists(text):
123 |             text_lens = (text != -1).sum(dim = -1)
124 |             lens = torch.maximum(text_lens, lens) # make sure lengths are at least those of the text characters
125 | 
126 |         # duration
127 | 
128 |         cond_mask = lens_to_mask(lens)
129 |         if edit_mask is not None:
130 |             cond_mask = cond_mask & edit_mask
131 | 
132 |         if isinstance(duration, int):
133 |             duration = torch.full((batch,), duration, device = device, dtype = torch.long)
134 | 
135 |         duration = torch.maximum(lens + 1, duration) # just add one token so something is generated
136 |         duration = duration.clamp(max = max_duration)
137 |         max_duration = duration.amax()
138 |         
139 |         # duplicate test corner for inner time step oberservation
140 |         if duplicate_test:
141 |             test_cond = F.pad(cond, (0, 0, cond_seq_len, max_duration - 2*cond_seq_len), value = 0.)
142 |             
143 |         cond = F.pad(cond, (0, 0, 0, max_duration - cond_seq_len), value = 0.)
144 |         cond_mask = F.pad(cond_mask, (0, max_duration - cond_mask.shape[-1]), value = False)
145 |         cond_mask = rearrange(cond_mask, '... -> ... 1')
146 |         step_cond = torch.where(cond_mask, cond, torch.zeros_like(cond))  # allow direct control (cut cond audio) with lens passed in
147 | 
148 |         if batch > 1:
149 |             mask = lens_to_mask(duration)
150 |         else:  # save memory and speed up, as single inference need no mask currently
151 |             mask = None
152 | 
153 |         # test for no ref audio
154 |         if no_ref_audio:
155 |             cond = torch.zeros_like(cond)
156 | 
157 |         # neural ode
158 | 
159 |         def fn(t, x):
160 |             # at each step, conditioning is fixed
161 |             # step_cond = torch.where(cond_mask, cond, torch.zeros_like(cond))
162 | 
163 |             # predict flow
164 |             pred = self.transformer(x = x, cond = step_cond, text = text, time = t, mask = mask, drop_audio_cond = False, drop_text = False)
165 |             if cfg_strength < 1e-5:
166 |                 return pred
167 |             
168 |             null_pred = self.transformer(x = x, cond = step_cond, text = text, time = t, mask = mask, drop_audio_cond = True, drop_text = True)
169 |             return pred + (pred - null_pred) * cfg_strength
170 | 
171 |         # noise input
172 |         # to make sure batch inference result is same with different batch size, and for sure single inference
173 |         # still some difference maybe due to convolutional layers
174 |         y0 = []
175 |         for dur in duration:
176 |             if exists(seed):
177 |                 torch.manual_seed(seed)
178 |             y0.append(torch.randn(dur, self.num_channels, device = self.device))
179 |         y0 = pad_sequence(y0, padding_value = 0, batch_first = True)
180 | 
181 |         t_start = 0
182 | 
183 |         # duplicate test corner for inner time step oberservation
184 |         if duplicate_test:
185 |             t_start = t_inter
186 |             y0 = (1 - t_start) * y0 + t_start * test_cond
187 |             steps = int(steps * (1 - t_start))
188 | 
189 |         t = torch.linspace(t_start, 1, steps, device = self.device)
190 |         if sway_sampling_coef is not None:
191 |             t = t + sway_sampling_coef * (torch.cos(torch.pi / 2 * t) - 1 + t)
192 | 
193 |         trajectory = odeint(fn, y0, t, **self.odeint_kwargs)
194 |         
195 |         sampled = trajectory[-1]
196 |         out = sampled
197 |         out = torch.where(cond_mask, cond, out)
198 | 
199 |         if exists(vocoder):
200 |             out = rearrange(out, 'b n d -> b d n')
201 |             out = vocoder(out)
202 | 
203 |         return out, trajectory
204 | 
205 |     def forward(
206 |         self,
207 |         inp: float['b n d'] | float['b nw'], # mel or raw wave
208 |         text: int['b nt'] | list[str],
209 |         *,
210 |         lens: int['b'] | None = None,
211 |         noise_scheduler: str | None = None,
212 |     ):
213 |         # handle raw wave
214 |         if inp.ndim == 2:
215 |             inp = self.mel_spec(inp)
216 |             inp = rearrange(inp, 'b d n -> b n d')
217 |             assert inp.shape[-1] == self.num_channels
218 | 
219 |         batch, seq_len, dtype, device, σ1 = *inp.shape[:2], inp.dtype, self.device, self.sigma
220 | 
221 |         # handle text as string
222 |         if isinstance(text, list):
223 |             if exists(self.vocab_char_map):
224 |                 text = list_str_to_idx(text, self.vocab_char_map).to(device)
225 |             else:
226 |                 text = list_str_to_tensor(text).to(device)
227 |             assert text.shape[0] == batch
228 | 
229 |         # lens and mask
230 |         if not exists(lens):
231 |             lens = torch.full((batch,), seq_len, device = device)
232 |         
233 |         mask = lens_to_mask(lens, length = seq_len)  # useless here, as collate_fn will pad to max length in batch
234 | 
235 |         # get a random span to mask out for training conditionally
236 |         frac_lengths = torch.zeros((batch,), device = self.device).float().uniform_(*self.frac_lengths_mask)
237 |         rand_span_mask = mask_from_frac_lengths(lens, frac_lengths)
238 | 
239 |         if exists(mask):
240 |             rand_span_mask &= mask
241 | 
242 |         # mel is x1
243 |         x1 = inp
244 | 
245 |         # x0 is gaussian noise
246 |         x0 = torch.randn_like(x1)
247 | 
248 |         # time step
249 |         time = torch.rand((batch,), dtype = dtype, device = self.device)
250 |         # TODO. noise_scheduler
251 | 
252 |         # sample xt (φ_t(x) in the paper)
253 |         t = rearrange(time, 'b -> b 1 1')
254 |         φ = (1 - t) * x0 + t * x1
255 |         flow = x1 - x0
256 | 
257 |         # only predict what is within the random mask span for infilling
258 |         cond = torch.where(
259 |             rand_span_mask[..., None],
260 |             torch.zeros_like(x1), x1
261 |         )
262 | 
263 |         # transformer and cfg training with a drop rate
264 |         drop_audio_cond = random() < self.audio_drop_prob  # p_drop in voicebox paper
265 |         if random() < self.cond_drop_prob:  # p_uncond in voicebox paper
266 |             drop_audio_cond = True
267 |             drop_text = True
268 |         else:
269 |             drop_text = False
270 |             
271 |         # if want rigourously mask out padding, record in collate_fn in dataset.py, and pass in here
272 |         # adding mask will use more memory, thus also need to adjust batchsampler with scaled down threshold for long sequences
273 |         pred = self.transformer(x = φ, cond = cond, text = text, time = time, drop_audio_cond = drop_audio_cond, drop_text = drop_text)
274 | 
275 |         # flow matching loss
276 |         loss = F.mse_loss(pred, flow, reduction = 'none')
277 |         loss = loss[rand_span_mask]
278 | 
279 |         return loss.mean(), cond, pred
280 | 


--------------------------------------------------------------------------------
/f5_model/ecapa_tdnn.py:
--------------------------------------------------------------------------------
  1 | # just for speaker similarity evaluation, third-party code
  2 | 
  3 | # From https://github.com/microsoft/UniSpeech/blob/main/downstreams/speaker_verification/models/
  4 | # part of the code is borrowed from https://github.com/lawlict/ECAPA-TDNN
  5 | 
  6 | import os
  7 | import torch
  8 | import torch.nn as nn
  9 | import torch.nn.functional as F
 10 | 
 11 | 
 12 | ''' Res2Conv1d + BatchNorm1d + ReLU
 13 | '''
 14 | 
 15 | class Res2Conv1dReluBn(nn.Module):
 16 |     '''
 17 |     in_channels == out_channels == channels
 18 |     '''
 19 | 
 20 |     def __init__(self, channels, kernel_size=1, stride=1, padding=0, dilation=1, bias=True, scale=4):
 21 |         super().__init__()
 22 |         assert channels % scale == 0, "{} % {} != 0".format(channels, scale)
 23 |         self.scale = scale
 24 |         self.width = channels // scale
 25 |         self.nums = scale if scale == 1 else scale - 1
 26 | 
 27 |         self.convs = []
 28 |         self.bns = []
 29 |         for i in range(self.nums):
 30 |             self.convs.append(nn.Conv1d(self.width, self.width, kernel_size, stride, padding, dilation, bias=bias))
 31 |             self.bns.append(nn.BatchNorm1d(self.width))
 32 |         self.convs = nn.ModuleList(self.convs)
 33 |         self.bns = nn.ModuleList(self.bns)
 34 | 
 35 |     def forward(self, x):
 36 |         out = []
 37 |         spx = torch.split(x, self.width, 1)
 38 |         for i in range(self.nums):
 39 |             if i == 0:
 40 |                 sp = spx[i]
 41 |             else:
 42 |                 sp = sp + spx[i]
 43 |             # Order: conv -> relu -> bn
 44 |             sp = self.convs[i](sp)
 45 |             sp = self.bns[i](F.relu(sp))
 46 |             out.append(sp)
 47 |         if self.scale != 1:
 48 |             out.append(spx[self.nums])
 49 |         out = torch.cat(out, dim=1)
 50 | 
 51 |         return out
 52 | 
 53 | 
 54 | ''' Conv1d + BatchNorm1d + ReLU
 55 | '''
 56 | 
 57 | class Conv1dReluBn(nn.Module):
 58 |     def __init__(self, in_channels, out_channels, kernel_size=1, stride=1, padding=0, dilation=1, bias=True):
 59 |         super().__init__()
 60 |         self.conv = nn.Conv1d(in_channels, out_channels, kernel_size, stride, padding, dilation, bias=bias)
 61 |         self.bn = nn.BatchNorm1d(out_channels)
 62 | 
 63 |     def forward(self, x):
 64 |         return self.bn(F.relu(self.conv(x)))
 65 | 
 66 | 
 67 | ''' The SE connection of 1D case.
 68 | '''
 69 | 
 70 | class SE_Connect(nn.Module):
 71 |     def __init__(self, channels, se_bottleneck_dim=128):
 72 |         super().__init__()
 73 |         self.linear1 = nn.Linear(channels, se_bottleneck_dim)
 74 |         self.linear2 = nn.Linear(se_bottleneck_dim, channels)
 75 | 
 76 |     def forward(self, x):
 77 |         out = x.mean(dim=2)
 78 |         out = F.relu(self.linear1(out))
 79 |         out = torch.sigmoid(self.linear2(out))
 80 |         out = x * out.unsqueeze(2)
 81 | 
 82 |         return out
 83 | 
 84 | 
 85 | ''' SE-Res2Block of the ECAPA-TDNN architecture.
 86 | '''
 87 | 
 88 | # def SE_Res2Block(channels, kernel_size, stride, padding, dilation, scale):
 89 | #     return nn.Sequential(
 90 | #         Conv1dReluBn(channels, 512, kernel_size=1, stride=1, padding=0),
 91 | #         Res2Conv1dReluBn(512, kernel_size, stride, padding, dilation, scale=scale),
 92 | #         Conv1dReluBn(512, channels, kernel_size=1, stride=1, padding=0),
 93 | #         SE_Connect(channels)
 94 | #     )
 95 | 
 96 | class SE_Res2Block(nn.Module):
 97 |     def __init__(self, in_channels, out_channels, kernel_size, stride, padding, dilation, scale, se_bottleneck_dim):
 98 |         super().__init__()
 99 |         self.Conv1dReluBn1 = Conv1dReluBn(in_channels, out_channels, kernel_size=1, stride=1, padding=0)
100 |         self.Res2Conv1dReluBn = Res2Conv1dReluBn(out_channels, kernel_size, stride, padding, dilation, scale=scale)
101 |         self.Conv1dReluBn2 = Conv1dReluBn(out_channels, out_channels, kernel_size=1, stride=1, padding=0)
102 |         self.SE_Connect = SE_Connect(out_channels, se_bottleneck_dim)
103 | 
104 |         self.shortcut = None
105 |         if in_channels != out_channels:
106 |             self.shortcut = nn.Conv1d(
107 |                 in_channels=in_channels,
108 |                 out_channels=out_channels,
109 |                 kernel_size=1,
110 |             )
111 | 
112 |     def forward(self, x):
113 |         residual = x
114 |         if self.shortcut:
115 |             residual = self.shortcut(x)
116 | 
117 |         x = self.Conv1dReluBn1(x)
118 |         x = self.Res2Conv1dReluBn(x)
119 |         x = self.Conv1dReluBn2(x)
120 |         x = self.SE_Connect(x)
121 | 
122 |         return x + residual
123 | 
124 | 
125 | ''' Attentive weighted mean and standard deviation pooling.
126 | '''
127 | 
128 | class AttentiveStatsPool(nn.Module):
129 |     def __init__(self, in_dim, attention_channels=128, global_context_att=False):
130 |         super().__init__()
131 |         self.global_context_att = global_context_att
132 | 
133 |         # Use Conv1d with stride == 1 rather than Linear, then we don't need to transpose inputs.
134 |         if global_context_att:
135 |             self.linear1 = nn.Conv1d(in_dim * 3, attention_channels, kernel_size=1)  # equals W and b in the paper
136 |         else:
137 |             self.linear1 = nn.Conv1d(in_dim, attention_channels, kernel_size=1)  # equals W and b in the paper
138 |         self.linear2 = nn.Conv1d(attention_channels, in_dim, kernel_size=1)  # equals V and k in the paper
139 | 
140 |     def forward(self, x):
141 | 
142 |         if self.global_context_att:
143 |             context_mean = torch.mean(x, dim=-1, keepdim=True).expand_as(x)
144 |             context_std = torch.sqrt(torch.var(x, dim=-1, keepdim=True) + 1e-10).expand_as(x)
145 |             x_in = torch.cat((x, context_mean, context_std), dim=1)
146 |         else:
147 |             x_in = x
148 | 
149 |         # DON'T use ReLU here! In experiments, I find ReLU hard to converge.
150 |         alpha = torch.tanh(self.linear1(x_in))
151 |         # alpha = F.relu(self.linear1(x_in))
152 |         alpha = torch.softmax(self.linear2(alpha), dim=2)
153 |         mean = torch.sum(alpha * x, dim=2)
154 |         residuals = torch.sum(alpha * (x ** 2), dim=2) - mean ** 2
155 |         std = torch.sqrt(residuals.clamp(min=1e-9))
156 |         return torch.cat([mean, std], dim=1)
157 | 
158 | 
159 | class ECAPA_TDNN(nn.Module):
160 |     def __init__(self, feat_dim=80, channels=512, emb_dim=192, global_context_att=False,
161 |                  feat_type='wavlm_large', sr=16000, feature_selection="hidden_states", update_extract=False, config_path=None):
162 |         super().__init__()
163 | 
164 |         self.feat_type = feat_type
165 |         self.feature_selection = feature_selection
166 |         self.update_extract = update_extract
167 |         self.sr = sr
168 |         
169 |         torch.hub._validate_not_a_forked_repo=lambda a,b,c: True
170 |         try:
171 |             local_s3prl_path = os.path.expanduser("~/.cache/torch/hub/s3prl_s3prl_main")
172 |             self.feature_extract = torch.hub.load(local_s3prl_path, feat_type, source='local', config_path=config_path)
173 |         except:
174 |             self.feature_extract = torch.hub.load('s3prl/s3prl', feat_type)
175 | 
176 |         if len(self.feature_extract.model.encoder.layers) == 24 and hasattr(self.feature_extract.model.encoder.layers[23].self_attn, "fp32_attention"):
177 |             self.feature_extract.model.encoder.layers[23].self_attn.fp32_attention = False
178 |         if len(self.feature_extract.model.encoder.layers) == 24 and hasattr(self.feature_extract.model.encoder.layers[11].self_attn, "fp32_attention"):
179 |             self.feature_extract.model.encoder.layers[11].self_attn.fp32_attention = False
180 | 
181 |         self.feat_num = self.get_feat_num()
182 |         self.feature_weight = nn.Parameter(torch.zeros(self.feat_num))
183 | 
184 |         if feat_type != 'fbank' and feat_type != 'mfcc':
185 |             freeze_list = ['final_proj', 'label_embs_concat', 'mask_emb', 'project_q', 'quantizer']
186 |             for name, param in self.feature_extract.named_parameters():
187 |                 for freeze_val in freeze_list:
188 |                     if freeze_val in name:
189 |                         param.requires_grad = False
190 |                         break
191 | 
192 |         if not self.update_extract:
193 |             for param in self.feature_extract.parameters():
194 |                 param.requires_grad = False
195 | 
196 |         self.instance_norm = nn.InstanceNorm1d(feat_dim)
197 |         # self.channels = [channels] * 4 + [channels * 3]
198 |         self.channels = [channels] * 4 + [1536]
199 | 
200 |         self.layer1 = Conv1dReluBn(feat_dim, self.channels[0], kernel_size=5, padding=2)
201 |         self.layer2 = SE_Res2Block(self.channels[0], self.channels[1], kernel_size=3, stride=1, padding=2, dilation=2, scale=8, se_bottleneck_dim=128)
202 |         self.layer3 = SE_Res2Block(self.channels[1], self.channels[2], kernel_size=3, stride=1, padding=3, dilation=3, scale=8, se_bottleneck_dim=128)
203 |         self.layer4 = SE_Res2Block(self.channels[2], self.channels[3], kernel_size=3, stride=1, padding=4, dilation=4, scale=8, se_bottleneck_dim=128)
204 | 
205 |         # self.conv = nn.Conv1d(self.channels[-1], self.channels[-1], kernel_size=1)
206 |         cat_channels = channels * 3
207 |         self.conv = nn.Conv1d(cat_channels, self.channels[-1], kernel_size=1)
208 |         self.pooling = AttentiveStatsPool(self.channels[-1], attention_channels=128, global_context_att=global_context_att)
209 |         self.bn = nn.BatchNorm1d(self.channels[-1] * 2)
210 |         self.linear = nn.Linear(self.channels[-1] * 2, emb_dim)
211 | 
212 | 
213 |     def get_feat_num(self):
214 |         self.feature_extract.eval()
215 |         wav = [torch.randn(self.sr).to(next(self.feature_extract.parameters()).device)]
216 |         with torch.no_grad():
217 |             features = self.feature_extract(wav)
218 |         select_feature = features[self.feature_selection]
219 |         if isinstance(select_feature, (list, tuple)):
220 |             return len(select_feature)
221 |         else:
222 |             return 1
223 | 
224 |     def get_feat(self, x):
225 |         if self.update_extract:
226 |             x = self.feature_extract([sample for sample in x])
227 |         else:
228 |             with torch.no_grad():
229 |                 if self.feat_type == 'fbank' or self.feat_type == 'mfcc':
230 |                     x = self.feature_extract(x) + 1e-6  # B x feat_dim x time_len
231 |                 else:
232 |                     x = self.feature_extract([sample for sample in x])
233 | 
234 |         if self.feat_type == 'fbank':
235 |             x = x.log()
236 | 
237 |         if self.feat_type != "fbank" and self.feat_type != "mfcc":
238 |             x = x[self.feature_selection]
239 |             if isinstance(x, (list, tuple)):
240 |                 x = torch.stack(x, dim=0)
241 |             else:
242 |                 x = x.unsqueeze(0)
243 |             norm_weights = F.softmax(self.feature_weight, dim=-1).unsqueeze(-1).unsqueeze(-1).unsqueeze(-1)
244 |             x = (norm_weights * x).sum(dim=0)
245 |             x = torch.transpose(x, 1, 2) + 1e-6
246 | 
247 |         x = self.instance_norm(x)
248 |         return x
249 | 
250 |     def forward(self, x):
251 |         x = self.get_feat(x)
252 | 
253 |         out1 = self.layer1(x)
254 |         out2 = self.layer2(out1)
255 |         out3 = self.layer3(out2)
256 |         out4 = self.layer4(out3)
257 | 
258 |         out = torch.cat([out2, out3, out4], dim=1)
259 |         out = F.relu(self.conv(out))
260 |         out = self.bn(self.pooling(out))
261 |         out = self.linear(out)
262 | 
263 |         return out
264 | 
265 | 
266 | def ECAPA_TDNN_SMALL(feat_dim, emb_dim=256, feat_type='wavlm_large', sr=16000, feature_selection="hidden_states", update_extract=False, config_path=None):
267 |     return ECAPA_TDNN(feat_dim=feat_dim, channels=512, emb_dim=emb_dim,
268 |                       feat_type=feat_type, sr=sr, feature_selection=feature_selection, update_extract=update_extract, config_path=config_path)
269 | 


--------------------------------------------------------------------------------
/f5_model/trainer.py:
--------------------------------------------------------------------------------
  1 | from __future__ import annotations
  2 | 
  3 | import os
  4 | import gc
  5 | from tqdm import tqdm
  6 | import wandb
  7 | 
  8 | import torch
  9 | from torch.optim import AdamW
 10 | from torch.utils.data import DataLoader, Dataset, SequentialSampler
 11 | from torch.optim.lr_scheduler import LinearLR, SequentialLR
 12 | 
 13 | from einops import rearrange
 14 | 
 15 | from accelerate import Accelerator
 16 | from accelerate.utils import DistributedDataParallelKwargs
 17 | 
 18 | from ema_pytorch import EMA
 19 | 
 20 | from .cfm import CFM
 21 | from .utils import exists, default
 22 | from .dataset import DynamicBatchSampler, collate_fn
 23 | 
 24 | 
 25 | # trainer
 26 | 
 27 | class Trainer:
 28 |     def __init__(
 29 |         self,
 30 |         model: CFM,
 31 |         epochs,
 32 |         learning_rate,
 33 |         num_warmup_updates = 20000,
 34 |         save_per_updates = 1000, 
 35 |         checkpoint_path = None,
 36 |         batch_size = 32, 
 37 |         batch_size_type: str = "sample",
 38 |         max_samples = 32,
 39 |         grad_accumulation_steps = 1,
 40 |         max_grad_norm = 1.0,
 41 |         noise_scheduler: str | None = None,
 42 |         duration_predictor: torch.nn.Module | None = None,
 43 |         wandb_project = "test_e2-tts",
 44 |         wandb_run_name = "test_run",
 45 |         wandb_resume_id: str = None,
 46 |         last_per_steps = None,
 47 |         accelerate_kwargs: dict = dict(),
 48 |         ema_kwargs: dict = dict()
 49 |     ):
 50 |         
 51 |         ddp_kwargs = DistributedDataParallelKwargs(find_unused_parameters = True)
 52 | 
 53 |         self.accelerator = Accelerator(
 54 |             log_with = "wandb",
 55 |             kwargs_handlers = [ddp_kwargs],
 56 |             gradient_accumulation_steps = grad_accumulation_steps,
 57 |             **accelerate_kwargs
 58 |         )
 59 |         
 60 |         if exists(wandb_resume_id):
 61 |             init_kwargs={"wandb": {"resume": "allow", "name": wandb_run_name, 'id': wandb_resume_id}}
 62 |         else:
 63 |             init_kwargs={"wandb": {"resume": "allow", "name": wandb_run_name}}
 64 |         self.accelerator.init_trackers(
 65 |             project_name = wandb_project, 
 66 |             init_kwargs=init_kwargs,
 67 |             config={"epochs": epochs,
 68 |                     "learning_rate": learning_rate,
 69 |                     "num_warmup_updates": num_warmup_updates, 
 70 |                     "batch_size": batch_size,
 71 |                     "batch_size_type": batch_size_type,
 72 |                     "max_samples": max_samples,
 73 |                     "grad_accumulation_steps": grad_accumulation_steps,
 74 |                     "max_grad_norm": max_grad_norm,
 75 |                     "gpus": self.accelerator.num_processes,
 76 |                     "noise_scheduler": noise_scheduler}
 77 |             )
 78 | 
 79 |         self.model = model
 80 | 
 81 |         if self.is_main:
 82 |             self.ema_model = EMA(
 83 |                 model,
 84 |                 include_online_model = False,
 85 |                 **ema_kwargs
 86 |             )
 87 | 
 88 |             self.ema_model.to(self.accelerator.device)
 89 | 
 90 |         self.epochs = epochs
 91 |         self.num_warmup_updates = num_warmup_updates
 92 |         self.save_per_updates = save_per_updates
 93 |         self.last_per_steps = default(last_per_steps, save_per_updates * grad_accumulation_steps)
 94 |         self.checkpoint_path = default(checkpoint_path, 'ckpts/test_e2-tts')
 95 | 
 96 |         self.batch_size = batch_size
 97 |         self.batch_size_type = batch_size_type
 98 |         self.max_samples = max_samples
 99 |         self.grad_accumulation_steps = grad_accumulation_steps
100 |         self.max_grad_norm = max_grad_norm
101 | 
102 |         self.noise_scheduler = noise_scheduler
103 | 
104 |         self.duration_predictor = duration_predictor
105 | 
106 |         self.optimizer = AdamW(model.parameters(), lr=learning_rate)
107 |         self.model, self.optimizer = self.accelerator.prepare(
108 |             self.model, self.optimizer
109 |         )
110 | 
111 |     @property
112 |     def is_main(self):
113 |         return self.accelerator.is_main_process
114 | 
115 |     def save_checkpoint(self, step, last=False):
116 |         self.accelerator.wait_for_everyone()
117 |         if self.is_main:
118 |             checkpoint = dict(
119 |                 model_state_dict = self.accelerator.unwrap_model(self.model).state_dict(),
120 |                 optimizer_state_dict = self.accelerator.unwrap_model(self.optimizer).state_dict(),
121 |                 ema_model_state_dict = self.ema_model.state_dict(),
122 |                 scheduler_state_dict = self.scheduler.state_dict(),
123 |                 step = step
124 |             )
125 |             if not os.path.exists(self.checkpoint_path):
126 |                 os.makedirs(self.checkpoint_path)
127 |             if last == True:
128 |                 self.accelerator.save(checkpoint, f"{self.checkpoint_path}/model_last.pt")
129 |                 print(f"Saved last checkpoint at step {step}")
130 |             else:
131 |                 self.accelerator.save(checkpoint, f"{self.checkpoint_path}/model_{step}.pt")
132 | 
133 |     def load_checkpoint(self):
134 |         if not exists(self.checkpoint_path) or not os.path.exists(self.checkpoint_path) or not os.listdir(self.checkpoint_path):
135 |             return 0
136 |         
137 |         self.accelerator.wait_for_everyone()
138 |         if "model_last.pt" in os.listdir(self.checkpoint_path):
139 |             latest_checkpoint = "model_last.pt"
140 |         else:
141 |             latest_checkpoint = sorted([f for f in os.listdir(self.checkpoint_path) if f.endswith('.pt')], key=lambda x: int(''.join(filter(str.isdigit, x))))[-1]
142 |         # checkpoint = torch.load(f"{self.checkpoint_path}/{latest_checkpoint}", map_location=self.accelerator.device)  # rather use accelerator.load_state ಥ_ಥ
143 |         checkpoint = torch.load(f"{self.checkpoint_path}/{latest_checkpoint}", map_location="cpu")
144 | 
145 |         if self.is_main:
146 |             self.ema_model.load_state_dict(checkpoint['ema_model_state_dict'])
147 | 
148 |         if 'step' in checkpoint:
149 |             self.accelerator.unwrap_model(self.model).load_state_dict(checkpoint['model_state_dict'])
150 |             self.accelerator.unwrap_model(self.optimizer).load_state_dict(checkpoint['optimizer_state_dict'])
151 |             if self.scheduler:
152 |                 self.scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
153 |             step = checkpoint['step']
154 |         else:
155 |             checkpoint['model_state_dict'] = {k.replace("ema_model.", ""): v for k, v in checkpoint['ema_model_state_dict'].items() if k not in ["initted", "step"]}
156 |             self.accelerator.unwrap_model(self.model).load_state_dict(checkpoint['model_state_dict'])
157 |             step = 0
158 | 
159 |         del checkpoint; gc.collect()
160 |         return step
161 | 
162 |     def train(self, train_dataset: Dataset, num_workers=16, resumable_with_seed: int = None):
163 |         
164 |         if exists(resumable_with_seed):
165 |             generator = torch.Generator()
166 |             generator.manual_seed(resumable_with_seed)
167 |         else: 
168 |             generator = None
169 | 
170 |         if self.batch_size_type == "sample":
171 |             train_dataloader = DataLoader(train_dataset, collate_fn=collate_fn, num_workers=num_workers, pin_memory=True, persistent_workers=True,
172 |                                           batch_size=self.batch_size, shuffle=True, generator=generator)
173 |         elif self.batch_size_type == "frame":
174 |             self.accelerator.even_batches = False
175 |             sampler = SequentialSampler(train_dataset)
176 |             batch_sampler = DynamicBatchSampler(sampler, self.batch_size, max_samples=self.max_samples, random_seed=resumable_with_seed, drop_last=False)
177 |             train_dataloader = DataLoader(train_dataset, collate_fn=collate_fn, num_workers=num_workers, pin_memory=True, persistent_workers=True,
178 |                                           batch_sampler=batch_sampler)
179 |         else:
180 |             raise ValueError(f"batch_size_type must be either 'sample' or 'frame', but received {self.batch_size_type}")
181 |         
182 |         #  accelerator.prepare() dispatches batches to devices;
183 |         #  which means the length of dataloader calculated before, should consider the number of devices
184 |         warmup_steps = self.num_warmup_updates * self.accelerator.num_processes  # consider a fixed warmup steps while using accelerate multi-gpu ddp
185 |                                                                                  # otherwise by default with split_batches=False, warmup steps change with num_processes
186 |         total_steps = len(train_dataloader) * self.epochs / self.grad_accumulation_steps
187 |         decay_steps = total_steps - warmup_steps
188 |         warmup_scheduler = LinearLR(self.optimizer, start_factor=1e-8, end_factor=1.0, total_iters=warmup_steps)
189 |         decay_scheduler = LinearLR(self.optimizer, start_factor=1.0, end_factor=1e-8, total_iters=decay_steps)
190 |         self.scheduler = SequentialLR(self.optimizer, 
191 |                                       schedulers=[warmup_scheduler, decay_scheduler],
192 |                                       milestones=[warmup_steps])
193 |         train_dataloader, self.scheduler = self.accelerator.prepare(train_dataloader, self.scheduler)  # actual steps = 1 gpu steps / gpus
194 |         start_step = self.load_checkpoint()
195 |         global_step = start_step
196 | 
197 |         if exists(resumable_with_seed):
198 |             orig_epoch_step = len(train_dataloader)
199 |             skipped_epoch = int(start_step // orig_epoch_step)
200 |             skipped_batch = start_step % orig_epoch_step
201 |             skipped_dataloader = self.accelerator.skip_first_batches(train_dataloader, num_batches=skipped_batch)
202 |         else:
203 |             skipped_epoch = 0
204 | 
205 |         for epoch in range(skipped_epoch, self.epochs):
206 |             self.model.train()
207 |             if exists(resumable_with_seed) and epoch == skipped_epoch:
208 |                 progress_bar = tqdm(skipped_dataloader, desc=f"Epoch {epoch+1}/{self.epochs}", unit="step", disable=not self.accelerator.is_local_main_process, 
209 |                                     initial=skipped_batch, total=orig_epoch_step)
210 |             else:
211 |                 progress_bar = tqdm(train_dataloader, desc=f"Epoch {epoch+1}/{self.epochs}", unit="step", disable=not self.accelerator.is_local_main_process)
212 | 
213 |             for batch in progress_bar:
214 |                 with self.accelerator.accumulate(self.model):
215 |                     text_inputs = batch['text']
216 |                     mel_spec = rearrange(batch['mel'], 'b d n -> b n d')
217 |                     mel_lengths = batch["mel_lengths"]
218 | 
219 |                     # TODO. add duration predictor training
220 |                     if self.duration_predictor is not None and self.accelerator.is_local_main_process:
221 |                         dur_loss = self.duration_predictor(mel_spec, lens=batch.get('durations'))
222 |                         self.accelerator.log({"duration loss": dur_loss.item()}, step=global_step)
223 | 
224 |                     loss, cond, pred = self.model(mel_spec, text=text_inputs, lens=mel_lengths, noise_scheduler=self.noise_scheduler)
225 |                     self.accelerator.backward(loss)
226 | 
227 |                     if self.max_grad_norm > 0 and self.accelerator.sync_gradients:
228 |                         self.accelerator.clip_grad_norm_(self.model.parameters(), self.max_grad_norm)
229 | 
230 |                     self.optimizer.step()
231 |                     self.scheduler.step()
232 |                     self.optimizer.zero_grad()
233 | 
234 |                 if self.is_main:
235 |                     self.ema_model.update()
236 | 
237 |                 global_step += 1
238 | 
239 |                 if self.accelerator.is_local_main_process:
240 |                     self.accelerator.log({"loss": loss.item(), "lr": self.scheduler.get_last_lr()[0]}, step=global_step)
241 |                 
242 |                 progress_bar.set_postfix(step=str(global_step), loss=loss.item())
243 |                 
244 |                 if global_step % (self.save_per_updates * self.grad_accumulation_steps) == 0:
245 |                     self.save_checkpoint(global_step)
246 |                 
247 |                 if global_step % self.last_per_steps == 0:
248 |                     self.save_checkpoint(global_step, last=True)
249 |         
250 |         self.accelerator.end_training()
251 | 


--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
  1 | import os,sys
  2 | import os.path as osp
  3 | now_dir = osp.dirname(osp.abspath(__file__))
  4 | sys.path.append(now_dir)
  5 | from f5_model import CFM, UNetT, DiT
  6 | from f5_model.utils import (
  7 |     load_checkpoint,
  8 |     get_tokenizer,
  9 |     convert_char_to_pinyin,
 10 |     save_spectrogram,
 11 | )
 12 | from LangSegment import LangSegment
 13 | from zh_normalization import text_normalize
 14 | import re
 15 | import torch
 16 | import tempfile
 17 | import shutil
 18 | import torchaudio
 19 | from transformers import pipeline
 20 | from pydub import AudioSegment,silence
 21 | import folder_paths
 22 | from tqdm import tqdm
 23 | from einops import rearrange
 24 | from vocos import Vocos
 25 | import numpy as np
 26 | from PIL import Image
 27 | import soundfile as sf
 28 | from comfy.utils import ProgressBar
 29 | from huggingface_hub import  snapshot_download
 30 | aifsh_dir = osp.join(folder_paths.models_dir,"AIFSH")
 31 | pretrained_dir = osp.join(aifsh_dir,"F5-TTS")
 32 | openai_dir = osp.join(aifsh_dir,"whisper-large-v3-turbo")
 33 | LangSegment.setfilters(["zh", "en"])
 34 | SPLIT_WORDS = [
 35 |     "but", "however", "nevertheless", "yet", "still",
 36 |     "therefore", "thus", "hence", "consequently",
 37 |     "moreover", "furthermore", "additionally",
 38 |     "meanwhile", "alternatively", "otherwise",
 39 |     "namely", "specifically", "for example", "such as",
 40 |     "in fact", "indeed", "notably",
 41 |     "in contrast", "on the other hand", "conversely",
 42 |     "in conclusion", "to summarize", "finally"
 43 | ]
 44 | device = (
 45 |     "cuda"
 46 |     if torch.cuda.is_available()
 47 |     else "mps" if torch.backends.mps.is_available() else "cpu"
 48 | )
 49 | tmp_dir = osp.join(now_dir, "tmp")
 50 | 
 51 | target_sample_rate = 24000
 52 | n_mel_channels = 100
 53 | target_rms = 0.1
 54 | hop_length = 256
 55 | nfe_step = 32  # 16, 32
 56 | cfg_strength = 2.0
 57 | ode_method = "euler"
 58 | sway_sampling_coef = -1.0
 59 | 
 60 | class F5TTSNode:
 61 |     def __init__(self):
 62 |         if not osp.exists(osp.join(pretrained_dir,"F5TTS_Base/model_1200000.safetensors")):
 63 |             snapshot_download(repo_id="SWivid/F5-TTS",
 64 |                             local_dir=pretrained_dir,
 65 |                             allow_patterns=["*.safetensors","*.json"])
 66 |         if not osp.exists(osp.join(openai_dir,"model.safetensors")):
 67 |             snapshot_download(repo_id="openai/whisper-large-v3-turbo",
 68 |                               local_dir=openai_dir)
 69 |         
 70 |     @classmethod
 71 |     def INPUT_TYPES(s):
 72 |         return {
 73 |             "required":{
 74 |                 "gen_text":("TEXT",),
 75 |                 "ref_audio":("AUDIO",),
 76 |                 "model_choice":(["F5-TTS", "E2-TTS"],),
 77 |                 "speed":("FLOAT",{
 78 |                     "default":1.0,
 79 |                     "min":0.5,
 80 |                     "max":2.0,
 81 |                     "step":0.05,
 82 |                     "round":0.001,
 83 |                     "display":"slider"
 84 |                 }),
 85 |                 "remove_silence":("BOOLEAN",{
 86 |                     "default": True,
 87 |                     "tooltip":"The model tends to produce silences, especially on longer audio. We can manually remove silences if needed. Note that this is an experimental feature and may produce strange results. This will also increase generation time."
 88 |                 }),
 89 |                 "split_words":("STRING",{
 90 |                     "default":",".join(SPLIT_WORDS),
 91 |                     "multiline": True,
 92 |                     "dynamicPrompts": True,
 93 |                     "tooltip":"Enter custom words to split on, separated by commas. Leave blank to use default list.",
 94 |                 })
 95 |             },
 96 |             "optional":{
 97 |                 "ref_text":("TEXT",)
 98 |             }
 99 |         }
100 |     
101 |     RETURN_TYPES = ("AUDIO","IMAGE",)
102 |     #RETURN_NAMES = ("image_output_name",)
103 | 
104 |     FUNCTION = "gen_audio"
105 | 
106 |     #OUTPUT_NODE = False
107 | 
108 |     CATEGORY = "AIFSH_F5-TTS"
109 | 
110 |     def gen_audio(self,gen_text,ref_audio,model_choice,speed,
111 |                   remove_silence,split_words,ref_text=None):
112 |         os.makedirs(tmp_dir,exist_ok=True)
113 |         if not split_words.strip():
114 |             custom_words = [word.strip() for word in split_words.split(',')]
115 |             global SPLIT_WORDS
116 |             SPLIT_WORDS = custom_words
117 |         
118 |         print(gen_text)
119 |         
120 |         print("Converting audio...")
121 |         with tempfile.NamedTemporaryFile(delete=False, suffix=".wav",dir=tmp_dir) as f:
122 |             ref_audio_orig = osp.join(tmp_dir,"tmp_ref_audio.wav")
123 |             waveform = ref_audio["waveform"].squeeze(0)
124 | 
125 |             torchaudio.save(ref_audio_orig,waveform,ref_audio["sample_rate"])
126 |             aseg = AudioSegment.from_file(ref_audio_orig)
127 |             # os.remove(ref_audio_orig)
128 | 
129 |             non_silent_segs = silence.split_on_silence(aseg, min_silence_len=1000, silence_thresh=-50, keep_silence=500)
130 |             non_silent_wave = AudioSegment.silent(duration=0)
131 |             for non_silent_seg in non_silent_segs:
132 |                 non_silent_wave += non_silent_seg
133 |             aseg = non_silent_wave
134 | 
135 |             audio_duration = len(aseg)
136 |             if audio_duration > 15000:
137 |                 print("Audio is over 15s, clipping to only first 15s.")
138 |                 aseg = aseg[:15000]
139 |             aseg.export(f.name, format="wav")
140 |             ref_audio = f.name
141 |         
142 |         if ref_text is None:
143 |             print("No reference text provided, transcribing reference audio...")
144 |             pipe = pipeline(
145 |                 "automatic-speech-recognition",
146 |                 model=openai_dir,
147 |                 torch_dtype=torch.float16,
148 |                 device=device,
149 |             )
150 |             ref_text = pipe(
151 |                 ref_audio,
152 |                 chunk_length_s=30,
153 |                 batch_size=128,
154 |                 generate_kwargs={"task": "transcribe"},
155 |                 return_timestamps=False,
156 |             )["text"].strip()
157 |             print("Finished transcription")
158 |         else:
159 |            print("Using custom reference text...")
160 |         
161 |         # Split the input text into batches
162 |         if len(ref_text.encode('utf-8')) == len(ref_text) and len(gen_text.encode('utf-8')) == len(gen_text):
163 |             max_chars = 400-len(ref_text.encode('utf-8'))
164 |         else:
165 |             max_chars = 300-len(ref_text.encode('utf-8'))
166 |         gen_text_batches = split_text_into_batches(gen_text, max_chars=max_chars)
167 |         print('ref_text', ref_text)
168 |         gen_text_batches = text_list_normalize(gen_text_batches)
169 |         for i, gen_text in enumerate(gen_text_batches):
170 |             print(f'gen_text {i}', gen_text)
171 |         print(f"Generating audio using {model_choice} in {len(gen_text_batches)} batches")
172 |         (target_sr, waveform), img_path= infer_batch(ref_audio, ref_text, gen_text_batches, model_choice, remove_silence,speed)
173 |         # print(waveform.shape)
174 |         res_audio = {
175 |             "waveform": torch.from_numpy(waveform).unsqueeze(0).unsqueeze(0),
176 |             "sample_rate": target_sr
177 |         }
178 |         res_img = torch.from_numpy(np.array(Image.open(img_path))/255.0).unsqueeze(0)
179 |         # print(res_img.shape)
180 |         shutil.rmtree(tmp_dir)
181 |         return (res_audio, res_img,)
182 | 
183 | NODE_CLASS_MAPPINGS = {
184 |     "F5TTSNode": F5TTSNode
185 | }
186 | 
187 | def text_list_normalize(texts):
188 |     text_list = []
189 |     for text in texts:
190 |         for tmp in LangSegment.getTexts(text):
191 |             normalize = text_normalize(tmp.get("text"))
192 |             if normalize != "" and tmp.get("lang") == "en" and normalize not in ["."]:
193 |                 if len(text_list) > 0:
194 |                     text_list[-1] += normalize
195 |                 else:
196 |                     text_list.append(normalize)
197 |             elif tmp.get("lang") == "zh":
198 |                 text_list.append(normalize)
199 |             else:
200 |                 if len(text_list) > 0:
201 |                     text_list[-1] += tmp.get("text")
202 |                 else:
203 |                     text_list.append(tmp.get("text"))
204 |     return text_list
205 | 
206 | def load_model(exp_name, model_cls, model_cfg, ckpt_step):
207 | 
208 |     ckpt_path = osp.join(pretrained_dir,f"{exp_name}/model_{ckpt_step}.safetensors")
209 |     # ckpt_path = f"ckpts/{exp_name}/model_{ckpt_step}.pt"  # .pt | .safetensors
210 |     vocab_char_map, vocab_size = get_tokenizer("Emilia_ZH_EN", "pinyin")
211 |     model = CFM(
212 |         transformer=model_cls(
213 |             **model_cfg, text_num_embeds=vocab_size, mel_dim=n_mel_channels
214 |         ),
215 |         mel_spec_kwargs=dict(
216 |             target_sample_rate=target_sample_rate,
217 |             n_mel_channels=n_mel_channels,
218 |             hop_length=hop_length,
219 |         ),
220 |         odeint_kwargs=dict(
221 |             method=ode_method,
222 |         ),
223 |         vocab_char_map=vocab_char_map,
224 |     ).to(device)
225 | 
226 |     model = load_checkpoint(model, ckpt_path, device, use_ema = True)
227 | 
228 |     return model
229 | 
230 | 
231 | 
232 | def infer_batch(ref_audio, ref_text, gen_text_batches, exp_name, remove_silence, speed):
233 |     
234 |     if exp_name == "F5-TTS":
235 |         F5TTS_model_cfg = dict(
236 |             dim=1024, depth=22, heads=16, ff_mult=2, text_dim=512, conv_layers=4
237 |         )
238 |         ema_model = load_model(
239 |             "F5TTS_Base", DiT, F5TTS_model_cfg, 1200000
240 |         )
241 |     elif exp_name == "E2-TTS":
242 |         E2TTS_model_cfg = dict(dim=1024, depth=24, heads=16, ff_mult=4)
243 |         ema_model = load_model(
244 |             "E2TTS_Base", UNetT, E2TTS_model_cfg, 1200000
245 |         )
246 | 
247 |     audio, sr = torchaudio.load(ref_audio)
248 |     if audio.shape[0] > 1:
249 |         audio = torch.mean(audio, dim=0, keepdim=True)
250 | 
251 |     rms = torch.sqrt(torch.mean(torch.square(audio)))
252 |     if rms < target_rms:
253 |         audio = audio * target_rms / rms
254 |     if sr != target_sample_rate:
255 |         resampler = torchaudio.transforms.Resample(sr, target_sample_rate)
256 |         audio = resampler(audio)
257 |     audio = audio.to(device)
258 | 
259 |     generated_waves = []
260 |     spectrograms = []
261 |     comfybar = ProgressBar(len(gen_text_batches))
262 |     for i, gen_text in enumerate(tqdm(gen_text_batches)):
263 |         # Prepare the text
264 |         if len(ref_text[-1].encode('utf-8')) == 1:
265 |             ref_text = ref_text + " "
266 |         text_list = [ref_text + gen_text]
267 |         final_text_list = convert_char_to_pinyin(text_list)
268 | 
269 |         # Calculate duration
270 |         ref_audio_len = audio.shape[-1] // hop_length
271 |         zh_pause_punc = r"。，、；：？！"
272 |         ref_text_len = len(ref_text.encode('utf-8')) + 3 * len(re.findall(zh_pause_punc, ref_text))
273 |         gen_text_len = len(gen_text.encode('utf-8')) + 3 * len(re.findall(zh_pause_punc, gen_text))
274 |         duration = ref_audio_len + int(ref_audio_len / ref_text_len * gen_text_len / speed)
275 | 
276 |         # inference
277 |         with torch.inference_mode():
278 |             generated, _ = ema_model.sample(
279 |                 cond=audio,
280 |                 text=final_text_list,
281 |                 duration=duration,
282 |                 steps=nfe_step,
283 |                 cfg_strength=cfg_strength,
284 |                 sway_sampling_coef=sway_sampling_coef,
285 |             )
286 | 
287 |         generated = generated[:, ref_audio_len:, :]
288 |         generated_mel_spec = rearrange(generated, "1 n d -> 1 d n")
289 |         
290 |         vocos = Vocos.from_pretrained("charactr/vocos-mel-24khz")
291 |         generated_wave = vocos.decode(generated_mel_spec.cpu())
292 |         if rms < target_rms:
293 |             generated_wave = generated_wave * rms / target_rms
294 | 
295 |         # wav -> numpy
296 |         generated_wave = generated_wave.squeeze().cpu().numpy()
297 |         
298 |         generated_waves.append(generated_wave)
299 |         spectrograms.append(generated_mel_spec[0].cpu().numpy())
300 |         comfybar.update(1)
301 | 
302 |     # Combine all generated waves
303 |     final_wave = np.concatenate(generated_waves)
304 | 
305 |     # Remove silence
306 |     if remove_silence:
307 |         with tempfile.NamedTemporaryFile(delete=False, suffix=".wav",dir=tmp_dir) as f:
308 |             sf.write(f.name, final_wave, target_sample_rate)
309 |             aseg = AudioSegment.from_file(f.name)
310 |             non_silent_segs = silence.split_on_silence(aseg, min_silence_len=1000, silence_thresh=-50, keep_silence=500)
311 |             non_silent_wave = AudioSegment.silent(duration=0)
312 |             for non_silent_seg in non_silent_segs:
313 |                 non_silent_wave += non_silent_seg
314 |             aseg = non_silent_wave
315 |             aseg.export(f.name, format="wav")
316 |             final_wave, _ = torchaudio.load(f.name)
317 |         final_wave = final_wave.squeeze().cpu().numpy()
318 | 
319 |     # Create a combined spectrogram
320 |     combined_spectrogram = np.concatenate(spectrograms, axis=1)
321 | 
322 |     with tempfile.NamedTemporaryFile(suffix=".png", delete=False,dir=tmp_dir) as tmp_spectrogram:
323 |         spectrogram_path = tmp_spectrogram.name
324 |         save_spectrogram(combined_spectrogram, spectrogram_path)
325 | 
326 |     return (target_sample_rate, final_wave), spectrogram_path
327 | 
328 | 
329 | def split_text_into_batches(text, max_chars=200, split_words=SPLIT_WORDS):
330 |     if len(text.encode('utf-8')) <= max_chars:
331 |         return [text]
332 |     if text[-1] not in ['。', '.', '!', '！', '?', '？']:
333 |         text += '.'
334 |         
335 |     sentences = re.split('([。.!?！？])', text)
336 |     sentences = [''.join(i) for i in zip(sentences[0::2], sentences[1::2])]
337 |     
338 |     batches = []
339 |     current_batch = ""
340 |     
341 |     def split_by_words(text):
342 |         words = text.split()
343 |         current_word_part = ""
344 |         word_batches = []
345 |         for word in words:
346 |             if len(current_word_part.encode('utf-8')) + len(word.encode('utf-8')) + 1 <= max_chars:
347 |                 current_word_part += word + ' '
348 |             else:
349 |                 if current_word_part:
350 |                     # Try to find a suitable split word
351 |                     for split_word in split_words:
352 |                         split_index = current_word_part.rfind(' ' + split_word + ' ')
353 |                         if split_index != -1:
354 |                             word_batches.append(current_word_part[:split_index].strip())
355 |                             current_word_part = current_word_part[split_index:].strip() + ' '
356 |                             break
357 |                     else:
358 |                         # If no suitable split word found, just append the current part
359 |                         word_batches.append(current_word_part.strip())
360 |                         current_word_part = ""
361 |                 current_word_part += word + ' '
362 |         if current_word_part:
363 |             word_batches.append(current_word_part.strip())
364 |         return word_batches
365 | 
366 |     for sentence in sentences:
367 |         if len(current_batch.encode('utf-8')) + len(sentence.encode('utf-8')) <= max_chars:
368 |             current_batch += sentence
369 |         else:
370 |             # If adding this sentence would exceed the limit
371 |             if current_batch:
372 |                 batches.append(current_batch)
373 |                 current_batch = ""
374 |             
375 |             # If the sentence itself is longer than max_chars, split it
376 |             if len(sentence.encode('utf-8')) > max_chars:
377 |                 # First, try to split by colon
378 |                 colon_parts = sentence.split(':')
379 |                 if len(colon_parts) > 1:
380 |                     for part in colon_parts:
381 |                         if len(part.encode('utf-8')) <= max_chars:
382 |                             batches.append(part)
383 |                         else:
384 |                             # If colon part is still too long, split by comma
385 |                             comma_parts = re.split('[,，]', part)
386 |                             if len(comma_parts) > 1:
387 |                                 current_comma_part = ""
388 |                                 for comma_part in comma_parts:
389 |                                     if len(current_comma_part.encode('utf-8')) + len(comma_part.encode('utf-8')) <= max_chars:
390 |                                         current_comma_part += comma_part + ','
391 |                                     else:
392 |                                         if current_comma_part:
393 |                                             batches.append(current_comma_part.rstrip(','))
394 |                                         current_comma_part = comma_part + ','
395 |                                 if current_comma_part:
396 |                                     batches.append(current_comma_part.rstrip(','))
397 |                             else:
398 |                                 # If no comma, split by words
399 |                                 batches.extend(split_by_words(part))
400 |                 else:
401 |                     # If no colon, split by comma
402 |                     comma_parts = re.split('[,，]', sentence)
403 |                     if len(comma_parts) > 1:
404 |                         current_comma_part = ""
405 |                         for comma_part in comma_parts:
406 |                             if len(current_comma_part.encode('utf-8')) + len(comma_part.encode('utf-8')) <= max_chars:
407 |                                 current_comma_part += comma_part + ','
408 |                             else:
409 |                                 if current_comma_part:
410 |                                     batches.append(current_comma_part.rstrip(','))
411 |                                 current_comma_part = comma_part + ','
412 |                         if current_comma_part:
413 |                             batches.append(current_comma_part.rstrip(','))
414 |                     else:
415 |                         # If no comma, split by words
416 |                         batches.extend(split_by_words(sentence))
417 |             else:
418 |                 current_batch = sentence
419 |     
420 |     if current_batch:
421 |         batches.append(current_batch)
422 |     
423 |     return batches
424 | 
425 | 


--------------------------------------------------------------------------------
/f5_model/modules.py:
--------------------------------------------------------------------------------
  1 | """
  2 | ein notation:
  3 | b - batch
  4 | n - sequence
  5 | nt - text sequence
  6 | nw - raw wave length
  7 | d - dimension
  8 | """
  9 | 
 10 | from __future__ import annotations
 11 | from typing import Optional
 12 | import math
 13 | 
 14 | import torch
 15 | from torch import nn
 16 | import torch.nn.functional as F
 17 | import torchaudio
 18 | 
 19 | from einops import rearrange
 20 | from x_transformers.x_transformers import apply_rotary_pos_emb
 21 | 
 22 | 
 23 | # raw wav to mel spec
 24 | 
 25 | class MelSpec(nn.Module):
 26 |     def __init__(
 27 |         self,
 28 |         filter_length = 1024,
 29 |         hop_length = 256,
 30 |         win_length = 1024,
 31 |         n_mel_channels = 100,
 32 |         target_sample_rate = 24_000,
 33 |         normalize = False,
 34 |         power = 1,
 35 |         norm = None,
 36 |         center = True,
 37 |     ):
 38 |         super().__init__()
 39 |         self.n_mel_channels = n_mel_channels
 40 | 
 41 |         self.mel_stft = torchaudio.transforms.MelSpectrogram(
 42 |             sample_rate = target_sample_rate,
 43 |             n_fft = filter_length,
 44 |             win_length = win_length,
 45 |             hop_length = hop_length,
 46 |             n_mels = n_mel_channels,
 47 |             power = power,
 48 |             center = center,
 49 |             normalized = normalize,
 50 |             norm = norm,
 51 |         )
 52 | 
 53 |         self.register_buffer('dummy', torch.tensor(0), persistent = False)
 54 | 
 55 |     def forward(self, inp):
 56 |         if len(inp.shape) == 3:
 57 |             inp = rearrange(inp, 'b 1 nw -> b nw')
 58 | 
 59 |         assert len(inp.shape) == 2
 60 | 
 61 |         if self.dummy.device != inp.device:
 62 |             self.to(inp.device)
 63 | 
 64 |         mel = self.mel_stft(inp)
 65 |         mel = mel.clamp(min = 1e-5).log()
 66 |         return mel
 67 |     
 68 | 
 69 | # sinusoidal position embedding
 70 | 
 71 | class SinusPositionEmbedding(nn.Module):
 72 |     def __init__(self, dim):
 73 |         super().__init__()
 74 |         self.dim = dim
 75 | 
 76 |     def forward(self, x, scale=1000):
 77 |         device = x.device
 78 |         half_dim = self.dim // 2
 79 |         emb = math.log(10000) / (half_dim - 1)
 80 |         emb = torch.exp(torch.arange(half_dim, device=device).float() * -emb)
 81 |         emb = scale * x.unsqueeze(1) * emb.unsqueeze(0)
 82 |         emb = torch.cat((emb.sin(), emb.cos()), dim=-1)
 83 |         return emb
 84 | 
 85 | 
 86 | # convolutional position embedding
 87 | 
 88 | class ConvPositionEmbedding(nn.Module):
 89 |     def __init__(self, dim, kernel_size = 31, groups = 16):
 90 |         super().__init__()
 91 |         assert kernel_size % 2 != 0
 92 |         self.conv1d = nn.Sequential(
 93 |             nn.Conv1d(dim, dim, kernel_size, groups = groups, padding = kernel_size // 2),
 94 |             nn.Mish(),
 95 |             nn.Conv1d(dim, dim, kernel_size, groups = groups, padding = kernel_size // 2),
 96 |             nn.Mish(),
 97 |         )
 98 | 
 99 |     def forward(self, x: float['b n d'], mask: bool['b n'] | None  = None):
100 |         if mask is not None:
101 |             mask = mask[..., None]
102 |             x = x.masked_fill(~mask, 0.)
103 | 
104 |         x = rearrange(x, 'b n d -> b d n')
105 |         x = self.conv1d(x)
106 |         out = rearrange(x, 'b d n -> b n d')
107 | 
108 |         if mask is not None:
109 |             out = out.masked_fill(~mask, 0.)
110 | 
111 |         return out
112 | 
113 | 
114 | # rotary positional embedding related
115 | 
116 | def precompute_freqs_cis(dim: int, end: int, theta: float = 10000.0, theta_rescale_factor=1.):
117 |     # proposed by reddit user bloc97, to rescale rotary embeddings to longer sequence length without fine-tuning
118 |     # has some connection to NTK literature
119 |     # https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/
120 |     # https://github.com/lucidrains/rotary-embedding-torch/blob/main/rotary_embedding_torch/rotary_embedding_torch.py
121 |     theta *= theta_rescale_factor ** (dim / (dim - 2))
122 |     freqs = 1.0 / (theta ** (torch.arange(0, dim, 2)[: (dim // 2)].float() / dim))
123 |     t = torch.arange(end, device=freqs.device)  # type: ignore
124 |     freqs = torch.outer(t, freqs).float()  # type: ignore
125 |     freqs_cos = torch.cos(freqs)  # real part
126 |     freqs_sin = torch.sin(freqs)  # imaginary part
127 |     return torch.cat([freqs_cos, freqs_sin], dim=-1)
128 | 
129 | def get_pos_embed_indices(start, length, max_pos, scale=1.):
130 |     # length = length if isinstance(length, int) else length.max()
131 |     scale = scale * torch.ones_like(start, dtype=torch.float32)  # in case scale is a scalar
132 |     pos = start.unsqueeze(1) + (
133 |             torch.arange(length, device=start.device, dtype=torch.float32).unsqueeze(0) *
134 |             scale.unsqueeze(1)).long()
135 |     # avoid extra long error.
136 |     pos = torch.where(pos < max_pos, pos, max_pos - 1)
137 |     return pos
138 | 
139 | 
140 | # Global Response Normalization layer (Instance Normalization ?)
141 | 
142 | class GRN(nn.Module):
143 |     def __init__(self, dim):
144 |         super().__init__()
145 |         self.gamma = nn.Parameter(torch.zeros(1, 1, dim))
146 |         self.beta = nn.Parameter(torch.zeros(1, 1, dim))
147 | 
148 |     def forward(self, x):
149 |         Gx = torch.norm(x, p=2, dim=1, keepdim=True)
150 |         Nx = Gx / (Gx.mean(dim=-1, keepdim=True) + 1e-6)
151 |         return self.gamma * (x * Nx) + self.beta + x
152 | 
153 | 
154 | # ConvNeXt-V2 Block https://github.com/facebookresearch/ConvNeXt-V2/blob/main/models/convnextv2.py
155 | # ref: https://github.com/bfs18/e2_tts/blob/main/rfwave/modules.py#L108
156 | 
157 | class ConvNeXtV2Block(nn.Module):
158 |     def __init__(
159 |         self,
160 |         dim: int,
161 |         intermediate_dim: int,
162 |         dilation: int = 1,
163 |     ):
164 |         super().__init__()
165 |         padding = (dilation * (7 - 1)) // 2
166 |         self.dwconv = nn.Conv1d(dim, dim, kernel_size=7, padding=padding, groups=dim, dilation=dilation)  # depthwise conv
167 |         self.norm = nn.LayerNorm(dim, eps=1e-6)
168 |         self.pwconv1 = nn.Linear(dim, intermediate_dim)  # pointwise/1x1 convs, implemented with linear layers
169 |         self.act = nn.GELU()
170 |         self.grn = GRN(intermediate_dim)
171 |         self.pwconv2 = nn.Linear(intermediate_dim, dim)
172 | 
173 |     def forward(self, x: torch.Tensor) -> torch.Tensor:
174 |         residual = x
175 |         x = x.transpose(1, 2)  # b n d -> b d n
176 |         x = self.dwconv(x)
177 |         x = x.transpose(1, 2)  # b d n -> b n d
178 |         x = self.norm(x)
179 |         x = self.pwconv1(x)
180 |         x = self.act(x)
181 |         x = self.grn(x)
182 |         x = self.pwconv2(x)
183 |         return residual + x
184 | 
185 | 
186 | # AdaLayerNormZero
187 | # return with modulated x for attn input, and params for later mlp modulation
188 | 
189 | class AdaLayerNormZero(nn.Module):
190 |     def __init__(self, dim):
191 |         super().__init__()
192 | 
193 |         self.silu = nn.SiLU()
194 |         self.linear = nn.Linear(dim, dim * 6)
195 | 
196 |         self.norm = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
197 | 
198 |     def forward(self, x, emb = None):
199 |         emb = self.linear(self.silu(emb))
200 |         shift_msa, scale_msa, gate_msa, shift_mlp, scale_mlp, gate_mlp = torch.chunk(emb, 6, dim=1)
201 | 
202 |         x = self.norm(x) * (1 + scale_msa[:, None]) + shift_msa[:, None]
203 |         return x, gate_msa, shift_mlp, scale_mlp, gate_mlp
204 | 
205 | 
206 | # AdaLayerNormZero for final layer
207 | # return only with modulated x for attn input, cuz no more mlp modulation
208 | 
209 | class AdaLayerNormZero_Final(nn.Module):
210 |     def __init__(self, dim):
211 |         super().__init__()
212 | 
213 |         self.silu = nn.SiLU()
214 |         self.linear = nn.Linear(dim, dim * 2)
215 | 
216 |         self.norm = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
217 | 
218 |     def forward(self, x, emb):
219 |         emb = self.linear(self.silu(emb))
220 |         scale, shift = torch.chunk(emb, 2, dim=1)
221 | 
222 |         x = self.norm(x) * (1 + scale)[:, None, :] + shift[:, None, :]
223 |         return x
224 | 
225 | 
226 | # FeedForward
227 | 
228 | class FeedForward(nn.Module):
229 |     def __init__(self, dim, dim_out = None, mult = 4, dropout = 0., approximate: str = 'none'):
230 |         super().__init__()
231 |         inner_dim = int(dim * mult)
232 |         dim_out = dim_out if dim_out is not None else dim
233 | 
234 |         activation = nn.GELU(approximate=approximate)
235 |         project_in = nn.Sequential(
236 |             nn.Linear(dim, inner_dim),
237 |             activation
238 |         )
239 |         self.ff = nn.Sequential(
240 |             project_in,
241 |             nn.Dropout(dropout),
242 |             nn.Linear(inner_dim, dim_out)
243 |         )
244 | 
245 |     def forward(self, x):
246 |         return self.ff(x)
247 | 
248 | 
249 | # Attention with possible joint part
250 | # modified from diffusers/src/diffusers/models/attention_processor.py
251 | 
252 | class Attention(nn.Module):
253 |     def __init__(
254 |         self,
255 |         processor: JointAttnProcessor | AttnProcessor,
256 |         dim: int,
257 |         heads: int = 8,
258 |         dim_head: int = 64,
259 |         dropout: float = 0.0,
260 |         context_dim: Optional[int] = None, # if not None -> joint attention
261 |         context_pre_only = None,
262 |     ):
263 |         super().__init__()
264 | 
265 |         if not hasattr(F, "scaled_dot_product_attention"):
266 |             raise ImportError("Attention equires PyTorch 2.0, to use it, please upgrade PyTorch to 2.0.")
267 | 
268 |         self.processor = processor
269 | 
270 |         self.dim = dim
271 |         self.heads = heads
272 |         self.inner_dim = dim_head * heads
273 |         self.dropout = dropout
274 | 
275 |         self.context_dim = context_dim
276 |         self.context_pre_only = context_pre_only
277 | 
278 |         self.to_q = nn.Linear(dim, self.inner_dim)
279 |         self.to_k = nn.Linear(dim, self.inner_dim)
280 |         self.to_v = nn.Linear(dim, self.inner_dim)
281 | 
282 |         if self.context_dim is not None:
283 |             self.to_k_c = nn.Linear(context_dim, self.inner_dim)
284 |             self.to_v_c = nn.Linear(context_dim, self.inner_dim)
285 |             if self.context_pre_only is not None:
286 |                 self.to_q_c = nn.Linear(context_dim, self.inner_dim)
287 | 
288 |         self.to_out = nn.ModuleList([])
289 |         self.to_out.append(nn.Linear(self.inner_dim, dim))
290 |         self.to_out.append(nn.Dropout(dropout))
291 | 
292 |         if self.context_pre_only is not None and not self.context_pre_only:
293 |             self.to_out_c = nn.Linear(self.inner_dim, dim)
294 | 
295 |     def forward(
296 |         self,
297 |         x: float['b n d'], # noised input x
298 |         c: float['b n d'] = None,  # context c
299 |         mask: bool['b n'] | None = None,
300 |         rope = None,  # rotary position embedding for x
301 |         c_rope = None,  # rotary position embedding for c
302 |     ) -> torch.Tensor:
303 |         if c is not None:
304 |             return self.processor(self, x, c = c, mask = mask, rope = rope, c_rope = c_rope)
305 |         else:
306 |             return self.processor(self, x, mask = mask, rope = rope)
307 | 
308 | 
309 | # Attention processor
310 | 
311 | class AttnProcessor:
312 |     def __init__(self):
313 |         pass
314 | 
315 |     def __call__(
316 |         self,
317 |         attn: Attention,
318 |         x: float['b n d'], # noised input x
319 |         mask: bool['b n'] | None = None,
320 |         rope = None,  # rotary position embedding
321 |     ) -> torch.FloatTensor:
322 | 
323 |         batch_size = x.shape[0]
324 | 
325 |         # `sample` projections.
326 |         query = attn.to_q(x)
327 |         key = attn.to_k(x)
328 |         value = attn.to_v(x)
329 | 
330 |         # apply rotary position embedding
331 |         if rope is not None:
332 |             freqs, xpos_scale = rope
333 |             q_xpos_scale, k_xpos_scale = (xpos_scale, xpos_scale ** -1.) if xpos_scale is not None else (1., 1.)
334 | 
335 |             query = apply_rotary_pos_emb(query, freqs, q_xpos_scale)
336 |             key = apply_rotary_pos_emb(key, freqs, k_xpos_scale)
337 | 
338 |         # attention
339 |         inner_dim = key.shape[-1]
340 |         head_dim = inner_dim // attn.heads
341 |         query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
342 |         key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
343 |         value = value.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
344 | 
345 |         # mask. e.g. inference got a batch with different target durations, mask out the padding
346 |         if mask is not None:
347 |             attn_mask = mask
348 |             attn_mask = rearrange(attn_mask, 'b n -> b 1 1 n')
349 |             attn_mask = attn_mask.expand(batch_size, attn.heads, query.shape[-2], key.shape[-2])
350 |         else:
351 |             attn_mask = None
352 | 
353 |         x = F.scaled_dot_product_attention(query, key, value, attn_mask=attn_mask, dropout_p=0.0, is_causal=False)
354 |         x = x.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim)
355 |         x = x.to(query.dtype)
356 | 
357 |         # linear proj
358 |         x = attn.to_out[0](x)
359 |         # dropout
360 |         x = attn.to_out[1](x)
361 | 
362 |         if mask is not None:
363 |             mask = rearrange(mask, 'b n -> b n 1')
364 |             x = x.masked_fill(~mask, 0.)
365 | 
366 |         return x
367 |     
368 | 
369 | # Joint Attention processor for MM-DiT
370 | # modified from diffusers/src/diffusers/models/attention_processor.py
371 | 
372 | class JointAttnProcessor:
373 |     def __init__(self):
374 |         pass
375 | 
376 |     def __call__(
377 |         self,
378 |         attn: Attention,
379 |         x: float['b n d'], # noised input x
380 |         c: float['b nt d'] = None,  # context c, here text
381 |         mask: bool['b n'] | None = None,
382 |         rope = None,  # rotary position embedding for x
383 |         c_rope = None,  # rotary position embedding for c
384 |     ) -> torch.FloatTensor:
385 |         residual = x
386 | 
387 |         batch_size = c.shape[0]
388 | 
389 |         # `sample` projections.
390 |         query = attn.to_q(x)
391 |         key = attn.to_k(x)
392 |         value = attn.to_v(x)
393 | 
394 |         # `context` projections.
395 |         c_query = attn.to_q_c(c)
396 |         c_key = attn.to_k_c(c)
397 |         c_value = attn.to_v_c(c)
398 | 
399 |         # apply rope for context and noised input independently
400 |         if rope is not None:
401 |             freqs, xpos_scale = rope
402 |             q_xpos_scale, k_xpos_scale = (xpos_scale, xpos_scale ** -1.) if xpos_scale is not None else (1., 1.)
403 |             query = apply_rotary_pos_emb(query, freqs, q_xpos_scale)
404 |             key = apply_rotary_pos_emb(key, freqs, k_xpos_scale)
405 |         if c_rope is not None:
406 |             freqs, xpos_scale = c_rope
407 |             q_xpos_scale, k_xpos_scale = (xpos_scale, xpos_scale ** -1.) if xpos_scale is not None else (1., 1.)
408 |             c_query = apply_rotary_pos_emb(c_query, freqs, q_xpos_scale)
409 |             c_key = apply_rotary_pos_emb(c_key, freqs, k_xpos_scale)
410 | 
411 |         # attention
412 |         query = torch.cat([query, c_query], dim=1)
413 |         key = torch.cat([key, c_key], dim=1)
414 |         value = torch.cat([value, c_value], dim=1)
415 | 
416 |         inner_dim = key.shape[-1]
417 |         head_dim = inner_dim // attn.heads
418 |         query = query.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
419 |         key = key.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
420 |         value = value.view(batch_size, -1, attn.heads, head_dim).transpose(1, 2)
421 | 
422 |         # mask. e.g. inference got a batch with different target durations, mask out the padding
423 |         if mask is not None:
424 |             attn_mask = F.pad(mask, (0, c.shape[1]), value = True)  # no mask for c (text)
425 |             attn_mask = rearrange(attn_mask, 'b n -> b 1 1 n')
426 |             attn_mask = attn_mask.expand(batch_size, attn.heads, query.shape[-2], key.shape[-2])
427 |         else:
428 |             attn_mask = None
429 | 
430 |         x = F.scaled_dot_product_attention(query, key, value, attn_mask=attn_mask, dropout_p=0.0, is_causal=False)
431 |         x = x.transpose(1, 2).reshape(batch_size, -1, attn.heads * head_dim)
432 |         x = x.to(query.dtype)
433 | 
434 |         # Split the attention outputs.
435 |         x, c = (
436 |             x[:, :residual.shape[1]],
437 |             x[:, residual.shape[1]:],
438 |         )
439 | 
440 |         # linear proj
441 |         x = attn.to_out[0](x)
442 |         # dropout
443 |         x = attn.to_out[1](x)
444 |         if not attn.context_pre_only:
445 |             c = attn.to_out_c(c)
446 | 
447 |         if mask is not None:
448 |             mask = rearrange(mask, 'b n -> b n 1')
449 |             x = x.masked_fill(~mask, 0.)
450 |             # c = c.masked_fill(~mask, 0.)  # no mask for c (text)
451 | 
452 |         return x, c
453 | 
454 | 
455 | # DiT Block
456 | 
457 | class DiTBlock(nn.Module):
458 | 
459 |     def __init__(self, dim, heads, dim_head, ff_mult = 4, dropout = 0.1):
460 |         super().__init__()
461 |         
462 |         self.attn_norm = AdaLayerNormZero(dim)
463 |         self.attn = Attention(
464 |             processor = AttnProcessor(),
465 |             dim = dim,
466 |             heads = heads,
467 |             dim_head = dim_head,
468 |             dropout = dropout,
469 |             )
470 |         
471 |         self.ff_norm = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
472 |         self.ff = FeedForward(dim = dim, mult = ff_mult, dropout = dropout, approximate = "tanh")
473 | 
474 |     def forward(self, x, t, mask = None, rope = None): # x: noised input, t: time embedding
475 |         # pre-norm & modulation for attention input
476 |         norm, gate_msa, shift_mlp, scale_mlp, gate_mlp = self.attn_norm(x, emb=t)
477 | 
478 |         # attention
479 |         attn_output = self.attn(x=norm, mask=mask, rope=rope)
480 | 
481 |         # process attention output for input x
482 |         x = x + gate_msa.unsqueeze(1) * attn_output
483 |         
484 |         norm = self.ff_norm(x) * (1 + scale_mlp[:, None]) + shift_mlp[:, None]
485 |         ff_output = self.ff(norm)
486 |         x = x + gate_mlp.unsqueeze(1) * ff_output
487 | 
488 |         return x
489 | 
490 | 
491 | # MMDiT Block https://arxiv.org/abs/2403.03206
492 | 
493 | class MMDiTBlock(nn.Module):
494 |     r""" 
495 |     modified from diffusers/src/diffusers/models/attention.py
496 | 
497 |     notes.
498 |     _c: context related. text, cond, etc. (left part in sd3 fig2.b)
499 |     _x: noised input related. (right part)
500 |     context_pre_only: last layer only do prenorm + modulation cuz no more ffn
501 |     """
502 | 
503 |     def __init__(self, dim, heads, dim_head, ff_mult = 4, dropout = 0.1, context_pre_only = False):
504 |         super().__init__()
505 | 
506 |         self.context_pre_only = context_pre_only
507 |         
508 |         self.attn_norm_c = AdaLayerNormZero_Final(dim) if context_pre_only else AdaLayerNormZero(dim)
509 |         self.attn_norm_x = AdaLayerNormZero(dim)
510 |         self.attn = Attention(
511 |             processor = JointAttnProcessor(),
512 |             dim = dim,
513 |             heads = heads,
514 |             dim_head = dim_head,
515 |             dropout = dropout,
516 |             context_dim = dim,
517 |             context_pre_only = context_pre_only,
518 |             )
519 | 
520 |         if not context_pre_only:
521 |             self.ff_norm_c = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
522 |             self.ff_c = FeedForward(dim = dim, mult = ff_mult, dropout = dropout, approximate = "tanh")
523 |         else:
524 |             self.ff_norm_c = None
525 |             self.ff_c = None
526 |         self.ff_norm_x = nn.LayerNorm(dim, elementwise_affine=False, eps=1e-6)
527 |         self.ff_x = FeedForward(dim = dim, mult = ff_mult, dropout = dropout, approximate = "tanh")
528 | 
529 |     def forward(self, x, c, t, mask = None, rope = None, c_rope = None): # x: noised input, c: context, t: time embedding
530 |         # pre-norm & modulation for attention input
531 |         if self.context_pre_only:
532 |             norm_c = self.attn_norm_c(c, t)
533 |         else:
534 |             norm_c, c_gate_msa, c_shift_mlp, c_scale_mlp, c_gate_mlp = self.attn_norm_c(c, emb=t)
535 |         norm_x, x_gate_msa, x_shift_mlp, x_scale_mlp, x_gate_mlp = self.attn_norm_x(x, emb=t)
536 | 
537 |         # attention
538 |         x_attn_output, c_attn_output = self.attn(x=norm_x, c=norm_c, mask=mask, rope=rope, c_rope=c_rope)
539 | 
540 |         # process attention output for context c
541 |         if self.context_pre_only:
542 |             c = None
543 |         else: # if not last layer
544 |             c = c + c_gate_msa.unsqueeze(1) * c_attn_output
545 | 
546 |             norm_c = self.ff_norm_c(c) * (1 + c_scale_mlp[:, None]) + c_shift_mlp[:, None]
547 |             c_ff_output = self.ff_c(norm_c)
548 |             c = c + c_gate_mlp.unsqueeze(1) * c_ff_output
549 | 
550 |         # process attention output for input x
551 |         x = x + x_gate_msa.unsqueeze(1) * x_attn_output
552 |         
553 |         norm_x = self.ff_norm_x(x) * (1 + x_scale_mlp[:, None]) + x_shift_mlp[:, None]
554 |         x_ff_output = self.ff_x(norm_x)
555 |         x = x + x_gate_mlp.unsqueeze(1) * x_ff_output
556 | 
557 |         return c, x
558 | 
559 | 
560 | # time step conditioning embedding
561 | 
562 | class TimestepEmbedding(nn.Module):
563 |     def __init__(self, dim, freq_embed_dim=256):
564 |         super().__init__()
565 |         self.time_embed = SinusPositionEmbedding(freq_embed_dim)
566 |         self.time_mlp = nn.Sequential(
567 |             nn.Linear(freq_embed_dim, dim),
568 |             nn.SiLU(),
569 |             nn.Linear(dim, dim)
570 |         )
571 | 
572 |     def forward(self, timestep: float['b']):
573 |         time_hidden = self.time_embed(timestep)
574 |         time = self.time_mlp(time_hidden)  # b d
575 |         return time
576 | 


--------------------------------------------------------------------------------
/zh_normalization/char_convert.py:
--------------------------------------------------------------------------------
 1 | # coding=utf-8
 2 | # Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
 3 | #
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | #     http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 | # See the License for the specific language governing permissions and
14 | # limitations under the License.
15 | """Traditional and simplified Chinese conversion, a simplified character may correspond to multiple traditional characters.
16 | """
17 | simplified_charcters = '制咖片型超声盘鉴定仔点他命书歌粉巾字帐恤手指记忆棒形转弯沟光○〇㐄㐅㐆㐌㐖毒㐜㐡㐤㐰㐺㑇㑳㒳㒸㔾㗂㗎㝵㞎㞙㞞以㢲㢴㤅㥁㥯㨗㫺㬎㮎㮚㮸㲋㲱㲾㳮涧㵪㶸㷖㷭㹢㹴犬㺢狓㺵碗㽮㿝䍃䔢䖟䖸䗈䗥䗪䝓射䥯䦉䯝鲃鱼䲔䳗鹅䵹鼄䶑一对应映射丁不识下儿子做二休世丘之貉并中台原则串为甚谓干净了百事无成八变五十些人得道鸡升天代如并来去个国政策劲幽灵在欧洲游荡接样萝卜坑侧化传价元论醇共再准刀两断切分耕耘收获钱货物向看旧就绪险刻千金动劳永逸匙零夜半卡通回复返影踪反常态口咬气句话同吐快吹周味呼诺呜品红锅哄而散起唱和问三知生熟团漆黑火糟堆场空块面塌糊涂尘染壁厢夔已足多情露水大早到晚夫妻当关万莫开失古恨套所料既往孔见提师要家主审寸阴难买斗牛小撮部阵局展身层巴掌帆风顺席地带过年计于春头载四季期被蛇怕井绳度愿式份弹顷深前律径心意念差愁孤行俱全房厅交遮打技长把抓死拿眼泪鼻涕钥锁折段抿拍即合扫排掬挥拨拥上入击洞掷揽改故辙败文值名斑方面旁族日秋餐隔雅里终父旦时晌会霎间晃暴寒曝更月望垠际朝夕本正经利杯羹东西板枝独秀根筋杆进条龙服务概模次函数又性程总付步脚印趋登毛拔呵氧氮碳决雌雄波未平派谎言流清楚白准溜烟潭有获闻是处降琴鹤甲病发可拾沙目然了直以相眨穿睹瞥瞬矢的解石鸟神教秉虔诚秘种窝蜂穷窍笑置笔苟勾销抹杀煞等奖箍节吃箭仇双雕诗筹箩筐系列纸级士官统丝毫挂维网尽线微吭响股脑胎脉承腔臂力致效资源址器举功投般说讲规贸易叶障着慎满皆输号木电池衣倾钟高低视仁觉醒览遗角银币触溃九鼎蔽抄出驷马追重语破贫洗贯走路安蹴至几蹶振跃役胆汗较辈轮辞赞退六连遍递边针血锤音错门思闪真倒项栽雾类保护川先惊乍体哄鳞爪鸣滴泡邻域党专鼓作齐炒丑烯亥克内酯冬加奴卯肝炎基尺梁街裤镐客宠庭巳汝昌烷玲磊糖肇酉醛啷青县韪良香骨鲷丂七集河市弦喜嘴张舌堵区工业姊妹星架构巧彩扭歪拼凑余热曜武州爷浮屠美乡老阶树荤素碎落能魄鳃鳗珠丄丅丆万俟丈尚摸母娘量管群亚虎必我堂令申件装伏位博侠义界表女墟台戏臭皮匠胜诸葛亮赛顶倍催请运算包立叉戟离疫苗土史志演围揭瓦晒夷姑婆帝村宝烂尖杉碱屉桌山岔岛由纪峡坝库镇废从德后拗汤治旬食明昧曹朋友框栏极权幂曲归依猫民氟硼氯磷铁江侗自旅法司洋浦梅园温暖湾焦班幸用田略番叠皇炮捶硝苯酸腺苷棱草镜穗跳远索锦纲聚氰胺联店胚膲爱色堇紫罗兰芝茶饭菱云虫藏藩乱叛苏亲债凳学座恐恋柱测肌腹衩锥系貂企乌跪叩军车农题迭都甘油屯奏键短阿姨陪姐只顾茅庐槽驾魂鲜鹿页其菜单乘任供势午齿汉组织吊调泻唇坡城报坟外夸将尉建筑岸岗公床扬新剑升杭林栗校楼标款汽社浣海商馆剧院钢华港机械广媒环球融第医科证券综财乐育游涨犹岭疏瘾睑确兵领导缴肢膛船艾瑟尔苍蔡虞效衫覆访诉课谕议轨述野钩限敌鞋颌颔颚饶首龈站例修凡划垂届属崽颏厨拜挫摆放旋削棋榻槛礼沉注滑营狱画确仪聘花葬诏员跌辖周达酒锚闸陷陆雨雪飞威丌于丹久乏予理评产亢卑亦乎舞己悲矩圆词害志但住佞佳便俗信票案幅翁倦伦假偏倚斜亏鬼敲停备伤脾胃仅此像俭匮免宜穴焉戴兼容许冻伯仲负彼昼皂轩轾实刊划颠卫战哥比省非好黄饰别拘束掩奶睬选择摇扰烦苦枚写协厌及格受欢迎约只估侵犯割状告或缺抗拒挽撤救药喻磨灭端倪少逆逾越避靠适吉誉吝玉含延咎歹听啻渊善谋均匀堪忍够太惹妙妥妨孕症孝术室完纳推冠积宣疑辩栗碴称屈挠屑干涉衡待很忙恶忿怎么怠急耻恭息悦惑惜惟想愉愧怍慌愤启懂懈怀材才紧招认扣抵拉舍也罢插揣冒搭撞南墙扩核支攻敢雷攀敬里吗需景智暇曾罪遇朽枉止况竞争辱求愈渝溶济左右袒困补爽特寂寞示弱找谢畏强疾徐痛痒冤符眠睦瞅董何厚云措活疲羞者轻玻璃祥兆禁移稂莠稳佛换答简结果盟绝缕途给谈否羁翼耐肖胫毋宁兴舒若菲莱痕迹窠臼虚衰脸兔撒鹰棺范该详讳抬泰让须眉象众赀账费灰赖奇虑训辍辨菽麦辛近送透逞徒速续逮捕遂遑违逊斧钺艰醉锈随观弃显饱脂肪使丏丐帮丒且慢末丕替桃宗王尊凉爵各图屋脊粮署录坛吾禄职胄袭君厦丗北壑桐疹损逢陵鹬丙寅戌氨腈唑纶辰酮脱氢酶醚丞丢现掉纱帽弄扯炮碗丠両丣坐存激肩臻蒂莲悖序驱丨丩丫挺杈髻鬟细介俄伊犁京尼布订普渡央委监察检查剂圈设警队斯督剩震境航舶革防托播促质版蝾螈锋研艺历残消频谱精密制造陲邮候埔坚压坜凹汇执府究邦俘摄寮彬狼岳肺肿庸英讯诊埋粒胞括控码韩暑枪枢砥澳哇牟寿甸钻探篇签缀缝继耳肯照妇埃悬璧轴柜台辣搁浅邪跑纤阮阳私囊魔丮丰姿采丱烧丳丵丶丷丸参寨朗桂瑞砂衷霞貌凤仆舰因嫌宰峰干络牌持旨祭祷簿编罚宾办丼丿乀乂乃乄仰慕盛旷留考验阔乆乇么丑麽乊湖燃乑乒乓乕乖僻忤戾离谬迕乗危肥劫除隙浪婿乙炔肠酰吡咯盐乚乛乜嘢卿玄宫尾狐龟塔嶷兄弟泉章霄钉耙乞扎哀怜恕讨乢乣乤乥乧乨乩童乪乫乭乳晕汁液瑶浆牙癌突窦罩腐胶猪酪蛋糕菌瘤乴乵乶乷乸乹乺乼乾俸冰嘉哕嚎坤妈尸垒旱枯涸俐渴潮涩煸豆燥爹瘦瘪癣瞪袋脆姜贝隆馏乿亀亁叫咕攘扔搞男砸窜蓬麻亃亄亅却亇迟典今临繁累卵奉婚聪躬巨与迁添裂副宿岁怪恶尕仑愣杆硅硫钛铀锰芑杂异钠砷胂磺琥珀舱棍簧胡茬盗浩盆贩郎腿亍洪亐互欠助勉惠操斥诿系户译亓墓碑刑铃卅渠缤纷斗米旗宪钒灯徽瘟祖拳福谷丰脏腑绑肉腌苓蕴桥铺霸颜闹判喷冈底蛙陉矿亖亘亜罕们娜桑那努哈喀弗烈曼松森杜氏杯奥琛敦戊穆圣裔汇薛孙亟亡佚虏羊牢奋释卷卸契媾感额睫缠谊趾塞挤纽阻还配驰庄亨洛祚亪享津沪畿郊慈菴枇杷膏亭阁锃丽亳亶亹诛初责翻疯偶杰丛稠妖拖寰居吸授慧蜗吞壮魅狗矛盾益渣患忧稀描猿梦暂涯畜祸缘沸搜引擎臣横纭谁混援蒸兽狮税剖亻亼亽亡什献刹邡么仂仃仄仆富怨仈仉毕昔晨壳绍仍仏仒仕宦仗欺恃腰叹叹炬梓讫施仙后琼逝仚仝仞仟悔仡佬偿填泊拓扑簇羔购顿钦佩发棻阃驭养亿儆尤借帧赈凌叙帖李柔刚沃眦睚戒讹取飨读仨仫仮著泳卧躺韶夏裁仳仵唯贤凭钓诞仿似宋佛讽伀硕盼鹅伄儅伈伉俪柯始娃迈戈坦堡帕茨萨庙玛莉莎藤霍姆伋伍奢胥廷芳豪伎俩侍汛勒希羲雏伐憩整谟闲闲伕伙伴颐伜伝伢叔恒兹恩翰伱伲侣伶俜悧鼬伸懒缩喇叭伹伺伻伽倻辐伾似佃伫布乔妮墨佉卢佌贷劣廉昂档浓矮伞洼缓耗胸谷迷挡率龋宅沫舍疗佐贰佑占优据铧尝呢须鲁晓佗佘余坪寺瓜铳僧蒙芒陀龛哼呕坊奸孽弊揖祟茧缚誓贼佝偻瞀佟你夺赶佡佢佣佤佧贾佪佫佯佰佱洁绩酿肴佴卷佶佷佸佹佺佻佼佽佾具唤窘坏娱怒慨硬习惯聋膨胀蔓骇贵痹侀侁侂侃侄侅鸿燕侇侈糜靡侉侌妾侏儒仓鼠侐侑侔仑侘侚链侜偎傍钴循柳葫芦附価侮骂蔑侯岩截蚀局贴壶嬛宴捷携桶笺酌俣狭膝狄俅俉俊俏俎俑俓俔谚俚俛黎健呈固墒增守康箱湿祐镖镳杠盒靖膜龄俞豹猎噪孚封札筒托衍鸽剪撰稿炼厂禊练缮葺俯瞰撑冲效俳俴俵俶俷俺备俾伥倂倅储卒惶敷猝逃颉蓄崇隐倌倏忽刺蜡烛噍嚼坍扁抽毙葱楣灌灶粪背薮卖赔闭霉腾倓倔幸倘倜傥倝借箸挹浇阅倡狂倢倣値倥偬倨傲倩匡嗣冲柝珍倬倭寇猩倮倶倷倹勤赞偁偃充伪吏嗓寐惺扮拱芫茜藉虢钞偈伟晶偌宕距析滤殿疼瘫注颇偓偕鸭歇滞偝偟偢忘怡旺偨偩逼偫偭偯偰偱偲侦缉蹄偷减惰漏窥窃偸偺迹傀儡傅傈僳骂篱傎奎琳迪叟芭傒傔傕伧悉荒傜傞傢傣芽逼佣婢傮睨寄檄诵谣颂伛担辜弓惨蒿悼疤傺傻屄臆巢泄箧羡盖轧颓傿㑩僄僇佥僊働僎侨僔僖僚僝伪僣僤侥僦猴偾僩僬僭僮僯僰雇僵殖签静僾僿征陇儁侬儃儇侩朴薄儊儋儌儍傧儓俦侪拟尽儜儞儤儦儩汰哉寡渥裕酷儭儱罐儳儵儹傩俨儽兀臬臲鹫允勋勋宙宵帅憝彝谐嫂阋畅沛溢盈饥赫凶悍狠猛顽愚妣斩秦遣鞭耀敏荣槃泽爆碟磁秃缆辉霁卤朵娄孜烽酱勃汀箕裘钳耶蒙蕾彻兑软遭黜兎児韵媳爸兕觥兖兙兛兜售鍪肚兝兞兟兡兢兣樽殓涅睡禀籍赘泌啡肽奸幕涵涝熵疚眷稃衬讧赴焕椒歼植跏没试误猜栖窗肋袖颊兪卦撇胡岐廓轿疸枫茴珑厕秩募勺吨寓斤历亩迫筷厘最淫螺韬兮宽匪筛襄赢轭复兲诈刃堰戎痞蚁饷它冀铸冂冃円冇冉册嫁厉砺竭醮冏牧冑冓冔冕冖冗冘冞冢窄抑诬冥冫烘菇蛰冷凝坨橇淇淋炭饼砖碛窖醋雕雹霜冱冶炉艳嘲峻滩淡漠煖飕饮冼冽凃凄怆梗凅凇净凊凋敝蒙凔凛遵汞脢凞几凢処凰凯凵凶焰凸折刷纹预丧喽奔巡榜殡芙蓉租笼辑鞘萃凼锯镬刁蛮刂娩崩批拆摊掰蘖骤歧颗秒袂赃勿嘱忌磋琢肤刈羽刎讼戮舂桨艇刓刖霹雳刜创犊刡恙墅帜筵致劫劫刨昏默攸尿欲熏润薰圭删刮痧铲刱刲刳刴刵踏磅戳柏槐绣芹苋猬舟铭鹄鹜劫剁剃辫刭锉履铅克剌姻咽哨廊掠桅沿召瞻翅赵卜渺茫郭剒剔剕沥剚愎毅讷才剜剥啄采剞剟剡剣剤䌽剐肾驶黏剰袍剀紊铲剸剺剽剿劁劂札劈啪柴扳啦刘奭姥夼昫涓熙禅禹锡翔雁鹗刽刿弩柄蜻蛉劒劓劖劘劙澜篑赏矶釜晋甜薪逐劦熔纣虐赤囚劬劭労劵效劻劼劾峭艮勅勇励勍勐腊脖庞漫饲荡粥辄勖勗勘骄馁碌泮雇捐竹骑殊阱绩朴恳谨剿勧勩勯勰劢勋勷劝惩慰诫谏勹芡践阑匁庇拯粟扎袱裹饺匆遽匈匉匊匋匍匐茎匏匕妆痰脓蛹斋苑烤蹈塘羌熊阀螳螂疆碚竿纬荷茵邙魏匚匜匝匟扶稷匣匦拢匸匹耦匽匾匿卂叮疮禧轸堤棚迢钧炼卄卆遐卉瓷盲瓶当胱腱裸卋卌卍卐怯污贱鄙龌龊陋卓溪唐梯渔陈枣泥漳浔涧梨芬谯赡辕迦郑単驴弈洽鳌卛占筮卝卞卟吩啉屎翠厄卣卨卪卬卮榫袄玺绶钮蚤惧殆笃耸卲帘帙绕恤卼卽厂厎厓厔厖厗奚厘厍厜厝谅厕厤厥厪腻孢厮厰厳厣厹厺粕垢芜菁厼厾叁悟茸薯叄吵笄悌哺讥坫垄弧芯杠潜婴刍袁诘贪谍煽馈驳収岳缔灾贿骗叚叡吻拦蘑蜜诀燧玩砚筝椎蔺铜逗骊另觅叨唠谒杵姓喊嚷嚣咚咛塑寻恼憎擦只泣渗蝠叱吒咄咤喝籀黛舵舷叵叶铎懿昭穰苴辽叻叼吁堑嫖赌瞧爬众抒吅吆夥卺橡涤抱纵摩郡唁坠扇篮膀袜颈吋忾谘酬哭妓媛暗表缰迩妃羿絮蕃浑拐葵暮隅吔吖啶嗪戚吜啬噬咽吟哦咏吠吧唧嗒咐吪隽咀征燐苞茹钙哧吮吰吱嘎吲哚吴栋娇窟孟箫忠晗淞阖闾趼宇呐睛嘘拂捧疵熄竽笛糠吼吽呀吕韦蒙呃呆笨呇贡呉罄呋喃呎呏呔呠呡痴呣呤呦呧瑛眩扒晬淑姬瑜璇鹃呪呫哔嚅嗫呬呯呰呱呲咧噌钝呴呶呷呸呺呻哱咻啸噜吁坎坷逻呿咁咂咆哮咇咈咋蟹煦珅蔼咍咑咒诅咔哒嚓咾哝哩喱咗咠咡咢咣咥咦咨嗟询咩咪咫啮啮咭咮咱咲咳呛嗽咴啕咸咹咺呙喉咿婉恸悯赋矜绿茗蓝哂抢瞒哆嗦啰噻啾滨彗哋哌哎唷哟哏哐哞哢哤哪里哫啼喘哰哲萎蚌哳咩哽哿呗唅唆唈唉唎唏哗尧棣殇璜睿肃唔睇唕吣唞唣喳唪唬唰喏唲唳唵嘛唶唸唹唻唼唾唿啁啃鹦鹉啅埠栈榷祺铺鞅飙啊啍啎啐啓啕啖啗啜哑祈啢衔啤啥啫啱啲啵啺饥啽噶昆沁喁喂喆裙喈咙喋喌喎喑喒喓喔粗喙幛庆滋鹊喟喣喤喥喦喧骚喨喩梆吃葡萄喭驼挑吓碰枞瓣纯疱藻趟铬喵営喹喺喼喿嗀嗃嗄嗅嗈嗉嗊嗍嗐嗑嗔诟嗕嗖嗙嗛嗜痂癖嗝嗡嗤嗥嗨唢嗬嗯嗰嗲嗵叽嗷嗹嗾嗿嘀嘁嘂嘅惋嘈峪禾荫啀嘌嘏嘐嘒啯啧嘚唛嘞嘟囔嘣嘥嘦嘧嘬嘭这谑严敞馋松哓嘶嗥呒虾嘹嘻啴嘿噀噂噅噇噉噎噏噔噗噘噙噚咝噞噢噤蝉皿噩噫噭嗳噱哙噳嚏涌洒欲巫霏噷噼嚃嚄嚆抖哜尝嚔苏嚚嚜嚞嚟呖嚬嚭嚮嚯亸喾饬按竣苛嚵嘤啭冁呓膪谦囍囒囓囗囘萧酚飘溅谛囝溯眸纥銮鹘囟殉囡団囤囥囧囨囱囫囵囬囮囯囲図囶囷囸囹圄圉拟囻囿圀圂圃圊粹蠹赦圌垦圏滚鲱凿枘圕圛圜圞坯埂壤骸炕祠窑豚绅魠鲮鳖圧握圩圪垯圬圮圯炸岬幔毯祇窨菩溉圳圴圻圾坂坆沾坋坌舛壈昆垫墩椅坒坓坩埚坭坰坱坳坴坵坻坼杨挣涎帘垃垈垌垍垓垔垕垗垚垛垝垣垞垟垤垧垮垵垺垾垿埀畔埄埆埇埈埌殃隍埏埒埕埗埜垭埤埦埧埭埯埰埲埳埴埵埶绋埸培怖桩础辅埼埽堀诃侄庑堃堄摧磐贞韧砌堈堉垩堋堌堍堎垴堙堞堠礁堧堨舆堭堮蜓摘堲堳堽堿塁塄塈煤茔棵塍垲埘塓绸塕鸦沽虱塙冢塝缪塡坞埙塥塩塬塱场螨塼塽塾塿墀墁墈墉墐夯増毁墝墠墦渍钵墫墬堕墰墺墙橱壅壆壊壌壎壒榨蒜壔壕壖圹垆壜壝垅壡壬壭壱売壴壹壻壸寝壿夂夅夆変夊夌漱邑夓腕泄甥御骼夗夘夙衮瑙妊娠醣枭珊莺鹭戗幻魇夤蹀秘擂鸫姚宛闺屿庾挞拇賛蛤裨菠氅漓捞湄蚊霆鲨箐篆篷荆肆舅荔鲆巷惭骰辟邱镕镰阪漂烩鲵鲽鳄鸨胪鹏妒峨谭枰晏玑癸祝秤竺牡籁恢罡蝼蝎赐绒御梭夬夭砣榆怙枕夶夹馅奄崛葩谲奈贺祀赠奌奂奓奕䜣詝奘奜奠奡奣陶奨奁魁奫奬奰娲孩贬隶酥宄狡猾她姹嫣妁毡荼皋膻蝇嫔妄妍嫉媚娆妗趣妚妞妤碍妬娅妯娌妲妳妵妺姁姅姉姗姒姘姙姜姝姞姣姤姧姫姮娥姱姸姺姽婀娀诱慑胁娉婷娑娓娟娣娭娯娵娶娸娼婊婐婕婞婤婥溪孺婧婪婬婹婺婼婽媁媄媊媕媞媟媠媢媬媮妫媲媵媸媺媻媪眯媿嫄嫈袅嫏嫕妪嫘嫚嫜嫠嫡嫦嫩嫪毐嫫嫬嫰妩嫺娴嫽嫿妫嬃嬅嬉耍婵痴艳嬔嬖嬗嫱袅嫒嬢嬷嬦嬬嬭幼嬲嬴婶嬹嬾嬿孀娘孅娈孏曰癫屏孑孓雀孖斟篓谜摺孛矻鸠崮轲祜鸾孥邈毓棠膑孬孭孰孱孳孵泛罔衔孻孪宀宁冗拙株薇掣抚琪瓿榴谧弥宊濂祁瑕宍宏碁宓邸谳実潢町宥宧宨宬徵崎骏掖阙臊煮禽蚕宸豫寀寁寥寃檐庶寎暄碜寔寖寘寙寛寠苫寤肘洱滥蒗陕核寪弘绰螽宝擅疙瘩晷対檐専尃尅赎绌缭畴衅尌峙醌襟痲碧屁昊槌淘恵瀑牝畑莓缸羚觑蔻脏躁尔尓锐尗尙尜尟尢尥尨尪尬尭尰擒尲尶尴尸尹潽蠖蛾尻扣梢蚴鳍脬蹲屇屌蚵屐屃挪屖屘屙屛屝屡屣峦嶂岩舄屧屦屩屪屃屮戍驻钾崖嵛巅旮旯楂榄榉芋茱萸靛麓屴屹屺屼岀岊岌岍阜岑彭巩岒岝岢岚岣岧岨岫岱岵岷峁峇峋峒峓峞峠嵋峨峰峱岘峹峿崀崁崆祯崋崌崃岖昆崒崔嵬巍萤颢崚崞崟崠峥巆崤崦崧殂岽崱崳崴崶崿嵂嵇嵊泗嵌嵎嵒嵓岁嵙嵞嵡嵩嵫嵯嵴嵼嵾嵝崭崭晴嶋嶌嶒嶓嵚崂嶙嶝嶞峤嶡嶢峄嶨嶭嶮嶰嶲岙嵘巂巃巇巉岿巌巓巘巛滇芎巟巠弋回巣巤炊擘蜥蟒蛊觋巰蜀彦淖杏茂甫楞巻巽帼巿帛斐鲫蕊帑帔帗帚琉汶帟帡帣帨裙帯帰帷帹暆帏幄帮幋幌幏帻幙帮幞幠幡幢幦幨幩幪帱幭幯幰遥蹉跎馀庚鉴幵幷稚邃庀庁広庄庈庉笠庋跋庖牺庠庤庥鲸庬庱庳庴庵馨衢庹庿廃厩廆廋廌廎廏廐廑廒荫廖廛厮搏锣廞弛袤廥廧廨廪廱绵踵髓廸迫瓯邺廻廼廾廿躔弁皱弇弌弍弎弐弑吊诡憾荐弝弢弣弤弨弭弮弰弪霖繇焘斌旭溥骞弶弸弼弾彀彄别累纠强彔彖彘彟彟陌彤贻彧绘虹彪炳雕蔚鸥彰瘅彲彳彴仿彷徉徨彸彽踩敛旆徂徇徊渭畲铉裼従筌徘徙徜徕膳苏萌渐徬徭醺徯徳徴潘徻徼忀瘁胖燎怦悸颤扉犀澎湃砰恍惚绞隘忉惮挨饿忐忑忒忖応忝忞耿忡忪忭忮忱忸怩忻悠懑怏遏怔怗怚怛怞怼黍讶怫怭懦怱怲恍怵惕怸怹恁恂恇恉恌恏恒恓恔恘恚恛恝恞恟恠恣恧眄恪恫恬澹恰恿悀悁悃悄悆悊悐悒晦悚悛悜悝悤您悩悪悮悰悱凄恻德悴怅惘闷悻悾惄愫钟蒐惆惇惌惎惏惓惔惙惛耄惝疟浊恿惦德恽惴蠢惸拈愀愃愆愈愊愍愐愑愒愓愔愕恪氓蠢騃昵惬赧悫愬愮愯恺愼慁恿慅慆慇霭慉慊愠慝慥怄怂慬慱悭慴慵慷戚焚憀灼郁憃惫憋憍眺捏轼愦憔憖憙憧憬憨憪憭怃憯憷憸憹憺懃懅懆邀懊懋怿懔懐懞懠懤懥恹懫懮懰懱毖懵遁梁雍忏懽戁戄戆戉戋戕戛戝戛戠戡戢戣戤戥戦戬戭戯轰戱披菊牖戸戹戺戻卯戽锹扂楔扃扆扈扊杖牵绢铐镯赉扐搂搅烊盹瞌跟趸镲靶鼾払扗玫腮扛扞扠扡扢盔押扤扦扱罾揄绥鞍郤窾扻扼扽抃抆抈抉抌抏瞎抔缳缢擞抜拗択抨摔歉蹿牾抶抻搐泵菸拃拄拊髀抛拌脯拎拏拑擢秧沓曳挛迂拚拝拠拡拫拭拮踢拴拶拷攒拽掇芥橐簪摹疔挈瓢骥捺蹻挌挍挎挐拣挓挖掘浚挙揍聩挲挶挟挿捂捃捄捅捆捉捋胳膊揎捌捍捎躯蛛捗捘捙捜捥捩扪捭据捱捻捼捽掀掂抡臀膘掊掎掏掐笙掔掗掞棉芍掤搪阐掫掮掯揉掱掲掽掾揃揅揆搓揌诨揕揗揘揜揝揞揠揥揩揪揫橥遒麈揰揲揵揶揸背揺搆搉搊搋搌搎搔搕撼橹捣搘搠搡搢搣搤搥搦搧搨搬楦裢讪赸掏搰搲搳搴揾搷搽搾搿摀摁摂摃摎掴摒摓跤摙摛掼摞摠摦喉羯摭摮挚摰摲抠摴抟摷掺摽撂撃撅稻撊撋挦锏泼撕撙撚㧑挢撢掸撦撅撩撬撱朔揿蚍蜉挝捡擀掳闯擉缶觚擐擕擖擗擡擣擤澡腚擧擨擩擫擭摈拧撷擸撸擽擿攃摅撵攉攥攐攓撄搀撺每攩攫辔澄攮攰攲攴轶攷砭讦攽碘敁敃敇敉叙敎筏敔敕敖闰诲敜煌敧敪敳敹敺敻敿斁衽斄牒绉诌斉斎斓鹑谰驳鳢斒筲斛斝斞斠斡斢斨斫斮晾沂潟颖绛邵斲斸釳於琅斾斿旀旗旃旄涡旌旎旐旒旓旖旛旝旟旡旣浴旰獭魃旴时旻旼旽昀昃昄昇昉晰躲澈熹皎皓矾昑昕昜昝昞昡昤晖笋昦昨是昱昳昴昶昺昻晁蹇隧蔬髦晄晅晒晛晜晞晟晡晢晤晥曦晩萘莹顗晿暁暋暌暍暐暔暕煅旸暝暠暡曚暦暨暪朦胧昵暲殄冯暵暸暹暻暾曀晔昙曈曌曏曐暧曘曙曛叠昽曩骆曱甴肱曷牍禺锟曽沧耽朁朅朆杪栓夸竟粘绦朊膺朏朐朓朕朘朙瞄觐溘饔飧朠朢朣栅椆淀虱朩朮朰朱炆璋钰炽鹮朳槿朵朾朿杅杇杌陧欣钊湛漼楷瀍煜玟缨翱肇舜贽适逵杓杕杗杙荀蘅杝杞脩珓筊杰榔狍閦颦缅莞杲杳眇杴杶杸杻杼枋枌枒枓衾葄翘纾逋枙狸桠枟槁枲枳枴枵枷枸橼枹枻柁柂柃柅柈柊柎某柑橘柒柘柙柚柜柞栎柟柢柣柤柩柬柮柰柲橙柶柷柸柺査柿栃栄栒栔栘栝栟柏栩栫栭栱栲栳栴檀栵栻桀骜桁镁桄桉桋桎梏椹葚桓桔桕桜桟桫椤桭杯桯桲桴桷桹湘溟梃梊梍梐潼栀枧梜梠梡梣梧梩梱梲梳梴梵梹棁棃樱棐棑棕榈簑绷蓑枨棘棜棨棩棪棫棬棯棰棱棳棸棹椁棼碗椄苕椈椊椋椌椐椑椓椗検椤椪椰椳椴椵椷椸椽椿楀匾楅篪楋楍楎楗楘楙楛楝楟楠楢楥桢楩楪楫楬楮楯楰梅楸楹楻楽榀榃榊榎槺榕榖榘榛狉莽搒笞榠榡榤榥榦榧杩榭榰榱梿霰榼榾桤槊闩槎槑槔槖様槜槢槥椠槪槭椮槱槲槻槼槾樆樊樏樑樕樗樘樛樟樠樧樨権樲樴樵猢狲桦樻罍樾樿橁橄橆桡笥龠橕橚橛辆椭橤橧竖膈跨橾橿檩檃檇柽檍檎檑檖檗桧槚檠樯檨檫檬梼槟檴檵柠棹櫆櫌栉櫜椟櫡槠栌枥榇栊櫹棂茄櫽欀欂欃欐欑栾欙棂溴欨欬欱欵欶欷歔欸欹欻欼欿歁歃歆艎歈歊莳蝶歓歕歘歙歛歜欤歠蹦诠镶蹒跚升陟歩歮歯歰歳歴璞歺瞑歾殁夭殈殍殑殗殜殙殛殒殢殣殥殪殚僵殰殳荃殷殸殹蛟殻肴谤殴毈毉喂毎毑蕈毗毘毚茛邓毧毬毳毷毹毽毾毵牦氄氆靴氉氊氇氍氐聊氕氖気氘氙氚氛氜氝氡汹焊痉氤氲氥氦铝锌氪烃氩铵痤汪浒漉痘盂碾菖蒲蕹蛭螅氵冰氹氺氽烫氾氿渚汆汊汋汍汎汏汐汔汕褟汙汚汜蓠沼秽蔑汧汨汩汭汲汳汴堤汾沄沅沆瀣沇沈葆浸沦湎溺痼疴沌沍沏沐沔沕沘浜畹砾沚沢沬沭沮沰沱灢沴沷籽沺烹濡洄泂肛泅泆涌肓泐泑泒泓泔泖泙泚泜泝泠漩馍涛粼泞藓鳅泩泫泭泯铢泱泲洇洊泾琵琶荽蓟箔洌洎洏洑潄濯洙洚洟洢洣洧洨洩痢滔洫洮洳洴洵洸洹洺洼洿淌蜚浄浉浙赣渫浠浡浤浥淼瀚浬浭翩萍浯浰蜃淀苔蛞蝓蜇螵蛸煲鲤浃浼浽溦涂涊涐涑涒涔滂莅涘涙涪涫涬涮涴涶涷涿淄淅淆淊凄黯淓淙涟淜淝淟淠淢淤渌淦淩猥藿亵淬淮淯淰淳诣涞纺淸淹炖癯绮渇済渉渋渓渕涣渟渢滓渤澥渧渨渮渰渲渶渼湅湉湋湍湑湓湔黔湜湝浈湟湢湣湩湫湮麟湱湲湴涅満沩溍溎溏溛舐漭溠溤溧驯溮溱溲溳溵溷溻溼溽溾滁滃滉滊荥滏稽滕滘汇滝滫滮羼耷卤滹浐煎漈漊漎绎漕漖漘漙沤漜漪漾漥漦漯漰溆漶漷濞潀颍潎潏潕潗潚潝潞潠潦祉疡潲潵滗潸潺潾涠澁澂澃澉澌澍澐澒澔澙渑澣澦澧澨澫澬浍澰澴澶澼熏郁濆濇濈濉濊貊濔疣濜濠濩觞浚濮盥潍濲泺瀁滢渎渖瀌浏瀒瀔濒泸瀛潇潆瀡潴泷濑瀬弥潋瀳瀵瀹瀺瀼沣滠灉灋灒漓灖灏灞灠滦灥灨滟灪蜴灮烬獴灴灸灺炁炅鱿炗炘炙炤炫疽烙钎炯炰炱炲炴炷毁炻烀烋瘴鲳烓烔焙烜烝烳饪烺焃焄耆焌焐焓焗焜焞焠焢焮焯焱焼煁煃煆煇煊熠煍熬煐炜煕暖熏硷霾煚煝煟煠茕矸煨琐炀萁煳煺煻熀熅熇熉罴荧穹炝熘熛熜稔谙烁熤熨熯熰眶蚂颎熳熸熿燀烨燂燄盏燊燋燏燔隼燖焖燠燡灿燨燮燹燻燽燿爇爊爓爚爝爟爨蟾爯爰为爻丬爿牀牁牂牄牋窗牏牓窗釉牚腩蒡虻牠虽蛎牣牤牮牯牲牳牴牷牸牼绊牿靬犂犄犆犇犉犍犎犒荦犗犛犟犠犨犩犪犮犰狳犴犵犺狁甩狃狆狎狒獾狘狙黠狨狩狫狴狷狺狻豕狈蜘猁猇猈猊猋猓猖獗猗猘狰狞犸猞猟獕猭猱猲猳猷猸猹猺玃獀獃獉獍獏獐獒毙獙獚獜獝獞獠獢獣獧鼇蹊狯猃獬豸狝獯鬻獳犷猕猡玁菟玅玆玈珉糁禛郅玍玎玓瓅玔玕玖玗玘玞玠玡玢玤玥玦珏瑰玭玳瑁玶玷玹玼珂珇珈瑚珌馐馔珔珖珙珛珞珡珣珥珧珩珪佩珶珷珺珽琀琁陨玡琇琖琚琠琤琦琨琫琬琭琮琯琰琱琲琅琴珐珲瑀瑂瑄瑉玮瑑瑔瑗瑢瑭瑱瑲瑳瑽瑾瑿璀璨璁璅璆璈琏璊璐璘璚璝璟璠璡璥瑷璩璪璫璯璲玙璸璺璿瓀璎瓖瓘瓒瓛脐瓞瓠瓤瓧瓩瓮瓰瓱瓴瓸瓻瓼甀甁甃甄甇甋甍甎甏甑甒甓甔瓮甖甗饴蔗甙诧钜粱盎锈团甡褥産甪甬甭甮宁铠甹甽甾甿畀畁畇畈畊畋畎畓畚畛畟鄂畤畦畧荻畯畳畵畷畸畽畾疃叠疋疍疎箪疐疒疕疘疝疢疥疧疳疶疿痁痄痊痌痍痏痐痒痔痗瘢痚痠痡痣痦痩痭痯痱痳痵痻痿瘀痖瘃瘈瘉瘊瘌瘏瘐痪瘕瘖瘙瘚瘛疭瘜瘝瘗瘠瘥瘨瘭瘆瘯瘰疬瘳疠瘵瘸瘺瘘瘼癃痨痫癈癎癐癔癙癜癠疖症癞蟆癪瘿痈発踔绀蔫酵皙砬砒翎翳蔹钨镴皑鹎驹暨粤褶皀皁荚皃镈皈皌皋皒朱皕皖皘皜皝皞皤皦皨皪皫皭糙绽皴皲皻皽盅盋碗盍盚盝踞盦盩秋千盬盭眦睁瞤盯盱眙裰盵盻睐眂眅眈眊県眑眕眚眛眞眢眣眭眳眴眵眹瞓眽郛睃睅睆睊睍睎困睒睖睙睟睠睢睥睪睾睯睽睾眯瞈瞋瞍逛瞏瞕瞖眍䁖瞟瞠瞢瞫瞭瞳瞵瞷瞹瞽阇瞿眬矉矍铄矔矗矙瞩矞矟矠矣矧矬矫矰矱硪碇磙罅舫阡、矼矽礓砃砅砆砉砍砑砕砝砟砠砢砦砧砩砫砮砳艏砵砹砼硇硌硍硎硏硐硒硜硖砗磲茚钡硭硻硾碃碉碏碣碓碔碞碡碪碫碬砀碯碲砜碻礴磈磉磎硙磔磕磖磛磟磠磡磤磥蹭磪磬磴磵磹磻硗礀硚礅礌礐礚礜礞礤礧礮砻礲礵礽礿祂祄祅祆禳祊祍祏祓祔祕祗祘祛祧祫祲祻祼饵脔锢禂禇禋祦禔祎隋禖禘禚禜禝禠祃禢禤禥禨禫祢禴禸秆秈秊闱飒秋秏秕笈蘵赁秠秣秪秫秬秭秷秸稊稌稍稑稗稙稛稞稬秸稲稹稼颡稿穂穄穇穈穉穋稣贮穏穜穟秾穑穣穤穧穨穭穮穵穸窿阒窀窂窅窆窈窕窊窋窌窒窗窔窞窣窬黩蹙窑窳窴窵窭窸窗竁竃竈竑竜并竦竖篦篾笆鲛竾笉笊笎笏笐靥笓笤箓笪笫笭笮笰笱笲笳笵笸笻筀筅筇筈筎筑筘筠筤筥筦笕筒筭箸筰筱筳筴宴筸箂个箊箎箑箒箘箙箛箜篌箝箠箬镞箯箴箾篁筼筜篘篙篚篛篜篝篟篠篡篢篥篧篨篭篰篲筚篴篶篹篼箦簁簃簆簉簋簌簏簜簟簠簥簦簨簬簰簸簻籊藤籒籓籔签籚篯箨籣籥籧笾簖籫籯芾麴籵籸籹籼粁秕粋粑粔粝粛粞粢粧粨粲粳稗粻粽辟粿糅糆糈糌糍糒糔萼糗蛆蹋糢糨糬粽糯糱籴粜糸糺紃蹼鲣霉纡纨绔纫闽襻紑纰纮锭鸢鹞纴紞紟扎紩紬绂绁纻紽紾绐絁絃絅経絍绗絏缡褵絓絖絘絜绚絣螯絪絫聒絰絵绝絺絻絿綀绡綅绠绨绣綌綍綎捆綖綘継続缎绻綦綪线綮綯绾罟蝽綷縩绺绫緁绲緅緆缁绯緌緎総緑绱緖缃缄缂绵缗緤褓缌纂緪緰缑缈缏缇縁縃縄萦缙缒縏缣縕缞縚缜缟缛縠縡縢縦绦縯縰骋缧縳纤缦絷缥縻衙縿繄缫繈繊繋繐缯繖繘繙繠缋繣繨缰缲繸繻缱纁纆纇缬缵纩纑纕缵纙纚纛缾罃罆坛罋罂罎罏罖罘罛罝罠罣罥罦罨罫罭锾罳罶罹罻罽罿羂羃羇芈蕉５１鸵羑羖羌羜羝羢羣羟羧羭羮羰羱羵羶羸藜鲐翀翃翅翊翌翏翕翛翟翡翣翥翦跹翪翫翚翮翯翱翽翾翿板饕鸹锨耋耇耎耏专耒耜耔耞耡耤耨耩耪耧耰鬓耵聍聃聆聎聝聡聦聱聴聂聼阈聿肄肏肐肕腋肙肜肟肧胛肫肬肭肰肴肵肸肼胊胍胏胑胔胗胙胝胠铨胤胦胩胬胭胯胰胲胴胹胻胼胾脇脘脝脞脡脣脤脥脧脰脲脳腆腊腌臜腍腒腓胨腜腠脶腥腧腬腯踝蹬镣腴腶蠕诽膂腽嗉膇膋膔腘膗膙膟黐膣膦膫膰膴膵膷脍臃臄臇臈臌臐臑臓膘臖臙臛臝臞臧蓐诩臽臾臿舀舁鳑鲏舋舎舔舗馆舝舠舡舢舨舭舲舳舴舸舺艁艄艅艉艋艑艕艖艗艘艚艜艟艣舣艨艩舻艬艭荏艴艳艸艹艻艿芃芄芊萰陂藭芏芔芘芚蕙芟芣芤茉芧芨芩芪芮芰鲢芴芷芸荛豢芼芿苄苒苘苙苜蓿苠苡苣荬苤苎苪镑苶苹苺苻苾茀茁范蠡萣茆茇茈茌茍茖茞茠茢茥茦菰茭茯茳藨茷藘茼荁荄荅荇荈菅蜢鸮荍荑荘豆荵荸荠莆莒莔莕莘莙莚莛莜莝莦莨菪莩莪莭莰莿菀菆菉菎菏菐菑菓菔芲菘菝菡菢菣菥蓂菧菫毂蓥菶菷菹醢菺菻菼菾萅萆苌萋萏萐萑萜萩萱萴莴扁萻葇葍葎葑荭葖葙葠葥苇葧葭药葳葴葶葸葹葽蒄蒎莼茏薹莅蒟蒻蒢蒦蒨蒭藁蒯蒱鉾蒴蒹蒺蒽荪蓁蓆蓇蓊蓌蓍蓏蓓蓖蓧蓪蓫荜跣藕苁蓰蓱莼蓷蓺蓼蔀蔂蔃蔆蔇蔉蔊蔋蔌蔎蔕蔘蔙蒌蔟锷蒋雯茑蔯蔳麻蔵蔸蔾荨蒇蕋蕍荞蕐蕑芸莸蕖蕗蕝蕞蕠蕡蒉蕣蕤蕨蕳蓣蕸蕺蕻薀薁薃薅薆荟薉芗薏薐蔷薖薘剃谔钗薜薠薢薤薧薨薫薬薳薶薷薸薽薾薿藄藇藋荩藐藙藚藟藦藳藴苈藷藾蘀蘁蕲苹蘗蘘蘝蘤蘧蘩蘸蘼虀虆虍蟠虒虓虖虡虣虥虩虬虰蛵蛇虷鳟虺虼蚆蚈蚋蚓蚔蚖蚘蚜蚡蚣蚧蚨蚩蚪蚯蚰蜒蚱蚳蚶蚹蚺蚻蚿蛀蛁蛄蛅蝮蛌蛍蛐蟮蛑蛓蛔蛘蛚蛜蛡蛣蜊蛩蛱蜕螫蜅蚬蜈蝣蜋蜍蜎蜑蠊蜛饯蜞蜣蜨蜩蜮蜱蜷蜺蜾蜿蝀蝃蝋蝌蝍蝎蝏蝗蝘蝙蝝鲼蝡蝤蝥猿蝰虻蝲蝴蝻螃蠏蛳螉螋螒螓螗螘螙螚蟥螟螣螥螬螭䗖螾螀蟀蟅蝈蟊蟋蟑蟓蟛蟜蟟蟢虮蟨蟪蟭蛲蟳蛏蟷蟺蟿蠁蠂蠃虿蠋蛴蠓蚝蠗蠙蠚蠛蠜蠧蟏蠩蜂蠮蠰蠲蠵蠸蠼蠽衁衄衄衇衈衉衋衎衒同衖胡衞裳钩衭衲衵衹衺衿袈裟袗袚袟袢袪袮袲袴袷袺袼褙袽裀裉袅裋夹裍裎裒裛裯裱裲裴裾褀褂褉褊裈褎褐褒褓褔褕袆褚褡褢褦褧褪褫袅褯褰褱裆褛褽褾襁褒襆裥襉襋襌襏襚襛襜裣襞襡襢褴襦襫襬襭襮襕襶襼襽襾覂覃覅霸覉覊覌覗觇覚覜觍觎覧覩觊觏覰観觌觔觕觖觜觽觝觡酲觩觫觭觱觳觯觷觼觾觿言赅讣訇訏訑訒诂讬訧訬訳訹证訾詀詅诋毁詈詊讵詑诒诐詗诎察詨诜詶詸詹詻诙诖誂誃诔锄诓誋诳诶悖誙诮诰誧説読誯谇訚谄谆諆諌诤诹诼諕谂谀諝谝諟喧谥諴諵谌谖誊謆謇歌謍謏謑谡谥謡謦謪谪讴謷謼谩哗譅譆譈譊讹譒撰谮鑫譞噪譩谵譬譱譲谴譸譹谫讅讆詟䜩雠讐谗谶讙谠讟谽豁豉豇岂豊豋豌豏豔豞豖豗豜豝豣豦豨豭豱豳豵豶豷豺豻貅貆狸猊貔貘䝙貜貤餍贳餸贶贲赂賏赊赇赒賝赓赕賨赍斗賮賵賸赚赙赜赟贉赆赑贕赝赬赭赱赳迄趁趂趄趐趑趒趔趡趦趫趮趯趱趴趵趷趹趺趿跁跂跅跆踬跄跐跕跖跗跙跛跦跧跩跫跬跮跱跲跴跺跼跽踅踆踈踉踊踒踖踘踜踟躇蹰踠踡踣踤踥踦踧跷踫踮逾踱踊踶踹踺踼踽躞蹁蹂躏蹎蹐蹓蹔跸蹚蹜蹝迹蹠蹡蹢跶蹧蹩蹪蹯鞠蹽躃躄躅踌跻躐踯跞躘躙躗躝躠蹑躜躧躩躭躰躬躶軃軆辊軏轫軘軜軝腭転軥軨軭軱轱辘軷轵轺軽軿輀輂辇辂辁輈挽輗辄辎辋輠輤輬輭輮辏輴輵輶輹輼辗辒轇轏轑轒辚轕轖轗轘轙轝轞轹轳罪辣辞辵辶辺込辿迅迋迍麿迓迣迤逦迥迨迮迸迺迻迿逄逅逌逍逑逓迳逖逡逭逯逴逶逹遄遅侦遘遛遝遢遨遫遯遰遴绕遹遻邂邅邉邋邎邕邗邘邛邠邢邧邨邯郸邰邲邳邴邶邷邽邾邿郃郄郇郈郔郕郗郙郚郜郝郞郏郠郢郪郫郯郰郲郳郴郷郹郾郿鄀鄄郓鄇鄈鄋鄍鄎鄏鄐鄑邹邬鄕郧鄗鄘鄚鄜鄞鄠鄢鄣鄤鄦鄩鄫鄬鄮鄯鄱郐鄷鄹邝鄻鄾鄿酃酅酆酇郦酊酋酎酏酐酣酔酕醄酖酗酞酡酢酤酩酴酹酺醁醅醆醊醍醐醑醓醖醝酝醡醤醨醪醭醯醰酦醲醴醵醸醹醼醽醾釂酾酽釆釈鲈镏阊钆钇钌钯钋鼢鼹钐钏釪釬釭釱钍釸钕钫鈃钭鈆鈇钚鈊鈌钤钣鈒鈤钬钪鈬铌铈钶铛钹铍钸钿鉄鉆铊铇鉌铋鉏铂钷铆钵鉥钲鉨钼钽鉱鉲鉶铰铒鉼铪銍銎铣銕镂铫铦铑铷銤铱铟銧铥铕铯銭銰焊銶锑锉汞鋂锒鋆鋈鋊铤鋍铗鋐鋑鋕鋘鋙锊锓锔锇铓鋭铖锆锂铽鋳鋹鋺鉴镚钎錀锞锖锫锩錍铔锕錔锱铮锛錞锬锜錤錩錬録铼錼锝钔锴鍉镀鍏鍐铡鍚锻锽锸锲锘鍫鍭鍱鍴锶鍹锗针锺锿镅鎉鎋鎌鎍鎏鎒鎓鎗镉鎚鎞镃鎤铩锼鎭鎯镒镍鎴镓鎸鎹镎镟鏊镆镠镝鏖铿锵鏚镗镘镛鏠鏦錾镤鏸镪鏻鏽鏾铙鐄鐇鐏铹镦镡鐗馗镫镢镨鐡锎镄鐩镌鐬鐱镭鐶鐻鐽镱鑀鑅镔鑐鑕鑚鑛鑢鑤镥鑪镧鑯鑱鑴鑵镊镢钃镻闫闬闶闳閒闵閗閟阂関合閤哄阆閲阉閺阎阏阍阌暗闉阕阗闑闒闿闘闚阚闟闠闤闼阞阢阤阨阬阯阹阼阽陁陑陔陛陜陡陥陬骘陴険陼陾阴隃隈隒隗隞隠隣隤隩隮隰颧隳隷隹雂雈雉雊雎雑雒雗雘雚雝雟雩雰雱驿霂霅霈霊沾霒霓霙霝霢霣霤霨霩霪霫霮靁叇叆靑靓靣腼靪靮靰靳靷靸靺靼靿鞀鞃鞄鞍鞗鞙鞚鞝鞞鞡鞣鞨鞫鞬鞮鞶鞹鞾鞑韅鞯驮韍韎韔韖韘韝韫韡韣韭韭韱韹韺頀刮頄顸顼頍颀颃颁頖頞頠頫頬颅頯頲颕頼悴顋顑颙颛颜顕顚顜颟顣颥颞飐飑台飓颸飏飖颽颾颿飀飂飚飌翻飡飣饲飥饨饫飮飧飶餀餂饸饹餇餈饽哺馂餖餗餚馄馃餟餠餤餧餩餪餫糊餮糇餲饧馎糕饩馈馊馌馒饇馑馓膳饎饐饘饟馕馘馥馝馡馣骝骡馵馹駃駄駅駆駉駋驽駓驵駗骀驸駜骂骈駪駬骃駴骎駹駽駾騂騄骓騆騉騋骒骐麟騑騒験騕骛騠騢騣騤騧骧騵驺骟騺蓦骖骠骢驆驈骅驌骁驎骣驒驔驖驙驦驩驫骺鲠骫骭肮骱骴骶骷髅骾髁髂髄髆膀髇髑髌髋髙髝髞髟髡髣髧髪髫髭髯髲髳髹髺髽髾鬁鬃鬅鬈鬋鬎鬏鬐鬑鬒鬖鬗鬘鬙鬠鬣斗鬫鬬阄鬯鬰鬲鬵鬷魆魈魊魋魍魉魑魖鳔魛魟魣魦魨魬鲂魵魸鮀鲅鮆鲧鲇鲍鲋鮓鲒鲕鮟鱇鮠鮦鮨鲔鲑鮶鮸鮿鲧鯄鯆鲩鯈鲻鯕鲭鲞鯙鯠鲲鯥鲰鲶鳀鯸鳊鲗䲠鹣鳇鰋鳄鳆鰕鰛鰜鲥鰤鳏鰦鳎鳐鳁鳓鰶鲦鲡鰼鰽鱀鱄鳙鱆鳕鱎鱐鳝鳝鳜鲟鲎鱠鳣鱨鲚鱮鱲鱵鱻鲅鳦凫鳯鳲鳷鳻鴂鴃鴄鸩鴈鴎鸰鴔鴗鸳鸯鸲鹆鸱鴠鴢鸪鴥鸸鹋鴳鸻鴷鴽鵀鵁鸺鹁鵖鵙鹈鹕鹅鵟鵩鹌鵫鵵鵷鵻鹍鶂鶊鶏鶒鹙鶗鶡鶤鶦鶬鶱鹟鶵鶸鶹鹡鶿鹚鷁鷃鷄鷇䴘䴘鷊鷏鹧鷕鹥鸷鷞鷟鸶鹪鹩鷩鷫鷭鹇鹇鸴鷾䴙鸂鸇䴙鸏鸑鸒鸓鸬鹳鸜鹂鹸咸鹾麀麂麃麄麇麋麌麐麑麒麚麛麝麤麸面麫麮麯麰麺麾黁黈黉黢黒黓黕黙黝黟黥黦黧黮黰黱黪黶黹黻黼黾鼋鼂鼃鼅鼈鼍鼏鼐鼒冬鼖鼙鼚鼛鼡鼩鼱鼪鼫鼯鼷鼽齁齆齇齈齉齌赍齑龀齕齗龅齚龇齞龃龉龆齢出齧齩齮齯齰齱齵齾厐龑龒龚龖龘龝龡龢龤'
18 | 
19 | traditional_characters = '制咖片型超聲盤鑒定仔點他命書歌粉巾字帳恤手指記憶棒形轉彎溝光○〇㐄㐅㐆㐌㐖毒㐜㐡㐤㐰㐺㑇㑳㒳㒸㔾㗂㗎㝵㞎㞙㞞㠯㢲㢴㤅㥁㥯㨗㫺㬎㮎㮚㮸㲋㲱㲾㳮㵎㵪㶸㷖㷭㹢㹴犬㺢狓㺵㼝㽮㿝䍃䔢䖟䖸䗈䗥䗪䝓䠶䥯䦉䯝䰾魚䲔䳗䳘䵹鼄䶑一對應映射丁不識下兒子做二休世丘之貉並中台原則串為甚謂乾淨了百事無成八變五十些人得道雞升天代如併來去個國政策勁幽靈在歐洲遊蕩接樣蘿蔔坑側化傳價元論醇共再准刀兩斷切分耕耘收穫錢貨物向看舊就緒險刻千金動勞永逸匙零夜半卡通回復返影蹤反常態口咬氣句話同吐快吹周味呼諾嗚品紅鍋哄而散起唱和問三知生熟團漆黑火糟堆場空塊麵塌糊塗塵染壁廂夔已足多情露水大早到晚夫妻當關萬莫開失古恨套所料既往孔見提師要家主審寸陰難買鬥牛小撮部陣局展身層巴掌帆風順席地帶過年計於春頭載四季期被蛇怕井繩度願式份彈頃深前律徑心意念差愁孤行俱全房廳交遮打技長把抓死拿眼淚鼻涕鑰鎖折段抿拍即合掃排掬揮撥擁上入擊洞擲攬改故轍敗文值名斑方面旁族日秋餐隔雅里終父旦時晌會霎間晃暴寒曝更月望垠際朝夕本正經利杯羹東西板枝獨秀根筋桿進條龍服務概模次函數又性程總付步腳印趨登毛拔呵氧氮碳決雌雄波未平派謊言流清楚白準溜煙潭有獲聞是處降琴鶴甲病發可拾沙目然瞭直以相眨穿睹瞥瞬矢的解石鳥神教秉虔誠秘種窩蜂窮竅笑置筆苟勾銷抹殺煞等獎箍節吃箭仇雙鵰詩籌籮筐系列紙級士官統絲毫掛維網盡線微吭響股腦胎脈承腔臂力致效資源址器舉功投般說講規貿易葉障著慎滿皆輸號木電池衣傾鐘高低視仁覺醒覽遺角銀幣觸潰九鼎蔽抄出駟馬追重語破貧洗貫走路安蹴至幾蹶振躍役膽汗較輩輪辭贊退六連遍遞邊針血錘音錯門思閃真倒項栽霧類保護川先驚乍體鬨鱗爪鳴滴泡鄰域黨專鼓作齊炒丑烯亥克內酯冬加奴卯肝炎基尺梁街褲鎬客寵庭巳汝昌烷玲磊糖肇酉醛啷青縣韙良香骨鯛丂七集河市弦喜嘴張舌堵區工業姊妹星架構巧彩扭歪拼湊餘熱曜武州爺浮屠美鄉老階樹葷素碎落能魄鰓鰻珠丄丅丆万俟丈尚摸母娘量管群亞虎必我堂令申件裝伏位博俠義界表女墟臺戲臭皮匠勝諸葛亮賽頂倍催請運算包立叉戟離疫苗土史志演圍揭瓦曬夷姑婆帝村寶爛尖杉鹼屜桌山岔島由紀峽壩庫鎮廢從德後拗湯治旬食明昧曹朋友框欄極權冪曲歸依貓民氟硼氯磷鐵江侗自旅法司洋浦梅園溫暖灣焦班幸用田略番疊皇炮捶硝苯酸腺苷稜草鏡穗跳遠索錦綱聚氰胺聯店胚膲愛色堇紫羅蘭芝茶飯菱雲蟲藏藩亂叛蘇親債凳學座恐戀柱測肌腹衩錐係貂企烏跪叩軍車農題迭都甘油屯奏鍵短阿姨陪姐隻顧茅廬槽駕魂鮮鹿頁其菜單乘任供勢午齒漢組織吊調瀉唇坡城報墳外夸將尉建築岸崗公床揚新劍昇杭林栗校樓標款汽社浣海商館劇院鋼華港機械廣媒環球融第醫科證券綜財樂育游漲猶嶺疏癮瞼確兵領導繳肢膛船艾瑟爾蒼蔡虞傚衫覆訪訴課諭議軌述野鉤限敵鞋頜頷顎饒首齦站例修凡劃垂屆屬崽頦廚拜挫擺放旋削棋榻檻禮沉注滑營獄畫确儀聘花葬詔員跌轄週達酒錨閘陷陸雨雪飛威丌于丹久乏予理評產亢卑亦乎舞己悲矩圓詞害誌但住佞佳便俗信票案幅翁倦倫假偏倚斜虧鬼敲停備傷脾胃僅此像儉匱免宜穴焉戴兼容許凍伯仲負彼晝皂軒輊實刊划顛衛戰哥比省非好黃飾別拘束掩奶睬選擇搖擾煩苦枚寫協厭及格受歡迎約只估侵犯割狀告或缺抗拒挽撤救藥喻磨滅端倪少逆逾越避靠適吉譽吝玉含延咎歹聽啻淵善謀均勻堪忍夠太惹妙妥妨孕症孝術室完納推冠積宣疑辯慄碴稱屈撓屑干涉衡待很忙惡忿怎麼怠急恥恭息悅惑惜惟想愉愧怍慌憤啟懂懈懷材才緊招認扣抵拉捨也罷插揣冒搭撞南牆擴核支攻敢雷攀敬裡嗎需景智暇曾罪遇朽枉止況競爭辱求癒渝溶濟左右袒困補爽特寂寞示弱找謝畏強疾徐痛癢冤符眠睦瞅董何厚云措活疲羞者輕玻璃祥兆禁移稂莠穩佛換答簡結果盟絕縷途給談否羈翼耐肖脛毋寧興舒若菲萊痕跡窠臼虛衰臉兔撒鷹棺範該詳諱抬泰讓鬚眉象眾貲賬費灰賴奇慮訓輟辨菽麥辛近送透逞徒速續逮捕遂遑違遜斧鉞艱醉鏽隨觀棄顯飽脂肪使丏丐幫丒且慢末丕替桃宗王尊涼爵各圖屋脊糧署錄壇吾祿職胄襲君廈丗北壑桐疹損逢陵鷸丙寅戌氨腈唑綸辰酮脫氫酶醚丞丟現掉紗帽弄扯砲碗丠両丣坐存激肩臻蒂蓮悖序驅丨丩丫挺杈髻鬟細介俄伊犁京尼布訂普渡央委監察檢查劑圈設警隊斯督剩震境航舶革防托播促質版蠑螈鋒研藝歷殘消頻譜精密製造陲郵候埔堅壓壢凹匯執府究邦俘攝寮彬狼嶽肺腫庸英訊診埋粒胞括控碼韓暑槍樞砥澳哇牟壽甸鑽探篇簽綴縫繼耳肯照婦埃懸璧軸櫃檯辣擱淺邪跑纖阮陽私囊魔丮丰姿采丱燒丳丵丶丷丸參寨朗桂瑞砂衷霞貌鳳僕艦因嫌宰峰幹絡牌持旨祭禱簿編罰賓辦丼丿乀乂乃乄仰慕盛曠留考驗闊乆乇么醜麼乊湖燃乑乒乓乕乖僻忤戾离謬迕乗危肥劫除隙浪婿乙炔腸酰吡咯鹽乚乛乜嘢卿玄宮尾狐龜塔嶷兄弟泉章霄釘耙乞扎哀憐恕討乢乣乤乥乧乨乩童乪乫乭乳暈汁液瑤漿牙癌突竇罩腐膠豬酪蛋糕菌瘤乴乵乶乷乸乹乺乼乾俸冰嘉噦嚎坤媽屍壘旱枯涸俐渴潮澀煸豆燥爹瘦癟癬瞪袋脆薑貝隆餾乿亀亁叫咕攘扔搞男砸竄蓬麻亃亄亅卻亇遲典今臨繁累卵奉婚聰躬巨與遷添裂副宿歲怪噁尕崙愣杆硅硫鈦鈾錳芑雜異鈉砷胂磺琥珀艙棍簧胡茬盜浩盆販郎腿亍洪亐互欠助勉惠操斥諉繫戶譯亓墓碑刑鈴卅渠繽紛斗米旗憲釩燈徽瘟祖拳福穀豐臟腑綁肉醃苓蘊橋鋪霸顏鬧判噴岡底蛙陘礦亖亙亜罕們娜桑那努哈喀弗烈曼松森杜氏盃奧琛敦戊穆聖裔彙薛孫亟亡佚虜羊牢奮釋卷卸契媾感額睫纏誼趾塞擠紐阻還配馳莊亨洛祚亪享津滬畿郊慈菴枇杷膏亭閣鋥麗亳亶亹誅初責翻瘋偶傑叢稠妖拖寰居吸授慧蝸吞壯魅狗矛盾益渣患憂稀描猿夢暫涯畜禍緣沸搜引擎臣橫紜誰混援蒸獸獅稅剖亻亼亽亾什獻剎邡麽仂仃仄仆富怨仈仉畢昔晨殼紹仍仏仒仕宦仗欺恃腰嘆歎炬梓訖施仙后瓊逝仚仝仞仟悔仡佬償填泊拓撲簇羔購頓欽佩髮棻閫馭養億儆尤藉幀賑凌敘帖李柔剛沃眥睚戒訛取饗讀仨仫仮著泳臥躺韶夏裁仳仵唯賢憑釣誕仿似宋彿諷伀碩盼鵝伄儅伈伉儷柯始娃邁戈坦堡帕茨薩廟瑪莉莎藤霍姆伋伍奢胥廷芳豪伎倆侍汛勒希羲雛伐憩整謨閑閒伕伙伴頤伜伝伢叔恆茲恩翰伱伲侶伶俜悧鼬伸懶縮喇叭伹伺伻伽倻輻伾佀佃佇佈喬妮墨佉盧佌貸劣廉昂檔濃矮傘窪緩耗胸谷迷擋率齲宅沫舍療佐貳佑佔優據鏵嘗呢須魯曉佗佘余坪寺瓜銃僧蒙芒陀龕哼嘔坊姦孽弊揖祟繭縛誓賊佝僂瞀佟你奪趕佡佢佣佤佧賈佪佫佯佰佱潔績釀餚佴捲佶佷佸佹佺佻佼佽佾具喚窘壞娛怒慨硬習慣聾膨脹蔓駭貴痺侀侁侂侃侄侅鴻燕侇侈糜靡侉侌妾侏儒倉鼠侐侑侔侖侘侚鏈侜偎傍鈷循柳葫蘆附価侮罵蔑侯岩截蝕侷貼壺嬛宴捷攜桶箋酌俁狹膝狄俅俉俊俏俎俑俓俔諺俚俛黎健呈固墒增守康箱濕祐鏢鑣槓盒靖膜齡俞豹獵噪孚封札筒託衍鴿剪撰稿煉廠禊練繕葺俯瞰撐衝俲俳俴俵俶俷俺俻俾倀倂倅儲卒惶敷猝逃頡蓄崇隱倌倏忽刺蠟燭噍嚼坍扁抽斃蔥楣灌灶糞背藪賣賠閉霉騰倓倔倖倘倜儻倝借箸挹澆閱倡狂倢倣値倥傯倨傲倩匡嗣沖柝珍倬倭寇猩倮倶倷倹勤讚偁偃充偽吏嗓寐惺扮拱芫茜藉虢鈔偈偉晶偌宕距析濾殿疼癱註頗偓偕鴨歇滯偝偟偢忘怡旺偨偩偪偫偭偯偰偱偲偵緝蹄偷減惰漏窺竊偸偺迹傀儡傅傈僳傌籬傎奎琳迪叟芭傒傔傕傖悉荒傜傞傢傣芽逼傭婢傮睨寄檄誦謠頌傴擔辜弓慘蒿悼疤傺傻屄臆巢洩篋羨蓋軋頹傿儸僄僇僉僊働僎僑僔僖僚僝僞僣僤僥僦猴僨僩僬僭僮僯僰僱僵殖籤靜僾僿征隴儁儂儃儇儈朴薄儊儋儌儍儐儓儔儕儗儘儜儞儤儦儩汰哉寡渥裕酷儭儱罐儳儵儹儺儼儽兀臬臲鷲允勛勳宙宵帥憝彞諧嫂鬩暢沛溢盈飢赫兇悍狠猛頑愚妣斬秦遣鞭耀敏榮槃澤爆碟磁禿纜輝霽鹵朵婁孜烽醬勃汀箕裘鉗耶懞蕾徹兌軟遭黜兎児韻媳爸兕觥兗兙兛兜售鍪肚兝兞兟兡兢兣樽殮涅睡稟籍贅泌啡肽奸幕涵澇熵疚眷稃襯訌赴煥椒殲植跏沒試誤猜棲窗肋袖頰兪卦撇鬍岐廓轎疸楓茴瓏廁秩募勺噸寓斤曆畝迫筷釐最淫螺韜兮寬匪篩襄贏軛複兲詐刃堰戎痞蟻餉它冀鑄冂冃円冇冉冊嫁厲礪竭醮冏牧冑冓冔冕冖冗冘冞冢窄抑誣冥冫烘菇蟄冷凝坨橇淇淋炭餅磚磧窖醋雕雹霜冱冶爐艷嘲峻灘淡漠煖颼飲冼冽凃凄愴梗凅凇凈凊凋敝濛凔凜遵汞脢凞几凢処凰凱凵凶焰凸摺刷紋預喪嘍奔巡榜殯芙蓉租籠輯鞘萃凼鋸鑊刁蠻刂娩崩批拆攤掰櫱驟歧顆秒袂贓勿囑忌磋琢膚刈羽刎訟戮舂槳艇刓刖霹靂刜創犢刡恙墅幟筵緻刦刧刨昏默攸尿慾薰潤薰圭刪刮痧鏟刱刲刳刴刵踏磅戳柏槐繡芹莧蝟舟銘鵠鶩刼剁剃辮剄剉履鉛剋剌姻咽哨廊掠桅沿召瞻翅趙卜渺茫郭剒剔剕瀝剚愎毅訥纔剜剝啄採剞剟剡剣剤綵剮腎駛黏剰袍剴紊剷剸剺剽剿劁劂劄劈啪柴扳啦劉奭姥夼昫涓熙禪禹錫翔雁鶚劊劌弩柄蜻蛉劒劓劖劘劙瀾簣賞磯釜晉甜薪逐劦熔紂虐赤囚劬劭労劵効劻劼劾峭艮勅勇勵勍勐臘脖龐漫飼盪粥輒勖勗勘驕餒碌泮雇捐竹騎殊阱勣樸懇謹勦勧勩勯勰勱勲勷勸懲慰誡諫勹芡踐闌匁庇拯粟紮袱裹餃匆遽匈匉匊匋匍匐莖匏匕妝痰膿蛹齋苑烤蹈塘羌熊閥螳螂疆碚竿緯荷茵邙魏匚匜匝匟扶稷匣匭攏匸匹耦匽匾匿卂叮瘡禧軫堤棚迢鈞鍊卄卆遐卉瓷盲瓶噹胱腱裸卋卌卍卐怯污賤鄙齷齪陋卓溪唐梯漁陳棗泥漳潯澗梨芬譙贍轅迦鄭単驢弈洽鰲卛占筮卝卞卟吩啉屎翠厄卣卨卪卬卮榫襖璽綬鈕蚤懼殆篤聳卲帘帙繞卹卼卽厂厎厓厔厖厗奚厘厙厜厝諒厠厤厥厪膩孢厮厰厳厴厹厺粕垢蕪菁厼厾叁悟茸薯叄吵笄悌哺譏坫壟弧芯杠潛嬰芻袁詰貪諜煽饋駁収岳締災賄騙叚叡吻攔蘑蜜訣燧玩硯箏椎藺銅逗驪另覓叨嘮謁杵姓喊嚷囂咚嚀塑尋惱憎擦祇泣滲蝠叱吒咄咤喝籀黛舵舷叵叶鐸懿昭穰苴遼叻叼吁塹嫖賭瞧爬衆抒吅吆夥巹橡滌抱縱摩郡唁墜扇籃膀襪頸吋愾諮酬哭妓媛暗錶韁邇妃羿絮蕃渾拐葵暮隅吔吖啶嗪戚吜嗇噬嚥吟哦詠吠吧唧嗒咐吪雋咀徵燐苞茹鈣哧吮吰吱嘎吲哚吳棟嬌窟孟簫忠晗淞闔閭趼宇吶睛噓拂捧疵熄竽笛糠吼吽呀呂韋矇呃呆笨呇貢呉罄呋喃呎呏呔呠呡癡呣呤呦呧瑛眩扒晬淑姬瑜璇鵑呪呫嗶嚅囁呬呯呰呱呲咧噌鈍呴呶呷呸呺呻哱咻嘯嚕籲坎坷邏呿咁咂咆哮咇咈咋蟹煦珅藹咍咑咒詛咔噠嚓咾噥哩喱咗咠咡咢咣咥咦咨嗟詢咩咪咫嚙齧咭咮咱咲咳嗆嗽咴咷咸咹咺咼喉咿婉慟憫賦矜綠茗藍哂搶瞞哆嗦囉噻啾濱彗哋哌哎唷喲哏哐哞哢哤哪裏哫啼喘哰哲萎蚌哳哶哽哿唄唅唆唈唉唎唏嘩堯棣殤璜睿肅唔睇唕唚唞唣喳唪唬唰喏唲唳唵嘛唶唸唹唻唼唾唿啁啃鸚鵡啅埠棧榷祺舖鞅飆啊啍啎啐啓啕啖啗啜啞祈啢啣啤啥啫啱啲啵啺饑啽噶崑沁喁喂喆裙喈嚨喋喌喎喑喒喓喔粗喙幛慶滋鵲喟喣喤喥喦喧騷喨喩梆喫葡萄喭駝挑嚇碰樅瓣純皰藻趟鉻喵営喹喺喼喿嗀嗃嗄嗅嗈嗉嗊嗍嗐嗑嗔詬嗕嗖嗙嗛嗜痂癖嗝嗡嗤嗥嗨嗩嗬嗯嗰嗲嗵嘰嗷嗹嗾嗿嘀嘁嘂嘅惋嘈峪禾蔭嘊嘌嘏嘐嘒嘓嘖嘚嘜嘞嘟囔嘣嘥嘦嘧嘬嘭這謔嚴敞饞鬆嘵嘶嘷嘸蝦嘹嘻嘽嘿噀噂噅噇噉噎噏噔噗噘噙噚噝噞噢噤蟬皿噩噫噭噯噱噲噳嚏涌灑欲巫霏噷噼嚃嚄嚆抖嚌嚐嚔囌嚚嚜嚞嚟嚦嚬嚭嚮嚯嚲嚳飭按竣苛嚵嚶囀囅囈膪謙囍囒囓囗囘蕭酚飄濺諦囝溯眸紇鑾鶻囟殉囡団囤囥囧囨囪囫圇囬囮囯囲図囶囷囸囹圄圉擬囻囿圀圂圃圊粹蠹赦圌墾圏滾鯡鑿枘圕圛圜圞坯埂壤骸炕祠窯豚紳魠鯪鱉圧握圩圪垯圬圮圯炸岬幔毯祇窨菩溉圳圴圻圾坂坆沾坋坌舛壈昆墊墩椅坒坓坩堝坭坰坱坳坴坵坻坼楊掙涎簾垃垈垌垍垓垔垕垗垚垛垝垣垞垟垤垧垮垵垺垾垿埀畔埄埆埇埈埌殃隍埏埒埕埗埜埡埤埦埧埭埯埰埲埳埴埵埶紼埸培怖樁礎輔埼埽堀訶姪廡堃堄摧磐貞韌砌堈堉堊堋堌堍堎堖堙堞堠礁堧堨輿堭堮蜓摘堲堳堽堿塁塄塈煤塋棵塍塏塒塓綢塕鴉沽虱塙塚塝繆塡塢塤塥塩塬塱塲蟎塼塽塾塿墀墁墈墉墐夯増毀墝墠墦漬缽墫墬墮墰墺墻櫥壅壆壊壌壎壒榨蒜壔壕壖壙壚壜壝壠壡壬壭壱売壴壹壻壼寢壿夂夅夆変夊夌漱邑夓腕泄甥禦骼夗夘夙袞瑙妊娠醣梟珊鶯鷺戧幻魘夤蹀祕擂鶇姚宛閨嶼庾撻拇賛蛤裨菠氅漓撈湄蚊霆鯊箐篆篷荊肆舅荔鮃巷慚骰辟邱鎔鐮阪漂燴鯢鰈鱷鴇臚鵬妒峨譚枰晏璣癸祝秤竺牡籟恢罡螻蠍賜絨御梭夬夭砣榆怙枕夶夾餡奄崛葩譎奈賀祀贈奌奐奓奕訢詝奘奜奠奡奣陶奨奩魁奫奬奰媧孩貶隸酥宄狡猾她奼嫣妁氈荼皋膻蠅嬪妄妍嫉媚嬈妗趣妚妞妤礙妬婭妯娌妲妳妵妺姁姅姉姍姒姘姙姜姝姞姣姤姧姫姮娥姱姸姺姽婀娀誘懾脅娉婷娑娓娟娣娭娯娵娶娸娼婊婐婕婞婤婥谿孺婧婪婬婹婺婼婽媁媄媊媕媞媟媠媢媬媮媯媲媵媸媺媻媼眯媿嫄嫈嫋嫏嫕嫗嫘嫚嫜嫠嫡嫦嫩嫪毐嫫嫬嫰嫵嫺嫻嫽嫿嬀嬃嬅嬉耍嬋痴豔嬔嬖嬗嬙嬝嬡嬢嬤嬦嬬嬭幼嬲嬴嬸嬹嬾嬿孀孃孅孌孏曰癲屏孑孓雀孖斟簍謎摺孛矻鳩崮軻祜鸞孥邈毓棠臏孬孭孰孱孳孵泛罔銜孻孿宀宁宂拙株薇掣撫琪瓿榴謐彌宊濂祁瑕宍宏碁宓邸讞実潢町宥宧宨宬徵崎駿掖闕臊煮禽蠶宸豫寀寁寥寃簷庶寎暄磣寔寖寘寙寛寠苫寤肘洱濫蒗陝覈寪弘綽螽寳擅疙瘩晷対檐専尃尅贖絀繚疇釁尌峙醌襟痲碧屁昊槌淘恵瀑牝畑莓缸羚覷蔻髒躁尒尓銳尗尙尜尟尢尥尨尪尬尭尰擒尲尶尷尸尹潽蠖蛾尻釦梢蚴鰭脬蹲屇屌蚵屐屓挪屖屘屙屛屝屢屣巒嶂巖舄屧屨屩屪屭屮戍駐鉀崖嵛巔旮旯楂欖櫸芋茱萸靛麓屴屹屺屼岀岊岌岍阜岑彭鞏岒岝岢嵐岣岧岨岫岱岵岷峁峇峋峒峓峞峠嵋峩峯峱峴峹峿崀崁崆禎崋崌崍嶇崐崒崔嵬巍螢顥崚崞崟崠崢巆崤崦崧殂崬崱崳崴崶崿嵂嵇嵊泗嵌嵎嵒嵓嵗嵙嵞嵡嵩嵫嵯嵴嵼嵾嶁嶃嶄晴嶋嶌嶒嶓嶔嶗嶙嶝嶞嶠嶡嶢嶧嶨嶭嶮嶰嶲嶴嶸巂巃巇巉巋巌巓巘巛滇芎巟巠弋迴巣巤炊擘蜥蟒蠱覡巰蜀彥淖杏茂甫楞巻巽幗巿帛斐鯽蕊帑帔帗帚琉汶帟帡帣帨帬帯帰帷帹暆幃幄幇幋幌幏幘幙幚幞幠幡幢幦幨幩幪幬幭幯幰遙蹉跎餘庚鑑幵幷稚邃庀庁広庄庈庉笠庋跋庖犧庠庤庥鯨庬庱庳庴庵馨衢庹庿廃廄廆廋廌廎廏廐廑廒廕廖廛廝搏鑼廞弛袤廥廧廨廩廱綿踵髓廸廹甌鄴廻廼廾廿躔弁皺弇弌弍弎弐弒弔詭憾薦弝弢弣弤弨弭弮弰弳霖繇燾斌旭溥騫弶弸弼弾彀彄彆纍糾彊彔彖彘彟彠陌彤貽彧繪虹彪炳彫蔚鷗彰癉彲彳彴彷彷徉徨彸彽踩斂旆徂徇徊渭畬鉉裼従筌徘徙徜徠膳甦萌漸徬徭醺徯徳徴潘徻徼忀瘁胖燎怦悸顫扉犀澎湃砰恍惚絞隘忉憚挨餓忐忑忒忖応忝忞耿忡忪忭忮忱忸怩忻悠懣怏遏怔怗怚怛怞懟黍訝怫怭懦怱怲怳怵惕怸怹恁恂恇恉恌恏恒恓恔恘恚恛恝恞恟恠恣恧眄恪恫恬澹恰恿悀悁悃悄悆悊悐悒晦悚悛悜悝悤您悩悪悮悰悱悽惻悳悴悵惘悶悻悾惄愫鍾蒐惆惇惌惎惏惓惔惙惛耄惝瘧濁惥惦惪惲惴惷惸拈愀愃愆愈愊愍愐愑愒愓愔愕愙氓蠢騃昵愜赧愨愬愮愯愷愼慁慂慅慆慇靄慉慊慍慝慥慪慫慬慱慳慴慵慷慼焚憀灼鬱憃憊憋憍眺捏軾憒憔憖憙憧憬憨憪憭憮憯憷憸憹憺懃懅懆邀懊懋懌懍懐懞懠懤懥懨懫懮懰懱毖懵遁樑雍懺懽戁戄戇戉戔戕戛戝戞戠戡戢戣戤戥戦戩戭戯轟戱披菊牖戸戹戺戻戼戽鍬扂楔扃扆扈扊杖牽絹銬鐲賚扐摟攪烊盹瞌跟躉鑔靶鼾払扗玫腮扛扞扠扡扢盔押扤扦扱罾揄綏鞍郤窾扻扼扽抃抆抈抉抌抏瞎抔繯縊擻抜抝択抨摔歉躥牾抶抻搐泵菸拃拄拊髀拋拌脯拎拏拑擢秧沓曳攣迂拚拝拠拡拫拭拮踢拴拶拷攢拽掇芥橐簪摹疔挈瓢驥捺蹻挌挍挎挐揀挓挖掘浚挙揍聵挲挶挾挿捂捃捄捅捆捉捋胳膊揎捌捍捎軀蛛捗捘捙捜捥捩捫捭据捱捻捼捽掀掂掄臀膘掊掎掏掐笙掔掗掞棉芍掤搪闡掫掮掯揉掱掲掽掾揃揅揆搓揌諢揕揗揘揜揝揞揠揥揩揪揫櫫遒麈揰揲揵揶揸揹揺搆搉搊搋搌搎搔搕撼櫓搗搘搠搡搢搣搤搥搦搧搨搬楦褳訕赸搯搰搲搳搴搵搷搽搾搿摀摁摂摃摎摑摒摓跤摙摛摜摞摠摦睺羯摭摮摯摰摲摳摴摶摷摻摽撂撃撅稻撊撋撏鐧潑撕撙撚撝撟撢撣撦撧撩撬撱朔撳蚍蜉撾撿擀擄闖擉缶觚擐擕擖擗擡擣擤澡腚擧擨擩擫擭擯擰擷擸擼擽擿攃攄攆攉攥攐攓攖攙攛每攩攫轡澄攮攰攲攴軼攷砭訐攽碘敁敃敇敉敍敎筏敔敕敖閏誨敜煌敧敪敱敹敺敻敿斁衽斄牒縐謅斉斎斕鶉讕駮鱧斒筲斛斝斞斠斡斢斨斫斮晾沂潟穎絳邵斲斸釳於琅斾斿旀旂旃旄渦旌旎旐旒旓旖旛旝旟旡旣浴旰獺魃旴旹旻旼旽昀昃昄昇昉晰躲澈熹皎皓礬昑昕昜昝昞昡昤暉筍昦昨昰昱昳昴昶昺昻晁蹇隧蔬髦晄晅晒晛晜晞晟晡晢晤晥曦晩萘瑩顗晿暁暋暌暍暐暔暕煅暘暝暠暡曚暦暨暪朦朧暱暲殄馮暵暸暹暻暾曀曄曇曈曌曏曐曖曘曙曛曡曨曩駱曱甴肱曷牘禺錕曽滄耽朁朅朆杪栓誇竟粘絛朊膺朏朐朓朕朘朙瞄覲溘饔飧朠朢朣柵椆澱蝨朩朮朰朱炆璋鈺熾鹮朳槿朶朾朿杅杇杌隉欣釗湛漼楷瀍煜玟纓翱肈舜贄适逵杓杕杗杙荀蘅杝杞脩珓筊杰榔狍閦顰緬莞杲杳眇杴杶杸杻杼枋枌枒枓衾葄翹紓逋枙狸椏枟槁枲枳枴枵枷枸櫞枹枻柁柂柃柅柈柊柎某柑橘柒柘柙柚柜柞櫟柟柢柣柤柩柬柮柰柲橙柶柷柸柺査柿栃栄栒栔栘栝栟栢栩栫栭栱栲栳栴檀栵栻桀驁桁鎂桄桉桋桎梏椹葚桓桔桕桜桟桫欏桭桮桯桲桴桷桹湘溟梃梊梍梐潼梔梘梜梠梡梣梧梩梱梲梳梴梵梹棁棃櫻棐棑棕櫚簑繃蓑棖棘棜棨棩棪棫棬棯棰棱棳棸棹槨棼椀椄苕椈椊椋椌椐椑椓椗検椤椪椰椳椴椵椷椸椽椿楀楄楅篪楋楍楎楗楘楙楛楝楟楠楢楥楨楩楪楫楬楮楯楰楳楸楹楻楽榀榃榊榎槺榕榖榘榛狉莽榜笞榠榡榤榥榦榧榪榭榰榱槤霰榼榾榿槊閂槎槑槔槖様槜槢槥槧槪槭槮槱槲槻槼槾樆樊樏樑樕樗樘樛樟樠樧樨権樲樴樵猢猻樺樻罍樾樿橁橄橆橈笥龠橕橚橛輛橢橤橧豎膈跨橾橿檁檃檇檉檍檎檑檖檗檜檟檠檣檨檫檬檮檳檴檵檸櫂櫆櫌櫛櫜櫝櫡櫧櫨櫪櫬櫳櫹櫺茄櫽欀欂欃欐欑欒欙欞溴欨欬欱欵欶欷歔欸欹欻欼欿歁歃歆艎歈歊蒔蝶歓歕歘歙歛歜歟歠蹦詮鑲蹣跚陞陟歩歮歯歰歳歴璞歺瞑歾歿殀殈殍殑殗殜殙殛殞殢殣殥殪殫殭殰殳荃殷殸殹蛟殻殽謗毆毈毉餵毎毑蕈毗毘毚茛鄧毧毬毳毷毹毽毾毿氂氄氆靴氉氊氌氍氐聊氕氖気氘氙氚氛氜氝氡洶焊痙氤氳氥氦鋁鋅氪烴氬銨痤汪滸漉痘盂碾菖蒲蕹蛭螅氵氷氹氺氽燙氾氿渚汆汊汋汍汎汏汐汔汕褟汙汚汜蘺沼穢衊汧汨汩汭汲汳汴隄汾沄沅沆瀣沇沈葆浸淪湎溺痼痾沌沍沏沐沔沕沘浜畹礫沚沢沬沭沮沰沱灢沴沷籽沺烹濡洄泂肛泅泆湧肓泐泑泒泓泔泖泙泚泜泝泠漩饃濤粼濘蘚鰍泩泫泭泯銖泱泲洇洊涇琵琶荽薊箔洌洎洏洑潄濯洙洚洟洢洣洧洨洩痢滔洫洮洳洴洵洸洹洺洼洿淌蜚浄浉浙贛渫浠浡浤浥淼瀚浬浭翩萍浯浰蜃淀苔蛞蝓蜇螵蛸煲鯉浹浼浽溦涂涊涐涑涒涔滂涖涘涙涪涫涬涮涴涶涷涿淄淅淆淊淒黯淓淙漣淜淝淟淠淢淤淥淦淩猥藿褻淬淮淯淰淳詣淶紡淸淹燉癯綺渇済渉渋渓渕渙渟渢滓渤澥渧渨渮渰渲渶渼湅湉湋湍湑湓湔黔湜湝湞湟湢湣湩湫湮麟湱湲湴湼満溈溍溎溏溛舐漭溠溤溧馴溮溱溲溳溵溷溻溼溽溾滁滃滉滊滎滏稽滕滘滙滝滫滮羼耷滷滹滻煎漈漊漎繹漕漖漘漙漚漜漪漾漥漦漯漰漵漶漷濞潀潁潎潏潕潗潚潝潞潠潦祉瘍潲潵潷潸潺潾潿澁澂澃澉澌澍澐澒澔澙澠澣澦澧澨澫澬澮澰澴澶澼熏郁濆濇濈濉濊貊濔疣濜濠濩觴濬濮盥濰濲濼瀁瀅瀆瀋瀌瀏瀒瀔瀕瀘瀛瀟瀠瀡瀦瀧瀨瀬瀰瀲瀳瀵瀹瀺瀼灃灄灉灋灒灕灖灝灞灠灤灥灨灩灪蜴灮燼獴灴灸灺炁炅魷炗炘炙炤炫疽烙釺炯炰炱炲炴炷燬炻烀烋瘴鯧烓烔焙烜烝烳飪烺焃焄耆焌焐焓焗焜焞焠焢焮焯焱焼煁煃煆煇煊熠煍熬煐煒煕煗燻礆霾煚煝煟煠煢矸煨瑣煬萁煳煺煻熀熅熇熉羆熒穹熗熘熛熜稔諳爍熤熨熯熰眶螞熲熳熸熿燀燁燂燄盞燊燋燏燔隼燖燜燠燡燦燨燮燹燻燽燿爇爊爓爚爝爟爨蟾爯爰爲爻爿爿牀牁牂牄牋牎牏牓牕釉牚腩蒡虻牠雖蠣牣牤牮牯牲牳牴牷牸牼絆牿靬犂犄犆犇犉犍犎犒犖犗犛犟犠犨犩犪犮犰狳犴犵犺狁甩狃狆狎狒獾狘狙黠狨狩狫狴狷狺狻豕狽蜘猁猇猈猊猋猓猖獗猗猘猙獰獁猞猟獕猭猱猲猳猷猸猹猺玃獀獃獉獍獏獐獒獘獙獚獜獝獞獠獢獣獧鼇蹊獪獫獬豸獮獯鬻獳獷獼玀玁菟玅玆玈珉糝禛郅玍玎玓瓅玔玕玖玗玘玞玠玡玢玤玥玦玨瑰玭玳瑁玶玷玹玼珂珇珈瑚珌饈饌珔珖珙珛珞珡珣珥珧珩珪珮珶珷珺珽琀琁隕琊琇琖琚琠琤琦琨琫琬琭琮琯琰琱琲瑯琹琺琿瑀瑂瑄瑉瑋瑑瑔瑗瑢瑭瑱瑲瑳瑽瑾瑿璀璨璁璅璆璈璉璊璐璘璚璝璟璠璡璥璦璩璪璫璯璲璵璸璺璿瓀瓔瓖瓘瓚瓛臍瓞瓠瓤瓧瓩瓮瓰瓱瓴瓸瓻瓼甀甁甃甄甇甋甍甎甏甑甒甓甔甕甖甗飴蔗甙詫鉅粱盎銹糰甡褥産甪甬甭甮甯鎧甹甽甾甿畀畁畇畈畊畋畎畓畚畛畟鄂畤畦畧荻畯畳畵畷畸畽畾疃疉疋疍疎簞疐疒疕疘疝疢疥疧疳疶疿痁痄痊痌痍痏痐痒痔痗瘢痚痠痡痣痦痩痭痯痱痳痵痻痿瘀瘂瘃瘈瘉瘊瘌瘏瘐瘓瘕瘖瘙瘚瘛瘲瘜瘝瘞瘠瘥瘨瘭瘮瘯瘰癧瘳癘瘵瘸瘺瘻瘼癃癆癇癈癎癐癔癙癜癠癤癥癩蟆癪癭癰発踔紺蔫酵皙砬砒翎翳蘞鎢鑞皚鵯駒鱀粵褶皀皁莢皃鎛皈皌皐皒硃皕皖皘皜皝皞皤皦皨皪皫皭糙綻皴皸皻皽盅盋盌盍盚盝踞盦盩鞦韆盬盭眦睜瞤盯盱眙裰盵盻睞眂眅眈眊県眑眕眚眛眞眢眣眭眳眴眵眹瞓眽郛睃睅睆睊睍睎睏睒睖睙睟睠睢睥睪睪睯睽睾瞇瞈瞋瞍逛瞏瞕瞖瞘瞜瞟瞠瞢瞫瞭瞳瞵瞷瞹瞽闍瞿矓矉矍鑠矔矗矙矚矞矟矠矣矧矬矯矰矱硪碇磙罅舫阡、矼矽礓砃砅砆砉砍砑砕砝砟砠砢砦砧砩砫砮砳艏砵砹砼硇硌硍硎硏硐硒硜硤硨磲茚鋇硭硻硾碃碉碏碣碓碔碞碡碪碫碬碭碯碲碸碻礡磈磉磎磑磔磕磖磛磟磠磡磤磥蹭磪磬磴磵磹磻磽礀礄礅礌礐礚礜礞礤礧礮礱礲礵礽礿祂祄祅祆禳祊祍祏祓祔祕祗祘祛祧祫祲祻祼餌臠錮禂禇禋禑禔禕隋禖禘禚禜禝禠禡禢禤禥禨禫禰禴禸稈秈秊闈颯秌秏秕笈蘵賃秠秣秪秫秬秭秷秸稊稌稍稑稗稙稛稞稬稭稲稹稼顙稾穂穄穇穈穉穋穌貯穏穜穟穠穡穣穤穧穨穭穮穵穸窿闃窀窂窅窆窈窕窊窋窌窒窓窔窞窣窬黷蹙窰窳窴窵窶窸窻竁竃竈竑竜竝竦竪篦篾笆鮫竾笉笊笎笏笐靨笓笤籙笪笫笭笮笰笱笲笳笵笸笻筀筅筇筈筎筑筘筠筤筥筦筧筩筭筯筰筱筳筴讌筸箂箇箊箎箑箒箘箙箛箜篌箝箠箬鏃箯箴箾篁篔簹篘篙篚篛篜篝篟篠篡篢篥篧篨篭篰篲篳篴篶篹篼簀簁簃簆簉簋簌簏簜簟簠簥簦簨簬簰簸簻籊籐籒籓籔籖籚籛籜籣籥籧籩籪籫籯芾麴籵籸籹籼粁粃粋粑粔糲粛粞粢粧粨粲粳粺粻粽闢粿糅糆糈糌糍糒糔萼糗蛆蹋糢糨糬糭糯糱糴糶糸糺紃蹼鰹黴紆紈絝紉閩襻紑紕紘錠鳶鷂紝紞紟紥紩紬紱紲紵紽紾紿絁絃絅経絍絎絏縭褵絓絖絘絜絢絣螯絪絫聒絰絵絶絺絻絿綀綃綅綆綈綉綌綍綎綑綖綘継続緞綣綦綪綫綮綯綰罟蝽綷縩綹綾緁緄緅緆緇緋緌緎総緑緔緖緗緘緙緜緡緤緥緦纂緪緰緱緲緶緹縁縃縄縈縉縋縏縑縕縗縚縝縞縟縠縡縢縦縧縯縰騁縲縳縴縵縶縹縻衙縿繄繅繈繊繋繐繒繖繘繙繠繢繣繨繮繰繸繻繾纁纆纇纈纉纊纑纕纘纙纚纛缾罃罆罈罋罌罎罏罖罘罛罝罠罣罥罦罨罫罭鍰罳罶罹罻罽罿羂羃羇羋蕉５１鴕羑羖羗羜羝羢羣羥羧羭羮羰羱羵羶羸藜鮐翀翃翄翊翌翏翕翛翟翡翣翥翦躚翪翫翬翮翯翺翽翾翿闆饕鴰鍁耋耇耎耏耑耒耜耔耞耡耤耨耩耪耬耰鬢耵聹聃聆聎聝聡聦聱聴聶聼閾聿肄肏肐肕腋肙肜肟肧胛肫肬肭肰肴肵肸肼胊胍胏胑胔胗胙胝胠銓胤胦胩胬胭胯胰胲胴胹胻胼胾脇脘脝脞脡脣脤脥脧脰脲脳腆腊腌臢腍腒腓腖腜腠腡腥腧腬腯踝蹬鐐腴腶蠕誹膂膃膆膇膋膔膕膗膙膟黐膣膦膫膰膴膵膷膾臃臄臇臈臌臐臑臓臕臖臙臛臝臞臧蓐詡臽臾臿舀舁鰟鮍舋舎舔舗舘舝舠舡舢舨舭舲舳舴舸舺艁艄艅艉艋艑艕艖艗艘艚艜艟艣艤艨艩艫艬艭荏艴艶艸艹艻艿芃芄芊萰陂藭芏芔芘芚蕙芟芣芤茉芧芨芩芪芮芰鰱芴芷芸蕘豢芼芿苄苒苘苙苜蓿苠苡苣蕒苤苧苪鎊苶苹苺苻苾茀茁范蠡萣茆茇茈茌茍茖茞茠茢茥茦菰茭茯茳藨茷藘茼荁荄荅荇荈菅蜢鴞荍荑荘荳荵荸薺莆莒莔莕莘莙莚莛莜莝莦莨菪莩莪莭莰莿菀菆菉菎菏菐菑菓菔菕菘菝菡菢菣菥蓂菧菫轂鎣菶菷菹醢菺菻菼菾萅萆萇萋萏萐萑萜萩萱萴萵萹萻葇葍葎葑葒葖葙葠葥葦葧葭葯葳葴葶葸葹葽蒄蒎蒓蘢薹蒞蒟蒻蒢蒦蒨蒭藁蒯蒱鉾蒴蒹蒺蒽蓀蓁蓆蓇蓊蓌蓍蓏蓓蓖蓧蓪蓫蓽跣藕蓯蓰蓱蓴蓷蓺蓼蔀蔂蔃蔆蔇蔉蔊蔋蔌蔎蔕蔘蔙蔞蔟鍔蔣雯蔦蔯蔳蔴蔵蔸蔾蕁蕆蕋蕍蕎蕐蕑蕓蕕蕖蕗蕝蕞蕠蕡蕢蕣蕤蕨蕳蕷蕸蕺蕻薀薁薃薅薆薈薉薌薏薐薔薖薘薙諤釵薜薠薢薤薧薨薫薬薳薶薷薸薽薾薿藄藇藋藎藐藙藚藟藦藳藴藶藷藾蘀蘁蘄蘋蘗蘘蘝蘤蘧蘩蘸蘼虀虆虍蟠虒虓虖虡虣虥虩虯虰蛵虵虷鱒虺虼蚆蚈蚋蚓蚔蚖蚘蚜蚡蚣蚧蚨蚩蚪蚯蚰蜒蚱蚳蚶蚹蚺蚻蚿蛀蛁蛄蛅蝮蛌蛍蛐蟮蛑蛓蛔蛘蛚蛜蛡蛣蜊蛩蛺蛻螫蜅蜆蜈蝣蜋蜍蜎蜑蠊蜛餞蜞蜣蜨蜩蜮蜱蜷蜺蜾蜿蝀蝃蝋蝌蝍蝎蝏蝗蝘蝙蝝鱝蝡蝤蝥蝯蝰蝱蝲蝴蝻螃蠏螄螉螋螒螓螗螘螙螚蟥螟螣螥螬螭螮螾螿蟀蟅蟈蟊蟋蟑蟓蟛蟜蟟蟢蟣蟨蟪蟭蟯蟳蟶蟷蟺蟿蠁蠂蠃蠆蠋蠐蠓蠔蠗蠙蠚蠛蠜蠧蠨蠩蠭蠮蠰蠲蠵蠸蠼蠽衁衂衄衇衈衉衋衎衒衕衖衚衞裳鈎衭衲衵衹衺衿袈裟袗袚袟袢袪袮袲袴袷袺袼褙袽裀裉裊裋裌裍裎裒裛裯裱裲裴裾褀褂褉褊褌褎褐褒褓褔褕褘褚褡褢褦褧褪褫褭褯褰褱襠褸褽褾襁襃襆襇襉襋襌襏襚襛襜襝襞襡襢襤襦襫襬襭襮襴襶襼襽襾覂覃覅覇覉覊覌覗覘覚覜覥覦覧覩覬覯覰観覿觔觕觖觜觽觝觡酲觩觫觭觱觳觶觷觼觾觿言賅訃訇訏訑訒詁託訧訬訳訹証訾詀詅詆譭詈詊詎詑詒詖詗詘詧詨詵詶詸詹詻詼詿誂誃誄鋤誆誋誑誒誖誙誚誥誧説読誯誶誾諂諄諆諌諍諏諑諕諗諛諝諞諟諠諡諴諵諶諼謄謆謇謌謍謏謑謖謚謡謦謪謫謳謷謼謾譁譅譆譈譊譌譒譔譖鑫譞譟譩譫譬譱譲譴譸譹譾讅讆讋讌讎讐讒讖讙讜讟谽豁豉豇豈豊豋豌豏豔豞豖豗豜豝豣豦豨豭豱豳豵豶豷豺豻貅貆貍貎貔貘貙貜貤饜貰餸貺賁賂賏賒賕賙賝賡賧賨賫鬭賮賵賸賺賻賾贇贉贐贔贕贗赬赭赱赳迄趁趂趄趐趑趒趔趡趦趫趮趯趲趴趵趷趹趺趿跁跂跅跆躓蹌跐跕跖跗跙跛跦跧跩跫跬跮跱跲跴跺跼跽踅踆踈踉踊踒踖踘踜踟躇躕踠踡踣踤踥踦踧蹺踫踮踰踱踴踶踹踺踼踽躞蹁蹂躪蹎蹐蹓蹔蹕蹚蹜蹝蹟蹠蹡蹢躂蹧蹩蹪蹯鞠蹽躃躄躅躊躋躐躑躒躘躙躛躝躠躡躦躧躩躭躰躳躶軃軆輥軏軔軘軜軝齶転軥軨軭軱軲轆軷軹軺軽軿輀輂輦輅輇輈輓輗輙輜輞輠輤輬輭輮輳輴輵輶輹輼輾轀轇轏轑轒轔轕轖轗轘轙轝轞轢轤辠辢辤辵辶辺込辿迅迋迍麿迓迣迤邐迥迨迮迸迺迻迿逄逅逌逍逑逓逕逖逡逭逯逴逶逹遄遅遉遘遛遝遢遨遫遯遰遴遶遹遻邂邅邉邋邎邕邗邘邛邠邢邧邨邯鄲邰邲邳邴邶邷邽邾邿郃郄郇郈郔郕郗郙郚郜郝郞郟郠郢郪郫郯郰郲郳郴郷郹郾郿鄀鄄鄆鄇鄈鄋鄍鄎鄏鄐鄑鄒鄔鄕鄖鄗鄘鄚鄜鄞鄠鄢鄣鄤鄦鄩鄫鄬鄮鄯鄱鄶鄷鄹鄺鄻鄾鄿酃酅酆酇酈酊酋酎酏酐酣酔酕醄酖酗酞酡酢酤酩酴酹酺醁醅醆醊醍醐醑醓醖醝醞醡醤醨醪醭醯醰醱醲醴醵醸醹醼醽醾釂釃釅釆釈鱸鎦閶釓釔釕鈀釙鼢鼴釤釧釪釬釭釱釷釸釹鈁鈃鈄鈆鈇鈈鈊鈌鈐鈑鈒鈤鈥鈧鈬鈮鈰鈳鐺鈸鈹鈽鈿鉄鉆鉈鉋鉌鉍鉏鉑鉕鉚鉢鉥鉦鉨鉬鉭鉱鉲鉶鉸鉺鉼鉿銍銎銑銕鏤銚銛銠銣銤銥銦銧銩銪銫銭銰銲銶銻銼銾鋂鋃鋆鋈鋊鋌鋍鋏鋐鋑鋕鋘鋙鋝鋟鋦鋨鋩鋭鋮鋯鋰鋱鋳鋹鋺鋻鏰鐱錀錁錆錇錈錍錏錒錔錙錚錛錞錟錡錤錩錬録錸錼鍀鍆鍇鍉鍍鍏鍐鍘鍚鍛鍠鍤鍥鍩鍫鍭鍱鍴鍶鍹鍺鍼鍾鎄鎇鎉鎋鎌鎍鎏鎒鎓鎗鎘鎚鎞鎡鎤鎩鎪鎭鎯鎰鎳鎴鎵鎸鎹鎿鏇鏊鏌鏐鏑鏖鏗鏘鏚鏜鏝鏞鏠鏦鏨鏷鏸鏹鏻鏽鏾鐃鐄鐇鐏鐒鐓鐔鐗馗鐙鐝鐠鐡鐦鐨鐩鐫鐬鐱鐳鐶鐻鐽鐿鑀鑅鑌鑐鑕鑚鑛鑢鑤鑥鑪鑭鑯鑱鑴鑵鑷钁钃镻閆閈閌閎閒閔閗閟閡関閤閤閧閬閲閹閺閻閼閽閿闇闉闋闐闑闒闓闘闚闞闟闠闤闥阞阢阤阨阬阯阹阼阽陁陑陔陛陜陡陥陬騭陴険陼陾隂隃隈隒隗隞隠隣隤隩隮隰顴隳隷隹雂雈雉雊雎雑雒雗雘雚雝雟雩雰雱驛霂霅霈霊霑霒霓霙霝霢霣霤霨霩霪霫霮靁靆靉靑靚靣靦靪靮靰靳靷靸靺靼靿鞀鞃鞄鞌鞗鞙鞚鞝鞞鞡鞣鞨鞫鞬鞮鞶鞹鞾韃韅韉馱韍韎韔韖韘韝韞韡韣韭韮韱韹韺頀颳頄頇頊頍頎頏頒頖頞頠頫頬顱頯頲頴頼顇顋顑顒顓顔顕顚顜顢顣顬顳颭颮颱颶颸颺颻颽颾颿飀飂飈飌飜飡飣飤飥飩飫飮飱飶餀餂餄餎餇餈餑餔餕餖餗餚餛餜餟餠餤餧餩餪餫餬餮餱餲餳餺餻餼餽餿饁饅饇饉饊饍饎饐饘饟饢馘馥馝馡馣騮騾馵馹駃駄駅駆駉駋駑駓駔駗駘駙駜駡駢駪駬駰駴駸駹駽駾騂騄騅騆騉騋騍騏驎騑騒験騕騖騠騢騣騤騧驤騵騶騸騺驀驂驃驄驆驈驊驌驍驎驏驒驔驖驙驦驩驫骺鯁骫骭骯骱骴骶骷髏骾髁髂髄髆髈髐髑髕髖髙髝髞髟髡髣髧髪髫髭髯髲髳髹髺髽髾鬁鬃鬅鬈鬋鬎鬏鬐鬑鬒鬖鬗鬘鬙鬠鬣鬪鬫鬬鬮鬯鬰鬲鬵鬷魆魈魊魋魍魎魑魖鰾魛魟魣魦魨魬魴魵魸鮀鮁鮆鮌鮎鮑鮒鮓鮚鮞鮟鱇鮠鮦鮨鮪鮭鮶鮸鮿鯀鯄鯆鯇鯈鯔鯕鯖鯗鯙鯠鯤鯥鯫鯰鯷鯸鯿鰂鰆鶼鰉鰋鰐鰒鰕鰛鰜鰣鰤鰥鰦鰨鰩鰮鰳鰶鰷鱺鰼鰽鱀鱄鱅鱆鱈鱎鱐鱓鱔鱖鱘鱟鱠鱣鱨鱭鱮鱲鱵鱻鲅鳦鳧鳯鳲鳷鳻鴂鴃鴄鴆鴈鴎鴒鴔鴗鴛鴦鴝鵒鴟鴠鴢鴣鴥鴯鶓鴳鴴鴷鴽鵀鵁鵂鵓鵖鵙鵜鶘鵞鵟鵩鵪鵫鵵鵷鵻鵾鶂鶊鶏鶒鶖鶗鶡鶤鶦鶬鶱鶲鶵鶸鶹鶺鶿鷀鷁鷃鷄鷇鷈鷉鷊鷏鷓鷕鷖鷙鷞鷟鷥鷦鷯鷩鷫鷭鷳鷴鷽鷾鷿鸂鸇鸊鸏鸑鸒鸓鸕鸛鸜鸝鹸鹹鹺麀麂麃麄麇麋麌麐麑麒麚麛麝麤麩麪麫麮麯麰麺麾黁黈黌黢黒黓黕黙黝黟黥黦黧黮黰黱黲黶黹黻黼黽黿鼂鼃鼅鼈鼉鼏鼐鼒鼕鼖鼙鼚鼛鼡鼩鼱鼪鼫鼯鼷鼽齁齆齇齈齉齌齎齏齔齕齗齙齚齜齞齟齬齠齢齣齧齩齮齯齰齱齵齾龎龑龒龔龖龘龝龡龢龤'
20 | 
21 | assert len(simplified_charcters) == len(simplified_charcters)
22 | 
23 | s2t_dict = {}
24 | t2s_dict = {}
25 | for i, item in enumerate(simplified_charcters):
26 |     s2t_dict[item] = traditional_characters[i]
27 |     t2s_dict[traditional_characters[i]] = item
28 | 
29 | 
30 | def tranditional_to_simplified(text: str) -> str:
31 |     return "".join(
32 |         [t2s_dict[item] if item in t2s_dict else item for item in text])
33 | 
34 | 
35 | def simplified_to_traditional(text: str) -> str:
36 |     return "".join(
37 |         [s2t_dict[item] if item in s2t_dict else item for item in text])
38 | 
39 | 
40 | if __name__ == "__main__":
41 |     text = "一般是指存取一個應用程式啟動時始終顯示在網站或網頁瀏覽器中的一個或多個初始網頁等畫面存在的站點"
42 |     print(text)
43 |     text_simple = tranditional_to_simplified(text)
44 |     print(text_simple)
45 |     text_traditional = simplified_to_traditional(text_simple)
46 |     print(text_traditional)
47 | 


--------------------------------------------------------------------------------
/f5_model/utils.py:
--------------------------------------------------------------------------------
  1 | from __future__ import annotations
  2 | 
  3 | import os
  4 | import re
  5 | import math
  6 | import random
  7 | import string
  8 | from tqdm import tqdm
  9 | from collections import defaultdict
 10 | 
 11 | import matplotlib
 12 | matplotlib.use("Agg")
 13 | import matplotlib.pylab as plt
 14 | 
 15 | import torch
 16 | import torch.nn.functional as F
 17 | from torch.nn.utils.rnn import pad_sequence
 18 | import torchaudio
 19 | 
 20 | import einx
 21 | from einops import rearrange, reduce
 22 | 
 23 | import jieba
 24 | from pypinyin import lazy_pinyin, Style
 25 | import zhconv
 26 | from zhon.hanzi import punctuation
 27 | from jiwer import compute_measures
 28 | 
 29 | from funasr import AutoModel
 30 | from faster_whisper import WhisperModel
 31 | 
 32 | from .ecapa_tdnn import ECAPA_TDNN_SMALL
 33 | from .modules import MelSpec
 34 | 
 35 | 
 36 | # seed everything
 37 | 
 38 | def seed_everything(seed = 0):
 39 |     random.seed(seed)
 40 |     os.environ['PYTHONHASHSEED'] = str(seed)
 41 |     torch.manual_seed(seed)
 42 |     torch.cuda.manual_seed(seed)
 43 |     torch.cuda.manual_seed_all(seed)
 44 |     torch.backends.cudnn.deterministic = True
 45 |     torch.backends.cudnn.benchmark = False
 46 | 
 47 | # helpers
 48 | 
 49 | def exists(v):
 50 |     return v is not None
 51 | 
 52 | def default(v, d):
 53 |     return v if exists(v) else d
 54 | 
 55 | # tensor helpers
 56 | 
 57 | def lens_to_mask(
 58 |     t: int['b'],
 59 |     length: int | None = None
 60 | ) -> bool['b n']:
 61 | 
 62 |     if not exists(length):
 63 |         length = t.amax()
 64 | 
 65 |     seq = torch.arange(length, device = t.device)
 66 |     return einx.less('n, b -> b n', seq, t)
 67 | 
 68 | def mask_from_start_end_indices(
 69 |     seq_len: int['b'],
 70 |     start: int['b'],
 71 |     end: int['b']
 72 | ):
 73 |     max_seq_len = seq_len.max().item()  
 74 |     seq = torch.arange(max_seq_len, device = start.device).long()
 75 |     return einx.greater_equal('n, b -> b n', seq, start) & einx.less('n, b -> b n', seq, end)
 76 | 
 77 | def mask_from_frac_lengths(
 78 |     seq_len: int['b'],
 79 |     frac_lengths: float['b']
 80 | ):
 81 |     lengths = (frac_lengths * seq_len).long()
 82 |     max_start = seq_len - lengths
 83 | 
 84 |     rand = torch.rand_like(frac_lengths)
 85 |     start = (max_start * rand).long().clamp(min = 0)
 86 |     end = start + lengths
 87 | 
 88 |     return mask_from_start_end_indices(seq_len, start, end)
 89 | 
 90 | def maybe_masked_mean(
 91 |     t: float['b n d'],
 92 |     mask: bool['b n'] = None
 93 | ) -> float['b d']:
 94 | 
 95 |     if not exists(mask):
 96 |         return t.mean(dim = 1)
 97 | 
 98 |     t = einx.where('b n, b n d, -> b n d', mask, t, 0.)
 99 |     num = reduce(t, 'b n d -> b d', 'sum')
100 |     den = reduce(mask.float(), 'b n -> b', 'sum')
101 | 
102 |     return einx.divide('b d, b -> b d', num, den.clamp(min = 1.))
103 | 
104 | 
105 | # simple utf-8 tokenizer, since paper went character based
106 | def list_str_to_tensor(
107 |     text: list[str],
108 |     padding_value = -1
109 | ) -> int['b nt']:
110 |     list_tensors = [torch.tensor([*bytes(t, 'UTF-8')]) for t in text]  # ByT5 style
111 |     text = pad_sequence(list_tensors, padding_value = padding_value, batch_first = True)
112 |     return text
113 | 
114 | # char tokenizer, based on custom dataset's extracted .txt file
115 | def list_str_to_idx(
116 |     text: list[str] | list[list[str]],
117 |     vocab_char_map: dict[str, int],  # {char: idx}
118 |     padding_value = -1
119 | ) -> int['b nt']:
120 |     list_idx_tensors = [torch.tensor([vocab_char_map.get(c, 0) for c in t]) for t in text]  # pinyin or char style
121 |     text = pad_sequence(list_idx_tensors, padding_value = padding_value, batch_first = True)
122 |     return text
123 | 
124 | 
125 | # Get tokenizer
126 | 
127 | def get_tokenizer(dataset_name, tokenizer: str = "pinyin"):
128 |     ''' 
129 |     tokenizer   - "pinyin" do g2p for only chinese characters, need .txt vocab_file
130 |                 - "char" for char-wise tokenizer, need .txt vocab_file
131 |                 - "byte" for utf-8 tokenizer
132 |     vocab_size  - if use "pinyin", all available pinyin types, common alphabets (also those with accent) and symbols
133 |                 - if use "char", derived from unfiltered character & symbol counts of custom dataset
134 |                 - if use "byte", set to 256 (unicode byte range) 
135 |     ''' 
136 |     if tokenizer in ["pinyin", "char"]:
137 |         data_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
138 |         with open (os.path.join(data_dir,f"data/{dataset_name}_{tokenizer}/vocab.txt"), "r", encoding="utf-8") as f:
139 |             vocab_char_map = {}
140 |             for i, char in enumerate(f):
141 |                 vocab_char_map[char[:-1]] = i
142 |         vocab_size = len(vocab_char_map)
143 |         assert vocab_char_map[" "] == 0, "make sure space is of idx 0 in vocab.txt, cuz 0 is used for unknown char"
144 | 
145 |     elif tokenizer == "byte":
146 |         vocab_char_map = None
147 |         vocab_size = 256
148 | 
149 |     return vocab_char_map, vocab_size
150 | 
151 | 
152 | # convert char to pinyin
153 | 
154 | def convert_char_to_pinyin(text_list, polyphone = True):
155 |     final_text_list = []
156 |     god_knows_why_en_testset_contains_zh_quote = str.maketrans({'“': '"', '”': '"', '‘': "'", '’': "'"})  # in case librispeech (orig no-pc) test-clean
157 |     custom_trans = str.maketrans({';': ','})  # add custom trans here, to address oov
158 |     for text in text_list:
159 |         char_list = []
160 |         text = text.translate(god_knows_why_en_testset_contains_zh_quote)
161 |         text = text.translate(custom_trans)
162 |         for seg in jieba.cut(text):
163 |             seg_byte_len = len(bytes(seg, 'UTF-8'))
164 |             if seg_byte_len == len(seg):  # if pure alphabets and symbols
165 |                 if char_list and seg_byte_len > 1 and char_list[-1] not in " :'\"":
166 |                     char_list.append(" ")
167 |                 char_list.extend(seg)
168 |             elif polyphone and seg_byte_len == 3 * len(seg):  # if pure chinese characters
169 |                 seg = lazy_pinyin(seg, style=Style.TONE3, tone_sandhi=True)
170 |                 for c in seg:
171 |                     if c not in "。，、；：？！《》【】—…":
172 |                         char_list.append(" ")
173 |                     char_list.append(c)
174 |             else:  # if mixed chinese characters, alphabets and symbols
175 |                 for c in seg:
176 |                     if ord(c) < 256:
177 |                         char_list.extend(c)
178 |                     else:
179 |                         if c not in "。，、；：？！《》【】—…":
180 |                             char_list.append(" ")
181 |                             char_list.extend(lazy_pinyin(c, style=Style.TONE3, tone_sandhi=True))
182 |                         else:  # if is zh punc
183 |                             char_list.append(c)
184 |         final_text_list.append(char_list)
185 | 
186 |     return final_text_list
187 | 
188 | 
189 | # save spectrogram
190 | def save_spectrogram(spectrogram, path):
191 |     plt.figure(figsize=(12, 4))
192 |     plt.imshow(spectrogram, origin='lower', aspect='auto')
193 |     plt.colorbar()
194 |     plt.savefig(path)
195 |     plt.close()
196 | 
197 | 
198 | # seedtts testset metainfo: utt, prompt_text, prompt_wav, gt_text, gt_wav
199 | def get_seedtts_testset_metainfo(metalst):
200 |     f = open(metalst); lines = f.readlines(); f.close()
201 |     metainfo = []
202 |     for line in lines:
203 |         if len(line.strip().split('|')) == 5:
204 |             utt, prompt_text, prompt_wav, gt_text, gt_wav = line.strip().split('|')
205 |         elif len(line.strip().split('|')) == 4:
206 |             utt, prompt_text, prompt_wav, gt_text = line.strip().split('|')
207 |             gt_wav = os.path.join(os.path.dirname(metalst), "wavs", utt + ".wav")
208 |         if not os.path.isabs(prompt_wav):
209 |             prompt_wav = os.path.join(os.path.dirname(metalst), prompt_wav)
210 |         metainfo.append((utt, prompt_text, prompt_wav, gt_text, gt_wav))
211 |     return metainfo
212 | 
213 | 
214 | # librispeech test-clean metainfo: gen_utt, ref_txt, ref_wav, gen_txt, gen_wav
215 | def get_librispeech_test_clean_metainfo(metalst, librispeech_test_clean_path):
216 |     f = open(metalst); lines = f.readlines(); f.close()
217 |     metainfo = []
218 |     for line in lines:
219 |         ref_utt, ref_dur, ref_txt, gen_utt, gen_dur, gen_txt = line.strip().split('\t')
220 | 
221 |         # ref_txt = ref_txt[0] + ref_txt[1:].lower() + '.'  # if use librispeech test-clean (no-pc)
222 |         ref_spk_id, ref_chaptr_id, _ =  ref_utt.split('-')
223 |         ref_wav = os.path.join(librispeech_test_clean_path, ref_spk_id, ref_chaptr_id, ref_utt + '.flac')
224 | 
225 |         # gen_txt = gen_txt[0] + gen_txt[1:].lower() + '.'  # if use librispeech test-clean (no-pc)
226 |         gen_spk_id, gen_chaptr_id, _ =  gen_utt.split('-')
227 |         gen_wav = os.path.join(librispeech_test_clean_path, gen_spk_id, gen_chaptr_id, gen_utt + '.flac')
228 | 
229 |         metainfo.append((gen_utt, ref_txt, ref_wav, " " + gen_txt, gen_wav))
230 | 
231 |     return metainfo
232 | 
233 | 
234 | # padded to max length mel batch
235 | def padded_mel_batch(ref_mels):
236 |     max_mel_length = torch.LongTensor([mel.shape[-1] for mel in ref_mels]).amax()
237 |     padded_ref_mels = []
238 |     for mel in ref_mels:
239 |         padded_ref_mel = F.pad(mel, (0, max_mel_length - mel.shape[-1]), value = 0)
240 |         padded_ref_mels.append(padded_ref_mel)
241 |     padded_ref_mels = torch.stack(padded_ref_mels)
242 |     padded_ref_mels = rearrange(padded_ref_mels, 'b d n -> b n d')
243 |     return padded_ref_mels
244 | 
245 | 
246 | # get prompts from metainfo containing: utt, prompt_text, prompt_wav, gt_text, gt_wav
247 | 
248 | def get_inference_prompt(
249 |     metainfo, 
250 |     speed = 1., tokenizer = "pinyin", polyphone = True, 
251 |     target_sample_rate = 24000, n_mel_channels = 100, hop_length = 256, target_rms = 0.1,
252 |     use_truth_duration = False,
253 |     infer_batch_size = 1, num_buckets = 200, min_secs = 3, max_secs = 40,
254 | ):
255 |     prompts_all = []
256 | 
257 |     min_tokens = min_secs * target_sample_rate // hop_length
258 |     max_tokens = max_secs * target_sample_rate // hop_length
259 | 
260 |     batch_accum = [0] * num_buckets
261 |     utts, ref_rms_list, ref_mels, ref_mel_lens, total_mel_lens, final_text_list = \
262 |         ([[] for _ in range(num_buckets)] for _ in range(6))
263 | 
264 |     mel_spectrogram = MelSpec(target_sample_rate=target_sample_rate, n_mel_channels=n_mel_channels, hop_length=hop_length)
265 | 
266 |     for utt, prompt_text, prompt_wav, gt_text, gt_wav in tqdm(metainfo, desc="Processing prompts..."):
267 | 
268 |         # Audio
269 |         ref_audio, ref_sr = torchaudio.load(prompt_wav)
270 |         ref_rms = torch.sqrt(torch.mean(torch.square(ref_audio)))
271 |         if ref_rms < target_rms:
272 |             ref_audio = ref_audio * target_rms / ref_rms
273 |         assert ref_audio.shape[-1] > 5000, f"Empty prompt wav: {prompt_wav}, or torchaudio backend issue."
274 |         if ref_sr != target_sample_rate:
275 |             resampler = torchaudio.transforms.Resample(ref_sr, target_sample_rate)
276 |             ref_audio = resampler(ref_audio)
277 | 
278 |         # Text
279 |         if len(prompt_text[-1].encode('utf-8')) == 1:
280 |             prompt_text = prompt_text + " "
281 |         text = [prompt_text + gt_text]
282 |         if tokenizer == "pinyin":
283 |             text_list = convert_char_to_pinyin(text, polyphone = polyphone)
284 |         else:
285 |             text_list = text
286 | 
287 |         # Duration, mel frame length
288 |         ref_mel_len = ref_audio.shape[-1] // hop_length
289 |         if use_truth_duration:
290 |             gt_audio, gt_sr = torchaudio.load(gt_wav)
291 |             if gt_sr != target_sample_rate:
292 |                 resampler = torchaudio.transforms.Resample(gt_sr, target_sample_rate)
293 |                 gt_audio = resampler(gt_audio)
294 |             total_mel_len = ref_mel_len + int(gt_audio.shape[-1] / hop_length / speed)
295 | 
296 |             # # test vocoder resynthesis
297 |             # ref_audio = gt_audio
298 |         else:
299 |             zh_pause_punc = r"。，、；：？！"
300 |             ref_text_len = len(prompt_text.encode('utf-8')) + 3 * len(re.findall(zh_pause_punc, prompt_text))
301 |             gen_text_len = len(gt_text.encode('utf-8')) + 3 * len(re.findall(zh_pause_punc, gt_text))
302 |             total_mel_len = ref_mel_len + int(ref_mel_len / ref_text_len * gen_text_len / speed)
303 | 
304 |         # to mel spectrogram
305 |         ref_mel = mel_spectrogram(ref_audio)
306 |         ref_mel = rearrange(ref_mel, '1 d n -> d n')
307 | 
308 |         # deal with batch
309 |         assert infer_batch_size > 0, "infer_batch_size should be greater than 0."
310 |         assert min_tokens <= total_mel_len <= max_tokens, \
311 |             f"Audio {utt} has duration {total_mel_len*hop_length//target_sample_rate}s out of range [{min_secs}, {max_secs}]."
312 |         bucket_i = math.floor((total_mel_len - min_tokens) / (max_tokens - min_tokens + 1) * num_buckets)
313 | 
314 |         utts[bucket_i].append(utt)
315 |         ref_rms_list[bucket_i].append(ref_rms)
316 |         ref_mels[bucket_i].append(ref_mel)
317 |         ref_mel_lens[bucket_i].append(ref_mel_len)
318 |         total_mel_lens[bucket_i].append(total_mel_len)
319 |         final_text_list[bucket_i].extend(text_list)
320 | 
321 |         batch_accum[bucket_i] += total_mel_len
322 | 
323 |         if batch_accum[bucket_i] >= infer_batch_size:
324 |             # print(f"\n{len(ref_mels[bucket_i][0][0])}\n{ref_mel_lens[bucket_i]}\n{total_mel_lens[bucket_i]}")
325 |             prompts_all.append((
326 |                 utts[bucket_i], 
327 |                 ref_rms_list[bucket_i], 
328 |                 padded_mel_batch(ref_mels[bucket_i]), 
329 |                 ref_mel_lens[bucket_i], 
330 |                 total_mel_lens[bucket_i], 
331 |                 final_text_list[bucket_i]
332 |             ))
333 |             batch_accum[bucket_i] = 0
334 |             utts[bucket_i], ref_rms_list[bucket_i], ref_mels[bucket_i], ref_mel_lens[bucket_i], total_mel_lens[bucket_i], final_text_list[bucket_i] = [], [], [], [], [], []
335 | 
336 |     # add residual
337 |     for bucket_i, bucket_frames in enumerate(batch_accum):
338 |         if bucket_frames > 0:
339 |             prompts_all.append((
340 |                 utts[bucket_i], 
341 |                 ref_rms_list[bucket_i], 
342 |                 padded_mel_batch(ref_mels[bucket_i]), 
343 |                 ref_mel_lens[bucket_i], 
344 |                 total_mel_lens[bucket_i], 
345 |                 final_text_list[bucket_i]
346 |             ))
347 |     # not only leave easy work for last workers
348 |     random.seed(666)
349 |     random.shuffle(prompts_all)
350 | 
351 |     return prompts_all
352 | 
353 | 
354 | # get wav_res_ref_text of seed-tts test metalst
355 | # https://github.com/BytedanceSpeech/seed-tts-eval
356 | 
357 | def get_seed_tts_test(metalst, gen_wav_dir, gpus):
358 |     f = open(metalst)
359 |     lines = f.readlines()
360 |     f.close()
361 | 
362 |     test_set_ = []
363 |     for line in tqdm(lines):
364 |         if len(line.strip().split('|')) == 5:
365 |             utt, prompt_text, prompt_wav, gt_text, gt_wav = line.strip().split('|')
366 |         elif len(line.strip().split('|')) == 4:
367 |             utt, prompt_text, prompt_wav, gt_text = line.strip().split('|')
368 | 
369 |         if not os.path.exists(os.path.join(gen_wav_dir, utt + '.wav')):
370 |             continue
371 |         gen_wav = os.path.join(gen_wav_dir, utt + '.wav')
372 |         if not os.path.isabs(prompt_wav):
373 |             prompt_wav = os.path.join(os.path.dirname(metalst), prompt_wav)
374 | 
375 |         test_set_.append((gen_wav, prompt_wav, gt_text))
376 | 
377 |     num_jobs = len(gpus)
378 |     if num_jobs == 1:
379 |         return [(gpus[0], test_set_)]
380 |     
381 |     wav_per_job = len(test_set_) // num_jobs + 1
382 |     test_set = []
383 |     for i in range(num_jobs):
384 |         test_set.append((gpus[i], test_set_[i*wav_per_job:(i+1)*wav_per_job]))
385 | 
386 |     return test_set
387 | 
388 | 
389 | # get librispeech test-clean cross sentence test
390 | 
391 | def get_librispeech_test(metalst, gen_wav_dir, gpus, librispeech_test_clean_path, eval_ground_truth = False):
392 |     f = open(metalst)
393 |     lines = f.readlines()
394 |     f.close()
395 | 
396 |     test_set_ = []
397 |     for line in tqdm(lines):
398 |         ref_utt, ref_dur, ref_txt, gen_utt, gen_dur, gen_txt = line.strip().split('\t')
399 | 
400 |         if eval_ground_truth:
401 |             gen_spk_id, gen_chaptr_id, _ =  gen_utt.split('-')
402 |             gen_wav = os.path.join(librispeech_test_clean_path, gen_spk_id, gen_chaptr_id, gen_utt + '.flac')
403 |         else:
404 |             if not os.path.exists(os.path.join(gen_wav_dir, gen_utt + '.wav')):
405 |                 raise FileNotFoundError(f"Generated wav not found: {gen_utt}")
406 |             gen_wav = os.path.join(gen_wav_dir, gen_utt + '.wav')
407 | 
408 |         ref_spk_id, ref_chaptr_id, _ =  ref_utt.split('-')
409 |         ref_wav = os.path.join(librispeech_test_clean_path, ref_spk_id, ref_chaptr_id, ref_utt + '.flac')
410 | 
411 |         test_set_.append((gen_wav, ref_wav, gen_txt))
412 | 
413 |     num_jobs = len(gpus)
414 |     if num_jobs == 1:
415 |         return [(gpus[0], test_set_)]
416 |     
417 |     wav_per_job = len(test_set_) // num_jobs + 1
418 |     test_set = []
419 |     for i in range(num_jobs):
420 |         test_set.append((gpus[i], test_set_[i*wav_per_job:(i+1)*wav_per_job]))
421 | 
422 |     return test_set
423 | 
424 | 
425 | # load asr model
426 | 
427 | def load_asr_model(lang, ckpt_dir = ""):
428 |     if lang == "zh":
429 |         model = AutoModel(
430 |             model = os.path.join(ckpt_dir, "paraformer-zh"), 
431 |             # vad_model = os.path.join(ckpt_dir, "fsmn-vad"), 
432 |             # punc_model = os.path.join(ckpt_dir, "ct-punc"),
433 |             # spk_model = os.path.join(ckpt_dir, "cam++"), 
434 |             disable_update=True,
435 |             )  # following seed-tts setting
436 |     elif lang == "en":
437 |         model_size = "large-v3" if ckpt_dir == "" else ckpt_dir
438 |         model = WhisperModel(model_size, device="cuda", compute_type="float16")
439 |     return model
440 | 
441 | 
442 | # WER Evaluation, the way Seed-TTS does
443 | 
444 | def run_asr_wer(args):
445 |     rank, lang, test_set, ckpt_dir = args
446 | 
447 |     if lang == "zh":
448 |         torch.cuda.set_device(rank)
449 |     elif lang == "en":
450 |         os.environ["CUDA_VISIBLE_DEVICES"] = str(rank)
451 |     else:
452 |         raise NotImplementedError("lang support only 'zh' (funasr paraformer-zh), 'en' (faster-whisper-large-v3), for now.")
453 | 
454 |     asr_model = load_asr_model(lang, ckpt_dir = ckpt_dir)
455 | 
456 |     punctuation_all = punctuation + string.punctuation
457 |     wers = []
458 | 
459 |     for gen_wav, prompt_wav, truth in tqdm(test_set):
460 |         if lang == "zh":
461 |             res = asr_model.generate(input=gen_wav, batch_size_s=300, disable_pbar=True)
462 |             hypo = res[0]["text"]
463 |             hypo = zhconv.convert(hypo, 'zh-cn')
464 |         elif lang == "en":
465 |             segments, _ = asr_model.transcribe(gen_wav, beam_size=5, language="en")
466 |             hypo = ''
467 |             for segment in segments:
468 |                 hypo = hypo + ' ' + segment.text
469 | 
470 |         # raw_truth = truth
471 |         # raw_hypo = hypo
472 | 
473 |         for x in punctuation_all:
474 |             truth = truth.replace(x, '')
475 |             hypo = hypo.replace(x, '')
476 | 
477 |         truth = truth.replace('  ', ' ')
478 |         hypo = hypo.replace('  ', ' ')
479 | 
480 |         if lang == "zh":
481 |             truth = " ".join([x for x in truth])
482 |             hypo = " ".join([x for x in hypo])
483 |         elif lang == "en":
484 |             truth = truth.lower()
485 |             hypo = hypo.lower()
486 | 
487 |         measures = compute_measures(truth, hypo)
488 |         wer = measures["wer"]
489 | 
490 |         # ref_list = truth.split(" ")
491 |         # subs = measures["substitutions"] / len(ref_list)
492 |         # dele = measures["deletions"] / len(ref_list)
493 |         # inse = measures["insertions"] / len(ref_list)
494 | 
495 |         wers.append(wer)
496 | 
497 |     return wers
498 | 
499 | 
500 | # SIM Evaluation
501 | 
502 | def run_sim(args):
503 |     rank, test_set, ckpt_dir = args
504 |     device = f"cuda:{rank}"
505 | 
506 |     model = ECAPA_TDNN_SMALL(feat_dim=1024, feat_type='wavlm_large', config_path=None)
507 |     state_dict = torch.load(ckpt_dir, map_location=lambda storage, loc: storage)
508 |     model.load_state_dict(state_dict['model'], strict=False)
509 | 
510 |     use_gpu=True if torch.cuda.is_available() else False
511 |     if use_gpu:
512 |         model = model.cuda(device)
513 |     model.eval()
514 | 
515 |     sim_list = []
516 |     for wav1, wav2, truth in tqdm(test_set):
517 | 
518 |         wav1, sr1 = torchaudio.load(wav1)
519 |         wav2, sr2 = torchaudio.load(wav2)
520 | 
521 |         resample1 = torchaudio.transforms.Resample(orig_freq=sr1, new_freq=16000)
522 |         resample2 = torchaudio.transforms.Resample(orig_freq=sr2, new_freq=16000)
523 |         wav1 = resample1(wav1)
524 |         wav2 = resample2(wav2)
525 | 
526 |         if use_gpu:
527 |             wav1 = wav1.cuda(device)
528 |             wav2 = wav2.cuda(device)
529 |         with torch.no_grad():
530 |             emb1 = model(wav1)
531 |             emb2 = model(wav2)
532 |         
533 |         sim = F.cosine_similarity(emb1, emb2)[0].item()
534 |         # print(f"VSim score between two audios: {sim:.4f} (-1.0, 1.0).")
535 |         sim_list.append(sim)
536 |     
537 |     return sim_list
538 | 
539 | 
540 | # filter func for dirty data with many repetitions
541 | 
542 | def repetition_found(text, length = 2, tolerance = 10):
543 |     pattern_count = defaultdict(int)
544 |     for i in range(len(text) - length + 1):
545 |         pattern = text[i:i + length]
546 |         pattern_count[pattern] += 1
547 |     for pattern, count in pattern_count.items():
548 |         if count > tolerance:
549 |             return True
550 |     return False
551 | 
552 | 
553 | # load model checkpoint for inference
554 | 
555 | def load_checkpoint(model, ckpt_path, device, use_ema = True):
556 |     from ema_pytorch import EMA
557 | 
558 |     ckpt_type = ckpt_path.split(".")[-1]
559 |     if ckpt_type == "safetensors":
560 |         from safetensors.torch import load_file
561 |         checkpoint = load_file(ckpt_path, device=device)
562 |     else:
563 |         checkpoint = torch.load(ckpt_path, map_location=device)
564 | 
565 |     if use_ema == True:
566 |         ema_model = EMA(model, include_online_model = False).to(device)
567 |         if ckpt_type == "safetensors":
568 |             ema_model.load_state_dict(checkpoint)
569 |         else:
570 |             ema_model.load_state_dict(checkpoint['ema_model_state_dict'])
571 |         ema_model.copy_params_from_ema_to_model()
572 |     else:
573 |         model.load_state_dict(checkpoint['model_state_dict'])
574 |         
575 |     return model


--------------------------------------------------------------------------------
/data/Emilia_ZH_EN_pinyin/vocab.txt:
--------------------------------------------------------------------------------
   1 |  
   2 | !
   3 | "
   4 | #
   5 | $
   6 | %
   7 | &
   8 | '
   9 | (
  10 | )
  11 | *
  12 | +
  13 | ,
  14 | -
  15 | .
  16 | /
  17 | 0
  18 | 1
  19 | 2
  20 | 3
  21 | 4
  22 | 5
  23 | 6
  24 | 7
  25 | 8
  26 | 9
  27 | :
  28 | ;
  29 | =
  30 | >
  31 | ?
  32 | @
  33 | A
  34 | B
  35 | C
  36 | D
  37 | E
  38 | F
  39 | G
  40 | H
  41 | I
  42 | J
  43 | K
  44 | L
  45 | M
  46 | N
  47 | O
  48 | P
  49 | Q
  50 | R
  51 | S
  52 | T
  53 | U
  54 | V
  55 | W
  56 | X
  57 | Y
  58 | Z
  59 | [
  60 | \
  61 | ]
  62 | _
  63 | a
  64 | a1
  65 | ai1
  66 | ai2
  67 | ai3
  68 | ai4
  69 | an1
  70 | an3
  71 | an4
  72 | ang1
  73 | ang2
  74 | ang4
  75 | ao1
  76 | ao2
  77 | ao3
  78 | ao4
  79 | b
  80 | ba
  81 | ba1
  82 | ba2
  83 | ba3
  84 | ba4
  85 | bai1
  86 | bai2
  87 | bai3
  88 | bai4
  89 | ban1
  90 | ban2
  91 | ban3
  92 | ban4
  93 | bang1
  94 | bang2
  95 | bang3
  96 | bang4
  97 | bao1
  98 | bao2
  99 | bao3
 100 | bao4
 101 | bei
 102 | bei1
 103 | bei2
 104 | bei3
 105 | bei4
 106 | ben1
 107 | ben2
 108 | ben3
 109 | ben4
 110 | beng
 111 | beng1
 112 | beng2
 113 | beng3
 114 | beng4
 115 | bi1
 116 | bi2
 117 | bi3
 118 | bi4
 119 | bian1
 120 | bian2
 121 | bian3
 122 | bian4
 123 | biao1
 124 | biao2
 125 | biao3
 126 | bie1
 127 | bie2
 128 | bie3
 129 | bie4
 130 | bin1
 131 | bin4
 132 | bing1
 133 | bing2
 134 | bing3
 135 | bing4
 136 | bo
 137 | bo1
 138 | bo2
 139 | bo3
 140 | bo4
 141 | bu2
 142 | bu3
 143 | bu4
 144 | c
 145 | ca1
 146 | cai1
 147 | cai2
 148 | cai3
 149 | cai4
 150 | can1
 151 | can2
 152 | can3
 153 | can4
 154 | cang1
 155 | cang2
 156 | cao1
 157 | cao2
 158 | cao3
 159 | ce4
 160 | cen1
 161 | cen2
 162 | ceng1
 163 | ceng2
 164 | ceng4
 165 | cha1
 166 | cha2
 167 | cha3
 168 | cha4
 169 | chai1
 170 | chai2
 171 | chan1
 172 | chan2
 173 | chan3
 174 | chan4
 175 | chang1
 176 | chang2
 177 | chang3
 178 | chang4
 179 | chao1
 180 | chao2
 181 | chao3
 182 | che1
 183 | che2
 184 | che3
 185 | che4
 186 | chen1
 187 | chen2
 188 | chen3
 189 | chen4
 190 | cheng1
 191 | cheng2
 192 | cheng3
 193 | cheng4
 194 | chi1
 195 | chi2
 196 | chi3
 197 | chi4
 198 | chong1
 199 | chong2
 200 | chong3
 201 | chong4
 202 | chou1
 203 | chou2
 204 | chou3
 205 | chou4
 206 | chu1
 207 | chu2
 208 | chu3
 209 | chu4
 210 | chua1
 211 | chuai1
 212 | chuai2
 213 | chuai3
 214 | chuai4
 215 | chuan1
 216 | chuan2
 217 | chuan3
 218 | chuan4
 219 | chuang1
 220 | chuang2
 221 | chuang3
 222 | chuang4
 223 | chui1
 224 | chui2
 225 | chun1
 226 | chun2
 227 | chun3
 228 | chuo1
 229 | chuo4
 230 | ci1
 231 | ci2
 232 | ci3
 233 | ci4
 234 | cong1
 235 | cong2
 236 | cou4
 237 | cu1
 238 | cu4
 239 | cuan1
 240 | cuan2
 241 | cuan4
 242 | cui1
 243 | cui3
 244 | cui4
 245 | cun1
 246 | cun2
 247 | cun4
 248 | cuo1
 249 | cuo2
 250 | cuo4
 251 | d
 252 | da
 253 | da1
 254 | da2
 255 | da3
 256 | da4
 257 | dai1
 258 | dai2
 259 | dai3
 260 | dai4
 261 | dan1
 262 | dan2
 263 | dan3
 264 | dan4
 265 | dang1
 266 | dang2
 267 | dang3
 268 | dang4
 269 | dao1
 270 | dao2
 271 | dao3
 272 | dao4
 273 | de
 274 | de1
 275 | de2
 276 | dei3
 277 | den4
 278 | deng1
 279 | deng2
 280 | deng3
 281 | deng4
 282 | di1
 283 | di2
 284 | di3
 285 | di4
 286 | dia3
 287 | dian1
 288 | dian2
 289 | dian3
 290 | dian4
 291 | diao1
 292 | diao3
 293 | diao4
 294 | die1
 295 | die2
 296 | die4
 297 | ding1
 298 | ding2
 299 | ding3
 300 | ding4
 301 | diu1
 302 | dong1
 303 | dong3
 304 | dong4
 305 | dou1
 306 | dou2
 307 | dou3
 308 | dou4
 309 | du1
 310 | du2
 311 | du3
 312 | du4
 313 | duan1
 314 | duan2
 315 | duan3
 316 | duan4
 317 | dui1
 318 | dui4
 319 | dun1
 320 | dun3
 321 | dun4
 322 | duo1
 323 | duo2
 324 | duo3
 325 | duo4
 326 | e
 327 | e1
 328 | e2
 329 | e3
 330 | e4
 331 | ei2
 332 | en1
 333 | en4
 334 | er
 335 | er2
 336 | er3
 337 | er4
 338 | f
 339 | fa1
 340 | fa2
 341 | fa3
 342 | fa4
 343 | fan1
 344 | fan2
 345 | fan3
 346 | fan4
 347 | fang1
 348 | fang2
 349 | fang3
 350 | fang4
 351 | fei1
 352 | fei2
 353 | fei3
 354 | fei4
 355 | fen1
 356 | fen2
 357 | fen3
 358 | fen4
 359 | feng1
 360 | feng2
 361 | feng3
 362 | feng4
 363 | fo2
 364 | fou2
 365 | fou3
 366 | fu1
 367 | fu2
 368 | fu3
 369 | fu4
 370 | g
 371 | ga1
 372 | ga2
 373 | ga3
 374 | ga4
 375 | gai1
 376 | gai2
 377 | gai3
 378 | gai4
 379 | gan1
 380 | gan2
 381 | gan3
 382 | gan4
 383 | gang1
 384 | gang2
 385 | gang3
 386 | gang4
 387 | gao1
 388 | gao2
 389 | gao3
 390 | gao4
 391 | ge1
 392 | ge2
 393 | ge3
 394 | ge4
 395 | gei2
 396 | gei3
 397 | gen1
 398 | gen2
 399 | gen3
 400 | gen4
 401 | geng1
 402 | geng3
 403 | geng4
 404 | gong1
 405 | gong3
 406 | gong4
 407 | gou1
 408 | gou2
 409 | gou3
 410 | gou4
 411 | gu
 412 | gu1
 413 | gu2
 414 | gu3
 415 | gu4
 416 | gua1
 417 | gua2
 418 | gua3
 419 | gua4
 420 | guai1
 421 | guai2
 422 | guai3
 423 | guai4
 424 | guan1
 425 | guan2
 426 | guan3
 427 | guan4
 428 | guang1
 429 | guang2
 430 | guang3
 431 | guang4
 432 | gui1
 433 | gui2
 434 | gui3
 435 | gui4
 436 | gun3
 437 | gun4
 438 | guo1
 439 | guo2
 440 | guo3
 441 | guo4
 442 | h
 443 | ha1
 444 | ha2
 445 | ha3
 446 | hai1
 447 | hai2
 448 | hai3
 449 | hai4
 450 | han1
 451 | han2
 452 | han3
 453 | han4
 454 | hang1
 455 | hang2
 456 | hang4
 457 | hao1
 458 | hao2
 459 | hao3
 460 | hao4
 461 | he1
 462 | he2
 463 | he4
 464 | hei1
 465 | hen2
 466 | hen3
 467 | hen4
 468 | heng1
 469 | heng2
 470 | heng4
 471 | hong1
 472 | hong2
 473 | hong3
 474 | hong4
 475 | hou1
 476 | hou2
 477 | hou3
 478 | hou4
 479 | hu1
 480 | hu2
 481 | hu3
 482 | hu4
 483 | hua1
 484 | hua2
 485 | hua4
 486 | huai2
 487 | huai4
 488 | huan1
 489 | huan2
 490 | huan3
 491 | huan4
 492 | huang1
 493 | huang2
 494 | huang3
 495 | huang4
 496 | hui1
 497 | hui2
 498 | hui3
 499 | hui4
 500 | hun1
 501 | hun2
 502 | hun4
 503 | huo
 504 | huo1
 505 | huo2
 506 | huo3
 507 | huo4
 508 | i
 509 | j
 510 | ji1
 511 | ji2
 512 | ji3
 513 | ji4
 514 | jia
 515 | jia1
 516 | jia2
 517 | jia3
 518 | jia4
 519 | jian1
 520 | jian2
 521 | jian3
 522 | jian4
 523 | jiang1
 524 | jiang2
 525 | jiang3
 526 | jiang4
 527 | jiao1
 528 | jiao2
 529 | jiao3
 530 | jiao4
 531 | jie1
 532 | jie2
 533 | jie3
 534 | jie4
 535 | jin1
 536 | jin2
 537 | jin3
 538 | jin4
 539 | jing1
 540 | jing2
 541 | jing3
 542 | jing4
 543 | jiong3
 544 | jiu1
 545 | jiu2
 546 | jiu3
 547 | jiu4
 548 | ju1
 549 | ju2
 550 | ju3
 551 | ju4
 552 | juan1
 553 | juan2
 554 | juan3
 555 | juan4
 556 | jue1
 557 | jue2
 558 | jue4
 559 | jun1
 560 | jun4
 561 | k
 562 | ka1
 563 | ka2
 564 | ka3
 565 | kai1
 566 | kai2
 567 | kai3
 568 | kai4
 569 | kan1
 570 | kan2
 571 | kan3
 572 | kan4
 573 | kang1
 574 | kang2
 575 | kang4
 576 | kao1
 577 | kao2
 578 | kao3
 579 | kao4
 580 | ke1
 581 | ke2
 582 | ke3
 583 | ke4
 584 | ken3
 585 | keng1
 586 | kong1
 587 | kong3
 588 | kong4
 589 | kou1
 590 | kou2
 591 | kou3
 592 | kou4
 593 | ku1
 594 | ku2
 595 | ku3
 596 | ku4
 597 | kua1
 598 | kua3
 599 | kua4
 600 | kuai3
 601 | kuai4
 602 | kuan1
 603 | kuan2
 604 | kuan3
 605 | kuang1
 606 | kuang2
 607 | kuang4
 608 | kui1
 609 | kui2
 610 | kui3
 611 | kui4
 612 | kun1
 613 | kun3
 614 | kun4
 615 | kuo4
 616 | l
 617 | la
 618 | la1
 619 | la2
 620 | la3
 621 | la4
 622 | lai2
 623 | lai4
 624 | lan2
 625 | lan3
 626 | lan4
 627 | lang1
 628 | lang2
 629 | lang3
 630 | lang4
 631 | lao1
 632 | lao2
 633 | lao3
 634 | lao4
 635 | le
 636 | le1
 637 | le4
 638 | lei
 639 | lei1
 640 | lei2
 641 | lei3
 642 | lei4
 643 | leng1
 644 | leng2
 645 | leng3
 646 | leng4
 647 | li
 648 | li1
 649 | li2
 650 | li3
 651 | li4
 652 | lia3
 653 | lian2
 654 | lian3
 655 | lian4
 656 | liang2
 657 | liang3
 658 | liang4
 659 | liao1
 660 | liao2
 661 | liao3
 662 | liao4
 663 | lie1
 664 | lie2
 665 | lie3
 666 | lie4
 667 | lin1
 668 | lin2
 669 | lin3
 670 | lin4
 671 | ling2
 672 | ling3
 673 | ling4
 674 | liu1
 675 | liu2
 676 | liu3
 677 | liu4
 678 | long1
 679 | long2
 680 | long3
 681 | long4
 682 | lou1
 683 | lou2
 684 | lou3
 685 | lou4
 686 | lu1
 687 | lu2
 688 | lu3
 689 | lu4
 690 | luan2
 691 | luan3
 692 | luan4
 693 | lun1
 694 | lun2
 695 | lun4
 696 | luo1
 697 | luo2
 698 | luo3
 699 | luo4
 700 | lv2
 701 | lv3
 702 | lv4
 703 | lve3
 704 | lve4
 705 | m
 706 | ma
 707 | ma1
 708 | ma2
 709 | ma3
 710 | ma4
 711 | mai2
 712 | mai3
 713 | mai4
 714 | man1
 715 | man2
 716 | man3
 717 | man4
 718 | mang2
 719 | mang3
 720 | mao1
 721 | mao2
 722 | mao3
 723 | mao4
 724 | me
 725 | mei2
 726 | mei3
 727 | mei4
 728 | men
 729 | men1
 730 | men2
 731 | men4
 732 | meng
 733 | meng1
 734 | meng2
 735 | meng3
 736 | meng4
 737 | mi1
 738 | mi2
 739 | mi3
 740 | mi4
 741 | mian2
 742 | mian3
 743 | mian4
 744 | miao1
 745 | miao2
 746 | miao3
 747 | miao4
 748 | mie1
 749 | mie4
 750 | min2
 751 | min3
 752 | ming2
 753 | ming3
 754 | ming4
 755 | miu4
 756 | mo1
 757 | mo2
 758 | mo3
 759 | mo4
 760 | mou1
 761 | mou2
 762 | mou3
 763 | mu2
 764 | mu3
 765 | mu4
 766 | n
 767 | n2
 768 | na1
 769 | na2
 770 | na3
 771 | na4
 772 | nai2
 773 | nai3
 774 | nai4
 775 | nan1
 776 | nan2
 777 | nan3
 778 | nan4
 779 | nang1
 780 | nang2
 781 | nang3
 782 | nao1
 783 | nao2
 784 | nao3
 785 | nao4
 786 | ne
 787 | ne2
 788 | ne4
 789 | nei3
 790 | nei4
 791 | nen4
 792 | neng2
 793 | ni1
 794 | ni2
 795 | ni3
 796 | ni4
 797 | nian1
 798 | nian2
 799 | nian3
 800 | nian4
 801 | niang2
 802 | niang4
 803 | niao2
 804 | niao3
 805 | niao4
 806 | nie1
 807 | nie4
 808 | nin2
 809 | ning2
 810 | ning3
 811 | ning4
 812 | niu1
 813 | niu2
 814 | niu3
 815 | niu4
 816 | nong2
 817 | nong4
 818 | nou4
 819 | nu2
 820 | nu3
 821 | nu4
 822 | nuan3
 823 | nuo2
 824 | nuo4
 825 | nv2
 826 | nv3
 827 | nve4
 828 | o
 829 | o1
 830 | o2
 831 | ou1
 832 | ou2
 833 | ou3
 834 | ou4
 835 | p
 836 | pa1
 837 | pa2
 838 | pa4
 839 | pai1
 840 | pai2
 841 | pai3
 842 | pai4
 843 | pan1
 844 | pan2
 845 | pan4
 846 | pang1
 847 | pang2
 848 | pang4
 849 | pao1
 850 | pao2
 851 | pao3
 852 | pao4
 853 | pei1
 854 | pei2
 855 | pei4
 856 | pen1
 857 | pen2
 858 | pen4
 859 | peng1
 860 | peng2
 861 | peng3
 862 | peng4
 863 | pi1
 864 | pi2
 865 | pi3
 866 | pi4
 867 | pian1
 868 | pian2
 869 | pian4
 870 | piao1
 871 | piao2
 872 | piao3
 873 | piao4
 874 | pie1
 875 | pie2
 876 | pie3
 877 | pin1
 878 | pin2
 879 | pin3
 880 | pin4
 881 | ping1
 882 | ping2
 883 | po1
 884 | po2
 885 | po3
 886 | po4
 887 | pou1
 888 | pu1
 889 | pu2
 890 | pu3
 891 | pu4
 892 | q
 893 | qi1
 894 | qi2
 895 | qi3
 896 | qi4
 897 | qia1
 898 | qia3
 899 | qia4
 900 | qian1
 901 | qian2
 902 | qian3
 903 | qian4
 904 | qiang1
 905 | qiang2
 906 | qiang3
 907 | qiang4
 908 | qiao1
 909 | qiao2
 910 | qiao3
 911 | qiao4
 912 | qie1
 913 | qie2
 914 | qie3
 915 | qie4
 916 | qin1
 917 | qin2
 918 | qin3
 919 | qin4
 920 | qing1
 921 | qing2
 922 | qing3
 923 | qing4
 924 | qiong1
 925 | qiong2
 926 | qiu1
 927 | qiu2
 928 | qiu3
 929 | qu1
 930 | qu2
 931 | qu3
 932 | qu4
 933 | quan1
 934 | quan2
 935 | quan3
 936 | quan4
 937 | que1
 938 | que2
 939 | que4
 940 | qun2
 941 | r
 942 | ran2
 943 | ran3
 944 | rang1
 945 | rang2
 946 | rang3
 947 | rang4
 948 | rao2
 949 | rao3
 950 | rao4
 951 | re2
 952 | re3
 953 | re4
 954 | ren2
 955 | ren3
 956 | ren4
 957 | reng1
 958 | reng2
 959 | ri4
 960 | rong1
 961 | rong2
 962 | rong3
 963 | rou2
 964 | rou4
 965 | ru2
 966 | ru3
 967 | ru4
 968 | ruan2
 969 | ruan3
 970 | rui3
 971 | rui4
 972 | run4
 973 | ruo4
 974 | s
 975 | sa1
 976 | sa2
 977 | sa3
 978 | sa4
 979 | sai1
 980 | sai4
 981 | san1
 982 | san2
 983 | san3
 984 | san4
 985 | sang1
 986 | sang3
 987 | sang4
 988 | sao1
 989 | sao2
 990 | sao3
 991 | sao4
 992 | se4
 993 | sen1
 994 | seng1
 995 | sha1
 996 | sha2
 997 | sha3
 998 | sha4
 999 | shai1
1000 | shai2
1001 | shai3
1002 | shai4
1003 | shan1
1004 | shan3
1005 | shan4
1006 | shang
1007 | shang1
1008 | shang3
1009 | shang4
1010 | shao1
1011 | shao2
1012 | shao3
1013 | shao4
1014 | she1
1015 | she2
1016 | she3
1017 | she4
1018 | shei2
1019 | shen1
1020 | shen2
1021 | shen3
1022 | shen4
1023 | sheng1
1024 | sheng2
1025 | sheng3
1026 | sheng4
1027 | shi
1028 | shi1
1029 | shi2
1030 | shi3
1031 | shi4
1032 | shou1
1033 | shou2
1034 | shou3
1035 | shou4
1036 | shu1
1037 | shu2
1038 | shu3
1039 | shu4
1040 | shua1
1041 | shua2
1042 | shua3
1043 | shua4
1044 | shuai1
1045 | shuai3
1046 | shuai4
1047 | shuan1
1048 | shuan4
1049 | shuang1
1050 | shuang3
1051 | shui2
1052 | shui3
1053 | shui4
1054 | shun3
1055 | shun4
1056 | shuo1
1057 | shuo4
1058 | si1
1059 | si2
1060 | si3
1061 | si4
1062 | song1
1063 | song3
1064 | song4
1065 | sou1
1066 | sou3
1067 | sou4
1068 | su1
1069 | su2
1070 | su4
1071 | suan1
1072 | suan4
1073 | sui1
1074 | sui2
1075 | sui3
1076 | sui4
1077 | sun1
1078 | sun3
1079 | suo
1080 | suo1
1081 | suo2
1082 | suo3
1083 | t
1084 | ta1
1085 | ta2
1086 | ta3
1087 | ta4
1088 | tai1
1089 | tai2
1090 | tai4
1091 | tan1
1092 | tan2
1093 | tan3
1094 | tan4
1095 | tang1
1096 | tang2
1097 | tang3
1098 | tang4
1099 | tao1
1100 | tao2
1101 | tao3
1102 | tao4
1103 | te4
1104 | teng2
1105 | ti1
1106 | ti2
1107 | ti3
1108 | ti4
1109 | tian1
1110 | tian2
1111 | tian3
1112 | tiao1
1113 | tiao2
1114 | tiao3
1115 | tiao4
1116 | tie1
1117 | tie2
1118 | tie3
1119 | tie4
1120 | ting1
1121 | ting2
1122 | ting3
1123 | tong1
1124 | tong2
1125 | tong3
1126 | tong4
1127 | tou
1128 | tou1
1129 | tou2
1130 | tou4
1131 | tu1
1132 | tu2
1133 | tu3
1134 | tu4
1135 | tuan1
1136 | tuan2
1137 | tui1
1138 | tui2
1139 | tui3
1140 | tui4
1141 | tun1
1142 | tun2
1143 | tun4
1144 | tuo1
1145 | tuo2
1146 | tuo3
1147 | tuo4
1148 | u
1149 | v
1150 | w
1151 | wa
1152 | wa1
1153 | wa2
1154 | wa3
1155 | wa4
1156 | wai1
1157 | wai3
1158 | wai4
1159 | wan1
1160 | wan2
1161 | wan3
1162 | wan4
1163 | wang1
1164 | wang2
1165 | wang3
1166 | wang4
1167 | wei1
1168 | wei2
1169 | wei3
1170 | wei4
1171 | wen1
1172 | wen2
1173 | wen3
1174 | wen4
1175 | weng1
1176 | weng4
1177 | wo1
1178 | wo2
1179 | wo3
1180 | wo4
1181 | wu1
1182 | wu2
1183 | wu3
1184 | wu4
1185 | x
1186 | xi1
1187 | xi2
1188 | xi3
1189 | xi4
1190 | xia1
1191 | xia2
1192 | xia4
1193 | xian1
1194 | xian2
1195 | xian3
1196 | xian4
1197 | xiang1
1198 | xiang2
1199 | xiang3
1200 | xiang4
1201 | xiao1
1202 | xiao2
1203 | xiao3
1204 | xiao4
1205 | xie1
1206 | xie2
1207 | xie3
1208 | xie4
1209 | xin1
1210 | xin2
1211 | xin4
1212 | xing1
1213 | xing2
1214 | xing3
1215 | xing4
1216 | xiong1
1217 | xiong2
1218 | xiu1
1219 | xiu3
1220 | xiu4
1221 | xu
1222 | xu1
1223 | xu2
1224 | xu3
1225 | xu4
1226 | xuan1
1227 | xuan2
1228 | xuan3
1229 | xuan4
1230 | xue1
1231 | xue2
1232 | xue3
1233 | xue4
1234 | xun1
1235 | xun2
1236 | xun4
1237 | y
1238 | ya
1239 | ya1
1240 | ya2
1241 | ya3
1242 | ya4
1243 | yan1
1244 | yan2
1245 | yan3
1246 | yan4
1247 | yang1
1248 | yang2
1249 | yang3
1250 | yang4
1251 | yao1
1252 | yao2
1253 | yao3
1254 | yao4
1255 | ye1
1256 | ye2
1257 | ye3
1258 | ye4
1259 | yi
1260 | yi1
1261 | yi2
1262 | yi3
1263 | yi4
1264 | yin1
1265 | yin2
1266 | yin3
1267 | yin4
1268 | ying1
1269 | ying2
1270 | ying3
1271 | ying4
1272 | yo1
1273 | yong1
1274 | yong2
1275 | yong3
1276 | yong4
1277 | you1
1278 | you2
1279 | you3
1280 | you4
1281 | yu1
1282 | yu2
1283 | yu3
1284 | yu4
1285 | yuan1
1286 | yuan2
1287 | yuan3
1288 | yuan4
1289 | yue1
1290 | yue4
1291 | yun1
1292 | yun2
1293 | yun3
1294 | yun4
1295 | z
1296 | za1
1297 | za2
1298 | za3
1299 | zai1
1300 | zai3
1301 | zai4
1302 | zan1
1303 | zan2
1304 | zan3
1305 | zan4
1306 | zang1
1307 | zang4
1308 | zao1
1309 | zao2
1310 | zao3
1311 | zao4
1312 | ze2
1313 | ze4
1314 | zei2
1315 | zen3
1316 | zeng1
1317 | zeng4
1318 | zha1
1319 | zha2
1320 | zha3
1321 | zha4
1322 | zhai1
1323 | zhai2
1324 | zhai3
1325 | zhai4
1326 | zhan1
1327 | zhan2
1328 | zhan3
1329 | zhan4
1330 | zhang1
1331 | zhang2
1332 | zhang3
1333 | zhang4
1334 | zhao1
1335 | zhao2
1336 | zhao3
1337 | zhao4
1338 | zhe
1339 | zhe1
1340 | zhe2
1341 | zhe3
1342 | zhe4
1343 | zhen1
1344 | zhen2
1345 | zhen3
1346 | zhen4
1347 | zheng1
1348 | zheng2
1349 | zheng3
1350 | zheng4
1351 | zhi1
1352 | zhi2
1353 | zhi3
1354 | zhi4
1355 | zhong1
1356 | zhong2
1357 | zhong3
1358 | zhong4
1359 | zhou1
1360 | zhou2
1361 | zhou3
1362 | zhou4
1363 | zhu1
1364 | zhu2
1365 | zhu3
1366 | zhu4
1367 | zhua1
1368 | zhua2
1369 | zhua3
1370 | zhuai1
1371 | zhuai3
1372 | zhuai4
1373 | zhuan1
1374 | zhuan2
1375 | zhuan3
1376 | zhuan4
1377 | zhuang1
1378 | zhuang4
1379 | zhui1
1380 | zhui4
1381 | zhun1
1382 | zhun2
1383 | zhun3
1384 | zhuo1
1385 | zhuo2
1386 | zi
1387 | zi1
1388 | zi2
1389 | zi3
1390 | zi4
1391 | zong1
1392 | zong2
1393 | zong3
1394 | zong4
1395 | zou1
1396 | zou2
1397 | zou3
1398 | zou4
1399 | zu1
1400 | zu2
1401 | zu3
1402 | zuan1
1403 | zuan3
1404 | zuan4
1405 | zui2
1406 | zui3
1407 | zui4
1408 | zun1
1409 | zuo
1410 | zuo1
1411 | zuo2
1412 | zuo3
1413 | zuo4
1414 | {
1415 | ~
1416 | ¡
1417 | ¢
1418 | £
1419 | ¥
1420 | §
1421 | ¨
1422 | ©
1423 | «
1424 | ®
1425 | ¯
1426 | °
1427 | ±
1428 | ²
1429 | ³
1430 | ´
1431 | µ
1432 | ·
1433 | ¹
1434 | º
1435 | »
1436 | ¼
1437 | ½
1438 | ¾
1439 | ¿
1440 | À
1441 | Á
1442 | Â
1443 | Ã
1444 | Ä
1445 | Å
1446 | Æ
1447 | Ç
1448 | È
1449 | É
1450 | Ê
1451 | Í
1452 | Î
1453 | Ñ
1454 | Ó
1455 | Ö
1456 | ×
1457 | Ø
1458 | Ú
1459 | Ü
1460 | Ý
1461 | Þ
1462 | ß
1463 | à
1464 | á
1465 | â
1466 | ã
1467 | ä
1468 | å
1469 | æ
1470 | ç
1471 | è
1472 | é
1473 | ê
1474 | ë
1475 | ì
1476 | í
1477 | î
1478 | ï
1479 | ð
1480 | ñ
1481 | ò
1482 | ó
1483 | ô
1484 | õ
1485 | ö
1486 | ø
1487 | ù
1488 | ú
1489 | û
1490 | ü
1491 | ý
1492 | Ā
1493 | ā
1494 | ă
1495 | ą
1496 | ć
1497 | Č
1498 | č
1499 | Đ
1500 | đ
1501 | ē
1502 | ė
1503 | ę
1504 | ě
1505 | ĝ
1506 | ğ
1507 | ħ
1508 | ī
1509 | į
1510 | İ
1511 | ı
1512 | Ł
1513 | ł
1514 | ń
1515 | ņ
1516 | ň
1517 | ŋ
1518 | Ō
1519 | ō
1520 | ő
1521 | œ
1522 | ř
1523 | Ś
1524 | ś
1525 | Ş
1526 | ş
1527 | Š
1528 | š
1529 | Ť
1530 | ť
1531 | ũ
1532 | ū
1533 | ź
1534 | Ż
1535 | ż
1536 | Ž
1537 | ž
1538 | ơ
1539 | ư
1540 | ǎ
1541 | ǐ
1542 | ǒ
1543 | ǔ
1544 | ǚ
1545 | ș
1546 | ț
1547 | ɑ
1548 | ɔ
1549 | ɕ
1550 | ə
1551 | ɛ
1552 | ɜ
1553 | ɡ
1554 | ɣ
1555 | ɪ
1556 | ɫ
1557 | ɴ
1558 | ɹ
1559 | ɾ
1560 | ʃ
1561 | ʊ
1562 | ʌ
1563 | ʒ
1564 | ʔ
1565 | ʰ
1566 | ʷ
1567 | ʻ
1568 | ʾ
1569 | ʿ
1570 | ˈ
1571 | ː
1572 | ˙
1573 | ˜
1574 | ˢ
1575 | ́
1576 | ̅
1577 | Α
1578 | Β
1579 | Δ
1580 | Ε
1581 | Θ
1582 | Κ
1583 | Λ
1584 | Μ
1585 | Ξ
1586 | Π
1587 | Σ
1588 | Τ
1589 | Φ
1590 | Χ
1591 | Ψ
1592 | Ω
1593 | ά
1594 | έ
1595 | ή
1596 | ί
1597 | α
1598 | β
1599 | γ
1600 | δ
1601 | ε
1602 | ζ
1603 | η
1604 | θ
1605 | ι
1606 | κ
1607 | λ
1608 | μ
1609 | ν
1610 | ξ
1611 | ο
1612 | π
1613 | ρ
1614 | ς
1615 | σ
1616 | τ
1617 | υ
1618 | φ
1619 | χ
1620 | ψ
1621 | ω
1622 | ϊ
1623 | ό
1624 | ύ
1625 | ώ
1626 | ϕ
1627 | ϵ
1628 | Ё
1629 | А
1630 | Б
1631 | В
1632 | Г
1633 | Д
1634 | Е
1635 | Ж
1636 | З
1637 | И
1638 | Й
1639 | К
1640 | Л
1641 | М
1642 | Н
1643 | О
1644 | П
1645 | Р
1646 | С
1647 | Т
1648 | У
1649 | Ф
1650 | Х
1651 | Ц
1652 | Ч
1653 | Ш
1654 | Щ
1655 | Ы
1656 | Ь
1657 | Э
1658 | Ю
1659 | Я
1660 | а
1661 | б
1662 | в
1663 | г
1664 | д
1665 | е
1666 | ж
1667 | з
1668 | и
1669 | й
1670 | к
1671 | л
1672 | м
1673 | н
1674 | о
1675 | п
1676 | р
1677 | с
1678 | т
1679 | у
1680 | ф
1681 | х
1682 | ц
1683 | ч
1684 | ш
1685 | щ
1686 | ъ
1687 | ы
1688 | ь
1689 | э
1690 | ю
1691 | я
1692 | ё
1693 | і
1694 | ְ
1695 | ִ
1696 | ֵ
1697 | ֶ
1698 | ַ
1699 | ָ
1700 | ֹ
1701 | ּ
1702 | ־
1703 | ׁ
1704 | א
1705 | ב
1706 | ג
1707 | ד
1708 | ה
1709 | ו
1710 | ז
1711 | ח
1712 | ט
1713 | י
1714 | כ
1715 | ל
1716 | ם
1717 | מ
1718 | ן
1719 | נ
1720 | ס
1721 | ע
1722 | פ
1723 | ק
1724 | ר
1725 | ש
1726 | ת
1727 | أ
1728 | ب
1729 | ة
1730 | ت
1731 | ج
1732 | ح
1733 | د
1734 | ر
1735 | ز
1736 | س
1737 | ص
1738 | ط
1739 | ع
1740 | ق
1741 | ك
1742 | ل
1743 | م
1744 | ن
1745 | ه
1746 | و
1747 | ي
1748 | َ
1749 | ُ
1750 | ِ
1751 | ْ
1752 | ก
1753 | ข
1754 | ง
1755 | จ
1756 | ต
1757 | ท
1758 | น
1759 | ป
1760 | ย
1761 | ร
1762 | ว
1763 | ส
1764 | ห
1765 | อ
1766 | ฮ
1767 | ั
1768 | า
1769 | ี
1770 | ึ
1771 | โ
1772 | ใ
1773 | ไ
1774 | ่
1775 | ้
1776 | ์
1777 | ḍ
1778 | Ḥ
1779 | ḥ
1780 | ṁ
1781 | ṃ
1782 | ṅ
1783 | ṇ
1784 | Ṛ
1785 | ṛ
1786 | Ṣ
1787 | ṣ
1788 | Ṭ
1789 | ṭ
1790 | ạ
1791 | ả
1792 | Ấ
1793 | ấ
1794 | ầ
1795 | ậ
1796 | ắ
1797 | ằ
1798 | ẻ
1799 | ẽ
1800 | ế
1801 | ề
1802 | ể
1803 | ễ
1804 | ệ
1805 | ị
1806 | ọ
1807 | ỏ
1808 | ố
1809 | ồ
1810 | ộ
1811 | ớ
1812 | ờ
1813 | ở
1814 | ụ
1815 | ủ
1816 | ứ
1817 | ữ
1818 | ἀ
1819 | ἁ
1820 | Ἀ
1821 | ἐ
1822 | ἔ
1823 | ἰ
1824 | ἱ
1825 | ὀ
1826 | ὁ
1827 | ὐ
1828 | ὲ
1829 | ὸ
1830 | ᾶ
1831 | ᾽
1832 | ῆ
1833 | ῇ
1834 | ῶ
1835 | ‎
1836 | ‑
1837 | ‒
1838 | –
1839 | —
1840 | ―
1841 | ‖
1842 | †
1843 | ‡
1844 | •
1845 | …
1846 | ‧
1847 | ‬
1848 | ′
1849 | ″
1850 | ⁄
1851 | ⁡
1852 | ⁰
1853 | ⁴
1854 | ⁵
1855 | ⁶
1856 | ⁷
1857 | ⁸
1858 | ⁹
1859 | ₁
1860 | ₂
1861 | ₃
1862 | €
1863 | ₱
1864 | ₹
1865 | ₽
1866 | ℃
1867 | ℏ
1868 | ℓ
1869 | №
1870 | ℝ
1871 | ™
1872 | ⅓
1873 | ⅔
1874 | ⅛
1875 | →
1876 | ∂
1877 | ∈
1878 | ∑
1879 | −
1880 | ∗
1881 | √
1882 | ∞
1883 | ∫
1884 | ≈
1885 | ≠
1886 | ≡
1887 | ≤
1888 | ≥
1889 | ⋅
1890 | ⋯
1891 | █
1892 | ♪
1893 | ⟨
1894 | ⟩
1895 | 、
1896 | 。
1897 | 《
1898 | 》
1899 | 「
1900 | 」
1901 | 【
1902 | 】
1903 | あ
1904 | う
1905 | え
1906 | お
1907 | か
1908 | が
1909 | き
1910 | ぎ
1911 | く
1912 | ぐ
1913 | け
1914 | げ
1915 | こ
1916 | ご
1917 | さ
1918 | し
1919 | じ
1920 | す
1921 | ず
1922 | せ
1923 | ぜ
1924 | そ
1925 | ぞ
1926 | た
1927 | だ
1928 | ち
1929 | っ
1930 | つ
1931 | で
1932 | と
1933 | ど
1934 | な
1935 | に
1936 | ね
1937 | の
1938 | は
1939 | ば
1940 | ひ
1941 | ぶ
1942 | へ
1943 | べ
1944 | ま
1945 | み
1946 | む
1947 | め
1948 | も
1949 | ゃ
1950 | や
1951 | ゆ
1952 | ょ
1953 | よ
1954 | ら
1955 | り
1956 | る
1957 | れ
1958 | ろ
1959 | わ
1960 | を
1961 | ん
1962 | ァ
1963 | ア
1964 | ィ
1965 | イ
1966 | ウ
1967 | ェ
1968 | エ
1969 | オ
1970 | カ
1971 | ガ
1972 | キ
1973 | ク
1974 | ケ
1975 | ゲ
1976 | コ
1977 | ゴ
1978 | サ
1979 | ザ
1980 | シ
1981 | ジ
1982 | ス
1983 | ズ
1984 | セ
1985 | ゾ
1986 | タ
1987 | ダ
1988 | チ
1989 | ッ
1990 | ツ
1991 | テ
1992 | デ
1993 | ト
1994 | ド
1995 | ナ
1996 | ニ
1997 | ネ
1998 | ノ
1999 | バ
2000 | パ
2001 | ビ
2002 | ピ
2003 | フ
2004 | プ
2005 | ヘ
2006 | ベ
2007 | ペ
2008 | ホ
2009 | ボ
2010 | ポ
2011 | マ
2012 | ミ
2013 | ム
2014 | メ
2015 | モ
2016 | ャ
2017 | ヤ
2018 | ュ
2019 | ユ
2020 | ョ
2021 | ヨ
2022 | ラ
2023 | リ
2024 | ル
2025 | レ
2026 | ロ
2027 | ワ
2028 | ン
2029 | ・
2030 | ー
2031 | ㄋ
2032 | ㄍ
2033 | ㄎ
2034 | ㄏ
2035 | ㄓ
2036 | ㄕ
2037 | ㄚ
2038 | ㄜ
2039 | ㄟ
2040 | ㄤ
2041 | ㄥ
2042 | ㄧ
2043 | ㄱ
2044 | ㄴ
2045 | ㄷ
2046 | ㄹ
2047 | ㅁ
2048 | ㅂ
2049 | ㅅ
2050 | ㅈ
2051 | ㅍ
2052 | ㅎ
2053 | ㅏ
2054 | ㅓ
2055 | ㅗ
2056 | ㅜ
2057 | ㅡ
2058 | ㅣ
2059 | 㗎
2060 | 가
2061 | 각
2062 | 간
2063 | 갈
2064 | 감
2065 | 갑
2066 | 갓
2067 | 갔
2068 | 강
2069 | 같
2070 | 개
2071 | 거
2072 | 건
2073 | 걸
2074 | 겁
2075 | 것
2076 | 겉
2077 | 게
2078 | 겠
2079 | 겨
2080 | 결
2081 | 겼
2082 | 경
2083 | 계
2084 | 고
2085 | 곤
2086 | 골
2087 | 곱
2088 | 공
2089 | 과
2090 | 관
2091 | 광
2092 | 교
2093 | 구
2094 | 국
2095 | 굴
2096 | 귀
2097 | 귄
2098 | 그
2099 | 근
2100 | 글
2101 | 금
2102 | 기
2103 | 긴
2104 | 길
2105 | 까
2106 | 깍
2107 | 깔
2108 | 깜
2109 | 깨
2110 | 께
2111 | 꼬
2112 | 꼭
2113 | 꽃
2114 | 꾸
2115 | 꿔
2116 | 끔
2117 | 끗
2118 | 끝
2119 | 끼
2120 | 나
2121 | 난
2122 | 날
2123 | 남
2124 | 납
2125 | 내
2126 | 냐
2127 | 냥
2128 | 너
2129 | 넘
2130 | 넣
2131 | 네
2132 | 녁
2133 | 년
2134 | 녕
2135 | 노
2136 | 녹
2137 | 놀
2138 | 누
2139 | 눈
2140 | 느
2141 | 는
2142 | 늘
2143 | 니
2144 | 님
2145 | 닙
2146 | 다
2147 | 닥
2148 | 단
2149 | 달
2150 | 닭
2151 | 당
2152 | 대
2153 | 더
2154 | 덕
2155 | 던
2156 | 덥
2157 | 데
2158 | 도
2159 | 독
2160 | 동
2161 | 돼
2162 | 됐
2163 | 되
2164 | 된
2165 | 될
2166 | 두
2167 | 둑
2168 | 둥
2169 | 드
2170 | 들
2171 | 등
2172 | 디
2173 | 따
2174 | 딱
2175 | 딸
2176 | 땅
2177 | 때
2178 | 떤
2179 | 떨
2180 | 떻
2181 | 또
2182 | 똑
2183 | 뚱
2184 | 뛰
2185 | 뜻
2186 | 띠
2187 | 라
2188 | 락
2189 | 란
2190 | 람
2191 | 랍
2192 | 랑
2193 | 래
2194 | 랜
2195 | 러
2196 | 런
2197 | 럼
2198 | 렇
2199 | 레
2200 | 려
2201 | 력
2202 | 렵
2203 | 렸
2204 | 로
2205 | 록
2206 | 롬
2207 | 루
2208 | 르
2209 | 른
2210 | 를
2211 | 름
2212 | 릉
2213 | 리
2214 | 릴
2215 | 림
2216 | 마
2217 | 막
2218 | 만
2219 | 많
2220 | 말
2221 | 맑
2222 | 맙
2223 | 맛
2224 | 매
2225 | 머
2226 | 먹
2227 | 멍
2228 | 메
2229 | 면
2230 | 명
2231 | 몇
2232 | 모
2233 | 목
2234 | 몸
2235 | 못
2236 | 무
2237 | 문
2238 | 물
2239 | 뭐
2240 | 뭘
2241 | 미
2242 | 민
2243 | 밌
2244 | 밑
2245 | 바
2246 | 박
2247 | 밖
2248 | 반
2249 | 받
2250 | 발
2251 | 밤
2252 | 밥
2253 | 방
2254 | 배
2255 | 백
2256 | 밸
2257 | 뱀
2258 | 버
2259 | 번
2260 | 벌
2261 | 벚
2262 | 베
2263 | 벼
2264 | 벽
2265 | 별
2266 | 병
2267 | 보
2268 | 복
2269 | 본
2270 | 볼
2271 | 봐
2272 | 봤
2273 | 부
2274 | 분
2275 | 불
2276 | 비
2277 | 빔
2278 | 빛
2279 | 빠
2280 | 빨
2281 | 뼈
2282 | 뽀
2283 | 뿅
2284 | 쁘
2285 | 사
2286 | 산
2287 | 살
2288 | 삼
2289 | 샀
2290 | 상
2291 | 새
2292 | 색
2293 | 생
2294 | 서
2295 | 선
2296 | 설
2297 | 섭
2298 | 섰
2299 | 성
2300 | 세
2301 | 셔
2302 | 션
2303 | 셨
2304 | 소
2305 | 속
2306 | 손
2307 | 송
2308 | 수
2309 | 숙
2310 | 순
2311 | 술
2312 | 숫
2313 | 숭
2314 | 숲
2315 | 쉬
2316 | 쉽
2317 | 스
2318 | 슨
2319 | 습
2320 | 슷
2321 | 시
2322 | 식
2323 | 신
2324 | 실
2325 | 싫
2326 | 심
2327 | 십
2328 | 싶
2329 | 싸
2330 | 써
2331 | 쓰
2332 | 쓴
2333 | 씌
2334 | 씨
2335 | 씩
2336 | 씬
2337 | 아
2338 | 악
2339 | 안
2340 | 않
2341 | 알
2342 | 야
2343 | 약
2344 | 얀
2345 | 양
2346 | 얘
2347 | 어
2348 | 언
2349 | 얼
2350 | 엄
2351 | 업
2352 | 없
2353 | 었
2354 | 엉
2355 | 에
2356 | 여
2357 | 역
2358 | 연
2359 | 염
2360 | 엽
2361 | 영
2362 | 옆
2363 | 예
2364 | 옛
2365 | 오
2366 | 온
2367 | 올
2368 | 옷
2369 | 옹
2370 | 와
2371 | 왔
2372 | 왜
2373 | 요
2374 | 욕
2375 | 용
2376 | 우
2377 | 운
2378 | 울
2379 | 웃
2380 | 워
2381 | 원
2382 | 월
2383 | 웠
2384 | 위
2385 | 윙
2386 | 유
2387 | 육
2388 | 윤
2389 | 으
2390 | 은
2391 | 을
2392 | 음
2393 | 응
2394 | 의
2395 | 이
2396 | 익
2397 | 인
2398 | 일
2399 | 읽
2400 | 임
2401 | 입
2402 | 있
2403 | 자
2404 | 작
2405 | 잔
2406 | 잖
2407 | 잘
2408 | 잡
2409 | 잤
2410 | 장
2411 | 재
2412 | 저
2413 | 전
2414 | 점
2415 | 정
2416 | 제
2417 | 져
2418 | 졌
2419 | 조
2420 | 족
2421 | 좀
2422 | 종
2423 | 좋
2424 | 죠
2425 | 주
2426 | 준
2427 | 줄
2428 | 중
2429 | 줘
2430 | 즈
2431 | 즐
2432 | 즘
2433 | 지
2434 | 진
2435 | 집
2436 | 짜
2437 | 짝
2438 | 쩌
2439 | 쪼
2440 | 쪽
2441 | 쫌
2442 | 쭈
2443 | 쯔
2444 | 찌
2445 | 찍
2446 | 차
2447 | 착
2448 | 찾
2449 | 책
2450 | 처
2451 | 천
2452 | 철
2453 | 체
2454 | 쳐
2455 | 쳤
2456 | 초
2457 | 촌
2458 | 추
2459 | 출
2460 | 춤
2461 | 춥
2462 | 춰
2463 | 치
2464 | 친
2465 | 칠
2466 | 침
2467 | 칩
2468 | 칼
2469 | 커
2470 | 켓
2471 | 코
2472 | 콩
2473 | 쿠
2474 | 퀴
2475 | 크
2476 | 큰
2477 | 큽
2478 | 키
2479 | 킨
2480 | 타
2481 | 태
2482 | 터
2483 | 턴
2484 | 털
2485 | 테
2486 | 토
2487 | 통
2488 | 투
2489 | 트
2490 | 특
2491 | 튼
2492 | 틀
2493 | 티
2494 | 팀
2495 | 파
2496 | 팔
2497 | 패
2498 | 페
2499 | 펜
2500 | 펭
2501 | 평
2502 | 포
2503 | 폭
2504 | 표
2505 | 품
2506 | 풍
2507 | 프
2508 | 플
2509 | 피
2510 | 필
2511 | 하
2512 | 학
2513 | 한
2514 | 할
2515 | 함
2516 | 합
2517 | 항
2518 | 해
2519 | 햇
2520 | 했
2521 | 행
2522 | 허
2523 | 험
2524 | 형
2525 | 혜
2526 | 호
2527 | 혼
2528 | 홀
2529 | 화
2530 | 회
2531 | 획
2532 | 후
2533 | 휴
2534 | 흐
2535 | 흔
2536 | 희
2537 | 히
2538 | 힘
2539 | ﷺ
2540 | ﷻ
2541 | ！
2542 | ，
2543 | ？
2544 | �
2545 | 𠮶
2546 | 


--------------------------------------------------------------------------------