├── .gitattributes ├── .gitignore ├── OutputWordLstToAniUseClass.py ├── OutputWordToAnkiUseClass ├── README.md ├── example pictures ├── 初始界面.png ├── 导入混淆词excel.png ├── 新版界面.png ├── 查询单个词.png ├── 查询混淆词.png └── 解析导入的混淆词.png └── test.xlsx /.gitattributes: -------------------------------------------------------------------------------- 1 | ############################################################################### 2 | # Set default behavior to automatically normalize line endings. 3 | ############################################################################### 4 | * text=auto 5 | 6 | ############################################################################### 7 | # Set default behavior for command prompt diff. 8 | # 9 | # This is need for earlier builds of msysgit that does not have it on by 10 | # default for csharp files. 11 | # Note: This is only used by command line 12 | ############################################################################### 13 | #*.cs diff=csharp 14 | 15 | ############################################################################### 16 | # Set the merge driver for project and solution files 17 | # 18 | # Merging from the command prompt will add diff markers to the files if there 19 | # are conflicts (Merging from VS is not affected by the settings below, in VS 20 | # the diff markers are never inserted). Diff markers may cause the following 21 | # file extensions to fail to load in VS. An alternative would be to treat 22 | # these files as binary and thus will always conflict and require user 23 | # intervention with every merge. To do so, just uncomment the entries below 24 | ############################################################################### 25 | #*.sln merge=binary 26 | #*.csproj merge=binary 27 | #*.vbproj merge=binary 28 | #*.vcxproj merge=binary 29 | #*.vcproj merge=binary 30 | #*.dbproj merge=binary 31 | #*.fsproj merge=binary 32 | #*.lsproj merge=binary 33 | #*.wixproj merge=binary 34 | #*.modelproj merge=binary 35 | #*.sqlproj merge=binary 36 | #*.wwaproj merge=binary 37 | 38 | ############################################################################### 39 | # behavior for image files 40 | # 41 | # image files are treated as binary by default. 42 | ############################################################################### 43 | #*.jpg binary 44 | #*.png binary 45 | #*.gif binary 46 | 47 | ############################################################################### 48 | # diff behavior for common document formats 49 | # 50 | # Convert binary document formats to text before diffing them. This feature 51 | # is only available from the command line. Turn it on by uncommenting the 52 | # entries below. 53 | ############################################################################### 54 | #*.doc diff=astextplain 55 | #*.DOC diff=astextplain 56 | #*.docx diff=astextplain 57 | #*.DOCX diff=astextplain 58 | #*.dot diff=astextplain 59 | #*.DOT diff=astextplain 60 | #*.pdf diff=astextplain 61 | #*.PDF diff=astextplain 62 | #*.rtf diff=astextplain 63 | #*.RTF diff=astextplain 64 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore Visual Studio temporary files, build results, and 2 | ## files generated by popular Visual Studio add-ons. 3 | 4 | # User-specific files 5 | *.suo 6 | *.user 7 | *.userosscache 8 | *.sln.docstates 9 | 10 | # User-specific files (MonoDevelop/Xamarin Studio) 11 | *.userprefs 12 | 13 | # Build results 14 | [Dd]ebug/ 15 | [Dd]ebugPublic/ 16 | [Rr]elease/ 17 | [Rr]eleases/ 18 | x64/ 19 | x86/ 20 | bld/ 21 | [Bb]in/ 22 | [Oo]bj/ 23 | [Ll]og/ 24 | 25 | # Visual Studio 2015 cache/options directory 26 | .vs/ 27 | # Uncomment if you have tasks that create the project's static files in wwwroot 28 | #wwwroot/ 29 | 30 | # MSTest test Results 31 | [Tt]est[Rr]esult*/ 32 | [Bb]uild[Ll]og.* 33 | 34 | # NUNIT 35 | *.VisualState.xml 36 | TestResult.xml 37 | 38 | # Build Results of an ATL Project 39 | [Dd]ebugPS/ 40 | [Rr]eleasePS/ 41 | dlldata.c 42 | 43 | # DNX 44 | project.lock.json 45 | project.fragment.lock.json 46 | artifacts/ 47 | 48 | *_i.c 49 | *_p.c 50 | *_i.h 51 | *.ilk 52 | *.meta 53 | *.obj 54 | *.pch 55 | *.pdb 56 | *.pgc 57 | *.pgd 58 | *.rsp 59 | *.sbr 60 | *.tlb 61 | *.tli 62 | *.tlh 63 | *.tmp 64 | *.tmp_proj 65 | *.log 66 | *.vspscc 67 | *.vssscc 68 | .builds 69 | *.pidb 70 | *.svclog 71 | *.scc 72 | 73 | # Chutzpah Test files 74 | _Chutzpah* 75 | 76 | # Visual C++ cache files 77 | ipch/ 78 | *.aps 79 | *.ncb 80 | *.opendb 81 | *.opensdf 82 | *.sdf 83 | *.cachefile 84 | *.VC.db 85 | *.VC.VC.opendb 86 | 87 | # Visual Studio profiler 88 | *.psess 89 | *.vsp 90 | *.vspx 91 | *.sap 92 | 93 | # TFS 2012 Local Workspace 94 | $tf/ 95 | 96 | # Guidance Automation Toolkit 97 | *.gpState 98 | 99 | # ReSharper is a .NET coding add-in 100 | _ReSharper*/ 101 | *.[Rr]e[Ss]harper 102 | *.DotSettings.user 103 | 104 | # JustCode is a .NET coding add-in 105 | .JustCode 106 | 107 | # TeamCity is a build add-in 108 | _TeamCity* 109 | 110 | # DotCover is a Code Coverage Tool 111 | *.dotCover 112 | 113 | # NCrunch 114 | _NCrunch_* 115 | .*crunch*.local.xml 116 | nCrunchTemp_* 117 | 118 | # MightyMoose 119 | *.mm.* 120 | AutoTest.Net/ 121 | 122 | # Web workbench (sass) 123 | .sass-cache/ 124 | 125 | # Installshield output folder 126 | [Ee]xpress/ 127 | 128 | # DocProject is a documentation generator add-in 129 | DocProject/buildhelp/ 130 | DocProject/Help/*.HxT 131 | DocProject/Help/*.HxC 132 | DocProject/Help/*.hhc 133 | DocProject/Help/*.hhk 134 | DocProject/Help/*.hhp 135 | DocProject/Help/Html2 136 | DocProject/Help/html 137 | 138 | # Click-Once directory 139 | publish/ 140 | 141 | # Publish Web Output 142 | *.[Pp]ublish.xml 143 | *.azurePubxml 144 | # TODO: Comment the next line if you want to checkin your web deploy settings 145 | # but database connection strings (with potential passwords) will be unencrypted 146 | #*.pubxml 147 | *.publishproj 148 | 149 | # Microsoft Azure Web App publish settings. Comment the next line if you want to 150 | # checkin your Azure Web App publish settings, but sensitive information contained 151 | # in these scripts will be unencrypted 152 | PublishScripts/ 153 | 154 | # NuGet Packages 155 | *.nupkg 156 | # The packages folder can be ignored because of Package Restore 157 | **/packages/* 158 | # except build/, which is used as an MSBuild target. 159 | !**/packages/build/ 160 | # Uncomment if necessary however generally it will be regenerated when needed 161 | #!**/packages/repositories.config 162 | # NuGet v3's project.json files produces more ignoreable files 163 | *.nuget.props 164 | *.nuget.targets 165 | 166 | # Microsoft Azure Build Output 167 | csx/ 168 | *.build.csdef 169 | 170 | # Microsoft Azure Emulator 171 | ecf/ 172 | rcf/ 173 | 174 | # Windows Store app package directories and files 175 | AppPackages/ 176 | BundleArtifacts/ 177 | Package.StoreAssociation.xml 178 | _pkginfo.txt 179 | 180 | # Visual Studio cache files 181 | # files ending in .cache can be ignored 182 | *.[Cc]ache 183 | # but keep track of directories ending in .cache 184 | !*.[Cc]ache/ 185 | 186 | # Others 187 | ClientBin/ 188 | ~$* 189 | *~ 190 | *.dbmdl 191 | *.dbproj.schemaview 192 | *.jfm 193 | *.pfx 194 | *.publishsettings 195 | node_modules/ 196 | orleans.codegen.cs 197 | 198 | # Since there are multiple workflows, uncomment next line to ignore bower_components 199 | # (https://github.com/github/gitignore/pull/1529#issuecomment-104372622) 200 | #bower_components/ 201 | 202 | # RIA/Silverlight projects 203 | Generated_Code/ 204 | 205 | # Backup & report files from converting an old project file 206 | # to a newer Visual Studio version. Backup files are not needed, 207 | # because we have git ;-) 208 | _UpgradeReport_Files/ 209 | Backup*/ 210 | UpgradeLog*.XML 211 | UpgradeLog*.htm 212 | 213 | # SQL Server files 214 | *.mdf 215 | *.ldf 216 | 217 | # Business Intelligence projects 218 | *.rdl.data 219 | *.bim.layout 220 | *.bim_*.settings 221 | 222 | # Microsoft Fakes 223 | FakesAssemblies/ 224 | 225 | # GhostDoc plugin setting file 226 | *.GhostDoc.xml 227 | 228 | # Node.js Tools for Visual Studio 229 | .ntvs_analysis.dat 230 | 231 | # Visual Studio 6 build log 232 | *.plg 233 | 234 | # Visual Studio 6 workspace options file 235 | *.opt 236 | 237 | # Visual Studio LightSwitch build output 238 | **/*.HTMLClient/GeneratedArtifacts 239 | **/*.DesktopClient/GeneratedArtifacts 240 | **/*.DesktopClient/ModelManifest.xml 241 | **/*.Server/GeneratedArtifacts 242 | **/*.Server/ModelManifest.xml 243 | _Pvt_Extensions 244 | 245 | # Paket dependency manager 246 | .paket/paket.exe 247 | paket-files/ 248 | 249 | # FAKE - F# Make 250 | .fake/ 251 | 252 | # JetBrains Rider 253 | .idea/ 254 | *.sln.iml 255 | 256 | # CodeRush 257 | .cr/ 258 | 259 | # Python Tools for Visual Studio (PTVS) 260 | __pycache__/ 261 | *.pyc -------------------------------------------------------------------------------- /OutputWordLstToAniUseClass.py: -------------------------------------------------------------------------------- 1 | import codecs # 读写utf-8 2 | import re 3 | import tkinter as tk 4 | from tkinter import messagebox 5 | import urllib.request 6 | from tkinter.filedialog import * 7 | 8 | import pyexcel_xlsx 9 | from bs4 import BeautifulSoup 10 | 11 | 12 | class Methods: 13 | """各种小的方法函数""" 14 | 15 | @staticmethod 16 | def is_chinese(uchar): # 找到中文UTF-8编码 17 | if '\u4e00' <= uchar <= '\u9fff': 18 | return True 19 | else: 20 | return False 21 | 22 | 23 | class Translator: 24 | """将英文单词翻译成带有翻译、例句等的字典""" 25 | dictionary_type_tuple = ("bing", "iciba", "youdict") 26 | query_time_out = 30 # 在线字典访问超时时间 27 | __selected_dictionary_type = 0 28 | 29 | def __init__(self, word, dictonary_type): 30 | """初始化返回字典,选择需要使用的在线字典""" 31 | self.__word = "" 32 | self.__result_dictionary = {"word": word, "translation": "", "example_english": "", "example_chinese": "", 33 | "root": "", "vocabulary_range": ""} 34 | # 装填字典 35 | self.__load_result_dictionary(dictonary_type) 36 | 37 | def __load_result_dictionary(self, dictionary_type): 38 | if self.dictionary_type_tuple[0] == dictionary_type: 39 | self.__bing_dictionary() 40 | elif self.dictionary_type_tuple[1] == dictionary_type: 41 | self.__iciba_dictionary() 42 | elif self.dictionary_type_tuple[2] == dictionary_type: 43 | self.__youdict_dictionary() 44 | else: 45 | self.__result_dictionary["error"] = "No dictionary found" 46 | 47 | def get_result_dictionary(self): 48 | return self.__result_dictionary 49 | 50 | def __bing_dictionary(self): 51 | url = "http://cn.bing.com/dict/search?q=" + self.__result_dictionary["word"] 52 | response = urllib.request.urlopen(url, timeout=Translator.query_time_out) 53 | html = response.read().decode("utf-8") 54 | soup = BeautifulSoup(html, 'lxml') 55 | # 目标词汇 56 | self.__result_dictionary["word"] = soup.find('div', id="headword").text 57 | # 释义 58 | express = "" 59 | expresses = soup.find_all('span', class_='def') 60 | for item in expresses: 61 | express = item.text + express 62 | self.__result_dictionary["translation"] = express 63 | # 例句 64 | english_example = "" 65 | english_examples = soup.find('div', class_='sen_en') 66 | for item in english_examples: 67 | english_example = english_example + item.text 68 | self.__result_dictionary["example_english"] = english_example 69 | chinese_example = "" 70 | chinese_examples = soup.find('div', class_='sen_cn') 71 | for item in chinese_examples: 72 | chinese_example = chinese_example + item.text 73 | self.__result_dictionary["example_chinese"] = chinese_example 74 | 75 | def __iciba_dictionary(self): 76 | headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) " 77 | "Chrome/100.0.4896.127 Safari/537.36"} 78 | url = "http://www.iciba.com/word?w=" + self.__result_dictionary["word"] 79 | req = urllib.request.Request(url=url, headers=headers) 80 | response = urllib.request.urlopen(req) 81 | html = response.read().decode("utf-8") 82 | soup = BeautifulSoup(html, 'lxml') 83 | # 目标词汇 84 | word = soup.find('h1', class_="Mean_word__hwr_g").text 85 | self.__result_dictionary["word"] = re.sub('\s', '', word) # 将string中的所有空白字符删除 86 | # 在哪些词汇表中 87 | try: 88 | self.__result_dictionary["vocabulary_range"] = soup.find("p", class_="Mean_tag__K_C8K").text 89 | except AttributeError: 90 | self.__result_dictionary["vocabulary_range"] = "Unknown" 91 | # 释义 92 | express = "" 93 | expresses = soup.find_all('ul', class_='Mean_part__UI9M6') 94 | for item in expresses: 95 | express = item.text + express 96 | self.__result_dictionary["translation"] = re.sub('\s', '', express) 97 | # 词根 98 | root = "" 99 | try: 100 | for cnt, item in enumerate(soup.find(class_='Affix_affix__iiL_9').findAll("p")): 101 | root += item.text + " " 102 | if cnt == 2: 103 | break 104 | self.__result_dictionary["root"] = root 105 | except AttributeError: 106 | self.__result_dictionary["root"] = "Unknown" 107 | # 例句: 108 | self.__result_dictionary["example_english"] = soup.find(class_="NormalSentence_sentence__Jr9aj").find( 109 | class_="NormalSentence_en__BKdCu").text 110 | self.__result_dictionary["example_chinese"] = soup.find(class_="NormalSentence_sentence__Jr9aj").find( 111 | class_="NormalSentence_cn__gyUtC").text 112 | 113 | def __youdict_dictionary(self): 114 | url = "http://www.youdict.com/w/" + self.__result_dictionary["word"] 115 | response = urllib.request.urlopen(url, timeout=Translator.query_time_out) 116 | html = response.read().decode("utf-8") 117 | soup = BeautifulSoup(html, 'lxml') 118 | # 目标词汇 119 | try: 120 | word = soup.find('h3', id="yd-word").text 121 | self.__result_dictionary["word"] = re.sub('\s', '', word) # 将string中的所有空白字符删除 122 | except AttributeError: 123 | self.__result_dictionary["word"] = "No words found" 124 | # 在哪些词汇表中 125 | try: 126 | self.__result_dictionary["vocabulary_range"] = soup.find(style="margin-bottom:6px;").text.split("\n")[ 127 | 1] # 删除这个句子中的回车 128 | except AttributeError: 129 | self.__result_dictionary["vocabulary_range"] = "Unknown" 130 | # 释义: 131 | try: 132 | expresses = soup.find(id='yd-word-meaning') 133 | express = "" 134 | for item in expresses: 135 | express = item.text + express 136 | self.__result_dictionary["translation"] = re.sub('\s', '', express) 137 | except AttributeError: 138 | self.__result_dictionary["translation"] = "Parsing translation failed" 139 | # 例句 140 | try: 141 | example = soup.find(id='yd-liju').text 142 | delete_number = example.index("来自") # 只需要第一个例句,删掉最后来自哪部字典的部分 143 | example = example[4:delete_number] 144 | for char in example: 145 | if Methods.is_chinese(char): 146 | example_list = example.split(char, 1) 147 | self.__result_dictionary["example_english"] = example_list[0] 148 | self.__result_dictionary["example_chinese"] = char + example_list[1] 149 | break 150 | except AttributeError: 151 | self.__result_dictionary["example_english"] = "Parsing example failed" 152 | # 词根 153 | root = "" 154 | try: 155 | ex = 0 # 除去第一个元素 156 | for item in soup.find(id='yd-ciyuan').contents: 157 | if ex != 0: 158 | root = root + item.text 159 | ex = ex + 1 160 | self.__result_dictionary["root"] = root 161 | except AttributeError: 162 | self.__result_dictionary["root"] = "Parsing root failed" 163 | 164 | 165 | class WordListProcessor: 166 | """解析导入的单词本,返回单词数组""" 167 | word_list_type_dict = {'youdao': dict(file_type=[("txt文件", ".txt")], function=0), 168 | 'confused_words': dict(file_type=[("excel文件", ".xlsx")], function=1), 169 | 'eduic': dict(file_type=[("txt文件", ".txt")], function=2)} 170 | 171 | def __init__(self, file, word_list_type): 172 | self.__result_words_list = [] 173 | 174 | if WordListProcessor.word_list_type_dict[word_list_type]["function"] == 0: 175 | self.__parse_youdao_words(file) 176 | elif WordListProcessor.word_list_type_dict[word_list_type]["function"] == 1: 177 | self.__parse_confused_words(file) 178 | elif WordListProcessor.word_list_type_dict[word_list_type]["function"] == 2: 179 | self.__parse_eudic_words(file) 180 | # 若增加解析器就加一个elif 181 | else: 182 | pass 183 | 184 | def __parse_confused_words(self, file): 185 | xls_data = pyexcel_xlsx.get_data(file) 186 | for sheet in xls_data.keys(): 187 | excel_list = xls_data[sheet] 188 | for words in excel_list: 189 | line_str = "" 190 | for word in words: 191 | line_str = line_str + "-" + word 192 | self.__result_words_list.append(line_str[1:]) 193 | break # 仅返回第一个表 194 | 195 | def __parse_youdao_words(self, file): 196 | # with open(file, 'r', encoding="utf-8") as f1: # 打开文件 197 | # txt_string = f1.read() # 读入文件内容到str1中 198 | f1 = open(file, 'r', encoding="utf-8") 199 | txt_string = f1.read() # 读入文件内容到str1中 200 | for item in txt_string.split("\n"): 201 | if re.match(r'\d*, ', item): # 提取出有单词的一项(这一行第一个是数字之后接着逗号之后是一个空格) 202 | item = item.split(" ")[1] # 只要每项中的单词而不要序号和音标 203 | self.__result_words_list.append(item) # 装入列表 204 | 205 | def __parse_eudic_words(self, file): 206 | with open(file, 'r', encoding="utf-8") as f1: # 打开文件 207 | txt_string = f1.read() # 读入文件内容到str1中 208 | for item in txt_string.split("\n"): 209 | if re.match(r'\d+@', item): 210 | item = item.split("@")[1] 211 | self.__result_words_list.append(item) # 装入列表 212 | 213 | def get_result_words_list(self): 214 | return self.__result_words_list 215 | 216 | 217 | class Framework(tk.Tk): 218 | """框架结构""" 219 | 220 | def __init__(self, *args, **kwargs): 221 | tk.Tk.__init__(self, *args, **kwargs) 222 | self.button_parse_list_box_words = None 223 | self.list_box_words_list = None 224 | self.button_input_word_book_confirm = None 225 | self.__selected_input_word_book_type = None 226 | self.export_path = None 227 | self.entry_word = None 228 | self.label_vocabulary = None 229 | self.text_show_all = None 230 | self.message = None 231 | self.__selected_dictionary_type = None 232 | self.__configuring_panel_size("normal") 233 | self.__place_widgets() 234 | self.protocol("WM_DELETE_WINDOW", self.__on_closing) 235 | 236 | def __configuring_panel_size(self, mode="normal"): 237 | if mode == "large": 238 | self.geometry("605x500") # 扩大窗口以显示列表控件 239 | elif mode == "normal": 240 | self.geometry("445x500") 241 | else: 242 | pass 243 | 244 | def __place_widgets(self): 245 | """放置各种widgets""" 246 | 247 | def place_message(): 248 | # 信息输出框: 249 | self.message = tk.Message(text="Waiting for input", width=100) 250 | self.message.grid(row=1, column=3, columnspan=2, rowspan=2, sticky="N") 251 | 252 | def place_word_input_part(): 253 | # 单词输入框+确认按钮 254 | self.entry_word = tk.Entry() 255 | self.entry_word.grid(row=0, column=2, columnspan=2, padx=0) 256 | 257 | self.button_word_query = tk.Button(text="Query", command=self.__command_button_word_query) 258 | self.button_word_query.grid(row=0, column=4) 259 | 260 | def place_label_vocabulary(): 261 | # 词汇表(CET4,考研等) 262 | self.label_vocabulary = tk.Label(text="Vocabulary range") 263 | self.label_vocabulary.grid(row=2, column=0, columnspan=4, sticky="W", padx=10) 264 | 265 | def place_text_show_all(): 266 | # 显示将要被导出的text 267 | self.text_show_all = tk.Text(width=60, height=28) 268 | self.text_show_all.grid(row=5, column=0, columnspan=6, rowspan=6, padx=10) 269 | 270 | def place_button_export_part(): 271 | # 将text控件中的文档按单词本格式输出的按钮 272 | self.button_export_confirm = tk.Button(text="Export", command=self.__command_button_output_confirm) 273 | self.button_export_confirm.grid(row=12, column=3, columnspan=2) 274 | # 保存路径字符串 275 | self.export_path = tk.StringVar(self) 276 | # 路径选择label 277 | self.label_select_export_path = tk.Label(text="Select export path:") 278 | self.label_select_export_path.grid(row=12, column=0, padx=10) 279 | # 路径选择框 280 | self.entry_export_path = tk.Entry(self, textvariable=self.export_path) 281 | self.entry_export_path.grid(row=12, column=1) 282 | # 打开路径选择的按钮 283 | self.button_select_export_path = tk.Button(text="Select", command=self.__command_button_select_output_path) 284 | self.button_select_export_path.grid(row=12, column=2, sticky="W") 285 | 286 | def place_select_dictionary_part(): 287 | # 选择词典控件 288 | self.label_source_choice = tk.Label(text="Choice source :") 289 | self.label_source_choice.grid(row=0, column=0, sticky="W", padx=10) 290 | self.__selected_dictionary_type = tk.StringVar(self) 291 | self.__selected_dictionary_type.set(Translator.dictionary_type_tuple[1]) # default value 292 | self.option_menu_select_online_dictionary = tk.OptionMenu(self, self.__selected_dictionary_type, 293 | *Translator.dictionary_type_tuple, 294 | command=self.__command_option_menu_changed) 295 | self.option_menu_select_online_dictionary.grid(row=0, column=1, sticky="W") 296 | 297 | def place_input_word_book_part(): 298 | # 导入单词本控件 299 | self.__activate_input_word_feature = False # 是否启动导入单词功能标志位 300 | self.label_choice_word_book = tk.Label(text="Choice word book :") 301 | self.label_choice_word_book.grid(row=1, column=0, sticky="W", padx=10) 302 | self.__selected_input_word_book_type = tk.StringVar(self) 303 | self.__selected_input_word_book_type.set("null") # default value 304 | self.option_menu_select_input_word_book_type = tk.OptionMenu(self, self.__selected_input_word_book_type, 305 | *WordListProcessor.word_list_type_dict, 306 | command=self.__command_option_menu_changed) 307 | self.option_menu_select_input_word_book_type.grid(row=1, column=1, sticky="W") 308 | self.button_input_word_book_confirm = tk.Button(text="Confirm", 309 | command=self.__command_button_input_word_book_confirm) 310 | self.button_input_word_book_confirm.grid(row=1, column=2, sticky="W") 311 | # 以下是隐藏的控件,当导入单词本时被.grid 312 | self.list_box_words_list = tk.Listbox(self, height=23) 313 | self.button_parse_list_box_words = tk.Button(text="Confirm", 314 | command=self.__command_button_parse_list_box_words) 315 | 316 | place_message() 317 | place_word_input_part() 318 | place_label_vocabulary() 319 | place_text_show_all() 320 | place_button_export_part() 321 | place_select_dictionary_part() 322 | place_input_word_book_part() 323 | 324 | def __output_query_result_to_text_show_all(self, query_string): 325 | def output_single_word_result_to_text_show_all(word): 326 | # 如果查询发生错误,直接返回 327 | try: 328 | translator = Translator(word, self.__selected_dictionary_type.get()) 329 | word_dictionary = translator.get_result_dictionary() 330 | self.message.config(text="Success!") 331 | except Exception as e: 332 | self.message.config(text=str(e)) 333 | raise e 334 | input_string = "{0}\n{1}

{2}\n
{3}\n{4}\n".format(word_dictionary["word"], 335 | word_dictionary["translation"], 336 | word_dictionary["example_chinese"], 337 | word_dictionary["example_english"], 338 | word_dictionary["root"]) 339 | self.text_show_all.insert("end", input_string) 340 | self.text_show_all.insert("end", "-----------------\n") 341 | self.label_vocabulary.config(text=word_dictionary["vocabulary_range"]) 342 | 343 | def output_confused_words_list_result_to_text_show_all(word_list): 344 | word_dictionaries_list = [] 345 | # 如果查询发生错误,直接返回 346 | try: 347 | for word in word_list: 348 | translator = Translator(word, self.__selected_dictionary_type.get()) 349 | word_dictionaries_list.append(translator.get_result_dictionary()) 350 | except Exception as e: 351 | self.message.config(text=str(e)) 352 | return 353 | input_string = "" 354 | # words: 355 | for word_dictionary in word_dictionaries_list: 356 | input_string = input_string + word_dictionary["word"] + "<---->" 357 | input_string = input_string[:-6] + "\n" 358 | # translation: 359 | for word_dictionary in word_dictionaries_list: 360 | input_string = input_string + word_dictionary["translation"] + "
" + \ 361 | word_dictionary["example_chinese"] + "
<---->
" 362 | input_string = input_string[:-16] + "\n" 363 | # example: 364 | for word_dictionary in word_dictionaries_list: 365 | input_string = input_string + word_dictionary["example_english"] + "
<---->
" 366 | input_string = input_string[:-16] + "\n" 367 | # root: 368 | for word_dictionary in word_dictionaries_list: 369 | input_string = input_string + word_dictionary["root"] + "
<---->
" 370 | input_string = input_string[:-16] + "\n" 371 | self.text_show_all.insert("end", input_string) 372 | self.text_show_all.insert("end", "-----------------\n") 373 | 374 | input_words_list = query_string.split("-") 375 | if len(input_words_list) == 1: 376 | output_single_word_result_to_text_show_all(input_words_list[0]) 377 | else: 378 | output_confused_words_list_result_to_text_show_all(input_words_list) 379 | 380 | def __command_button_word_query(self): 381 | input_word: str = self.entry_word.get() 382 | # 当没有输入时,点击无效 383 | if len(input_word) == 0: 384 | self.message.config(text="No input word detected") 385 | return 386 | # 当没有选择在线字典时,点击无效 387 | if self.__selected_dictionary_type.get() == "null": 388 | self.message.config(text="The online dictionary type has not been selected") 389 | return 390 | self.entry_word.delete(0, len(input_word)) # 清空输入框 391 | self.__output_query_result_to_text_show_all(input_word.strip()) 392 | 393 | # 点击导出按钮之后 394 | def __command_button_output_confirm(self): 395 | # 先检测路径是否存在 396 | if self.export_path.get() == "": 397 | self.message.config(text="No output path selected yet") 398 | return 399 | # 整理显示框中的字符串 400 | input_str = self.text_show_all.get("0.0", "end") 401 | input_str = input_str.replace("\n", "\t") 402 | input_str = input_str.replace("\t-----------------\t", "\r\n") 403 | input_str = input_str[0:-1] # 去掉最后一个\t 404 | # 追加写入 405 | fo = codecs.open(self.export_path.get() + "/Anki_words.txt", "a+", "utf-8") 406 | fo.write(input_str) 407 | fo.close() 408 | self.message.config(text="Output completed") 409 | self.text_show_all.delete("0.0", "end") # 清空显示区 410 | 411 | def __command_button_input_word_book_confirm(self): 412 | if self.__activate_input_word_feature is False: 413 | # 当没有选择导入方式时,操作无效 414 | if self.__selected_input_word_book_type.get() == "null": 415 | self.message.config(text="The word book type has not been selected") 416 | return 417 | self.__activate_input_word_feature = True 418 | self.button_input_word_book_confirm.config(text="On") 419 | self.__configuring_panel_size("large") 420 | self.list_box_words_list.grid(row=5, column=6) # 放置列表控件 421 | self.button_parse_list_box_words.grid(row=12, column=6) # 放置确认导入按钮 422 | # 当原来的列表为空时再次打开文件选择框,否则不重新选择文件 423 | if self.list_box_words_list.size() == 0: 424 | # 打开文件选择框 425 | try: 426 | file = askopenfilename( 427 | filetypes=WordListProcessor.word_list_type_dict 428 | [self.__selected_input_word_book_type.get()]['file_type']) 429 | word_list_processor = WordListProcessor(file, self.__selected_input_word_book_type.get()) 430 | query_list = word_list_processor.get_result_words_list() 431 | for words in query_list: 432 | self.list_box_words_list.insert('end', words) # 装入列表 433 | except Exception as e: 434 | self.message.config(text=str(e)) 435 | self.__command_button_input_word_book_confirm() 436 | else: 437 | self.__activate_input_word_feature = False 438 | self.__configuring_panel_size("normal") 439 | self.list_box_words_list.grid_forget() # 隐藏确认导入按钮 440 | self.button_parse_list_box_words.grid_forget() 441 | self.button_input_word_book_confirm.config(text="Off") 442 | 443 | def __command_button_parse_list_box_words(self): 444 | """当点击listbox下边的确认按钮之后""" 445 | if self.list_box_words_list.size() > 0: 446 | # 当没有选择字典时,不进行操作 447 | if self.__selected_dictionary_type.get() == "null": 448 | self.message.config(text="The word book type has not been selected") 449 | return 450 | list_item = self.list_box_words_list.get(0) 451 | self.list_box_words_list.delete(0) # 删除第一个位置的字符 452 | self.__output_query_result_to_text_show_all(list_item) 453 | 454 | def __command_option_menu_changed(self, event): 455 | """改变字典选择项之后""" 456 | self.message.config(text="select " + event) 457 | 458 | def __command_button_select_output_path(self): 459 | """点击选择输出路径之后""" 460 | self.export_path.set(askdirectory()) 461 | 462 | def __on_closing(self): 463 | not_export_yet = len(self.text_show_all.get("0.0", "end")) > 1 464 | if not_export_yet: 465 | """关闭确认""" 466 | if tk.messagebox.askokcancel("Quit", "There are still words left to export\nDo you Really want to quit?"): 467 | self.destroy() 468 | else: 469 | self.destroy() 470 | 471 | 472 | window = Framework() 473 | window.title("Anki assistant") 474 | window.mainloop() 475 | -------------------------------------------------------------------------------- /OutputWordToAnkiUseClass: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/OutputWordToAnkiUseClass -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Anki assistant 2 | 从在线必应词典/爱词霸/youdict爬取信息生成Anki可用的单词本 3 | ### 第五版界面: 4 | #### 初始界面: 5 | ![image](https://github.com/wangzilinn/Anki-assisent/blob/master/example%20pictures/%E6%96%B0%E7%89%88%E7%95%8C%E9%9D%A2.png) 6 | #### 查询单个词 7 | ![image](https://github.com/wangzilinn/Anki-assisent/blob/master/example%20pictures/%E6%9F%A5%E8%AF%A2%E5%8D%95%E4%B8%AA%E8%AF%8D.png) 8 | #### 查询混淆词 9 | ![image](https://github.com/wangzilinn/Anki-assisent/blob/master/example%20pictures/%E6%9F%A5%E8%AF%A2%E6%B7%B7%E6%B7%86%E8%AF%8D.png) 10 | #### 导入混淆词 11 | ![image](https://github.com/wangzilinn/Anki-assisent/blob/master/example%20pictures/%E8%A7%A3%E6%9E%90%E5%AF%BC%E5%85%A5%E7%9A%84%E6%B7%B7%E6%B7%86%E8%AF%8D.png) 12 | #### 解析导入的混淆词 13 | ![image](https://github.com/wangzilinn/Anki-assisent/blob/master/example%20pictures/%E5%AF%BC%E5%85%A5%E6%B7%B7%E6%B7%86%E8%AF%8Dexcel.png) 14 | ### 第五版功能: 15 | 1. 导入有道单词本(txt格式输出,需转为UTF-8编码) 16 | 2. 导入欧陆单词本(txt格式输出) 17 | 3. 导入混淆词excel(示例文件在库中test.xlsx) 18 | 4. 从在线词典youdao\iciba\youdict中查询或输入导入的单词 19 | 5. 输出为Anki单词本 20 | 21 | ### 之前版本的记录: 22 | #### 最近这几天每天都在背单词,发现了按照记忆曲线复习的单词本anki,觉得简直是神器,几天下来背单词的效率提高了不少。 23 | #### 但是每天把单词添加到单词本要花费不少时间,其典型过程是: 24 | 1. 找到要背的单词 25 | 2. 去百度/谷歌/必应搜索释义、例句等 26 | 3. 把搜索到的东西粘贴到单词本上 27 | 为了加深印象,很多步骤都是手打,而不是直接复制(后来发现并没有什么卵用) 28 | 29 | #### 为了节省时间,又拿起了许久不用的python写了个带图形界面的脚本,界面很简单,包括: 30 | 1. 单词查询控件 31 | 2. 确认按钮 32 | 3. 查询结果区 33 | 4. 生成单词本按钮 34 | 35 | ### 操作方法: 36 | 1. 将要添加的单词或单词本导入(注意:支持导入的有道单词本为使用有道单词软件输出的txt格式文件,并应手动转为UTF-8编码) 37 | 2. Confirm 38 | 3. 所有单词添加完毕后,Output 39 | 4. 输出的Anki单词本卡片内分隔符为\t,卡片分隔符为\r\n 40 | 41 | ### 程序流程: 42 | 1. 获取用户单词或单词本 43 | 2. 去在线词典网站爬取该单词的释义与例句 44 | 3. 格式化爬取数据 45 | 4. 输出到文件 46 | 47 | ### 修改源码时注意: 48 | 1. 源码已经放上了,需要一些不是自带的的库(urllib,codecs,bs4,pyexcel_xlsx) 49 | 2. 默认输出文件名为Anki_word_list.txt 50 | 3. 未使用百度和google翻译的原因是他们网站都做了处理(混淆和ajax),图省事用了bing 51 | 4. anki输出需要utf-8格式的输出,所以普通的打开文件操作是不行滴,要使用import codecs库 52 | 53 | ### 随笔 54 | 从产生需求到第二版,这两天一边写一边谷歌扫盲,花了大概5个小时,这种隔一年半载再拿起一种语言的感觉很奇妙,写惯了c/c++再写python有种带着学步车走路的感觉. 55 | 56 | 代码都是起床没睡醒和晚上睡不着的时候写的,命名,单词拼写,程序结构难免有不合理的地方,本着能用就行,适当拓展的原则(自用软件,给老板干活另说),这个库就先这样吧,以后除非网站都挂掉,不会再更新了 57 | 58 | 上面的FLAG说倒就倒,在实际使用中发现了在很多中情况下的闪退现象,已经修复 59 | 60 | 添加了查看该单词属于那个考试范围的label 61 | 62 | 2018年9月6日:使用面向对象整体重构了软件,增加了不少新功能,对python的常见特性已经比较熟悉了 63 | 64 | 2022年4月28日:距离上次维护这个软件已经是四年之前了,之前还是本科生,现在研究生都要毕业了。最近因为想重新学学英语,所以又捡起了这个小软件,目前已经修复了一个单词本的查询功能, 回头把readme重新写好看些 65 | 66 | [![996.icu](https://img.shields.io/badge/link-996.icu-red.svg)](https://996.icu) 67 | -------------------------------------------------------------------------------- /example pictures/初始界面.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/example pictures/初始界面.png -------------------------------------------------------------------------------- /example pictures/导入混淆词excel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/example pictures/导入混淆词excel.png -------------------------------------------------------------------------------- /example pictures/新版界面.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/example pictures/新版界面.png -------------------------------------------------------------------------------- /example pictures/查询单个词.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/example pictures/查询单个词.png -------------------------------------------------------------------------------- /example pictures/查询混淆词.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/example pictures/查询混淆词.png -------------------------------------------------------------------------------- /example pictures/解析导入的混淆词.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/example pictures/解析导入的混淆词.png -------------------------------------------------------------------------------- /test.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wangzilinn/Anki-assistant/718eec3b2a80ebe3fc960fde8b61d306239bde02/test.xlsx --------------------------------------------------------------------------------