├── LDA-demo ├── dataset.txt ├── dataset_cn.txt ├── main.py └── stopwords.dic ├── LSA-demo └── lsa.py ├── PLSA-demo ├── dataset.txt ├── main.py └── op_dataset.txt ├── README.md └── images ├── lda1.png ├── lda2.png ├── lda3.png ├── lsa.png ├── plsa1.png └── plsa2.png /LDA-demo/dataset.txt: -------------------------------------------------------------------------------- 1 | One Piece (Japanese: ワンピース Hepburn: Wan Pīsu?) is a Japanese manga series written and illustrated by Eiichiro Oda. It has been serialized in Shueisha's Weekly Shōnen Jump magazine since July 19, 1997, with the chapters collected into eighty-one tankōbon volumes to date. One Piece follows the adventures of Monkey D. Luffy, a funny young man whose body gained the properties of rubber after unintentionally eating a Devil Fruit. With his diverse crew of pirates, named the Straw Hat Pirates, Luffy explores the grand line in search of the world's ultimate treasure known as "One Piece" in order to become the next King of the Pirates. 2 | The manga has been adapted into an original video animation (OVA) produced by Production I.G in 1998, and an anime series produced by Toei Animation, which began broadcasting in Japan in 1999 and has aired 745 episodes to date. Additionally, Toei has developed thirteen animated feature films, two OVAs, and five television specials. Several companies have developed various types of merchandising such as a trading card game, and a large number of video games. The manga series was licensed for an English language release in North America & the United Kingdom by Viz Media and in Australia by Madman Entertainment. The anime series was licensed by 4Kids Entertainment for an English-language release in North America, before the license was dropped and subsequently acquired by Funimation in 2007. 3 | One Piece has received widespread critical acclaim, primarily for its art, characterization and humor. Several volumes of the manga have broken publishing records, including highest initial print run of any book in Japan. The official website for Eiichiro Oda's One Piece manga announced that the manga has set a Guinness World Record for "the most copies published for the same comic book series by a single author". As of June 2015, the manga has 320 million copies printed worldwide,[1] making it the best-selling manga series in history 4 | The world of One Piece is populated by humans and numerous other races such as "fishmen" (a race of fish/human hybrid, similar to mermen and mermaids), Minkmen (a race of humanoid animals), and giants. It is covered by two vast oceans, which are divided by a massive mountain range called the Red Line (赤い土の大陸レッドライン Reddo Rain?), which is also the only continent in the world.[2] The Grand Line (偉大なる航路グランドライン Gurando Rain), a sea that runs perpendicular to the Red Line, further divides them into four seas: North Blue (北の海ノースブルー Nōsu Burū), East Blue (東の海イーストブルー Īsuto Burū), West Blue (西の海ウェストブルー Uesuto Burū) and South Blue (南の海サウスブルー Sausu Burū).[3] Surrounding the Grand Line are two regions called Calm Belts (凪の帯カームベルト kāmu beruto), similar to horse latitudes, which experience almost no wind and ocean currents and are breeding ground for huge sea creatures called neptunians (海王類 kaiōrui, lit. "sea kings"). Because of this, the calm belts are very effective barriers for those trying to enter the Grand Line.[4] However, navy ships, members of an intergovernmental organization known as the World Government, are able to use a sea-prism stone (海楼石 kairōseki) to mask their presence from the sea kings and can simply pass through the calm belts. All other ships are forced to take a more dangerous route, going through a mountain at the first intersection of the Grand Line and the Red Line, a canal system known as Reverse Mountain (リヴァース・マウンテン Rivāsu Maunten).[5] Sea water from each of the four seas runs up that mountain and merges at the top to flow down a fifth canal and into the first half of the Grand Line.[6] The second half of the Grand Line, beyond the second intersection with the Red Line, is also known as the New World (新世界 Shin Sekai).[7] 5 | The currents and weather on the Grand Line's open sea are extremely unpredictable, whereas in the vicinity of islands the climate is stable.[8] The magnetic fields within the Grand Line cause normal compasses to malfunction, making it even more difficult to navigate,[9] and instead a special compass called a Log Pose (記録指針ログポース Rogu Pōsu?) must be used.[10] The Log Pose works by locking on to one island's magnetic field and then locking on to another island's magnetic field.[11] The time for it to set depends on the island.[12] This process can be bypassed by obtaining an Eternal Pose (永久指針エターナルポース Etānaru Pōsu), a Log Pose variation that is permanently set to a specific island and never changes.[13] 6 | The world of One Piece includes anachronisms, such as the Transponder Snails (電伝虫 Den-Den Mushi?), snail-like animals that can be attached to electric equipment and function as rotary phones,[14] fax machines,[14] surveillance cameras,[15] and similar devices.[15] Dials貝 (daiaru), the shells of certain sky-dwelling animals, can be used to store kinetic energy, wind, sound, images, heat, and the like and have various applications.[16] 7 | A Devil Fruit (悪魔の実 Akuma no Mi) is a type of fruit that, which when eaten, confers a power on the consumer.[17] There are three categories of Devil Fruits;[18] Zoan (動物系ゾオン Zoon) fruits allow the user to fully and partially transform into a specific animal; as well as real-life animals. Some Zoan fruits allow the user to transform into mythical creatures.[19] Logia (自然系ロギア Rogia) fruits give control over and allow the user "to change their living body structure into the powers of nature".[18] Paramecia (超人系パラミシア Paramishia) is a category for fruits that give the user superhuman abilities.[20] They are said to be incarnations of the sea devil himself, and as a result, Devil Fruit users cannot swim in sea water, as "they are hated by the sea".[21] Sea-prism stone also has this effect. When even partially submerged in sea water, they lose all of their strength and coordination, although some abilities remain, such as Luffy still being able to stretch after being totally submerged. "Moving" water, such as rain or waves, does not have this effect. When a Devil Fruit user dies, the powers will be reincarnated into a new Devil Fruit. Devil Fruit powers can be in a stage called "Awakened", where the user can turn anything around them, besides their own bodies, into what their Devil Fruit power is.[22] 8 | Haki (覇気?, lit. "Ambition") is a latent ability every living being in the world of One Piece possesses, though it is quiescent in most. There are three varieties of Haki: Color of Observation (見聞色の覇気 Kenbunshoku no Haki) allows to sense the presence of other beings and predict their movement. Color of Armament (武装色の覇気 Busōshoku no Haki) allows one to envelop body parts and even inanimate forms with a force akin to an invisible armor that possesses defensive and offensive properties, also allowing one to inflict harm upon Devil Fruit users. The rare Color of the Conquering King (覇王色の覇気 Haōshoku no Haki) enables one to intimidate or render beings of weak will power unconscious.[23] Haki is shown that when it's used too much, it will stop working for certain periods of time. 9 | The series begins with the execution of Gol D. Roger, a man known as the King of the Pirates (海賊王 Kaizokuō?). Just before his death, Roger announces that his treasure, the One Piece (ひとつなぎの大秘宝ワンピース Wan Pīsu?), will be available to anyone who finds it, beginning the Great Pirate Era (大海賊時代 Dai Kaizoku Jidai?). As a result, countless pirates set out to the Grand Line to look for the treasure. 10 | Twenty-two years have passed since Roger's execution, and Monkey D. Luffy, a young man inspired by his childhood idol and powerful pirate "Red Haired" Shanks, sets off on a journey from the East Blue sea to find the One Piece and proclaim himself as the King of the Pirates. In an effort to organize his own crew, the Straw Hat Pirates (麦わら海賊団篇 Mugiwara Kaizoku-dan?), Luffy rescues and befriends a swordsman named Roronoa Zoro, and they head off in search of the One Piece. They are joined by Nami, a navigator and thief; Usopp, a sniper and a liar; and Vinsmoke Sanji, a womanizing chef; in their journey, they acquire a ship named the Going Merry and engage in confrontations with pirates including Buggy the Clown, Captain Kuro and Don Krieg. Later, Luffy encounters Arlong, a fishman and member of the former Sun Pirates who thinks that fishmen are superior to humans. After Luffy defeats Arlong, Nami officially joins Luffy's crew and the Navy places a bounty on Luffy's head. Luffy goes to Loguetown (the town of beginning and end; where Gol D. Roger was born and later executed) then meets Captain Smoker, a Navy captain who can turn into smoke. He briefly captures Luffy, but Luffy is saved by his father, Monkey D. Dragon, who is identified by Smoker simply as Dragon, the most wanted man in the world. 11 | After making their way into the Grand Line, the group meets Nefeltari Vivi, a princess who wants to help save her country, the Alabasta Kingdom, from the crime syndicate Baroque Works. They later befriend the doctor and anthropomorphized reindeer Tony Tony Chopper while in Drum Island and along the way fight Baroque Works agents. The Straw Hat Pirates arrive in Alabasta, leading to battles with Baroque Works and their leader, Sir Crocodile a member of the 7 warlords of the sea, who sought to usurp the Alabasta royal line and obtain one of the three ancient weapons, Pluton. Luffy finally defeats Crocodile and liberates Alabasta. Soon after, Nico Robin, an archaeologist formerly employed as Baroque Works vice-president, joins Luffy's crew. 12 | They soon encounter Blackbeard, the alias of Marshall D. Teach, who also aspires to become the King of the Pirates. After traveling to the flying island of Skypiea, the crew gets involved in a war between the Skypieans and the Shandorians, leading to a confrontation against the island's ruler, Eneru, who has the power of lightning. Luffy defeats Eneru to save Skypiea and end the war. The crew soon meets the navy admiral Aokiji, who reveals that Robin was involved in searching for Poneglyphs, which are stones with markings left by an ancient civilization to reveal the missing 100 years of history that the World Government had erased. The group goes to Water 7, where they battle the cyborg shipwright Franky and are informed that the Going Merry is damaged beyond repair, leading to Usopp's temporary departure from the Straw Hat Pirates. However, Cipher Pol No. 9, the World Government's intelligence agency, captures Robin and Franky for information regarding the Poneglyphs and the ancient weapons that may come from them. Franky liberates himself from the government after burning his blueprints for one of the weapons, Pluton, and teams up with the Straw Hat Pirates to declare war on the government, resulting in battles between CP9 and its members. The final long battle with CP9 ends when the crew saves Robin. To replace the sunken Going Merry, Franky constructs a new ship, the Thousand Sunny, for the Straw Hats and officially joins the crew. Soon after, the crew helps a musician skeleton named Brook, who was revived after eating a Devil Fruit, find his shadow aboard the gigantic pirate ship Thriller Bark, which has been stolen by Gecko Moriah. After defeating Moriah, Brook joins Luffy's crew. 13 | After arriving at Sabaody Archipelago, the crew prepares to sail to the New World, the second half of Grand Line. While there, they befriend Silvers Rayleigh, the vice captain of Roger's pirate crew who they ask to coat their ship so they can travel underwater. The crew eventually gets separated during a battle with Bartholomew Kuma, a huge cyborg under the control of the Navy, at the Sabaody Archipelago; Luffy is transported to the all-female island Amazon Lily. Having learned that his adoptive older brother and the son of Gol D. Roger, Portgas D. Ace, has been detained at the government prison Impel Down, Luffy traverses there and liberates several of its prisoners, including the fish man Jimbei and some former enemies. However, Luffy soon learns that Ace is at Marineford to be executed at Navy headquarters. As Luffy arrives, war breaks out between the Navy and a group of pirates led by the legendary pirate Edward Newgate, also known by the alias Whitebeard. In the ensuing chaos, Whitebeard and Ace are killed. At Rayleigh's request, Luffy has his crewmates undergo rigorous training regimens, some under the tutelage of prominent figures. 14 | Two years later, the crew regroups at Sabaody Archipelago and journeys to Fishman Island to enter the New World. During this time, a group of fishman pirates appear, seeking supremacy against humans, and hold a coup d'état to decide the fate of the island. However, after Nami forgives Jimbei for releasing Arlong into the East Blue, the Straw Hats defeat the fishmen pirates, saving the island. The Straw Hats leave Fishman Island and finally reach the New World, though not before starting a feud with Charlotte "Big Mom" Linlin, one of the "Four Emperors", the strongest pirates in the New World. Entering the half-burning and half-freezing island Punk Hazard, the crew encounters an old acquaintance and a recently appointed Warlord, Trafalgar Law, and they form an alliance to take down Kaido, another of the Four Emperors. The alliance is drawn into a fierce battle against Caesar Clown, the scientist responsible for the destruction of Punk Hazard four years earlier, who is in the service of Donquixote Doflamingo. 15 | After Caesar's defeat, the alliance travels to Dressrosa, a kingdom ruled by Doflamingo, in an attempt to destroy the Smile factory, which supplies artificial Devil Fruits to Kaido. In Dressrosa, Luffy competes in a competition to receive the Flame-Flame Fruit, formerly used by Ace. Sanji, Nami, Brook, and Chopper leave Dressrosa when the Big Mom Pirates arrive to attack the Thousand Sunny. Luffy's other adoptive brother, Sabo, arrives and assumes Luffy's place in the competition to earn the Flame-Flame Fruit as Luffy leaves to rescue his friends. Luffy and Trafalgar Law defeat Doflamingo in a battle for Dressrosa's fate and save the country; however, they are forced to flee Dressrosa from naval forces led by Admiral Fujitora to regroup with the others. After escaping and gaining new bounties, seven pirate crews form an alliance and swear their loyalty to Luffy, creating the Straw Hat Grand Fleet. 16 | Luffy, Zoro, Usopp, Robin, Franky, Law, Kin’emon, and Kanjuro head to the mysterious island Zou, which lies on top of a giant elephant named Zunisha, in search for the remaining Straw Hats, the Heart Pirates, Momonosuke, Caesar Clown, and finally the ninja Raizo who was traveling with the samurai. While on Zou, they meet the anthropomorphic animal race known as the Mink Tribe. The crew also discovers that Sanji has suddenly been swept up in a personal conflict, and that Zou has been under siege by Jack of the Beasts Pirates. However, Jack is utterly defeated by Zunisha and peace is restored to Zou. Luffy, accompanied by Nami, Chopper, Brook, and the minks Pekoms, Pedro, and Carrot, heads to Whole Cake Island, where Big Mom resides. They attempt to rescue Sanji from his arranged marriage with Charlotte Purin, one of Big Mom's daughters, which was set up to finalize a political alliance between the Charlotte Family and the Vinsmoke Family. -------------------------------------------------------------------------------- /LDA-demo/main.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import time 3 | import codecs 4 | import jieba 5 | import re 6 | 7 | # 预处理(分词,去停用词,为每个word赋予一个编号,文档使用word编号的列表表示) 8 | def preprocessing(): 9 | # 读取停止词文件 10 | file = codecs.open('stopwords.dic','r','utf-8') 11 | stopwords = [line.strip() for line in file] 12 | file.close() 13 | 14 | # 读数据集 15 | file = codecs.open('dataset.txt','r','utf-8') 16 | documents = [document.strip() for document in file] 17 | file.close() 18 | 19 | word2id = {} 20 | id2word = {} 21 | docs = [] 22 | currentDocument = [] 23 | currentWordId = 0 24 | 25 | for document in documents: 26 | # 分词 27 | segList = jieba.cut(document) 28 | for word in segList: 29 | word = word.lower().strip() 30 | # 单词长度大于1并且不包含数字并且不是停止词 31 | if len(word) > 1 and not re.search('[0-9]', word) and word not in stopwords: 32 | if word in word2id: 33 | currentDocument.append(word2id[word]) 34 | else: 35 | currentDocument.append(currentWordId) 36 | word2id[word] = currentWordId 37 | id2word[currentWordId] = word 38 | currentWordId += 1 39 | docs.append(currentDocument); 40 | currentDocument = [] 41 | return docs, word2id, id2word 42 | 43 | # 初始化,按照每个topic概率都相等的multinomial分布采样,等价于取随机数,并更新采样出的topic的相关计数 44 | def randomInitialize(): 45 | for d, doc in enumerate(docs): 46 | zCurrentDoc = [] 47 | for w in doc: 48 | pz = np.divide(np.multiply(ndz[d, :], nzw[:, w]), nz) 49 | z = np.random.multinomial(1, pz / pz.sum()).argmax() 50 | zCurrentDoc.append(z) 51 | ndz[d, z] += 1 52 | nzw[z, w] += 1 53 | nz[z] += 1 54 | Z.append(zCurrentDoc) 55 | 56 | # gibbs采样 57 | def gibbsSampling(): 58 | # 为每个文档中的每个单词重新采样topic 59 | for d, doc in enumerate(docs): 60 | for index, w in enumerate(doc): 61 | z = Z[d][index] 62 | # 将当前文档当前单词原topic相关计数减去1 63 | ndz[d, z] -= 1 64 | nzw[z, w] -= 1 65 | nz[z] -= 1 66 | # 重新计算当前文档当前单词属于每个topic的概率 67 | pz = np.divide(np.multiply(ndz[d, :], nzw[:, w]), nz) 68 | # 按照计算出的分布进行采样 69 | z = np.random.multinomial(1, pz / pz.sum()).argmax() 70 | Z[d][index] = z 71 | # 将当前文档当前单词新采样的topic相关计数加上1 72 | ndz[d, z] += 1 73 | nzw[z, w] += 1 74 | nz[z] += 1 75 | 76 | def perplexity(): 77 | nd = np.sum(ndz, 1) 78 | n = 0 79 | ll = 0.0 80 | for d, doc in enumerate(docs): 81 | for w in doc: 82 | ll = ll + np.log(((nzw[:, w] / nz) * (ndz[d, :] / nd[d])).sum()) 83 | n = n + 1 84 | return np.exp(ll/(-n)) 85 | 86 | 87 | 88 | alpha = 5 89 | beta = 0.1 90 | iterationNum = 50 91 | Z = [] 92 | K = 10 93 | docs, word2id, id2word = preprocessing() 94 | N = len(docs) 95 | M = len(word2id) 96 | ndz = np.zeros([N, K]) + alpha 97 | nzw = np.zeros([K, M]) + beta 98 | nz = np.zeros([K]) + M * beta 99 | randomInitialize() 100 | for i in range(0, iterationNum): 101 | gibbsSampling() 102 | print(time.strftime('%X'), "Iteration: ", i, " Completed", " Perplexity: ", perplexity()) 103 | 104 | topicwords = [] 105 | maxTopicWordsNum = 10 106 | for z in range(0, K): 107 | ids = nzw[z, :].argsort() 108 | topicword = [] 109 | for j in ids: 110 | topicword.insert(0, id2word[j]) 111 | topicwords.append(topicword[0 : min(10, len(topicword))]) 112 | -------------------------------------------------------------------------------- /LDA-demo/stopwords.dic: -------------------------------------------------------------------------------- 1 | 'd 2 | 'll 3 | 'm 4 | 're 5 | 's 6 | 't 7 | 've 8 | ZT 9 | ZZ 10 | a 11 | a's 12 | able 13 | about 14 | above 15 | abst 16 | accordance 17 | according 18 | accordingly 19 | across 20 | act 21 | actually 22 | added 23 | adj 24 | adopted 25 | affected 26 | affecting 27 | affects 28 | after 29 | afterwards 30 | again 31 | against 32 | ah 33 | ain't 34 | all 35 | allow 36 | allows 37 | almost 38 | alone 39 | along 40 | already 41 | also 42 | although 43 | always 44 | am 45 | among 46 | amongst 47 | an 48 | and 49 | announce 50 | another 51 | any 52 | anybody 53 | anyhow 54 | anymore 55 | anyone 56 | anything 57 | anyway 58 | anyways 59 | anywhere 60 | apart 61 | apparently 62 | appear 63 | appreciate 64 | appropriate 65 | approximately 66 | are 67 | area 68 | areas 69 | aren 70 | aren't 71 | arent 72 | arise 73 | around 74 | as 75 | aside 76 | ask 77 | asked 78 | asking 79 | asks 80 | associated 81 | at 82 | auth 83 | available 84 | away 85 | awfully 86 | b 87 | back 88 | backed 89 | backing 90 | backs 91 | be 92 | became 93 | because 94 | become 95 | becomes 96 | becoming 97 | been 98 | before 99 | beforehand 100 | began 101 | begin 102 | beginning 103 | beginnings 104 | begins 105 | behind 106 | being 107 | beings 108 | believe 109 | below 110 | beside 111 | besides 112 | best 113 | better 114 | between 115 | beyond 116 | big 117 | biol 118 | both 119 | brief 120 | briefly 121 | but 122 | by 123 | c 124 | c'mon 125 | c's 126 | ca 127 | came 128 | can 129 | can't 130 | cannot 131 | cant 132 | case 133 | cases 134 | cause 135 | causes 136 | certain 137 | certainly 138 | changes 139 | clear 140 | clearly 141 | co 142 | com 143 | come 144 | comes 145 | concerning 146 | consequently 147 | consider 148 | considering 149 | contain 150 | containing 151 | contains 152 | corresponding 153 | could 154 | couldn't 155 | couldnt 156 | course 157 | currently 158 | d 159 | date 160 | definitely 161 | describe 162 | described 163 | despite 164 | did 165 | didn't 166 | differ 167 | different 168 | differently 169 | discuss 170 | do 171 | does 172 | doesn't 173 | doing 174 | don't 175 | done 176 | down 177 | downed 178 | downing 179 | downs 180 | downwards 181 | due 182 | during 183 | e 184 | each 185 | early 186 | ed 187 | edu 188 | effect 189 | eg 190 | eight 191 | eighty 192 | either 193 | else 194 | elsewhere 195 | end 196 | ended 197 | ending 198 | ends 199 | enough 200 | entirely 201 | especially 202 | et 203 | et-al 204 | etc 205 | even 206 | evenly 207 | ever 208 | every 209 | everybody 210 | everyone 211 | everything 212 | everywhere 213 | ex 214 | exactly 215 | example 216 | except 217 | f 218 | face 219 | faces 220 | fact 221 | facts 222 | far 223 | felt 224 | few 225 | ff 226 | fifth 227 | find 228 | finds 229 | first 230 | five 231 | fix 232 | followed 233 | following 234 | follows 235 | for 236 | former 237 | formerly 238 | forth 239 | found 240 | four 241 | from 242 | full 243 | fully 244 | further 245 | furthered 246 | furthering 247 | furthermore 248 | furthers 249 | g 250 | gave 251 | general 252 | generally 253 | get 254 | gets 255 | getting 256 | give 257 | given 258 | gives 259 | giving 260 | go 261 | goes 262 | going 263 | gone 264 | good 265 | goods 266 | got 267 | gotten 268 | great 269 | greater 270 | greatest 271 | greetings 272 | group 273 | grouped 274 | grouping 275 | groups 276 | h 277 | had 278 | hadn't 279 | happens 280 | hardly 281 | has 282 | hasn't 283 | have 284 | haven't 285 | having 286 | he 287 | he's 288 | hed 289 | hello 290 | help 291 | hence 292 | her 293 | here 294 | here's 295 | hereafter 296 | hereby 297 | herein 298 | heres 299 | hereupon 300 | hers 301 | herself 302 | hes 303 | hi 304 | hid 305 | high 306 | higher 307 | highest 308 | him 309 | himself 310 | his 311 | hither 312 | home 313 | hopefully 314 | how 315 | howbeit 316 | however 317 | hundred 318 | i 319 | i'd 320 | i'll 321 | i'm 322 | i've 323 | id 324 | ie 325 | if 326 | ignored 327 | im 328 | immediate 329 | immediately 330 | importance 331 | important 332 | in 333 | inasmuch 334 | inc 335 | include 336 | indeed 337 | index 338 | indicate 339 | indicated 340 | indicates 341 | information 342 | inner 343 | insofar 344 | instead 345 | interest 346 | interested 347 | interesting 348 | interests 349 | into 350 | invention 351 | inward 352 | is 353 | isn't 354 | it 355 | it'd 356 | it'll 357 | it's 358 | itd 359 | its 360 | itself 361 | j 362 | just 363 | k 364 | keep 365 | keeps 366 | kept 367 | keys 368 | kg 369 | kind 370 | km 371 | knew 372 | know 373 | known 374 | knows 375 | l 376 | large 377 | largely 378 | last 379 | lately 380 | later 381 | latest 382 | latter 383 | latterly 384 | least 385 | less 386 | lest 387 | let 388 | let's 389 | lets 390 | like 391 | liked 392 | likely 393 | line 394 | little 395 | long 396 | longer 397 | longest 398 | look 399 | looking 400 | looks 401 | ltd 402 | m 403 | made 404 | mainly 405 | make 406 | makes 407 | making 408 | man 409 | many 410 | may 411 | maybe 412 | me 413 | mean 414 | means 415 | meantime 416 | meanwhile 417 | member 418 | members 419 | men 420 | merely 421 | mg 422 | might 423 | million 424 | miss 425 | ml 426 | more 427 | moreover 428 | most 429 | mostly 430 | mr 431 | mrs 432 | much 433 | mug 434 | must 435 | my 436 | myself 437 | n 438 | n't 439 | na 440 | name 441 | namely 442 | nay 443 | nd 444 | near 445 | nearly 446 | necessarily 447 | necessary 448 | need 449 | needed 450 | needing 451 | needs 452 | neither 453 | never 454 | nevertheless 455 | new 456 | newer 457 | newest 458 | next 459 | nine 460 | ninety 461 | no 462 | nobody 463 | non 464 | none 465 | nonetheless 466 | noone 467 | nor 468 | normally 469 | nos 470 | not 471 | noted 472 | nothing 473 | novel 474 | now 475 | nowhere 476 | number 477 | numbers 478 | o 479 | obtain 480 | obtained 481 | obviously 482 | of 483 | off 484 | often 485 | oh 486 | ok 487 | okay 488 | old 489 | older 490 | oldest 491 | omitted 492 | on 493 | once 494 | one 495 | ones 496 | only 497 | onto 498 | open 499 | opened 500 | opening 501 | opens 502 | or 503 | ord 504 | order 505 | ordered 506 | ordering 507 | orders 508 | other 509 | others 510 | otherwise 511 | ought 512 | our 513 | ours 514 | ourselves 515 | out 516 | outside 517 | over 518 | overall 519 | owing 520 | own 521 | p 522 | page 523 | pages 524 | part 525 | parted 526 | particular 527 | particularly 528 | parting 529 | parts 530 | past 531 | per 532 | perhaps 533 | place 534 | placed 535 | places 536 | please 537 | plus 538 | point 539 | pointed 540 | pointing 541 | points 542 | poorly 543 | possible 544 | possibly 545 | potentially 546 | pp 547 | predominantly 548 | present 549 | presented 550 | presenting 551 | presents 552 | presumably 553 | previously 554 | primarily 555 | probably 556 | problem 557 | problems 558 | promptly 559 | proud 560 | provides 561 | put 562 | puts 563 | q 564 | que 565 | quickly 566 | quite 567 | qv 568 | r 569 | ran 570 | rather 571 | rd 572 | re 573 | readily 574 | really 575 | reasonably 576 | recent 577 | recently 578 | ref 579 | refs 580 | regarding 581 | regardless 582 | regards 583 | related 584 | relatively 585 | research 586 | respectively 587 | resulted 588 | resulting 589 | results 590 | right 591 | room 592 | rooms 593 | run 594 | s 595 | said 596 | same 597 | saw 598 | say 599 | saying 600 | says 601 | sec 602 | second 603 | secondly 604 | seconds 605 | section 606 | see 607 | seeing 608 | seem 609 | seemed 610 | seeming 611 | seems 612 | seen 613 | sees 614 | self 615 | selves 616 | sensible 617 | sent 618 | serious 619 | seriously 620 | seven 621 | several 622 | shall 623 | she 624 | she'll 625 | shed 626 | shes 627 | should 628 | shouldn't 629 | show 630 | showed 631 | showing 632 | shown 633 | showns 634 | shows 635 | side 636 | sides 637 | significant 638 | significantly 639 | similar 640 | similarly 641 | since 642 | six 643 | slightly 644 | small 645 | smaller 646 | smallest 647 | so 648 | some 649 | somebody 650 | somehow 651 | someone 652 | somethan 653 | something 654 | sometime 655 | sometimes 656 | somewhat 657 | somewhere 658 | soon 659 | sorry 660 | specifically 661 | specified 662 | specify 663 | specifying 664 | state 665 | states 666 | still 667 | stop 668 | strongly 669 | sub 670 | substantially 671 | successfully 672 | such 673 | sufficiently 674 | suggest 675 | sup 676 | sure 677 | t 678 | t's 679 | take 680 | taken 681 | taking 682 | tell 683 | tends 684 | th 685 | than 686 | thank 687 | thanks 688 | thanx 689 | that 690 | that'll 691 | that's 692 | that've 693 | thats 694 | the 695 | their 696 | theirs 697 | them 698 | themselves 699 | then 700 | thence 701 | there 702 | there'll 703 | there's 704 | there've 705 | thereafter 706 | thereby 707 | thered 708 | therefore 709 | therein 710 | thereof 711 | therere 712 | theres 713 | thereto 714 | thereupon 715 | these 716 | they 717 | they'd 718 | they'll 719 | they're 720 | they've 721 | theyd 722 | theyre 723 | thing 724 | things 725 | think 726 | thinks 727 | third 728 | this 729 | thorough 730 | thoroughly 731 | those 732 | thou 733 | though 734 | thoughh 735 | thought 736 | thoughts 737 | thousand 738 | three 739 | throug 740 | through 741 | throughout 742 | thru 743 | thus 744 | til 745 | tip 746 | to 747 | today 748 | together 749 | too 750 | took 751 | toward 752 | towards 753 | tried 754 | tries 755 | truly 756 | try 757 | trying 758 | ts 759 | turn 760 | turned 761 | turning 762 | turns 763 | twice 764 | two 765 | u 766 | un 767 | under 768 | unfortunately 769 | unless 770 | unlike 771 | unlikely 772 | until 773 | unto 774 | up 775 | upon 776 | ups 777 | us 778 | use 779 | used 780 | useful 781 | usefully 782 | usefulness 783 | uses 784 | using 785 | usually 786 | uucp 787 | v 788 | value 789 | various 790 | very 791 | via 792 | viz 793 | vol 794 | vols 795 | vs 796 | w 797 | want 798 | wanted 799 | wanting 800 | wants 801 | was 802 | wasn't 803 | way 804 | ways 805 | we 806 | we'd 807 | we'll 808 | we're 809 | we've 810 | wed 811 | welcome 812 | well 813 | wells 814 | went 815 | were 816 | weren't 817 | what 818 | what'll 819 | what's 820 | whatever 821 | whats 822 | when 823 | whence 824 | whenever 825 | where 826 | where's 827 | whereafter 828 | whereas 829 | whereby 830 | wherein 831 | wheres 832 | whereupon 833 | wherever 834 | whether 835 | which 836 | while 837 | whim 838 | whither 839 | who 840 | who'll 841 | who's 842 | whod 843 | whoever 844 | whole 845 | whom 846 | whomever 847 | whos 848 | whose 849 | why 850 | widely 851 | will 852 | willing 853 | wish 854 | with 855 | within 856 | without 857 | won't 858 | wonder 859 | words 860 | work 861 | worked 862 | working 863 | works 864 | world 865 | would 866 | wouldn't 867 | www 868 | x 869 | y 870 | year 871 | years 872 | yes 873 | yet 874 | you 875 | you'd 876 | you'll 877 | you're 878 | you've 879 | youd 880 | young 881 | younger 882 | youngest 883 | your 884 | youre 885 | yours 886 | yourself 887 | yourselves 888 | z 889 | zero 890 | zt 891 | zz 892 | あ 893 | い 894 | う 895 | え 896 | お 897 | か 898 | き 899 | く 900 | け 901 | こ 902 | さ 903 | し 904 | す 905 | せ 906 | そ 907 | た 908 | ち 909 | つ 910 | て 911 | と 912 | な 913 | に 914 | ぬ 915 | ね 916 | の 917 | は 918 | ひ 919 | ふ 920 | へ 921 | ほ 922 | ま 923 | み 924 | む 925 | め 926 | も 927 | や 928 | ゆ 929 | よ 930 | ら 931 | り 932 | る 933 | れ 934 | ろ 935 | わ 936 | を 937 | ん 938 | ア 939 | イ 940 | ウ 941 | エ 942 | オ 943 | カ 944 | キ 945 | ク 946 | ケ 947 | コ 948 | サ 949 | シ 950 | ス 951 | セ 952 | ソ 953 | タ 954 | チ 955 | ツ 956 | テ 957 | ト 958 | ナ 959 | ニ 960 | ヌ 961 | ネ 962 | ノ 963 | ハ 964 | ヒ 965 | フ 966 | ヘ 967 | ホ 968 | マ 969 | ミ 970 | ム 971 | メ 972 | モ 973 | ヤ 974 | ユ 975 | ヨ 976 | ラ 977 | リ 978 | ル 979 | レ 980 | ロ 981 | ワ 982 | ヲ 983 | ン 984 | が 985 | ぎ 986 | ぐ 987 | げ 988 | ご 989 | ざ 990 | じ 991 | ず 992 | ぜ 993 | ぞ 994 | だ 995 | ぢ 996 | づ 997 | で 998 | ど 999 | ば 1000 | び 1001 | ぶ 1002 | べ 1003 | ぼ 1004 | ガ 1005 | ギ 1006 | グ 1007 | ゲ 1008 | ゴ 1009 | ザ 1010 | ジ 1011 | ズ 1012 | ゼ 1013 | ゾ 1014 | ダ 1015 | ヂ 1016 | ヅ 1017 | デ 1018 | ド 1019 | バ 1020 | ビ 1021 | ブ 1022 | ベ 1023 | ボ 1024 | ぱ 1025 | ぴ 1026 | ぷ 1027 | ぺ 1028 | ぽ 1029 | パ 1030 | ピ 1031 | プ 1032 | ペ 1033 | ポ 1034 | ぁ 1035 | ぃ 1036 | ぅ 1037 | ぇ 1038 | ぉ 1039 | ゃ 1040 | ゅ 1041 | ょ 1042 | っ 1043 | ゎ 1044 | ゐ 1045 | ゑ 1046 | ァ 1047 | ィ 1048 | ゥ 1049 | ェ 1050 | ォ 1051 | ャ 1052 | ュ 1053 | ョ 1054 | ッ 1055 | ヮ 1056 | ヰ 1057 | ヱ 1058 | ヵ 1059 | ヶ 1060 | ゛ 1061 | ゜ 1062 | 「 1063 | 」 1064 | ・ 1065 | 、 1066 | 。 1067 | ー 1068 | ` 1069 | ~ 1070 | ! 1071 | @ 1072 | # 1073 | $ 1074 | % 1075 | ^ 1076 | & 1077 | * 1078 | ( 1079 | ) 1080 | _ 1081 | + 1082 | - 1083 | = 1084 | [ 1085 | ] 1086 | { 1087 | } 1088 | \ 1089 | | 1090 | ; 1091 | : 1092 | ' 1093 | " 1094 | < 1095 | > 1096 | , 1097 | . 1098 | / 1099 | ? 1100 | ! 1101 | " 1102 | # 1103 | $ 1104 | % 1105 | & 1106 | ' 1107 | ( 1108 | ) 1109 | * 1110 | + 1111 | , 1112 | - 1113 | -- 1114 | . 1115 | .. 1116 | ... 1117 | ...... 1118 | ................... 1119 | ./ 1120 | .一 1121 | .数 1122 | .日 1123 | / 1124 | // 1125 | 0 1126 | 1 1127 | 2 1128 | 3 1129 | 4 1130 | 5 1131 | 6 1132 | 7 1133 | 8 1134 | 9 1135 | : 1136 | :// 1137 | :: 1138 | ; 1139 | < 1140 | = 1141 | > 1142 | >> 1143 | ? 1144 | @ 1145 | A 1146 | Lex 1147 | [ 1148 | \ 1149 | ] 1150 | ^ 1151 | _ 1152 | ` 1153 | exp 1154 | sub 1155 | sup 1156 | | 1157 | } 1158 | ~ 1159 | ~~~~ 1160 | · 1161 | × 1162 | ××× 1163 | Δ 1164 | Ψ 1165 | γ 1166 | μ 1167 | φ 1168 | φ. 1169 | В 1170 | — 1171 | —— 1172 | ——— 1173 | ‘ 1174 | ’ 1175 | ’‘ 1176 | “ 1177 | ” 1178 | ”, 1179 | … 1180 | …… 1181 | …………………………………………………③ 1182 | ′∈ 1183 | ′| 1184 | ℃ 1185 | Ⅲ 1186 | ↑ 1187 | → 1188 | ∈[ 1189 | ∪φ∈ 1190 | ≈ 1191 | ① 1192 | ② 1193 | ②c 1194 | ③ 1195 | ③] 1196 | ④ 1197 | ⑤ 1198 | ⑥ 1199 | ⑦ 1200 | ⑧ 1201 | ⑨ 1202 | ⑩ 1203 | ── 1204 | ■ 1205 | ▲ 1206 |   1207 | 、 1208 | 。 1209 | 〈 1210 | 〉 1211 | 《 1212 | 》 1213 | 》), 1214 | 」 1215 | 『 1216 | 』 1217 | 【 1218 | 】 1219 | 〔 1220 | 〕 1221 | 〕〔 1222 | ㈧ 1223 | 一 1224 | 一. 1225 | 一一 1226 | 一下 1227 | 一个 1228 | 一些 1229 | 一何 1230 | 一切 1231 | 一则 1232 | 一则通过 1233 | 一天 1234 | 一定 1235 | 一方面 1236 | 一旦 1237 | 一时 1238 | 一来 1239 | 一样 1240 | 一次 1241 | 一片 1242 | 一番 1243 | 一直 1244 | 一致 1245 | 一般 1246 | 一起 1247 | 一转眼 1248 | 一边 1249 | 一面 1250 | 七 1251 | 万一 1252 | 三 1253 | 三天两头 1254 | 三番两次 1255 | 三番五次 1256 | 上 1257 | 上下 1258 | 上升 1259 | 上去 1260 | 上来 1261 | 上述 1262 | 上面 1263 | 下 1264 | 下列 1265 | 下去 1266 | 下来 1267 | 下面 1268 | 不 1269 | 不一 1270 | 不下 1271 | 不久 1272 | 不了 1273 | 不亦乐乎 1274 | 不仅 1275 | 不仅...而且 1276 | 不仅仅 1277 | 不仅仅是 1278 | 不会 1279 | 不但 1280 | 不但...而且 1281 | 不光 1282 | 不免 1283 | 不再 1284 | 不力 1285 | 不单 1286 | 不变 1287 | 不只 1288 | 不可 1289 | 不可开交 1290 | 不可抗拒 1291 | 不同 1292 | 不外 1293 | 不外乎 1294 | 不够 1295 | 不大 1296 | 不如 1297 | 不妨 1298 | 不定 1299 | 不对 1300 | 不少 1301 | 不尽 1302 | 不尽然 1303 | 不巧 1304 | 不已 1305 | 不常 1306 | 不得 1307 | 不得不 1308 | 不得了 1309 | 不得已 1310 | 不必 1311 | 不怎么 1312 | 不怕 1313 | 不惟 1314 | 不成 1315 | 不拘 1316 | 不择手段 1317 | 不敢 1318 | 不料 1319 | 不断 1320 | 不日 1321 | 不时 1322 | 不是 1323 | 不曾 1324 | 不止 1325 | 不止一次 1326 | 不比 1327 | 不消 1328 | 不满 1329 | 不然 1330 | 不然的话 1331 | 不特 1332 | 不独 1333 | 不由得 1334 | 不知不觉 1335 | 不管 1336 | 不管怎样 1337 | 不经意 1338 | 不胜 1339 | 不能 1340 | 不能不 1341 | 不至于 1342 | 不若 1343 | 不要 1344 | 不论 1345 | 不起 1346 | 不足 1347 | 不过 1348 | 不迭 1349 | 不问 1350 | 不限 1351 | 与 1352 | 与其 1353 | 与其说 1354 | 与否 1355 | 与此同时 1356 | 专门 1357 | 且 1358 | 且不说 1359 | 且说 1360 | 两者 1361 | 严格 1362 | 严重 1363 | 个 1364 | 个人 1365 | 个别 1366 | 中小 1367 | 中间 1368 | 丰富 1369 | 串行 1370 | 临 1371 | 临到 1372 | 为 1373 | 为主 1374 | 为了 1375 | 为什么 1376 | 为什麽 1377 | 为何 1378 | 为止 1379 | 为此 1380 | 为着 1381 | 主张 1382 | 主要 1383 | 举凡 1384 | 举行 1385 | 乃 1386 | 乃至 1387 | 乃至于 1388 | 么 1389 | 之 1390 | 之一 1391 | 之前 1392 | 之后 1393 | 之後 1394 | 之所以 1395 | 之类 1396 | 乌乎 1397 | 乎 1398 | 乒 1399 | 乘 1400 | 乘势 1401 | 乘机 1402 | 乘胜 1403 | 乘虚 1404 | 乘隙 1405 | 九 1406 | 也 1407 | 也好 1408 | 也就是说 1409 | 也是 1410 | 也罢 1411 | 了 1412 | 了解 1413 | 争取 1414 | 二 1415 | 二来 1416 | 二话不说 1417 | 二话没说 1418 | 于 1419 | 于是 1420 | 于是乎 1421 | 云云 1422 | 云尔 1423 | 互 1424 | 互相 1425 | 五 1426 | 些 1427 | 交口 1428 | 亦 1429 | 产生 1430 | 亲口 1431 | 亲手 1432 | 亲眼 1433 | 亲自 1434 | 亲身 1435 | 人 1436 | 人人 1437 | 人们 1438 | 人家 1439 | 人民 1440 | 什么 1441 | 什么样 1442 | 什麽 1443 | 仅 1444 | 仅仅 1445 | 今 1446 | 今后 1447 | 今天 1448 | 今年 1449 | 今後 1450 | 介于 1451 | 仍 1452 | 仍旧 1453 | 仍然 1454 | 从 1455 | 从不 1456 | 从严 1457 | 从中 1458 | 从事 1459 | 从今以后 1460 | 从优 1461 | 从古到今 1462 | 从古至今 1463 | 从头 1464 | 从宽 1465 | 从小 1466 | 从新 1467 | 从无到有 1468 | 从早到晚 1469 | 从未 1470 | 从来 1471 | 从此 1472 | 从此以后 1473 | 从而 1474 | 从轻 1475 | 从速 1476 | 从重 1477 | 他 1478 | 他人 1479 | 他们 1480 | 他是 1481 | 他的 1482 | 代替 1483 | 以 1484 | 以上 1485 | 以下 1486 | 以为 1487 | 以便 1488 | 以免 1489 | 以前 1490 | 以及 1491 | 以后 1492 | 以外 1493 | 以後 1494 | 以故 1495 | 以期 1496 | 以来 1497 | 以至 1498 | 以至于 1499 | 以致 1500 | 们 1501 | 任 1502 | 任何 1503 | 任凭 1504 | 任务 1505 | 企图 1506 | 伙同 1507 | 会 1508 | 伟大 1509 | 传 1510 | 传说 1511 | 传闻 1512 | 似乎 1513 | 似的 1514 | 但 1515 | 但凡 1516 | 但愿 1517 | 但是 1518 | 何 1519 | 何乐而不为 1520 | 何以 1521 | 何况 1522 | 何处 1523 | 何妨 1524 | 何尝 1525 | 何必 1526 | 何时 1527 | 何止 1528 | 何苦 1529 | 何须 1530 | 余外 1531 | 作为 1532 | 你 1533 | 你们 1534 | 你是 1535 | 你的 1536 | 使 1537 | 使得 1538 | 使用 1539 | 例如 1540 | 依 1541 | 依据 1542 | 依照 1543 | 依靠 1544 | 便 1545 | 便于 1546 | 促进 1547 | 保持 1548 | 保管 1549 | 保险 1550 | 俺 1551 | 俺们 1552 | 倍加 1553 | 倍感 1554 | 倒不如 1555 | 倒不如说 1556 | 倒是 1557 | 倘 1558 | 倘使 1559 | 倘或 1560 | 倘然 1561 | 倘若 1562 | 借 1563 | 借以 1564 | 借此 1565 | 假使 1566 | 假如 1567 | 假若 1568 | 偏偏 1569 | 做到 1570 | 偶尔 1571 | 偶而 1572 | 傥然 1573 | 像 1574 | 儿 1575 | 允许 1576 | 元/吨 1577 | 充其极 1578 | 充其量 1579 | 充分 1580 | 先不先 1581 | 先后 1582 | 先後 1583 | 先生 1584 | 光 1585 | 光是 1586 | 全体 1587 | 全力 1588 | 全年 1589 | 全然 1590 | 全身心 1591 | 全部 1592 | 全都 1593 | 全面 1594 | 八 1595 | 八成 1596 | 公然 1597 | 六 1598 | 兮 1599 | 共 1600 | 共同 1601 | 共总 1602 | 关于 1603 | 其 1604 | 其一 1605 | 其中 1606 | 其二 1607 | 其他 1608 | 其余 1609 | 其后 1610 | 其它 1611 | 其实 1612 | 其次 1613 | 具体 1614 | 具体地说 1615 | 具体来说 1616 | 具体说来 1617 | 具有 1618 | 兼之 1619 | 内 1620 | 再 1621 | 再其次 1622 | 再则 1623 | 再有 1624 | 再次 1625 | 再者 1626 | 再者说 1627 | 再说 1628 | 冒 1629 | 冲 1630 | 决不 1631 | 决定 1632 | 决非 1633 | 况且 1634 | 准备 1635 | 凑巧 1636 | 凝神 1637 | 几 1638 | 几乎 1639 | 几度 1640 | 几时 1641 | 几番 1642 | 几经 1643 | 凡 1644 | 凡是 1645 | 凭 1646 | 凭借 1647 | 出 1648 | 出于 1649 | 出去 1650 | 出来 1651 | 出现 1652 | 分别 1653 | 分头 1654 | 分期 1655 | 分期分批 1656 | 切 1657 | 切不可 1658 | 切切 1659 | 切勿 1660 | 切莫 1661 | 则 1662 | 则甚 1663 | 刚 1664 | 刚好 1665 | 刚巧 1666 | 刚才 1667 | 初 1668 | 别 1669 | 别人 1670 | 别处 1671 | 别是 1672 | 别的 1673 | 别管 1674 | 别说 1675 | 到 1676 | 到了儿 1677 | 到处 1678 | 到头 1679 | 到头来 1680 | 到底 1681 | 到目前为止 1682 | 前后 1683 | 前此 1684 | 前者 1685 | 前进 1686 | 前面 1687 | 加上 1688 | 加之 1689 | 加以 1690 | 加入 1691 | 加强 1692 | 动不动 1693 | 动辄 1694 | 勃然 1695 | 匆匆 1696 | 十分 1697 | 千 1698 | 千万 1699 | 千万千万 1700 | 半 1701 | 单 1702 | 单单 1703 | 单纯 1704 | 即 1705 | 即令 1706 | 即使 1707 | 即便 1708 | 即刻 1709 | 即如 1710 | 即将 1711 | 即或 1712 | 即是说 1713 | 即若 1714 | 却 1715 | 却不 1716 | 历 1717 | 原来 1718 | 去 1719 | 又 1720 | 又及 1721 | 及 1722 | 及其 1723 | 及时 1724 | 及至 1725 | 双方 1726 | 反之 1727 | 反之亦然 1728 | 反之则 1729 | 反倒 1730 | 反倒是 1731 | 反应 1732 | 反手 1733 | 反映 1734 | 反而 1735 | 反过来 1736 | 反过来说 1737 | 取得 1738 | 取道 1739 | 受到 1740 | 变成 1741 | 古来 1742 | 另 1743 | 另一个 1744 | 另一方面 1745 | 另外 1746 | 另悉 1747 | 另方面 1748 | 另行 1749 | 只 1750 | 只当 1751 | 只怕 1752 | 只是 1753 | 只有 1754 | 只消 1755 | 只要 1756 | 只限 1757 | 叫 1758 | 叫做 1759 | 召开 1760 | 叮咚 1761 | 叮当 1762 | 可 1763 | 可以 1764 | 可好 1765 | 可是 1766 | 可能 1767 | 可见 1768 | 各 1769 | 各个 1770 | 各人 1771 | 各位 1772 | 各地 1773 | 各式 1774 | 各种 1775 | 各级 1776 | 各自 1777 | 合理 1778 | 同 1779 | 同一 1780 | 同时 1781 | 同样 1782 | 后 1783 | 后来 1784 | 后者 1785 | 后面 1786 | 向 1787 | 向使 1788 | 向着 1789 | 吓 1790 | 吗 1791 | 否则 1792 | 吧 1793 | 吧哒 1794 | 吱 1795 | 呀 1796 | 呃 1797 | 呆呆地 1798 | 呐 1799 | 呕 1800 | 呗 1801 | 呜 1802 | 呜呼 1803 | 呢 1804 | 周围 1805 | 呵 1806 | 呵呵 1807 | 呸 1808 | 呼哧 1809 | 呼啦 1810 | 咋 1811 | 和 1812 | 咚 1813 | 咦 1814 | 咧 1815 | 咱 1816 | 咱们 1817 | 咳 1818 | 哇 1819 | 哈 1820 | 哈哈 1821 | 哉 1822 | 哎 1823 | 哎呀 1824 | 哎哟 1825 | 哗 1826 | 哗啦 1827 | 哟 1828 | 哦 1829 | 哩 1830 | 哪 1831 | 哪个 1832 | 哪些 1833 | 哪儿 1834 | 哪天 1835 | 哪年 1836 | 哪怕 1837 | 哪样 1838 | 哪边 1839 | 哪里 1840 | 哼 1841 | 哼唷 1842 | 唉 1843 | 唯有 1844 | 啊 1845 | 啊呀 1846 | 啊哈 1847 | 啊哟 1848 | 啐 1849 | 啥 1850 | 啦 1851 | 啪达 1852 | 啷当 1853 | 喀 1854 | 喂 1855 | 喏 1856 | 喔唷 1857 | 喽 1858 | 嗡 1859 | 嗡嗡 1860 | 嗬 1861 | 嗯 1862 | 嗳 1863 | 嘎 1864 | 嘎嘎 1865 | 嘎登 1866 | 嘘 1867 | 嘛 1868 | 嘻 1869 | 嘿 1870 | 嘿嘿 1871 | 四 1872 | 因 1873 | 因为 1874 | 因了 1875 | 因此 1876 | 因着 1877 | 因而 1878 | 固 1879 | 固然 1880 | 在 1881 | 在下 1882 | 在于 1883 | 地 1884 | 均 1885 | 坚决 1886 | 坚持 1887 | 基于 1888 | 基本 1889 | 基本上 1890 | 处在 1891 | 处处 1892 | 处理 1893 | 复杂 1894 | 多 1895 | 多么 1896 | 多亏 1897 | 多多 1898 | 多多少少 1899 | 多多益善 1900 | 多少 1901 | 多年前 1902 | 多年来 1903 | 多数 1904 | 多次 1905 | 够瞧的 1906 | 大 1907 | 大不了 1908 | 大举 1909 | 大事 1910 | 大体 1911 | 大体上 1912 | 大凡 1913 | 大力 1914 | 大多 1915 | 大多数 1916 | 大大 1917 | 大家 1918 | 大张旗鼓 1919 | 大批 1920 | 大抵 1921 | 大概 1922 | 大略 1923 | 大约 1924 | 大致 1925 | 大都 1926 | 大量 1927 | 大面儿上 1928 | 失去 1929 | 奇 1930 | 奈 1931 | 奋勇 1932 | 她 1933 | 她们 1934 | 她是 1935 | 她的 1936 | 好 1937 | 好在 1938 | 好的 1939 | 好象 1940 | 如 1941 | 如上 1942 | 如上所述 1943 | 如下 1944 | 如今 1945 | 如何 1946 | 如其 1947 | 如前所述 1948 | 如同 1949 | 如常 1950 | 如是 1951 | 如期 1952 | 如果 1953 | 如次 1954 | 如此 1955 | 如此等等 1956 | 如若 1957 | 始而 1958 | 姑且 1959 | 存在 1960 | 存心 1961 | 孰料 1962 | 孰知 1963 | 宁 1964 | 宁可 1965 | 宁愿 1966 | 宁肯 1967 | 它 1968 | 它们 1969 | 它们的 1970 | 它是 1971 | 它的 1972 | 安全 1973 | 完全 1974 | 完成 1975 | 定 1976 | 实现 1977 | 实际 1978 | 宣布 1979 | 容易 1980 | 密切 1981 | 对 1982 | 对于 1983 | 对应 1984 | 对待 1985 | 对方 1986 | 对比 1987 | 将 1988 | 将才 1989 | 将要 1990 | 将近 1991 | 小 1992 | 少数 1993 | 尔 1994 | 尔后 1995 | 尔尔 1996 | 尔等 1997 | 尚且 1998 | 尤其 1999 | 就 2000 | 就地 2001 | 就是 2002 | 就是了 2003 | 就是说 2004 | 就此 2005 | 就算 2006 | 就要 2007 | 尽 2008 | 尽可能 2009 | 尽如人意 2010 | 尽心尽力 2011 | 尽心竭力 2012 | 尽快 2013 | 尽早 2014 | 尽然 2015 | 尽管 2016 | 尽管如此 2017 | 尽量 2018 | 局外 2019 | 居然 2020 | 届时 2021 | 属于 2022 | 屡 2023 | 屡屡 2024 | 屡次 2025 | 屡次三番 2026 | 岂 2027 | 岂但 2028 | 岂止 2029 | 岂非 2030 | 川流不息 2031 | 左右 2032 | 巨大 2033 | 巩固 2034 | 差一点 2035 | 差不多 2036 | 己 2037 | 已 2038 | 已矣 2039 | 已经 2040 | 巴 2041 | 巴巴 2042 | 带 2043 | 帮助 2044 | 常 2045 | 常常 2046 | 常言说 2047 | 常言说得好 2048 | 常言道 2049 | 平素 2050 | 年复一年 2051 | 并 2052 | 并不 2053 | 并不是 2054 | 并且 2055 | 并排 2056 | 并无 2057 | 并没 2058 | 并没有 2059 | 并肩 2060 | 并非 2061 | 广大 2062 | 广泛 2063 | 应当 2064 | 应用 2065 | 应该 2066 | 庶乎 2067 | 庶几 2068 | 开外 2069 | 开始 2070 | 开展 2071 | 引起 2072 | 弗 2073 | 弹指之间 2074 | 强烈 2075 | 强调 2076 | 归 2077 | 归根到底 2078 | 归根结底 2079 | 归齐 2080 | 当 2081 | 当下 2082 | 当中 2083 | 当儿 2084 | 当前 2085 | 当即 2086 | 当口儿 2087 | 当地 2088 | 当场 2089 | 当头 2090 | 当庭 2091 | 当时 2092 | 当然 2093 | 当真 2094 | 当着 2095 | 形成 2096 | 彻夜 2097 | 彻底 2098 | 彼 2099 | 彼时 2100 | 彼此 2101 | 往 2102 | 往往 2103 | 待 2104 | 待到 2105 | 很 2106 | 很多 2107 | 很少 2108 | 後来 2109 | 後面 2110 | 得 2111 | 得了 2112 | 得出 2113 | 得到 2114 | 得天独厚 2115 | 得起 2116 | 心里 2117 | 必 2118 | 必定 2119 | 必将 2120 | 必然 2121 | 必要 2122 | 必须 2123 | 快 2124 | 快要 2125 | 忽地 2126 | 忽然 2127 | 怎 2128 | 怎么 2129 | 怎么办 2130 | 怎么样 2131 | 怎奈 2132 | 怎样 2133 | 怎麽 2134 | 怕 2135 | 急匆匆 2136 | 怪 2137 | 怪不得 2138 | 总之 2139 | 总是 2140 | 总的来看 2141 | 总的来说 2142 | 总的说来 2143 | 总结 2144 | 总而言之 2145 | 恍然 2146 | 恐怕 2147 | 恰似 2148 | 恰好 2149 | 恰如 2150 | 恰巧 2151 | 恰恰 2152 | 恰恰相反 2153 | 恰逢 2154 | 您 2155 | 您们 2156 | 您是 2157 | 惟其 2158 | 惯常 2159 | 意思 2160 | 愤然 2161 | 愿意 2162 | 慢说 2163 | 成为 2164 | 成年 2165 | 成年累月 2166 | 成心 2167 | 我 2168 | 我们 2169 | 我是 2170 | 我的 2171 | 或 2172 | 或则 2173 | 或多或少 2174 | 或是 2175 | 或曰 2176 | 或者 2177 | 或许 2178 | 战斗 2179 | 截然 2180 | 截至 2181 | 所 2182 | 所以 2183 | 所在 2184 | 所幸 2185 | 所有 2186 | 所谓 2187 | 才 2188 | 才能 2189 | 扑通 2190 | 打 2191 | 打从 2192 | 打开天窗说亮话 2193 | 扩大 2194 | 把 2195 | 抑或 2196 | 抽冷子 2197 | 拦腰 2198 | 拿 2199 | 按 2200 | 按时 2201 | 按期 2202 | 按照 2203 | 按理 2204 | 按说 2205 | 挨个 2206 | 挨家挨户 2207 | 挨次 2208 | 挨着 2209 | 挨门挨户 2210 | 挨门逐户 2211 | 换句话说 2212 | 换言之 2213 | 据 2214 | 据实 2215 | 据悉 2216 | 据我所知 2217 | 据此 2218 | 据称 2219 | 据说 2220 | 掌握 2221 | 接下来 2222 | 接着 2223 | 接著 2224 | 接连不断 2225 | 放量 2226 | 故 2227 | 故意 2228 | 故此 2229 | 故而 2230 | 敞开儿 2231 | 敢 2232 | 敢于 2233 | 敢情 2234 | 数/ 2235 | 整个 2236 | 断然 2237 | 方 2238 | 方便 2239 | 方才 2240 | 方能 2241 | 方面 2242 | 旁人 2243 | 无 2244 | 无宁 2245 | 无法 2246 | 无论 2247 | 既 2248 | 既...又 2249 | 既往 2250 | 既是 2251 | 既然 2252 | 日复一日 2253 | 日渐 2254 | 日益 2255 | 日臻 2256 | 日见 2257 | 时候 2258 | 昂然 2259 | 明显 2260 | 明确 2261 | 是 2262 | 是不是 2263 | 是以 2264 | 是否 2265 | 是的 2266 | 显然 2267 | 显著 2268 | 普通 2269 | 普遍 2270 | 暗中 2271 | 暗地里 2272 | 暗自 2273 | 更 2274 | 更为 2275 | 更加 2276 | 更进一步 2277 | 曾 2278 | 曾经 2279 | 替 2280 | 替代 2281 | 最 2282 | 最后 2283 | 最大 2284 | 最好 2285 | 最後 2286 | 最近 2287 | 最高 2288 | 有 2289 | 有些 2290 | 有关 2291 | 有利 2292 | 有力 2293 | 有及 2294 | 有所 2295 | 有效 2296 | 有时 2297 | 有点 2298 | 有的 2299 | 有的是 2300 | 有着 2301 | 有著 2302 | 望 2303 | 朝 2304 | 朝着 2305 | 末##末 2306 | 本 2307 | 本人 2308 | 本地 2309 | 本着 2310 | 本身 2311 | 权时 2312 | 来 2313 | 来不及 2314 | 来得及 2315 | 来看 2316 | 来着 2317 | 来自 2318 | 来讲 2319 | 来说 2320 | 极 2321 | 极为 2322 | 极了 2323 | 极其 2324 | 极力 2325 | 极大 2326 | 极度 2327 | 极端 2328 | 构成 2329 | 果然 2330 | 果真 2331 | 某 2332 | 某个 2333 | 某些 2334 | 某某 2335 | 根据 2336 | 根本 2337 | 格外 2338 | 梆 2339 | 概 2340 | 次第 2341 | 欢迎 2342 | 欤 2343 | 正值 2344 | 正在 2345 | 正如 2346 | 正巧 2347 | 正常 2348 | 正是 2349 | 此 2350 | 此中 2351 | 此后 2352 | 此地 2353 | 此处 2354 | 此外 2355 | 此时 2356 | 此次 2357 | 此间 2358 | 殆 2359 | 毋宁 2360 | 每 2361 | 每个 2362 | 每天 2363 | 每年 2364 | 每当 2365 | 每时每刻 2366 | 每每 2367 | 每逢 2368 | 比 2369 | 比及 2370 | 比如 2371 | 比如说 2372 | 比方 2373 | 比照 2374 | 比起 2375 | 比较 2376 | 毕竟 2377 | 毫不 2378 | 毫无 2379 | 毫无例外 2380 | 毫无保留地 2381 | 汝 2382 | 沙沙 2383 | 没 2384 | 没奈何 2385 | 没有 2386 | 沿 2387 | 沿着 2388 | 注意 2389 | 活 2390 | 深入 2391 | 清楚 2392 | 满 2393 | 满足 2394 | 漫说 2395 | 焉 2396 | 然 2397 | 然则 2398 | 然后 2399 | 然後 2400 | 然而 2401 | 照 2402 | 照着 2403 | 牢牢 2404 | 特别是 2405 | 特殊 2406 | 特点 2407 | 犹且 2408 | 犹自 2409 | 独 2410 | 独自 2411 | 猛然 2412 | 猛然间 2413 | 率尔 2414 | 率然 2415 | 现代 2416 | 现在 2417 | 理应 2418 | 理当 2419 | 理该 2420 | 瑟瑟 2421 | 甚且 2422 | 甚么 2423 | 甚或 2424 | 甚而 2425 | 甚至 2426 | 甚至于 2427 | 用 2428 | 用来 2429 | 甫 2430 | 甭 2431 | 由 2432 | 由于 2433 | 由是 2434 | 由此 2435 | 由此可见 2436 | 略 2437 | 略为 2438 | 略加 2439 | 略微 2440 | 白 2441 | 白白 2442 | 的 2443 | 的确 2444 | 的话 2445 | 皆可 2446 | 目前 2447 | 直到 2448 | 直接 2449 | 相似 2450 | 相信 2451 | 相反 2452 | 相同 2453 | 相对 2454 | 相对而言 2455 | 相应 2456 | 相当 2457 | 相等 2458 | 省得 2459 | 看 2460 | 看上去 2461 | 看出 2462 | 看到 2463 | 看来 2464 | 看样子 2465 | 看看 2466 | 看见 2467 | 看起来 2468 | 真是 2469 | 真正 2470 | 眨眼 2471 | 着 2472 | 着呢 2473 | 矣 2474 | 矣乎 2475 | 矣哉 2476 | 知道 2477 | 砰 2478 | 确定 2479 | 碰巧 2480 | 社会主义 2481 | 离 2482 | 种 2483 | 积极 2484 | 移动 2485 | 究竟 2486 | 穷年累月 2487 | 突出 2488 | 突然 2489 | 窃 2490 | 立 2491 | 立刻 2492 | 立即 2493 | 立地 2494 | 立时 2495 | 立马 2496 | 竟 2497 | 竟然 2498 | 竟而 2499 | 第 2500 | 第二 2501 | 等 2502 | 等到 2503 | 等等 2504 | 策略地 2505 | 简直 2506 | 简而言之 2507 | 简言之 2508 | 管 2509 | 类如 2510 | 粗 2511 | 精光 2512 | 紧接着 2513 | 累年 2514 | 累次 2515 | 纯 2516 | 纯粹 2517 | 纵 2518 | 纵令 2519 | 纵使 2520 | 纵然 2521 | 练习 2522 | 组成 2523 | 经 2524 | 经常 2525 | 经过 2526 | 结合 2527 | 结果 2528 | 给 2529 | 绝 2530 | 绝不 2531 | 绝对 2532 | 绝非 2533 | 绝顶 2534 | 继之 2535 | 继后 2536 | 继续 2537 | 继而 2538 | 维持 2539 | 综上所述 2540 | 缕缕 2541 | 罢了 2542 | 老 2543 | 老大 2544 | 老是 2545 | 老老实实 2546 | 考虑 2547 | 者 2548 | 而 2549 | 而且 2550 | 而况 2551 | 而又 2552 | 而后 2553 | 而外 2554 | 而已 2555 | 而是 2556 | 而言 2557 | 而论 2558 | 联系 2559 | 联袂 2560 | 背地里 2561 | 背靠背 2562 | 能 2563 | 能否 2564 | 能够 2565 | 腾 2566 | 自 2567 | 自个儿 2568 | 自从 2569 | 自各儿 2570 | 自后 2571 | 自家 2572 | 自己 2573 | 自打 2574 | 自身 2575 | 臭 2576 | 至 2577 | 至于 2578 | 至今 2579 | 至若 2580 | 致 2581 | 般的 2582 | 良好 2583 | 若 2584 | 若夫 2585 | 若是 2586 | 若果 2587 | 若非 2588 | 范围 2589 | 莫 2590 | 莫不 2591 | 莫不然 2592 | 莫如 2593 | 莫若 2594 | 莫非 2595 | 获得 2596 | 藉以 2597 | 虽 2598 | 虽则 2599 | 虽然 2600 | 虽说 2601 | 蛮 2602 | 行为 2603 | 行动 2604 | 表明 2605 | 表示 2606 | 被 2607 | 要 2608 | 要不 2609 | 要不是 2610 | 要不然 2611 | 要么 2612 | 要是 2613 | 要求 2614 | 见 2615 | 规定 2616 | 觉得 2617 | 譬喻 2618 | 譬如 2619 | 认为 2620 | 认真 2621 | 认识 2622 | 让 2623 | 许多 2624 | 论 2625 | 论说 2626 | 设使 2627 | 设或 2628 | 设若 2629 | 诚如 2630 | 诚然 2631 | 话说 2632 | 该 2633 | 该当 2634 | 说明 2635 | 说来 2636 | 说说 2637 | 请勿 2638 | 诸 2639 | 诸位 2640 | 诸如 2641 | 谁 2642 | 谁人 2643 | 谁料 2644 | 谁知 2645 | 谨 2646 | 豁然 2647 | 贼死 2648 | 赖以 2649 | 赶 2650 | 赶快 2651 | 赶早不赶晚 2652 | 起 2653 | 起先 2654 | 起初 2655 | 起头 2656 | 起来 2657 | 起见 2658 | 起首 2659 | 趁 2660 | 趁便 2661 | 趁势 2662 | 趁早 2663 | 趁机 2664 | 趁热 2665 | 趁着 2666 | 越是 2667 | 距 2668 | 跟 2669 | 路经 2670 | 转动 2671 | 转变 2672 | 转贴 2673 | 轰然 2674 | 较 2675 | 较为 2676 | 较之 2677 | 较比 2678 | 边 2679 | 达到 2680 | 达旦 2681 | 迄 2682 | 迅速 2683 | 过 2684 | 过于 2685 | 过去 2686 | 过来 2687 | 运用 2688 | 近 2689 | 近几年来 2690 | 近年来 2691 | 近来 2692 | 还 2693 | 还是 2694 | 还有 2695 | 还要 2696 | 这 2697 | 这一来 2698 | 这个 2699 | 这么 2700 | 这么些 2701 | 这么样 2702 | 这么点儿 2703 | 这些 2704 | 这会儿 2705 | 这儿 2706 | 这就是说 2707 | 这时 2708 | 这样 2709 | 这次 2710 | 这点 2711 | 这种 2712 | 这般 2713 | 这边 2714 | 这里 2715 | 这麽 2716 | 进入 2717 | 进去 2718 | 进来 2719 | 进步 2720 | 进而 2721 | 进行 2722 | 连 2723 | 连同 2724 | 连声 2725 | 连日 2726 | 连日来 2727 | 连袂 2728 | 连连 2729 | 迟早 2730 | 迫于 2731 | 适应 2732 | 适当 2733 | 适用 2734 | 逐步 2735 | 逐渐 2736 | 通常 2737 | 通过 2738 | 造成 2739 | 逢 2740 | 遇到 2741 | 遭到 2742 | 遵循 2743 | 遵照 2744 | 避免 2745 | 那 2746 | 那个 2747 | 那么 2748 | 那么些 2749 | 那么样 2750 | 那些 2751 | 那会儿 2752 | 那儿 2753 | 那时 2754 | 那末 2755 | 那样 2756 | 那般 2757 | 那边 2758 | 那里 2759 | 那麽 2760 | 部分 2761 | 都 2762 | 鄙人 2763 | 采取 2764 | 里面 2765 | 重大 2766 | 重新 2767 | 重要 2768 | 鉴于 2769 | 针对 2770 | 长期以来 2771 | 长此下去 2772 | 长线 2773 | 长话短说 2774 | 问题 2775 | 间或 2776 | 防止 2777 | 阿 2778 | 附近 2779 | 陈年 2780 | 限制 2781 | 陡然 2782 | 除 2783 | 除了 2784 | 除却 2785 | 除去 2786 | 除外 2787 | 除开 2788 | 除此 2789 | 除此之外 2790 | 除此以外 2791 | 除此而外 2792 | 除非 2793 | 随 2794 | 随后 2795 | 随时 2796 | 随着 2797 | 随著 2798 | 隔夜 2799 | 隔日 2800 | 难得 2801 | 难怪 2802 | 难说 2803 | 难道 2804 | 难道说 2805 | 集中 2806 | 零 2807 | 需要 2808 | 非但 2809 | 非常 2810 | 非徒 2811 | 非得 2812 | 非特 2813 | 非独 2814 | 靠 2815 | 顶多 2816 | 顷 2817 | 顷刻 2818 | 顷刻之间 2819 | 顷刻间 2820 | 顺 2821 | 顺着 2822 | 顿时 2823 | 颇 2824 | 风雨无阻 2825 | 饱 2826 | 首先 2827 | 马上 2828 | 高低 2829 | 高兴 2830 | 默然 2831 | 默默地 2832 | 齐 2833 | ︿ 2834 | ! 2835 | # 2836 | $ 2837 | % 2838 | & 2839 | ' 2840 | ( 2841 | ) 2842 | )÷(1- 2843 | )、 2844 | * 2845 | + 2846 | +ξ 2847 | ++ 2848 | , 2849 | ,也 2850 | - 2851 | -β 2852 | -- 2853 | -[*]- 2854 | . 2855 | / 2856 | 0 2857 | 0:2 2858 | 1 2859 | 1. 2860 | 12% 2861 | 2 2862 | 2.3% 2863 | 3 2864 | 4 2865 | 5 2866 | 5:0 2867 | 6 2868 | 7 2869 | 8 2870 | 9 2871 | : 2872 | ; 2873 | < 2874 | <± 2875 | <Δ 2876 | <λ 2877 | <φ 2878 | << 2879 | = 2880 | =″ 2881 | =☆ 2882 | =( 2883 | =- 2884 | =[ 2885 | ={ 2886 | > 2887 | >λ 2888 | ? 2889 | @ 2890 | A 2891 | LI 2892 | R.L. 2893 | ZXFITL 2894 | [ 2895 | [①①] 2896 | [①②] 2897 | [①③] 2898 | [①④] 2899 | [①⑤] 2900 | [①⑥] 2901 | [①⑦] 2902 | [①⑧] 2903 | [①⑨] 2904 | [①A] 2905 | [①B] 2906 | [①C] 2907 | [①D] 2908 | [①E] 2909 | [①] 2910 | [①a] 2911 | [①c] 2912 | [①d] 2913 | [①e] 2914 | [①f] 2915 | [①g] 2916 | [①h] 2917 | [①i] 2918 | [①o] 2919 | [② 2920 | [②①] 2921 | [②②] 2922 | [②③] 2923 | [②④ 2924 | [②⑤] 2925 | [②⑥] 2926 | [②⑦] 2927 | [②⑧] 2928 | [②⑩] 2929 | [②B] 2930 | [②G] 2931 | [②] 2932 | [②a] 2933 | [②b] 2934 | [②c] 2935 | [②d] 2936 | [②e] 2937 | [②f] 2938 | [②g] 2939 | [②h] 2940 | [②i] 2941 | [②j] 2942 | [③①] 2943 | [③⑩] 2944 | [③F] 2945 | [③] 2946 | [③a] 2947 | [③b] 2948 | [③c] 2949 | [③d] 2950 | [③e] 2951 | [③g] 2952 | [③h] 2953 | [④] 2954 | [④a] 2955 | [④b] 2956 | [④c] 2957 | [④d] 2958 | [④e] 2959 | [⑤] 2960 | [⑤]] 2961 | [⑤a] 2962 | [⑤b] 2963 | [⑤d] 2964 | [⑤e] 2965 | [⑤f] 2966 | [⑥] 2967 | [⑦] 2968 | [⑧] 2969 | [⑨] 2970 | [⑩] 2971 | [*] 2972 | [- 2973 | [] 2974 | ] 2975 | ]∧′=[ 2976 | ][ 2977 | _ 2978 | a] 2979 | b] 2980 | c] 2981 | e] 2982 | f] 2983 | ng昉 2984 | { 2985 | {- 2986 | | 2987 | } 2988 | }> 2989 | ~ 2990 | ~± 2991 | ~+ 2992 | ¥ -------------------------------------------------------------------------------- /LSA-demo/lsa.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pylab 3 | import re 4 | from scipy import linalg 5 | from matplotlib import pyplot 6 | 7 | # 文档 8 | documents =[ 9 | "Roronoa Zoro, nicknamed \"Pirate Hunter\" Zoro, is a fictional character in the One Piece franchise created by Eiichiro Oda.", 10 | "In the story, Zoro is the first to join Monkey D. Luffy after he is saved from being executed at the Marine Base. ", 11 | "Zoro is an expert swordsman who uses three swords for his Three Sword Style, but is also capable of the one and two-sword styles. ", 12 | "Zoro seems to be more comfortable and powerful using three swords, but he also uses one sword or two swords against weaker enemies.", 13 | "In One Piece, Luffy sails from the East Blue to the Grand Line in search of the legendary treasure One Piece to succeed Gol D. Roger as the King of the Pirates. ", 14 | "Luffy is the captain of the Straw Hat Pirates and along his journey, he recruits new crew members with unique abilities and personalities. ", 15 | "Luffy often thinks with his stomach and gorges himself to comical levels. ", 16 | "However, Luffy is not as naive as many people believe him to be, showing more understanding in situations than people often expect. ", 17 | "Knowing the dangers ahead, Luffy is willing to risk his life to reach his goal to become the King of the Pirates, and protect his crew.", 18 | "Adopted and raised by Navy seaman turned tangerine farmer Bellemere, Nami and her older sister Nojiko, have to witness their mother being murdered by the infamous Arlong.", 19 | "Nami, still a child but already an accomplished cartographer who dreams of drawing a complete map of the world, joins the pirates, hoping to eventually buy freedom for her village. ", 20 | "Growing up as a pirate-hating pirate, drawing maps for Arlong and stealing treasure from other pirates, Nami becomes an excellent burglar, pickpocket and navigator with an exceptional ability to forecast weather.", 21 | "After Arlong betrays her, and he and his gang are defeated by the Straw Hat Pirates, Nami joins the latter in pursuit of her dream." 22 | ] 23 | print(len(documents)) 24 | # 停用词 25 | stopwords = ['a','an', 'after', 'also', 'and', 'as', 'be', 'being', 'but', 'by', 'd', 'for', 'from', 'he', 'her', 'his', 'in', 'is', 'more', 'of', 'often', 'the', 'to', 'who', 'with', 'people'] 26 | # 要去除的标点符号的正则表达式 27 | punctuation_regex = '[,.;"]+' 28 | # map,key是单词,value是单词出现的文档编号 29 | dictionary = {} 30 | 31 | # 当前处理的文档编号 32 | currentDocId = 0 33 | 34 | # 依次处理每篇文档 35 | for d in documents: 36 | words = d.split(); 37 | for w in words: 38 | # 去标点 39 | w = re.sub(punctuation_regex, '', w.lower()) 40 | if w in stopwords: 41 | continue 42 | elif w in dictionary: 43 | dictionary[w].append(currentDocId) 44 | else: 45 | dictionary[w] = [currentDocId] 46 | currentDocId += 1 47 | 48 | # 至少出现在两个文档中的单词选为关键词 49 | keywords = [k for k in dictionary.keys() if len(dictionary[k]) > 1] 50 | keywords.sort() 51 | print("keywords:\n", keywords, "\n") 52 | 53 | # 生成word-document矩阵 54 | X = np.zeros([len(keywords), currentDocId]) 55 | for i, k in enumerate(keywords): 56 | for d in dictionary[k]: 57 | X[i,d] += 1 58 | 59 | 60 | # 奇异值分解 61 | U,sigma,V = linalg.svd(X, full_matrices=True) 62 | 63 | print("U:\n", U, "\n") 64 | print("SIGMA:\n", sigma, "\n") 65 | print("V:\n", V, "\n") 66 | 67 | # 得到降维(降到targetDimension维)后单词与文档的坐标表示 68 | targetDimension = 2 69 | U2 = U[0:, 0:targetDimension] 70 | V2 = V[0:targetDimension, 0:] 71 | sigma2 = np.diag(sigma[0:targetDimension]) 72 | print(U2.shape, sigma2.shape, V2.shape) 73 | 74 | # 对比原始矩阵与降维结果 75 | X2 = np.dot(np.dot(U2, sigma2), V2); 76 | print("X:\n", X); 77 | print("X2:\n", X2); 78 | 79 | # 开始画图 80 | pyplot.title("LSA") 81 | pyplot.xlabel(u'x') 82 | pyplot.ylabel(u'y') 83 | 84 | # 绘制单词表示的点 85 | # U2的每一行包含了每个单词的坐标表示(维度是targetDimension),此处使用前两个维度的坐标画图 86 | for i in range(len(U2)): 87 | pylab.text(U2[i][0], U2[i][1], keywords[i], fontsize=10) 88 | print("(", U2[i][0], ",", U2[i][1], ")", keywords[i]) 89 | x = U2.T[0] 90 | y = U2.T[1] 91 | pylab.plot(x, y, '.') 92 | 93 | # 绘制文档表示的点 94 | # V2的每一列包含了每个文档的坐标表示(维度是targetDimension),此处使用前两个维度的坐标画图 95 | for i in range(len(V2[0])): 96 | pylab.text(V2[0][i], V2[1][i], ('D%d' %(i+1)), fontsize=10) 97 | print("(", V2[0][i], ",", V2[1][i], ")", ('D%d' %(i+1))) 98 | x = V[0] 99 | y = V[1] 100 | pylab.plot(x, y, 'x') 101 | 102 | 103 | pylab.savefig("D:/1.png", dpi=100) 104 | -------------------------------------------------------------------------------- /PLSA-demo/dataset.txt: -------------------------------------------------------------------------------- 1 | One Piece (Japanese: ワンピース Hepburn: Wan Pīsu?) is a Japanese manga series written and illustrated by Eiichiro Oda. It has been serialized in Shueisha's Weekly Shōnen Jump magazine since July 19, 1997, with the chapters collected into eighty-one tankōbon volumes to date. One Piece follows the adventures of Monkey D. Luffy, a funny young man whose body gained the properties of rubber after unintentionally eating a Devil Fruit. With his diverse crew of pirates, named the Straw Hat Pirates, Luffy explores the grand line in search of the world's ultimate treasure known as "One Piece" in order to become the next King of the Pirates. 2 | The manga has been adapted into an original video animation (OVA) produced by Production I.G in 1998, and an anime series produced by Toei Animation, which began broadcasting in Japan in 1999 and has aired 745 episodes to date. Additionally, Toei has developed thirteen animated feature films, two OVAs, and five television specials. Several companies have developed various types of merchandising such as a trading card game, and a large number of video games. The manga series was licensed for an English language release in North America & the United Kingdom by Viz Media and in Australia by Madman Entertainment. The anime series was licensed by 4Kids Entertainment for an English-language release in North America, before the license was dropped and subsequently acquired by Funimation in 2007. 3 | One Piece has received widespread critical acclaim, primarily for its art, characterization and humor. Several volumes of the manga have broken publishing records, including highest initial print run of any book in Japan. The official website for Eiichiro Oda's One Piece manga announced that the manga has set a Guinness World Record for "the most copies published for the same comic book series by a single author". As of June 2015, the manga has 320 million copies printed worldwide,[1] making it the best-selling manga series in history 4 | The world of One Piece is populated by humans and numerous other races such as "fishmen" (a race of fish/human hybrid, similar to mermen and mermaids), Minkmen (a race of humanoid animals), and giants. It is covered by two vast oceans, which are divided by a massive mountain range called the Red Line (赤い土の大陸レッドライン Reddo Rain?), which is also the only continent in the world.[2] The Grand Line (偉大なる航路グランドライン Gurando Rain), a sea that runs perpendicular to the Red Line, further divides them into four seas: North Blue (北の海ノースブルー Nōsu Burū), East Blue (東の海イーストブルー Īsuto Burū), West Blue (西の海ウェストブルー Uesuto Burū) and South Blue (南の海サウスブルー Sausu Burū).[3] Surrounding the Grand Line are two regions called Calm Belts (凪の帯カームベルト kāmu beruto), similar to horse latitudes, which experience almost no wind and ocean currents and are breeding ground for huge sea creatures called neptunians (海王類 kaiōrui, lit. "sea kings"). Because of this, the calm belts are very effective barriers for those trying to enter the Grand Line.[4] However, navy ships, members of an intergovernmental organization known as the World Government, are able to use a sea-prism stone (海楼石 kairōseki) to mask their presence from the sea kings and can simply pass through the calm belts. All other ships are forced to take a more dangerous route, going through a mountain at the first intersection of the Grand Line and the Red Line, a canal system known as Reverse Mountain (リヴァース・マウンテン Rivāsu Maunten).[5] Sea water from each of the four seas runs up that mountain and merges at the top to flow down a fifth canal and into the first half of the Grand Line.[6] The second half of the Grand Line, beyond the second intersection with the Red Line, is also known as the New World (新世界 Shin Sekai).[7] 5 | The currents and weather on the Grand Line's open sea are extremely unpredictable, whereas in the vicinity of islands the climate is stable.[8] The magnetic fields within the Grand Line cause normal compasses to malfunction, making it even more difficult to navigate,[9] and instead a special compass called a Log Pose (記録指針ログポース Rogu Pōsu?) must be used.[10] The Log Pose works by locking on to one island's magnetic field and then locking on to another island's magnetic field.[11] The time for it to set depends on the island.[12] This process can be bypassed by obtaining an Eternal Pose (永久指針エターナルポース Etānaru Pōsu), a Log Pose variation that is permanently set to a specific island and never changes.[13] 6 | The world of One Piece includes anachronisms, such as the Transponder Snails (電伝虫 Den-Den Mushi?), snail-like animals that can be attached to electric equipment and function as rotary phones,[14] fax machines,[14] surveillance cameras,[15] and similar devices.[15] Dials貝 (daiaru), the shells of certain sky-dwelling animals, can be used to store kinetic energy, wind, sound, images, heat, and the like and have various applications.[16] 7 | A Devil Fruit (悪魔の実 Akuma no Mi) is a type of fruit that, which when eaten, confers a power on the consumer.[17] There are three categories of Devil Fruits;[18] Zoan (動物系ゾオン Zoon) fruits allow the user to fully and partially transform into a specific animal; as well as real-life animals. Some Zoan fruits allow the user to transform into mythical creatures.[19] Logia (自然系ロギア Rogia) fruits give control over and allow the user "to change their living body structure into the powers of nature".[18] Paramecia (超人系パラミシア Paramishia) is a category for fruits that give the user superhuman abilities.[20] They are said to be incarnations of the sea devil himself, and as a result, Devil Fruit users cannot swim in sea water, as "they are hated by the sea".[21] Sea-prism stone also has this effect. When even partially submerged in sea water, they lose all of their strength and coordination, although some abilities remain, such as Luffy still being able to stretch after being totally submerged. "Moving" water, such as rain or waves, does not have this effect. When a Devil Fruit user dies, the powers will be reincarnated into a new Devil Fruit. Devil Fruit powers can be in a stage called "Awakened", where the user can turn anything around them, besides their own bodies, into what their Devil Fruit power is.[22] 8 | Haki (覇気?, lit. "Ambition") is a latent ability every living being in the world of One Piece possesses, though it is quiescent in most. There are three varieties of Haki: Color of Observation (見聞色の覇気 Kenbunshoku no Haki) allows to sense the presence of other beings and predict their movement. Color of Armament (武装色の覇気 Busōshoku no Haki) allows one to envelop body parts and even inanimate forms with a force akin to an invisible armor that possesses defensive and offensive properties, also allowing one to inflict harm upon Devil Fruit users. The rare Color of the Conquering King (覇王色の覇気 Haōshoku no Haki) enables one to intimidate or render beings of weak will power unconscious.[23] Haki is shown that when it's used too much, it will stop working for certain periods of time. 9 | The series begins with the execution of Gol D. Roger, a man known as the King of the Pirates (海賊王 Kaizokuō?). Just before his death, Roger announces that his treasure, the One Piece (ひとつなぎの大秘宝ワンピース Wan Pīsu?), will be available to anyone who finds it, beginning the Great Pirate Era (大海賊時代 Dai Kaizoku Jidai?). As a result, countless pirates set out to the Grand Line to look for the treasure. 10 | Twenty-two years have passed since Roger's execution, and Monkey D. Luffy, a young man inspired by his childhood idol and powerful pirate "Red Haired" Shanks, sets off on a journey from the East Blue sea to find the One Piece and proclaim himself as the King of the Pirates. In an effort to organize his own crew, the Straw Hat Pirates (麦わら海賊団篇 Mugiwara Kaizoku-dan?), Luffy rescues and befriends a swordsman named Roronoa Zoro, and they head off in search of the One Piece. They are joined by Nami, a navigator and thief; Usopp, a sniper and a liar; and Vinsmoke Sanji, a womanizing chef; in their journey, they acquire a ship named the Going Merry and engage in confrontations with pirates including Buggy the Clown, Captain Kuro and Don Krieg. Later, Luffy encounters Arlong, a fishman and member of the former Sun Pirates who thinks that fishmen are superior to humans. After Luffy defeats Arlong, Nami officially joins Luffy's crew and the Navy places a bounty on Luffy's head. Luffy goes to Loguetown (the town of beginning and end; where Gol D. Roger was born and later executed) then meets Captain Smoker, a Navy captain who can turn into smoke. He briefly captures Luffy, but Luffy is saved by his father, Monkey D. Dragon, who is identified by Smoker simply as Dragon, the most wanted man in the world. 11 | After making their way into the Grand Line, the group meets Nefeltari Vivi, a princess who wants to help save her country, the Alabasta Kingdom, from the crime syndicate Baroque Works. They later befriend the doctor and anthropomorphized reindeer Tony Tony Chopper while in Drum Island and along the way fight Baroque Works agents. The Straw Hat Pirates arrive in Alabasta, leading to battles with Baroque Works and their leader, Sir Crocodile a member of the 7 warlords of the sea, who sought to usurp the Alabasta royal line and obtain one of the three ancient weapons, Pluton. Luffy finally defeats Crocodile and liberates Alabasta. Soon after, Nico Robin, an archaeologist formerly employed as Baroque Works vice-president, joins Luffy's crew. 12 | They soon encounter Blackbeard, the alias of Marshall D. Teach, who also aspires to become the King of the Pirates. After traveling to the flying island of Skypiea, the crew gets involved in a war between the Skypieans and the Shandorians, leading to a confrontation against the island's ruler, Eneru, who has the power of lightning. Luffy defeats Eneru to save Skypiea and end the war. The crew soon meets the navy admiral Aokiji, who reveals that Robin was involved in searching for Poneglyphs, which are stones with markings left by an ancient civilization to reveal the missing 100 years of history that the World Government had erased. The group goes to Water 7, where they battle the cyborg shipwright Franky and are informed that the Going Merry is damaged beyond repair, leading to Usopp's temporary departure from the Straw Hat Pirates. However, Cipher Pol No. 9, the World Government's intelligence agency, captures Robin and Franky for information regarding the Poneglyphs and the ancient weapons that may come from them. Franky liberates himself from the government after burning his blueprints for one of the weapons, Pluton, and teams up with the Straw Hat Pirates to declare war on the government, resulting in battles between CP9 and its members. The final long battle with CP9 ends when the crew saves Robin. To replace the sunken Going Merry, Franky constructs a new ship, the Thousand Sunny, for the Straw Hats and officially joins the crew. Soon after, the crew helps a musician skeleton named Brook, who was revived after eating a Devil Fruit, find his shadow aboard the gigantic pirate ship Thriller Bark, which has been stolen by Gecko Moriah. After defeating Moriah, Brook joins Luffy's crew. 13 | After arriving at Sabaody Archipelago, the crew prepares to sail to the New World, the second half of Grand Line. While there, they befriend Silvers Rayleigh, the vice captain of Roger's pirate crew who they ask to coat their ship so they can travel underwater. The crew eventually gets separated during a battle with Bartholomew Kuma, a huge cyborg under the control of the Navy, at the Sabaody Archipelago; Luffy is transported to the all-female island Amazon Lily. Having learned that his adoptive older brother and the son of Gol D. Roger, Portgas D. Ace, has been detained at the government prison Impel Down, Luffy traverses there and liberates several of its prisoners, including the fish man Jimbei and some former enemies. However, Luffy soon learns that Ace is at Marineford to be executed at Navy headquarters. As Luffy arrives, war breaks out between the Navy and a group of pirates led by the legendary pirate Edward Newgate, also known by the alias Whitebeard. In the ensuing chaos, Whitebeard and Ace are killed. At Rayleigh's request, Luffy has his crewmates undergo rigorous training regimens, some under the tutelage of prominent figures. 14 | Two years later, the crew regroups at Sabaody Archipelago and journeys to Fishman Island to enter the New World. During this time, a group of fishman pirates appear, seeking supremacy against humans, and hold a coup d'état to decide the fate of the island. However, after Nami forgives Jimbei for releasing Arlong into the East Blue, the Straw Hats defeat the fishmen pirates, saving the island. The Straw Hats leave Fishman Island and finally reach the New World, though not before starting a feud with Charlotte "Big Mom" Linlin, one of the "Four Emperors", the strongest pirates in the New World. Entering the half-burning and half-freezing island Punk Hazard, the crew encounters an old acquaintance and a recently appointed Warlord, Trafalgar Law, and they form an alliance to take down Kaido, another of the Four Emperors. The alliance is drawn into a fierce battle against Caesar Clown, the scientist responsible for the destruction of Punk Hazard four years earlier, who is in the service of Donquixote Doflamingo. 15 | After Caesar's defeat, the alliance travels to Dressrosa, a kingdom ruled by Doflamingo, in an attempt to destroy the Smile factory, which supplies artificial Devil Fruits to Kaido. In Dressrosa, Luffy competes in a competition to receive the Flame-Flame Fruit, formerly used by Ace. Sanji, Nami, Brook, and Chopper leave Dressrosa when the Big Mom Pirates arrive to attack the Thousand Sunny. Luffy's other adoptive brother, Sabo, arrives and assumes Luffy's place in the competition to earn the Flame-Flame Fruit as Luffy leaves to rescue his friends. Luffy and Trafalgar Law defeat Doflamingo in a battle for Dressrosa's fate and save the country; however, they are forced to flee Dressrosa from naval forces led by Admiral Fujitora to regroup with the others. After escaping and gaining new bounties, seven pirate crews form an alliance and swear their loyalty to Luffy, creating the Straw Hat Grand Fleet. 16 | Luffy, Zoro, Usopp, Robin, Franky, Law, Kin’emon, and Kanjuro head to the mysterious island Zou, which lies on top of a giant elephant named Zunisha, in search for the remaining Straw Hats, the Heart Pirates, Momonosuke, Caesar Clown, and finally the ninja Raizo who was traveling with the samurai. While on Zou, they meet the anthropomorphic animal race known as the Mink Tribe. The crew also discovers that Sanji has suddenly been swept up in a personal conflict, and that Zou has been under siege by Jack of the Beasts Pirates. However, Jack is utterly defeated by Zunisha and peace is restored to Zou. Luffy, accompanied by Nami, Chopper, Brook, and the minks Pekoms, Pedro, and Carrot, heads to Whole Cake Island, where Big Mom resides. They attempt to rescue Sanji from his arranged marriage with Charlotte Purin, one of Big Mom's daughters, which was set up to finalize a political alliance between the Charlotte Family and the Vinsmoke Family. -------------------------------------------------------------------------------- /PLSA-demo/main.py: -------------------------------------------------------------------------------- 1 | from numpy import * 2 | from pylab import * 3 | from math import * 4 | import re 5 | import time 6 | import codecs 7 | 8 | 9 | # 读数据集 10 | file = codecs.open('dataset.txt','r','utf-8') 11 | documents = [document.strip() for document in file] 12 | file.close() 13 | 14 | # 文档总数 15 | N = len(documents) 16 | 17 | # topic数 18 | K = 10 19 | 20 | #============================================================================== 21 | # 统计词语在每篇文档中的出现次数以及总词数 22 | #============================================================================== 23 | 24 | # map类型,键是word,值是word在全部documents中出现的总次数 25 | wordCount = {} 26 | 27 | # list类型,每个元素是一个map类型对象,键是word,值是word在对应的document中出现的次数 28 | wordCountPerDocument = []; 29 | 30 | # 要去除的标点符号的正则表达式 31 | punctuationRegex = '[,.;"?!#-_…()`|“”‘]+' 32 | 33 | stopwords = ['a','an', 'after', 'also', 'they', 'man', 'zou', 'can', 'and', 'as', 'up', 'soon', 'be', 'being', 'but', 'by', 'd', 'for', 'from', 'he', 'her', 'his', 'in', 'is', 'more', 'of', 'often', 'the', 'to', 'who', 'with', 'people', 'or', 'it', 'that', 'its', 'are', 'has', 'was', 'on', 'at', 'have', 'into', 'no', 'which'] 34 | 35 | for d in documents: 36 | words = d.split() 37 | wordCountCurrentDoc = {} 38 | for w in words: 39 | # 过滤stopwords并小写化 40 | w = re.sub(punctuationRegex, '', w.lower()) 41 | if len(w)<=1 or re.search('http', w) or re.search('[0-9]', w) or w in stopwords: 42 | continue 43 | # 否则统计该词出现次数 44 | if w in wordCount: 45 | wordCount[w] += 1 46 | else: 47 | wordCount[w] = 1 48 | if w in wordCountCurrentDoc: 49 | wordCountCurrentDoc[w] += 1 50 | else: 51 | wordCountCurrentDoc[w] = 1 52 | wordCountPerDocument.append(wordCountCurrentDoc); 53 | 54 | 55 | #============================================================================== 56 | # 构造词表 57 | #============================================================================== 58 | 59 | # map类型,键是word,值是word的编号 60 | dictionary = {} 61 | # map类型,键是word的编号,值是word 62 | dictionaryReverse = {} 63 | 64 | index = 0; 65 | for word in wordCount.keys(): 66 | if wordCount[word] > 1: 67 | dictionary[word] = index; 68 | dictionaryReverse[index] = word; 69 | index += 1; 70 | 71 | # 词表长度 72 | M = len(dictionary) 73 | 74 | #============================================================================== 75 | # 构造document-word矩阵 76 | #============================================================================== 77 | 78 | X = zeros([N, M], int8) 79 | 80 | for word in dictionary.keys(): 81 | j = dictionary[word] 82 | for i in range(0, N): 83 | if word in wordCountPerDocument[i]: 84 | X[i, j] = wordCountPerDocument[i][word]; 85 | 86 | 87 | #============================================================================== 88 | # 初始化参数 89 | #============================================================================== 90 | 91 | # lamda[i, j] : p(zj|di) 92 | lamda = random([N, K]) 93 | for i in range(0, N): 94 | normalization = sum(lamda[i, :]) 95 | for j in range(0, K): 96 | lamda[i, j] /= normalization; 97 | 98 | # theta[i, j] : p(wj|zi) 99 | theta = random([K, M]) 100 | for i in range(0, K): 101 | normalization = sum(theta[i, :]) 102 | for j in range(0, M): 103 | theta[i, j] /= normalization; 104 | 105 | #============================================================================== 106 | # 定义隐变量的后验概率的矩阵表示 107 | #============================================================================== 108 | 109 | # p[i, j, k] : p(zk|di,wj) 110 | p = zeros([N, M, K]) 111 | 112 | #============================================================================== 113 | # E-Step 114 | #============================================================================== 115 | def EStep(): 116 | for i in range(0, N): 117 | for j in range(0, M): 118 | denominator = 0; 119 | for k in range(0, K): 120 | p[i, j, k] = theta[k, j] * lamda[i, k]; 121 | denominator += p[i, j, k]; 122 | if denominator == 0: 123 | for k in range(0, K): 124 | p[i, j, k] = 0; 125 | else: 126 | for k in range(0, K): 127 | p[i, j, k] /= denominator; 128 | 129 | 130 | #============================================================================== 131 | # M-Step 132 | #============================================================================== 133 | def MStep(): 134 | # 更新参数theta 135 | for k in range(0, K): 136 | denominator = 0 137 | for j in range(0, M): 138 | theta[k, j] = 0 139 | for i in range(0, N): 140 | theta[k, j] += X[i, j] * p[i, j, k] 141 | denominator += theta[k, j] 142 | if denominator == 0: 143 | for j in range(0, M): 144 | theta[k, j] = 1.0 / M 145 | else: 146 | for j in range(0, M): 147 | theta[k, j] /= denominator 148 | 149 | 150 | # 更新参数lamda 151 | for i in range(0, N): 152 | for k in range(0, K): 153 | lamda[i, k] = 0 154 | denominator = 0 155 | for j in range(0, M): 156 | lamda[i, k] += X[i, j] * p[i, j, k] 157 | denominator += X[i, j]; 158 | if denominator == 0: 159 | lamda[i, k] = 1.0 / K 160 | else: 161 | lamda[i, k] /= denominator 162 | 163 | def LogLikelihood(): 164 | loglikelihood = 0 165 | for i in range(0, N): 166 | for j in range(0, M): 167 | tmp = 0 168 | for k in range(0, K): 169 | tmp += theta[k, j] * lamda[i, k] 170 | if tmp > 0: 171 | loglikelihood += X[i, j] * log(tmp) 172 | print('loglikelihood : ', loglikelihood) 173 | 174 | #============================================================================== 175 | # EM algorithm 176 | #============================================================================== 177 | LogLikelihood() 178 | for i in range(0, 20): 179 | EStep() 180 | MStep() 181 | print("[", time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time())), "] After the", i+1, "'s iteration ", ) 182 | LogLikelihood() 183 | 184 | 185 | #============================================================================== 186 | # get top words of each topic 187 | #============================================================================== 188 | topicwords = [] 189 | maxTopicWordsNum = 10 190 | for i in range(0, K): 191 | topicword = [] 192 | ids = theta[i, :].argsort() 193 | for j in ids: 194 | topicword.insert(0, dictionaryReverse[j]) 195 | topicwords.append(topicword[0:min(maxTopicWordsNum, len(topicword))]) 196 | -------------------------------------------------------------------------------- /PLSA-demo/op_dataset.txt: -------------------------------------------------------------------------------- 1 | Roronoa Zoro, nicknamed Pirate Hunter Zoro, is a fictional character in the One Piece franchise created by Eiichiro Oda. 2 | In the story, Zoro is the first to join Monkey D. Luffy after he is saved from being executed at the Marine Base. 3 | Zoro is an expert swordsman who uses three swords for his Three Sword Style, but is also capable of the one and two-sword styles. 4 | Zoro seems to be more comfortable and powerful using three swords, but he also uses one sword or two swords against weaker enemies. 5 | In One Piece, Luffy sails from the East Blue to the Grand Line in search of the legendary treasure One Piece to succeed Gol D. Roger as the King of the Pirates. 6 | Luffy is the captain of the Straw Hat Pirates and along his journey, he recruits new crew members with unique abilities and personalities. 7 | Luffy often thinks with his stomach and gorges himself to comical levels. 8 | However, Luffy is not as naive as many people believe him to be, showing more understanding in situations than people often expect. 9 | Knowing the dangers ahead, Luffy is willing to risk his life to reach his goal to become the King of the Pirates, and protect his crew. 10 | Adopted and raised by Navy seaman turned tangerine farmer Bellemere, Nami and her older sister Nojiko, have to witness their mother being murdered by the infamous Arlong. 11 | Nami, still a child but already an accomplished cartographer who dreams of drawing a complete map of the world, joins the pirates, hoping to eventually buy freedom for her village. 12 | Growing up as a pirate-hating pirate, drawing maps for Arlong and stealing treasure from other pirates, Nami becomes an excellent burglar, pickpocket and navigator with an exceptional ability to forecast weather. 13 | After Arlong betrays her, and he and his gang are defeated by the Straw Hat Pirates, Nami joins the latter in pursuit of her dream. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Natural Language Processing 2 | 3 | Implemented 4 | ============ 5 | ## LSA (Latent Semantic Analysis) 6 | 7 | ![lsa](https://github.com/laserwave/NLP-demos/blob/master/images/lsa.png) 8 | 9 | [自然语言处理之LSA](http://zhikaizhang.cn/2016/05/31/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B9%8BLSA/) 10 | 11 | ## PLSA (Probabilistic Latent Semantic Analysis) 12 | 13 | ![plsa1](https://github.com/laserwave/NLP-demos/blob/master/images/plsa2.png) 14 | 15 | ![plsa2](https://github.com/laserwave/NLP-demos/blob/master/images/plsa1.png) 16 | 17 | [自然语言处理之PLSA](http://zhikaizhang.cn/2016/06/17/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B9%8BPLSA/) 18 | 19 | ## LDA (Latent Dirichlet Allocation) 20 | 21 | ![lda1](https://github.com/laserwave/NLP-demos/blob/master/images/lda1.png) 22 | 23 | ![lda2](https://github.com/laserwave/NLP-demos/blob/master/images/lda2.png) 24 | 25 | ![lda3](https://github.com/laserwave/NLP-demos/blob/master/images/lda3.png) 26 | 27 | [自然语言处理之LDA](http://zhikaizhang.cn/2016/06/29/%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B9%8BLDA/) 28 | 29 | Author 30 | ============ 31 | 32 | * ZhikaiZhang 33 | * Email 34 | * Blog 35 | -------------------------------------------------------------------------------- /images/lda1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laserwave/topic_models/be50d698525d62ecefc3406987d44539108d4fc2/images/lda1.png -------------------------------------------------------------------------------- /images/lda2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laserwave/topic_models/be50d698525d62ecefc3406987d44539108d4fc2/images/lda2.png -------------------------------------------------------------------------------- /images/lda3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laserwave/topic_models/be50d698525d62ecefc3406987d44539108d4fc2/images/lda3.png -------------------------------------------------------------------------------- /images/lsa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laserwave/topic_models/be50d698525d62ecefc3406987d44539108d4fc2/images/lsa.png -------------------------------------------------------------------------------- /images/plsa1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laserwave/topic_models/be50d698525d62ecefc3406987d44539108d4fc2/images/plsa1.png -------------------------------------------------------------------------------- /images/plsa2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/laserwave/topic_models/be50d698525d62ecefc3406987d44539108d4fc2/images/plsa2.png --------------------------------------------------------------------------------