└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # chinese-word2vec

2 | word2vec/glove/swivel binary file on chinese corpus

3 | 4 | word2vec: https://code.google.com/p/word2vec/

5 | glove: http://nlp.stanford.edu/projects/glove/

6 | swivel: https://github.com/tensorflow/models/tree/master/swivel

7 |         http://arxiv.org/abs/1602.02215

8 | 9 | 训练语料:百科网页+多个分类新闻语料(全半角转换后)

10 | 11 | binary file: (Word2vec的二进制读取可以参考其代码)

12 |     word2vec CBOW: utf8  2.18G

13 |         http://pan.baidu.com/s/1qX334vE

14 |     word2vec SKIP: utf8  2.17G

15 |         http://pan.baidu.com/s/1bogTzfp

16 |     word2vec CBOW: gb18030(精简语料)  2.31G

17 |         http://pan.baidu.com/s/1jHumqjW

18 |     word2vec SKIP: gb18030(精简语料)  1.47G

19 |         http://pan.baidu.com/s/1ntVJBYD

20 |     swivel: utf8  

21 |         待上传

22 |     glove: utf8   9.19G

23 |         http://pan.baidu.com/s/1i3XowWP

24 | 25 | 26 | 其他:

27 |     使用1.25亿英国twitter训练的词向量:

28 |         参考地址:https://figshare.com/articles/UK_Twitter_word_embeddings/4052331

29 |         下载:https://ndownloader.figshare.com/articles/4052331/versions/1

30 | --------------------------------------------------------------------------------