├── LICENSE
├── README-CV.md
└── README.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020 xiaoxiong74
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README-CV.md:
--------------------------------------------------------------------------------
 1 | # Cool-NLPCV （持续更新中...）
 2 | Some Cool NLP and CV Repositories and Solutions   
 3 | 
 4 | [Cool-NLP](README.md) | Cool-CV
 5 | 
 6 | 旨在收集CV中常见任务的开源解决方案、数据集、工具、学习资料等，方便学习或快速查找。在此分享出来，供大家参考。欢迎积极分享并Star，谢谢！ 
 7 | 会持续不定时更新，也欢迎加入共同分享。  
 8 | 
 9 | 所有内容来源于网络，如果有侵权等问题，请及时联系我删除。
10 | 
11 | 1、机器学习&深度学习入门精选  
12 | * [Python-100天从新手到大师](https://github.com/jackfrued/Python-100-Days)
13 | * [斯坦福大学2014（吴恩达）机器学习教程中文笔记](https://github.com/fengdu78/Coursera-ML-AndrewNg-Notes)
14 | * [《统计学习方法》第二版的代码实现](https://github.com/fengdu78/lihang-code)
15 | * [Coursera深度学习教程中文笔记(deeplearning.ai吴恩达)](https://github.com/fengdu78/deeplearning_ai_books)
16 | * [《动手学深度学习》TensorFlow2.0版本](http://zh.d2l.ai/)
17 | * [《动手学深度学习》Pytorch版本](https://github.com/ShusenTang/Dive-into-DL-PyTorch)
18 | * [Deep-learning-with-keras-notebooks](https://github.com/erhwenkuo/deep-learning-with-keras-notebooks)
19 | * [TensorFlow2教程及深度学习入门指南](https://github.com/snowkylin/tensorflow-handbook)  
20 | * [Pytorch模型训练实用教程](https://github.com/TingsongYu/PyTorch_Tutorial)
21 | * [《机器学习》(西瓜书)公式推导解析](https://github.com/datawhalechina/pumpkin-book)
22 | * [数据科学笔记以及资料搜集Data-Science-Notes](https://github.com/fengdu78/Data-Science-Notes)
23 | * [李宏毅《深度强化学习》笔记](https://github.com/datawhalechina/leedeeprl-notes)
24 | * [Pandas中文教程](https://datawhalechina.github.io/joyful-pandas/build/html/%E7%9B%AE%E5%BD%95/ch3.html)
25 | * [各种框架的深度学习环境Docker镜像](https://github.com/ufoym/deepo)
26 | 
27 | 2、人脸识别&人脸特征提取
28 | * [InsightFace: 2D and 3D Face Analysis Project](https://github.com/deepinsight/insightface)
29 | * [Accelerated Training for Massive Classification via Dynamic Class Selection (HF-Softmax)](https://github.com/yl-1993/hfsoftmax)
30 | * [Multi-task face recognition framework based on PyTorch](https://github.com/XiaohangZhan/face_recognition_framework)
31 | 
32 | 3、人脸质量评价
33 | * [Code and information for face image quality assessment with SER-FIQ](https://github.com/pterhoer/FaceImageQuality)
34 | * [FIIQA-PyTorch:face image illumination quality assessment implement by pytorch](https://github.com/yangyuke001/FIIQA-PyTorch)
35 | 
36 | 4、人脸聚类
37 | * [Learning to Cluster Faces (CVPR 2019, CVPR 2020)](https://github.com/yl-1993/learn-to-cluster)
38 | * [Learning to Cluster Faces by Infomap](https://github.com/xiaoxiong74/face-cluster-by-infomap)
39 | 
40 | 5、人脸检测
41 | * [开源人脸口罩检测模型和数据](https://github.com/xiaoxiong74/FaceMaskDetection)
42 | * [轻量级人脸检测模型:Ultra-Light-Fast-Generic-Face-Detector-1MB](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/README_CN.md)
43 | * [带有关键点检测的超轻量级人脸检测器](https://github.com/biubug6/Face-Detector-1MB-with-landmark)
44 | * [腾讯优图高精度双分支人脸检测器](https://github.com/Tencent/FaceDetection-DSFD)
45 | 
46 | 6、目标检测
47 | * [目标检测:Yolo v4, v3 and v2](https://github.com/AlexeyAB/darknet)
48 | * [基于目标检测工地安全帽和禁入危险区域识别系统,含数据集](https://github.com/PeterH0323/Smart_Construction)
49 | * [基于Retinaface车牌检测,全新模型仅1.8MB](https://github.com/zeusees/License-Plate-Detector)
50 | 
51 | 7、目标跟踪
52 | * [MMTracking: 一体化视频目标感知平台](https://github.com/open-mmlab/mmtracking)
53 | * [多目标跟踪DeepSort](https://github.com/nwojke/deep_sort)
54 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Cool-NLPCV （持续更新中...）
  2 | Some Cool NLP and CV Repositories and Solutions  
  3 |  
  4 | Cool-NLP | [Cool-CV](README-CV.md)  
  5 | 
  6 | 旨在收集NLP中常见任务的开源解决方案、数据集、工具、学习资料、优质博客等，方便学习或快速查找。在此分享出来，供大家参考。欢迎积极分享并Star，谢谢!  
  7 | 会持续不定时更新，也欢迎加入共同分享。 
  8 | 
  9 | 所有内容来源于网络，如果有侵权等问题，请及时联系我删除。
 10 | 
 11 | 1、机器学习&深度学习入门精选  
 12 | * [Python-100天从新手到大师](https://github.com/jackfrued/Python-100-Days)
 13 | * [斯坦福大学2014（吴恩达）机器学习教程中文笔记](https://github.com/fengdu78/Coursera-ML-AndrewNg-Notes)
 14 | * [《统计学习方法》第二版的代码实现](https://github.com/fengdu78/lihang-code)
 15 | * [Coursera深度学习教程中文笔记(deeplearning.ai吴恩达)](https://github.com/fengdu78/deeplearning_ai_books)
 16 | * [《动手学深度学习》TensorFlow2.0版本](http://zh.d2l.ai/)
 17 | * [《动手学深度学习》Pytorch版本](https://github.com/ShusenTang/Dive-into-DL-PyTorch)
 18 | * [Deep-learning-with-keras-notebooks](https://github.com/erhwenkuo/deep-learning-with-keras-notebooks)
 19 | * [TensorFlow2教程及深度学习入门指南](https://github.com/snowkylin/tensorflow-handbook)  
 20 | * [Pytorch模型训练实用教程](https://github.com/TingsongYu/PyTorch_Tutorial)
 21 | * [《机器学习》(西瓜书)公式推导解析](https://github.com/datawhalechina/pumpkin-book)
 22 | * [数据科学笔记以及资料搜集Data-Science-Notes](https://github.com/fengdu78/Data-Science-Notes)
 23 | * [李宏毅《深度强化学习》笔记](https://github.com/datawhalechina/leedeeprl-notes)
 24 | * [Pandas中文教程](https://datawhalechina.github.io/joyful-pandas/build/html/%E7%9B%AE%E5%BD%95/ch3.html)
 25 | * [各种框架的深度学习环境Docker镜像](https://github.com/ufoym/deepo)
 26 | 
 27 | 2、词向量&Bert系列预训练模型
 28 | * [100+ Chinese Word Vectors上百种预训练中文词向量](https://github.com/Embedding/Chinese-Word-Vectors)  
 29 | * [腾讯词向量](https://ai.tencent.com/ailab/nlp/embedding.html)
 30 | * [Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)](https://github.com/ymcui/Chinese-BERT-wwm)
 31 | * [谷歌官方BERT](https://github.com/google-research/bert)
 32 | * [中文ELECTRA预训练模型](https://github.com/ymcui/Chinese-ELECTRA)
 33 | * [中文XLNet预训练模型](https://github.com/ymcui/Chinese-XLNet)
 34 | * [中文MacBERT预训练模型](https://github.com/ymcui/MacBERT)
 35 | * [中文AlBert预训练模型](https://github.com/brightmart/albert_zh)
 36 | * [开源预训练语言模型合集](https://github.com/ZhuiyiTechnology/pretrained-models)
 37 | * [JD客服对话数据(42G,12亿句子)预训练BERT及WordEmbedding](https://github.com/jd-aig/nlp_baai/tree/master/pretrained_models_and_embeddings)
 38 | * [基于词颗粒度的中文WoBERT](https://github.com/ZhuiyiTechnology/WoBERT)
 39 | * [高质量中文预训练模型集合：最先进大模型、最快小模型、相似度专门模型](https://github.com/CLUEbenchmark/CLUEPretrainedModels)
 40 | 
 41 | 3、自然语言处理数据集&数据下载网站
 42 | * [任务型对话数据、文本分类、实体识别&词性标注、搜索匹配、推荐系统、百科数据、指代消歧、中文完形填空数据集、中华古诗词数据库、保险行业语料库、汉语拆字字典、中文数据集平台](https://github.com/InsaneLife/ChineseNLPCorpus)
 43 | * [情感/观点/评论 倾向性分析、中文命名实体识别、推荐系统、FAQ 问答系统](https://github.com/SophonPlus/ChineseNlpCorpus) 
 44 | * [维基百科、新闻语料、百科问答、社区问答、中英翻译语料](https://github.com/brightmart/nlp_chinese_corpus) 
 45 | * [中文语言理解测评基准，包括代表性的数据集、基准(预训练)模型、语料库、排行榜](https://github.com/CLUEbenchmark/CLUE)
 46 | * [知识图谱的数据集:常识、城市、金融、农业、地理、气象、社交、物联网、医疗、娱乐、生活、商业、出行、科教等](http://openkg.cn/dataset)
 47 | * [新冠开放知识图谱](http://openkg.cn/dataset/39801d1b-0b51-4cde-a06c-62def5a70563)
 48 | * [《大词林》开源75万核心实体和围绕核心实体的细粒度概念、关系列表](http://openkg.cn/dataset/hit)
 49 | * [大规模医疗对话数据集:包含110万医学咨询，400万条医患对话](https://github.com/UCSD-AI4H/Medical-Dialogue-System)
 50 | * [新冠及其他类型肺炎中文医疗对话数据集](https://github.com/UCSD-AI4H/COVID-Dialogue)
 51 | * [MedQuAD：(英文)医学问答数据集](https://github.com/abachaa/MedQuAD)
 52 | * [中文医疗对话数据集Chinese medical dialogue data](https://github.com/Toyhom/Chinese-medical-dialogue-data)
 53 | * [大规模中文知识图谱数据](https://github.com/ownthink/KnowledgeGraphData)
 54 | * [中文语音语料:说话人约3200个，音频约900小时，文本约113万条，共有约1300万字](https://github.com/KuangDD/zhvoice)
 55 | * [THUOCL（THU Open Chinese Lexicon）中文词库](https://github.com/thunlp/THUOCL)
 56 | * [面向中文处理的12类、百万规模的语义常用词典，包括34万抽象语义库、34万反义语义库、43万同义语义库等](https://github.com/liuhuanyong/ChineseSemanticKB)
 57 | * [百度知道问答语料库，包括超过580万的问题，938万的答案，5800个分类标签](https://github.com/liuhuanyong/MiningZhiDaoQACorpus)
 58 | * [公司名语料库、机构名语料库](https://github.com/wainshine/Company-Names-Corpus)
 59 | * [中英文NLP数据集](https://github.com/loujie0822/CLUEDatasetSearch)
 60 | * [chinese-poetry: 最全中文诗歌古典文集数据库](https://github.com/chinese-poetry/chinese-poetry)
 61 | * [智源数据开放研究中心](https://open.baai.ac.cn/home)
 62 | * [百度大脑](https://ai.baidu.com/broad/download)
 63 | * [滴滴数据开放计划](https://outreach.didichuxing.com/app-vue/)
 64 | * [天池数据集大全(涵盖文本、图像、推荐、交通、语音等)](https://tianchi.aliyun.com/dataset)
 65 | 
 66 | 
 67 | 4、基于Bert(bert4keras)的各类任务统一框架实现：
 68 | * [中文分词、实体识别、文本(情感)分类、阅读理解、标题生成、关系抽取(三元组抽取)、对抗训练、图像描述生成、文本生成](https://github.com/bojone/bert4keras/tree/master/examples)
 69 | 
 70 | 5、[BAT机器学习面试1000题系列](https://blog.csdn.net/v_JULY_v/article/details/78121924)
 71 | 
 72 | 6、[Macadam是一个以Tensorflow(Keras)和bert4keras为基础，专注于文本分类、序列标注和关系抽取的自然语言处理工具包，](https://github.com/yongzhuo/Macadam)
 73 | * 支持RANDOM、WORD2VEC、FASTTEXT、BERT、ALBERT、ROBERTA、NEZHA、XLNET、ELECTRA、GPT-2等EMBEDDING嵌入
 74 | * 支持FineTune、FastText、TextCNN、CharCNN、BiRNN、RCNN、DCNN、CRNN、DeepMoji、SelfAttention、HAN、Capsule等文本分类算法
 75 | * 支持CRF、Bi-LSTM-CRF、CNN-LSTM、DGCNN、Bi-LSTM-LAN、Lattice-LSTM-Batch、MRC等序列标注算法
 76 | 
 77 | 7、论文合集&实战分享
 78 | * [NLP相关顶会(如ACL、EMNLP、NAACL、COLING、AAAI、IJCAI)的论文、开源代码项目合集](https://github.com/yizhen20133868/NLP-Conferences-Code)
 79 | * [NLP论文多个领域经典、顶会、必读整理分享](https://blog.csdn.net/qq_42189083/article/details/106424610)
 80 | * [深度学习模型在各大公司实战落地细节解读,主要包括搜索/推荐/自然语言处理方向](https://github.com/DA-southampton/Tech_Aarticle)
 81 | 
 82 | 8、实体识别合集
 83 | * [基于TF：BERT-BiLSTM-CRF-NER](https://github.com/macanv/BERT-BiLSTM-CRF-NER)
 84 | * [基于TF+Pytorch:CLUENER 细粒度命名实体识别](https://github.com/CLUEbenchmark/CLUENER2020)
 85 | * [基于Pytorch:Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)](https://github.com/lonePatient/BERT-NER-Pytorch)
 86 | * [基于TF：命名实体识别实践与探索](https://github.com/wavewangyue/ner)
 87 | * [工业界如何解决NER问题？12个trick，与你分享](https://zhuanlan.zhihu.com/p/152463745)
 88 | * [中文NER的正确打开方式: 词汇增强方法总结 (从Lattice LSTM到FLAT)](https://zhuanlan.zhihu.com/p/142615620)
 89 | * [支持批并行的LatticeLSTM](https://github.com/LeeSureman/Batch_Parallel_LatticeLSTM)
 90 | * [medical_NER - 中文医学知识图谱命名实体识别](https://github.com/pumpkinduo/KnowledgeGraph_NER)
 91 | * [BERT/CRF实现的命名实体识别](https://github.com/Louis-udm/NER-BERT-CRF)
 92 | * [用预训练语言模型ALBERT做中文NER](https://github.com/ProHiryu/albert-chinese-ner)
 93 | * [用 bilstm-crf,bert及相关方法进行序列标注](https://github.com/qiufengyuyi/sequence_tagging)
 94 | * [BILSTM+CRF做医疗实体识别,包含医疗NER数据](https://github.com/DengYangyong/medical_entity_recognize)
 95 | * [DeepIE:基于深度学习的信息抽取技术](https://github.com/loujie0822/DeepIE)
 96 | 
 97 | 9、文本(情感)分类
 98 | * [基于CNN，RNN 和NLP中预训练模型构建的多个常见的文本分类模型](https://github.com/xiaoxiong74/textClassifier)
 99 | * [中文文本分类，TextCNN，TextRNN，FastText，TextRCNN，BiLSTM_Attention, DPCNN, Transformer,基于pytorch](https://github.com/649453932/Chinese-Text-Classification-Pytorch)
100 | * [腾讯开源深度学习文本分类工具:NeuralNLP-NeuralClassifier,基于Pytorch](https://github.com/Tencent/NeuralNLP-NeuralClassifier)
101 | * [Keras-TextClassification](https://github.com/yongzhuo/Keras-TextClassification)
102 | * [中文ULMFiT 情感分析 文本分类](https://github.com/bigboNed3/chinese_ulmfit)
103 | * [基于Bert、Xlnet + cnn、lstm、gru的文本分类](https://github.com/zhanlaoban/Transformers_for_Text_Classification)
104 | * [如何解决NLP分类任务的11个关键问题](https://zhuanlan.zhihu.com/p/183852900)
105 | * [文本分类资料综述总结(含代码)](https://github.com/xiaoqian19940510/text-classification-surveys)
106 | 
107 | 10、关系抽取(三元组抽取)
108 | * [基于远监督的中文关系抽取](https://github.com/xiaofei05/Distant-Supervised-Chinese-Relation-Extraction)
109 | * [基于DGCNN和概率图的轻量级信息抽取模型](https://kexue.fm/archives/6671)
110 | * [用bert4keras做三元组抽取](https://kexue.fm/archives/7161)
111 | * [信息抽取冠军方案分享：嵌套NER+关系抽取+实体标准化](https://zhuanlan.zhihu.com/p/326302618)
112 | * [ACL2020信息抽取相关论文汇总](https://blog.csdn.net/qq_42189083/article/details/106424416)
113 | * [Nlp中的实体关系抽取方法总结](https://zhuanlan.zhihu.com/p/77868938)
114 | * [DeepKE:基于 Pytorch 的深度学习中文关系抽取框架](https://github.com/zjunlp/deepke)
115 | * [基于TensorFlow的实体及关系抽取,2019语言与智能技术竞赛信息抽取任务解决方案](https://github.com/yuanxiaosc/Entity-Relation-Extraction)
116 | * [一种级联指针三元组抽取框架](https://github.com/weizhepei/CasRel)
117 | * [事件抽取方法总结(含代码)](https://github.com/xiaoqian19940510/Event-Extraction)
118 | * [DeepIE:基于深度学习的信息抽取技术](https://github.com/loujie0822/DeepIE)
119 | 
120 | 
121 | 11、文本生成、文本摘要
122 | * [动手做个DialoGPT：基于LM的生成式多轮对话模型](https://kexue.fm/archives/7718)
123 | * [法研杯2020司法摘要:SPACES：“抽取-生成”式长文本摘要（法研杯总结）](https://github.com/bojone/SPACES)
124 | 
125 | 12、阅读理解
126 | * [基于MLM的阅读理解问答](https://kexue.fm/archives/7148)
127 | 
128 | 13、知识图谱
129 | * [基于医药知识图谱的智能问答系统](https://github.com/YeYzheng/KGQA-Based-On-medicine)
130 | * [京东商品知识图谱](https://github.com/liuhuanyong/ProductKnowledgeGraph)
131 | * [军事领域知识图谱问答项目](https://github.com/liuhuanyong/QAonMilitaryKG)
132 | * [百度百科中文页面，抽取三元组信息，构建中文知识图谱](https://github.com/lixiang0/WEB_KG)
133 | * [基于知识图谱的问答系统](https://github.com/fighting41love/funNLP)
134 | * [《知识图谱》课程资料](https://github.com/npubird/KnowledgeGraphCourse)
135 | * [农业知识图谱(AgriKG)：农业领域的信息检索，命名实体识别，关系抽取，智能问答，辅助决策](https://github.com/qq547276542/Agriculture_KnowledgeGraph)
136 | * [知识图谱构建，自动问答，基于kg的自动问答:以疾病为中心的一定规模医药领域知识图谱，并以该知识图谱完成自动问答与分析服务](https://github.com/liuhuanyong/QASystemOnMedicalKG)
137 | * [知识图谱相关学习资料，提供系统化的知识图谱学习路径](https://github.com/husthuke/awesome-knowledge-graph)
138 | 
139 | 14、文本相似度计算(判定)
140 | * [中文问题句子相似度计算比赛及方案汇总](https://github.com/ShuaichiLi/Chinese-sentence-similarity-task)
141 | * [中国法研杯相似案例匹配Top1团队解决方案](https://github.com/GuidoPaul/CAIL2019)
142 | * [常用文本匹配模型tf版本，数据集为QA_corpus](https://github.com/terrifyzhao/text_matching)
143 | * [文本匹配的相关模型DSSM,ESIM,ABCNN,BIMPM等，数据集为LCQMC官方数据](https://github.com/pengming617/text_matching)
144 | * [基于Siamese bilstm模型的相似句子判定模型,提供训练数据集和测试数据集](https://github.com/liuhuanyong/SiameseSentenceSimilarity)
145 | 
146 | 15、Attention(注意力机制)、Transformer
147 | * [深度学习中的注意力模型](https://zhuanlan.zhihu.com/p/37601161)
148 | * [《Attention is All You Need》浅读(简介+代码)](https://kexue.fm/archives/4765)
149 | * [通俗易懂：8大步骤图解注意力机制](https://zhuanlan.zhihu.com/p/94077451)
150 | * [Transformer如戏，全靠Mask](https://kexue.fm/archives/6933)
151 | * [放弃幻想，全面拥抱Transformer：自然语言处理三大特征抽取器（CNN/RNN/TF）比较](https://zhuanlan.zhihu.com/p/54743941)
152 | 
153 | 
154 | 16、机器人、问答
155 | * [智能客服、聊天机器人的应用和架构、算法分享和介绍](https://github.com/lizhe2004/chatbot-list)
156 | * [微软聊天机器人框架BotFramework](https://github.com/microsoft/botframework)
157 | * [聊天机器人框架RASA](https://github.com/RasaHQ/rasa)
158 | * [GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型](https://github.com/yangjianxin1/GPT2-chitchat)
159 | * [基于金融-司法领域(兼有闲聊性质)的聊天机器人](https://github.com/charlesXu86/Chatbot_CN)
160 | * [基于rasa_nlu，rasa_core，rasa_core_sdk构建的聊天机器人](https://github.com/xiaoxiong74/rasa_chatbot)
161 | 
162 | 17、Embedding系列
163 | * [nlp中的词向量对比：word2vec/glove/fastText/elmo/GPT/bert](https://zhuanlan.zhihu.com/p/56382372)
164 | * [乘风破浪的PTM：两年来预训练模型的技术进展](https://zhuanlan.zhihu.com/p/254821426)
165 | * [万字长文解析词向量(W2C/Fasttext/Glove)](https://zhuanlan.zhihu.com/p/164999424)
166 | * [Embedding入门必读的十篇论文](https://blog.csdn.net/qq_42189083/article/details/106429008)
167 | 
168 | 18、Bert解读系列
169 | * [BERT模型图解](https://cloud.tencent.com/developer/article/1389555)
170 | * [NLP预训练模型：从transformer到albert](https://zhuanlan.zhihu.com/p/85221503)
171 | * [Bert时代的创新（应用篇）：Bert在NLP各领域的应用进展](https://zhuanlan.zhihu.com/p/68446772)
172 | * [从Word Embedding到Bert模型—自然语言处理中的预训练技术发展史](https://zhuanlan.zhihu.com/p/49271699)
173 | * [XLNet:运行机制及和Bert的异同比较](https://zhuanlan.zhihu.com/p/70257427)
174 | 
175 | 19、NLP任务处理合集，包括但不限于词向量、命名实体识别、文本分类、文本生成、文本相似性计算、关系抽取、中文分词、词性标注、情感分析、新词发现、关键词、文本摘要、文本聚类等
176 | * [NLP相关的一些论文及代码, 包括主题模型、词向量、命名实体识别、文本分类、文本生成、文本相似性计算等，涉及到各种与nlp相关的算法，基于keras和tensorflow](https://github.com/msgi/nlp-journey)
177 | * [Jiagu自然语言处理工具 - 以BiLSTM等模型为基础，提供知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类等功能](https://github.com/ownthink/Jiagu)
178 | * [Texthero：文本数据高效处理包，包括预处理、关键词提取、命名实体识别、向量空间分析、文本可视化等](https://github.com/jbesomi/texthero)
179 | * [基于Pytorch的Bert应用，包括命名实体识别、情感分析、文本分类以及文本相似度等](https://github.com/rsanshierli/EasyBert)
180 | 
181 | 20、NLP基础工具包
182 | * [清华THULAC](https://github.com/thunlp/THULAC)
183 | * [HanLP](https://github.com/hankcs/HanLP)
184 | * [哈工大LTP](https://github.com/HIT-SCIR/ltp)
185 | * [Jieba](https://github.com/yanyiwu/cppjieba)
186 | * [NLPIR汉语分词](https://github.com/NLPIR-team/NLPIR)
187 | * [JioNLP：中文NLP任务预处理工具包，准确、高效、零使用门槛](https://github.com/dongrixinyu/JioNLP)
188 | * [Time-Extractor:中文文本时间抽取、时间转换及标准化](https://github.com/xiaoxiong74/Time-Extractor)
189 | * [TexSmart: 文本理解工具与服务](https://ai.tencent.com/ailab/nlp/texsmart/zh/)
190 | 
191 | 21、文本对抗、数据增强、少样本、零样本、半监督
192 | * [TextAttack:一个用于NLP对抗性攻击、数据扩充和模型训练的框架](https://github.com/QData/TextAttack)
193 | * [对抗训练浅谈：意义、方法和思考(附Keras实现)](https://kexue.fm/archives/7234)
194 | * [中文语料的EDA数据增强工具](https://github.com/zhanlaoban/eda_nlp_for_Chinese)
195 | * [一文搞懂NLP中的对抗训练FGSM/FGM/PGD/FreeAT/YOPO/FreeLB/SMART](https://zhuanlan.zhihu.com/p/103593948)
196 | * [NLP中的对抗训练 + PyTorch实现](https://zhuanlan.zhihu.com/p/91269728)
197 | * [BERT的MLM模型也能小样本学习](https://kexue.fm/archives/7764)
198 | 
199 | 22、NLP标注工具或平台
200 | * [BRAT:基于web的文本标注工具](https://github.com/nlplab/brat)
201 | * [YEDDA](https://github.com/jiesutd/YEDDA)
202 | * [MarkTool 基于web的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类等](https://github.com/FXLP/MarkTool)
203 | * [doccano:一站式文本标注工具](https://github.com/doccano/doccano)
204 | 
205 | 23、NLP面试指南
206 | * [NLP算法面试必备！史上最全！PTMs：NLP预训练模型的全面总结](https://zhuanlan.zhihu.com/p/115014536)
207 | * [NLP/AI面试全记录(持续更新，最全预训练总结)](https://zhuanlan.zhihu.com/p/57153934)
208 | * [机器学习、NLP面试中常考到的知识点和代码实现](https://github.com/NLP-LOVE/ML-NLP)
209 | * [关于Attention和Transformer的灵魂拷问](https://zhuanlan.zhihu.com/p/336606129)
210 | 
211 | 24、人工智能技术系列报告
212 | * [清华大学人工智能技术系列报告](https://reports.aminer.cn/)
213 | 
214 | 25、[国内自然语言处理(NLP)研究组](https://zhuanlan.zhihu.com/p/142465929)
215 | 
216 | 26、语音识别
217 | * [MASR 中文语音识别](https://github.com/nobody132/masr)
218 | * [基于深度学习的中文语音识别系统 A Deep-Learning-Based Chinese Speech Recognition System](https://github.com/nl8590687/ASRT_SpeechRecognition)
219 | 
220 | 27、Seq2Seq
221 | * [无监督编程语言转换(Python、C++、Java)](https://github.com/facebookresearch/TransCoder?utm_source=catalyzex.com)
222 | 
223 | 28、竞赛精选
224 | * [NLP比赛的TOP方案](https://github.com/zhpmatrix/nlp-competitions-list-review)
225 | * [首届中文NL2SQL挑战赛冠军方案](https://github.com/nudtnlp/tianchi-nl2sql-top1)
226 | * [首届中文NL2SQL挑战赛季军方案与代码](https://github.com/beader/tianchi_nl2sql)
227 | * [Kaggle竞赛宝典方案汇总](https://mp.weixin.qq.com/s/2dd8l4MpyI3UzdTSWGthyA)
228 | * [推荐算法竞赛TOP方案合集](https://zhuanlan.zhihu.com/p/317708353)
229 | * [Data competition Top Solution 数据竞赛top解决方案开源整理](https://github.com/Smilexuhc/Data-Competition-TopSolution	)
230 | 
231 | 29、模型蒸馏
232 | * [BERT模型蒸馏完全指南（原理/技巧/代码）](https://zhuanlan.zhihu.com/p/273378905)
233 | * [一个基于PyTorch的NLP知识蒸馏工具包](https://github.com/airaria/TextBrewer)
234 | 
235 | 30、训练技巧
236 | * [神经网络分布式训练、混合精度训练、梯度累加...一文带你优雅地训练大型模型](https://zhuanlan.zhihu.com/p/110278004)
237 | * [BERT预训练实操总结](https://zhuanlan.zhihu.com/p/337212893)
238 | 
239 | 31、竞赛网站
240 | * [阿里云天池](https://tianchi.aliyun.com/)
241 | * [DataFountain](https://www.datafountain.cn/)
242 | * [Biendata competitions](https://www.biendata.xyz/)
243 | * [和鲸社区KESCI](https://www.kesci.com/home/competition)
244 | * [DC-lab](https://www.dclab.run/index.html)
245 | * [Kaggle](https://www.kaggle.com/)
246 | * [图灵联邦](https://www.turingtopia.com/competitionnew)
247 | * [Flyai](https://www.flyai.com/)
248 | * [Eval](https://eval.ai/web/challenges/list)
249 | 
250 | 32、论文检索下载
251 | * [SCI-HUB](https://sci-hub.se/)
252 | * [arXiv](https://arxiv.org/)
253 | * [卖萌屋学术站](https://arxiv.xixiaoyao.cn/)
254 | 
255 | 33、推荐系统
256 | * [易用可扩展的深度学习点击率预测算法包](https://github.com/shenweichen/DeepCTR)
257 | 
258 | 
259 | 
260 | 
261 | 
262 | 
263 | 
264 | 


--------------------------------------------------------------------------------