├── test ├── china.jpg ├── huoguo.jpg ├── longmao.jpg ├── chongqing.jpg └── ch_stopwords.txt ├── output ├── active_hour.png ├── most_hot_word.jpg ├── most_active_hour.jpg ├── sentiment_index.png ├── most_active_person.jpg └── most_hot_topic_tr.jpg ├── README.md ├── requirement.txt ├── src ├── __init__.py ├── config.py ├── mredis.py ├── main.py ├── classify.py └── analyse.py ├── .gitignore └── LICENSE /test/china.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/test/china.jpg -------------------------------------------------------------------------------- /test/huoguo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/test/huoguo.jpg -------------------------------------------------------------------------------- /test/longmao.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/test/longmao.jpg -------------------------------------------------------------------------------- /test/chongqing.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/test/chongqing.jpg -------------------------------------------------------------------------------- /output/active_hour.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/output/active_hour.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # QQ_Chat_Records_Analyse 2 | ## QQ群聊天记录分析 3 | - 首先个人爱好做的小程序; 4 | - 统计群最活跃的人; 5 | - 挖掘本群话题范围,判读本群性质; 6 | -------------------------------------------------------------------------------- /output/most_hot_word.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/output/most_hot_word.jpg -------------------------------------------------------------------------------- /output/most_active_hour.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/output/most_active_hour.jpg -------------------------------------------------------------------------------- /output/sentiment_index.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/output/sentiment_index.png -------------------------------------------------------------------------------- /output/most_active_person.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/output/most_active_person.jpg -------------------------------------------------------------------------------- /output/most_hot_topic_tr.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Times125/QQ_Chat_Records_Analyse/HEAD/output/most_hot_topic_tr.jpg -------------------------------------------------------------------------------- /requirement.txt: -------------------------------------------------------------------------------- 1 | nltk==3.2.5 2 | numpy==1.14.0 3 | scikit-learn==0.19.1 4 | scipy==1.0.0 5 | six==1.11.0 6 | virtualenv==15.1.0 7 | -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | @Author:lichunhui 6 | @Time: 2018/1/14 18:06 7 | @Description: 8 | """ -------------------------------------------------------------------------------- /src/config.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | @Author:lichunhui 6 | @Time: 2018/1/14 18:07 7 | @Description: 8 | """ 9 | import os 10 | 11 | # 聊天记录以及停用词表目录 12 | material = os.path.join(os.path.abspath('..'), 'test') 13 | 14 | # 文件输出目录 15 | output_path = os.path.join(os.path.abspath('..'), 'output') 16 | 17 | # 模型目录 18 | model_path = os.path.join(os.path.abspath('..'), 'model') 19 | 20 | # 词云字体目录 21 | font_path = '‪C:\Windows\Fonts\SIMYOU.TTF' 22 | -------------------------------------------------------------------------------- /src/mredis.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | @Author:lichunhui 6 | @Time: 2018/1/15 0:27 7 | @Description: 8 | """ 9 | import redis 10 | 11 | 12 | class MyRedis(object): 13 | def __init__(self): 14 | pass 15 | 16 | @staticmethod 17 | def get_redis_instance(): 18 | pool = redis.ConnectionPool(host='127.0.0.1', port=6379, decode_responses=True) 19 | rdis = redis.Redis(connection_pool=pool) 20 | return rdis 21 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | .idea 7 | 8 | # C extensions 9 | *.so 10 | 11 | # Distribution / packaging 12 | .Python 13 | env/ 14 | build/ 15 | develop-eggs/ 16 | dist/ 17 | downloads/ 18 | eggs/ 19 | .eggs/ 20 | lib/ 21 | lib64/ 22 | parts/ 23 | sdist/ 24 | var/ 25 | wheels/ 26 | *.egg-info/ 27 | .installed.cfg 28 | *.egg 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | .hypothesis/ 50 | 51 | # Translations 52 | *.mo 53 | *.pot 54 | 55 | # Django stuff: 56 | *.log 57 | local_settings.py 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # dotenv 85 | .env 86 | 87 | # virtualenv 88 | .venv 89 | venv/ 90 | ENV/ 91 | 92 | # Spyder project settings 93 | .spyderproject 94 | .spyproject 95 | 96 | # Rope project settings 97 | .ropeproject 98 | 99 | # mkdocs documentation 100 | /site 101 | 102 | # mypy 103 | .mypy_cache/ 104 | -------------------------------------------------------------------------------- /src/main.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | @Author:lichunhui 6 | @Time: 2018/1/14 18:12 7 | @Description: 8 | """ 9 | 10 | from src.analyse import * 11 | 12 | 13 | def main(): 14 | read_data() 15 | analyse() 16 | # lst = ['感觉搞无机的太少了', '她是个善良的女孩,温柔可爱', '今天天气不错,让我心情舒畅', '我真的很伤心啊', '我真的很无聊啊','他们肯定太无聊了'] 17 | # print(lst.index('我真的很伤心啊')) 18 | # print([i for i, x in enumerate(lst) if x == '我真的很伤心啊']) 19 | # for i in lst: 20 | # s = SnowNLP(i) 21 | # print(s.sentiments) 22 | 23 | 24 | if __name__ == '__main__': 25 | main() 26 | # import matplotlib.pyplot as plt 27 | # 28 | # x = [1, 2, 3, 4, 5] 29 | # y1 = [6, 7, 8, 9, 10] 30 | # plt.plot(x, y1, marker='*', ms=10, color='y') 31 | # plt.show() 32 | # import matplotlib.pyplot as plt 33 | # 34 | # aaa = {'1209101643': 0.5427812816478225, '745500530': 0.5507599248949117, '1051617442': 35 | # 0.5530599685557689, '1045048548': 0.5499328897910526, '2642574488': 0.5446383498406289, '402376637': 36 | # 0.5407447368409908, '1764011453': 0.5369646476046138, '1251680944': 0.5354333781521334, '1219398106': 37 | # 0.5364054328018658, '599292912': 0.5364656951947343, '1913409863': 0.53484067081695, '309031584': 38 | # 0.5344930942030883, '545184326': 0.5341362432855117, '1329636023': 0.53325216105234, '710258421': 39 | # 0.5330333058797951, '673853621': 0.5326860825299103, '754315527': 0.5321570979667783, '1965083476': 40 | # 0.5316905993321258, '1141569440': 0.5316969969482886, '2275444386': 0.5306393358854631, 41 | # 'loyalnovakshayne@qq.com': 0.5310336921617058, '1115133277': 0.5309162264601218, '1715474300': 42 | # 0.5309849111992685, '122780300': 0.5309933866294124, '10000': 0.5284466697252308} 43 | # x = [] 44 | # y = [] 45 | # for k, v in aaa.items(): 46 | # print(v) 47 | # x.append(k) 48 | # y.append(v) 49 | # # 解决中文乱码问题 50 | # plt.rcParams['font.sans-serif'] = ['simHei'] 51 | # plt.rcParams['axes.unicode_minus'] = False 52 | # plt.title('重庆老乡群群友情感指数') 53 | # plt.ylim(0, 1) # 设置y轴范围9 54 | # plt.ylabel('情感指数') # 设置y轴标签 55 | # plt.xlabel('用户') # 设置x轴标签 56 | # plt.bar(x=x, height=y, color='rgbycmk') 57 | # plt.xticks(range(25), x, rotation=90, fontsize=6) 58 | # plt.legend() 59 | # plt.show() 60 | -------------------------------------------------------------------------------- /src/classify.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | @Author:lichunhui 6 | @Time: 2018/1/17 20:19 7 | @Description: 8 | """ 9 | # ! /usr/bin/env python 10 | # -*- coding: utf-8 -*- 11 | 12 | """ 13 | @Author:lichunhui 14 | @Time: 2018/1/17 20:19 15 | @Description: 16 | """ 17 | import jieba 18 | import os 19 | import numpy as np 20 | import pandas as pd 21 | from sklearn.svm import SVC 22 | from sklearn.externals import joblib 23 | from gensim.models.word2vec import Word2Vec 24 | from sklearn.model_selection import train_test_split 25 | 26 | 27 | class SVMClassifer: 28 | @classmethod 29 | def load_file(cls): 30 | # neg = pd.read_excel('data/neg.xls', header=None, index=None) 31 | # pos = pd.read_excel('data/pos.xls', header=None, index=None) 32 | # 33 | # cw = lambda x: list(jieba.cut(x)) 34 | # pos['words'] = pos[0].apply(cw) 35 | # neg['words'] = neg[0].apply(cw) 36 | # 37 | # # use 1 for positive sentiment, 0 for negative 38 | # y = np.concatenate((np.ones(len(pos)), np.zeros(len(neg)))) 39 | # x_train, x_test, y_train, y_test = train_test_split( 40 | # np.concatenate((pos['words'], neg['words'])), y, test_size=0.2) 41 | # np.save('svm_data/y_train.npy', y_train) 42 | # np.save('svm_data/y_test.npy', y_test) 43 | 44 | import os 45 | 46 | with open(os.path.dirname(__file__) + '/../dataset/pos.txt', 'r') as f: 47 | conts = f.readlines() 48 | pos_res = list() 49 | for cont in conts: 50 | pos_res.append(list(jieba.cut(cont))) 51 | 52 | with open(os.path.dirname(__file__) + '/../dataset/neg.txt', 'r') as f: 53 | conts = f.readlines() 54 | neg_res = list() 55 | for cont in conts: 56 | neg_res.append(list(jieba.cut(cont))) 57 | 58 | y = np.concatenate((np.ones(len(pos_res)), np.zeros(len(neg_res)))) 59 | x_train, x_test, y_train, y_test = train_test_split( 60 | np.concatenate((pos_res, neg_res)), y, test_size=0.2) 61 | np.save('svm_data/y_train.npy', y_train) 62 | np.save('svm_data/y_test.npy', y_test) 63 | 64 | return x_train, x_test 65 | 66 | @classmethod 67 | # 对每个句子的所有词向量取均值 68 | def build_wordvector(cls, text, size, imdb_w2v): 69 | vec = np.zeros(size).reshape((1, size)) 70 | count = 0. 71 | for word in text: 72 | try: 73 | vec += imdb_w2v[word].reshape((1, size)) 74 | count += 1. 75 | except KeyError: 76 | continue 77 | if count != 0: 78 | vec /= count 79 | return vec 80 | 81 | # 计算词向量 82 | @classmethod 83 | def save_train_vecs(cls, x_train, x_test): 84 | n_dim = 300 85 | # Initialize model and build vocab 86 | imdb_w2v = Word2Vec(size=n_dim, min_count=10) 87 | imdb_w2v.build_vocab(x_train) 88 | 89 | # Train the model over train_reviews (this may take several minutes) 90 | imdb_w2v.train(x_train, total_examples=imdb_w2v.corpus_count, epochs=imdb_w2v.iter) 91 | train_vecs = np.concatenate([cls.build_wordvector(z, n_dim, imdb_w2v) for z in x_train]) 92 | # train_vecs = scale(train_vecs) 93 | 94 | np.save('svm_data/train_vecs.npy', train_vecs) 95 | 96 | # Train word2vec on test tweets 97 | imdb_w2v.train(x_test, total_examples=imdb_w2v.corpus_count, epochs=imdb_w2v.iter) 98 | imdb_w2v.save('svm_data/w2v_model/w2v_model.pkl') 99 | # Build test tweet vectors then scale 100 | test_vecs = np.concatenate([cls.build_wordvector(z, n_dim, imdb_w2v) for z in x_test]) 101 | # test_vecs = scale(test_vecs) 102 | np.save('svm_data/test_vecs.npy', test_vecs) 103 | 104 | @classmethod 105 | def get_data(cls): 106 | train_vecs = np.load('svm_data/train_vecs.npy') 107 | y_train = np.load('svm_data/y_train.npy') 108 | test_vecs = np.load('svm_data/test_vecs.npy') 109 | y_test = np.load('svm_data/y_test.npy') 110 | return train_vecs, y_train, test_vecs, y_test 111 | 112 | # 训练svm模型 113 | @classmethod 114 | def train(cls): 115 | x_train, x_test = cls.load_file() 116 | cls.save_train_vecs(x_train, x_test) 117 | train_vecs, y_train, test_vecs, y_test = cls.get_data() 118 | clf = SVC(kernel='rbf', verbose=True) 119 | clf.fit(train_vecs, y_train) 120 | joblib.dump(clf, ) 121 | print(clf.score(test_vecs, y_test)) 122 | 123 | # 得到待预测单个句子的词向量 124 | @classmethod 125 | def get_predict_vecs(cls, words): 126 | n_dim = 300 127 | imdb_w2v = Word2Vec.load('svm_data/w2v_model/w2v_model.pkl') 128 | # imdb_w2v.train(words) 129 | train_vecs = cls.build_wordvector(words, n_dim, imdb_w2v) 130 | # print train_vecs.shape 131 | return train_vecs 132 | 133 | # 对单个句子进行情感判断 134 | @classmethod 135 | def predict(cls, string): 136 | words = jieba.lcut(string) 137 | words_vecs = cls.get_predict_vecs(words) 138 | clf = joblib.load('svm_data/svm_model/model.pkl') 139 | result = clf.predict(words_vecs) 140 | return result[0] 141 | -------------------------------------------------------------------------------- /test/ch_stopwords.txt: -------------------------------------------------------------------------------- 1 | $ 2 | 0 3 | 1 4 | 2 5 | 3 6 | 4 7 | 5 8 | 6 9 | 7 10 | 8 11 | 9 12 | ? 13 | _ 14 | “ 15 | ” 16 | 、 17 | 。 18 | 《 19 | 》 20 | ( 21 | ) 22 | ⌒ 23 | ~ 24 | ⊙ 25 | . 26 | ~ 27 | → 28 | 一 29 | 一些 30 | 一何 31 | 一切 32 | 一则 33 | 一方面 34 | 一旦 35 | 一来 36 | 一样 37 | 一般 38 | 一转眼 39 | 万一 40 | 上 41 | 上下 42 | 下 43 | 不 44 | 不仅 45 | 不但 46 | 不光 47 | 不单 48 | 不只 49 | 不外乎 50 | 不如 51 | 不妨 52 | 不尽 53 | 不尽然 54 | 不得 55 | 不怕 56 | 不惟 57 | 不成 58 | 不拘 59 | 不料 60 | 不是 61 | 不比 62 | 不然 63 | 不特 64 | 不独 65 | 不管 66 | 不至于 67 | 不若 68 | 不论 69 | 不过 70 | 不问 71 | 与 72 | 与其 73 | 与其说 74 | 与否 75 | 与此同时 76 | 且 77 | 且不说 78 | 且说 79 | 两者 80 | 个 81 | 个别 82 | 临 83 | 为 84 | 为了 85 | 为什么 86 | 为何 87 | 为止 88 | 为此 89 | 为着 90 | 乃 91 | 乃至 92 | 乃至于 93 | 么 94 | 之 95 | 之一 96 | 之所以 97 | 之类 98 | 乌乎 99 | 乎 100 | 乘 101 | 也 102 | 也好 103 | 也罢 104 | 了 105 | 二来 106 | 于 107 | 于是 108 | 于是乎 109 | 云云 110 | 云尔 111 | 些 112 | 亦 113 | 人 114 | 人们 115 | 人家 116 | 什么 117 | 什么样 118 | 今 119 | 介于 120 | 仍 121 | 仍旧 122 | 从 123 | 从此 124 | 从而 125 | 他 126 | 他人 127 | 他们 128 | 以 129 | 以上 130 | 以为 131 | 以便 132 | 以免 133 | 以及 134 | 以故 135 | 以期 136 | 以来 137 | 以至 138 | 以至于 139 | 以致 140 | 们 141 | 任 142 | 任何 143 | 任凭 144 | 似的 145 | 但 146 | 但凡 147 | 但是 148 | 何 149 | 何以 150 | 何况 151 | 何处 152 | 何时 153 | 余外 154 | 作为 155 | 你 156 | 你们 157 | 使 158 | 使得 159 | 例如 160 | 依 161 | 依据 162 | 依照 163 | 便于 164 | 俺 165 | 俺们 166 | 倘 167 | 倘使 168 | 倘或 169 | 倘然 170 | 倘若 171 | 借 172 | 假使 173 | 假如 174 | 假若 175 | 傥然 176 | 像 177 | 儿 178 | 先不先 179 | 光是 180 | 全体 181 | 全部 182 | 兮 183 | 关于 184 | 其 185 | 其一 186 | 其中 187 | 其二 188 | 其他 189 | 其余 190 | 其它 191 | 其次 192 | 具体地说 193 | 具体说来 194 | 兼之 195 | 内 196 | 再 197 | 再其次 198 | 再则 199 | 再有 200 | 再者 201 | 再者说 202 | 再说 203 | 冒 204 | 冲 205 | 况且 206 | 几 207 | 几时 208 | 凡 209 | 凡是 210 | 凭 211 | 凭借 212 | 出于 213 | 出来 214 | 分别 215 | 则 216 | 则甚 217 | 别 218 | 别人 219 | 别处 220 | 别是 221 | 别的 222 | 别管 223 | 别说 224 | 到 225 | 前后 226 | 前此 227 | 前者 228 | 加之 229 | 加以 230 | 即 231 | 即令 232 | 即使 233 | 即便 234 | 即如 235 | 即或 236 | 即若 237 | 却 238 | 去 239 | 又 240 | 又及 241 | 及 242 | 及其 243 | 及至 244 | 反之 245 | 反而 246 | 反过来 247 | 反过来说 248 | 受到 249 | 另 250 | 另一方面 251 | 另外 252 | 另悉 253 | 只 254 | 只当 255 | 只怕 256 | 只是 257 | 只有 258 | 只消 259 | 只要 260 | 只限 261 | 叫 262 | 叮咚 263 | 可 264 | 可以 265 | 可是 266 | 可见 267 | 各 268 | 各个 269 | 各位 270 | 各种 271 | 各自 272 | 同 273 | 同时 274 | 后 275 | 后者 276 | 向 277 | 向使 278 | 向着 279 | 吓 280 | 吗 281 | 否则 282 | 吧 283 | 吧哒 284 | 吱 285 | 呀 286 | 呃 287 | 呕 288 | 呗 289 | 呜 290 | 呜呼 291 | 呢 292 | 呵 293 | 呵呵 294 | 呸 295 | 呼哧 296 | 咋 297 | 和 298 | 咚 299 | 咦 300 | 咧 301 | 咱 302 | 咱们 303 | 咳 304 | 哇 305 | 哈 306 | 哈哈 307 | 哉 308 | 哎 309 | 哎呀 310 | 哎哟 311 | 哗 312 | 哟 313 | 哦 314 | 哩 315 | 哪 316 | 哪个 317 | 哪些 318 | 哪儿 319 | 哪天 320 | 哪年 321 | 哪怕 322 | 哪样 323 | 哪边 324 | 哪里 325 | 哼 326 | 哼唷 327 | 唉 328 | 唯有 329 | 啊 330 | 啐 331 | 啥 332 | 啦 333 | 啪达 334 | 啷当 335 | 喂 336 | 喏 337 | 喔唷 338 | 喽 339 | 嗡 340 | 嗡嗡 341 | 嗬 342 | 嗯 343 | 嗳 344 | 嘎 345 | 嘎登 346 | 嘘 347 | 嘛 348 | 嘻 349 | 嘿 350 | 嘿嘿 351 | 因 352 | 因为 353 | 因了 354 | 因此 355 | 因着 356 | 因而 357 | 固然 358 | 在 359 | 在下 360 | 在于 361 | 地 362 | 基于 363 | 处在 364 | 多 365 | 多么 366 | 多少 367 | 大 368 | 大家 369 | 她 370 | 她们 371 | 好 372 | 如 373 | 如上 374 | 如上所述 375 | 如下 376 | 如何 377 | 如其 378 | 如同 379 | 如是 380 | 如果 381 | 如此 382 | 如若 383 | 始而 384 | 孰料 385 | 孰知 386 | 宁 387 | 宁可 388 | 宁愿 389 | 宁肯 390 | 它 391 | 它们 392 | 对 393 | 对于 394 | 对待 395 | 对方 396 | 对比 397 | 将 398 | 小 399 | 尔 400 | 尔后 401 | 尔尔 402 | 尚且 403 | 就 404 | 就是 405 | 就是了 406 | 就是说 407 | 就算 408 | 就要 409 | 尽 410 | 尽管 411 | 尽管如此 412 | 岂但 413 | 己 414 | 已 415 | 已矣 416 | 巴 417 | 巴巴 418 | 并 419 | 并且 420 | 并非 421 | 庶乎 422 | 庶几 423 | 开外 424 | 开始 425 | 归 426 | 归齐 427 | 当 428 | 当地 429 | 当然 430 | 当着 431 | 彼 432 | 彼时 433 | 彼此 434 | 往 435 | 待 436 | 很 437 | 得 438 | 得了 439 | 怎 440 | 怎么 441 | 怎么办 442 | 怎么样 443 | 怎奈 444 | 怎样 445 | 总之 446 | 总的来看 447 | 总的来说 448 | 总的说来 449 | 总而言之 450 | 恰恰相反 451 | 您 452 | 惟其 453 | 慢说 454 | 我 455 | 我们 456 | 或 457 | 或则 458 | 或是 459 | 或曰 460 | 或者 461 | 截至 462 | 所 463 | 所以 464 | 所在 465 | 所幸 466 | 所有 467 | 才 468 | 才能 469 | 打 470 | 打从 471 | 把 472 | 抑或 473 | 拿 474 | 按 475 | 按照 476 | 换句话说 477 | 换言之 478 | 据 479 | 据此 480 | 接着 481 | 故 482 | 故此 483 | 故而 484 | 旁人 485 | 无 486 | 无宁 487 | 无论 488 | 既 489 | 既往 490 | 既是 491 | 既然 492 | 时候 493 | 是 494 | 是以 495 | 是的 496 | 曾 497 | 替 498 | 替代 499 | 最 500 | 有 501 | 有些 502 | 有关 503 | 有及 504 | 有时 505 | 有的 506 | 望 507 | 朝 508 | 朝着 509 | 本 510 | 本人 511 | 本地 512 | 本着 513 | 本身 514 | 来 515 | 来着 516 | 来自 517 | 来说 518 | 极了 519 | 果然 520 | 果真 521 | 某 522 | 某个 523 | 某些 524 | 某某 525 | 根据 526 | 欤 527 | 正值 528 | 正如 529 | 正巧 530 | 正是 531 | 此 532 | 此地 533 | 此处 534 | 此外 535 | 此时 536 | 此次 537 | 此间 538 | 毋宁 539 | 每 540 | 每当 541 | 比 542 | 比及 543 | 比如 544 | 比方 545 | 没奈何 546 | 沿 547 | 沿着 548 | 漫说 549 | 焉 550 | 然则 551 | 然后 552 | 然而 553 | 照 554 | 照着 555 | 犹且 556 | 犹自 557 | 甚且 558 | 甚么 559 | 甚或 560 | 甚而 561 | 甚至 562 | 甚至于 563 | 用 564 | 用来 565 | 由 566 | 由于 567 | 由是 568 | 由此 569 | 由此可见 570 | 的 571 | 的确 572 | 的话 573 | 直到 574 | 相对而言 575 | 省得 576 | 看 577 | 眨眼 578 | 着 579 | 着呢 580 | 矣 581 | 矣乎 582 | 矣哉 583 | 离 584 | 竟而 585 | 第 586 | 等 587 | 等到 588 | 等等 589 | 简言之 590 | 管 591 | 类如 592 | 紧接着 593 | 纵 594 | 纵令 595 | 纵使 596 | 纵然 597 | 经 598 | 经过 599 | 结果 600 | 给 601 | 继之 602 | 继后 603 | 继而 604 | 综上所述 605 | 罢了 606 | 者 607 | 而 608 | 而且 609 | 而况 610 | 而后 611 | 而外 612 | 而已 613 | 而是 614 | 而言 615 | 能 616 | 能否 617 | 腾 618 | 自 619 | 自个儿 620 | 自从 621 | 自各儿 622 | 自后 623 | 自家 624 | 自己 625 | 自打 626 | 自身 627 | 至 628 | 至于 629 | 至今 630 | 至若 631 | 致 632 | 般的 633 | 若 634 | 若夫 635 | 若是 636 | 若果 637 | 若非 638 | 莫不然 639 | 莫如 640 | 莫若 641 | 虽 642 | 虽则 643 | 虽然 644 | 虽说 645 | 被 646 | 要 647 | 要不 648 | 要不是 649 | 要不然 650 | 要么 651 | 要是 652 | 譬喻 653 | 譬如 654 | 让 655 | 许多 656 | 论 657 | 设使 658 | 设或 659 | 设若 660 | 诚如 661 | 诚然 662 | 该 663 | 说来 664 | 诸 665 | 诸位 666 | 诸如 667 | 谁 668 | 谁人 669 | 谁料 670 | 谁知 671 | 贼死 672 | 赖以 673 | 赶 674 | 起 675 | 起见 676 | 趁 677 | 趁着 678 | 越是 679 | 距 680 | 跟 681 | 较 682 | 较之 683 | 边 684 | 过 685 | 还 686 | 还是 687 | 还有 688 | 还要 689 | 这 690 | 这一来 691 | 这个 692 | 这么 693 | 这么些 694 | 这么样 695 | 这么点儿 696 | 这些 697 | 这会儿 698 | 这儿 699 | 这就是说 700 | 这时 701 | 这样 702 | 这次 703 | 这般 704 | 这边 705 | 这里 706 | 进而 707 | 连 708 | 连同 709 | 逐步 710 | 通过 711 | 遵循 712 | 遵照 713 | 那 714 | 那个 715 | 那么 716 | 那么些 717 | 那么样 718 | 那些 719 | 那会儿 720 | 那儿 721 | 那时 722 | 那样 723 | 那般 724 | 那边 725 | 那里 726 | 都 727 | 鄙人 728 | 鉴于 729 | 针对 730 | 阿 731 | 除 732 | 除了 733 | 除外 734 | 除开 735 | 除此之外 736 | 除非 737 | 随 738 | 随后 739 | 随时 740 | 随着 741 | 难道说 742 | 非但 743 | 非徒 744 | 非特 745 | 非独 746 | 靠 747 | 顺 748 | 顺着 749 | 首先 750 | 查看 751 | 一个 752 | 753 | ! 754 | , 755 | : 756 | ; 757 | ? 758 | ? 759 | @ 760 | - 761 | / 762 | [ 763 | ] 764 | ( 765 | ) 766 | * 767 | & 768 | ^ 769 | % 770 | # 771 | ` 772 | · 773 | { 774 | } 775 | | 776 | < 777 | > 778 | = 779 | —— 780 | 【 781 | 】 782 | 请使用手机QQ进行查看 783 | 请使用新版手机QQ查收 -------------------------------------------------------------------------------- /src/analyse.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | """ 5 | @Author:lichunhui 6 | @Time: 2018/1/13 22:59 7 | @Description: 8 | """ 9 | import re 10 | import jieba 11 | import jieba.analyse 12 | import matplotlib.pyplot as plt 13 | from wordcloud import WordCloud 14 | from collections import Counter 15 | from src.config import * 16 | from scipy.misc import imread 17 | from src.mredis import MyRedis 18 | from snownlp import SnowNLP 19 | from multiprocessing import Pool, Manager 20 | from src.classify import SVMClassifer 21 | 22 | entity = [] # 存放(日期,qq号,内容) 23 | times = [] # 日期-时间 24 | hours = [] # 一天中聊天人数最多的时刻 25 | qqs = [] # 所有参与聊天的qq号 26 | contents = [] # 所有聊天内容 27 | 28 | 29 | # 从txt文件中读取数据,利用正则匹配想要的数据 30 | def read_data(): 31 | """ 32 | 从导出的qq群聊天记录中获取想要的数据,包括时间,成员,内容 33 | :return: Nothing 34 | """ 35 | fn = os.path.join(material, 'records.txt') 36 | with open(fn, 'r', encoding='utf-8', errors='ignore') as reader: 37 | txt = reader.read() 38 | re_pat = r'20[\d-]{8}\s[\d:]{7,8}\s+[^\n]+(?:\d{5,11}|@\w+\.[comnetcn]{2,3})\)' # 正则语句,匹配记录头 39 | log_title_arr = re.findall(re_pat, txt) # 记录头数组['2016-06-24 15:42:52 张某(40**21)',…] 40 | log_content_arr = re.split(re_pat, txt)[1:] # 记录内容数组['\n', '\n选修的\n\n', '\n就怕这次…] 41 | 42 | print(len(log_title_arr), '---', len(log_content_arr)) 43 | for i in range(int(len(log_title_arr))): 44 | date = re.search(r'20[\d-]{8}\s[\d:]{7,8}', log_title_arr[i]).group() # 匹配记录头中的时间 45 | qq = re.search(r'(?<=\()[^\)]+', log_title_arr[i]).group() # 匹配记录头中的QQ号 46 | content = log_content_arr[i].strip('\n') # 聊天内容 47 | hour = re.search(r'(?<=\s)[^\:]+', date).group() # 一天中聊天的时刻 48 | times.append(date) # 记录所有的聊天日期 49 | hours.append(hour) # 记录所有聊天时刻点 50 | qqs.append(qq) # 所有qq 51 | contents.append(content) # 所有聊天内容 52 | entity.append((date, qq, content)) 53 | print(len(times), '---', len(contents)) 54 | ''' 55 | rdis = MyRedis.get_redis_instance() # 获得一个redis实例 56 | rdis.set('entity', entity) 57 | rdis.set('times', times) 58 | rdis.set('qqs', qqs) 59 | rdis.set('contents', contents) 60 | ''' 61 | 62 | 63 | # 进行聊天记录分析 64 | def analyse(): 65 | """ 66 | 对聊天记录进行分析,包括统计参与聊天总人数,最近活跃人数,最活跃的群成员,聊天热点话题,用户情感等等 67 | :return: Nothing 68 | """ 69 | counter_res = Counter(qqs) # 统计出发言频率 70 | most_active_person = dict([(k, v) for k, v in counter_res.most_common(25)]) # 最活跃的25个人 71 | total_chat_person = len(counter_res) # 记录中参与聊天总人数 72 | print(most_active_person) 73 | print(total_chat_person) 74 | draw(most_active_person, 'most_active_person.jpg', 'chongqing.jpg') 75 | # 利用结巴分词对聊天内容进行处理 76 | with open(os.path.join(material, 'ch_stopwords.txt'), 'r', encoding='utf-8', errors='ignore') as reader: 77 | lines = reader.readlines() 78 | stopwords = [sw.strip('\n') for sw in lines] 79 | stopwords.append(' ') 80 | stopwords.append('\n') 81 | # print('停用词=====', stopwords) 82 | res_list = [] 83 | jieba.load_userdict(os.path.join(material, 'jieba_usr_dict.txt')) # 自定义词典 84 | for content in contents: 85 | seg_list = jieba.cut(content, cut_all=False) 86 | seg_list = [sl for sl in seg_list if sl not in stopwords] 87 | if len(seg_list) != 0: 88 | res_list.extend(seg_list) 89 | # 统计频次最高的词 90 | content_counter = Counter(res_list) 91 | most_hot_topic = dict([(k, v) for k, v in content_counter.most_common(600)]) # 谈论最多的话题点 92 | print(most_hot_topic) 93 | draw(most_hot_topic, 'most_hot_word.jpg', 'longmao.jpg') 94 | 95 | # 基于TextRank 算法的关键词抽取 96 | sentence = ''.join(res_list) 97 | # print(sentence, '===') 98 | text_rank = jieba.analyse.textrank(sentence, topK=150, withWeight=False, 99 | allowPOS=('ns', 'n', 'vn', 'v', 'adj')) # 100个关键词 100 | print(text_rank) 101 | draw(text_rank, 'most_hot_topic_tr.jpg', 'china.jpg') 102 | 103 | # 分析一天中群最活跃的时间段 104 | hours_counter_res = Counter(hours) 105 | hours_counter = dict([(int(k), v) for k, v in hours_counter_res.most_common(24)]) 106 | tmp = [i for i in range(24)] 107 | x1 = [] 108 | y1 = [] 109 | for i in tmp: 110 | x1.append(i) 111 | if i in hours_counter: 112 | y1.append(hours_counter[i]) 113 | else: 114 | y1.append(0) 115 | print(hours_counter) 116 | # 解决中文乱码问题 117 | plt.rcParams['font.sans-serif'] = ['simHei'] 118 | plt.rcParams['axes.unicode_minus'] = False 119 | plt.title('重庆老乡群群友活跃分布') 120 | plt.ylabel('记录条数') # 设置y轴标签 121 | plt.xlabel('时刻') # 设置x轴标签 122 | plt.xticks(range(24), x1) 123 | plt.legend() 124 | plt.plot(x1, y1, marker='*', ms=10, color='y') # 绘制折线图 125 | plt.show() 126 | # draw(hours_counter, 'most_active_hour.jpg', 'huoguo.jpg') 127 | 128 | # 分析用户情感,推测用户性格 129 | pool = Pool(4) 130 | manager = Manager() 131 | que = manager.Queue() 132 | user_sentiment = {} 133 | for qq in most_active_person: 134 | txt = [] 135 | person_index = [i for i, x in enumerate(qqs) if x == qq] # 最活跃的人的聊天记录的位置 136 | for i in person_index: 137 | txt.append(contents[i]) 138 | pool.apply_async(emotion_analyse, (que, txt, qq)) 139 | txt = [] 140 | pool.close() 141 | pool.join() 142 | print('All subprocess done') 143 | x = [] 144 | y = [] 145 | while not que.empty(): 146 | print('this is while') 147 | qq, sentiment_index = que.get(True) 148 | user_sentiment[qq] = sentiment_index 149 | x.append(qq) 150 | y.append(sentiment_index) 151 | print(user_sentiment) 152 | print(x) 153 | print(y) 154 | # 解决中文乱码问题 155 | plt.rcParams['font.sans-serif'] = ['simHei'] 156 | plt.rcParams['axes.unicode_minus'] = False 157 | plt.title('重庆老乡群群友情感指数') 158 | plt.ylim(0, 1) # 设置y轴范围 159 | plt.ylabel('情感指数') # 设置y轴标签 160 | plt.xlabel('用户') # 设置x轴标签 161 | plt.bar(x=x, height=y, color='rgbycmk') 162 | plt.xticks(range(25), x, rotation=90, fontsize=6) 163 | plt.legend() 164 | plt.show() 165 | """ 166 | 167 | 半年:2017/07 -- 2018/01 {'1209101643': 0.5427812816478225, '745500530': 0.5507599248949117, '1051617442': 168 | 0.5530599685557689, '1045048548': 0.5499328897910526, '2642574488': 0.5446383498406289, '402376637': 169 | 0.5407447368409908, '1764011453': 0.5369646476046138, '1251680944': 0.5354333781521334, '1219398106': 170 | 0.5364054328018658, '599292912': 0.5364656951947343, '1913409863': 0.53484067081695, '309031584': 171 | 0.5344930942030883, '545184326': 0.5341362432855117, '1329636023': 0.53325216105234, '710258421': 172 | 0.5330333058797951, '673853621': 0.5326860825299103, '754315527': 0.5321570979667783, '1965083476': 173 | 0.5316905993321258, '1141569440': 0.5316969969482886, '2275444386': 0.5306393358854631, 174 | 'loyalnovakshayne@qq.com': 0.5310336921617058, '1115133277': 0.5309162264601218, '1715474300': 175 | 0.5309849111992685, '122780300': 0.5309933866294124, '10000': 0.5284466697252308, '1227832488': 176 | 0.5283585964515772, '1316141782': 0.5285967890143103, '1344754848': 0.5286674077843782, '1241927211': 177 | 0.528424292537605, '1278678748': 0.5280859518868842, '1196042086': 0.5280284458864, '784319656': 178 | 0.5284060134894644, '452335534': 0.5286984475623131, '2914707056': 0.5287701425196093, '1278109751': 179 | 0.5290585856871963, '837125065': 0.5291170420305971, '1064278183': 0.5291016493609904, '798955866': 180 | 0.5294630944685925, '1085578433': 0.52930914248514, '422938365': 0.5295712616540428, '1457556035': 181 | 0.5296832347357361, '1129344867': 0.5296765800798483, '1822118385': 0.5297533989900461, '2422197668': 182 | 0.5297322708242689, '1061275041': 0.5298537037626699, '393989256': 0.5297992500496315, '1105081309': 183 | 0.5299694494350515, '747544291': 0.5301582266477316, '357190318': 0.5301595769523562, '121289126': 184 | 0.5302845157303446, '1095736889': 0.5303553062869905, '1272082503': 0.5305090612032322, '309941018': 185 | 0.5304935382513758, '251604829': 0.5305276413428799, '876705832': 0.5307866870732014, '1351461971': 186 | 0.5308601636552932, '363374239': 0.5308308896578008, '328054694': 0.5307810877049866, '504133892': 187 | 0.5307463796609506, '641097301': 0.5308282628491252, '506640120': 0.5308740152452233, '812749634': 188 | 0.5309081934474766, '961357326': 0.5310443894354496, '568588075': 0.5311020621084173, '904349494': 189 | 0.5311619995291751, '1098224615': 0.5311377240414275, '865845406': 0.5311768177693404, '750479251': 190 | 0.531187383565594, '1030201081': 0.5311473874833691, '769031687': 0.5309991086644246, '1006392533': 191 | 0.5309762045576085, '1009132098': 0.531084708728667, '877697004': 0.5310824284067153, '835194461': 192 | 0.5310894638634346, '815263592': 0.5311570717128403, '836483716': 0.5312497820063115, '295083887': 193 | 0.531316229230391, '986048812': 0.5312283218489628, '1732376703': 0.53123102278883, '1176781811': 194 | 0.531317643774081, '1146751230': 0.5313541708299178, '464870515': 0.5312882048083162, '591985453': 195 | 0.5313028049854518, '1503484527': 0.5313465383938655, '846522909': 0.5313583909940992, '244665984': 196 | 0.5313870427657331, '761631604': 0.5313994868892827, '1647036905': 0.5313838526108119, '2280737669': 197 | 0.5314268451643539, '505420996': 0.5315191037434139, '296585171': 0.5314684837901023, '1985059595': 198 | 0.5315002259629534, '3038798137': 0.5315600243751035, '1478620363': 0.5315204256725709, '1194906693': 199 | 0.5315078156451286, '849350590': 0.5315612858290759, '1069045056': 0.5315969839355288, '1195912410': 200 | 0.5316059291585217, '1172596168': 0.5315992909922406, '774110694': 0.5316443354015329} 201 | 202 | 全年:2017/01 -- 2018/01 203 | 204 | """ 205 | 206 | 207 | # 对单个qq用户的情感分析 208 | def emotion_analyse(que, txt_lst=None, qq=None): 209 | """ 210 | desc:对单个qq用户的情感分析,推测此用户的性格以及生活习惯等,利用工具SnowNLP 211 | :param que: 队列 212 | :param txt_lst: 传入需要分析的内容text 213 | :param qq: 传入被分析人的qq 214 | :return: 用户情感打分 215 | """ 216 | if txt_lst is None or qq is None: 217 | return 218 | print('analyse %s emotions and pid is %d and len is %d' % (qq, os.getpid(), len(txt_lst))) 219 | sentiment_scores = 0.0 220 | for txt in txt_lst: 221 | if len(txt) == 0: 222 | txt = ' ' 223 | s = SnowNLP(txt) 224 | sentiment_scores += s.sentiments 225 | # print(sentiment_scores, '<<<=====', qq) 226 | scores = sentiment_scores / len(txt_lst) 227 | print('task %s done! and socre is %.5f and len is %d' % (qq, scores, len(txt_lst))) 228 | res = (qq, scores) 229 | que.put(res, True) 230 | # return sentiment_scores / len(txt_lst) 231 | 232 | 233 | # 显示统计结果 234 | def draw(counter_res, file_name, mask_name): 235 | """ 236 | 将结果用词云展示 237 | :param counter_res:dict or list 238 | :param file_name: the output file name 239 | :param mask_name: mask picture file name 240 | :return: Nothing 241 | """ 242 | mask = imread(os.path.join(material, mask_name)) 243 | wc = WordCloud(font_path=font_path, width=1046, height=1066, background_color="white", 244 | relative_scaling=.6, max_words=1000, mask=mask, max_font_size=100) 245 | if type(counter_res) is dict: 246 | wc.fit_words(counter_res) # 字体大小和词频有关 247 | elif type(counter_res) is list: 248 | wc.generate(' '.join(counter_res)) 249 | wc.to_file(os.path.join(output_path, file_name)) # 将词云导出到文件 250 | ''' 251 | # plt 画图 252 | plt.figure() 253 | plt.imshow(wc) 254 | plt.axis("off") 255 | plt.show() 256 | ''' 257 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | --------------------------------------------------------------------------------