├── Response.txt ├── Sessions.txt ├── .idea ├── scopes │ └── scope_settings.xml ├── encodings.xml ├── vcs.xml ├── misc.xml ├── modules.xml ├── crawl_wechat.iml └── workspace.xml ├── readme.md ├── prepare_request.py ├── new_crawl_wechat.py └── New_Response.txt /Response.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/songluyi/crawl_wechat/HEAD/Response.txt -------------------------------------------------------------------------------- /Sessions.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/songluyi/crawl_wechat/HEAD/Sessions.txt -------------------------------------------------------------------------------- /.idea/scopes/scope_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | -------------------------------------------------------------------------------- /.idea/encodings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | -------------------------------------------------------------------------------- /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /.idea/misc.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/crawl_wechat.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | #微信公众号批量抓取工具 2 | ##运行环境 3 | 1. 需要安装Python 3.5 如果运行2.7的会出现一点小bug 目前暂时没有精力改成2.7版本 4 | 5 | 2. 关于安装库,我用的都是标准的,如果连requests 或者ftplib都木有的话我也不好说什么是吧。pymysql这个库链接MySQL还是不错的 6 | 7 | ##使用本程序问答手册 8 | 9 | 1. 我现在有一到两万公众号的获取需求你的程序能够满足么 10 | **答:可以。如果有更大的量,可以考虑用虚拟机 然后运行电脑版微信批量用脚本模拟点击获取key。如果需求很强烈的话我会考虑改写 11 | 里面的ftp获取方法。** 12 | 13 | 2. 我想了解一下本程序批量获取微信公众号文章的方法 14 | **答:通过fiddler获取电脑版的微信公众号访问请求以及response,然后本程序解析resopnse里面的msglist 然后批量获取文章的链接 15 | 描述 发布时间 名称等 然后批量插入数据库 记得自己改数据库信息哦 值得注意的是:目前key算法换成一个公众号一个key 16 | key有效30min 我觉得用key获取文章历史的没啥意义,当然大家可以只用key来拼出来,获取文章历史列表只需要frommsgid和count 17 | 就行,我这里就不写我怎么拼了。** 18 | 19 | 3. 我该如何使用本程序,能达到我的效果 20 | **答:本程序仅作一个demo,运行new_crawl_wechat他会从response.txt转换为new_response.txt 这个response怎么来的呢是你fiddler截获 21 | 报文来的,生成的一个txt文本。至于fiddler怎么设置可以截获微信报文 怎么样修改rules可以生成你想要的response.txt尽在我的博客** 22 | [fiddler教程地址](http://www.songluyi.com/%E5%88%A9%E7%94%A8fiddler-%E6%88%AA%E8%8E%B7%E5%BE%AE%E4%BF%A1%E4%BC%A0%E8%BE%93%E6%95%B0%E6%8D%AE-%EF%BC%88%E6%96%B9%E4%BE%BF%E6%8A%93%E5%8F%96%E5%85%AC%E4%BC%97%E5%8F%B7%E4%BF%A1%E6%81%AF%EF%BC%89/) 23 | 4. 我觉得你的思路很傻比,完全没有Python精神,我该如何找到你的联系方式喷你呢? 24 | **答:见左边博客或者gmail谢谢 (请带上自己牛逼的方法和代码哦)** 25 | 26 | 5. 你这个傻吊程序靠谱不,有人fork然后成功了么? 27 | **答:有人测试成功,已经为公司进行运作,满足实验室或者小规模使用** 28 | 29 | ##文件介绍 30 | 1. prepare_request.py这个文件是以前版本,当时key没有和公众号关联一个key可以获取多个公众号文章列表,后来结果是大家都知道了 31 | 你可以借鉴这里面对key的拼接或者其他思想。 32 | 2. new_crawl_wechet.py这个是最新的,运行它你可以立刻看到他在做什么:主要就是将fiddler4获取到mp.weixin.qq.com的request的 33 | response获取到的txt解析然后获取其中内容并放置到mysql中. 34 | 3. 其他的txt文件是fiddler获取下来的,new_response是我的程序生成的,具体为什么这么做我有写明. 35 | 36 | ##技术含量 37 | 1. 恩,爬取微信除了让我比较难受以外没什么卵技术含量. 38 | 39 | ##已经更新 40 | 1. 1.1版本更新 修复了如果拉取多章历史文章,不能获取的问题。 41 | 42 | ##日后跟新 43 | 1. 在爬取的过程中,多进程导致微信有时候会有connection_error爆出,我准备写一个批量代理的轮子,方便切换IP。 44 | 45 | 3. 如果有任何问题可以在issue里面提出,或者直接mail我。如果看到我会解决你的一些问题。 -------------------------------------------------------------------------------- /prepare_request.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 2016/9/5 10:36 3 | """ 4 | ------------------------------------------------------------------------------- 5 | Function: 此工具用于获取截获的微信请求报文 6 | Version: 1.0 7 | Author: SLY 8 | Contact: slysly759@gmail.com 9 | 10 | ------------------------------------------------------------------------------- 11 | """ 12 | import chardet,sys 13 | import ftplib 14 | import time 15 | import string 16 | from urllib import request 17 | from _collections import deque 18 | import requests 19 | class Get_Wechat(object): 20 | def __init__(self): 21 | self.time=time.strftime("%m_%d_%H_%M_%S", time.localtime()) 22 | self.header=self.set_config()[0] 23 | self.row_request=self.set_config()[1] 24 | def get_data_from_ftp(self): 25 | ftp=ftplib.FTP() 26 | ftp.connect() 27 | ftp.login() 28 | print(ftp.cwd()) 29 | DownLocalFilename="Session.txt" 30 | f = open(DownLocalFilename, 'wb') 31 | DownRoteFilename="Session.txt" 32 | ftp.retrbinary('RETR ' + DownRoteFilename , f.write , 1024) 33 | f.close() 34 | ftp.close() 35 | def Change_Ftp_Txt(self):#传过来是字节流不符合string的解析因此进行修改 36 | f=open('Sessions.txt','r',encoding='gbk',errors='ignore') 37 | data=f.readlines() 38 | new_data=[] 39 | file_write_new=open('New_Session.txt','wb') 40 | for i in data: 41 | i=str(i).replace("b'",'').replace("'",'') 42 | # i.replace('\x00','') 43 | hope=''.join(list(filter(lambda x: x in string.printable, i))) 44 | if hope.startswith('#')or not hope.split(): 45 | continue 46 | file_write_new.write(bytes(hope,encoding='utf-8')) 47 | file_write_new.close() 48 | 49 | def set_config(self): 50 | data=deque(open('New_Session.txt','r'),18) 51 | header={} 52 | final_request={} 53 | form_request=[] 54 | for i in data: 55 | print(i) 56 | if ':' in str(i): 57 | special_data=str(i).split(':')[1].replace('\n','') 58 | header[str(i).split(':')[0]]=special_data 59 | else: 60 | form_request.append(i) 61 | if 'CONNECT mp.weixin.qq.com' in header and 'Request header' in header and 'Request body' in header : 62 | list(map(header.pop,['CONNECT mp.weixin.qq.com','Request header','Request body','Request url','Proxy-Connection'])) 63 | #目前尚不清楚为何没有list map就没有实际效果。。。 64 | split_request=form_request[0].replace('GET ','').split('&') 65 | for j in split_request[1:]: 66 | final_request[str(j).split('=')[0]]=str(j).split('=')[1] 67 | final_request[split_request[0].split('?')[1].split('=')[0]]=split_request[0].split('?')[1].split('=')[1] 68 | return [header,final_request] 69 | def start_request_test(self): 70 | header=self.header 71 | print(header) 72 | url='http://mp.weixin.qq.com/mp/getmasssendmsg?__biz=MzA3ODA5NjgyOA==&uin=MjM3ODE4ODcxMg%3D%3D&key=7b81aac53bd2393dd33aa07750d5189e6ad025e93260f9226ce99f1123681c3e1a6521425204ed0293c7ff7e0e46d9b30805712ed1b1ac86&devicetype=Windows+7&version=62020025&lang=zh_CN&ascene=7&pass_ticket=tqneZzemQw0OsH5VSC1z2VTlN0A8OO4eU0VgGGMf6%2BLyPYb8ZdDef%2FX9mWb2gerS&wx_header=1' 73 | data=requests.get(url,header) 74 | data.encoding='utf-8' 75 | 76 | print(data.text) 77 | 78 | if __name__=="__main__": 79 | go=Get_Wechat() 80 | go.Change_Ftp_Txt() 81 | go.set_config() 82 | go.start_request_test() 83 | #麻痹傻吊微信又换加密方式干! 84 | 85 | 86 | -------------------------------------------------------------------------------- /new_crawl_wechat.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 2016/9/7 13:47 3 | """ 4 | ------------------------------------------------------------------------------- 5 | Function: using for fucking wechat 6 | Version: 1.1 7 | Author: SLY 8 | Contact: slysly759@gmail.com 9 | 10 | ------------------------------------------------------------------------------- 11 | """ 12 | import multiprocessing 13 | from multiprocessing import Pool as ThreadPool 14 | import string,re,ftplib,time 15 | import requests,pymysql 16 | class fuck_wechat(object): 17 | def __init__(self): 18 | self.time=time.strftime("%m_%d_%H_%M_%S", time.localtime()) 19 | self.ID=1 20 | def get_data_from_ftp(self): 21 | ftp=ftplib.FTP() 22 | ftp.connect() 23 | ftp.login() 24 | print(ftp.cwd()) 25 | DownLocalFilename="Session.txt" 26 | f = open(DownLocalFilename, 'wb') 27 | DownRoteFilename="Session.txt" 28 | ftp.retrbinary('RETR ' + DownRoteFilename , f.write , 1024) 29 | f.close() 30 | ftp.close() 31 | def change_txt(self): 32 | f=open('Response.txt','r',encoding='gbk',errors='ignore') 33 | data=f.readlines() 34 | new_data=[] 35 | file_write_new=open('New_Response.txt','wb') 36 | for i in data: 37 | i=str(i).replace("b'",'').replace("'",'') 38 | # i.replace('\x00','') 39 | hope=''.join(list(filter(lambda x: x in string.printable, i))) 40 | if hope.startswith('#')or not hope.split(): 41 | continue 42 | file_write_new.write(bytes(hope,encoding='utf-8')) 43 | # print(bytes(hope,encoding='utf-8')) 44 | file_write_new.close() 45 | f.close() 46 | def return_all_article(self): 47 | msglist=[] 48 | start_row=[] 49 | end_row=[] 50 | #这个continue的list是为了解决微信后来传输json的历史文章页。 51 | continue_start_row=[] 52 | continue_end_row=[] 53 | f=open('New_Response.txt','r',encoding='utf-8',errors='ignore') 54 | file_data=f.readlines() 55 | # all_row=len(file_data) 56 | # print(file_data) 57 | count=0 58 | for need_data in file_data: 59 | count+=1 60 | if 'msgList' in str(need_data) : 61 | start_row.append(count) 62 | if '{"ret":0,' in str(need_data): 63 | continue_start_row.append(count) 64 | if 'if(!!window.__initCatch)' in str(need_data) : 65 | end_row.append(count) 66 | if 'csp_nonce_str' in str(need_data): 67 | continue_end_row.append(count) 68 | print(start_row) 69 | print(end_row) 70 | print(continue_start_row) 71 | print(continue_end_row) 72 | all_article=[] 73 | if start_row: 74 | for i in range(len(start_row)): 75 | row_article_list=''.join(file_data[start_row[i-1]:end_row[i]]) 76 | row_article_list=row_article_list.replace('\t','').replace(' ','').replace('"','').replace(' ','').replace('\\\\','')\ 77 | .replace('amp;amp;','').replace(',','') 78 | print(row_article_list) 79 | result=re.findall("http://mp.weixin.qq.com/s(.*?)#",row_article_list) 80 | s=list(map(lambda x:'http://mp.weixin.qq.com/s'+x,result)) 81 | all_article.extend(s) 82 | else: 83 | print('error:response里面没有历史文章页信息,请检查!') 84 | 85 | if continue_end_row: 86 | for j in range(len(continue_start_row)): 87 | row_article_list=''.join(file_data[continue_start_row[j]:continue_end_row[j]]) 88 | row_article_list=row_article_list.replace('\\','').replace('amp;','') 89 | print(row_article_list) 90 | result=re.findall("http://mp.weixin.qq.com/s(.*?)#",row_article_list) 91 | s=list(map(lambda x:'http://mp.weixin.qq.com/s'+x,result)) 92 | all_article.extend(s) 93 | else: 94 | print('info:response中 没有后续文章页,如果没有模拟点击过,请忽略!') 95 | return all_article 96 | 97 | def start_request(self,url): 98 | try: 99 | self.ID+=1 100 | data=requests.get(url) 101 | data.encoding='utf-8' 102 | # print(data.text) 103 | s=data.text 104 | # print(type(s)) 105 | nick_name=re.findall('var nickname =(.*);',s)[0].replace('"','') 106 | app_uni=re.findall('var appuin =(.*);',s)[0].replace('"','').replace('||','') 107 | msg_title=re.findall('var msg_title = (.*);',s)[0].replace('"','') 108 | msg_desc=re.findall('var msg_desc = (.*);',s)[0].replace('"','') 109 | publish_time=re.findall('var publish_time = (.*);',s)[0].replace('"','').replace(' ||','').replace(' ','') 110 | print('Finish one') 111 | print(url) 112 | # return {'nick_name':nick_name,'app_uni':app_uni,'msg_title':msg_title,'msg_desc':msg_desc,'msg_url':url, 113 | # 'publish_time':publish_time} 114 | return (nick_name,app_uni,msg_title,msg_desc,url,publish_time) 115 | except TimeoutError: 116 | return None 117 | except ConnectionError: 118 | return None 119 | 120 | 121 | def get_max_id(self): 122 | db = pymysql.connect("localhost","root","070801382","world",port=3308,charset='utf8') 123 | cursor = db.cursor() 124 | cursor.execute("SELECT max(id) from wechet_db") 125 | data = cursor.fetchone() 126 | if data: 127 | return data[0] 128 | else: 129 | raise ConnectionError 130 | def insert_db(self,data): 131 | db = pymysql.connect("localhost","root","070801382","world",port=3308,charset='utf8') 132 | cursor = db.cursor() 133 | sql="INSERT INTO wechet_db(nick_name,app_uni,msg_title,msg_desc,msg_url,publish_time) VALUES(%s,%s,%s,%s,%s,%s)" 134 | cursor.executemany(sql,data) 135 | db.commit() 136 | if __name__=="__main__": 137 | start_time=time.time() 138 | pool = ThreadPool(multiprocessing.cpu_count()*2) 139 | wtf=fuck_wechat() 140 | # wtf.get_data_from_ftp 141 | """ 142 | 如果你的点击fiddler生成的response在ftp可以用该方法传输到本目录下 143 | """ 144 | wtf.change_txt() 145 | article_lists=wtf.return_all_article() 146 | print(article_lists) 147 | print(len(article_lists)) 148 | results=list(pool.map(wtf.start_request,article_lists)) 149 | pool.close() 150 | pool.join() 151 | print(results) 152 | print(len(results)) 153 | end_time=time.time() 154 | cost = end_time - start_time #time in second 155 | print('耗时为:') 156 | print(cost) 157 | wtf.insert_db(results)#如果你没有设置数据库,可以考虑注释掉这一段。 158 | print('插入数据库成功') 159 | 160 | -------------------------------------------------------------------------------- /New_Response.txt: -------------------------------------------------------------------------------- 1 | Response code: 200 2 | Response body: 3 | Response code: 200 4 | Response body: 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 20 | 26 | 27 | 33 | 34 | wo` 35 | 36 | 37 | 38 | 39 | 74 | 75 | 76 | 77 |
78 |
79 |
80 | }fY
81 | 84 | 91 | 108 | 109 | 239 | 398 | 399 | 400 | 401 | Response code: 302 402 | Response body: 403 | Response code: 200 404 | Response body: {"ret":0,"errmsg":"ok","general_msg_list":"{\"list\":[{\"comm_msg_info\":{\"id\":1000000030,\"type\":49,\"datetime\":1472126281,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"S_`O.^6rRvP _N1\\",\"digest\":\"b6r b0\",\"content\":\"\",\"fileid\":503280946,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764595&idx=1&sn=59401d1d63613cedea09f056e85a214f&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQOrhkb59LyZ6ribla22JaLAcEUlApniaTia8l5n9vsgicJn0XUdwyCyuuQCLPEspuWVtbRMakLNaCHyUw\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"Z\\",\"copyright_stat\":100}},{\"comm_msg_info\":{\"id\":1000000029,\"type\":49,\"datetime\":1472039972,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"-6ra bO bN>k\",\"digest\":\"-N ,{ N9_ T Ta0\",\"content\":\"\",\"fileid\":503280943,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764592&idx=1&sn=98b9a2843c8fd908c57ab688c143518d&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/wd.koudai.com\\\/item.html?itemID=144041673\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQPJclUT9nv6OVnWFicjoicHL0rpycjNvbeKV84ibEYK9vI6jb610QjR3jnrJRzGkzIiaT37riclatU6Wxg\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"-",\"copyright_stat\":11}},{\"comm_msg_info\":{\"id\":1000000028,\"type\":49,\"datetime\":1471953462,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\",Ta[ g}YYT b1\(Wb",\"digest\":\" g~6 gaU gnR& & \",\"content\":\"\",\"fileid\":503280934,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764590&idx=1&sn=abad53163cfca3676be5a5af1b612995&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQNPhMHib6UN2yu70sqpbXekkFaf6vTBHk2Tymh7ico7pxsicO2ICUCSyWmyoF87ejG1LkwhAJnwQ07Qg\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"Chris Maggio\",\"copyright_stat\":100}},{\"comm_msg_info\":{\"id\":1000000027,\"type\":49,\"datetime\":1471867106,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"aasT@wHN\",\"digest\":\"1\w==vu ,

mu;m`^Oc1rZ\",\"digest\":\"b&^@wau bbO@b g$O0\",\"content\":\"\",\"fileid\":503280926,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764576&idx=1&sn=626742b8d6365e8d6a7c858f2ba0476d&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQPQicq2lic5yJMic7L0F66jVlzDXwNia0rLlx4mQOrKBClXgzzgW1x3CrWzsRtZeVKDNThEaKKrFqiauIg\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\""}\",\"copyright_stat\":100}},{\"comm_msg_info\":{\"id\":1000000025,\"type\":49,\"datetime\":1471435092,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"-```ZP%vR 405 | N\ }YtS 406 | NY\",\"digest\":\"6qKb:gNN] S` y OKbLNS_6q_N 407 | NO:y1_~\",\"content\":\"\",\"fileid\":503280923,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764573&idx=1&sn=d3db0fea58b5ef31514e8474682972dd&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/wd.koudai.com\\\/i\\\/756562562?wfr=c\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQNAZSbOgRYvxRRibKicqhbEicAHH8xTmtEDX8OxlUeNXMcZghrw3eRVLElkvX1G3PkFPXibXO7tKAGzUg\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"-",\"copyright_stat\":11}},{\"comm_msg_info\":{\"id\":1000000024,\"type\":49,\"datetime\":1471348819,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\" 408 | NN wRT\",\"digest\":\"begf[cKN~\",\"content\":\"\",\"fileid\":503280914,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764567&idx=1&sn=4af79ea8ede2bf87f4e694bfa5a9eac5&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQOfhnIZ2TQaeTDZYdz47TVPwJy3MKZXJPpejNQDOKdJnbuRzJH9G4E0E7pDoBBFzRUJWibeBRiaeUXQ\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"T\",\"copyright_stat\":11}},{\"comm_msg_info\":{\"id\":1000000023,\"type\":49,\"datetime\":1471262434,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"!o@s=N\",\"digest\":\",T!oHQuNLuT0W@s=Nwn~\",\"content\":\"\",\"fileid\":503280908,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764561&idx=1&sn=fe02c9ffce1cd2a757ab281b57074fde&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz_jpg\\\/QWypbO5WwQMxYX3IAzAM9qyTDwhPRKKfNaMficGeN9icerTSZZpMxxtCXF3PCB5PVyQAdau8rV0VmGoa3PSy1R5A\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"!o\",\"copyright_stat\":100}},{\"comm_msg_info\":{\"id\":1000000022,\"type\":49,\"datetime\":1470916614,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"(W6KNY\",\"digest\":\"(W*NYulb 409 | N gR g gnfg udk +Y 410 | YUOBl0f4NawMR(W 411 | N b _NQQKN-N g^yf \b(W N?agw0Reg0\",\"content\":\"\",\"fileid\":503280898,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764555&idx=1&sn=c97ab229e39bb714bc013559f51ff5a3&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz\\\/QWypbO5WwQOxYdAF2aS49Gicl2BiaoTjxymickMemkdPJhuiaLoLu7ibDR7VPO6aYQK1ib22NMYyo5ib0g58kX8ibnFBPw\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"FT4YE",\"copyright_stat\":11}},{\"comm_msg_info\":{\"id\":1000000021,\"type\":49,\"datetime\":1470830342,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"-Zb+stYR/eyrLvR N_\",\"digest\":\"-N0R :N`O&^egNTRR>k0\",\"content\":\"\",\"fileid\":503280896,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764545&idx=1&sn=83615a2f4c794700e733c013dd35b18e&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/weidian.com\\\/i\\\/1924016923?wfr=c\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz\\\/QWypbO5WwQMwpFYQMXa8efNAR1ricxNUlBzsOqmbIriap8sfIKTyqraibJZwf19b4jz7UA0bIW4R0pwfIla6B3iaaA\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"-",\"copyright_stat\":11}}]}","bizuin_code":"MzA3ODA5NjgyOA==","uin_code":"MjM3ODE4ODcxMg==","key":"7b81aac53bd2393d480b224df019845af9ebb35a85902cbb253ae71aabbf3d9de586981c3dddd319e563c5ca69197c05007b94be125feb75","is_friend":1,"is_continue":1,"count":10,"csp_nonce_str":1458529357} 412 | Response code: 200 413 | Response body: {"ret":0,"errmsg":"ok","general_msg_list":"{\"list\":[{\"comm_msg_info\":{\"id\":1000000020,\"type\":49,\"datetime\":1470743752,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"(W 8T@k(TeYhV0`O/f$O RRga>o1\/f\RO:NN)Y_g}Y 421 | N& & CT (u_N)Y_Nb `O& & \",\"content\":\"\",\"fileid\":503280840,\"content_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=2650764493&idx=1&sn=ab27e1d8397e62bdb70640a87a821822&scene=4#wechat_redirect\",\"source_url\":\"http:\\\/\\\/mp.weixin.qq.com\\\/s?__biz=MzA3ODA5NjgyOA==&mid=200113055&idx=1&sn=0581952a4acba0c54fcafe3d20cbeedc#rd\",\"cover\":\"http:\\\/\\\/mmbiz.qpic.cn\\\/mmbiz\\\/QWypbO5WwQMM4r3TFUp4ctOV63enhgWibGb3obIS6WwNcn68OsEz9yBlUvkpyEhbzmmOdoThzLPeI7KNQR2DicHQ\\\/0?wx_fmt=jpeg\",\"subtype\":9,\"is_multi\":0,\"multi_app_msg_item_list\":[],\"author\":\"T\",\"copyright_stat\":11}},{\"comm_msg_info\":{\"id\":1000000011,\"type\":49,\"datetime\":1468497455,\"fakeid\":\"3078096828\",\"status\":2,\"content\":\"\"},\"app_msg_ext_info\":{\"title\":\"Fe+Ynl-Ng ZP 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 75 | 76 | 89 | 90 | 91 | true 92 | 93 | 94 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 129 | 130 | 131 | 132 | 135 | 136 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 156 | 157 | 158 | 159 | 180 | 181 | 202 | 203 | 224 | 225 | 244 | 245 | 264 | 265 | 282 | 283 | 296 | 297 | 319 | 320 | 344 | 345 | 358 | 359 | 360 | 361 | 362 | 381 | 382 | 401 | 402 | 421 | 422 | 443 | 444 | 445 | 446 | 447 | 448 | true 449 | 450 | 451 | 452 | 453 | 454 | 455 | 456 | 457 | 458 | 459 | 460 | 461 | 462 | 463 | 464 | 465 | 466 | 467 | 468 | 469 | 470 | 471 | 472 | 473 | 474 | 475 | 476 | 1473042662082 477 | 480 | 481 | 482 | 483 | 484 | 485 | 486 | 487 | 488 | 489 | 490 | 491 | 492 | 493 | 494 | 495 | 496 | 497 | 498 | 499 | 500 | 501 | 502 | 503 | 504 | 505 | 506 | 507 | 508 | 509 | 510 | 511 | 512 | 513 | 514 | 515 | 516 | 517 | 518 | 519 | 520 | 521 | 522 | 523 | 524 | 525 | 526 | 527 | 528 | 529 | 530 | 531 | 534 | 537 | 538 | 539 | 541 | 542 | 545 | 546 | 547 | 548 | 549 | 550 | file://$PROJECT_DIR$/new_crawl_wechat.py 551 | 88 552 | 554 | 555 | 557 | 558 | 559 | 560 | 561 | 562 | 563 | 564 | 565 | 566 | 567 | 568 | 569 | 570 | 571 | 572 | 573 | 574 | 575 | 576 | 577 | 578 | 579 | 580 | 581 | 582 | 583 | 584 | 585 | 586 | 587 | 588 | 589 | 590 | 591 | 592 | 593 | 594 | 595 | 596 | 597 | 598 | 599 | 600 | 601 | 602 | 603 | 604 | 605 | 606 | 607 | 608 | 609 | 610 | 611 | 612 | 613 | 614 | 615 | 616 | 617 | 618 | 619 | 620 | 621 | 622 | 623 | 624 | 625 | 626 | 627 | 628 | 629 | 630 | 631 | 632 | 633 | 634 | 635 | 636 | 637 | 638 | 639 | 640 | 641 | 642 | 643 | 644 | 645 | 646 | 647 | 648 | 649 | 650 | 651 | 652 | 653 | 654 | 655 | 656 | 657 | 658 | 659 | 660 | 661 | 662 | 663 | 664 | 665 | 666 | 667 | 668 | 669 | 670 | 671 | 672 | 673 | 674 | 675 | 676 | 677 | 678 | 679 | 680 | 681 | 682 | 683 | 684 | 685 | 686 | 687 | 688 | 689 | 690 | 691 | 692 | 693 | 694 | 695 | 696 | 697 | 698 | 699 | 700 | 701 | 702 | 703 | 704 | 705 | 706 | 707 | 708 | 709 | 710 | 711 | 712 | 713 | 714 | 715 | 716 | 717 | 718 | 719 | 720 | 721 | 722 | 723 | 724 | 725 | 726 | 727 | 728 | 729 | 730 | 731 | 732 | 733 | 734 | 735 | 736 | 737 | 738 | 739 | 740 | 741 | 742 | 743 | 744 | 745 | 746 | 747 | 748 | 749 | 750 | 751 | 752 | 753 | 754 | 755 | 756 | 757 | 758 | 759 | 760 | 761 | 762 | 763 | 764 | 765 | 766 | 767 | 768 | 769 | 770 | 771 | 772 | 773 | 774 | 775 | 776 | 777 | 778 | 779 | 780 | 781 | 782 | 783 | 784 | 785 | 786 | 787 | 788 | 789 | 790 | 791 | 792 | 793 | 794 | 795 | 796 | 797 | 798 | 799 | 800 | 801 | 802 | 803 | 804 | 805 | 806 | 807 | 808 | 809 | 810 | 811 | 812 | 813 | 814 | 815 | 816 | 817 | 818 | 819 | 820 | 821 | 822 | 823 | 824 | 825 | 826 | 827 | 828 | 829 | 830 | 831 | 832 | 833 | 834 | 835 | 836 | 837 | 838 | 839 | 840 | 841 | 842 | 843 | 844 | 845 | 846 | 847 | 848 | 849 | 850 | 851 | 852 | 853 | 854 | 855 | 856 | 857 | 858 | 859 | 860 | 861 | 862 | 863 | 864 | 865 | 866 | 867 | 868 | 869 | 870 | 871 | 872 | 873 | 874 | 875 | 876 | 877 | 878 | 879 | 880 | 881 | 882 | 883 | 884 | 885 | 886 | 887 | 888 | 889 | 890 | 891 | 892 | 893 | 894 | 895 | 896 | 897 | 898 | 899 | 900 | 901 | 902 | 903 | 904 | 905 | 906 | 907 | 908 | 909 | 910 | 911 | 912 | 913 | 914 | 915 | 916 | 917 | 918 | 919 | 920 | 921 | 922 | 923 | 924 | 925 | 926 | 927 | 928 | 929 | 930 | 931 | 932 | 933 | 934 | 935 | 936 | 937 | 938 | 939 | 940 | 941 | 942 | 943 | 944 | 945 | 946 | 947 | 948 | 949 | 950 | 951 | 952 | 953 | 954 | 955 | 956 | 957 | 958 | 959 | 960 | 961 | 962 | 963 | 964 | 965 | 966 | 967 | 968 | 969 | 970 | 971 | 972 | 973 | 974 | 975 | 976 | 977 | 978 | 979 | 980 | 981 | 982 | 983 | 984 | 985 | 986 | 987 | 988 | 989 | 990 | 991 | 992 | 993 | 994 | 995 | 996 | 997 | 998 | 999 | 1000 | 1001 | 1002 | 1003 | 1004 | 1005 | 1006 | 1007 | 1008 | 1009 | 1010 | 1011 | 1012 | 1013 | 1014 | 1015 | 1016 | 1017 | 1018 | 1019 | 1020 | 1021 | 1022 | 1023 | 1024 | 1025 | 1026 | 1027 | 1028 | 1029 | 1030 | 1031 | 1032 | 1033 | 1034 | 1035 | 1036 | 1037 | 1038 | 1039 | 1040 | 1041 | 1042 | 1043 | 1044 | 1045 | 1046 | 1047 | 1048 | 1049 | 1050 | 1051 | 1052 | 1053 | 1054 | 1055 | 1056 | 1057 | 1058 | 1059 | 1060 | 1061 | 1062 | 1063 | 1064 | 1065 | 1066 | 1067 | 1068 | 1069 | 1070 | 1071 | 1072 | --------------------------------------------------------------------------------