├── README.md ├── comments ├── 0.txt ├── comment_urls.csv └── comment_v1.csv ├── configs.py ├── dataCrawl.py ├── dataProcess.py ├── dataShow.py ├── result └── 2.txt └── stopwords ├── 1.txt └── 哈工大停用词表.txt /README.md: -------------------------------------------------------------------------------- 1 | # 微博评论传统文本分析 2 | 关注于某个大的话题,按关键字搜索总话题,分为各个分话题,在每个分话题下爬取多条热门微博及其评论数据,保证内容和评论的多样性 3 | **词云、情感极性分布、关键词提取可视化等** 4 | -------------------------------------------------------------------------------- /comments/0.txt: -------------------------------------------------------------------------------- 1 | 1 2 | -------------------------------------------------------------------------------- /comments/comment_urls.csv: -------------------------------------------------------------------------------- 1 | sub_topic_name,comment_url 2 | 高以翔,https://weibo.cn/comment/Ii6kWuOWR 3 | 高以翔,https://weibo.cn/comment/Iilr76LjR 4 | 高以翔,https://weibo.cn/comment/IiBnqpqT7 5 | 高以翔,https://weibo.cn/comment/Ii5QKf6EH 6 | 高以翔,https://weibo.cn/comment/IizbPnjVe 7 | 高以翔,https://weibo.cn/comment/Ii6s9xVZP 8 | 高以翔,https://weibo.cn/comment/Iiwl3xzII 9 | 高以翔,https://weibo.cn/comment/IiAXx1crO 10 | 高以翔,https://weibo.cn/comment/Ii8cRrMi8 11 | 高以翔,https://weibo.cn/comment/Ii6WYuIRq 12 | 高以翔,https://weibo.cn/comment/Ii5Gep0Lc 13 | 高以翔,https://weibo.cn/comment/Ii9NC3BqH 14 | 高以翔,https://weibo.cn/comment/Ii5XNyYxQ 15 | 高以翔,https://weibo.cn/comment/IiuMM2Qfz 16 | 高以翔,https://weibo.cn/comment/IipT3tVaz 17 | 高以翔,https://weibo.cn/comment/IinfksybE 18 | 高以翔,https://weibo.cn/comment/IiAR13k7n 19 | 高以翔,https://weibo.cn/comment/Ii6o4kBKf 20 | 高以翔,https://weibo.cn/comment/IitNqaXXS 21 | 高以翔,https://weibo.cn/comment/Ii6cdqdwz 22 | 高以翔去世,https://weibo.cn/comment/IipNXjNZe 23 | 高以翔去世,https://weibo.cn/comment/IifBYcJyd 24 | 高以翔去世,https://weibo.cn/comment/Ii7MN70E2 25 | 高以翔去世,https://weibo.cn/comment/IiB3RqHIP 26 | 高以翔去世,https://weibo.cn/comment/IirV5xhUs 27 | 高以翔去世,https://weibo.cn/comment/IicxftnLC 28 | 高以翔去世,https://weibo.cn/comment/Iigo37Ork 29 | 高以翔去世,https://weibo.cn/comment/IictAzxYm 30 | 高以翔去世,https://weibo.cn/comment/Ii8BL7p4k 31 | 高以翔去世,https://weibo.cn/comment/Ii928l3yf 32 | 高以翔去世,https://weibo.cn/comment/Iiufmki31 33 | 高以翔去世,https://weibo.cn/comment/Ii7OOowMQ 34 | 高以翔去世,https://weibo.cn/comment/Iiz3Z2KrC 35 | 高以翔去世,https://weibo.cn/comment/Iia0VlXzu 36 | 高以翔去世,https://weibo.cn/comment/Iiqyozyjg 37 | 高以翔去世,https://weibo.cn/comment/Ii7KAig2u 38 | 高以翔去世,https://weibo.cn/comment/Ii8fo5OHw 39 | 高以翔去世,https://weibo.cn/comment/Ii8vcuGIh 40 | 高以翔去世,https://weibo.cn/comment/Iib5LkRhk 41 | 高以翔去世,https://weibo.cn/comment/IigXmr5Wv 42 | 高以翔怎么了,https://weibo.cn/comment/Ii7jlyQvu 43 | 高以翔怎么了,https://weibo.cn/comment/Ii6u3s6oc 44 | 高以翔怎么了,https://weibo.cn/comment/Ii6XwvpZV 45 | 高以翔怎么了,https://weibo.cn/comment/Ii6v6sK1T 46 | 高以翔怎么了,https://weibo.cn/comment/Ii5QKf6EH 47 | 高以翔怎么了,https://weibo.cn/comment/Ii7Pyu7XN 48 | 高以翔怎么了,https://weibo.cn/comment/Ii6s9xVZP 49 | 高以翔怎么了,https://weibo.cn/comment/Ii6RwbMaR 50 | 高以翔怎么了,https://weibo.cn/comment/Ii5Gep0Lc 51 | 高以翔怎么了,https://weibo.cn/comment/Ii6WYuIRq 52 | 高以翔怎么了,https://weibo.cn/comment/Ii6Af0XAA 53 | 高以翔怎么了,https://weibo.cn/comment/Ii6RLfxM7 54 | 高以翔怎么了,https://weibo.cn/comment/Ii6HWvImi 55 | 高以翔怎么了,https://weibo.cn/comment/Ii6z392Km 56 | 高以翔怎么了,https://weibo.cn/comment/Ii7i5eyab 57 | 高以翔怎么了,https://weibo.cn/comment/Ii7o4FOpz 58 | 高以翔怎么了,https://weibo.cn/comment/Ii70Adwki 59 | 高以翔怎么了,https://weibo.cn/comment/Ii7LEeZmZ 60 | 高以翔怎么了,https://weibo.cn/comment/Ii7x35xgR 61 | 高以翔怎么了,https://weibo.cn/comment/Ii7aJzUm4 62 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IitGZxays 63 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IiuZYgrvV 64 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/Iitv8mCfE 65 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IitKMb8bg 66 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IitIxzjFs 67 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IivLZBUV2 68 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IizMPfPnX 69 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/Iiv8o7Spe 70 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IizNpjatM 71 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/Iiyw591nj 72 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/Iiu3pwEWB 73 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IiAcWzc7l 74 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IivLXxywW 75 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IiuLOF2aI 76 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IivtSruHw 77 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IiuG90VfS 78 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IivCvzzAZ 79 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/Iiv3z3p8l 80 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/IiuXO68lt 81 | 高以翔好友婚礼 彩虹,https://weibo.cn/comment/Iiv8Fejiq 82 | 高以翔女友更新签名,https://weibo.cn/comment/IipNXjNZe 83 | 高以翔女友更新签名,https://weibo.cn/comment/Iip8J5evR 84 | 高以翔女友更新签名,https://weibo.cn/comment/IiqpqwUEp 85 | 高以翔女友更新签名,https://weibo.cn/comment/IiqPdo8RL 86 | 高以翔女友更新签名,https://weibo.cn/comment/Iiqjse55e 87 | 高以翔女友更新签名,https://weibo.cn/comment/IiqeZm1xs 88 | 高以翔女友更新签名,https://weibo.cn/comment/IiryHhHyR 89 | 高以翔女友更新签名,https://weibo.cn/comment/IipRscSR6 90 | 高以翔女友更新签名,https://weibo.cn/comment/IiqaWcA8s 91 | 高以翔女友更新签名,https://weibo.cn/comment/IiuLOF2aI 92 | 高以翔女友更新签名,https://weibo.cn/comment/IiuXO68lt 93 | 高以翔女友更新签名,https://weibo.cn/comment/IiqvxjOWi 94 | 高以翔女友更新签名,https://weibo.cn/comment/IirsNb0ru 95 | 高以翔女友更新签名,https://weibo.cn/comment/Iiqlu8BD7 96 | 高以翔女友更新签名,https://weibo.cn/comment/Iitif4gZc 97 | 高以翔女友更新签名,https://weibo.cn/comment/Iiq4wsO7T 98 | 高以翔女友更新签名,https://weibo.cn/comment/IisRGzzZw 99 | 高以翔女友更新签名,https://weibo.cn/comment/IiqcKb2cK 100 | 高以翔女友更新签名,https://weibo.cn/comment/Iiqufh844 101 | 高以翔女友更新签名,https://weibo.cn/comment/IivCvzzAZ 102 | 高以翔追思会,https://weibo.cn/comment/IicxftnLC 103 | 高以翔追思会,https://weibo.cn/comment/Iihqa5WM7 104 | 高以翔追思会,https://weibo.cn/comment/IigXmr5Wv 105 | 高以翔追思会,https://weibo.cn/comment/Ii891iyrC 106 | 高以翔追思会,https://weibo.cn/comment/IigMwaywq 107 | 高以翔追思会,https://weibo.cn/comment/IiBOksqhh 108 | 高以翔追思会,https://weibo.cn/comment/Iihh5aGm3 109 | 高以翔追思会,https://weibo.cn/comment/IiBEUBt3H 110 | 高以翔追思会,https://weibo.cn/comment/IilvF46G5 111 | 高以翔追思会,https://weibo.cn/comment/IihjnAgIo 112 | 高以翔追思会,https://weibo.cn/comment/IihgWjqSR 113 | 高以翔追思会,https://weibo.cn/comment/IizbPnjVe 114 | 高以翔追思会,https://weibo.cn/comment/Iii1sATzz 115 | 高以翔追思会,https://weibo.cn/comment/IifMS0D0R 116 | 高以翔追思会,https://weibo.cn/comment/IiqLjmQ8Z 117 | 高以翔追思会,https://weibo.cn/comment/IiqGHlJJF 118 | 高以翔追思会,https://weibo.cn/comment/IiiJh4JH9 119 | 高以翔追思会,https://weibo.cn/comment/IigTyk43C 120 | 高以翔追思会,https://weibo.cn/comment/IiiONvta8 121 | 高以翔追思会,https://weibo.cn/comment/IiiDqcgEO 122 | 高以翔助理发博,https://weibo.cn/comment/IiAaCmoFd 123 | 高以翔助理发博,https://weibo.cn/comment/IizTCoaKj 124 | 高以翔助理发博,https://weibo.cn/comment/IiAemErEJ 125 | 高以翔助理发博,https://weibo.cn/comment/IizXceXZR 126 | 高以翔助理发博,https://weibo.cn/comment/IivuRE2JK 127 | 高以翔助理发博,https://weibo.cn/comment/IiAhWexgq 128 | 高以翔助理发博,https://weibo.cn/comment/IizJCoKE9 129 | 高以翔助理发博,https://weibo.cn/comment/IizCRnIJE 130 | 高以翔助理发博,https://weibo.cn/comment/IigSYgegk 131 | 高以翔助理发博,https://weibo.cn/comment/IizKUv8uM 132 | 高以翔助理发博,https://weibo.cn/comment/IiAf16kjb 133 | 高以翔助理发博,https://weibo.cn/comment/IiAIjeRB5 134 | 高以翔助理发博,https://weibo.cn/comment/IiAOq0JJc 135 | 高以翔助理发博,https://weibo.cn/comment/IiAgLfj4f 136 | 高以翔助理发博,https://weibo.cn/comment/IiAkjocwd 137 | 高以翔助理发博,https://weibo.cn/comment/IiAMOs2jn 138 | 高以翔助理发博,https://weibo.cn/comment/IiAaK2tj5 139 | 高以翔助理发博,https://weibo.cn/comment/IiAbN5py3 140 | 高以翔助理发博,https://weibo.cn/comment/IiAdmyrhb 141 | 高以翔助理发博,https://weibo.cn/comment/IiAqdu8Ta 142 | 高以翔爸爸,https://weibo.cn/comment/Iim0wiJGf 143 | 高以翔爸爸,https://weibo.cn/comment/IilRH5t1Z 144 | 高以翔爸爸,https://weibo.cn/comment/IifUohpHE 145 | 高以翔爸爸,https://weibo.cn/comment/IiiZz3VaH 146 | 高以翔爸爸,https://weibo.cn/comment/IiiVvjdcz 147 | 高以翔爸爸,https://weibo.cn/comment/IilSYj6o3 148 | 高以翔爸爸,https://weibo.cn/comment/IilQfw12a 149 | 高以翔爸爸,https://weibo.cn/comment/IilPuft8S 150 | 高以翔爸爸,https://weibo.cn/comment/IilTdnpVy 151 | 高以翔爸爸,https://weibo.cn/comment/IilUO1Nng 152 | 高以翔爸爸,https://weibo.cn/comment/IilW4CLzE 153 | 高以翔爸爸,https://weibo.cn/comment/IilPt6xT4 154 | 高以翔爸爸,https://weibo.cn/comment/IilRivJEI 155 | 高以翔爸爸,https://weibo.cn/comment/Iim4gzJNI 156 | 高以翔爸爸,https://weibo.cn/comment/Iim66BY8R 157 | 高以翔爸爸,https://weibo.cn/comment/IiArTyBJx 158 | 高以翔爸爸,https://weibo.cn/comment/IilXH5rE9 159 | 高以翔爸爸,https://weibo.cn/comment/Iily2uemB 160 | 高以翔爸爸,https://weibo.cn/comment/Iimo2j5xy 161 | 高以翔爸爸,https://weibo.cn/comment/IipbsFkzb 162 | 高以翔追我吧录制画面,https://weibo.cn/comment/Iigo37Ork 163 | 高以翔追我吧录制画面,https://weibo.cn/comment/IictAzxYm 164 | 高以翔追我吧录制画面,https://weibo.cn/comment/Ii8fo5OHw 165 | 高以翔追我吧录制画面,https://weibo.cn/comment/IihCBiC3U 166 | 高以翔追我吧录制画面,https://weibo.cn/comment/IigKREWiV 167 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifFYnZBL 168 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifLL8KpE 169 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifJXtVLk 170 | 高以翔追我吧录制画面,https://weibo.cn/comment/IigoDtrcv 171 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifPQ55sf 172 | 高以翔追我吧录制画面,https://weibo.cn/comment/Iic6qt0vH 173 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifZMcxS7 174 | 高以翔追我吧录制画面,https://weibo.cn/comment/Ii6yjkWdS 175 | 高以翔追我吧录制画面,https://weibo.cn/comment/IicF6aF0j 176 | 高以翔追我吧录制画面,https://weibo.cn/comment/IiqrknuSQ 177 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifKR2Npg 178 | 高以翔追我吧录制画面,https://weibo.cn/comment/Iih7dxuV1 179 | 高以翔追我吧录制画面,https://weibo.cn/comment/IifGXhqcz 180 | 高以翔追我吧录制画面,https://weibo.cn/comment/Iig5cavK1 181 | 高以翔追我吧录制画面,https://weibo.cn/comment/IiuLD8Z5y 182 | 高以翔猝死,https://weibo.cn/comment/Ii7LkFHrq 183 | 高以翔猝死,https://weibo.cn/comment/Ii7MUcbAm 184 | 高以翔猝死,https://weibo.cn/comment/Ii7Qjtmkn 185 | 高以翔猝死,https://weibo.cn/comment/Ii8dZu285 186 | 高以翔猝死,https://weibo.cn/comment/Ii8UvhtmP 187 | 高以翔猝死,https://weibo.cn/comment/Ii9GhgA9s 188 | 高以翔猝死,https://weibo.cn/comment/Ii9QIanWM 189 | 高以翔猝死,https://weibo.cn/comment/Ii82zbuUf 190 | 高以翔猝死,https://weibo.cn/comment/Ii7ZOEMYr 191 | 高以翔猝死,https://weibo.cn/comment/IikUsERBz 192 | 高以翔猝死,https://weibo.cn/comment/Ii7OYa8fB 193 | 高以翔猝死,https://weibo.cn/comment/Iil9Xng89 194 | 高以翔猝死,https://weibo.cn/comment/Ii7UavZFx 195 | 高以翔猝死,https://weibo.cn/comment/Ii7TQqAV7 196 | 高以翔猝死,https://weibo.cn/comment/Ii7PCCbIO 197 | 高以翔猝死,https://weibo.cn/comment/IicGT0ZUp 198 | 高以翔猝死,https://weibo.cn/comment/Ii8bIe5Rf 199 | 高以翔猝死,https://weibo.cn/comment/Iigo89UIY 200 | 高以翔猝死,https://weibo.cn/comment/Ii8qaaoIw 201 | 高以翔猝死,https://weibo.cn/comment/Iia5ZtSfU 202 | 悼念高以翔,https://weibo.cn/comment/Ii8CenApK 203 | 悼念高以翔,https://weibo.cn/comment/Ii8Rl4hUU 204 | 悼念高以翔,https://weibo.cn/comment/Ii8GNbmnY 205 | 悼念高以翔,https://weibo.cn/comment/Ii8Scx9Yo 206 | 悼念高以翔,https://weibo.cn/comment/Ii8F3zrHG 207 | 悼念高以翔,https://weibo.cn/comment/Iit1FjJCC 208 | 悼念高以翔,https://weibo.cn/comment/IilMGcIpC 209 | 悼念高以翔,https://weibo.cn/comment/Ii96flESm 210 | 悼念高以翔,https://weibo.cn/comment/Ii8NRCMXi 211 | 悼念高以翔,https://weibo.cn/comment/IihYTnqU7 212 | 悼念高以翔,https://weibo.cn/comment/IibtDryFP 213 | 悼念高以翔,https://weibo.cn/comment/Ii8iD2AeQ 214 | 悼念高以翔,https://weibo.cn/comment/IirVHjdVP 215 | 悼念高以翔,https://weibo.cn/comment/Ii96Ak8tt 216 | 悼念高以翔,https://weibo.cn/comment/IipTlhEyw 217 | 悼念高以翔,https://weibo.cn/comment/Iim07dsie 218 | 悼念高以翔,https://weibo.cn/comment/Ii97ZdJCK 219 | 悼念高以翔,https://weibo.cn/comment/Ii8WRvYhD 220 | 悼念高以翔,https://weibo.cn/comment/Iilvk32r6 221 | 悼念高以翔,https://weibo.cn/comment/IihuigP7Y 222 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqDq3PoL 223 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/Iir3Eq3B5 224 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqNH8Y6j 225 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqaWcA8s 226 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiALFfDxL 227 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiAcWzc7l 228 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IisKFBWE7 229 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqGpnl5W 230 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/Iircb7ycs 231 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiAeMjrzd 232 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqHJdYmz 233 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqEZyFVP 234 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/Iir2Gtxjh 235 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqDq3PoL 236 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/Iirro0XsJ 237 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiuoYEqfu 238 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IivWuiHY5 239 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiBr8CAPH 240 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/Iiv91mWuc 241 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/IiqTGibYu 242 | 浙江卫视阻拦高以翔粉丝举行追思会,https://weibo.cn/comment/Iitf07Syj 243 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifBYcJyd 244 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifUohpHE 245 | 高以翔父母安慰节目组,https://weibo.cn/comment/IigKREWiV 246 | 高以翔父母安慰节目组,https://weibo.cn/comment/Iih2QxyUm 247 | 高以翔父母安慰节目组,https://weibo.cn/comment/Iig7GnLG4 248 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifJXtVLk 249 | 高以翔父母安慰节目组,https://weibo.cn/comment/IignA9jfm 250 | 高以翔父母安慰节目组,https://weibo.cn/comment/Iih2btKjp 251 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifFm2Nfq 252 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifPQ55sf 253 | 高以翔父母安慰节目组,https://weibo.cn/comment/IigbGm2N0 254 | 高以翔父母安慰节目组,https://weibo.cn/comment/Iigeocsr4 255 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifN3e5Qw 256 | 高以翔父母安慰节目组,https://weibo.cn/comment/IigaVBVLn 257 | 高以翔父母安慰节目组,https://weibo.cn/comment/Iig1SbqaP 258 | 高以翔父母安慰节目组,https://weibo.cn/comment/IigKIbWwI 259 | 高以翔父母安慰节目组,https://weibo.cn/comment/IigRayEZL 260 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifBYcJyd 261 | 高以翔父母安慰节目组,https://weibo.cn/comment/IiggYD57a 262 | 高以翔父母安慰节目组,https://weibo.cn/comment/IifT74Wca 263 | 高以翔父母安慰节目组,https://weibo.cn/comment/IigVtp6rb 264 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizIhoAU7 265 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiAemErEJ 266 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizXceXZR 267 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiuZYgrvV 268 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IipvUkzxI 269 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizTCoaKj 270 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiAjMBnFH 271 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiBTi6E8d 272 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizJCoKE9 273 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IivuRE2JK 274 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiB3RqHIP 275 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiAkjocwd 276 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizCRnIJE 277 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/Iip8J5evR 278 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizNpjatM 279 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiADUb6e7 280 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiqpqwUEp 281 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/Iir3Eq3B5 282 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IizMPfPnX 283 | 毛加恩婚礼照为高以翔保留位置,https://weibo.cn/comment/IiALFfDxL 284 | 高以翔女朋友,https://weibo.cn/comment/Ii858qJad 285 | 高以翔女朋友,https://weibo.cn/comment/Ii83zwrQc 286 | 高以翔女朋友,https://weibo.cn/comment/Ii87KFcxl 287 | 高以翔女朋友,https://weibo.cn/comment/Ii8cRrMi8 288 | 高以翔女朋友,https://weibo.cn/comment/Ii7TcvT8f 289 | 高以翔女朋友,https://weibo.cn/comment/Ii8R9e6xm 290 | 高以翔女朋友,https://weibo.cn/comment/Ii87ogm0t 291 | 高以翔女朋友,https://weibo.cn/comment/Ii86x2cKQ 292 | 高以翔女朋友,https://weibo.cn/comment/Ii8bqfiIa 293 | 高以翔女朋友,https://weibo.cn/comment/Ii83qDO9g 294 | 高以翔女朋友,https://weibo.cn/comment/Ii81zqW54 295 | 高以翔女朋友,https://weibo.cn/comment/Ii8cTv2eU 296 | 高以翔女朋友,https://weibo.cn/comment/Ii882FHX9 297 | 高以翔女朋友,https://weibo.cn/comment/Ii84Irm93 298 | 高以翔女朋友,https://weibo.cn/comment/Ii84B5C8g 299 | 高以翔女朋友,https://weibo.cn/comment/Ii8rrpeli 300 | 高以翔女朋友,https://weibo.cn/comment/Ii896ARdf 301 | 高以翔女朋友,https://weibo.cn/comment/Ii8hCluJB 302 | 高以翔女朋友,https://weibo.cn/comment/Ii8sCErB5 303 | 高以翔女朋友,https://weibo.cn/comment/Ii8fT6Fpn 304 | 高以翔才35岁,https://weibo.cn/comment/Ii7XgASy2 305 | 高以翔才35岁,https://weibo.cn/comment/Ii7Y6ipSV 306 | 高以翔才35岁,https://weibo.cn/comment/Ii7TbmUFA 307 | 高以翔才35岁,https://weibo.cn/comment/Ii85AvDg4 308 | 高以翔才35岁,https://weibo.cn/comment/Ii7KazFSs 309 | 高以翔才35岁,https://weibo.cn/comment/Ii81oxhfV 310 | 高以翔才35岁,https://weibo.cn/comment/Ii89bBx6K 311 | 高以翔才35岁,https://weibo.cn/comment/Ii7OjnBaB 312 | 高以翔才35岁,https://weibo.cn/comment/Ii83qDO9g 313 | 高以翔才35岁,https://weibo.cn/comment/Ii81Ywj94 314 | 高以翔才35岁,https://weibo.cn/comment/Ii84Irm93 315 | 高以翔才35岁,https://weibo.cn/comment/Ii7Js1Sn3 316 | 高以翔才35岁,https://weibo.cn/comment/Ii84wlKPE 317 | 高以翔才35岁,https://weibo.cn/comment/Ii89L5TF3 318 | 高以翔才35岁,https://weibo.cn/comment/Ii8fT6Fpn 319 | 高以翔才35岁,https://weibo.cn/comment/Ii82Bimwo 320 | 高以翔才35岁,https://weibo.cn/comment/Ii7YLFawO 321 | 高以翔才35岁,https://weibo.cn/comment/Ii84oFgBq 322 | 高以翔才35岁,https://weibo.cn/comment/Ii82EwJZI 323 | 高以翔才35岁,https://weibo.cn/comment/Ii82WewuD 324 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IiiaCieTZ 325 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/Iijtkntoo 326 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IipvB5QIW 327 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IioYoo8B3 328 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/Iiio8wSYZ 329 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/Iil46rTa5 330 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IioM5s5G4 331 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/Iii4mxzWt 332 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IiltPw21n 333 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IigjwnEQz 334 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IiiYtFkOr 335 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IijwC0Uor 336 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IijCRmK68 337 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IikONlMiO 338 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IikdeikF6 339 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IilN266d1 340 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IikztmaSj 341 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/IijC4kwrp 342 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/Iil8uAAVw 343 | 林书豪球鞋印高以翔合影,https://weibo.cn/comment/Iijs2iZVD 344 | 高以翔粉丝诉求,https://weibo.cn/comment/Iitx6nzOU 345 | 高以翔粉丝诉求,https://weibo.cn/comment/IityC5lf8 346 | 高以翔遇见王沥川,https://weibo.cn/comment/Ii8AgjFU6 347 | 高以翔遇见王沥川,https://weibo.cn/comment/Ii8lRlhzr 348 | 高以翔遇见王沥川,https://weibo.cn/comment/H3mgwnyel 349 | 高以翔遇见王沥川,https://weibo.cn/comment/GgVu6pZH6 350 | 高以翔遇见王沥川,https://weibo.cn/comment/G8gP8qHCm 351 | 高以翔遇见王沥川,https://weibo.cn/comment/G6dixyGjL 352 | 高以翔遇见王沥川,https://weibo.cn/comment/G4VALqguZ 353 | 高以翔遇见王沥川,https://weibo.cn/comment/FEOEO9uem 354 | 高以翔遇见王沥川,https://weibo.cn/comment/FsEv3dpag 355 | 高以翔遇见王沥川,https://weibo.cn/comment/FnVJ96tgm 356 | 高以翔遇见王沥川,https://weibo.cn/comment/Fj8UVo1bl 357 | 高以翔遇见王沥川,https://weibo.cn/comment/Fj8VJFOMf 358 | 高以翔遇见王沥川,https://weibo.cn/comment/FhwEt5Phn 359 | 高以翔遇见王沥川,https://weibo.cn/comment/FdCbMvgWd 360 | 高以翔遇见王沥川,https://weibo.cn/comment/Ff5N2aOce 361 | 高以翔遇见王沥川,https://weibo.cn/comment/F8kYcoh3Q 362 | 高以翔遇见王沥川,https://weibo.cn/comment/F3A0BCTuF 363 | 高以翔遇见王沥川,https://weibo.cn/comment/F5y477xRp 364 | 高以翔遇见王沥川,https://weibo.cn/comment/F3SiWb3xZ 365 | 高以翔最后一次露面,https://weibo.cn/comment/Ii89tkP6R 366 | 高以翔最后一次露面,https://weibo.cn/comment/Ii83zwrQc 367 | 高以翔最后一次露面,https://weibo.cn/comment/Ii7TEBFHU 368 | 高以翔最后一次露面,https://weibo.cn/comment/Ii86R9kwf 369 | 高以翔最后一次露面,https://weibo.cn/comment/Ii82zdWqD 370 | 高以翔最后一次露面,https://weibo.cn/comment/Ii8GFt2U4 371 | 高以翔最后一次露面,https://weibo.cn/comment/Ii81oxhfV 372 | 高以翔最后一次露面,https://weibo.cn/comment/Ii81Ywj94 373 | 高以翔最后一次露面,https://weibo.cn/comment/Ii9b9qIah 374 | 高以翔最后一次露面,https://weibo.cn/comment/Ii7QP6n5Z 375 | 高以翔最后一次露面,https://weibo.cn/comment/Ii8aWo5uH 376 | 高以翔最后一次露面,https://weibo.cn/comment/Ii7VoaA0o 377 | 高以翔最后一次露面,https://weibo.cn/comment/Ii7Y6ALwS 378 | 高以翔最后一次露面,https://weibo.cn/comment/Ii7YE06Pv 379 | 高以翔最后一次露面,https://weibo.cn/comment/Ii82EwJZI 380 | 高以翔最后一次露面,https://weibo.cn/comment/Ii8AKa629 381 | 高以翔最后一次露面,https://weibo.cn/comment/Ii82WewuD 382 | 高以翔最后一次露面,https://weibo.cn/comment/Ii81fBITj 383 | 高以翔最后一次露面,https://weibo.cn/comment/Ii8bNyrAf 384 | 高以翔最后一次露面,https://weibo.cn/comment/Ii8YdsIF0 385 | -------------------------------------------------------------------------------- /configs.py: -------------------------------------------------------------------------------- 1 | Cookie = { 2 | # "Cookie":"SINAGLOBAL=611428576138.9926.1529031585340; Ugrow-G0=140ad66ad7317901fc818d7fd7743564; login_sid_t=738f2639b7ec9ed4bda4bda71f80fbe1; cross_origin_proto=SSL; _s_tentry=login.sina.com.cn; Apache=1358246560106.4663.1574947692615; ULV=1574947692624:5:4:2:1358246560106.4663.1574947692615:1574945962066; YF-V5-G0=260e732907e3bd813efaef67866e5183; YF-Page-G0=7f483edf167a381b771295af62b14a27|1574954499|1574954484; crossidccode=CODE-gz-1IAwCP-43YF5X-Pks0uH6BiIchdC4a9cfb2; ALF=1606532846; SSOLoginState=1574996846; SCF=Al_WvdbtXEH8Yao4ordpDA-hA_gh9UtndibTXXHvBmUFdV0yXBqO484CDzp5Bj7z_mKcBav8xgTRM0uEdohuprg.; SUB=_2A25w5Ps_DeRhGeBK6FQU8CvJwjSIHXVTkGv3rDV8PUNbmtBeLUXakW9NR93hbJIseJD1haFYH8STgaMVsRkpM1yY; SUBP=0033WrSXqPxfM725Ws9jqgMF55529P9D9W5zny7FLIxKUOfwzh7B98AA5JpX5KzhUgL.FoqXe0qfeh-f1Kn2dJLoIfQLxKqL1K.L1-2LxKBLBonL122LxKML1-BLBK2LxKBLBonL12BLxKqL1hnLBoeLxKML1-2L1hBLxK-L1K5LB-eLxKqLB-BL12eLxK-LBo5L1K2LxK-LBo.LBoBt; SUHB=0FQ7qQX89hc79G; wvr=6; webim_unReadCount=%7B%22time%22%3A1574996883376%2C%22dm_pub_total%22%3A0%2C%22chat_group_client%22%3A0%2C%22allcountNum%22%3A56%2C%22msgbox%22%3A0%7D; UOR=www.google.com.hk,weibo.com,www.google.com" 3 | "Cookie": "OUTFOX_SEARCH_USER_ID_NCOO=190685000.04271516; _T_WM=98250732886; MLOGIN=1; ALF=1577588846; SCF=Al_WvdbtXEH8Yao4ordpDA-hA_gh9UtndibTXXHvBmUFmyO4zBrfxMlN7lhYijtYCVMofHDmXW5FC2WkqI33Inc.; SUB=_2A25w5PyKDeRhGeBK6FQU8CvJwjSIHXVQJoTCrDV6PUJbktANLU_TkW1NR93hbEKLhuRhdBRHhHRwXA89Ut8v0cKO; SUBP=0033WrSXqPxfM725Ws9jqgMF55529P9D9W5zny7FLIxKUOfwzh7B98AA5JpX5K-hUgL.FoqXe0qfeh-f1Kn2dJLoIfQLxKqL1K.L1-2LxKBLBonL122LxKML1-BLBK2LxKBLBonL12BLxKqL1hnLBoeLxKML1-2L1hBLxK-L1K5LB-eLxKqLB-BL12eLxK-LBo5L1K2LxK-LBo.LBoBt; SUHB=0oU4rKHNUbbvXs; SSOLoginState=1574997210" 4 | } 5 | headers = { 6 | 'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0', 7 | 'Host': 'weibo.cn', 8 | 'Accept': 'application/json, text/plain, */*', 9 | 'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 10 | 'Accept-Encoding': 'gzip, deflate, br', 11 | 'Connection': 'keep-alive', 12 | } 13 | 14 | headers_sub_topic = { 15 | 'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0', 16 | 'Host': 's.weibo.com', 17 | 'Accept': 'application/json, text/plain, */*', 18 | 'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 19 | 'Accept-Encoding': 'gzip, deflate, br', 20 | 'Connection': 'keep-alive', 21 | } 22 | 23 | headers_topic = { 24 | 'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0', 25 | 'Host': 's.weibo.com', 26 | 'Accept': 'application/json, text/plain, */*', 27 | 'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 28 | 'Accept-Encoding': 'gzip, deflate, br', 29 | } 30 | -------------------------------------------------------------------------------- /dataCrawl.py: -------------------------------------------------------------------------------- 1 | import sys 2 | sys.path.append("..") 3 | import requests 4 | import time 5 | import random 6 | from bs4 import BeautifulSoup 7 | # from tools import Cookie_Process 8 | # from tools.Emoji_Process import filter_emoji 9 | import pandas as pd 10 | 11 | from configs import Cookie, headers, headers_topic, headers_sub_topic 12 | 13 | 14 | # Cookie = { 15 | # "Cookie":'SINAGLOBAL=611428576138.9926.1529031585340; wb_view_log=1280*7201.5; wvr=6; Ugrow-G0=140ad66ad7317901fc818d7fd7743564; login_sid_t=738f2639b7ec9ed4bda4bda71f80fbe1; cross_origin_proto=SSL; _s_tentry=login.sina.com.cn; Apache=1358246560106.4663.1574947692615; ULV=1574947692624:5:4:2:1358246560106.4663.1574947692615:1574945962066; ALF=1606483698; SSOLoginState=1574947698; SCF=Al_WvdbtXEH8Yao4ordpDA-hA_gh9UtndibTXXHvBmUFzHYP1JlL3i6hA8vy0DzC5TNsedQkLtWUdvNHGhm8UmE.; SUB=_2A25w27sjDeRhGeBK6FQU8CvJwjSIHXVTkKvrrDV8PUNbmtBeLRPNkW9NR93hbGsq6Cqd6bUBQo17s-KlQu2Y7IRL; SUBP=0033WrSXqPxfM725Ws9jqgMF55529P9D9W5zny7FLIxKUOfwzh7B98AA5JpX5KzhUgL.FoqXe0qfeh-f1Kn2dJLoIfQLxKqL1K.L1-2LxKBLBonL122LxKML1-BLBK2LxKBLBonL12BLxKqL1hnLBoeLxKML1-2L1hBLxK-L1K5LB-eLxKqLB-BL12eLxK-LBo5L1K2LxK-LBo.LBoBt; SUHB=04W-JfJec9qftr; YF-V5-G0=260e732907e3bd813efaef67866e5183; UOR=www.google.com.hk,weibo.com,www.google.com; YF-Page-G0=7f483edf167a381b771295af62b14a27|1574954484|1574954484; wb_view_log_6436505598=1280*7201.5; webim_unReadCount=%7B%22time%22%3A1574954492285%2C%22dm_pub_total%22%3A0%2C%22chat_group_client%22%3A0%2C%22allcountNum%22%3A0%2C%22msgbox%22%3A0%7D', 16 | # } 17 | 18 | # Cookie = { 19 | # # "Cookie":"SINAGLOBAL=611428576138.9926.1529031585340; Ugrow-G0=140ad66ad7317901fc818d7fd7743564; login_sid_t=738f2639b7ec9ed4bda4bda71f80fbe1; cross_origin_proto=SSL; _s_tentry=login.sina.com.cn; Apache=1358246560106.4663.1574947692615; ULV=1574947692624:5:4:2:1358246560106.4663.1574947692615:1574945962066; YF-V5-G0=260e732907e3bd813efaef67866e5183; YF-Page-G0=7f483edf167a381b771295af62b14a27|1574954499|1574954484; crossidccode=CODE-gz-1IAwCP-43YF5X-Pks0uH6BiIchdC4a9cfb2; ALF=1606532846; SSOLoginState=1574996846; SCF=Al_WvdbtXEH8Yao4ordpDA-hA_gh9UtndibTXXHvBmUFdV0yXBqO484CDzp5Bj7z_mKcBav8xgTRM0uEdohuprg.; SUB=_2A25w5Ps_DeRhGeBK6FQU8CvJwjSIHXVTkGv3rDV8PUNbmtBeLUXakW9NR93hbJIseJD1haFYH8STgaMVsRkpM1yY; SUBP=0033WrSXqPxfM725Ws9jqgMF55529P9D9W5zny7FLIxKUOfwzh7B98AA5JpX5KzhUgL.FoqXe0qfeh-f1Kn2dJLoIfQLxKqL1K.L1-2LxKBLBonL122LxKML1-BLBK2LxKBLBonL12BLxKqL1hnLBoeLxKML1-2L1hBLxK-L1K5LB-eLxKqLB-BL12eLxK-LBo5L1K2LxK-LBo.LBoBt; SUHB=0FQ7qQX89hc79G; wvr=6; webim_unReadCount=%7B%22time%22%3A1574996883376%2C%22dm_pub_total%22%3A0%2C%22chat_group_client%22%3A0%2C%22allcountNum%22%3A56%2C%22msgbox%22%3A0%7D; UOR=www.google.com.hk,weibo.com,www.google.com" 20 | # "Cookie":"OUTFOX_SEARCH_USER_ID_NCOO=190685000.04271516; _T_WM=98250732886; MLOGIN=1; ALF=1577588846; SCF=Al_WvdbtXEH8Yao4ordpDA-hA_gh9UtndibTXXHvBmUFmyO4zBrfxMlN7lhYijtYCVMofHDmXW5FC2WkqI33Inc.; SUB=_2A25w5PyKDeRhGeBK6FQU8CvJwjSIHXVQJoTCrDV6PUJbktANLU_TkW1NR93hbEKLhuRhdBRHhHRwXA89Ut8v0cKO; SUBP=0033WrSXqPxfM725Ws9jqgMF55529P9D9W5zny7FLIxKUOfwzh7B98AA5JpX5K-hUgL.FoqXe0qfeh-f1Kn2dJLoIfQLxKqL1K.L1-2LxKBLBonL122LxKML1-BLBK2LxKBLBonL12BLxKqL1hnLBoeLxKML1-2L1hBLxK-L1K5LB-eLxKqLB-BL12eLxK-LBo5L1K2LxK-LBo.LBoBt; SUHB=0oU4rKHNUbbvXs; SSOLoginState=1574997210" 21 | # } 22 | # headers = { 23 | # 'User-agent' : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0', 24 | # 'Host': 'weibo.cn', 25 | # 'Accept' : 'application/json, text/plain, */*', 26 | # 'Accept-Language' : 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 27 | # 'Accept-Encoding' : 'gzip, deflate, br', 28 | # 'Connection' : 'keep-alive', 29 | # } 30 | # headers_sub_topic = { 31 | # 'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0', 32 | # 'Host': 's.weibo.com', 33 | # 'Accept': 'application/json, text/plain, */*', 34 | # 'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 35 | # 'Accept-Encoding': 'gzip, deflate, br', 36 | # 'Connection': 'keep-alive', 37 | # } 38 | # headers_topic = { 39 | # 'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0', 40 | # 'Host': 's.weibo.com', 41 | # 'Accept': 'application/json, text/plain, */*', 42 | # 'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3', 43 | # 'Accept-Encoding': 'gzip, deflate, br', 44 | # } 45 | # df_comment = pd.DataFrame(columns=( 46 | # 'idx', 'wid', 'content', 'uid', 'uname', 'likes', 'created_time', 'keyword' 47 | # )) 48 | # line = 0 49 | 50 | df_comment = pd.read_csv('comments\comment_v1.csv') 51 | line = df_comment.shape[0] 52 | 53 | url_comment = 'https://weibo.cn/comment/{}?&page={}' 54 | '''爬取某个微博的的评论信息''' 55 | 56 | def fetch_comment_data(wbid, keyword, cookies): 57 | global line 58 | 59 | print('https://weibo.cn/comment/{}'.format(wbid)) 60 | r_comment = requests.get('https://weibo.cn/comment/{}'.format(wbid), headers=headers, cookies=cookies) 61 | soup_comment = BeautifulSoup(r_comment.text, 'lxml') 62 | flag = False 63 | try: 64 | flag = soup_comment.select('.c')[-1].text.startswith('还没有人针对') 65 | except Exception as e: 66 | page_num = 1 67 | 68 | if flag: 69 | print("--------- 此微博没有人评论! ---------\n") 70 | return 71 | else: 72 | try: 73 | page_num = int(soup_comment.select_one(".pa").text.split()[-1].split("/")[-1].split("页")[0]) 74 | except Exception as e: 75 | page_num = 1 76 | 77 | print(page_num) 78 | if page_num > 200: 79 | page_num = 200 80 | 81 | page_id = 1 82 | flag = -1 83 | # commentinfos = [] 84 | print("--------- 此微博 {} 的评论页数共有 {} 页 ---------\n".format(wbid, page_num)) 85 | while page_id < page_num: 86 | if flag == line: 87 | print('可能已经被阻塞了------') 88 | break 89 | elif flag != line: 90 | flag = line 91 | print('已经爬取了 {} 条评论......'.format(line)) 92 | time.sleep(random.uniform(1,2)) #设置睡眠时间 93 | 94 | print("++++++ 正在爬取此微博 {} 的第 {} 页评论...... ++++++\n".format(wbid, page_id)) 95 | r_comment = requests.get(url_comment.format(wbid, page_id), headers=headers, cookies=cookies) 96 | time.sleep(random.uniform(2, 3)) # 设置睡眠时间 97 | 98 | soup_comment = BeautifulSoup(r_comment.text, 'lxml') 99 | comment_list = soup_comment.select(".c") # 选取class为c的内容 100 | 101 | for l in comment_list: 102 | if str(l.get("id")).startswith("C_"): 103 | # comment_content = filter_emoji(l.select_one(".ctt").text) 104 | comment_content = l.select_one(".ctt").text 105 | comment_userid = l.select_one("a").get("href")[3:] 106 | comment_username = l.select_one("a").text 107 | comment_like = l.select_one(".cc").text.strip()[2:-1] 108 | # comment_createtime = time_process(l.select_one(".ct").text.strip()[:-5]) 109 | comment_createtime = l.select_one(".ct").text.strip()[:-5] 110 | 111 | commentinfo = [line, wbid, comment_content, comment_userid, 112 | comment_username, comment_like, comment_createtime, keyword] 113 | 114 | df_comment.loc[line] = commentinfo 115 | line += 1 116 | if line%100==0: 117 | df_comment.to_csv('comments\comment_v1.csv', index=False) 118 | time.sleep(20) 119 | page_id = page_id + 1 120 | 121 | df_comment.to_csv('comments\comment_v1.csv', index=False) 122 | print("--------- 此微博的全部评论爬取完毕!---------\n\n") 123 | 124 | 125 | def search_topic_content_id(keyword): 126 | #按照关键词,从话题来搜索该话题下的微博,能保证评论数量 127 | topic_content = requests.get('https://s.weibo.com/topic?q={}&pagetype=topic&topic=1&Refer=weibo_topic'.format(keyword), headers=headers_topic) 128 | if topic_content.status_code!=200: 129 | print('connected error') 130 | return -1 131 | 132 | soup_content = BeautifulSoup(topic_content.text, 'lxml') 133 | names = soup_content.select(".name") 134 | sub_topic_herfs = [] 135 | sub_topic_name = [] 136 | 137 | for name in names: 138 | # print(name.get("href")) #每个话题下面的分话题的链接 139 | sub_topic_herfs.append(name.get("href")) 140 | # print(name.text.strip('#')) #每个话题下面的分话题名称 141 | sub_topic_name.append(name.text.strip('#')) 142 | # print(len(names)) 143 | # print(names[0]) 144 | 145 | return sub_topic_herfs, sub_topic_name 146 | 147 | def crawl_sub_topic_content(sub_topic_url, cookies): 148 | ''' 149 | :param sub_topic_url: 分话题的链接 150 | :return: ——> list 每个元素为单个微博的链接,个数为20 151 | ''' 152 | sub_topic_url += '&xsort=hot&suball=1&tw=hotweibo&Refer=weibo_hot' #将子话题下的微博的排序变为热门微博,单个微博评论数量更多 153 | sub_contents = requests.get(sub_topic_url, headers=headers_sub_topic, cookies=cookies) 154 | # print(sub_contents.status_code) 155 | if sub_contents.status_code!=200: 156 | print('connect to sub topic error') 157 | return 0 158 | # print(sub_contents.text) 159 | soup_contents = BeautifulSoup(sub_contents.text, 'lxml') 160 | 161 | content_urls = soup_contents.select('.from') # 为了获取单个微博9位唯一字符串标识, 形如 Ii5QKf6EH 162 | comment_urls = [] 163 | 164 | for c_str in content_urls: 165 | # print(c_str) 166 | only_str = c_str.select_one("a").get("href")[2:].split('/')[2][:9] # 得到单个微博的唯一9位字符串 167 | comment_urls.append('https://weibo.cn/comment/' + only_str) 168 | return comment_urls 169 | 170 | 171 | if __name__ == '__main__': 172 | # cookie = Cookie_Process.write_cookie() # 获取文件中存储的cookie 173 | df_comment_urls = pd.DataFrame(columns=( 174 | 'sub_topic_name', 'comment_url' 175 | )) 176 | line_url = 0 177 | time_start = time.time() 178 | sub_topic_urls, sub_names = search_topic_content_id('高以翔') 179 | for url, name in zip(sub_topic_urls, sub_names): 180 | print(line_url) 181 | sub_comment_urls = crawl_sub_topic_content(url, cookies=Cookie) 182 | for comment_url in sub_comment_urls: 183 | df_comment_urls.loc[line_url] = [name, comment_url] 184 | line_url += 1 185 | time.sleep(2) 186 | print(df_comment_urls.shape) 187 | df_comment_urls.to_csv('comments\comment_urls.csv', index=False) 188 | # crawl_sub_topic_content('https://s.weibo.com/weibo?q=%23%E9%AB%98%E4%BB%A5%E7%BF%94%23', cookies=Cookie) 189 | # search_all_comment(input("请输入要搜索的微博评论的关键字:")) 190 | # contents_strs = ['Ii7jlyQvu', 'Ii6ZY7972', 'Iihqa5WM7', 'IigXmr5Wv', 'IipNXjNZe', 'IipRscSR6'] #'Ii8Y09uyw', 'Ii7KAig2u','Ii9fuDZtm','IicxftnLC','IiltQpmgz'] #'Ii6s9xVZP', 'Ii7MN70E2'] 191 | # for content_str in contents_strs: 192 | # fetch_comment_data(content_str, '高以翔', Cookie) 193 | # time.sleep(20) 194 | 195 | time_end = time.time() 196 | 197 | print('本次操作数据全部爬取成功,爬取用时秒数:', (time_end - time_start)) 198 | 199 | 200 | 201 | 202 | 203 | 204 | -------------------------------------------------------------------------------- /dataProcess.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from collections import Counter 3 | import re 4 | import jieba as jb 5 | import jieba.posseg 6 | import jieba.analyse as analyse 7 | # from snownlp import SnowNLP 8 | from snownlp import sentiment 9 | 10 | 11 | from dataShow import plot_most_frequent, sentiment_distribution, words_cloud 12 | 13 | 14 | def clean_str(texts): 15 | texts_copy = [] 16 | for text in texts: 17 | # if math.isnan(text): 18 | if text.__class__.__name__ == 'float': # 去掉一些nan值 19 | continue 20 | if text.startswith('回复@'): 21 | text = re.sub('回复@[^:].+:', '', text) 22 | text = re.sub('[。!!.]{3,}', '', text) 23 | if len(text) >= 4: 24 | texts_copy.append(text) 25 | return texts_copy 26 | 27 | 28 | def word_frequence(texts, most_common=100): 29 | stop_words = [line.strip() for line in open('stopwords\哈工大停用词表.txt', 'r', encoding='utf-8').readlines()] 30 | c = Counter() 31 | for text in texts: 32 | cuted_text = jieba.posseg.cut(text) # 带词性的分词 33 | for word, pos in cuted_text: 34 | if word not in stop_words and (pos.startswith('v') or pos.startswith('n')): 35 | c[word] += 1 36 | common_words = c.most_common(most_common) 37 | for common_tuple in common_words: 38 | print('s', common_tuple[0], 's', ' ', common_tuple[1], ' ', common_tuple[1]/len(texts)) 39 | return common_words 40 | 41 | 42 | def sentiment_analyse(texts): 43 | sentiments = [] 44 | for text in texts: 45 | sentiments.append(sentiment.classify(text)) 46 | return sentiments 47 | 48 | 49 | def extract_tags(texts): 50 | document = ' '.join(texts) 51 | keywords_textrank = analyse.textrank(document, topK=30) 52 | # print(keywords) 53 | keywords_tfidf = analyse.tfidf(document, topK=30) 54 | # print(keywords) 55 | return keywords_textrank, keywords_tfidf 56 | 57 | 58 | if __name__ == "__main__": 59 | df_comments = pd.read_csv('comments\comment_v1.csv') 60 | comments = df_comments['content'] 61 | print(len(comments)) 62 | comments = clean_str(comments) 63 | print(len(comments)) 64 | print() 65 | print(comments[:50]) 66 | extract_tags(comments[:100]) 67 | commons = word_frequence(comments) 68 | words_cloud(comments) 69 | # plot_most_frequent(commons) 70 | # sentiments = sentiment_analyse(comments) 71 | # sentiment_distribution(sentiments, comments) 72 | # 73 | # for s, t in zip(sentiments, comments): 74 | # pos = 0 75 | # neg = 0 76 | # if s>0.95: 77 | # print(round(s, 3), t) 78 | # pos += 1 79 | # elif s<0.05: 80 | # print(round(s, 3), t) 81 | # neg += 1 82 | # print(pos, neg) 83 | # plt.plot(range(len(sentiments)), sorted(sentiments)) 84 | # plt.show() 85 | -------------------------------------------------------------------------------- /dataShow.py: -------------------------------------------------------------------------------- 1 | import matplotlib.pyplot as plt 2 | import numpy as np 3 | import matplotlib 4 | from wordcloud import WordCloud 5 | import jieba as jb 6 | 7 | matplotlib.rcParams['font.family'] = 'Times New Roman' 8 | plt.style.use('dark_background') 9 | 10 | 11 | def plot_most_frequent(common_words): 12 | # rows = len(common_words/10) #一张图画十个 13 | data = np.zeros(len(common_words)) 14 | words = [] 15 | for ii in range(len(common_words)): 16 | print(common_words[ii]) 17 | words.append(common_words[ii][0]) 18 | data[ii] = common_words[ii][1] 19 | 20 | plt.figure(figsize=(20, 5)) 21 | plt.bar(range(len(data)), data, width=0.4) 22 | plt.xticks(range(len(data)), words, rotation=90) 23 | plt.title('词频图 (名词与动词)') 24 | plt.xlabel('词') 25 | plt.ylabel('频次') 26 | plt.savefig('result\word_frequent.png', dpi=600, bbox_inches='tight') 27 | plt.show() 28 | 29 | # data = data.reshape((-1, 10)) 30 | 31 | # plt.figure(figsize=(5, 10)) 32 | # for ii in range(data.shape[0]): 33 | # plt.subplot(data.shape[0], 1, ii+1) 34 | # plt.bar(range(len(data[ii])), data[ii], width=0.4) 35 | # plt.ylim([0, common_words[0][1] + 30]) 36 | # plt.show() 37 | 38 | return 0 39 | 40 | 41 | def words_cloud(comments): 42 | # pass 43 | wcd = WordCloud( 44 | background_color='black', 45 | repeat=False, 46 | max_words=100, 47 | height=400, 48 | width=500, 49 | mask=None, 50 | max_font_size=50, 51 | min_font_size=10, 52 | colormap='Reds', 53 | mode='RGBA' 54 | ) 55 | comments = ' '.join(comments) 56 | ss = ' '.join(jb.lcut(comments)) 57 | wcd.generate(ss) 58 | wcd.to_image() 59 | wcd.to_file('result\wordCloud.png') 60 | print('+++++ 词云已经生成完毕 +++++') 61 | return 0 62 | 63 | 64 | def sentiment_distribution(sentiments, comments, up=0.95, down=0.05): 65 | pos = 0 66 | neg = 0 67 | for s, t in zip(sentiments, comments): 68 | if s > up: 69 | print(round(s, 3), t) 70 | pos += 1 71 | elif s < down: 72 | print(round(s, 3), t) 73 | neg += 1 74 | print(pos, neg) 75 | # plt.plot(range(len(sentiments)), sorted(sentiments)) 76 | plt.scatter(range(len(sentiments)), sentiments, s=0.4) 77 | plt.scatter(range(len(sentiments)), sorted(sentiments), s=0.2, marker='*') 78 | plt.ylim([0, 1.1]) 79 | plt.ylabel('Sentiment') 80 | plt.savefig('result\sentiment.png', dpi=1080, bbox_inches='tight') 81 | plt.show() 82 | 83 | return 0 84 | -------------------------------------------------------------------------------- /result/2.txt: -------------------------------------------------------------------------------- 1 | 2 2 | -------------------------------------------------------------------------------- /stopwords/1.txt: -------------------------------------------------------------------------------- 1 | 1 2 | -------------------------------------------------------------------------------- /stopwords/哈工大停用词表.txt: -------------------------------------------------------------------------------- 1 | 2 | ——— 3 | 》), 4 | )÷(1- 5 | ”, 6 | )、 7 | =( 8 | : 9 | → 10 | ℃ 11 | & 12 | * 13 | 一一 14 | ~~~~ 15 | ’ 16 | . 17 | 『 18 | .一 19 | ./ 20 | -- 21 | 』 22 | =″ 23 | 【 24 | [*] 25 | }> 26 | [⑤]] 27 | [①D] 28 | c] 29 | ng昉 30 | * 31 | // 32 | [ 33 | ] 34 | [②e] 35 | [②g] 36 | ={ 37 | } 38 | ,也 39 | ‘ 40 | A 41 | [①⑥] 42 | [②B] 43 | [①a] 44 | [④a] 45 | [①③] 46 | [③h] 47 | ③] 48 | 1. 49 | -- 50 | [②b] 51 | ’‘ 52 | ××× 53 | [①⑧] 54 | 0:2 55 | =[ 56 | [⑤b] 57 | [②c] 58 | [④b] 59 | [②③] 60 | [③a] 61 | [④c] 62 | [①⑤] 63 | [①⑦] 64 | [①g] 65 | ∈[ 66 | [①⑨] 67 | [①④] 68 | [①c] 69 | [②f] 70 | [②⑧] 71 | [②①] 72 | [①C] 73 | [③c] 74 | [③g] 75 | [②⑤] 76 | [②②] 77 | 一. 78 | [①h] 79 | .数 80 | [] 81 | [①B] 82 | 数/ 83 | [①i] 84 | [③e] 85 | [①①] 86 | [④d] 87 | [④e] 88 | [③b] 89 | [⑤a] 90 | [①A] 91 | [②⑧] 92 | [②⑦] 93 | [①d] 94 | [②j] 95 | 〕〔 96 | ][ 97 | :// 98 | ′∈ 99 | [②④ 100 | [⑤e] 101 | 12% 102 | b] 103 | ... 104 | ................... 105 | …………………………………………………③ 106 | ZXFITL 107 | [③F] 108 | 」 109 | [①o] 110 | ]∧′=[ 111 | ∪φ∈ 112 | ′| 113 | {- 114 | ②c 115 | } 116 | [③①] 117 | R.L. 118 | [①E] 119 | Ψ 120 | -[*]- 121 | ↑ 122 | .日 123 | [②d] 124 | [② 125 | [②⑦] 126 | [②②] 127 | [③e] 128 | [①i] 129 | [①B] 130 | [①h] 131 | [①d] 132 | [①g] 133 | [①②] 134 | [②a] 135 | f] 136 | [⑩] 137 | a] 138 | [①e] 139 | [②h] 140 | [②⑥] 141 | [③d] 142 | [②⑩] 143 | e] 144 | 〉 145 | 】 146 | 元/吨 147 | [②⑩] 148 | 2.3% 149 | 5:0 150 | [①] 151 | :: 152 | [②] 153 | [③] 154 | [④] 155 | [⑤] 156 | [⑥] 157 | [⑦] 158 | [⑧] 159 | [⑨] 160 | …… 161 | —— 162 | ? 163 | 、 164 | 。 165 | “ 166 | ” 167 | 《 168 | 》 169 | ! 170 | , 171 | : 172 | ; 173 | ? 174 | . 175 | , 176 | . 177 | ' 178 | ? 179 | · 180 | ——— 181 | ── 182 | ? 183 | — 184 | < 185 | > 186 | ( 187 | ) 188 | 〔 189 | 〕 190 | [ 191 | ] 192 | ( 193 | ) 194 | - 195 | + 196 | ~ 197 | × 198 | / 199 | / 200 | ① 201 | ② 202 | ③ 203 | ④ 204 | ⑤ 205 | ⑥ 206 | ⑦ 207 | ⑧ 208 | ⑨ 209 | ⑩ 210 | Ⅲ 211 | В 212 | " 213 | ; 214 | # 215 | @ 216 | γ 217 | μ 218 | φ 219 | φ. 220 | × 221 | Δ 222 | ■ 223 | ▲ 224 | sub 225 | exp 226 | sup 227 | sub 228 | Lex 229 | # 230 | % 231 | & 232 | ' 233 | + 234 | +ξ 235 | ++ 236 | - 237 | -β 238 | < 239 | <± 240 | <Δ 241 | <λ 242 | <φ 243 | << 244 | = 245 | = 246 | =☆ 247 | =- 248 | > 249 | >λ 250 | _ 251 | ~± 252 | ~+ 253 | [⑤f] 254 | [⑤d] 255 | [②i] 256 | ≈ 257 | [②G] 258 | [①f] 259 | LI 260 | ㈧ 261 | [- 262 | ...... 263 | 〉 264 | [③⑩] 265 | 第二 266 | 一番 267 | 一直 268 | 一个 269 | 一些 270 | 许多 271 | 种 272 | 有的是 273 | 也就是说 274 | 末##末 275 | 啊 276 | 阿 277 | 哎 278 | 哎呀 279 | 哎哟 280 | 唉 281 | 俺 282 | 俺们 283 | 按 284 | 按照 285 | 吧 286 | 吧哒 287 | 把 288 | 罢了 289 | 被 290 | 本 291 | 本着 292 | 比 293 | 比方 294 | 比如 295 | 鄙人 296 | 彼 297 | 彼此 298 | 边 299 | 别 300 | 别的 301 | 别说 302 | 并 303 | 并且 304 | 不比 305 | 不成 306 | 不单 307 | 不但 308 | 不独 309 | 不管 310 | 不光 311 | 不过 312 | 不仅 313 | 不拘 314 | 不论 315 | 不怕 316 | 不然 317 | 不如 318 | 不特 319 | 不惟 320 | 不问 321 | 不只 322 | 朝 323 | 朝着 324 | 趁 325 | 趁着 326 | 乘 327 | 冲 328 | 除 329 | 除此之外 330 | 除非 331 | 除了 332 | 此 333 | 此间 334 | 此外 335 | 从 336 | 从而 337 | 打 338 | 待 339 | 但 340 | 但是 341 | 当 342 | 当着 343 | 到 344 | 得 345 | 的 346 | 的话 347 | 等 348 | 等等 349 | 地 350 | 第 351 | 叮咚 352 | 对 353 | 对于 354 | 多 355 | 多少 356 | 而 357 | 而况 358 | 而且 359 | 而是 360 | 而外 361 | 而言 362 | 而已 363 | 尔后 364 | 反过来 365 | 反过来说 366 | 反之 367 | 非但 368 | 非徒 369 | 否则 370 | 嘎 371 | 嘎登 372 | 该 373 | 赶 374 | 个 375 | 各 376 | 各个 377 | 各位 378 | 各种 379 | 各自 380 | 给 381 | 根据 382 | 跟 383 | 故 384 | 故此 385 | 固然 386 | 关于 387 | 管 388 | 归 389 | 果然 390 | 果真 391 | 过 392 | 哈 393 | 哈哈 394 | 呵 395 | 和 396 | 何 397 | 何处 398 | 何况 399 | 何时 400 | 嘿 401 | 哼 402 | 哼唷 403 | 呼哧 404 | 乎 405 | 哗 406 | 还是 407 | 还有 408 | 换句话说 409 | 换言之 410 | 或 411 | 或是 412 | 或者 413 | 极了 414 | 及 415 | 及其 416 | 及至 417 | 即 418 | 即便 419 | 即或 420 | 即令 421 | 即若 422 | 即使 423 | 几 424 | 几时 425 | 己 426 | 既 427 | 既然 428 | 既是 429 | 继而 430 | 加之 431 | 假如 432 | 假若 433 | 假使 434 | 鉴于 435 | 将 436 | 较 437 | 较之 438 | 叫 439 | 接着 440 | 结果 441 | 借 442 | 紧接着 443 | 进而 444 | 尽 445 | 尽管 446 | 经 447 | 经过 448 | 就 449 | 就是 450 | 就是说 451 | 据 452 | 具体地说 453 | 具体说来 454 | 开始 455 | 开外 456 | 靠 457 | 咳 458 | 可 459 | 可见 460 | 可是 461 | 可以 462 | 况且 463 | 啦 464 | 来 465 | 来着 466 | 离 467 | 例如 468 | 哩 469 | 连 470 | 连同 471 | 两者 472 | 了 473 | 临 474 | 另 475 | 另外 476 | 另一方面 477 | 论 478 | 嘛 479 | 吗 480 | 慢说 481 | 漫说 482 | 冒 483 | 么 484 | 每 485 | 每当 486 | 们 487 | 莫若 488 | 某 489 | 某个 490 | 某些 491 | 拿 492 | 哪 493 | 哪边 494 | 哪儿 495 | 哪个 496 | 哪里 497 | 哪年 498 | 哪怕 499 | 哪天 500 | 哪些 501 | 哪样 502 | 那 503 | 那边 504 | 那儿 505 | 那个 506 | 那会儿 507 | 那里 508 | 那么 509 | 那么些 510 | 那么样 511 | 那时 512 | 那些 513 | 那样 514 | 乃 515 | 乃至 516 | 呢 517 | 能 518 | 你 519 | 你们 520 | 您 521 | 宁 522 | 宁可 523 | 宁肯 524 | 宁愿 525 | 哦 526 | 呕 527 | 啪达 528 | 旁人 529 | 呸 530 | 凭 531 | 凭借 532 | 其 533 | 其次 534 | 其二 535 | 其他 536 | 其它 537 | 其一 538 | 其余 539 | 其中 540 | 起 541 | 起见 542 | 起见 543 | 岂但 544 | 恰恰相反 545 | 前后 546 | 前者 547 | 且 548 | 然而 549 | 然后 550 | 然则 551 | 让 552 | 人家 553 | 任 554 | 任何 555 | 任凭 556 | 如 557 | 如此 558 | 如果 559 | 如何 560 | 如其 561 | 如若 562 | 如上所述 563 | 若 564 | 若非 565 | 若是 566 | 啥 567 | 上下 568 | 尚且 569 | 设若 570 | 设使 571 | 甚而 572 | 甚么 573 | 甚至 574 | 省得 575 | 时候 576 | 什么 577 | 什么样 578 | 使得 579 | 是 580 | 是的 581 | 首先 582 | 谁 583 | 谁知 584 | 顺 585 | 顺着 586 | 似的 587 | 虽 588 | 虽然 589 | 虽说 590 | 虽则 591 | 随 592 | 随着 593 | 所 594 | 所以 595 | 他 596 | 他们 597 | 他人 598 | 它 599 | 它们 600 | 她 601 | 她们 602 | 倘 603 | 倘或 604 | 倘然 605 | 倘若 606 | 倘使 607 | 腾 608 | 替 609 | 通过 610 | 同 611 | 同时 612 | 哇 613 | 万一 614 | 往 615 | 望 616 | 为 617 | 为何 618 | 为了 619 | 为什么 620 | 为着 621 | 喂 622 | 嗡嗡 623 | 我 624 | 我们 625 | 呜 626 | 呜呼 627 | 乌乎 628 | 无论 629 | 无宁 630 | 毋宁 631 | 嘻 632 | 吓 633 | 相对而言 634 | 像 635 | 向 636 | 向着 637 | 嘘 638 | 呀 639 | 焉 640 | 沿 641 | 沿着 642 | 要 643 | 要不 644 | 要不然 645 | 要不是 646 | 要么 647 | 要是 648 | 也 649 | 也罢 650 | 也好 651 | 一 652 | 一般 653 | 一旦 654 | 一方面 655 | 一来 656 | 一切 657 | 一样 658 | 一则 659 | 依 660 | 依照 661 | 矣 662 | 以 663 | 以便 664 | 以及 665 | 以免 666 | 以至 667 | 以至于 668 | 以致 669 | 抑或 670 | 因 671 | 因此 672 | 因而 673 | 因为 674 | 哟 675 | 用 676 | 由 677 | 由此可见 678 | 由于 679 | 有 680 | 有的 681 | 有关 682 | 有些 683 | 又 684 | 于 685 | 于是 686 | 于是乎 687 | 与 688 | 与此同时 689 | 与否 690 | 与其 691 | 越是 692 | 云云 693 | 哉 694 | 再说 695 | 再者 696 | 在 697 | 在下 698 | 咱 699 | 咱们 700 | 则 701 | 怎 702 | 怎么 703 | 怎么办 704 | 怎么样 705 | 怎样 706 | 咋 707 | 照 708 | 照着 709 | 者 710 | 这 711 | 这边 712 | 这儿 713 | 这个 714 | 这会儿 715 | 这就是说 716 | 这里 717 | 这么 718 | 这么点儿 719 | 这么些 720 | 这么样 721 | 这时 722 | 这些 723 | 这样 724 | 正如 725 | 吱 726 | 之 727 | 之类 728 | 之所以 729 | 之一 730 | 只是 731 | 只限 732 | 只要 733 | 只有 734 | 至 735 | 至于 736 | 诸位 737 | 着 738 | 着呢 739 | 自 740 | 自从 741 | 自个儿 742 | 自各儿 743 | 自己 744 | 自家 745 | 自身 746 | 综上所述 747 | 总的来看 748 | 总的来说 749 | 总的说来 750 | 总而言之 751 | 总之 752 | 纵 753 | 纵令 754 | 纵然 755 | 纵使 756 | 遵照 757 | 作为 758 | 兮 759 | 呃 760 | 呗 761 | 咚 762 | 咦 763 | 喏 764 | 啐 765 | 喔唷 766 | 嗬 767 | 嗯 768 | 嗳 769 | --------------------------------------------------------------------------------