├── README.md ├── __pycache__ ├── get_distance.cpython-36.pyc ├── qqlist.cpython-36.pyc ├── settings.cpython-36.pyc └── transCoordinateSystem.cpython-36.pyc ├── baidu_relitu.html ├── chromedriver.exe ├── data.txt ├── data2txt.bat ├── debug.log ├── example ├── example2018-10-06-13-22-00.txt ├── example2018-10-11-18-44-06.txt └── example2018-11-19-10-25-19.txt ├── get_distance.py ├── get_grid.py ├── log.log ├── main.py ├── qqlist.py ├── read_data_to_txt.py ├── run.bat ├── settings.py └── transCoordinateSystem.py /README.md: -------------------------------------------------------------------------------- 1 | #### 写在前面 2 | 3 | 目前已经无法爬取数据!!! 4 | 5 | 由于宜出行的登录策略更新,导致无法使用qq登录直接爬取人流量的问题,近期进行了代码升级,已经解决了该问题,并且能顺利爬取数据,示例如下。示例数据: 6 | ![ 7 | ](https://img-blog.csdnimg.cn/2019091609403833.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0NDY0OTI2,size_16,color_FFFFFF,t_70) 8 | 9 | 可视化效果图: 10 | 11 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20190916094104405.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0NDY0OTI2,size_16,color_FFFFFF,t_70) 12 | 目前也有腾讯位置大数据爬取的在线工具免费使用,不过该数据精度不够高,如果对精度要求高,推荐使用宜出行爬取。 13 | 14 | 工具链接:[腾讯出行人流量爬取工具](http://www.mapboxx.cn/tool/tencet/) 15 | 16 | 这是腾讯位置大数据可视化效果图: 17 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20190828153306464.jpg?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzM0NDY0OTI2,size_16,color_FFFFFF,t_70) 18 | 19 | 欢迎加入地图数据爬取交流群:626697156,有问题可以反馈给群主 20 | 21 | #### 工具介绍: 22 | 该工具基于微信中的宜出行提供的数据接口进行爬取,能够爬取一定范围内的当前时间点的人流量数据。地址:https://liujiao111.github.io/ 23 | 24 | #### 环境: 25 | - windows 26 | - python3+ 27 | - 安装第三方包:缺啥安装啥 28 | 29 | #### 使用指南: 30 | - 申请多个qq号,并将qq号放入当前目录下的qqlist.py文件中,格式如下: 31 | ``` 32 | qq_list = 33 | [["11111111", "11111111"], 34 | ["11111111", "11111111"], 35 | ["11111111", "11111111"], 36 | ["11111111", "11111111"], 37 | ["11111111", "11111111"], 38 | ["11111111", "11111111"], 39 | ["11111111", "11111111"], 40 | ["11111111", "11111111"]] 41 | ``` 42 | 根据你需要爬取的城市范围大小,适当申请多个qq号进行爬取(我试了下用6个号爬取来宾市的是没什么的),因为每个QQ号能爬取的数据量有限。将每个QQ号放入该文件中,并遵循已有的格式。 43 | - 确定需要爬取的城市矩形范围。使用百度地图提供的坐标拾取工具,确定城市的左上角、右上角、右下角、左下角四个点的坐标(大概组成一个矩形,不用太准确),并将拾取的四个坐标点依次填入setting.py文件中,并对应下面四个变量,示例: 44 | ``` 45 | city_bound_point_A = [114.286652,30.642638] #左上角点,x619 46 | city_bound_point_D = [114.239273,30.580588] #左下角点,农场十一队 47 | city_bound_point_B = [114.462433,30.574677] #右上角点,木妙 48 | city_bound_point_C = [114.418488,30.479746] #右下角点,牛场右下角 49 | ``` 50 | - 双击 `run.bat`文件,执行爬取程序,需要注意的是程序会自动打开谷歌浏览器并进行登陆操作,属于正常现象需要提前装好谷歌浏览器。 51 | - 等待一段时间后(具体不一定,大概10分钟),爬取后的结果存放到example文件夹中,里面txt文件就是爬取到的人流量数据,文件使用爬取的时间点进行命名,数据有四个字段 `count`,`wgs_lng`,`wgs_lat,time`,分别代表人流量,经度,纬度,时间点,后面可以使用该文件进行解析得到需要的数据,并进行可视化。 52 | 53 | - 这里提供了一个可以解析人流量txt数据文件,并生成用于百度地图热力图展示需要的数据的代码,见目录下的read_data_to_txt.py代码,需要修改代码中file_path的路径,修改为上面生成的example文件夹中txt格式的人流量数据所在文件路径,保存然后执行data2txt.bat,执行后会在当前目录下生成data_flow.xls和data.txt两个人流量数据文件。 54 | - 上面生成的txt文件可以拷贝内容到baidu_relitu.html文件中(文件内容有点多,请耐心等待),将html文件中第31行`var points = `后面的内容替换为上面data.txt中的文本内容,并且将第27行的中心点坐标替换为31行中随意一个坐标点的坐标,保存后在浏览器打开该HTML文件,即可看到在地图上展示的热力图数据。 55 | -------------------------------------------------------------------------------- /__pycache__/get_distance.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liujiao111/map-easygo/2b90f76f11b61349d2e2829efb56f0c2e5703b74/__pycache__/get_distance.cpython-36.pyc -------------------------------------------------------------------------------- /__pycache__/qqlist.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liujiao111/map-easygo/2b90f76f11b61349d2e2829efb56f0c2e5703b74/__pycache__/qqlist.cpython-36.pyc -------------------------------------------------------------------------------- /__pycache__/settings.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liujiao111/map-easygo/2b90f76f11b61349d2e2829efb56f0c2e5703b74/__pycache__/settings.cpython-36.pyc -------------------------------------------------------------------------------- /__pycache__/transCoordinateSystem.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liujiao111/map-easygo/2b90f76f11b61349d2e2829efb56f0c2e5703b74/__pycache__/transCoordinateSystem.cpython-36.pyc -------------------------------------------------------------------------------- /chromedriver.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liujiao111/map-easygo/2b90f76f11b61349d2e2829efb56f0c2e5703b74/chromedriver.exe -------------------------------------------------------------------------------- /data.txt: -------------------------------------------------------------------------------- 1 | OBJECTID,x,y 2 | 0,114.26942623315855,30.63226138309839 3 | 1,114.27865432656893,30.6286287161043 4 | 2,114.28789448746554,30.62494023540019 5 | 3,114.29715277615125,30.621175233005072 6 | 4,114.30643069884272,30.61733030397934 7 | 5,114.31572496555344,30.613420235421266 8 | 6,114.32502836212625,30.609474699161108 9 | 7,114.33433153100722,30.60553150825684 10 | 8,114.34362524689199,30.60162799282649 11 | 9,114.35290266021845,30.597792486470645 12 | 10,114.3621609891741,30.594037889324714 13 | 11,114.37140226753486,30.59035879335141 14 | 12,114.38063297241659,30.58673283121706 15 | 13,114.38986261357938,30.583125933081607 16 | 14,114.39910160545467,30.579500272196338 17 | 15,114.40835890952748,30.575823059372546 18 | 16,114.41763998946705,30.572074151523626 19 | 17,114.4269455512678,30.568250715246975 20 | 18,114.43627136116008,30.564367868563846 21 | 19,114.44560918607687,30.560455154370043 22 | 20,114.26419058978227,30.625281027986087 23 | 21,114.27343769819608,30.621474358136727 24 | 22,114.28269185676669,30.61762935838196 25 | 23,114.29196114914146,30.61371728364662 26 | 24,114.30124969427061,30.609724510684966 27 | 25,114.31055677800482,30.605655792539483 28 | 26,114.31987711630347,30.601533259372268 29 | 27,114.32920218487422,30.597391392835924 30 | 28,114.33852230393886,30.59326913678466 31 | 29,114.34782899299422,30.589200973036345 32 | 30,114.35711704852268,30.585209031879934 33 | 31,114.36638586153947,30.58129806678315 34 | 32,114.3756396669607,30.57745445715887 35 | 33,114.384886663037,30.57364946708039 36 | 34,114.39413720018716,30.56984599934493 37 | 35,114.40340145445602,30.566007273109705 38 | 36,114.41268712150988,30.562105407811547 39 | 37,114.42199766420748,30.558127919370353 40 | 38,114.4313315203655,30.55408062090202 41 | 39,114.44068245604548,30.549986255964868 42 | 40,114.25895597322919,30.618300803487735 43 | 41,114.26822531832637,30.614308120177473 44 | 42,114.2774958421729,30.61029742536859 45 | 43,114.28677709160701,30.606234019536817 46 | 44,114.29607555628806,30.602094977938474 47 | 45,114.3053932247838,30.59787457549871 48 | 46,114.31472720533232,30.593585718054758 49 | 47,114.32407049555233,30.589257041367382 50 | 48,114.33341371850241,30.584926349794667 51 | 49,114.34274741861493,30.58063192091462 52 | 50,114.35206438158323,30.576403700876362 53 | 51,114.36136143678193,30.572256441096346 54 | 52,114.37064032035208,30.568186373910333 55 | 53,114.37990739423917,30.564172198854195 56 | 54,114.38917228141486,30.56018014549262 57 | 55,114.3984457292604,30.556171928134315 58 | 56,114.40773719334987,30.552113734199807 59 | 57,114.4170527005166,30.54798414834937 60 | 58,114.426393487061,30.543779164217273 61 | 59,114.43575572925646,30.539513116010493 62 | 60,114.25371999080413,30.611327568598718 63 | 61,114.2630146714876,30.607136423230408 64 | 62,114.27230421831752,30.602948898779974 65 | 63,114.2815989681222,30.598726896704232 66 | 64,114.29090736574085,30.594439813215573 67 | 65,114.30023403911693,30.590071772427475 68 | 66,114.30957879031271,30.58562543158839 69 | 67,114.31893673459012,30.58112146273595 70 | 68,114.32829955118943,30.576593827269384 71 | 69,114.33765755020531,30.572081945656898 72 | 70,114.34700207003895,30.567621592454625 73 | 71,114.3563276429908,30.563236645943878 74 | 72,114.36563342018016,30.558933621561614 75 | 73,114.37492351869561,30.554700265815995 76 | 74,114.38420620415293,30.550508535771918 77 | 75,114.39349209294747,30.54632126184159 78 | 76,114.4027917876175,30.542100929454595 79 | 77,114.41211349241576,30.537818520119064 80 | 78,114.42146116279929,30.533460340057886 81 | 79,114.43083361989405,30.529031237425045 82 | 80,114.24848017621045,30.604367558174022 83 | 81,114.25780294769426,30.599966007933595 84 | 82,114.26711429613302,30.595589398350356 85 | 83,114.27642461262408,30.59119907387378 86 | 84,114.28574373733923,30.586758967306363 87 | 85,114.29507868555581,30.582244135569027 88 | 86,114.30443206785478,30.577646704245087 89 | 87,114.3138015719004,30.572977818875206 90 | 88,114.32318062301177,30.568265139095512 91 | 89,114.33256006399537,30.563546459903808 92 | 90,114.34193045602302,30.558860955801816 93 | 91,114.35128445707679,30.554240103369597 94 | 92,114.36061871610495,30.549700413787246 95 | 93,114.3699348336355,30.545239680862014 96 | 94,114.37923915710593,30.54083762266784 97 | 95,114.38854145127576,30.536460761072423 98 | 96,114.3978527476554,30.532070386829812 99 | 97,114.40718287039486,30.52763173311865 100 | 98,114.4165382139557,30.52312219771092 101 | 99,114.42592029130874,30.51853667943309 102 | 100,114.24323421152309,30.59742592063169 103 | 101,114.25258730324909,30.59280344070319 104 | 102,114.26192318889092,30.588225297724485 105 | 103,114.2712515699852,30.58365519728103 106 | 104,114.28058301629544,30.579054258911075 107 | 105,114.28992650294718,30.574390240363588 108 | 106,114.29928736159246,30.569645222300032 109 | 107,114.3086661252687,30.564819935247233 110 | 108,114.31805853404538,30.559933701968795 111 | 109,114.32745669410025,30.555020003289094 112 | 110,114.33685111110412,30.550118709296502 113 | 111,114.3462331132596,30.54526680614884 114 | 112,114.3555970877762,30.540489804627448 115 | 113,114.36494199806286,30.53579585565373 116 | 114,114.37427181726413,30.531173958204686 117 | 115,114.3835947678547,30.526596677946028 118 | 116,114.39292153773322,30.522026729389744 119 | 117,114.40226288503236,30.51742586248569 120 | 118,114.41162718979412,30.512763951656723 121 | 119,114.42101852596055,30.508026139296735 122 | 120,114.23798014684108,30.590506340739044 123 | 121,114.24736513042276,30.58565463529439 124 | 122,114.25672809436851,30.58086326596601 125 | 123,114.2660773526323,30.576101074517638 126 | 124,114.2754234776988,30.571329258264992 127 | 125,114.2847768258863,30.566510604105567 128 | 126,114.2941451576859,30.561618376157387 129 | 127,114.3035319130092,30.556642746694685 130 | 128,114.31293554348797,30.55159324052544 131 | 129,114.32235005215246,30.54649660715533 132 | 130,114.33176660463299,30.541390619243597 133 | 131,114.3411758229167,30.536315262595544 134 | 132,114.35057021189515,30.53130340241457 135 | 133,114.35994613783922,30.526373135256033 136 | 134,114.36930488358954,30.521523636251942 137 | 135,114.37865252293219,30.51673548174892 138 | 136,114.38799863551922,30.511975365743876 139 | 137,114.39735415849944,30.507204086716452 140 | 138,114.40672887704636,30.502385906928648 141 | 139,114.41612914426577,30.4974970625239 142 | 140,114.23271659516251,30.583610774381636 143 | 141,114.2421343077957,30.57852442351316 144 | 142,114.25152656197389,30.573509785409023 145 | 143,114.26089968118588,30.568543258330674 146 | 144,114.27026350073812,30.56358903485989 147 | 145,114.27962906255519,30.558607652102296 148 | 146,114.28900607029792,30.553565444286477 149 | 147,114.29840070590726,30.548442644899996 150 | 148,114.30781431950412,30.543238204563853 151 | 149,114.31724329477673,30.53797017120277 152 | 150,114.32668011175416,30.5326715352149 153 | 151,114.33611534473306,30.527382522716593 154 | 152,114.34554011209816,30.52214116679943 155 | 153,114.35494838863018,30.516974397766 156 | 154,114.3643386242328,30.51189177012939 157 | 155,114.37371427783984,30.506883316048327 158 | 156,114.3830831328685,30.50192203179042 159 | 157,114.39245555048899,30.496970400443654 160 | 158,114.40184207082021,30.49198939359158 161 | 159,114.41125092968859,30.486947805815 162 | 160,114.22744288159973,30.576739313652016 163 | 161,114.23689340481016,30.571416205180462 164 | 162,114.24631671612048,30.56617067936642 165 | 163,114.2557166821247,30.560988568672208 166 | 164,114.26510169659564,30.55583978555366 167 | 165,114.27448271738871,30.550685532281282 168 | 166,114.28387073830498,30.54548762542595 169 | 167,114.29327429500431,30.54021770420819 170 | 168,114.30269759247217,30.534864116968755 171 | 169,114.31213969068756,30.529434833087233 172 | 170,114.32159493149406,30.523955676987008 173 | 171,114.33105449390787,30.518464299567153 174 | 172,114.34050869745977,30.513001322709357 175 | 173,114.34994949679269,30.507600769819938 176 | 174,114.35937256688625,30.50228206705558 177 | 175,114.36877847695642,30.497045524554025 178 | 176,114.37817266886607,30.49187237455681 179 | 177,114.38756424142997,30.486729353346973 180 | 178,114.39696382783957,30.481576727614694 181 | 179,114.4063810716959,30.476377843232388 182 | 180,114.22215912999471,30.569890190864914 183 | 181,114.23164181887678,30.564331701844488 184 | 182,114.24109741301237,30.558850683139568 185 | 183,114.25052702290617,30.55344358695479 186 | 184,114.25993697844815,30.548088369747976 187 | 185,114.2693373593867,30.54274979241147 188 | 186,114.27873966991625,30.537387919214858 189 | 187,114.28815420993794,30.531967774473685 190 | 188,114.29758776506786,30.52646783527694 191 | 189,114.30704215867195,30.520885315721564 192 | 190,114.31651400154306,30.515236967020464 193 | 191,114.32599568835678,30.50955519427735 194 | 192,114.3354773932542,30.50388041612677 195 | 193,114.34494958029822,30.498251497098845 196 | 194,114.35440542390796,30.492696546769746 197 | 195,114.36384255782177,30.4872262924551 198 | 196,114.3732637328084,30.481831614231282 199 | 197,114.38267622489187,30.47648583118705 200 | 198,114.392090135326,30.471151186703263 201 | 199,114.40151598978997,30.46578797269528 202 | 200,114.2746650197356,30.63923489943446 203 | 201,114.28387719925516,30.635765566856623 204 | 202,114.29310527586765,30.632227072237804 205 | 203,114.30235288183155,30.628608206011258 206 | 204,114.311618859565,30.624915934064767 207 | 205,114.32089764984939,30.62117390425684 208 | 206,114.33018069595221,30.617417119283434 209 | 207,114.33945853706953,30.613683998207488 210 | 208,114.34872310729882,30.610007651144407 211 | 209,114.35796970980984,30.606408373381722 212 | 210,114.36719821033238,30.60288908445785 213 | 211,114.3764131725889,30.59943475919489 214 | 212,114.38562290043453,30.596015978031208 215 | 213,114.3948376022714,30.592595775453198 216 | 214,114.40406709556461,30.589138206543545 217 | 215,114.41331857650589,30.585616657305845 218 | 216,114.42259496698962,30.582019984972785 219 | 217,114.43189422040537,30.578355075567313 220 | 218,114.44120974872546,30.57464523176044 221 | 219,114.45053187595089,30.570924764175963 222 | 220,114.22744288159973,30.576739313652016 223 | 221,114.23689340481016,30.571416205180462 224 | 222,114.24631671612048,30.56617067936642 225 | 223,114.2557166821247,30.560988568672208 226 | 224,114.26510169659564,30.55583978555366 227 | 225,114.27448271738871,30.550685532281282 228 | 226,114.28387073830498,30.54548762542595 229 | 227,114.29327429500431,30.54021770420819 230 | 228,114.30269759247217,30.534864116968755 231 | 229,114.31213969068756,30.529434833087233 232 | 230,114.32159493149406,30.523955676987008 233 | 231,114.33105449390787,30.518464299567153 234 | 232,114.34050869745977,30.513001322709357 235 | 233,114.34994949679269,30.507600769819938 236 | 234,114.35937256688625,30.50228206705558 237 | 235,114.36877847695642,30.497045524554025 238 | 236,114.37817266886607,30.49187237455681 239 | 237,114.38756424142997,30.486729353346973 240 | 238,114.39696382783957,30.481576727614694 241 | 239,114.4063810716959,30.476377843232388 -------------------------------------------------------------------------------- /data2txt.bat: -------------------------------------------------------------------------------- 1 | python read_data_to_txt.py 2 | pause -------------------------------------------------------------------------------- /debug.log: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liujiao111/map-easygo/2b90f76f11b61349d2e2829efb56f0c2e5703b74/debug.log -------------------------------------------------------------------------------- /get_distance.py: -------------------------------------------------------------------------------- 1 | from math import atan, fabs, asin, sqrt, sin, tan, acos, cos, radians 2 | EARTH_RADIUS = 6371 # 地球平均半径,6371km 3 | 4 | def hav(theta): 5 | s = sin(theta / 2) 6 | return s * s 7 | 8 | #获取两点之间的距离 9 | def get_distance(lat0, lng0, lat1, lng1): 10 | "用haversine公式计算球面两点间的距离。" 11 | # 经纬度转换成弧度 12 | lat0 = radians(lat0) 13 | lat1 = radians(lat1) 14 | lng0 = radians(lng0) 15 | lng1 = radians(lng1) 16 | 17 | dlng = fabs(lng0 - lng1) 18 | dlat = fabs(lat0 - lat1) 19 | h = hav(dlat) + cos(lat0) * cos(lat1) * hav(dlng) 20 | distance = 2 * EARTH_RADIUS * asin(sqrt(h)) 21 | 22 | return distance 23 | 24 | -------------------------------------------------------------------------------- /get_grid.py: -------------------------------------------------------------------------------- 1 | from get_distance import get_distance 2 | import transCoordinateSystem 3 | import settings 4 | #TODO 5 | require_distance = 1 #要求的地图网格间隔距离(千米为单位) -- 百度地图 6 | 7 | print(settings.city_bound_point_A) 8 | 9 | #武汉 10 | #四个点分别代表矩阵的四个角的坐标 11 | #point_A = [114.286652,30.642638] #左上角点,x619 12 | #point_D = [114.239273,30.580588] #左下角点,农场十一队 13 | #point_B = [114.462433,30.574677] #右上角点,木妙 14 | #point_C = [114.418488,30.479746] #右下角点,牛场右下角 15 | 16 | 17 | point_A = settings.city_bound_point_A #左上角点,x619 18 | point_D = settings.city_bound_point_D #左下角点,农场十一队 19 | point_B = settings.city_bound_point_B #右上角点,木妙 20 | point_C = settings.city_bound_point_C #右下角点,牛场右下角 21 | 22 | #AB间的距离 23 | distance_AB = get_distance(point_A[1], point_A[0], point_B[1], point_B[0]) 24 | print(distance_AB) #13.381562710218205 25 | 26 | #AD间的距离 27 | distance_AD = get_distance(point_A[1], point_A[0], point_D[1], point_D[0]) 28 | print(distance_AD) #7.735240077970707 29 | 30 | #AB边上平分点的个数 31 | num_split_AB = int(distance_AB / require_distance if distance_AB % require_distance == 0 else distance_AB // require_distance + 1) - 1 32 | print('AB边上平分点的个数', str(num_split_AB)) 33 | 34 | #AD边上评分的个数 35 | num_split_AD = int(distance_AD / require_distance if distance_AD % require_distance == 0 else distance_AD // require_distance + 1) - 1 36 | print('AD边上评分的个数', str(num_split_AD)) 37 | 38 | 39 | #计算AB线段上各个点的坐标 40 | points_coordinate_AB = [] 41 | for i in range(num_split_AB + 1): 42 | x = point_A[0] + (point_B[0] - point_A[0]) / (num_split_AB + 1) * i 43 | y = point_A[1] + (point_B[1] - point_A[1]) / (num_split_AB + 1) * i 44 | points_coordinate_AB.append([x, y]) 45 | points_coordinate_AB.append(point_B)#加上B点的坐标 46 | print('AB线段上各个平分点的坐标集合:', points_coordinate_AB) 47 | 48 | #计算DC线段上各个点的坐标 49 | points_coordinate_DC = [] 50 | for i in range(num_split_AB + 1): 51 | x = point_D[0] + (point_C[0] - point_D[0]) / (num_split_AB + 1) * i 52 | y = point_D[1] + (point_C[1] - point_D[1]) / (num_split_AB + 1) * i 53 | points_coordinate_DC.append([x, y]) 54 | points_coordinate_DC.append(point_C)#加上B点的坐标 55 | print('DC线段上各个平分点的坐标集合:', points_coordinate_DC) 56 | 57 | 58 | #计算中间网格交汇点坐标 59 | points_coordinate_result = [] 60 | for i in range(num_split_AD + 2): 61 | for j in range(num_split_AB + 2): 62 | x = points_coordinate_AB[j][0] + (points_coordinate_DC[j][0] - points_coordinate_AB[j][0]) / (num_split_AD + 1) * (i + 1) 63 | y = points_coordinate_AB[j][1] + (points_coordinate_DC[j][1] - points_coordinate_AB[j][1]) / (num_split_AD + 1) * (i + 1) 64 | points_coordinate_result.append([x, y]) 65 | 66 | points_coordinate_result.extend(points_coordinate_AB) 67 | points_coordinate_result.extend(points_coordinate_DC) 68 | 69 | print('======================================') 70 | print(points_coordinate_result) 71 | 72 | ''' 73 | latA = 116.371067 74 | lonA = 39.959193 75 | latB = 116.483032 76 | lonB = 39.973018 77 | 78 | d2 = get_distance(lonA, latA, lonB, latB) 79 | print(d2) #9.664978035037414 80 | ''' 81 | 82 | def create_yichuxing_data(points_coordinate_result): 83 | '''生成宜出行爬取的数据各式''' 84 | text = 'OBJECTID,x,y\n' 85 | for i, point in enumerate(points_coordinate_result): 86 | lng = points_coordinate_result[i][0] 87 | lat = points_coordinate_result[i][1] 88 | res = transCoordinateSystem.bd09_to_wgs84(lng, lat) 89 | lng = res[0] 90 | lat = res[1] 91 | text += str(i) + ',' + str(lng) + ',' + str(lat) + '\n' 92 | 93 | with open('data.txt', 'w') as f: 94 | f.write(text.strip()) 95 | print('写入成功') 96 | 97 | create_yichuxing_data(points_coordinate_result) 98 | 99 | -------------------------------------------------------------------------------- /log.log: -------------------------------------------------------------------------------- 1 | 2018-09-05 17:31:06此轮抓取开始 2 | 2018-09-05 17:32:54此轮抓取开始 3 | 2018-09-05 17:38:46此轮抓取开始 4 | 2018-09-05 17:39:44此轮抓取开始 5 | 2018-09-05 17:40:28此轮抓取开始 6 | 2018-09-06 09:19:24此轮抓取开始 7 | 2018-09-06 09:26:39此轮抓取完成,开始去重 8 | 2018-09-06 09:26:39去重完成,等待下一轮开始 9 | 2018-09-06 11:12:44此轮抓取开始 10 | 2018-09-06 11:18:10此轮抓取开始 11 | 2018-09-06 11:31:22此轮抓取开始 12 | 2018-09-06 13:58:27此轮抓取开始 13 | 2018-09-07 09:10:07此轮抓取开始 14 | 2018-09-07 09:14:41此轮抓取完成,开始去重 15 | 2018-09-07 09:14:41去重完成,等待下一轮开始 16 | 2018-09-29 11:14:59此轮抓取开始 17 | 2018-09-30 19:41:18此轮抓取开始 18 | 2018-09-30 19:42:40此轮抓取完成,开始去重 19 | 2018-09-30 19:42:40去重完成,等待下一轮开始 20 | 2018-10-06 12:59:23此轮抓取开始 21 | 2018-10-06 13:01:05此轮抓取开始 22 | 2018-10-06 13:04:27此轮抓取开始 23 | 2018-10-06 13:22:00此轮抓取开始 24 | 2018-10-06 19:35:46此轮抓取开始 25 | 2018-10-11 18:44:06此轮抓取开始 26 | 2018-11-19 10:25:19此轮抓取开始 27 | 2018-11-19 17:45:38此轮抓取开始 28 | 2018-11-19 17:45:53此轮抓取开始 29 | 2018-11-19 17:51:56此轮抓取开始 30 | 2018-11-19 17:52:59此轮抓取开始 31 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | #! /usr/local/bin/python3 2 | # coding: utf-8 3 | # __author__ = "lsg" 4 | # __date__ = 2017/10/16 16:11 5 | 6 | #加载内置包 7 | import requests 8 | import json 9 | import time 10 | import sys 11 | #加载第三方包 12 | import pandas 13 | from selenium import webdriver 14 | from requests.exceptions import RequestException 15 | from selenium.common.exceptions import WebDriverException 16 | #加载自己编写的文件 17 | import qqlist 18 | import settings 19 | import transCoordinateSystem 20 | 21 | class CookieException(Exception): 22 | def __init__(self): 23 | Exception.__init__(self) 24 | 25 | class easygospider(): 26 | #初始化基本量 27 | def __init__(self): 28 | self.qq_number_list = [] 29 | self.cookie = None 30 | #输入文件的名称 31 | self.input = settings.xy_name 32 | #文件保存的路径 33 | self.filepath = settings.filepath 34 | self.filename = settings.filename 35 | self.qq_number_list = qqlist.qq_list 36 | 37 | # 初始化用于爬虫的网格,形成url 38 | def initial_paramslist(self): 39 | """ 40 | :return: List[] 41 | """ 42 | #filename文件中需要存储的为Wgs84坐标系下渔网为2.5Km的中心点的坐标,格式如data.txt 43 | #读取渔网中心点 44 | center = [] 45 | with open(self.input, 'r', encoding='utf-8') as f: 46 | for item in f.readlines()[1:]: 47 | center.append(tuple(item.strip().split(",")[-2:])) 48 | #生成四至范围(按照2.6km范围生成,防止有遗漏点,如有重复,最后去重的时候处理) 49 | round = [] 50 | for item in center: 51 | lng, lat = item 52 | lng, lat = float(lng), float(lat) 53 | round.append([lng - 0.5 * settings.lng_delta, 54 | lng + 0.5 * settings.lng_delta, 55 | lat - 0.5 * settings.lat_delta, 56 | lat + 0.5 * settings.lat_delta]) 57 | #生成待抓取的params 58 | paramslist = [] 59 | for item in round: 60 | lng_min, lng_max, lat_min, lat_max = item 61 | #lng_min, lat_min = transCoordinateSystem.wgs84_to_gcj02(lng_min, lat_min) 62 | #lng_max, lat_max = transCoordinateSystem.wgs84_to_gcj02(lng_max, lat_max) 63 | params = {"lng_min": lng_min, 64 | "lat_max": lat_max, 65 | "lng_max": lng_max, 66 | "lat_min": lat_min, 67 | "level": 16, 68 | "city": "厦门", 69 | "lat": "undefined", 70 | "lng": "undefined", 71 | "_token": ""} 72 | paramslist.append(params) 73 | return paramslist 74 | 75 | def get_cookie(self): 76 | while True: 77 | try: 78 | chrome_login = webdriver.Chrome(executable_path="chromedriver.exe") 79 | chrome_login.implicitly_wait(10) 80 | chrome_login.get( 81 | "http://c.easygo.qq.com/eg_toc/map.html?origin=csfw&cityid=110000") 82 | try: 83 | qq_ = self.qq_number_list.pop() 84 | except IndexError: 85 | self.qq_number_list = qqlist.qq_list 86 | qq_ = self.qq_number_list.pop() 87 | qq_num = qq_[0] 88 | qq_passwd = qq_[1] 89 | chrome_login.find_element_by_id("u").send_keys(qq_num) 90 | chrome_login.find_element_by_id("p").send_keys(qq_passwd) 91 | chrome_login.find_element_by_id("go").click() 92 | #检查是否存在验证码 93 | time.sleep(5) 94 | if "安全验证" in chrome_login.page_source: 95 | if settings.CAPTCHA_RECOGNIZ: 96 | input('等待手动去除验证码') 97 | else: 98 | chrome_login.close() 99 | continue 100 | 101 | #获取cookie 102 | cookies = chrome_login.get_cookies() 103 | chrome_login.quit() 104 | user_cookie = {} 105 | for cookie in cookies: 106 | user_cookie[cookie["name"]] = cookie["value"] 107 | return user_cookie 108 | except WebDriverException as e: 109 | pass 110 | finally: 111 | try: 112 | chrome_login.close() 113 | except Exception: 114 | pass 115 | 116 | def spyder(self,cookie,params): 117 | user_header = { 118 | "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36", 119 | "Referer": "http://c.easygo.qq.com/eg_toc/map.html?origin=csfw" 120 | } 121 | url = "http://c.easygo.qq.com/api/egc/heatmapdata" 122 | while True: 123 | try: 124 | r = requests.get(url, headers=user_header, 125 | cookies=cookie, params=params) 126 | # print(r.status_code) 127 | if r.status_code == 200: 128 | print('返回正确,返回的文本:', r.text) 129 | return r.text 130 | else: 131 | raise CookieException 132 | except RequestException: 133 | self.spyder(cookie, params) 134 | 135 | def save(self,text,time_now,file_name): 136 | try: 137 | with open(file_name, 'r') as f: 138 | f.readline() 139 | except FileNotFoundError as e: 140 | with open(file_name, 'w', encoding='utf-8') as f: 141 | f.write('count,wgs_lng,wgs_lat,time\n') 142 | # 写入数据 143 | with open(file_name, "a", encoding="utf-8") as f: 144 | if text is None: 145 | return 146 | node_list = json.loads(text)["data"] 147 | try: 148 | min_count = json.loads(text)['max_data']/40 149 | for i in node_list: 150 | # 此处的算法在宜出行网页后台的js可以找到,文件路径是http://c.easygo.qq.com/eg_toc/js/map-55f0ea7694.bundle.js 151 | i['count'] = i['count'] // min_count 152 | gcj_lng = 1e-6 * (250.0 * i['grid_x'] + 125.0) 153 | gcj_lat = 1e-6 * (250.0 * i['grid_y'] + 125.0) 154 | lng, lat = transCoordinateSystem.gcj02_to_wgs84(gcj_lng, gcj_lat) 155 | f.write(str(i['count']) + "," + str(lng) + "," + str(lat) + "," + time_now + "\n") 156 | except IndexError as e: 157 | pass 158 | # print("此区域没有点信息") 159 | except TypeError as e: 160 | print(node_list) 161 | raise CookieException 162 | def remove_duplicate(self,filepath): 163 | # df = pandas.read_csv(filepath,sep=",") 164 | df = pandas.read_csv(filepath, sep=',') 165 | df = df.drop_duplicates() 166 | csv_name = filepath.replace(".txt", "去重结果.csv") 167 | df.to_csv(csv_name,index=False) 168 | 169 | def exec(self): 170 | while True: 171 | time_now = time.time() 172 | time_now_str = time.strftime('%Y-%m-%d-%H-%M-%S', time.localtime(time_now)) 173 | write_log("此轮抓取开始") 174 | cookie = self.get_cookie() 175 | i = 1 176 | params_list = self.initial_paramslist() 177 | for params in params_list: 178 | 179 | #这部分负责每个qq号码抓取的次数 180 | if i % settings.fre == 0: 181 | cookie = self.get_cookie() 182 | while True: 183 | try: 184 | text = self.spyder(cookie, params) 185 | self.save(text, time_now_str, file_name=self.filepath + self.filename + time_now_str + ".txt") 186 | break 187 | except CookieException as e: 188 | cookie = self.get_cookie() 189 | 190 | view_bar(i, len(params_list)) 191 | i += 1 192 | write_log("此轮抓取完成,开始去重") 193 | self.remove_duplicate(self.filepath + self.filename + time_now_str + ".txt") 194 | write_log("去重完成,等待下一轮开始") 195 | time.sleep(settings.sleeptime - int(time.time() - time_now)) 196 | 197 | 198 | def write_log(content): 199 | with open("log.log",'a',encoding='utf-8') as f: 200 | info = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime())+content+"\n" 201 | sys.stdout.write(info) 202 | f.write(info) 203 | 204 | def view_bar(num, total): 205 | rate = float(num) / float(total) 206 | rate_num = int(rate * 100) 207 | r = '\r[%s%s]%d%%' % ("="*(rate_num), " "*(100-rate_num), rate_num, ) 208 | sys.stdout.write(r) 209 | 210 | if __name__ == "__main__": 211 | app = easygospider() 212 | app.exec() -------------------------------------------------------------------------------- /qqlist.py: -------------------------------------------------------------------------------- 1 | qq_list = [["4546565", "56565"], 2 | ["56565", "56565"], 3 | ["787856556", "56565"], 4 | ["34343434", "56565"], 5 | ["565656", "56565"], 6 | ["23125354", "56565"], 7 | ["22324545", "56565"], 8 | ["4545453", "56565"]] 9 | -------------------------------------------------------------------------------- /read_data_to_txt.py: -------------------------------------------------------------------------------- 1 | from transCoordinateSystem import wgs84_to_bd09 2 | import xlrd 3 | import xlwt 4 | 5 | file_path = 'C:\\Users\\hgvgh\\Desktop\\宜出行热力图抓取\\easygo\\example\\example2018-11-19-10-25-19.txt' 6 | 7 | def read_result_to_points(file_path): 8 | 9 | ''' 10 | 将easygo的数据转换为百度地图热力图显示的格式 11 | :param file_path: 12 | :return: 13 | ''' 14 | points = [] 15 | with open(file_path) as f: 16 | lines = f.readlines() 17 | i = 0 18 | 19 | # 将数据写入EXCEL中 20 | book = xlwt.Workbook(encoding='utf-8', style_compression=0) 21 | sheet = book.add_sheet('0', cell_overwrite_ok=True) 22 | # 第一行(列标题) 23 | sheet.write(0, 0, 'lng') 24 | sheet.write(0, 1, 'lat') 25 | sheet.write(0, 2, 'count') 26 | 27 | for line in lines: 28 | i += 1 29 | if i == 1: 30 | continue 31 | line_str = line.split(',') 32 | count = int(float(line_str[0])) 33 | lng = float(line_str[1]) 34 | lat = float(line_str[2]) 35 | lng, lat = wgs84_to_bd09(lng, lat) 36 | 37 | one_point = {} 38 | one_point['lng'] = lng 39 | one_point['lat'] = lat 40 | one_point['count'] = count 41 | points.append(one_point) 42 | 43 | sheet.write(i - 1, 0, lng) 44 | sheet.write(i - 1, 1, lat) 45 | sheet.write(i - 1, 2, count) 46 | 47 | book.save('data_flow.xls') 48 | 49 | 50 | with open('data_flow.txt', 'w') as f: 51 | f.write('[') 52 | i = 0 53 | print('总共有', len(points), '个热力点') 54 | for line in points: 55 | f.write(str(line) + ',') 56 | f.write(']') 57 | 58 | 59 | 60 | 61 | read_result_to_points(file_path) 62 | -------------------------------------------------------------------------------- /run.bat: -------------------------------------------------------------------------------- 1 | python get_grid.py 2 | python ./main.py 3 | pause -------------------------------------------------------------------------------- /settings.py: -------------------------------------------------------------------------------- 1 | #! /usr/local/bin/python3 2 | # coding: utf-8 3 | # __author__ = "Brady Hu" 4 | # __date__ = 2017/10/16 16:11 5 | 6 | import os 7 | 8 | #爬虫参数设置 9 | 10 | # 渔网点坐标所在文件 11 | xy_name = "data.txt" 12 | 13 | #需要爬取的区域的四个方向点坐标 14 | city_bound_point_A = [114.286652,30.642638] #左上角点,x619 15 | city_bound_point_D = [114.239273,30.580588] #左下角点,农场十一队 16 | city_bound_point_B = [114.462433,30.574677] #右上角点,木妙 17 | city_bound_point_C = [114.418488,30.479746] #右下角点,牛场右下角 18 | 19 | 20 | #下面设置文件存目录,不要设置在系统盘,不然会出现问题 21 | #当前目录用"."表示,如"./example/" 22 | filepath = r"./example/" 23 | if not os.path.exists(filepath): 24 | os.makedirs(filepath) 25 | filename = "example" 26 | 27 | #爬取的间隔时间 28 | sleeptime = 3600 #单位是秒,7200秒即为2小时 29 | #下面这个设置每个qq号抓取的次数 30 | fre = 80 31 | 32 | #每次爬取方格的边长(单位:km) 33 | edge = 2.6 34 | 35 | 36 | #下面的参数不用设置 37 | lng_delta = 0.01167*edge 38 | lat_delta = 0.009*edge 39 | #是否手动识别验证码,如需手动识别验证码,设置为True,否则设置为False,遇到验证码直接跳过 40 | CAPTCHA_RECOGNIZ = False -------------------------------------------------------------------------------- /transCoordinateSystem.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding:utf-8 -*- 3 | 4 | __author = "Brady_Hu" 5 | 6 | import math 7 | 8 | x_pi = 3.14159265358979324 * 3000.0 / 180.0 9 | pi = 3.1415926535897932384626 # π 10 | a = 6378245.0 # 长半轴 11 | ee = 0.00669342162296594323 # 扁率 12 | 13 | 14 | def gcj02_to_bd09(lng, lat): 15 | """ 16 | 火星坐标系(GCJ-02)转百度坐标系(BD-09) 17 | 谷歌、高德——>百度 18 | :param lng:火星坐标经度 19 | :param lat:火星坐标纬度 20 | :return: 21 | """ 22 | z = math.sqrt(lng * lng + lat * lat) + 0.00002 * math.sin(lat * x_pi) 23 | theta = math.atan2(lat, lng) + 0.000003 * math.cos(lng * x_pi) 24 | bd_lng = z * math.cos(theta) + 0.0065 25 | bd_lat = z * math.sin(theta) + 0.006 26 | return [bd_lng, bd_lat] 27 | 28 | 29 | def bd09_to_gcj02(bd_lon, bd_lat): 30 | """ 31 | 百度坐标系(BD-09)转火星坐标系(GCJ-02) 32 | 百度——>谷歌、高德 33 | :param bd_lat:百度坐标纬度 34 | :param bd_lon:百度坐标经度 35 | :return:转换后的坐标列表形式 36 | """ 37 | x = bd_lon - 0.0065 38 | y = bd_lat - 0.006 39 | z = math.sqrt(x * x + y * y) - 0.00002 * math.sin(y * x_pi) 40 | theta = math.atan2(y, x) - 0.000003 * math.cos(x * x_pi) 41 | gg_lng = z * math.cos(theta) 42 | gg_lat = z * math.sin(theta) 43 | return [gg_lng, gg_lat] 44 | 45 | 46 | def wgs84_to_gcj02(lng, lat): 47 | """ 48 | WGS84转GCJ02(火星坐标系) 49 | :param lng:WGS84坐标系的经度 50 | :param lat:WGS84坐标系的纬度 51 | :return: 52 | """ 53 | if out_of_china(lng, lat): # 判断是否在国内 54 | return lng, lat 55 | dlat = _transformlat(lng - 105.0, lat - 35.0) 56 | dlng = _transformlng(lng - 105.0, lat - 35.0) 57 | radlat = lat / 180.0 * pi 58 | magic = math.sin(radlat) 59 | magic = 1 - ee * magic * magic 60 | sqrtmagic = math.sqrt(magic) 61 | dlat = (dlat * 180.0) / ((a * (1 - ee)) / (magic * sqrtmagic) * pi) 62 | dlng = (dlng * 180.0) / (a / sqrtmagic * math.cos(radlat) * pi) 63 | mglat = lat + dlat 64 | mglng = lng + dlng 65 | return [mglng, mglat] 66 | 67 | 68 | def gcj02_to_wgs84(lng, lat): 69 | """ 70 | GCJ02(火星坐标系)转GPS84 71 | :param lng:火星坐标系的经度 72 | :param lat:火星坐标系纬度 73 | :return: 74 | """ 75 | if out_of_china(lng, lat): 76 | return lng, lat 77 | dlat = _transformlat(lng - 105.0, lat - 35.0) 78 | dlng = _transformlng(lng - 105.0, lat - 35.0) 79 | radlat = lat / 180.0 * pi 80 | magic = math.sin(radlat) 81 | magic = 1 - ee * magic * magic 82 | sqrtmagic = math.sqrt(magic) 83 | dlat = (dlat * 180.0) / ((a * (1 - ee)) / (magic * sqrtmagic) * pi) 84 | dlng = (dlng * 180.0) / (a / sqrtmagic * math.cos(radlat) * pi) 85 | mglat = lat + dlat 86 | mglng = lng + dlng 87 | return [lng * 2 - mglng, lat * 2 - mglat] 88 | 89 | 90 | def bd09_to_wgs84(bd_lon, bd_lat): 91 | lon, lat = bd09_to_gcj02(bd_lon, bd_lat) 92 | return gcj02_to_wgs84(lon, lat) 93 | 94 | 95 | def wgs84_to_bd09(lon, lat): 96 | lon, lat = wgs84_to_gcj02(lon, lat) 97 | return gcj02_to_bd09(lon, lat) 98 | 99 | 100 | def _transformlat(lng, lat): 101 | ret = -100.0 + 2.0 * lng + 3.0 * lat + 0.2 * lat * lat + \ 102 | 0.1 * lng * lat + 0.2 * math.sqrt(math.fabs(lng)) 103 | ret += (20.0 * math.sin(6.0 * lng * pi) + 20.0 * 104 | math.sin(2.0 * lng * pi)) * 2.0 / 3.0 105 | ret += (20.0 * math.sin(lat * pi) + 40.0 * 106 | math.sin(lat / 3.0 * pi)) * 2.0 / 3.0 107 | ret += (160.0 * math.sin(lat / 12.0 * pi) + 320 * 108 | math.sin(lat * pi / 30.0)) * 2.0 / 3.0 109 | return ret 110 | 111 | 112 | def _transformlng(lng, lat): 113 | ret = 300.0 + lng + 2.0 * lat + 0.1 * lng * lng + \ 114 | 0.1 * lng * lat + 0.1 * math.sqrt(math.fabs(lng)) 115 | ret += (20.0 * math.sin(6.0 * lng * pi) + 20.0 * 116 | math.sin(2.0 * lng * pi)) * 2.0 / 3.0 117 | ret += (20.0 * math.sin(lng * pi) + 40.0 * 118 | math.sin(lng / 3.0 * pi)) * 2.0 / 3.0 119 | ret += (150.0 * math.sin(lng / 12.0 * pi) + 300.0 * 120 | math.sin(lng / 30.0 * pi)) * 2.0 / 3.0 121 | return ret 122 | 123 | 124 | def out_of_china(lng, lat): 125 | """ 126 | 判断是否在国内,不在国内不做偏移 127 | :param lng: 128 | :param lat: 129 | :return: 130 | """ 131 | return not (lng > 73.66 and lng < 135.05 and lat > 3.86 and lat < 53.55) 132 | 133 | if __name__ == '__main__': 134 | lng = 128.543 135 | lat = 37.065 136 | result1 = gcj02_to_bd09(lng, lat) 137 | result2 = bd09_to_gcj02(lng, lat) 138 | result3 = wgs84_to_gcj02(lng, lat) 139 | result4 = gcj02_to_wgs84(lng, lat) 140 | result5 = bd09_to_wgs84(lng, lat) 141 | result6 = wgs84_to_bd09(lng, lat) 142 | 143 | print(result1, result2, result3, result4, result5, result6) --------------------------------------------------------------------------------