├── .history ├── README_20200518070417.md ├── README_20200518070450.md ├── README_20200518070457.md ├── README_20200518070459.md ├── README_20200518070503.md ├── README_20200518070507.md ├── README_20200518070509.md ├── README_20200518070510.md ├── README_20200518070515.md ├── README_20200518070517.md ├── README_20200518070520.md ├── README_20200518070522.md ├── README_20200518070523.md ├── README_20200518070949.md ├── README_20200518070953.md ├── README_20200518070955.md ├── README_20200518070957.md ├── README_20200518071001.md ├── README_20200518071003.md ├── README_20200518071009.md ├── README_20200518072121.md ├── README_20200518072125.md ├── README_20200518072126.md ├── README_20200518072849.md ├── README_20200518073517.md ├── README_20200518073538.md ├── README_20200518073539.md ├── README_20200518073947.md ├── README_20200518073949.md ├── README_20200518074657.md ├── README_20200518074700.md ├── README_20200518074701.md ├── README_20200518074705.md └── code │ ├── readme_20200518070417.txt │ ├── readme_20200518071146.txt │ ├── readme_20200518071204.txt │ └── readme_20200518071207.txt ├── .idea ├── .gitignore ├── BigData.iml ├── inspectionProfiles │ ├── Project_Default.xml │ └── profiles_settings.xml ├── misc.xml ├── modules.xml └── vcs.xml ├── README.md ├── charts ├── README.md ├── data_info.png ├── 全国2018年各气象站平均气温分布图(单位_℃).png ├── 全国2018年各气象站最低气温分布图(单位_℃).png ├── 全国2018年各气象站最高气温分布图(单位_℃).png ├── 全国2018年各气象站降水量分布图(单位_mm).png ├── 全国历年平均降水量变化曲线.png └── 全国历年气温变化曲线.png ├── code ├── readme.txt ├── show.py ├── show_map.py ├── stations_to_mysql.py ├── year_all_stations.py └── year_china.py ├── data ├── china_stations.txt └── readme.txt └── 答辩.pdf /.history/README_20200518070417.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | -------------------------------------------------------------------------------- /.history/README_20200518070450.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | -------------------------------------------------------------------------------- /.history/README_20200518070457.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | -------------------------------------------------------------------------------- /.history/README_20200518070459.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` 4 | -------------------------------------------------------------------------------- /.history/README_20200518070503.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的 4 | -------------------------------------------------------------------------------- /.history/README_20200518070507.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的地址 4 | -------------------------------------------------------------------------------- /.history/README_20200518070509.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的地址改为 4 | -------------------------------------------------------------------------------- /.history/README_20200518070510.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的地址改为本机 4 | -------------------------------------------------------------------------------- /.history/README_20200518070515.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的地址改为本机127.0.0.1 4 | -------------------------------------------------------------------------------- /.history/README_20200518070517.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的地址改为本机127.0.0.1 4 | 5 | -------------------------------------------------------------------------------- /.history/README_20200518070520.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ```数据库的地址改为本机127.0.0.1 4 | ··· 5 | -------------------------------------------------------------------------------- /.history/README_20200518070522.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` 4 | 数据库的地址改为本机127.0.0.1 5 | ··· 6 | -------------------------------------------------------------------------------- /.history/README_20200518070523.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` 4 | 数据库的地址改为本机127.0.0.1 5 | ··· 6 | -------------------------------------------------------------------------------- /.history/README_20200518070949.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` 4 | 数据库的地址改为本机127.0.0.1 5 | 6 | -------------------------------------------------------------------------------- /.history/README_20200518070953.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` 4 | 数据库的地址改为本机127.0.0.1 5 | 6 | -------------------------------------------------------------------------------- /.history/README_20200518070955.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` p 4 | 数据库的地址改为本机127.0.0.1 5 | 6 | -------------------------------------------------------------------------------- /.history/README_20200518070957.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机127.0.0.1 5 | 6 | -------------------------------------------------------------------------------- /.history/README_20200518071001.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机127.0.0.1 5 | -------------------------------------------------------------------------------- /.history/README_20200518071003.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机127.0.0.1 5 | ``` -------------------------------------------------------------------------------- /.history/README_20200518071009.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机127.0.0.1 5 | ``` -------------------------------------------------------------------------------- /.history/README_20200518072121.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机loaclhost127.0.0.1 5 | ``` -------------------------------------------------------------------------------- /.history/README_20200518072125.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机loaclhost或者127.0.0.1 5 | ``` -------------------------------------------------------------------------------- /.history/README_20200518072126.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | ``` python 4 | 数据库的地址改为本机loaclhost或者127.0.0.1 5 | ``` -------------------------------------------------------------------------------- /.history/README_20200518072849.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | 数据库的地址改为本机loaclhost或者127.0.0.1 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518073517.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | mysql数据库的地址改为本机loaclhost或者127.0.0.1 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518073538.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | 数据库的地址改为本机loaclhost或者127.0.0.1 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518073539.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | 数据库的地址改为本机loaclhost或者127.0.0.1 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518073947.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | My数据库的地址改为本机loaclhost或者127.0.0.1 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518073949.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | MySQL数据库的地址改为本机loaclhost或者127.0.0.1 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518074657.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | MySQL数据库的地址改为本机loaclhost或者127.0.0.1, 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518074700.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | MySQL数据库的地址改为本机loaclhost或者127.0.0.1,密码 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518074701.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | MySQL数据库的地址改为本机loaclhost或者127.0.0.1,密码为 6 | ``` -------------------------------------------------------------------------------- /.history/README_20200518074705.md: -------------------------------------------------------------------------------- 1 | # BigData 2 | 大数据结课作业 3 | 4 | ``` python 5 | MySQL数据库的地址改为本机loaclhost或者127.0.0.1,密码为自己的密码 6 | ``` -------------------------------------------------------------------------------- /.history/code/readme_20200518070417.txt: -------------------------------------------------------------------------------- 1 | spark以及可视化源代码 2 | -------------------------------------------------------------------------------- /.history/code/readme_20200518071146.txt: -------------------------------------------------------------------------------- 1 | spark以及可视化源代码 2 | -------------------------------------------------------------------------------- /.history/code/readme_20200518071204.txt: -------------------------------------------------------------------------------- 1 | pyspark以及可视化源代码 2 | -------------------------------------------------------------------------------- /.history/code/readme_20200518071207.txt: -------------------------------------------------------------------------------- 1 | pyspark以及可视化源代码 2 | -------------------------------------------------------------------------------- /.idea/.gitignore: -------------------------------------------------------------------------------- 1 | # Default ignored files 2 | /shelf/ 3 | /workspace.xml 4 | # Datasource local storage ignored files 5 | /dataSources/ 6 | /dataSources.local.xml 7 | # Editor-based HTTP Client requests 8 | /httpRequests/ 9 | -------------------------------------------------------------------------------- /.idea/BigData.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/Project_Default.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 60 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/profiles_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 6 | -------------------------------------------------------------------------------- /.idea/misc.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 6 | 7 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #### 《大数据技术基础》结课作业 2 | 3 | ## 基于Spark对全国历史气象数据的分析 4 | 5 | 6 | - **获取气象数据** 7 | 8 | - 从[www.data.gov](http://www.data.gov/)下载气象数据 9 | - 数据集整合了来自大约30个不同数据源的每日气候观测。包含了全球90,000多个陆基台站的台站测量数据,从中只取出中国站点的数据。 10 | - ![image-20200817203416072](charts/data_info.png) 11 | 12 | - **用****PySpark****处理数据** 13 | 14 | - **将所有文件读为一个****RDD** rdd = sc.wholeTextFiles("file:///" + os.getcwd() + "/china/") 15 | - **数据清洗 ** 去除字母, -9999等无效数据 16 | - **进行计算、排序等操作** 计算出最大值,最小值,平均值,并按照年份排序。最终得出各个站点以及全国每年的主要气象数据的统计结果。 17 | - **将结果存入数据库中** 得出的结果和stations.txt文件中站点的主要信息分别存入MySQL数据库中。 18 | 19 | - 数据可视化 20 | 21 | 主要运用了Python的第三方库matplotlib和pyecharts 22 | 23 | ![](charts/全国历年气温变化曲线.png) 24 | 25 | ![](charts/全国2018年各气象站降水量分布图(单位_mm).png) 26 | 27 |

Thanks

28 | 29 | ``` python3 30 | MySQL数据库的地址改为本机loaclhost或者127.0.0.1,密码为自己的密码 31 | ``` 32 | -------------------------------------------------------------------------------- /charts/README.md: -------------------------------------------------------------------------------- 1 | ## 可视化效果 2 | -------------------------------------------------------------------------------- /charts/data_info.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/data_info.png -------------------------------------------------------------------------------- /charts/全国2018年各气象站平均气温分布图(单位_℃).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/全国2018年各气象站平均气温分布图(单位_℃).png -------------------------------------------------------------------------------- /charts/全国2018年各气象站最低气温分布图(单位_℃).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/全国2018年各气象站最低气温分布图(单位_℃).png -------------------------------------------------------------------------------- /charts/全国2018年各气象站最高气温分布图(单位_℃).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/全国2018年各气象站最高气温分布图(单位_℃).png -------------------------------------------------------------------------------- /charts/全国2018年各气象站降水量分布图(单位_mm).png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/全国2018年各气象站降水量分布图(单位_mm).png -------------------------------------------------------------------------------- /charts/全国历年平均降水量变化曲线.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/全国历年平均降水量变化曲线.png -------------------------------------------------------------------------------- /charts/全国历年气温变化曲线.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/charts/全国历年气温变化曲线.png -------------------------------------------------------------------------------- /code/readme.txt: -------------------------------------------------------------------------------- 1 | pyspark以及可视化源代码 2 | -------------------------------------------------------------------------------- /code/show.py: -------------------------------------------------------------------------------- 1 | import pymysql 2 | import matplotlib.pyplot as plt 3 | 4 | import pylab as mpl # import matplotlib as mpl 5 | 6 | # 设置汉字格式 7 | # sans-serif就是无衬线字体,是一种通用字体族。 8 | # 常见的无衬线字体有 Trebuchet MS, Tahoma, Verdana, Arial, Helvetica,SimHei 中文的幼圆、隶书等等 9 | mpl.rcParams['font.sans-serif'] = ['FangSong'] # 指定默认字体 10 | mpl.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题 11 | 12 | 13 | def get_connect(): # 获取连接 14 | connect = pymysql.Connect(host='47.98.52.37', port=3306, user='root', passwd='Gcy2018%s', db='weather', charset='utf8') 15 | return connect 16 | 17 | 18 | def execute_sql(sql): # 执行SQL语句并返回执行结果的游标 19 | try: 20 | connect = get_connect() 21 | cursor = connect.cursor() 22 | cursor.execute(sql) 23 | connect.commit() 24 | connect.close() 25 | return cursor 26 | except: 27 | connect.rollback() 28 | 29 | 30 | sql = '''SELECT * FROM weather.every_year_china 31 | where id >= 1951 and id <= 2018;''' 32 | cursor = execute_sql(sql) 33 | 34 | TMAX, TMIN, TAVG, PRCP = [], [], [], [] 35 | 36 | for e in cursor.fetchall(): 37 | if -60 <= float(e[1]) <= 60: 38 | TMAX.append(e[1]) 39 | else: 40 | TMAX.append(None) 41 | TMIN.append(e[2]) 42 | TAVG.append(e[3]) 43 | PRCP.append(e[4] * 365) 44 | 45 | index = TMAX.index(None) 46 | TMAX[index] = (TMAX[index - 1] + TMAX[index + 1]) / 2 47 | 48 | print(TMAX) 49 | print(TMIN) 50 | print(TAVG) 51 | print(PRCP) 52 | 53 | 54 | # ——————————————————————————我是一条分割线————————————————————————————— 55 | 56 | plt.plot([i for i in range(1951, 2019)], TMAX, 'r', label='最高气温') 57 | plt.plot([i for i in range(1951, 2019)], TMIN, 'b--', label='最低气温') 58 | plt.plot([i for i in range(1951, 2019)], TAVG, 'g-.', label='平均气温') 59 | 60 | 61 | plt.xticks([i for i in range(1951, 2019)], fontsize=7) 62 | 63 | plt.title('全国气温变化曲线', fontproperties='SimHei', fontsize=15, color='green') 64 | plt.xlabel('年份(1951-2018) / 年', fontproperties='SimHei', fontsize=15, color='green') 65 | plt.ylabel('气温 / ℃', fontproperties='SimHei', fontsize=15, color='green') 66 | 67 | plt.legend(bbox_to_anchor=[1, 1.1]) 68 | plt.grid() 69 | plt.show() 70 | 71 | # ——————————————————————————我是一条分割线————————————————————————————— 72 | 73 | plt.plot([i for i in range(1951, 2019)], PRCP, 'g.-', label='全年降水量') 74 | 75 | plt.xticks([i for i in range(1951, 2019)], fontsize=7) 76 | 77 | plt.title('全国气象站年平均降水量变化曲线', fontproperties='SimHei', fontsize=15, color='red') 78 | plt.xlabel('年份(1951-2018) / 年', fontproperties='SimHei', fontsize=15, color='red') 79 | plt.ylabel('降水量 / mm', fontproperties='SimHei', fontsize=15, color='red') 80 | 81 | plt.legend(bbox_to_anchor=[0.1, 1.1]) 82 | plt.grid() 83 | plt.show() 84 | -------------------------------------------------------------------------------- /code/show_map.py: -------------------------------------------------------------------------------- 1 | import pymysql 2 | from pyecharts import Geo, Style 3 | 4 | import os 5 | print(os.getcwd()) 6 | 7 | 8 | def get_connect(): # 获取连接 9 | # connect = pymysql.Connect(host='47.98.52.37', port=3306, user='root', passwd='Gcy2018%s', db='weather', charset='utf8') 10 | connect = pymysql.Connect(host='localhost', port=3306, user='root', passwd='n3483226', db='weather', charset='utf8') 11 | return connect 12 | 13 | 14 | def execute_sql(sql): # 执行SQL语句并返回执行结果的游标 15 | try: 16 | connect = get_connect() 17 | cursor = connect.cursor() 18 | cursor.execute(sql) 19 | connect.commit() 20 | connect.close() 21 | return cursor 22 | except: 23 | connect.rollback() 24 | 25 | 26 | cursor = execute_sql("SELECT * FROM weather.stations;") 27 | 28 | stations_dict = dict() 29 | for e in cursor.fetchall(): 30 | stations_dict.update({e[0]: tuple([e[1], e[2], e[3]])}) 31 | 32 | sql = "SELECT * FROM all_city;" 33 | cursor = execute_sql(sql) 34 | TAVG, PRCP, TMAX, TMIN = [], [], [], [] # 数据对 35 | x_y = {} # 坐标对 36 | 37 | for key, value in stations_dict.items(): 38 | x_y.update({value[2]: [value[1], value[0]]}) 39 | 40 | for e in cursor.fetchall(): 41 | if e[0][-4:] == "2018": # 选择年份 42 | key = e[0][0:-4] 43 | TAVG.append(tuple([stations_dict[e[0][:-4]][2], e[3]])) 44 | PRCP.append(tuple([stations_dict[e[0][:-4]][2], e[4]])) 45 | TMAX.append(tuple([stations_dict[e[0][:-4]][2], e[1]])) 46 | TMIN.append(tuple([stations_dict[e[0][:-4]][2], e[2]])) 47 | 48 | res = [] 49 | for e in PRCP: 50 | if e[1] is not None: 51 | res.append(tuple([e[0], int(e[1] * 365)])) 52 | PRCP = res 53 | 54 | res = [] 55 | for e in TAVG: 56 | if e[1] is not None: 57 | res.append(e) 58 | TAVG = res 59 | 60 | res = [] 61 | for e in TMAX: 62 | if e[1] is not None: 63 | res.append(e) 64 | TMAX = res 65 | 66 | res = [] 67 | for e in TMIN: 68 | if e[1] is not None: 69 | res.append(e) 70 | TMIN = res 71 | 72 | PRCP_city = [i[0] for i in PRCP] 73 | PRCP_value = [i[1] for i in PRCP] 74 | 75 | TAVG_city = [i[0] for i in TAVG] 76 | TAVG_value = [i[1] for i in TAVG] 77 | 78 | TMAX_city = [i[0] for i in TMAX] 79 | TMAX_value = [i[1] for i in TMAX] 80 | 81 | TMIN_city = [i[0] for i in TMIN] 82 | TMIN_value = [i[1] for i in TMIN] 83 | # 84 | # print(SNWD) 85 | # print(max([i[1] for i in SNWD])) 86 | # print(min([i[1] for i in SNWD])) 87 | 88 | # 导入自定义的地点经纬度 89 | # attr:标签名称(在例子里面就是地点) 90 | # value:数值(在例子里就是流动人员) 91 | # visual_range:可视化的数值范围 92 | # symbol_size:散点的大小 93 | # visual_text_color:标签颜色 94 | # is_piecewise :颜色是否分段显示(False为渐变,True为分段) 95 | # is_visualmap:是否映射(数量与颜色深浅是否挂钩) 96 | # maptype :地图类型,可以是中国地图,省地图,市地图等等 97 | # visual_split_number :可视化数值分组 98 | # geo_cities_coords:自定义的经纬度 99 | 100 | style = Style(title_color="#fff", title_pos="center", width=1600, height=900, background_color="#404a59") 101 | 102 | pieces = [ 103 | {'max': 300, 'label': '0-300', 'color': '#FF7F00'}, 104 | {'min': 300, 'max': 600, 'label': '300-600', 'color': '#EEE685'}, 105 | {'min': 600, 'max': 1200, 'label': '600-1200', 'color': '#FFD700'}, 106 | {'min': 1200, 'max': 1600, 'label': '1200-1600', 'color': '#ccffff'}, 107 | {'min': 1600, 'max': 2400, 'label': '1600-2400', 'color': '#0099ff'}, 108 | {'min': 2400, 'max': 3200, 'label': '2400-3200', 'color': '#0066ff'}, 109 | {'min': 3200, 'label': '3200以上', 'color': '#0000ff'} # 有下限无上限 110 | ] 111 | 112 | # ——————————————————————————我是一条分割线————————————————————————————— 113 | geo = Geo('全国2018年各气象站降水量分布图(单位/mm)', **style.init_style) 114 | geo.add("", attr=PRCP_city, value=PRCP_value, symbol_size=10, visual_text_color="#fff", is_piecewise=True, 115 | is_visualmap=True, maptype='china', 116 | pieces=pieces, # 注意,要想pieces生效,必须is_piecewise = True, 117 | geo_cities_coords=x_y) 118 | 119 | geo.render("maps/PRCP.html") 120 | # ——————————————————————————我是一条分割线————————————————————————————— 121 | 122 | g = Geo('全国2018年各气象站平均气温分布图(单位/℃)', **style.init_style) 123 | g.add("", attr=TAVG_city, value=TAVG_value, visual_range=[-4, 32], visual_split_number=9, symbol_size=10, 124 | visual_text_color="#fff", is_piecewise=True, 125 | is_visualmap=True, maptype='china', 126 | geo_cities_coords=x_y) 127 | 128 | g.render("maps/TAVG.html") 129 | # ——————————————————————————我是一条分割线————————————————————————————— 130 | 131 | g = Geo('全国2018年各气象站最高气温分布图(单位/℃)', **style.init_style) 132 | g.add("", attr=TMAX_city, value=TMAX_value, visual_range=[0, 40], visual_split_number=8, symbol_size=10, 133 | visual_text_color="#fff", is_piecewise=True, 134 | is_visualmap=True, maptype='china', 135 | geo_cities_coords=x_y) 136 | 137 | g.render("maps/TMAX.html") 138 | # ——————————————————————————我是一条分割线————————————————————————————— 139 | g = Geo('全国2018年各气象站最低气温分布图(单位/℃)', **style.init_style) 140 | g.add("", attr=TMIN_city, value=TMIN_value, visual_range=[-40, 10], visual_split_number=5, symbol_size=10, 141 | visual_text_color="#fff", is_piecewise=True, 142 | is_visualmap=True, maptype='china', 143 | geo_cities_coords=x_y) 144 | 145 | g.render("maps/TMIN.html") 146 | -------------------------------------------------------------------------------- /code/stations_to_mysql.py: -------------------------------------------------------------------------------- 1 | # @data 2019/12/9 16:45 2 | import pymysql 3 | 4 | 5 | def get_connect(): # 获取连接 6 | # connect = pymysql.Connect(host='47.98.52.37', port=3306, user='root', passwd='Gcy2018%s', db='weather', charset='utf8') 7 | connect = pymysql.Connect(host='localhost', port=3306, user='root', passwd='n3483226', db='weather', charset='utf8') 8 | return connect 9 | 10 | 11 | def execute_sql(sql): # 执行SQL语句并返回执行结果的游标 12 | try: 13 | connect = get_connect() 14 | cursor = connect.cursor() 15 | cursor.execute(sql) 16 | connect.commit() 17 | connect.close() 18 | return cursor 19 | except pymysql.MySQLError: 20 | connect.rollback() 21 | 22 | 23 | sql = """CREATE TABLE stations ( 24 | id varchar(255) NOT NULL PRIMARY KEY, 25 | latitude float NOT NULL, 26 | longitude float NOT NULL, 27 | name varchar(255) NOT NULL)""" 28 | 29 | # execute_sql(sql) 30 | 31 | connect = get_connect() 32 | cursor = connect.cursor() 33 | 34 | with open('china_stations.txt') as f: 35 | for line in f: 36 | res = line.split(" ") 37 | res = list(filter(lambda x: False if x == '' else True, res)) 38 | sql = 'INSERT INTO stations VALUES ("%s", %s, %s, "%s")' % tuple([res[0], float(res[1]), float(res[2]), res[4]]) 39 | print(sql) 40 | cursor.execute(sql) 41 | print(list(res)) 42 | 43 | connect.commit() 44 | connect.close() 45 | -------------------------------------------------------------------------------- /code/year_all_stations.py: -------------------------------------------------------------------------------- 1 | # @data 2019/12/9 12:34 2 | 3 | from pyspark import SparkContext 4 | import os 5 | import numpy 6 | import re 7 | import pymysql 8 | 9 | 10 | def get_connect(): # 获取连接 11 | # connect = pymysql.Connect(host='47.98.52.37', port=3306, user='root', passwd='Gcy2018%s', db='weather', charset='utf8') 12 | connect = pymysql.Connect(host='localhost', port=3306, user='root', passwd='n3483226', db='test', charset='utf8') 13 | return connect 14 | 15 | 16 | def execute_sql(sql): # 执行SQL语句并返回执行结果的游标 17 | try: 18 | connect = get_connect() 19 | cursor = connect.cursor() 20 | cursor.execute(sql) 21 | connect.commit() 22 | connect.close() 23 | return cursor 24 | except: 25 | connect.rollback() 26 | 27 | 28 | sc = SparkContext("local[*]", "weather") 29 | 30 | 31 | def data_clean(line): # 去除字母, -9999等无效数据 32 | temp = re.sub(r'[A-Z]|[a-z]', ' ', line[21:]).replace("-9999", " ") 33 | res = list(filter(lambda x: False if x == '' else True, (line[:21] + temp).split(" "))) 34 | if len(res) == 1: # 数据全为-9999 35 | return tuple([res[0], '-9999']) 36 | else: 37 | return tuple([res[0], " ".join(res[1:])]) 38 | 39 | 40 | def calculate(x): # 41 | if x[0][-1] == 'X': 42 | max_tem = max(list(int(n) for n in x[1].split(" "))) 43 | return tuple([x[0][:15] + x[0][17:], str(max_tem)]) 44 | elif x[0][-1] == 'N': 45 | min_tem = min(list(int(n) for n in x[1].split(' '))) 46 | return tuple([x[0][:15] + x[0][17:], str(min_tem)]) 47 | else: 48 | ave = "{:.1f}".format(numpy.mean(list(int(n) for n in x[1].split(" ")))) 49 | return tuple([x[0][:15] + x[0][17:], str(ave)]) 50 | 51 | 52 | rdd = (sc.wholeTextFiles("file:///" + os.getcwd() + "/china/")).map(lambda x: x[1].split('\n'))\ 53 | .flatMap(lambda x: x[:-1]).map(data_clean).map(calculate) 54 | 55 | 56 | def merge_map(tp): # TMAX TMIN TAVG PRCP SNWD 57 | flag = tp[0][-1] 58 | values = tp[1].split(",") 59 | values = list(filter(lambda x: False if x == -9999.0 else True, [float(n) for n in values])) 60 | if len(values) == 0: 61 | return tuple([tp[0], 'null']) 62 | if flag == 'X': 63 | return tuple([tp[0], "{:.1f}".format(max(values) / 10)]) 64 | elif flag == 'N': 65 | return tuple([tp[0], "{:.1f}".format(min(values) / 10)]) 66 | else: 67 | return tuple([tp[0], "{:.1f}".format(numpy.mean(values) / 10)]) 68 | 69 | 70 | rdd = rdd.reduceByKey(lambda x, y: x + "," + y).sortByKey(lambda x: x[0]).map(merge_map) 71 | 72 | 73 | sql = """CREATE TABLE all_city ( 74 | id varchar(255) NOT NULL PRIMARY KEY, 75 | TMAX float, 76 | TMIN float, 77 | TAVG float, 78 | PRCP float, 79 | SNWD float 80 | )""" 81 | 82 | execute_sql(sql) 83 | execute_sql("delete from all_city") 84 | db = get_connect() 85 | cursor = db.cursor() 86 | 87 | row_dict = {'id': 'null', 'TMAX': 'null', 'TMIN': 'null', 'TAVG': 'null', 'PRCP': 'null', 'SNWD': 'null'} 88 | 89 | for e in rdd.collect(): # TMAX TMIN TAVG PRCP SNWD 90 | column = e[0][-4:] 91 | primary_key = e[0][:-4] 92 | if row_dict['id'] == primary_key or row_dict['id'] == 'null': 93 | # print(tuple(row_dict.values())) 94 | row_dict['id'] = primary_key 95 | row_dict[column] = float(e[1]) 96 | else: 97 | sql = "INSERT INTO all_city VALUES ('%s', %s, %s, %s, %s, %s)" % tuple(row_dict.values()) 98 | cursor.execute(sql) 99 | print(tuple(row_dict.values())) 100 | row_dict['id'], row_dict['TMAX'], row_dict['TMIN'], row_dict['TAVG'], row_dict['PRCP'], row_dict['SNWD'] =\ 101 | 'null', 'null', 'null', 'null', 'null', 'null' 102 | row_dict[column] = float(e[1]) 103 | db.commit() 104 | db.close() 105 | 106 | sc.stop() 107 | -------------------------------------------------------------------------------- /code/year_china.py: -------------------------------------------------------------------------------- 1 | # @data 2019/12/9 15:46 2 | 3 | from pyspark import SparkContext 4 | import os 5 | import numpy 6 | import re 7 | import pymysql 8 | 9 | 10 | def get_connect(): # 获取连接 11 | # connect = pymysql.Connect(host='47.98.52.37', port=3306, user='root', passwd='Gcy2018%s', db='weather', charset='utf8') 12 | connect = pymysql.Connect(host='localhost', port=3306, user='root', passwd='n3483226', db='weather', charset='utf8') 13 | return connect 14 | 15 | 16 | def execute_sql(sql): # 执行SQL语句并返回执行结果的游标 17 | try: 18 | connect = get_connect() 19 | cursor = connect.cursor() 20 | cursor.execute(sql) 21 | connect.commit() 22 | connect.close() 23 | return cursor 24 | except: 25 | connect.rollback() 26 | 27 | 28 | sc = SparkContext("local[*]", "weather") 29 | 30 | 31 | def data_clean(line): # 去除字母, -9999等无效数据 32 | temp = re.sub(r'[A-Z]|[a-z]', ' ', line[21:]).replace("-9999", " ") 33 | res = list(filter(lambda x: False if x == '' else True, (line[:21] + temp).split(" "))) # 过滤空值 34 | if len(res) == 1: # 数据全为-9999 35 | return tuple([res[0], '-9999']) 36 | else: # 返回为一个(key, value) 元组, 37 | return tuple([res[0], " ".join(res[1:])]) # key为站点编号 + year + month + 气象数据类型, value为每行内容,即一个月每天的数据 38 | 39 | 40 | def calculate(x): # 各类气象数据的计算 41 | if x[0][-1] == 'X': 42 | max_tem = max(list(int(n) for n in x[1].split(" "))) # 最高气温,求最大值 43 | return tuple([x[0][:15] + x[0][17:], str(max_tem)]) 44 | elif x[0][-1] == 'N': 45 | min_tem = min(list(int(n) for n in x[1].split(' '))) # 最低气温,求最小值 46 | return tuple([x[0][:15] + x[0][17:], str(min_tem)]) 47 | else: 48 | ave = "{:.1f}".format(numpy.mean(list(int(n) for n in x[1].split(" ")))) # 其余数据均求平均值 49 | return tuple([x[0][:15] + x[0][17:], str(ave)]) 50 | 51 | 52 | # 读取china文件夹下所有文件 将文件内容按行切分 53 | rdd = (sc.wholeTextFiles("file:///" + os.getcwd() + "/china/")).map(lambda x: x[1].split('\n')) \ 54 | .flatMap(lambda x: x[:-1]).map(data_clean).map(calculate) 55 | # 展开,每一行内容为rdd的一个元素 数据清洗 初步计算 56 | 57 | 58 | def merge_map(tp): # TMAX TMIN TAVG PRCP SNWD 59 | flag = tp[0][-1] 60 | values = tp[1].split(",") 61 | values = list(filter(lambda x: False if x == -9999.0 else True, [float(n) for n in values])) # 过滤无效值 62 | if len(values) == 0: 63 | return tuple([tp[0][11:], 'null']) 64 | if flag == 'X': # 将key去掉站点编号, 65 | return tuple([tp[0][11:], "{:.1f}".format(max(values) / 10)]) 66 | elif flag == 'N': 67 | return tuple([tp[0][11:], "{:.1f}".format(min(values) / 10)]) 68 | else: 69 | return tuple([tp[0][11:], "{:.1f}".format(numpy.mean(values) / 10)]) 70 | 71 | 72 | def res(x): # 计算最大值,最小值,平均值,返回元组 73 | if x[0][-1] == 'X': 74 | max_tem = max(list(float(n) for n in x[1].split(" "))) 75 | return tuple([x[0], str(max_tem)]) 76 | elif x[0][-1] == 'N': 77 | min_tem = min(list(float(n) for n in x[1].split(' '))) 78 | return tuple([x[0], str(min_tem)]) 79 | else: 80 | ave = "{:.1f}".format(numpy.mean(list(float(n) for n in x[1].split(" ")))) 81 | return tuple([x[0], str(ave)]) 82 | 83 | 84 | # 按key归并 按key排序(年份) 85 | rdd = rdd.reduceByKey(lambda x, y: x + "," + y).sortByKey(lambda x: x[0]).map(merge_map) 86 | # 按key归并 相同年份相同类型气象数据归为一组 计算出最终结果(year+气象数据类型, value) 按key年份排序 87 | rdd = rdd.reduceByKey(lambda x, y: x + " " + y).map(res).sortByKey(lambda x: x[0]) 88 | 89 | sql = """CREATE TABLE every_year_china ( 90 | id varchar(255) NOT NULL PRIMARY KEY, 91 | TMAX float, 92 | TMIN float, 93 | TAVG float, 94 | PRCP float, 95 | SNWD float 96 | )""" 97 | 98 | execute_sql(sql) 99 | execute_sql("delete from every_year_china") 100 | db = get_connect() 101 | cursor = db.cursor() 102 | row_dict = {'id': 'null', 'TMAX': 'null', 'TMIN': 'null', 'TAVG': 'null', 'PRCP': 'null', 'SNWD': 'null'} 103 | 104 | for e in rdd.collect(): # TMAX TMIN TAVG PRCP SNWD 存入mysql数据库 105 | column = e[0][-4:] 106 | primary_key = e[0][:-4] 107 | if row_dict['id'] == primary_key or row_dict['id'] == 'null': 108 | row_dict['id'] = primary_key 109 | row_dict[column] = float(e[1]) 110 | else: 111 | sql = "INSERT INTO every_year_china VALUES ('%s', %s, %s, %s, %s, %s)" % tuple(row_dict.values()) 112 | cursor.execute(sql) 113 | # print(tuple(row_dict.values())) 114 | row_dict['id'], row_dict['TMAX'], row_dict['TMIN'], row_dict['TAVG'], row_dict['PRCP'], row_dict['SNWD'] = \ 115 | 'null', 'null', 'null', 'null', 'null', 'null' 116 | row_dict[column] = float(e[1]) 117 | db.commit() 118 | db.close() 119 | -------------------------------------------------------------------------------- /data/china_stations.txt: -------------------------------------------------------------------------------- 1 | CHM00045004 22.3330 114.1670 26.0 KING'S PARK 45004 2 | CHM00045005 22.3000 114.1670 62.0 HONG KONG OBSERVATORY 45005 3 | CHM00050353 51.7170 126.6500 179.0 HUMA 50353 4 | CHM00050434 50.4500 121.7000 733.0 TULIHE 50434 5 | CHM00050527 49.2500 119.7000 650.0 HAILAR GSN 50527 6 | CHM00050557 49.1670 125.2330 243.0 NENJIANG 50557 7 | CHM00050564 49.4330 127.3500 235.0 SUNWU 50564 8 | CHM00050632 48.7670 121.9170 739.0 BUGT 50632 9 | CHM00050658 48.0500 125.8830 237.0 KESHAN 50658 10 | CHM00050727 47.1670 119.9330 997.0 ARXAN 50727 11 | CHM00050745 47.3830 123.9170 148.0 QIQIHAR GSN 50745 12 | CHM00050756 47.4330 126.9670 240.0 HAILUN 50756 13 | CHM00050788 47.2330 131.9830 65.0 FUJIN 50788 14 | CHM00050854 46.3830 125.3170 150.0 ANDA 50854 15 | CHM00050915 45.5170 116.9670 840.0 ULIASTAI 50915 16 | CHM00050949 45.0830 124.8670 136.0 QIAN GORLOS 50949 17 | CHM00050953 45.7500 126.7670 143.0 HARBIN 50953 18 | CHM00050963 45.9670 128.7330 110.0 TONGHE 50963 19 | CHM00050968 45.2170 127.9670 191.0 SHANGZHI 50968 20 | CHM00050978 45.2830 130.9490 234.0 JIXI 50978 21 | CHM00051076 47.7330 88.0830 737.0 ALTAY GSN 51076 22 | CHM00051087 46.9830 89.5170 827.0 FUYUN 51087 23 | CHM00051156 46.7830 85.7170 1294.0 HOBOKSAR 51156 24 | CHM00051243 45.6000 84.8500 428.0 KARAMAY 51243 25 | CHM00051334 44.6170 82.9000 321.0 JINGHE 51334 26 | CHM00051379 44.0170 89.5670 794.0 QITAI 51379 27 | CHM00051431 43.9500 81.3330 664.0 YINING 51431 28 | CHM00051463 43.8000 87.6500 947.0 WU LU MU QI GSN 51463 29 | CHM00051573 42.9330 89.2000 37.0 TURPAN 51573 30 | CHM00051644 41.7170 82.9500 1100.0 KUQA 51644 31 | CHM00051656 41.7500 86.1330 933.0 KORLA 51656 32 | CHM00051709 39.4670 75.9830 1380.0 KASHI GSN 51709 33 | CHM00051716 39.8000 78.5670 1117.0 BACHU 51716 34 | CHM00051747 39.0000 83.6670 1099.0 TAZHONG 51747 35 | CHM00051765 40.6330 87.7000 847.0 TIKANLIK 51765 36 | CHM00051777 39.0330 88.1670 889.0 RUOQIANG GSN 51777 37 | CHM00051811 38.4330 77.2670 1232.0 SHACHE 51811 38 | CHM00051828 37.1330 79.9330 1375.0 HOTAN GSN 51828 39 | CHM00051855 38.1500 85.5500 1248.0 QIEMO/QARQAN 51855 40 | CHM00052203 42.8170 93.5170 739.0 HAMI GSN 52203 41 | CHM00052267 41.9500 101.0670 941.0 EJIN QI 52267 42 | CHM00052323 41.8000 97.0330 1770.0 MAZONG SHAN 52323 43 | CHM00052418 40.1500 94.6830 1140.0 DUNHUANG 52418 44 | CHM00052436 40.2670 97.0330 1527.0 YUMENZHEN 52436 45 | CHM00052495 40.7500 104.5000 1329.0 BAYAN MOD 52495 46 | CHM00052533 39.7670 98.4830 1478.0 JIUQUAN GSN 52533 47 | CHM00052602 38.8330 93.3830 2771.0 LENGHU 52602 48 | CHM00052681 38.6330 103.0830 1367.0 MINQIN 52681 49 | CHM00052713 37.8500 95.3670 3174.0 DA-QAIDAM 52713 50 | CHM00052754 37.3330 100.1330 3302.0 GANGCA 52754 51 | CHM00052787 37.2000 102.8670 3044.0 WUSHAOLING 52787 52 | CHM00052818 36.4170 94.9000 2809.0 GOLMUD 52818 53 | CHM00052836 36.3000 98.1000 3192.0 DULAN GSN 52836 54 | CHM00052866 36.6170 101.7670 2296.0 XINING 52866 55 | CHM00052889 36.0500 103.8830 1518.0 LANZHOU 52889 56 | CHM00052955 35.5830 100.7500 3123.0 GUINAN 52955 57 | CHM00052957 35.2700 100.6500 3289.0 TONGDE 52957 58 | CHM00052983 35.8670 104.1500 1875.0 YU ZHONG GSN 52983 59 | CHM00053068 43.6500 112.0000 966.0 ERENHOT GSN 53068 60 | CHM00053192 44.0170 114.9500 1128.0 ABAG QI 53192 61 | CHM00053231 41.4500 106.3830 1510.0 HAILS 53231 62 | CHM00053276 42.4000 112.9000 1152.0 JURH 53276 63 | CHM00053336 41.5670 108.5170 1290.0 HALIUT 53336 64 | CHM00053352 41.7000 110.4330 1377.0 BAILING-MIAO 53352 65 | CHM00053391 41.9000 114.0000 1484.0 HUADE 53391 66 | CHM00053463 40.8170 111.6830 1065.0 HOHHOT 53463 67 | CHM00053487 40.1000 113.3330 1069.0 DATONG 53487 68 | CHM00053502 39.7830 105.7500 1143.0 JARTAI 53502 69 | CHM00053529 39.1000 107.9830 1381.0 OTOG QI 53529 70 | CHM00053614 38.4670 106.2000 1112.0 YINCHUAN GSN 53614 71 | CHM00053646 38.2330 109.7000 1058.0 YULIN 53646 72 | CHM00053673 38.7500 112.7000 838.0 YUANPING 53673 73 | CHM00053698 38.0330 114.4170 81.0 SHIJIAZHUANG 53698 74 | CHM00053723 37.8000 107.3830 1356.0 YANCHI 53723 75 | CHM00053772 37.7830 112.5500 779.0 TAIYUAN 53772 76 | CHM00053845 36.6000 109.5000 959.0 YAN AN 53845 77 | CHM00053863 37.0330 111.9170 745.0 JIEXIU 53863 78 | CHM00053898 36.0500 114.4000 64.0 ANYANG 53898 79 | CHM00053915 35.5500 106.6670 1348.0 PINGLIANG 53915 80 | CHM00053959 35.0500 111.0500 365.0 YUNCHENG 53959 81 | CHM00054012 44.5830 117.6000 997.0 XI UJIMQIN QI 54012 82 | CHM00054026 44.5670 120.9000 266.0 JARUD QI 54026 83 | CHM00054027 43.9830 119.4000 485.0 LINDONG 54027 84 | CHM00054094 44.5670 129.6000 242.0 MUDANJIANG 54094 85 | CHM00054096 44.3830 131.1500 498.0 SUIFENHE 54096 86 | CHM00054102 43.9500 116.1170 1004.0 XILIN HOT 54102 87 | CHM00054115 43.6000 118.0670 800.0 LINXI 54115 88 | CHM00054135 43.6000 122.2670 180.0 TONGLIAO 54135 89 | CHM00054157 43.1830 124.3330 167.0 SIPING 54157 90 | CHM00054161 43.9000 125.2170 238.0 CHANGCHUN 54161 91 | CHM00054208 42.1830 116.4670 1247.0 DUOLUN 54208 92 | CHM00054218 42.3000 118.8330 567.0 CHIFENG 54218 93 | CHM00054236 42.4170 122.5330 84.0 ZHANGWU 54236 94 | CHM00054292 42.8670 129.5000 178.0 YANJI 54292 95 | CHM00054324 41.5500 120.4500 176.0 CHAOYANG 54324 96 | CHM00054337 41.1330 121.1170 70.0 JINZHOU 54337 97 | CHM00054342 41.7330 123.5170 43.0 SHENYANG GSN 54342 98 | CHM00054346 41.3170 123.7830 185.0 BENXI 54346 99 | CHM00054374 41.7170 126.9170 333.0 LINJIANG 54374 100 | CHM00054405 40.4170 115.5000 538.0 HUAILAI 54405 101 | CHM00054423 40.9670 117.9170 423.0 CHENGDE 54423 102 | CHM00054471 40.6670 122.2000 4.0 YINGKOU 54471 103 | CHM00054483 40.8830 123.9000 235.0 CAOHEKOU 54483 104 | CHM00054497 40.0500 124.3330 14.0 DANDONG 54497 105 | CHM00054511 39.9330 116.2830 55.0 BEIJING GSN 54511 106 | CHM00054527 39.1000 117.1670 5.0 TIANJIN 54527 107 | CHM00054539 39.4330 118.9000 12.0 LETING 54539 108 | CHM00054616 38.3330 116.8330 11.0 CANGZHOU 54616 109 | CHM00054618 38.0830 116.5500 13.0 POTOU 54618 110 | CHM00054662 38.9000 121.6330 97.0 DALIAN 54662 111 | CHM00054725 37.5000 117.5330 12.0 HUIMIN 54725 112 | CHM00054776 37.4000 122.6830 47.0 CHENGSHANTOU 54776 113 | CHM00054823 36.6000 117.0500 169.0 JINAN 54823 114 | CHM00054843 36.7670 119.1830 22.0 WEIFANG 54843 115 | CHM00054857 36.0670 120.3330 77.0 QINGDAO GSN 54857 116 | CHM00054906 35.2500 115.4330 51.0 HEZE/CAOZHOU 54906 117 | CHM00054909 35.0830 115.5170 51.0 DINGTAO 54909 118 | CHM00054916 35.5670 116.8500 53.0 YANZHOU 54916 119 | CHM00055228 32.5000 80.0830 4280.0 SHIQUANHE 55228 120 | CHM00055279 31.3670 90.0170 4701.0 BAINGOIN 55279 121 | CHM00055299 31.4830 92.0670 4508.0 NAGQU 55299 122 | CHM00055472 30.9500 88.6330 4670.0 XAINZA 55472 123 | CHM00055578 29.2500 88.8830 3837.0 XIGAZE 55578 124 | CHM00055591 29.6670 91.1330 3650.0 LHASA GSN 55591 125 | CHM00056004 34.2170 92.4330 4535.0 TUOTUOHE 56004 126 | CHM00056021 34.1330 95.7830 4176.0 QUMARLEB 56021 127 | CHM00056029 33.0000 96.9670 3717.0 YUSHU 56029 128 | CHM00056033 34.9170 98.2170 4273.0 MADOI 56033 129 | CHM00056046 33.7500 99.6500 3968.0 DARLAG 56046 130 | CHM00056079 33.5830 102.9670 3441.0 RUO'ERGAI 56079 131 | CHM00056080 35.0000 102.9000 2910.0 HEZUO 56080 132 | CHM00056096 33.4000 104.9170 1079.0 WUDU 56096 133 | CHM00056106 31.8830 93.7830 4024.0 SOG XIAN 56106 134 | CHM00056116 31.4170 95.6000 3874.0 DENGQEN 56116 135 | CHM00056137 31.1500 97.1670 3307.0 QAMDO GSN 56137 136 | CHM00056146 31.6170 100.0000 3394.0 GARZE 56146 137 | CHM00056172 31.9000 102.2330 2666.0 BARKAM 56172 138 | CHM00056182 32.6500 103.5670 2852.0 SONGPAN 56182 139 | CHM00056187 30.7500 103.8670 541.0 WENJIANG GSN 56187 140 | CHM00056257 30.0000 100.2670 3950.0 LITANG 56257 141 | CHM00056294 30.6670 104.0170 508.0 CHENGDU 56294 142 | CHM00056444 28.4500 98.8830 3320.0 DEQEN 56444 143 | CHM00056462 29.0000 101.5000 2994.0 JIULONG 56462 144 | CHM00056492 28.8000 104.6000 342.0 YIBIN 56492 145 | CHM00056571 27.9000 102.2670 1599.0 XICHANG GSN 56571 146 | CHM00056651 26.8330 100.4670 2394.0 LIJING 56651 147 | CHM00056671 26.6500 102.2500 1788.0 HUILI 56671 148 | CHM00056739 25.1170 98.4830 1649.0 TENGCHONG GSN 56739 149 | CHM00056768 25.0170 101.5170 1820.0 CHUXIONG 56768 150 | CHM00056778 25.0170 102.6830 1892.0 KUNMING 56778 151 | CHM00056951 23.9500 100.2170 1503.0 LINCANG 56951 152 | CHM00056954 22.5670 99.9330 1054.0 LANCANG 56954 153 | CHM00056964 22.7670 100.9830 1303.0 SIMAO 56964 154 | CHM00056985 23.3830 103.3830 1302.0 MENGZI GSN 56985 155 | CHM00057006 34.5830 105.7500 1143.0 TIANSHUI 57006 156 | CHM00057014 34.5670 105.8670 1086.0 BEIDAO 57014 157 | CHM00057036 34.3000 108.9330 479.0 XIAN 57036 158 | CHM00057067 34.0500 111.0330 570.0 LUSHI 57067 159 | CHM00057083 34.7170 113.6500 111.0 ZHENGZHOU GSN 57083 160 | CHM00057127 33.0670 107.0330 509.0 HANZHONG 57127 161 | CHM00057131 34.4330 108.9670 411.0 JINGHE 57131 162 | CHM00057237 32.0670 108.0330 674.0 WANYUAN 57237 163 | CHM00057265 32.3830 111.6670 91.0 GUANGHUA 57265 164 | CHM00057290 33.0000 114.0170 83.0 ZHUMADIAN 57290 165 | CHM00057297 32.1330 114.0500 115.0 XINYANG 57297 166 | CHM00057411 30.8000 106.0830 310.0 NANCHONG 57411 167 | CHM00057447 30.2830 109.4670 458.0 ENSHI 57447 168 | CHM00057461 30.7000 111.3000 134.0 YICHANG GSN 57461 169 | CHM00057494 30.6000 114.0500 34.0 WUHAN 57494 170 | CHM00057515 29.5170 106.4830 351.0 CHONG-QING 57515 171 | CHM00057516 29.5830 106.4670 416.0 CHONGQING 57516 172 | CHM00057633 28.8330 108.7670 665.0 YOUYANG 57633 173 | CHM00057662 29.0500 111.6830 35.0 CHANGDE 57662 174 | CHM00057679 28.2000 113.0830 46.0 CHANGSHA 57679 175 | CHM00057687 28.2330 112.8670 68.0 CHANGSHA 57687 176 | CHM00057707 27.3000 105.2330 1511.0 BIJIE 57707 177 | CHM00057713 27.7000 106.8830 845.0 ZUNYI 57713 178 | CHM00057745 27.4500 109.6830 273.0 ZHIJIANG GSN 57745 179 | CHM00057799 27.1170 114.9670 78.0 JIAN 57799 180 | CHM00057816 26.5830 106.7330 1223.0 GUIYANG 57816 181 | CHM00057866 26.2330 111.6170 174.0 LINGLING 57866 182 | CHM00057902 25.4330 105.1830 1379.0 XINGREN 57902 183 | CHM00057957 25.3330 110.3000 174.0 GUILIN 57957 184 | CHM00057993 25.8670 115.0000 138.0 GANZHOU GSN 57993 185 | CHM00058027 34.2830 117.1500 42.0 XUZHOU 58027 186 | CHM00058040 34.8500 119.1330 10.0 GANYU 58040 187 | CHM00058102 33.8830 115.7670 42.0 BOXIAN 58102 188 | CHM00058221 32.9500 117.3670 22.0 BENGBU 58221 189 | CHM00058238 31.9330 118.9000 15.0 NANJING 58238 190 | CHM00058251 32.8500 120.2830 5.0 DONGTAI 58251 191 | CHM00058314 31.4000 116.3330 68.0 HUOSHAN 58314 192 | CHM00058321 31.8670 117.2330 36.0 HEFEI 58321 193 | CHM00058362 31.4000 121.4670 4.0 SHANGHAI GSN 58362 194 | CHM00058367 31.1670 121.4330 7.0 SHANGHAI/HONGQIAO 58367 195 | CHM00058424 30.5330 117.0500 20.0 ANQING 58424 196 | CHM00058457 30.2330 120.1670 43.0 HANGZHOU 58457 197 | CHM00058477 30.0330 122.1170 37.0 DINGHAI 58477 198 | CHM00058527 29.3000 117.2000 60.0 JINGDEZHEN 58527 199 | CHM00058606 28.6000 115.9170 50.0 NANCHANG GSN 58606 200 | CHM00058633 28.9670 118.8670 71.0 QU XIAN 58633 201 | CHM00058659 28.0170 120.6670 7.0 WENZHOU 58659 202 | CHM00058666 28.4500 121.8830 84.0 DACHEN DAO 58666 203 | CHM00058715 27.5830 116.6500 82.0 NANCHENG 58715 204 | CHM00058752 27.7830 120.6500 38.0 RUIAN 58752 205 | CHM00058834 26.6330 118.0000 128.0 NANPING 58834 206 | CHM00058847 26.0830 119.2830 14.0 FUZHOU 58847 207 | CHM00058921 25.9670 117.3500 204.0 YONGAN 58921 208 | CHM00059023 24.7000 108.0500 214.0 HECHI 59023 209 | CHM00059082 24.6670 113.6000 68.0 SHAOGUAN 59082 210 | CHM00059134 24.4830 118.0830 18.0 XIAMEN 59134 211 | CHM00059211 23.9000 106.6000 177.0 BAISE 59211 212 | CHM00059254 23.4000 110.0830 44.0 GUIPING 59254 213 | CHM00059265 23.4830 111.3000 120.0 WUZHOU 59265 214 | CHM00059287 23.2170 113.4830 71.0 GUANGZHOU GSN 59287 215 | CHM00059293 23.8000 114.7330 41.0 HEYUAN 59293 216 | CHM00059316 23.4000 116.6830 3.0 SHANTOU GSN 59316 217 | CHM00059417 22.3670 106.7500 129.0 LONGZHOU 59417 218 | CHM00059431 22.6330 108.2170 126.0 NANNING GSN 59431 219 | CHM00059501 22.7830 115.3670 5.0 SHANWEI 59501 220 | CHM00059632 21.9500 108.6170 6.0 QINZHOU 59632 221 | CHM00059663 21.8670 111.9670 22.0 YANGJIANG 59663 222 | CHM00059758 20.0000 110.2500 24.0 HAIKOU GSN 59758 223 | CHM00059838 19.1000 108.6170 8.0 DONGFANG 59838 224 | CHM00059855 19.2330 110.4670 25.0 QIONGHAI 59855 225 | CHM00059948 18.2170 109.5830 7.0 SANYA 59948 226 | CHM00059981 16.8330 112.3330 5.0 XISHA DAO 59981 227 | CHW00043204 31.2333 121.4500 27.1 HUNG CHIA 228 | CHW00043205 36.1833 120.3833 14.0 TSINGTAO -------------------------------------------------------------------------------- /data/readme.txt: -------------------------------------------------------------------------------- 1 | 数据从www.data.gov上获取 2 | -------------------------------------------------------------------------------- /答辩.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lutgcy/BigData-Spark/0711d1e064040d7209ccacf8cfe83e7c7593e70a/答辩.pdf --------------------------------------------------------------------------------