├── 1. 파이썬 기본 문법과 pandas (개정).ipynb
├── 2.1 결국 가격이다 (개정).ipynb
├── 2.2  '기준'이 필요해! (개정).ipynb
├── 3.1 부동산 수요 (개정).ipynb
├── 3.2 부동산 공급 (개정).ipynb
├── 4.1 학군 (개정).ipynb
├── 4.2 일자리 (개정).ipynb
├── 5. 지도와 부동산 (개정).ipynb
└── 데이터
    ├── 2018년 2차_졸업생의 진로 현황(전체).xlsx
    ├── 2018년 시도별 행정구역별 사설학원 현황.xlsx
    ├── 2018년_공시대상학교정보(전체).xlsx
    ├── KIKcd_B.20190101.xlsx
    ├── SIG_201804.zip
    ├── XrProjection 변환결과
        ├── TL_SCCO_SIG_WGS84.shp
        ├── TL_SCCO_SIG_WGS84.shx
        └── tl_scco_sig_wgs84.dbf
    ├── XrProjection 설치파일
        └── setup.exe
    ├── json 변환결과
        └── TL_SCCO_SIG_WGS84.json
    ├── ★(월간)KB주택가격동향_시계열(2019.01)12831994601335062.xls
    ├── 시·군·구별+미분양현황_2082_128_20181229151931.xlsx
    ├── 주택건설인허가실적.xlsx
    ├── 평균매매가격_아파트.xlsx
    ├── 평균전세가격_아파트.xlsx
    ├── 행정구역_시군구_별_주민등록세대수_20190107134842.xlsx
    ├── 행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20180821155737.xlsx
    └── 행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20190310191045.xlsx


/1. 파이썬 기본 문법과 pandas (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "## 파이썬 기본 데이터 형"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "code",
 12 |    "execution_count": null,
 13 |    "metadata": {
 14 |     "scrolled": true
 15 |    },
 16 |    "outputs": [],
 17 |    "source": [
 18 |     "3"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": null,
 24 |    "metadata": {},
 25 |    "outputs": [],
 26 |    "source": [
 27 |     "3.0"
 28 |    ]
 29 |   },
 30 |   {
 31 |    "cell_type": "code",
 32 |    "execution_count": null,
 33 |    "metadata": {},
 34 |    "outputs": [],
 35 |    "source": [
 36 |     "'3'"
 37 |    ]
 38 |   },
 39 |   {
 40 |    "cell_type": "code",
 41 |    "execution_count": null,
 42 |    "metadata": {},
 43 |    "outputs": [],
 44 |    "source": [
 45 |     "type(3)"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "code",
 50 |    "execution_count": null,
 51 |    "metadata": {},
 52 |    "outputs": [],
 53 |    "source": [
 54 |     "type(3.0)"
 55 |    ]
 56 |   },
 57 |   {
 58 |    "cell_type": "code",
 59 |    "execution_count": null,
 60 |    "metadata": {},
 61 |    "outputs": [],
 62 |    "source": [
 63 |     "type('3')"
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "markdown",
 68 |    "metadata": {},
 69 |    "source": [
 70 |     "## 파이썬 기본 연산"
 71 |    ]
 72 |   },
 73 |   {
 74 |    "cell_type": "code",
 75 |    "execution_count": null,
 76 |    "metadata": {},
 77 |    "outputs": [],
 78 |    "source": [
 79 |     "3 + 3.0"
 80 |    ]
 81 |   },
 82 |   {
 83 |    "cell_type": "code",
 84 |    "execution_count": null,
 85 |    "metadata": {},
 86 |    "outputs": [],
 87 |    "source": [
 88 |     "3 + '3'"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": null,
 94 |    "metadata": {},
 95 |    "outputs": [],
 96 |    "source": [
 97 |     "'안녕' + '하세요'"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "code",
102 |    "execution_count": null,
103 |    "metadata": {},
104 |    "outputs": [],
105 |    "source": [
106 |     "'안녕' * 5"
107 |    ]
108 |   },
109 |   {
110 |    "cell_type": "markdown",
111 |    "metadata": {},
112 |    "source": [
113 |     "## 변수"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": null,
119 |    "metadata": {},
120 |    "outputs": [],
121 |    "source": [
122 |     "test_name = 3"
123 |    ]
124 |   },
125 |   {
126 |    "cell_type": "code",
127 |    "execution_count": null,
128 |    "metadata": {},
129 |    "outputs": [],
130 |    "source": [
131 |     "test_name2 = 3.0"
132 |    ]
133 |   },
134 |   {
135 |    "cell_type": "code",
136 |    "execution_count": null,
137 |    "metadata": {},
138 |    "outputs": [],
139 |    "source": [
140 |     "test_name3 = '3'"
141 |    ]
142 |   },
143 |   {
144 |    "cell_type": "markdown",
145 |    "metadata": {},
146 |    "source": [
147 |     "## 리스트"
148 |    ]
149 |   },
150 |   {
151 |    "cell_type": "code",
152 |    "execution_count": null,
153 |    "metadata": {},
154 |    "outputs": [],
155 |    "source": [
156 |     "menu = ['짜장면', '돈까스', '냉면']"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "code",
161 |    "execution_count": null,
162 |    "metadata": {},
163 |    "outputs": [],
164 |    "source": [
165 |     "menu"
166 |    ]
167 |   },
168 |   {
169 |    "cell_type": "code",
170 |    "execution_count": null,
171 |    "metadata": {},
172 |    "outputs": [],
173 |    "source": [
174 |     "menu[0]"
175 |    ]
176 |   },
177 |   {
178 |    "cell_type": "code",
179 |    "execution_count": null,
180 |    "metadata": {},
181 |    "outputs": [],
182 |    "source": [
183 |     "menu[1:3]"
184 |    ]
185 |   },
186 |   {
187 |    "cell_type": "markdown",
188 |    "metadata": {},
189 |    "source": [
190 |     "## 딕셔너리"
191 |    ]
192 |   },
193 |   {
194 |    "cell_type": "code",
195 |    "execution_count": null,
196 |    "metadata": {},
197 |    "outputs": [],
198 |    "source": [
199 |     "grade = {'수학':90, '영어':100, '국어':70}"
200 |    ]
201 |   },
202 |   {
203 |    "cell_type": "code",
204 |    "execution_count": null,
205 |    "metadata": {},
206 |    "outputs": [],
207 |    "source": [
208 |     "grade"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": null,
214 |    "metadata": {},
215 |    "outputs": [],
216 |    "source": [
217 |     "grade['수학']"
218 |    ]
219 |   },
220 |   {
221 |    "cell_type": "markdown",
222 |    "metadata": {},
223 |    "source": [
224 |     "## for문"
225 |    ]
226 |   },
227 |   {
228 |    "cell_type": "code",
229 |    "execution_count": null,
230 |    "metadata": {},
231 |    "outputs": [],
232 |    "source": [
233 |     "for temp in [1, 2, 3, 4, 5]:\n",
234 |     "    print(temp)"
235 |    ]
236 |   },
237 |   {
238 |    "cell_type": "code",
239 |    "execution_count": null,
240 |    "metadata": {},
241 |    "outputs": [],
242 |    "source": [
243 |     "for temp in range(1, 100):\n",
244 |     "    print(temp, end='\\t')"
245 |    ]
246 |   },
247 |   {
248 |    "cell_type": "markdown",
249 |    "metadata": {},
250 |    "source": [
251 |     "## while문"
252 |    ]
253 |   },
254 |   {
255 |    "cell_type": "code",
256 |    "execution_count": null,
257 |    "metadata": {},
258 |    "outputs": [],
259 |    "source": [
260 |     "a = 1\n",
261 |     "while a < 6:\n",
262 |     "    print(a)\n",
263 |     "    a = a + 1"
264 |    ]
265 |   },
266 |   {
267 |    "cell_type": "markdown",
268 |    "metadata": {},
269 |    "source": [
270 |     "## 조건문"
271 |    ]
272 |   },
273 |   {
274 |    "cell_type": "code",
275 |    "execution_count": null,
276 |    "metadata": {},
277 |    "outputs": [],
278 |    "source": [
279 |     "number = 10\n",
280 |     "\n",
281 |     "if number > 0 :\n",
282 |     "    print('양수')\n",
283 |     "elif number < 0 :\n",
284 |     "    print('음수')\n",
285 |     "else:\n",
286 |     "    print(0)"
287 |    ]
288 |   },
289 |   {
290 |    "cell_type": "markdown",
291 |    "metadata": {},
292 |    "source": [
293 |     "## 함수"
294 |    ]
295 |   },
296 |   {
297 |    "cell_type": "code",
298 |    "execution_count": null,
299 |    "metadata": {},
300 |    "outputs": [],
301 |    "source": [
302 |     "height = 3\n",
303 |     "bottom = 4\n",
304 |     "hypotenuse = (height ** 2 + bottom ** 2) ** 0.5"
305 |    ]
306 |   },
307 |   {
308 |    "cell_type": "code",
309 |    "execution_count": null,
310 |    "metadata": {},
311 |    "outputs": [],
312 |    "source": [
313 |     "hypotenuse"
314 |    ]
315 |   },
316 |   {
317 |    "cell_type": "code",
318 |    "execution_count": null,
319 |    "metadata": {},
320 |    "outputs": [],
321 |    "source": [
322 |     "def pita():\n",
323 |     "    height_ft = 3\n",
324 |     "    bottom_ft = 4\n",
325 |     "    hypotenuse_ft = (height_ft ** 2 + bottom_ft ** 2) ** 0.5"
326 |    ]
327 |   },
328 |   {
329 |    "cell_type": "code",
330 |    "execution_count": null,
331 |    "metadata": {},
332 |    "outputs": [],
333 |    "source": [
334 |     "pita()"
335 |    ]
336 |   },
337 |   {
338 |    "cell_type": "code",
339 |    "execution_count": null,
340 |    "metadata": {},
341 |    "outputs": [],
342 |    "source": [
343 |     "hypotenuse_ft"
344 |    ]
345 |   },
346 |   {
347 |    "cell_type": "code",
348 |    "execution_count": null,
349 |    "metadata": {},
350 |    "outputs": [],
351 |    "source": [
352 |     "def pita():\n",
353 |     "    height_ft = 3\n",
354 |     "    bottom_ft = 4\n",
355 |     "    hypotenuse_ft = (height_ft ** 2 + bottom_ft ** 2) ** 0.5\n",
356 |     "    return hypotenuse_ft"
357 |    ]
358 |   },
359 |   {
360 |    "cell_type": "code",
361 |    "execution_count": null,
362 |    "metadata": {},
363 |    "outputs": [],
364 |    "source": [
365 |     "pita()"
366 |    ]
367 |   },
368 |   {
369 |    "cell_type": "code",
370 |    "execution_count": null,
371 |    "metadata": {},
372 |    "outputs": [],
373 |    "source": [
374 |     "def pita(height_ft, bottom_ft):\n",
375 |     "    hypotenuse_ft = (height_ft ** 2 + bottom_ft ** 2) ** 0.5\n",
376 |     "    return hypotenuse_ft"
377 |    ]
378 |   },
379 |   {
380 |    "cell_type": "code",
381 |    "execution_count": null,
382 |    "metadata": {},
383 |    "outputs": [],
384 |    "source": [
385 |     "print(pita(3, 4))\n",
386 |     "print(pita(2, 3))"
387 |    ]
388 |   },
389 |   {
390 |    "cell_type": "markdown",
391 |    "metadata": {},
392 |    "source": [
393 |     "## Pandas"
394 |    ]
395 |   },
396 |   {
397 |    "cell_type": "code",
398 |    "execution_count": null,
399 |    "metadata": {},
400 |    "outputs": [],
401 |    "source": [
402 |     "import pandas as pd"
403 |    ]
404 |   },
405 |   {
406 |    "cell_type": "code",
407 |    "execution_count": null,
408 |    "metadata": {},
409 |    "outputs": [],
410 |    "source": [
411 |     "series_ex = pd.Series([100, 90, 120, 110, 105])\n",
412 |     "\n",
413 |     "series_ex"
414 |    ]
415 |   },
416 |   {
417 |    "cell_type": "code",
418 |    "execution_count": null,
419 |    "metadata": {},
420 |    "outputs": [],
421 |    "source": [
422 |     "series_ex2 = pd.Series([100, 90, 120, 110, 105], index=['월', '화', '수', '목', '금'])\n",
423 |     "\n",
424 |     "series_ex2"
425 |    ]
426 |   },
427 |   {
428 |    "cell_type": "code",
429 |    "execution_count": null,
430 |    "metadata": {},
431 |    "outputs": [],
432 |    "source": [
433 |     "series_ex2['월']"
434 |    ]
435 |   },
436 |   {
437 |    "cell_type": "code",
438 |    "execution_count": null,
439 |    "metadata": {},
440 |    "outputs": [],
441 |    "source": [
442 |     "series_ex2[1]"
443 |    ]
444 |   },
445 |   {
446 |    "cell_type": "code",
447 |    "execution_count": null,
448 |    "metadata": {},
449 |    "outputs": [],
450 |    "source": [
451 |     "dataframe_ex = pd.DataFrame({'서울GDP':[100, 110, 120, 130, 135], \n",
452 |     "              '부산GDP':[70, 90, 80, 90, 100], \n",
453 |     "              '광주GDP':[50, 60, 65, 69, 80]}, \n",
454 |     "             index=[2012, 2013, 2014, 2015, 2016])\n",
455 |     "\n",
456 |     "dataframe_ex"
457 |    ]
458 |   },
459 |   {
460 |    "cell_type": "code",
461 |    "execution_count": null,
462 |    "metadata": {},
463 |    "outputs": [],
464 |    "source": [
465 |     "dataframe_ex['서울GDP']"
466 |    ]
467 |   },
468 |   {
469 |    "cell_type": "code",
470 |    "execution_count": null,
471 |    "metadata": {},
472 |    "outputs": [],
473 |    "source": [
474 |     "dataframe_ex.loc[2015]"
475 |    ]
476 |   },
477 |   {
478 |    "cell_type": "code",
479 |    "execution_count": null,
480 |    "metadata": {},
481 |    "outputs": [],
482 |    "source": [
483 |     "print(dataframe_ex['서울GDP'][2015])\n",
484 |     "print(dataframe_ex.loc[2015]['서울GDP'])"
485 |    ]
486 |   },
487 |   {
488 |    "cell_type": "code",
489 |    "execution_count": null,
490 |    "metadata": {},
491 |    "outputs": [],
492 |    "source": [
493 |     "series_ex2.sort_values()"
494 |    ]
495 |   },
496 |   {
497 |    "cell_type": "code",
498 |    "execution_count": null,
499 |    "metadata": {},
500 |    "outputs": [],
501 |    "source": [
502 |     "series_ex2.sort_values(ascending=False)"
503 |    ]
504 |   },
505 |   {
506 |    "cell_type": "code",
507 |    "execution_count": null,
508 |    "metadata": {},
509 |    "outputs": [],
510 |    "source": [
511 |     "dataframe_ex.sort_values(by='서울GDP', ascending=False)"
512 |    ]
513 |   },
514 |   {
515 |    "cell_type": "code",
516 |    "execution_count": null,
517 |    "metadata": {},
518 |    "outputs": [],
519 |    "source": [
520 |     "series_ex2[ series_ex2 > 100 ]"
521 |    ]
522 |   },
523 |   {
524 |    "cell_type": "code",
525 |    "execution_count": null,
526 |    "metadata": {},
527 |    "outputs": [],
528 |    "source": [
529 |     "dataframe_ex[ dataframe_ex['부산GDP'] > 80 ]"
530 |    ]
531 |   },
532 |   {
533 |    "cell_type": "markdown",
534 |    "metadata": {},
535 |    "source": [
536 |     "## 주석처리"
537 |    ]
538 |   },
539 |   {
540 |    "cell_type": "code",
541 |    "execution_count": null,
542 |    "metadata": {},
543 |    "outputs": [],
544 |    "source": [
545 |     "# print('주석입니다.')\n",
546 |     "print('주석이 아닙니다.')"
547 |    ]
548 |   }
549 |  ],
550 |  "metadata": {
551 |   "kernelspec": {
552 |    "display_name": "Python 3",
553 |    "language": "python",
554 |    "name": "python3"
555 |   },
556 |   "language_info": {
557 |    "codemirror_mode": {
558 |     "name": "ipython",
559 |     "version": 3
560 |    },
561 |    "file_extension": ".py",
562 |    "mimetype": "text/x-python",
563 |    "name": "python",
564 |    "nbconvert_exporter": "python",
565 |    "pygments_lexer": "ipython3",
566 |    "version": "3.6.4"
567 |   }
568 |  },
569 |  "nbformat": 4,
570 |  "nbformat_minor": 2
571 | }
572 | 


--------------------------------------------------------------------------------
/2.1 결국 가격이다 (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 2.1] pandas.read_excel로 엑셀파일 읽어오기 시도 \n",
 10 |     "\n",
 11 |     "import pandas as pd\n",
 12 |     "\n",
 13 |     "# 아래와 같이 KB에서 받은 데이터를 디렉터리와 제목을 합쳐서 문자열로 입력하세요\n",
 14 |     "#path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\KB부동산\\월간\\★(월간)KB주택가격동향_시계열(2019.01)12831994601335062.xls'\n",
 15 |     "path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
 16 |     "raw_data = pd.read_excel(path, sheet_name='매매종합')"
 17 |    ]
 18 |   },
 19 |   {
 20 |    "cell_type": "code",
 21 |    "execution_count": null,
 22 |    "metadata": {
 23 |     "scrolled": true
 24 |    },
 25 |    "outputs": [],
 26 |    "source": [
 27 |     "# [예제 2.2] xlwings모듈 활용하여 엑셀 데이터 가져와서 데이터프레임으로 저장 \n",
 28 |     "\n",
 29 |     "import pandas as pd\n",
 30 |     "import xlwings as xw\n",
 31 |     "\n",
 32 |     "path = r' 여러분이 받은 파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
 33 |     "wb = xw.Book(path)                \n",
 34 |     "sheet = wb.sheets['매매종합']   \n",
 35 |     "row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
 36 |     "data_range = 'A2:GE' + str(row_num)\n",
 37 |     "raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value "
 38 |    ]
 39 |   },
 40 |   {
 41 |    "cell_type": "code",
 42 |    "execution_count": null,
 43 |    "metadata": {},
 44 |    "outputs": [],
 45 |    "source": [
 46 |     "# [예제 2.3] 시-도 데이터와 구-군 데이터를 가져와 리스트로 만들기\n",
 47 |     "\n",
 48 |     "big_col = list(raw_data.columns)\n",
 49 |     "small_col = list(raw_data.iloc[0])"
 50 |    ]
 51 |   },
 52 |   {
 53 |    "cell_type": "code",
 54 |    "execution_count": null,
 55 |    "metadata": {},
 56 |    "outputs": [],
 57 |    "source": [
 58 |     "# [예제 2.4 small_col] 리스트에서 None 없애기\n",
 59 |     "\n",
 60 |     "for num, gu_data in enumerate(small_col):\n",
 61 |     "    if gu_data == None:\n",
 62 |     "        small_col[num] = big_col[num]"
 63 |    ]
 64 |   },
 65 |   {
 66 |    "cell_type": "code",
 67 |    "execution_count": null,
 68 |    "metadata": {},
 69 |    "outputs": [],
 70 |    "source": [
 71 |     "# [예제 2.5] small_col, big_col 리스트 완성하기 \n",
 72 |     "\n",
 73 |     "bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
 74 |     "bigname_list = bignames.split(' ')\n",
 75 |     "big_col = list(raw_data.columns)\n",
 76 |     "small_col = list(raw_data.iloc[0])\n",
 77 |     "\n",
 78 |     "for num, gu_data in enumerate(small_col):\n",
 79 |     "    if gu_data == None:\n",
 80 |     "        small_col[num] = big_col[num]\n",
 81 |     "    \n",
 82 |     "    check = num\n",
 83 |     "    while True:\n",
 84 |     "        if big_col[check] in bigname_list:\n",
 85 |     "            big_col[num] = big_col[check]\n",
 86 |     "            break\n",
 87 |     "        else:\n",
 88 |     "            check = check - 1"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": null,
 94 |    "metadata": {},
 95 |    "outputs": [],
 96 |    "source": [
 97 |     "# [예제 2.6] small_col, big_col 예외 부분 수정하기 \n",
 98 |     "\n",
 99 |     "big_col[129] = '경기' \n",
100 |     "big_col[130] = '경기'\n",
101 |     "small_col[185] = '서귀포'"
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": null,
107 |    "metadata": {},
108 |    "outputs": [],
109 |    "source": [
110 |     "# [예제 2.7] 새로운 컬럼 입력하기 \n",
111 |     "\n",
112 |     "raw_data.columns = [big_col, small_col]\n",
113 |     "new_col_data = raw_data.drop([0,1])"
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": null,
119 |    "metadata": {},
120 |    "outputs": [],
121 |    "source": [
122 |     "# [예제 2.8] 데이터를 읽어와 컬럼을 재설정하여 데이터프레임으로 저장하는 코드 정리 \n",
123 |     "\n",
124 |     "import pandas as pd\n",
125 |     "import xlwings as xw\n",
126 |     "\n",
127 |     "path = r' 여러분이 받은 파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
128 |     "wb = xw.Book(path)                \n",
129 |     "sheet = wb.sheets['매매종합']   \n",
130 |     "row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
131 |     "data_range = 'A2:GE' + str(row_num)\n",
132 |     "raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
133 |     "\n",
134 |     "bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
135 |     "bigname_list = bignames.split(' ')\n",
136 |     "big_col = list(raw_data.columns)\n",
137 |     "small_col = list(raw_data.iloc[0])\n",
138 |     "\n",
139 |     "for num, gu_data in enumerate(small_col):\n",
140 |     "    if gu_data == None:\n",
141 |     "        small_col[num] = big_col[num]\n",
142 |     "    \n",
143 |     "    check = num\n",
144 |     "    while True:\n",
145 |     "        if big_col[check] in bigname_list:\n",
146 |     "            big_col[num] = big_col[check]\n",
147 |     "            break\n",
148 |     "        else:\n",
149 |     "            check = check - 1\n",
150 |     "            \n",
151 |     "big_col[129] = '경기' \n",
152 |     "big_col[130] = '경기'\n",
153 |     "small_col[185] = '서귀포'\n",
154 |     "\n",
155 |     "raw_data.columns = [big_col, small_col]\n",
156 |     "new_col_data = raw_data.drop([0,1])"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "code",
161 |    "execution_count": null,
162 |    "metadata": {},
163 |    "outputs": [],
164 |    "source": [
165 |     "# [예제 2.9] 인덱스를 위한 날짜 리스트 만들기 \n",
166 |     "\n",
167 |     "index_list = list(new_col_data['구분']['구분'])\n",
168 |     "\n",
169 |     "new_index = []\n",
170 |     "\n",
171 |     "for num, raw_index in enumerate(index_list):\n",
172 |     "    temp = str(raw_index).split('.')\n",
173 |     "    if int(temp[0]) > 12 :\n",
174 |     "        if len(temp[0]) == 2:\n",
175 |     "            new_index.append('19' + temp[0] + '.' + temp[1])\n",
176 |     "        else:\n",
177 |     "            new_index.append(temp[0] + '.' + temp[1])\n",
178 |     "    else:\n",
179 |     "        new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
180 |     "\n",
181 |     "        \n",
182 |     "# [예제 2.10] 만들어진 날짜 리스트를 인덱스로 설정 \n",
183 |     "\n",
184 |     "new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
185 |     "cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)"
186 |    ]
187 |   },
188 |   {
189 |    "cell_type": "code",
190 |    "execution_count": null,
191 |    "metadata": {},
192 |    "outputs": [],
193 |    "source": [
194 |     "# [예제 2.11] 전처리 함수화 \n",
195 |     "\n",
196 |     "def KBpriceindex_preprocessing(path, data_type):\n",
197 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
198 |     "    # data_type : ‘매매종합’, ‘매매APT’, ‘매매연립’, ‘매매단독’, ‘전세종합’, ‘전세APT’, ‘전세연립’, ‘전세단독’ 중 하나\n",
199 |     "    \n",
200 |     "    wb = xw.Book(path)                \n",
201 |     "    sheet = wb.sheets[data_type]   \n",
202 |     "    row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
203 |     "    data_range = 'A2:GE' + str(row_num)\n",
204 |     "    raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
205 |     "    \n",
206 |     "    bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
207 |     "    bigname_list = bignames.split(' ')\n",
208 |     "    big_col = list(raw_data.columns)\n",
209 |     "    small_col = list(raw_data.iloc[0])\n",
210 |     "\n",
211 |     "    for num, gu_data in enumerate(small_col):\n",
212 |     "        if gu_data == None:\n",
213 |     "            small_col[num] = big_col[num]\n",
214 |     "\n",
215 |     "        check = num\n",
216 |     "        while True:\n",
217 |     "            if big_col[check] in bigname_list:\n",
218 |     "                big_col[num] = big_col[check]\n",
219 |     "                break\n",
220 |     "            else:\n",
221 |     "                check = check - 1\n",
222 |     "                \n",
223 |     "    big_col[129] = '경기' \n",
224 |     "    big_col[130] = '경기'\n",
225 |     "    small_col[185] = '서귀포'\n",
226 |     "    \n",
227 |     "    raw_data.columns = [big_col, small_col]\n",
228 |     "    new_col_data = raw_data.drop([0,1])\n",
229 |     "    \n",
230 |     "    index_list = list(new_col_data['구분']['구분'])\n",
231 |     "\n",
232 |     "    new_index = []\n",
233 |     "\n",
234 |     "    for num, raw_index in enumerate(index_list):\n",
235 |     "        temp = str(raw_index).split('.')\n",
236 |     "        if int(temp[0]) > 12 :\n",
237 |     "            if len(temp[0]) == 2:\n",
238 |     "                new_index.append('19' + temp[0] + '.' + temp[1])\n",
239 |     "            else:\n",
240 |     "                new_index.append(temp[0] + '.' + temp[1])\n",
241 |     "        else:\n",
242 |     "            new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
243 |     "\n",
244 |     "    new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
245 |     "    cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)\n",
246 |     "    return cleaned_data"
247 |    ]
248 |   },
249 |   {
250 |    "cell_type": "code",
251 |    "execution_count": null,
252 |    "metadata": {},
253 |    "outputs": [],
254 |    "source": [
255 |     "# [예제 2.12] 전처리 함수 사용 예제 \n",
256 |     "\n",
257 |     "# [예제 2.13] matplotlib 불러오고 한글폰트 설정 \n",
258 |     "import matplotlib.pyplot as plt\n",
259 |     "from matplotlib import font_manager, rc\n",
260 |     "%matplotlib inline\n",
261 |     "\n",
262 |     "font_name = font_manager.FontProperties(fname=\"c:/Windows/Fonts/malgun.ttf\").get_name()\n",
263 |     "rc('font', family=font_name)\n",
264 |     "# 맥OS 인 경우 위 두 줄을 입력하지 말고 아래 코드를 입력하세요\n",
265 |     "# rc('font', family='AppleGothic')\n",
266 |     "plt.rcParams['axes.unicode_minus'] = False\n",
267 |     "\n",
268 |     "\n",
269 |     "# [예제 2.14] 종합 매매가격 지수 그래프 그리기 \n",
270 |     "path = r' 여러분이 받은 파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
271 |     "data_type = '매매종합'\n",
272 |     "new_data = KBpriceindex_preprocessing(path, data_type)\n",
273 |     "new_data['전국']['전국'].plot(legend='전국')\n",
274 |     "plt.show()"
275 |    ]
276 |   },
277 |   {
278 |    "cell_type": "code",
279 |    "execution_count": null,
280 |    "metadata": {},
281 |    "outputs": [],
282 |    "source": [
283 |     "# [예제 2.15] 특정 지역에 원하는 시간대의 데이터를 가져와 그래프 그리기 \n",
284 |     "\n",
285 |     "new_data['전국']['전국']['2008-01':].plot(legend='전국')\n",
286 |     "plt.show()"
287 |    ]
288 |   },
289 |   {
290 |    "cell_type": "code",
291 |    "execution_count": null,
292 |    "metadata": {},
293 |    "outputs": [],
294 |    "source": [
295 |     "# [예제 2.16] subplot을 이용해 서울과 대구 그래프 그리기\n",
296 |     "\n",
297 |     "plt.figure(figsize=(10, 5))\n",
298 |     "\n",
299 |     "plt.subplot(1, 2, 1)\n",
300 |     "plt.title('서울')\n",
301 |     "plt.plot(new_data['서울']['서울']['2008-01':])\n",
302 |     "\n",
303 |     "plt.subplot(1, 2, 2)\n",
304 |     "plt.title('대구')\n",
305 |     "plt.plot(new_data['대구']['대구']['2008-01':])\n",
306 |     "\n",
307 |     "plt.show()"
308 |    ]
309 |   },
310 |   {
311 |    "cell_type": "code",
312 |    "execution_count": null,
313 |    "metadata": {},
314 |    "outputs": [],
315 |    "source": [
316 |     "# [예제 2.17] for 문을 이용해 여러 개의 subplot을 그리는 코드  \n",
317 |     "\n",
318 |     "spots = '전국 서울 대구 부산'\n",
319 |     "start_date = '2008-1'\n",
320 |     "spot_list = spots.split(' ')\n",
321 |     "num_row = int((len(spot_list)-1)/2)+1\n",
322 |     "\n",
323 |     "plt.figure(figsize=(10, num_row*5))\n",
324 |     "for i, spot in enumerate(spot_list):\n",
325 |     "    plt.subplot(num_row, 2, i+1)\n",
326 |     "    plt.title(spot)\n",
327 |     "    plt.plot(new_data[spot][spot][start_date:])\n",
328 |     "    \n",
329 |     "plt.show()"
330 |    ]
331 |   },
332 |   {
333 |    "cell_type": "code",
334 |    "execution_count": null,
335 |    "metadata": {},
336 |    "outputs": [],
337 |    "source": [
338 |     "# [예제 2.18] 시-도 안의 구 지역 가격지수까지 subplot으로 그래프 그리기 \n",
339 |     "\n",
340 |     "spots = '서울 서울,마포구 서울,강남구 부산 경기'\n",
341 |     "start_date = '2008-1'\n",
342 |     "spot_list = spots.split(' ')\n",
343 |     "num_row = int((len(spot_list)-1)/2)+1\n",
344 |     "\n",
345 |     "plt.figure(figsize=(10, num_row*5))\n",
346 |     "for i, spot in enumerate(spot_list):\n",
347 |     "    plt.subplot(num_row, 2, i+1)\n",
348 |     "    plt.title(spot)\n",
349 |     "    if ',' in spot:\n",
350 |     "        si, gu = spot.split(',')\n",
351 |     "    else:\n",
352 |     "        si = gu = spot\n",
353 |     "    plt.plot(new_data[si][gu][start_date:])\n",
354 |     "    \n",
355 |     "plt.show()"
356 |    ]
357 |   },
358 |   {
359 |    "cell_type": "code",
360 |    "execution_count": null,
361 |    "metadata": {},
362 |    "outputs": [],
363 |    "source": [
364 |     "# [예제 2.19] 특정 날짜의 전 지역 가격지수 데이터 가져오기 \n",
365 |     "\n",
366 |     "new_data.loc['2018-1-1']"
367 |    ]
368 |   },
369 |   {
370 |    "cell_type": "code",
371 |    "execution_count": null,
372 |    "metadata": {},
373 |    "outputs": [],
374 |    "source": [
375 |     "# [예제 2.20] 두 날짜 사이의 부동산 가격지수 증감률 구하기\n",
376 |     "\n",
377 |     "(new_data.loc['2018-1-1'] - new_data.loc['2016-1-1']) / new_data.loc['2016-1-1'] * 100"
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "code",
382 |    "execution_count": null,
383 |    "metadata": {},
384 |    "outputs": [],
385 |    "source": [
386 |     "# [예제 2.21] 가격지수 증감률 정렬하기 \n",
387 |     "\n",
388 |     "diff = (new_data.loc['2018-1-1'] - new_data.loc['2016-1-1']) / new_data.loc['2016-1-1'] * 100\n",
389 |     "diff.sort_values()"
390 |    ]
391 |   },
392 |   {
393 |    "cell_type": "code",
394 |    "execution_count": null,
395 |    "metadata": {},
396 |    "outputs": [],
397 |    "source": [
398 |     "# [예제 2.22] 누락된 지역 삭제 및 상위, 하위 10개만 출력 \n",
399 |     "\n",
400 |     "diff = ((new_data.loc['2018-1-1'] - new_data.loc['2016-1-1']) / new_data.loc['2016-1-1'] * 100).dropna()\n",
401 |     "print(\"하위 10개\")\n",
402 |     "print(diff.sort_values()[:10])\n",
403 |     "print(' ')\n",
404 |     "print(\"상위 10개\")\n",
405 |     "print(diff.sort_values(ascending=False)[:10])"
406 |    ]
407 |   },
408 |   {
409 |    "cell_type": "code",
410 |    "execution_count": null,
411 |    "metadata": {},
412 |    "outputs": [],
413 |    "source": [
414 |     "# [예제 2.23] 가격지수 증감률을 막대그래프로 시각화\n",
415 |     "\n",
416 |     "import numpy as np\n",
417 |     "from matplotlib import style\n",
418 |     "style.use('ggplot')\n",
419 |     "\n",
420 |     "fig = plt.figure(figsize=(13, 7))\n",
421 |     "ind = np.arange(20)\n",
422 |     "\n",
423 |     "ax = fig.add_subplot(1, 3, 1)\n",
424 |     "plt.title('2016.1~2018.1 가격 변화율 최하위 20')\n",
425 |     "rects = plt.barh(ind, diff.sort_values()[:20].values,  align='center', height=0.5)\n",
426 |     "plt.yticks(ind, diff.sort_values()[:20].index)\n",
427 |     "for i, rect in enumerate(rects):\n",
428 |     "    ax.text(0.95 * rect.get_width(),\n",
429 |     "            rect.get_y() + rect.get_height() / 2.0,\n",
430 |     "            str(round(diff.sort_values()[:20].values[i],2)) + '%',\n",
431 |     "            ha='left', va='center', bbox=dict(boxstyle=\"round\", fc=(0.5, 0.9, 0.7), ec=\"0.1\"))\n",
432 |     "    \n",
433 |     "ax2 = fig.add_subplot(1, 3, 3)\n",
434 |     "plt.title('2016.1~2018.1 가격 변화율 최상위 20')\n",
435 |     "rects2 = plt.barh(ind, diff.sort_values()[-20:].values,  align='center', height=0.5)\n",
436 |     "plt.yticks(ind,  diff.sort_values()[-20:].index)\n",
437 |     "for i, rect in enumerate(rects2):\n",
438 |     "    ax2.text(0.95 * rect.get_width(),\n",
439 |     "             rect.get_y() + rect.get_height() / 2.0,\n",
440 |     "             str(round(diff.sort_values()[-20:].values[i],2)) + '%', \n",
441 |     "             ha='right', va='center', bbox=dict(boxstyle=\"round\", fc=(0.5, 0.9, 0.7), ec=\"0.1\"))\n",
442 |     "\n",
443 |     "plt.show()"
444 |    ]
445 |   },
446 |   {
447 |    "cell_type": "code",
448 |    "execution_count": null,
449 |    "metadata": {},
450 |    "outputs": [],
451 |    "source": [
452 |     "# [예제 2.24] 특정 지역만 선택해서 가격지수 증감률을 막대그래프로 시각화\n",
453 |     "\n",
454 |     "loca =  '전국 서울 부산 경기 대구 광주 울산 대전'\n",
455 |     "\n",
456 |     "temp_list = loca.split(\" \")\n",
457 |     "loca_list = []\n",
458 |     "for temp in temp_list:\n",
459 |     "    if ',' in temp:\n",
460 |     "        temp_split = temp.split(\",\")\n",
461 |     "        loca_list.append((temp_split[0], temp_split[1]))\n",
462 |     "    else:\n",
463 |     "        loca_list.append((temp, temp))\n",
464 |     "\n",
465 |     "diff = ((new_data.loc['2018-1-1', loca_list] - new_data.loc['2016-1-1', loca_list]) / new_data.loc['2016-1-1', loca_list] * 100).sort_values()\n",
466 |     "\n",
467 |     "num = len(loca_list)\n",
468 |     "fig = plt.figure(figsize=(13, 7))\n",
469 |     "ind = np.arange(num)\n",
470 |     "\n",
471 |     "ax = fig.add_subplot(1, 3, 1)\n",
472 |     "plt.title('2016.1~2018.1 가격지수 변화율')\n",
473 |     "rects = plt.barh(ind, diff.head(num).values,  align='center', height=0.5)\n",
474 |     "plt.yticks(ind, diff.head(num).index)\n",
475 |     "for i, rect in enumerate(rects):\n",
476 |     "    ax.text(0.95 * rect.get_width(), rect.get_y() + rect.get_height() / 2.0, str(round(diff.head(20).values[i], 2)) + '%',\n",
477 |     "            ha='left', va='center', bbox=dict(boxstyle=\"round\", fc=(0.5, 0.9, 0.7), ec=\"0.1\"))\n",
478 |     "\n",
479 |     "\n",
480 |     "plt.show()"
481 |    ]
482 |   }
483 |  ],
484 |  "metadata": {
485 |   "kernelspec": {
486 |    "display_name": "Python 3",
487 |    "language": "python",
488 |    "name": "python3"
489 |   },
490 |   "language_info": {
491 |    "codemirror_mode": {
492 |     "name": "ipython",
493 |     "version": 3
494 |    },
495 |    "file_extension": ".py",
496 |    "mimetype": "text/x-python",
497 |    "name": "python",
498 |    "nbconvert_exporter": "python",
499 |    "pygments_lexer": "ipython3",
500 |    "version": "3.6.4"
501 |   }
502 |  },
503 |  "nbformat": 4,
504 |  "nbformat_minor": 2
505 | }
506 | 


--------------------------------------------------------------------------------
/2.2  '기준'이 필요해! (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 2.25] 기본 설정\n",
 10 |     "\n",
 11 |     "import pandas as pd\n",
 12 |     "import xlwings as xw\n",
 13 |     "\n",
 14 |     "import matplotlib.pyplot as plt\n",
 15 |     "from matplotlib import font_manager, rc\n",
 16 |     "%matplotlib inline\n",
 17 |     "\n",
 18 |     "font_name = font_manager.FontProperties(fname=\"c:/Windows/Fonts/malgun.ttf\").get_name()\n",
 19 |     "rc('font', family=font_name)\n",
 20 |     "# 맥OS인 경우 위 두 줄을 입력하지 말고 아래 코드를 입력하세요\n",
 21 |     "# rc('font', family='AppleGothic')\n",
 22 |     "plt.rcParams['axes.unicode_minus'] = False"
 23 |    ]
 24 |   },
 25 |   {
 26 |    "cell_type": "code",
 27 |    "execution_count": null,
 28 |    "metadata": {},
 29 |    "outputs": [],
 30 |    "source": [
 31 |     "# [예제 2.26] 소득 데이터 읽어오기 \n",
 32 |     "\n",
 33 |     "#path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20180821155737.xlsx'\n",
 34 |     "path = r'내려받은 개인소득 엑셀파일의 디렉터리\\내려받은 개인소득 엑셀 파일명.xlsx'\n",
 35 |     "raw = pd.read_excel(path, sheet_name='데이터', index_col=0)\n",
 36 |     "\n",
 37 |     "raw"
 38 |    ]
 39 |   },
 40 |   {
 41 |    "cell_type": "code",
 42 |    "execution_count": null,
 43 |    "metadata": {},
 44 |    "outputs": [],
 45 |    "source": [
 46 |     "# [예제 2.27] 첫 번째 행 제거\n",
 47 |     "\n",
 48 |     "raw.drop(['행정구역(시도)별'], inplace=True)"
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": null,
 54 |    "metadata": {},
 55 |    "outputs": [],
 56 |    "source": [
 57 |     "# [예제 2.28] 인덱스의 시도 이름을 약어로 변경하기 \n",
 58 |     "\n",
 59 |     "index_list = raw.index\n",
 60 |     "new_index = []\n",
 61 |     "for temp in index_list:\n",
 62 |     "    if temp[-1] == '시':\n",
 63 |     "        new_index.append(temp[:2])\n",
 64 |     "    elif temp[-1] == '도':\n",
 65 |     "        if len(temp) == 3:\n",
 66 |     "            new_index.append(temp[:2])\n",
 67 |     "        elif len(temp) == 4:\n",
 68 |     "            new_index.append(temp[0] + temp[2])\n",
 69 |     "        else:\n",
 70 |     "            new_index.append('제주')\n",
 71 |     "    else:\n",
 72 |     "        new_index.append(temp)\n",
 73 |     "        \n",
 74 |     "raw.index = new_index"
 75 |    ]
 76 |   },
 77 |   {
 78 |    "cell_type": "code",
 79 |    "execution_count": null,
 80 |    "metadata": {},
 81 |    "outputs": [],
 82 |    "source": [
 83 |     "# [예제 2.29] 컬럼과 인덱스 서로 바꾸기 \n",
 84 |     "\n",
 85 |     "income_data = raw.T\n",
 86 |     "\n",
 87 |     "income_data"
 88 |    ]
 89 |   },
 90 |   {
 91 |    "cell_type": "code",
 92 |    "execution_count": null,
 93 |    "metadata": {},
 94 |    "outputs": [],
 95 |    "source": [
 96 |     "# [예제 2.30] 소득 데이터 전처리 과정 함수화 \n",
 97 |     "\n",
 98 |     "def income_preprocessing(path):\n",
 99 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
100 |     "    \n",
101 |     "    raw = pd.read_excel(path)\n",
102 |     "    raw.drop([0], inplace=True)\n",
103 |     "    raw.set_index('행정구역(시도)별', inplace=True)\n",
104 |     "    index_list = raw.index\n",
105 |     "    new_index = []\n",
106 |     "    for temp in index_list:\n",
107 |     "        if temp[-1] == '시':\n",
108 |     "            new_index.append(temp[:2])\n",
109 |     "        elif temp[-1] == '도':\n",
110 |     "            if len(temp) == 3:\n",
111 |     "                new_index.append(temp[:2])\n",
112 |     "            elif len(temp) == 4:\n",
113 |     "                new_index.append(temp[0] + temp[2])\n",
114 |     "            else:\n",
115 |     "                new_index.append('제주')\n",
116 |     "        else:\n",
117 |     "            new_index.append(temp)\n",
118 |     "\n",
119 |     "    raw.index = new_index\n",
120 |     "    income_data = raw.T\n",
121 |     "    return income_data\n",
122 |     "\n",
123 |     "\n",
124 |     "income_data_path = r'내려받은 개인소득 엑셀파일의 디렉터리\\내려받은 개인소득 엑셀 파일명.xlsx'\n",
125 |     "income_data = income_preprocessing(income_data_path)"
126 |    ]
127 |   },
128 |   {
129 |    "cell_type": "code",
130 |    "execution_count": null,
131 |    "metadata": {},
132 |    "outputs": [],
133 |    "source": [
134 |     "# [예제 2.31] KB 가격지수 데이터 가져오기 \n",
135 |     "\n",
136 |     "def KBpriceindex_preprocessing(path, data_type):\n",
137 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
138 |     "    # data_type : ‘매매종합’, ‘매매APT’, ‘매매연립’, ‘매매단독’, ‘전세종합’, ‘전세APT’, ‘전세연립’, ‘전세단독’ 중 하나\n",
139 |     "    \n",
140 |     "    wb = xw.Book(path)                \n",
141 |     "    sheet = wb.sheets[data_type]   \n",
142 |     "    row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
143 |     "    data_range = 'A2:GE' + str(row_num)\n",
144 |     "    raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
145 |     "    \n",
146 |     "    bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
147 |     "    bigname_list = bignames.split(' ')\n",
148 |     "    big_col = list(raw_data.columns)\n",
149 |     "    small_col = list(raw_data.iloc[0])\n",
150 |     "\n",
151 |     "    for num, gu_data in enumerate(small_col):\n",
152 |     "        if gu_data == None:\n",
153 |     "            small_col[num] = big_col[num]\n",
154 |     "\n",
155 |     "        check = num\n",
156 |     "        while True:\n",
157 |     "            if big_col[check] in bigname_list:\n",
158 |     "                big_col[num] = big_col[check]\n",
159 |     "                break\n",
160 |     "            else:\n",
161 |     "                check = check - 1\n",
162 |     "                \n",
163 |     "    big_col[129] = '경기' \n",
164 |     "    big_col[130] = '경기'\n",
165 |     "    small_col[185] = '서귀포'\n",
166 |     "    \n",
167 |     "    raw_data.columns = [big_col, small_col]\n",
168 |     "    new_col_data = raw_data.drop([0,1])\n",
169 |     "    \n",
170 |     "    index_list = list(new_col_data['구분']['구분'])\n",
171 |     "\n",
172 |     "    new_index = []\n",
173 |     "\n",
174 |     "    for num, raw_index in enumerate(index_list):\n",
175 |     "        temp = str(raw_index).split('.')\n",
176 |     "        if int(temp[0]) > 12 :\n",
177 |     "            if len(temp[0]) == 2:\n",
178 |     "                new_index.append('19' + temp[0] + '.' + temp[1])\n",
179 |     "            else:\n",
180 |     "                new_index.append(temp[0] + '.' + temp[1])\n",
181 |     "        else:\n",
182 |     "            new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
183 |     "\n",
184 |     "    new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
185 |     "    cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)\n",
186 |     "    return cleaned_data\n",
187 |     "\n",
188 |     "\n",
189 |     "\n",
190 |     "path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
191 |     "data_type = '매매종합'\n",
192 |     "price_data = KBpriceindex_preprocessing(path, data_type)"
193 |    ]
194 |   },
195 |   {
196 |    "cell_type": "code",
197 |    "execution_count": null,
198 |    "metadata": {},
199 |    "outputs": [],
200 |    "source": [
201 |     "# [예제 2.32] 특정 지역들의 부동산 가격지수 변화와 개인소득 변화를 subplot으로 나타내기 \n",
202 |     "\n",
203 |     "location_list = ['전국', '서울', '부산', '대구', '대전' ,'광주', '경기']\n",
204 |     "start_year = '2004'\n",
205 |     "end_year = '2016'\n",
206 |     "\n",
207 |     "num_row = int((len(location_list)-1)/2)+1\n",
208 |     "plt.figure(figsize=(12, num_row*5))\n",
209 |     "for j, location in enumerate(location_list):\n",
210 |     "    year_data = []\n",
211 |     "    for i in range(int(start_year), int(end_year) + 1):\n",
212 |     "        if location == '제주도':\n",
213 |     "            year_data.append(price_data[location]['서귀포'][str(i)+'.12.1'])\n",
214 |     "        else:\n",
215 |     "            year_data.append(price_data[location][location][str(i)+'.12.1'])\n",
216 |     "\n",
217 |     "    temp_df = pd.DataFrame(income_data[location][start_year:end_year])\n",
218 |     "    temp_df.columns = [location + '소득']\n",
219 |     "    temp_df[location + '부동산 가격지수'] = year_data\n",
220 |     "    temp_df['소득 변화율'] = (temp_df[location + '소득']/temp_df[location + '소득'][0] - 1 )*100\n",
221 |     "    temp_df['부동산 기격 지수 변화율'] = (temp_df[location + '부동산 가격지수']/temp_df[location + '부동산 가격지수'][0] - 1 )*100\n",
222 |     "\n",
223 |     "    plt.subplot(num_row, 2, j+1)\n",
224 |     "    plt.title(location + ', ' + start_year + ' ~ ' + end_year + '까지')\n",
225 |     "    plt.plot(temp_df['부동산 기격 지수 변화율'], label=location + ' 부동산 가격 지수 변화율')\n",
226 |     "    plt.plot(temp_df['소득 변화율'], label=location + ' 소득 변화율')\n",
227 |     "    plt.legend()"
228 |    ]
229 |   },
230 |   {
231 |    "cell_type": "code",
232 |    "execution_count": null,
233 |    "metadata": {},
234 |    "outputs": [],
235 |    "source": [
236 |     "# [예제 2.33] PIR 데이터 가져오기 (ch02/ 2.2 ‘기준’이 필요해!.ipynb)\n",
237 |     "\n",
238 |     "path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
239 |     "wb = xw.Book(path)                \n",
240 |     "sheet = wb.sheets['PIR(월별)']   \n",
241 |     "row_num = sheet.range('J2').end('down').row  \n",
242 |     "data_range = 'B2:N' + str(row_num)\n",
243 |     "pir_rawdata = sheet[data_range].options(pd.DataFrame, index=False, header=True).value "
244 |    ]
245 |   },
246 |   {
247 |    "cell_type": "code",
248 |    "execution_count": null,
249 |    "metadata": {},
250 |    "outputs": [],
251 |    "source": [
252 |     "# [예제 2.34] 지역을 구분하는 상위 컬럼 만들기 \n",
253 |     "\n",
254 |     "big_col = list(pir_rawdata.columns)\n",
255 |     "big_col[0] = 'index1'\n",
256 |     "big_col[1] = 'index2'\n",
257 |     "big_col[2] = 'index3'\n",
258 |     "\n",
259 |     "for num, col in enumerate(big_col):\n",
260 |     "    if col == None:\n",
261 |     "        big_col[num] = big_col[num - 1]\n",
262 |     "    else:\n",
263 |     "        pass"
264 |    ]
265 |   },
266 |   {
267 |    "cell_type": "code",
268 |    "execution_count": null,
269 |    "metadata": {
270 |     "scrolled": false
271 |    },
272 |    "outputs": [],
273 |    "source": [
274 |     "# [예제 2.35] 소득 분위를 나타내는 하위 컬럼 만들기 \n",
275 |     "\n",
276 |     "small_col = list(pir_rawdata.loc[1])\n",
277 |     "small_col[0] = 'index1'\n",
278 |     "small_col[1] = 'index2'\n",
279 |     "small_col[2] = 'index3'"
280 |    ]
281 |   },
282 |   {
283 |    "cell_type": "code",
284 |    "execution_count": null,
285 |    "metadata": {
286 |     "scrolled": true
287 |    },
288 |    "outputs": [],
289 |    "source": [
290 |     "# [예제 2.36] 이중 컬럼 설정 \n",
291 |     "\n",
292 |     "pir_rawdata.columns = [big_col, small_col]\n",
293 |     "pir_rawdata.drop([0,1], inplace=True)"
294 |    ]
295 |   },
296 |   {
297 |    "cell_type": "code",
298 |    "execution_count": null,
299 |    "metadata": {},
300 |    "outputs": [],
301 |    "source": [
302 |     "# [예제 2.37] 상위 인덱스 설정\n",
303 |     "\n",
304 |     "big_index = list(pir_rawdata['index1']['index1'])\n",
305 |     "for num, index in enumerate(big_index):\n",
306 |     "    if index is not None:\n",
307 |     "        if type(index) == str:\n",
308 |     "            big_index[num] = '20' + index.split(\".\")[0][1:] + '.' + index.split(\".\")[1][:2]\n",
309 |     "        else:\n",
310 |     "            big_index[num] = big_index[num - 1].split(\".\")[0] + \".\" +  str(int(index))\n",
311 |     "    else:\n",
312 |     "        big_index[num] = big_index[num - 1]"
313 |    ]
314 |   },
315 |   {
316 |    "cell_type": "code",
317 |    "execution_count": null,
318 |    "metadata": {},
319 |    "outputs": [],
320 |    "source": [
321 |     "# [예제 2.38] 하위 인덱스 및 이중 인덱스 설정  \n",
322 |     "\n",
323 |     "small_index = list(pir_rawdata['index3']['index3'])\n",
324 |     "pir_rawdata.index = [pd.to_datetime(big_index), small_index]\n",
325 |     "\n",
326 |     "del pir_rawdata['index1']\n",
327 |     "del pir_rawdata['index2']\n",
328 |     "del pir_rawdata['index3']\n",
329 |     "\n",
330 |     "pir_rawdata.index.names = ['날짜', '평균주택가격']"
331 |    ]
332 |   },
333 |   {
334 |    "cell_type": "code",
335 |    "execution_count": null,
336 |    "metadata": {},
337 |    "outputs": [],
338 |    "source": [
339 |     "pir_rawdata.xs('3분위', level='평균주택가격')"
340 |    ]
341 |   },
342 |   {
343 |    "cell_type": "code",
344 |    "execution_count": null,
345 |    "metadata": {},
346 |    "outputs": [],
347 |    "source": [
348 |     "# [예제 2.39] 서울 각 소득 분위의 3분위 주택가격 시계열 subplot으로 나타내기  \n",
349 |     "\n",
350 |     "gagus = ['1분위', '2분위', '3분위', '4분위', '5분위']\n",
351 |     "location = '서울 Seoul'\n",
352 |     "num_row = int((len(gagus)-1)/2)+1\n",
353 |     "\n",
354 |     "plt.figure(figsize=(10, num_row*5))\n",
355 |     "for i, gagu in enumerate(gagus):\n",
356 |     "    plt.subplot(num_row, 2, i+1)\n",
357 |     "    plt.title(gagu + \" 가구의 중간값(3분위) 주택가격 PIR\")\n",
358 |     "    plt.plot(pir_rawdata.xs('3분위', level='평균주택가격')[location][gagu])\n",
359 |     "    \n",
360 |     "plt.show()"
361 |    ]
362 |   },
363 |   {
364 |    "cell_type": "code",
365 |    "execution_count": null,
366 |    "metadata": {},
367 |    "outputs": [],
368 |    "source": [
369 |     "pir_rawdata"
370 |    ]
371 |   },
372 |   {
373 |    "cell_type": "code",
374 |    "execution_count": null,
375 |    "metadata": {},
376 |    "outputs": [],
377 |    "source": [
378 |     "# [예제 2.40] PIR 그래프에 평균값 추가하기 \n",
379 |     "\n",
380 |     "gagus = ['1분위', '2분위', '3분위', '4분위', '5분위']\n",
381 |     "location = '서울 Seoul'\n",
382 |     "house_price_level = '3분위'\n",
383 |     "num_row = int((len(gagus)-1)/2)+1\n",
384 |     "\n",
385 |     "plt.figure(figsize=(10, num_row*5))\n",
386 |     "for i, gagu in enumerate(gagus):\n",
387 |     "    plt.subplot(num_row, 2, i+1)\n",
388 |     "    plt.title(gagu + \" 가구의 중간값(\" + house_price_level + \") 주택가격 PIR\")\n",
389 |     "    plt.plot(pir_rawdata.xs(house_price_level, level='평균주택가격')[location][gagu])\n",
390 |     "    indx = pir_rawdata.xs(house_price_level, level='평균주택가격')[location][gagu].index\n",
391 |     "    long_mean = pir_rawdata.xs(house_price_level, level='평균주택가격')[location][gagu].mean()\n",
392 |     "    plt.plot(indx, [long_mean for a in range(len(pir_rawdata.xs(house_price_level, level='평균주택가격')[location][gagu]))])\n",
393 |     "    \n",
394 |     "plt.show()"
395 |    ]
396 |   }
397 |  ],
398 |  "metadata": {
399 |   "kernelspec": {
400 |    "display_name": "Python 3",
401 |    "language": "python",
402 |    "name": "python3"
403 |   },
404 |   "language_info": {
405 |    "codemirror_mode": {
406 |     "name": "ipython",
407 |     "version": 3
408 |    },
409 |    "file_extension": ".py",
410 |    "mimetype": "text/x-python",
411 |    "name": "python",
412 |    "nbconvert_exporter": "python",
413 |    "pygments_lexer": "ipython3",
414 |    "version": "3.6.4"
415 |   }
416 |  },
417 |  "nbformat": 4,
418 |  "nbformat_minor": 2
419 | }
420 | 


--------------------------------------------------------------------------------
/3.1 부동산 수요 (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 3.1] 아파트 매매, 전세 지수 데이터 전처리 시키기\n",
 10 |     "import xlwings as xw\n",
 11 |     "import pandas as pd\n",
 12 |     "\n",
 13 |     "def KBpriceindex_preprocessing(path, data_type):\n",
 14 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
 15 |     "    # data_type : ‘매매종합’, ‘매매APT’, ‘매매연립’, ‘매매단독’, ‘전세종합’, ‘전세APT’, ‘전세연립’, ‘전세단독’ 중 하나\n",
 16 |     "    \n",
 17 |     "    wb = xw.Book(path)                \n",
 18 |     "    sheet = wb.sheets[data_type]   \n",
 19 |     "    row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
 20 |     "    data_range = 'A2:GE' + str(row_num)\n",
 21 |     "    raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
 22 |     "    \n",
 23 |     "    bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
 24 |     "    bigname_list = bignames.split(' ')\n",
 25 |     "    big_col = list(raw_data.columns)\n",
 26 |     "    small_col = list(raw_data.iloc[0])\n",
 27 |     "\n",
 28 |     "    for num, gu_data in enumerate(small_col):\n",
 29 |     "        if gu_data == None:\n",
 30 |     "            small_col[num] = big_col[num]\n",
 31 |     "\n",
 32 |     "        check = num\n",
 33 |     "        while True:\n",
 34 |     "            if big_col[check] in bigname_list:\n",
 35 |     "                big_col[num] = big_col[check]\n",
 36 |     "                break\n",
 37 |     "            else:\n",
 38 |     "                check = check - 1\n",
 39 |     "                \n",
 40 |     "    big_col[129] = '경기' \n",
 41 |     "    big_col[130] = '경기'\n",
 42 |     "    small_col[185] = '서귀포'\n",
 43 |     "    \n",
 44 |     "    raw_data.columns = [big_col, small_col]\n",
 45 |     "    new_col_data = raw_data.drop([0,1])\n",
 46 |     "    \n",
 47 |     "    index_list = list(new_col_data['구분']['구분'])\n",
 48 |     "\n",
 49 |     "    new_index = []\n",
 50 |     "\n",
 51 |     "    for num, raw_index in enumerate(index_list):\n",
 52 |     "        temp = str(raw_index).split('.')\n",
 53 |     "        if int(temp[0]) > 12 :\n",
 54 |     "            if len(temp[0]) == 2:\n",
 55 |     "                new_index.append('19' + temp[0] + '.' + temp[1])\n",
 56 |     "            else:\n",
 57 |     "                new_index.append(temp[0] + '.' + temp[1])\n",
 58 |     "        else:\n",
 59 |     "            new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
 60 |     "\n",
 61 |     "    new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
 62 |     "    cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)\n",
 63 |     "    return cleaned_data\n",
 64 |     "\n",
 65 |     "\n",
 66 |     "path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
 67 |     "price_index = KBpriceindex_preprocessing(path, '매매apt')\n",
 68 |     "jeonse_index = KBpriceindex_preprocessing(path, '전세apt')"
 69 |    ]
 70 |   },
 71 |   {
 72 |    "cell_type": "code",
 73 |    "execution_count": null,
 74 |    "metadata": {},
 75 |    "outputs": [],
 76 |    "source": [
 77 |     "# [예제 3.2] KBpriceindex_preprocessing 함수 실행\n",
 78 |     "\n",
 79 |     "import xlwings as xw\n",
 80 |     "\n",
 81 |     "def KBpriceindex_preprocessing(path, data_type):\n",
 82 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
 83 |     "    # data_type : ‘매매종합’, ‘매매APT’, ‘매매연립’, ‘매매단독’, ‘전세종합’, ‘전세APT’, ‘전세연립’, ‘전세단독’ 중 하나\n",
 84 |     "    \n",
 85 |     "    wb = xw.Book(path)                \n",
 86 |     "    sheet = wb.sheets[data_type]   \n",
 87 |     "    row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
 88 |     "    data_range = 'A2:GE' + str(row_num)\n",
 89 |     "    raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
 90 |     "    \n",
 91 |     "    bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
 92 |     "    bigname_list = bignames.split(' ')\n",
 93 |     "    big_col = list(raw_data.columns)\n",
 94 |     "    small_col = list(raw_data.iloc[0])\n",
 95 |     "\n",
 96 |     "    for num, gu_data in enumerate(small_col):\n",
 97 |     "        if gu_data == None:\n",
 98 |     "            small_col[num] = big_col[num]\n",
 99 |     "\n",
100 |     "        check = num\n",
101 |     "        while True:\n",
102 |     "            if big_col[check] in bigname_list:\n",
103 |     "                big_col[num] = big_col[check]\n",
104 |     "                break\n",
105 |     "            else:\n",
106 |     "                check = check - 1\n",
107 |     "                \n",
108 |     "    big_col[129] = '경기' \n",
109 |     "    big_col[130] = '경기'\n",
110 |     "    small_col[185] = '서귀포'\n",
111 |     "    \n",
112 |     "    raw_data.columns = [big_col, small_col]\n",
113 |     "    new_col_data = raw_data.drop([0,1])\n",
114 |     "    \n",
115 |     "    index_list = list(new_col_data['구분']['구분'])\n",
116 |     "\n",
117 |     "    new_index = []\n",
118 |     "\n",
119 |     "    for num, raw_index in enumerate(index_list):\n",
120 |     "        temp = str(raw_index).split('.')\n",
121 |     "        if int(temp[0]) > 20 :\n",
122 |     "            if len(temp[0]) == 2:\n",
123 |     "                new_index.append('19' + temp[0] + '.' + temp[1])\n",
124 |     "            else:\n",
125 |     "                new_index.append(temp[0] + '.' + temp[1])\n",
126 |     "        else:\n",
127 |     "            new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
128 |     "\n",
129 |     "    new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
130 |     "    cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)\n",
131 |     "    return cleaned_data"
132 |    ]
133 |   },
134 |   {
135 |    "cell_type": "code",
136 |    "execution_count": null,
137 |    "metadata": {},
138 |    "outputs": [],
139 |    "source": [
140 |     "# [예제 3.3] 아파트 매매가 지수, 전세가 지수 가져오기\n",
141 |     "\n",
142 |     "import pandas as pd\n",
143 |     "\n",
144 |     "path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
145 |     "price_index = KBpriceindex_preprocessing(path, '매매apt')\n",
146 |     "jeonse_index = KBpriceindex_preprocessing(path, '전세apt')"
147 |    ]
148 |   },
149 |   {
150 |    "cell_type": "code",
151 |    "execution_count": null,
152 |    "metadata": {},
153 |    "outputs": [],
154 |    "source": [
155 |     "# [예제 3.4] 매매가 지수 데이터프레임에서 날짜를 기준으로 데이터 가져오기\n",
156 |     "\n",
157 |     "from datetime import datetime\n",
158 |     "from dateutil.relativedelta import relativedelta\n",
159 |     "\n",
160 |     "index_date = datetime(2010, 1, 1)\n",
161 |     "time_range = 12\n",
162 |     "prev_date = index_date - relativedelta(months=time_range)\n",
163 |     "\n",
164 |     "print(index_date)\n",
165 |     "print(prev_date)\n",
166 |     "\n",
167 |     "price_index.loc[index_date]"
168 |    ]
169 |   },
170 |   {
171 |    "cell_type": "code",
172 |    "execution_count": null,
173 |    "metadata": {
174 |     "scrolled": true
175 |    },
176 |    "outputs": [],
177 |    "source": [
178 |     "# [예제 3.5] 1년 동안의 매매가 지수 증감 구하기\n",
179 |     "\n",
180 |     "(price_index.loc[index_date] - price_index.loc[prev_date])/price_index.loc[prev_date]"
181 |    ]
182 |   },
183 |   {
184 |    "cell_type": "code",
185 |    "execution_count": null,
186 |    "metadata": {},
187 |    "outputs": [],
188 |    "source": [
189 |     "# [예제 3.6] 매매가, 전세가 지수 증감률을 계산해서 데이터프레임 형태로 저장(ch03/ 3.1 부동산 수요.ipynb)\n",
190 |     "\n",
191 |     "demand_df = pd.DataFrame()\n",
192 |     "demand_df['매매증감률'] = (price_index.loc[index_date] - price_index.loc[prev_date])/ price_index.loc[prev_date]\n",
193 |     "demand_df['전세증감률'] = (jeonse_index.loc[index_date] - jeonse_index.loc[prev_date])/jeonse_index.loc[prev_date]"
194 |    ]
195 |   },
196 |   {
197 |    "cell_type": "code",
198 |    "execution_count": null,
199 |    "metadata": {
200 |     "scrolled": true
201 |    },
202 |    "outputs": [],
203 |    "source": [
204 |     "# [예제 3.7] 기준 날짜로부터 지난 3년간의 데이터 가져오기\n",
205 |     "\n",
206 |     "prev_date2 = index_date - relativedelta(months=time_range*3)\n",
207 |     "price_index[prev_date2:index_date][:-1]"
208 |    ]
209 |   },
210 |   {
211 |    "cell_type": "code",
212 |    "execution_count": null,
213 |    "metadata": {},
214 |    "outputs": [],
215 |    "source": [
216 |     "# [예제 3.8] demand_df에 지난 최댓값과 기준 날짜 값의 최댓값 대비 증감률 저장\n",
217 |     "\n",
218 |     "demand_df['이전최대값'] = price_index[prev_date2:index_date][:-1].max()\n",
219 |     "demand_df['최댓값대비증감률'] = (price_index.loc[index_date] - demand_df['이전최대값']) /demand_df['이전최대값']"
220 |    ]
221 |   },
222 |   {
223 |    "cell_type": "code",
224 |    "execution_count": null,
225 |    "metadata": {},
226 |    "outputs": [],
227 |    "source": [
228 |     "# [예제 3.9] 매매가 지수와 전세가 지수의 상승여부를 계산해 demand_df에 저장\n",
229 |     "\n",
230 |     "demand_df['매매가상승'] = demand_df['매매증감률'] > 0.01\n",
231 |     "demand_df['전세가상승'] = demand_df['전세증감률'] > 0.01"
232 |    ]
233 |   },
234 |   {
235 |    "cell_type": "code",
236 |    "execution_count": null,
237 |    "metadata": {},
238 |    "outputs": [],
239 |    "source": [
240 |     "# [예제 3.10] 전세가 지수 상승이 매매가 지수 상승보다 빠른지와 \n",
241 |     "# 기준 날짜의 매매가 지수 값이 지난 3년의 최댓값보다 큰지 계산하여 demand_df에 저장\n",
242 |     "\n",
243 |     "demand_df['더빠른전세상승'] = demand_df['전세증감률'] > demand_df['매매증감률']\n",
244 |     "demand_df['최댓값대비상승'] = demand_df['최댓값대비증감률'] > 0"
245 |    ]
246 |   },
247 |   {
248 |    "cell_type": "code",
249 |    "execution_count": null,
250 |    "metadata": {},
251 |    "outputs": [],
252 |    "source": [
253 |     "# [예제 3.11] 지역별로 수요 조건의 총합 구하기 (ch03/ 3.1 부동산 수요.ipynb)\n",
254 |     "\n",
255 |     "demand_df['수요총합'] = demand_df[['매매가상승','전세가상승','더빠른전세상승','최댓값대비상승']].sum(axis=1)"
256 |    ]
257 |   },
258 |   {
259 |    "cell_type": "code",
260 |    "execution_count": null,
261 |    "metadata": {},
262 |    "outputs": [],
263 |    "source": [
264 |     "# [예제 3.12] 수요총합이 4점인 지역만 보기\n",
265 |     "\n",
266 |     "demand_df = demand_df[demand_df['수요총합'] == 4]"
267 |    ]
268 |   },
269 |   {
270 |    "cell_type": "code",
271 |    "execution_count": null,
272 |    "metadata": {},
273 |    "outputs": [],
274 |    "source": [
275 |     "# [예제 3.13] demand_df에서 원하는 지역만 남기기 \n",
276 |     "\n",
277 |     "demand_df.loc[[('서울','동대문구'), ('부산', '중구')]]"
278 |    ]
279 |   },
280 |   {
281 |    "cell_type": "code",
282 |    "execution_count": null,
283 |    "metadata": {},
284 |    "outputs": [],
285 |    "source": [
286 |     "# [예제 3.14] 시군 단위의 인덱스만 남기기\n",
287 |     "\n",
288 |     "seleted_index = []\n",
289 |     "\n",
290 |     "for name in demand_df.index:\n",
291 |     "    if name[0] is not name[1]:\n",
292 |     "        seleted_index.append((name[0], name[1]))\n",
293 |     "        \n",
294 |     "demand_df = demand_df.loc[seleted_index]"
295 |    ]
296 |   },
297 |   {
298 |    "cell_type": "code",
299 |    "execution_count": null,
300 |    "metadata": {},
301 |    "outputs": [],
302 |    "source": [
303 |     "# [예제 3.15] 코드를 정리해 함수화 하기 \n",
304 |     "\n",
305 |     "def demand(price_index, jeonse_index, index_date, time_range):\n",
306 |     "\n",
307 |     "    prev_date = index_date - relativedelta(months=time_range)\n",
308 |     "    prev_date2 = index_date - relativedelta(months=time_range*3)\n",
309 |     "\n",
310 |     "    demand_df = pd.DataFrame()\n",
311 |     "    demand_df['매매증감률'] = (price_index.loc[index_date] - price_index.loc[prev_date])/ price_index.loc[prev_date].replace(0,None)\n",
312 |     "    demand_df['전세증감률'] = (jeonse_index.loc[index_date] - jeonse_index.loc[prev_date])/jeonse_index.loc[prev_date].replace(0,None)\n",
313 |     "    demand_df['이전최대값'] = price_index[prev_date2:index_date][:-1].max()\n",
314 |     "    demand_df['최댓값대비증감률'] = (price_index.loc[index_date] - demand_df['이전최대값']) /demand_df['이전최대값'].replace(0,None)\n",
315 |     "\n",
316 |     "    demand_df['매매가상승'] = demand_df['매매증감률'] > 0.01\n",
317 |     "    demand_df['전세가상승'] = demand_df['전세증감률'] > 0.01\n",
318 |     "    demand_df['더빠른전세상승'] = demand_df['전세증감률'] > demand_df['매매증감률']\n",
319 |     "    demand_df['최댓값대비상승'] = demand_df['최댓값대비증감률'] > 0\n",
320 |     "    demand_df['수요총합'] = demand_df[['매매가상승','전세가상승','더빠른전세상승','최댓값대비상승']].sum(axis=1)\n",
321 |     "\n",
322 |     "    demand_df = demand_df[demand_df['수요총합'] == 4]\n",
323 |     "\n",
324 |     "    seleted_index = []\n",
325 |     "\n",
326 |     "    for name in demand_df.index:\n",
327 |     "        if name[0] is not name[1]:\n",
328 |     "            seleted_index.append((name[0], name[1]))\n",
329 |     "\n",
330 |     "    demand_df = demand_df.loc[seleted_index]\n",
331 |     "    \n",
332 |     "    return demand_df"
333 |    ]
334 |   },
335 |   {
336 |    "cell_type": "code",
337 |    "execution_count": null,
338 |    "metadata": {},
339 |    "outputs": [],
340 |    "source": [
341 |     "# [예제 3.16] demand 함수 사용 예시\n",
342 |     "\n",
343 |     "path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
344 |     "price_index = KBpriceindex_preprocessing(path, '매매apt')\n",
345 |     "jeonse_index = KBpriceindex_preprocessing(path, '전세apt')\n",
346 |     "\n",
347 |     "index_date = datetime(2010, 1, 1)\n",
348 |     "time_range = 12\n",
349 |     "\n",
350 |     "demand_ex = demand(price_index, jeonse_index, index_date, time_range)"
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "code",
355 |    "execution_count": null,
356 |    "metadata": {},
357 |    "outputs": [],
358 |    "source": [
359 |     "# [예제 3.17] matplotlib를 불러오고 한글폰트 \n",
360 |     "\n",
361 |     "import matplotlib.pyplot as plt\n",
362 |     "from matplotlib import font_manager, rc\n",
363 |     "from matplotlib import style\n",
364 |     "style.use('ggplot')\n",
365 |     "%matplotlib inline\n",
366 |     "\n",
367 |     "font_name = font_manager.FontProperties(fname=\"c:/Windows/Fonts/malgun.ttf\").get_name()\n",
368 |     "rc('font', family=font_name)\n",
369 |     "# 맥OS인 경우 위 두 줄을 입력하지 말고 아래 코드를 입력하세요\n",
370 |     "# rc('font', family='AppleGothic')\n",
371 |     "plt.rcParams['axes.unicode_minus'] = False"
372 |    ]
373 |   },
374 |   {
375 |    "cell_type": "code",
376 |    "execution_count": null,
377 |    "metadata": {},
378 |    "outputs": [],
379 |    "source": [
380 |     "# [예제 3.18] 특정 날짜를 기준으로 부산 중구 그래프 그리기 \n",
381 |     "\n",
382 |     "si = '부산'\n",
383 |     "gu = '중구'\n",
384 |     "index_date = datetime(2010, 1, 1)\n",
385 |     "\n",
386 |     "prev_date = index_date - relativedelta(months=12)\n",
387 |     "prev_date2 = index_date - relativedelta(months=36)\n",
388 |     "graph_start = index_date - relativedelta(years=3)\n",
389 |     "graph_end = index_date + relativedelta(years=3)\n",
390 |     "\n",
391 |     "plt.figure(figsize=(10, 5))\n",
392 |     "plt.title(si + ' - ' + gu)\n",
393 |     "plt.plot(price_index[si][gu][graph_start:graph_end], label='매매가')\n",
394 |     "plt.plot(jeonse_index[si][gu][graph_start:graph_end], label='전세가')\n",
395 |     "plt.axvline(x=index_date, color='lightcoral', linestyle='--')\n",
396 |     "plt.axvline(x=prev_date, color='darkseagreen', linestyle='--')\n",
397 |     "plt.axvline(x=prev_date2, color='darkseagreen', linestyle='--')\n",
398 |     "plt.legend()\n",
399 |     "plt.show()"
400 |    ]
401 |   },
402 |   {
403 |    "cell_type": "code",
404 |    "execution_count": null,
405 |    "metadata": {},
406 |    "outputs": [],
407 |    "source": [
408 |     "# [예제 3.19] 수요전략의 결과 데이터프레임에서 지역 이름 가져오기\n",
409 |     "\n",
410 |     "for name in demand_ex.index:\n",
411 |     "    print(name)"
412 |    ]
413 |   },
414 |   {
415 |    "cell_type": "code",
416 |    "execution_count": null,
417 |    "metadata": {},
418 |    "outputs": [],
419 |    "source": [
420 |     "# [예제 3.20] demand 함수를 이용해 선택 된 전 지역의 그래프 그리기 \n",
421 |     "\n",
422 |     "index_date = datetime(2010, 1, 1)\n",
423 |     "\n",
424 |     "time_range = 12\n",
425 |     "prev_date = index_date - relativedelta(months=time_range)\n",
426 |     "prev_date2 = index_date - relativedelta(months=time_range * 3)\n",
427 |     "graph_start = index_date - relativedelta(months=time_range * 3)\n",
428 |     "graph_end = index_date + relativedelta(months=time_range * 3)\n",
429 |     "\n",
430 |     "num_row = int((len(demand_ex.index)-1)/2)+1\n",
431 |     "\n",
432 |     "plt.figure(figsize=(15, num_row*5))\n",
433 |     "for i, spot in enumerate(demand_ex.index):\n",
434 |     "    plt.subplot(num_row, 2, i+1)\n",
435 |     "    plt.title(spot)\n",
436 |     "    si = spot[0]\n",
437 |     "    gu = spot[1]\n",
438 |     "    plt.plot(price_index[si][gu][graph_start:graph_end], label='매매가')\n",
439 |     "    plt.plot(jeonse_index[si][gu][graph_start:graph_end], label='전세가')\n",
440 |     "    plt.axvline(x=index_date, color='lightcoral', linestyle='--')\n",
441 |     "    plt.axvline(x=prev_date, color='darkseagreen', linestyle='--')\n",
442 |     "    plt.axvline(x=prev_date2, color='darkseagreen', linestyle='--')\n",
443 |     "    plt.legend(loc='lower right')\n",
444 |     "    \n",
445 |     "plt.show()"
446 |    ]
447 |   }
448 |  ],
449 |  "metadata": {
450 |   "kernelspec": {
451 |    "display_name": "Python 3",
452 |    "language": "python",
453 |    "name": "python3"
454 |   },
455 |   "language_info": {
456 |    "codemirror_mode": {
457 |     "name": "ipython",
458 |     "version": 3
459 |    },
460 |    "file_extension": ".py",
461 |    "mimetype": "text/x-python",
462 |    "name": "python",
463 |    "nbconvert_exporter": "python",
464 |    "pygments_lexer": "ipython3",
465 |    "version": "3.6.4"
466 |   }
467 |  },
468 |  "nbformat": 4,
469 |  "nbformat_minor": 2
470 | }
471 | 


--------------------------------------------------------------------------------
/3.2 부동산 공급 (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 3.21] 인허가 엑셀 데이터를 read_excel로 읽어오기\n",
 10 |     "\n",
 11 |     "import pandas as pd\n",
 12 |     "\n",
 13 |     "# permission_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\주택건설인허가실적.xlsx'\n",
 14 |     "permission_path = r'내려받은 인허가 엑셀파일의 위치\\ 인허가 데이터 엑셀파일명.xlsx'\n",
 15 |     "pd.read_excel(permission_path)"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": null,
 21 |    "metadata": {},
 22 |    "outputs": [],
 23 |    "source": [
 24 |     "# [예제 3.22] 인허가 엑셀 데이터를 read_excel로 읽어오기\n",
 25 |     "\n",
 26 |     "permission_raw = pd.read_excel(permission_path, skiprows=10, index_col=0)"
 27 |    ]
 28 |   },
 29 |   {
 30 |    "cell_type": "code",
 31 |    "execution_count": null,
 32 |    "metadata": {},
 33 |    "outputs": [],
 34 |    "source": [
 35 |     "# [예제 3.23] permission_df의 행과 열을 바꾸기\n",
 36 |     "\n",
 37 |     "transposed_permission = permission_raw.T"
 38 |    ]
 39 |   },
 40 |   {
 41 |    "cell_type": "code",
 42 |    "execution_count": null,
 43 |    "metadata": {},
 44 |    "outputs": [],
 45 |    "source": [
 46 |     "# [예제 3.24] 인덱스를 '연도.날짜' 형식으로 바꾸기\n",
 47 |     "\n",
 48 |     "new_index = []\n",
 49 |     "\n",
 50 |     "for old_date in transposed_permission.index:\n",
 51 |     "    temp_list = old_date.split(' ')\n",
 52 |     "    new_index.append(temp_list[0][:4] + '.' + temp_list[1][:2])"
 53 |    ]
 54 |   },
 55 |   {
 56 |    "cell_type": "code",
 57 |    "execution_count": null,
 58 |    "metadata": {},
 59 |    "outputs": [],
 60 |    "source": [
 61 |     "# [예제 3.25] 인덱스를 새로 설정하고 데이터프레임 완성하기 \n",
 62 |     "\n",
 63 |     "transposed_permission.index = pd.to_datetime(new_index)\n",
 64 |     "transposed_permission.columns.name = None"
 65 |    ]
 66 |   },
 67 |   {
 68 |    "cell_type": "code",
 69 |    "execution_count": null,
 70 |    "metadata": {},
 71 |    "outputs": [],
 72 |    "source": [
 73 |     "# [예제 3.26] 인허가 데이터를 데이터프레임으로 변환하는 함수 정의 \n",
 74 |     "\n",
 75 |     "def permission_preprocessing(path):\n",
 76 |     "    permission_raw = pd.read_excel(path, skiprows=10, index_col=0)\n",
 77 |     "    transposed_permission = permission_raw.T\n",
 78 |     "    new_index = []\n",
 79 |     "\n",
 80 |     "    for old_date in transposed_permission.index:\n",
 81 |     "        temp_list = old_date.split(' ')\n",
 82 |     "        new_index.append(temp_list[0][:4] + '.' + temp_list[1][:2])\n",
 83 |     "        \n",
 84 |     "    transposed_permission.index = pd.to_datetime(new_index)\n",
 85 |     "    transposed_permission.columns.name = None\n",
 86 |     "    \n",
 87 |     "    return transposed_permission"
 88 |    ]
 89 |   },
 90 |   {
 91 |    "cell_type": "code",
 92 |    "execution_count": null,
 93 |    "metadata": {},
 94 |    "outputs": [],
 95 |    "source": [
 96 |     "# [예제 3.27] 미분양 엑셀 데이터를 read_excel로 읽어오기 \n",
 97 |     "\n",
 98 |     "# unsold_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\시·군·구별+미분양현황_2082_128_20181229151931.xlsx'\n",
 99 |     "unsold_path = r' 내려받은 미분양 데이터의 디렉터리 \\ 미분양 데이터 엑셀파일명.xlsx'\n",
100 |     "unsold_raw = pd.read_excel(unsold_path, skiprows=1, index_col=0)"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "code",
105 |    "execution_count": null,
106 |    "metadata": {},
107 |    "outputs": [],
108 |    "source": [
109 |     "# [예제 3.28] 미분양 데이터프레임 정리\n",
110 |     "\n",
111 |     "del unsold_raw['시군구']\n",
112 |     "transposed_unsold = unsold_raw.T\n",
113 |     "transposed_unsold.index = pd.to_datetime(transposed_unsold.index)\n",
114 |     "transposed_unsold.columns.name = None"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": null,
120 |    "metadata": {},
121 |    "outputs": [],
122 |    "source": [
123 |     "# [예제 3.29] 미분양 데이터를 데이터프레임으로 변환하는 함수 정의 \n",
124 |     "\n",
125 |     "def unsold_preprocessing(path):\n",
126 |     "    unsold_raw = pd.read_excel(path, skiprows=1, index_col=0)\n",
127 |     "    \n",
128 |     "    del unsold_raw['시군구']\n",
129 |     "    transposed_unsold = unsold_raw.T\n",
130 |     "    transposed_unsold.index = pd.to_datetime(transposed_unsold.index)\n",
131 |     "    transposed_unsold.columns.name = None\n",
132 |     "    \n",
133 |     "    return transposed_unsold"
134 |    ]
135 |   },
136 |   {
137 |    "cell_type": "code",
138 |    "execution_count": null,
139 |    "metadata": {},
140 |    "outputs": [],
141 |    "source": [
142 |     "# [예제 3.30] KBpriceindex_preprocessing 함수 가져오기\n",
143 |     "\n",
144 |     "import xlwings as xw\n",
145 |     "\n",
146 |     "def KBpriceindex_preprocessing(path, data_type):\n",
147 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
148 |     "    # data_type : ‘매매종합’, ‘매매APT’, ‘매매연립’, ‘매매단독’, ‘전세종합’, ‘전세APT’, ‘전세연립’, ‘전세단독’ 중 하나\n",
149 |     "    \n",
150 |     "    wb = xw.Book(path)                \n",
151 |     "    sheet = wb.sheets[data_type]   \n",
152 |     "    row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
153 |     "    data_range = 'A2:GE' + str(row_num)\n",
154 |     "    raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
155 |     "    \n",
156 |     "    bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
157 |     "    bigname_list = bignames.split(' ')\n",
158 |     "    big_col = list(raw_data.columns)\n",
159 |     "    small_col = list(raw_data.iloc[0])\n",
160 |     "\n",
161 |     "    for num, gu_data in enumerate(small_col):\n",
162 |     "        if gu_data == None:\n",
163 |     "            small_col[num] = big_col[num]\n",
164 |     "\n",
165 |     "        check = num\n",
166 |     "        while True:\n",
167 |     "            if big_col[check] in bigname_list:\n",
168 |     "                big_col[num] = big_col[check]\n",
169 |     "                break\n",
170 |     "            else:\n",
171 |     "                check = check - 1\n",
172 |     "                \n",
173 |     "    big_col[129] = '경기' \n",
174 |     "    big_col[130] = '경기'\n",
175 |     "    small_col[185] = '서귀포'\n",
176 |     "    \n",
177 |     "    raw_data.columns = [big_col, small_col]\n",
178 |     "    new_col_data = raw_data.drop([0,1])\n",
179 |     "    \n",
180 |     "    index_list = list(new_col_data['구분']['구분'])\n",
181 |     "\n",
182 |     "    new_index = []\n",
183 |     "\n",
184 |     "    for num, raw_index in enumerate(index_list):\n",
185 |     "        temp = str(raw_index).split('.')\n",
186 |     "        if int(temp[0]) > 12 :\n",
187 |     "            if len(temp[0]) == 2:\n",
188 |     "                new_index.append('19' + temp[0] + '.' + temp[1])\n",
189 |     "            else:\n",
190 |     "                new_index.append(temp[0] + '.' + temp[1])\n",
191 |     "        else:\n",
192 |     "            new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
193 |     "\n",
194 |     "    new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
195 |     "    cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)\n",
196 |     "    return cleaned_data"
197 |    ]
198 |   },
199 |   {
200 |    "cell_type": "code",
201 |    "execution_count": null,
202 |    "metadata": {},
203 |    "outputs": [],
204 |    "source": [
205 |     "# [예제 3.31] 앞에서 정의한 함수들을 이용해 데이터 전처리하고 데이터프레임으로 가져오기\n",
206 |     "\n",
207 |     "permission_path  = r'내려받은 인허가 엑셀파일의 위치\\ 인허가엑셀 파일명.xlsx'\n",
208 |     "permission = permission_preprocessing(permission_path)\n",
209 |     "unsold_path = r' 내려받은 미분양 데이터의 디렉터리 \\ 미분양 데이터 엑셀파일명.xlsx'\n",
210 |     "unsold = unsold_preprocessing(unsold_path)\n",
211 |     "kb_path = r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
212 |     "price_index = KBpriceindex_preprocessing(kb_path, '매매종합')\n",
213 |     "jun_index   = KBpriceindex_preprocessing(kb_path, '전세종합')"
214 |    ]
215 |   },
216 |   {
217 |    "cell_type": "code",
218 |    "execution_count": null,
219 |    "metadata": {},
220 |    "outputs": [],
221 |    "source": [
222 |     "# [예제 3.32] 그래프를 위한 설정 (\n",
223 |     "\n",
224 |     "import matplotlib.pyplot as plt\n",
225 |     "from matplotlib import font_manager, rc\n",
226 |     "from matplotlib import style\n",
227 |     "style.use('ggplot')\n",
228 |     "%matplotlib inline\n",
229 |     "\n",
230 |     "font_name = font_manager.FontProperties(fname=\"c:/Windows/Fonts/malgun.ttf\").get_name()\n",
231 |     "rc('font', family=font_name)\n",
232 |     "# 맥OS인 경우 위 두 줄을 입력하지 말고 아래 코드를 입력하세요\n",
233 |     "# rc('font', family='AppleGothic')\n",
234 |     "plt.rcParams['axes.unicode_minus'] = False"
235 |    ]
236 |   },
237 |   {
238 |    "cell_type": "code",
239 |    "execution_count": null,
240 |    "metadata": {},
241 |    "outputs": [],
242 |    "source": [
243 |     "# [예제 3.33] 서울의 인허가와 매매가 지수의 움직임을 그래프로 나타내기 \n",
244 |     "\n",
245 |     "plt.figure(figsize=(10,6))\n",
246 |     "ax = plt.subplot()\n",
247 |     "ax2 = ax.twinx()\n",
248 |     "\n",
249 |     "si = '서울'\n",
250 |     "gu = '서울'\n",
251 |     "\n",
252 |     "plt.title(si + '-' + gu)\n",
253 |     "ln1 = ax.plot(price_index[si][gu]['2009-1':], label='매매가')\n",
254 |     "ln2 = ax2.plot(permission[si]['2009-1':], label='인허가', color='green',marker=\"o\")\n",
255 |     "lns = ln1 +ln2\n",
256 |     "labs = [l.get_label() for l in lns]\n",
257 |     "ax.legend(lns, labs, loc='upper left')\n",
258 |     "    \n",
259 |     "plt.show()"
260 |    ]
261 |   },
262 |   {
263 |    "cell_type": "code",
264 |    "execution_count": null,
265 |    "metadata": {},
266 |    "outputs": [],
267 |    "source": [
268 |     "# [[예제 3.34] 인허가 데이터 연도별로 \n",
269 |     "\n",
270 |     "year_permission = permission.groupby(permission.index.year).sum()"
271 |    ]
272 |   },
273 |   {
274 |    "cell_type": "code",
275 |    "execution_count": null,
276 |    "metadata": {},
277 |    "outputs": [],
278 |    "source": [
279 |     "# [예제 3.35] 인허가 데이터를 2년 뒤로 옮기기 \n",
280 |     "\n",
281 |     "modified_permission = year_permission.shift(2)\n",
282 |     "temp = []\n",
283 |     "for year in modified_permission.index:\n",
284 |     "    temp.append(str(year) + '-6-1')\n",
285 |     "modified_permission.index = pd.to_datetime(temp)"
286 |    ]
287 |   },
288 |   {
289 |    "cell_type": "code",
290 |    "execution_count": null,
291 |    "metadata": {},
292 |    "outputs": [],
293 |    "source": [
294 |     "#[예제 3.36] 수정한 인허가 데이터와 매매가 지수 그래프 \n",
295 |     "\n",
296 |     "plt.figure(figsize=(10,6))\n",
297 |     "ax = plt.subplot()\n",
298 |     "ax2 = ax.twinx()\n",
299 |     "\n",
300 |     "si = '서울'\n",
301 |     "gu = '서울'\n",
302 |     "\n",
303 |     "plt.title(si + '-' + gu)\n",
304 |     "ln1 = ax.plot(price_index[si][gu]['2009-1':], label='매매가')\n",
305 |     "ln2 = ax2.plot(modified_permission[si]['2009':], label='인허가', color='green',marker=\"o\")\n",
306 |     "lns = ln1 +ln2\n",
307 |     "labs = [l.get_label() for l in lns]\n",
308 |     "ax.legend(lns, labs, loc='upper left')\n",
309 |     "    \n",
310 |     "plt.show()"
311 |    ]
312 |   },
313 |   {
314 |    "cell_type": "code",
315 |    "execution_count": null,
316 |    "metadata": {},
317 |    "outputs": [],
318 |    "source": [
319 |     "# [예제 3.37] 미분양 데이터를 추가한 그래프 \n",
320 |     "\n",
321 |     "plt.figure(figsize=(10,6))\n",
322 |     "ax = plt.subplot()\n",
323 |     "ax2 = ax.twinx()\n",
324 |     "\n",
325 |     "si = '서울'\n",
326 |     "gu = '서울'\n",
327 |     "\n",
328 |     "plt.title(si + '-' + gu)\n",
329 |     "ln1 = ax.plot(price_index[si][gu]['2009-1':], label='매매가')\n",
330 |     "ln2 = ax.plot(jun_index[si][gu]['2009-1':], label='전세가')\n",
331 |     "ln3 = ax2.plot(modified_permission[si]['2009':]/10, label='인허가', color='lightslategray', ls='--')\n",
332 |     "ln4 = ax2.plot(unsold[si]['2009':], label='미분양', color='y', ls='--')\n",
333 |     "lns = ln1 +ln2 + ln3 + ln4\n",
334 |     "labs = [l.get_label() for l in lns]\n",
335 |     "ax.legend(lns, labs, loc='upper left')\n",
336 |     "    \n",
337 |     "plt.show()"
338 |    ]
339 |   },
340 |   {
341 |    "cell_type": "code",
342 |    "execution_count": null,
343 |    "metadata": {},
344 |    "outputs": [],
345 |    "source": [
346 |     "# [예제 3.38] 변화율로 살펴보는 그래프 \n",
347 |     "\n",
348 |     "plt.figure(figsize=(10,6))\n",
349 |     "ax = plt.subplot()\n",
350 |     "ax2 = ax.twinx()\n",
351 |     "\n",
352 |     "si = '서울'\n",
353 |     "gu = '서울'\n",
354 |     "\n",
355 |     "plt.title(si + '-' + gu)\n",
356 |     "ln1 = ax.plot(price_index[si][gu]['2009-1':].pct_change(12), label='매매가')\n",
357 |     "ln2 = ax.plot(jun_index[si][gu]['2009-1':].pct_change(12), label='전세가')\n",
358 |     "ln3 = ax2.plot(modified_permission[si]['2009':].pct_change(), label='인허가', color='lightslategray', ls='--')\n",
359 |     "ln4 = ax2.plot(unsold[si]['2009':].pct_change(12), label='미분양', color='y', ls='--')\n",
360 |     "lns = ln1 +ln2 + ln3 + ln4\n",
361 |     "labs = [l.get_label() for l in lns]\n",
362 |     "ax.legend(lns, labs, loc='upper left')\n",
363 |     "    \n",
364 |     "plt.show()"
365 |    ]
366 |   },
367 |   {
368 |    "cell_type": "code",
369 |    "execution_count": null,
370 |    "metadata": {},
371 |    "outputs": [],
372 |    "source": [
373 |     "# [예제 3.39] demand 함수를 실행해 결과 저장 \n",
374 |     "\n",
375 |     "from datetime import datetime\n",
376 |     "from dateutil.relativedelta import relativedelta\n",
377 |     "\n",
378 |     "def demand(price_index, jeonse_index, index_date, time_range):\n",
379 |     "\n",
380 |     "    prev_date = index_date - relativedelta(months=time_range)\n",
381 |     "    prev_date2 = index_date - relativedelta(months=time_range*3)\n",
382 |     "\n",
383 |     "    demand_df = pd.DataFrame()\n",
384 |     "    demand_df['매매증감률'] = (price_index.loc[index_date] - price_index.loc[prev_date])/ price_index.loc[prev_date].replace(0,None)\n",
385 |     "    demand_df['전세증감률'] = (jeonse_index.loc[index_date] - jeonse_index.loc[prev_date])/jeonse_index.loc[prev_date].replace(0,None)\n",
386 |     "    demand_df['이전최대값'] = price_index[prev_date2:index_date][:-1].max()\n",
387 |     "    demand_df['최댓값대비증감률'] = (price_index.loc[index_date] - demand_df['이전최대값']) /demand_df['이전최대값'].replace(0,None)\n",
388 |     "\n",
389 |     "    demand_df['매매가상승'] = demand_df['매매증감률'] > 0.01\n",
390 |     "    demand_df['전세가상승'] = demand_df['전세증감률'] > 0.01\n",
391 |     "    demand_df['더빠른전세상승'] = demand_df['전세증감률'] > demand_df['매매증감률']\n",
392 |     "    demand_df['최댓값대비상승'] = demand_df['최댓값대비증감률'] > 0\n",
393 |     "    demand_df['수요총합'] = demand_df[['매매가상승','전세가상승','더빠른전세상승','최댓값대비상승']].sum(axis=1)\n",
394 |     "\n",
395 |     "    demand_df = demand_df[demand_df['수요총합'] == 4]\n",
396 |     "\n",
397 |     "    seleted_index = []\n",
398 |     "\n",
399 |     "    for name in demand_df.index:\n",
400 |     "        if name[0] is not name[1]:\n",
401 |     "            seleted_index.append((name[0], name[1]))\n",
402 |     "\n",
403 |     "    demand_df = demand_df.loc[seleted_index]\n",
404 |     "    \n",
405 |     "    return demand_df\n",
406 |     "\n",
407 |     "\n",
408 |     "index_date = datetime(2013, 1, 1)\n",
409 |     "time_range = 12\n",
410 |     "demand_1 = demand(price_index, jun_index, index_date, time_range)"
411 |    ]
412 |   },
413 |   {
414 |    "cell_type": "code",
415 |    "execution_count": null,
416 |    "metadata": {},
417 |    "outputs": [],
418 |    "source": [
419 |     "# [예제 3.40] demand 함수 결과를 인허가, 미분양 데이터와 함께 보기\n",
420 |     "\n",
421 |     "prev_date = index_date - relativedelta(months=time_range)\n",
422 |     "prev_date2 = index_date - relativedelta(months=time_range * 3)\n",
423 |     "graph_start = index_date - relativedelta(months=time_range * 3)\n",
424 |     "\n",
425 |     "num_row = int((len(demand_1.index)-1)/2)+1\n",
426 |     "\n",
427 |     "plt.figure(figsize=(15, num_row*5))\n",
428 |     "for i, spot in enumerate(demand_1.index):\n",
429 |     "    ax = plt.subplot(num_row, 2, i+1)\n",
430 |     "    si = spot[0]\n",
431 |     "    gu = spot[1]\n",
432 |     "    plt.title(spot)\n",
433 |     "    ax2 = ax.twinx()\n",
434 |     "    ln1 = ax.plot(price_index[si][gu][graph_start:], label='매매가')\n",
435 |     "    ln2 = ax.plot(jun_index[si][gu][graph_start:], label='전세가')\n",
436 |     "    ln3 = ax2.plot(modified_permission[si][graph_start:]/10,  color='lightslategray', label='인허가')\n",
437 |     "    ln4 = ax2.plot(unsold[si][graph_start:], color='y', label='미분양')\n",
438 |     "    ax.axvline(x=index_date, color='lightcoral', linestyle='--')\n",
439 |     "    ax.axvline(x=prev_date, color='darkseagreen', linestyle='--')\n",
440 |     "    ax.axvline(x=prev_date2, color='darkseagreen', linestyle='--')\n",
441 |     "    lns = ln1 +ln2 + ln3 + ln4\n",
442 |     "    labs = [l.get_label() for l in lns]\n",
443 |     "    ax.legend(lns, labs, loc='lower right')\n",
444 |     "\n",
445 |     "plt.show()\n"
446 |    ]
447 |   }
448 |  ],
449 |  "metadata": {
450 |   "kernelspec": {
451 |    "display_name": "Python 3",
452 |    "language": "python",
453 |    "name": "python3"
454 |   },
455 |   "language_info": {
456 |    "codemirror_mode": {
457 |     "name": "ipython",
458 |     "version": 3
459 |    },
460 |    "file_extension": ".py",
461 |    "mimetype": "text/x-python",
462 |    "name": "python",
463 |    "nbconvert_exporter": "python",
464 |    "pygments_lexer": "ipython3",
465 |    "version": "3.6.4"
466 |   }
467 |  },
468 |  "nbformat": 4,
469 |  "nbformat_minor": 2
470 | }
471 | 


--------------------------------------------------------------------------------
/4.1 학군 (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 4.1] ‘2018년 2차_졸업생의 진로 현황(전체)’ 엑셀 파일을 판다스의 read_excel로 읽어오기 \n",
 10 |     "\n",
 11 |     "import pandas as pd\n",
 12 |     "\n",
 13 |     "# graduate_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\2018년 2차_졸업생의 진로 현황(전체).xlsx'\n",
 14 |     "graduate_path = r' 졸업생 진로 데이터 엑셀 파일 디렉터리 \\ 졸업생 진로 데이터 엑셀 파일명.xlsx'\n",
 15 |     "raw_graduate = pd.read_excel(graduate_path, sheet_name='2018_졸업생의 진로 현황(중)')"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": null,
 21 |    "metadata": {},
 22 |    "outputs": [],
 23 |    "source": [
 24 |     "# [예제 4.2] 전국 중학교 졸업자 데이터를 데이터프레임으로 정리 \n",
 25 |     "\n",
 26 |     "select_col = raw_graduate[['지역', '학교명','정보공시 \\n 학교코드', '졸업자.2', '(특수목적고)과학고 진학자.2','(특수목적고)외고ㆍ국제고 진학자.2']]\n",
 27 |     "select_col.columns = ['지역', '학교명', '학교코드', '졸업자', '과고', '외고']\n",
 28 |     "graduate_data = select_col.drop(0)\n",
 29 |     "graduate_data['과고'] = pd.to_numeric(graduate_data['과고'])\n",
 30 |     "graduate_data['외고'] =  pd.to_numeric(graduate_data['외고']) \n",
 31 |     "graduate_data['졸업자'] =  pd.to_numeric(graduate_data['졸업자']) \n",
 32 |     "graduate_data['총합'] = graduate_data['과고'] + graduate_data['외고']"
 33 |    ]
 34 |   },
 35 |   {
 36 |    "cell_type": "code",
 37 |    "execution_count": null,
 38 |    "metadata": {},
 39 |    "outputs": [],
 40 |    "source": [
 41 |     "# [예제 4.3] 전국 중학교 지역명 정리\n",
 42 |     "\n",
 43 |     "def get_sido(x):\n",
 44 |     "    temp = x.split(' ')[0]\n",
 45 |     "    if len(temp) != 4:\n",
 46 |     "        return temp[:2]\n",
 47 |     "    else:\n",
 48 |     "        return temp[0] + temp[2]\n",
 49 |     "\n",
 50 |     "graduate_data['시도'] = graduate_data['지역'].dropna().apply(get_sido)\n",
 51 |     "graduate_data['구군'] = graduate_data['지역'].dropna().apply(lambda x: x.split(' ')[1])"
 52 |    ]
 53 |   },
 54 |   {
 55 |    "cell_type": "code",
 56 |    "execution_count": null,
 57 |    "metadata": {},
 58 |    "outputs": [],
 59 |    "source": [
 60 |     "# [예제 4.4] 누락된 지역 예외 처리\n",
 61 |     "\n",
 62 |     "graduate_data.at[588,'시도'] = '부산'\n",
 63 |     "graduate_data.at[588,'구군'] = '기장군'\n",
 64 |     "graduate_data.at[3011,'시도'] = '경북'\n",
 65 |     "graduate_data.at[3011,'구군'] = '예천군'"
 66 |    ]
 67 |   },
 68 |   {
 69 |    "cell_type": "code",
 70 |    "execution_count": null,
 71 |    "metadata": {
 72 |     "scrolled": false
 73 |    },
 74 |    "outputs": [],
 75 |    "source": [
 76 |     "# [예제 4.5] 특목고 진학 데이터를 데이터프레임으로 만드는 함수 \n",
 77 |     "\n",
 78 |     "def graduate_preprocrssing(path):\n",
 79 |     "    raw_graduate = pd.read_excel(path, sheet_name='2018_졸업생의 진로 현황(중)')\n",
 80 |     "    select_col = raw_graduate[['지역', '학교명','정보공시 \\n 학교코드', '졸업자.2', '(특수목적고)과학고 진학자.2','(특수목적고)외고ㆍ국제고 진학자.2']]\n",
 81 |     "    select_col.columns = ['지역', '학교명', '학교코드', '졸업자', '과고', '외고']\n",
 82 |     "    graduate_data = select_col.drop(0)\n",
 83 |     "    graduate_data['과고'] = pd.to_numeric(graduate_data['과고'])\n",
 84 |     "    graduate_data['외고'] =  pd.to_numeric(graduate_data['외고']) \n",
 85 |     "    graduate_data['졸업자'] =  pd.to_numeric(graduate_data['졸업자']) \n",
 86 |     "    graduate_data['총합'] = graduate_data['과고'] + graduate_data['외고']\n",
 87 |     "    \n",
 88 |     "    def get_sido(x):\n",
 89 |     "        temp = x.split(' ')[0]\n",
 90 |     "        if len(temp) != 4:\n",
 91 |     "            return temp[:2]\n",
 92 |     "        else:\n",
 93 |     "            return temp[0] + temp[2]\n",
 94 |     "    \n",
 95 |     "    graduate_data['시도'] = graduate_data['지역'].dropna().apply(get_sido)\n",
 96 |     "    graduate_data['구군'] = graduate_data['지역'].dropna().apply(lambda x: x.split(' ')[1])\n",
 97 |     "    \n",
 98 |     "    graduate_data.at[588,'시도'] = '부산'\n",
 99 |     "    graduate_data.at[588,'구군'] = '기장군'\n",
100 |     "    graduate_data.at[3011,'시도'] = '경북'\n",
101 |     "    graduate_data.at[3011,'구군'] = '예천군'\n",
102 |     "    \n",
103 |     "    return graduate_data"
104 |    ]
105 |   },
106 |   {
107 |    "cell_type": "code",
108 |    "execution_count": null,
109 |    "metadata": {},
110 |    "outputs": [],
111 |    "source": [
112 |     "# [예제 4.6] 평균 아파트 매매가격 데이터 읽어오기\n",
113 |     "\n",
114 |     "# price_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\평균매매가격_아파트.xlsx'\n",
115 |     "price_path = r' 평균 아파트 매매가격 데이터 디렉터리 \\ 평균 아파트 매매가격 데이터 엑셀 파일명.xlsx'\n",
116 |     "row_price = pd.read_excel(price_path, skiprows=10)"
117 |    ]
118 |   },
119 |   {
120 |    "cell_type": "code",
121 |    "execution_count": null,
122 |    "metadata": {
123 |     "scrolled": false
124 |    },
125 |    "outputs": [],
126 |    "source": [
127 |     "# [예제 4.7] 지역 설정하기\n",
128 |     "\n",
129 |     "big_col = []\n",
130 |     "for num, temp in enumerate(row_price['지 역']):\n",
131 |     "    if pd.isna(temp) :\n",
132 |     "        big_col.append(big_col[num-1])\n",
133 |     "    else:\n",
134 |     "        big_col.append(temp)\n",
135 |     "    \n",
136 |     "    \n",
137 |     "small_col = []\n",
138 |     "for num in range(len(row_price)):\n",
139 |     "    temp_list = list(row_price[['지 역', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3']].iloc[num])\n",
140 |     "    for temp in temp_list[3::-1]:\n",
141 |     "        if not pd.isna(temp):\n",
142 |     "            small_col.append(temp)\n",
143 |     "            break\n",
144 |     "            \n",
145 |     "row_price.index = [big_col, small_col]"
146 |    ]
147 |   },
148 |   {
149 |    "cell_type": "code",
150 |    "execution_count": null,
151 |    "metadata": {},
152 |    "outputs": [],
153 |    "source": [
154 |     "# [예제 4.8] 필요 없는 컬럼을 없애고 컬럼과 인덱스 바꾸기 \n",
155 |     "\n",
156 |     "transposed_price = row_price.drop(['지 역', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3'], axis=1).T"
157 |    ]
158 |   },
159 |   {
160 |    "cell_type": "code",
161 |    "execution_count": null,
162 |    "metadata": {
163 |     "scrolled": false
164 |    },
165 |    "outputs": [],
166 |    "source": [
167 |     "# [예제 4.9] 날짜 인덱스 개선\n",
168 |     "\n",
169 |     "time_index = []\n",
170 |     "for time in transposed_price.index:\n",
171 |     "    temp = time.split(' ')\n",
172 |     "    time_index.append(temp[0][:-1]+'.'+temp[1][:-1])\n",
173 |     "\n",
174 |     "transposed_price.index = pd.to_datetime(time_index)"
175 |    ]
176 |   },
177 |   {
178 |    "cell_type": "code",
179 |    "execution_count": null,
180 |    "metadata": {},
181 |    "outputs": [],
182 |    "source": [
183 |     "# [예제 4.10] 평균 아파트 매매가 데이터를 데이터프레임으로 만드는 함수 \n",
184 |     "\n",
185 |     "def gamjungwon_price_preprocessing(path):\n",
186 |     "    row_price = pd.read_excel(path, skiprows=10)\n",
187 |     "    \n",
188 |     "    big_col = []\n",
189 |     "    for num, temp in enumerate(row_price['지 역']):\n",
190 |     "        if pd.isna(temp) :\n",
191 |     "            big_col.append(big_col[num-1])\n",
192 |     "        else:\n",
193 |     "            big_col.append(temp)\n",
194 |     "\n",
195 |     "\n",
196 |     "    small_col = []\n",
197 |     "    for num in range(len(row_price)):\n",
198 |     "        temp_list = list(row_price[['지 역', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3']].iloc[num])\n",
199 |     "        for temp in temp_list[3::-1]:\n",
200 |     "            if not pd.isna(temp):\n",
201 |     "                small_col.append(temp)\n",
202 |     "                break\n",
203 |     "\n",
204 |     "    row_price.index = [big_col, small_col]\n",
205 |     "    \n",
206 |     "    transposed_price = row_price.drop(['지 역', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3'], axis=1).T\n",
207 |     "    \n",
208 |     "    time_index = []\n",
209 |     "    for time in transposed_price.index:\n",
210 |     "        temp = time.split(' ')\n",
211 |     "        time_index.append(temp[0][:-1]+'.'+temp[1][:-1])\n",
212 |     "\n",
213 |     "    transposed_price.index = pd.to_datetime(time_index)\n",
214 |     "    \n",
215 |     "    return transposed_price"
216 |    ]
217 |   },
218 |   {
219 |    "cell_type": "code",
220 |    "execution_count": null,
221 |    "metadata": {},
222 |    "outputs": [],
223 |    "source": [
224 |     "# [예제 4.11] 특목고 데이터프레임을 이용해 시도별 졸업생의 데이터를 합산하기 \n",
225 |     "\n",
226 |     "graduate_path = r' 졸업생 진로 데이터 엑셀 파일 디렉터리 \\ 졸업생 진로 데이터 엑셀 파일명.xlsx'\n",
227 |     "gradu_df = graduate_preprocrssing(graduate_path)\n",
228 |     "\n",
229 |     "gradu_df.groupby('시도').sum()"
230 |    ]
231 |   },
232 |   {
233 |    "cell_type": "code",
234 |    "execution_count": null,
235 |    "metadata": {},
236 |    "outputs": [],
237 |    "source": [
238 |     "# [예제 4.12] 정렬을 이용해 특목고를 가장 많이 진학시킨 시도 찾기 \n",
239 |     "\n",
240 |     "gradu_sido = gradu_df.groupby('시도').sum()\n",
241 |     "gradu_sido.sort_values(by='총합', ascending=False)"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "code",
246 |    "execution_count": null,
247 |    "metadata": {},
248 |    "outputs": [],
249 |    "source": [
250 |     "# [예제 4.13] 특목고 진학률을 계산해 정렬하기 \n",
251 |     "\n",
252 |     "gradu_sido['진학률'] = gradu_sido['총합'] / gradu_sido['졸업자'] * 100\n",
253 |     "gradu_sido.sort_values(by='진학률', ascending=False)"
254 |    ]
255 |   },
256 |   {
257 |    "cell_type": "code",
258 |    "execution_count": null,
259 |    "metadata": {},
260 |    "outputs": [],
261 |    "source": [
262 |     "# [예제 4.14] 지역별 가격 가져와 저장하기\n",
263 |     "\n",
264 |     "price_path = r' 평균 아파트 매매가격 데이터 디렉터리 \\ 평균 아파트 매매가격 데이터 엑셀 파일명.xlsx'\n",
265 |     "price_data = gamjungwon_price_preprocessing(price_path)\n",
266 |     "\n",
267 |     "sido_list = []\n",
268 |     "for i in gradu_sido.index:\n",
269 |     "    sido_list.append(price_data.loc['2018-6-1'][i][i])\n",
270 |     "      \n",
271 |     "gradu_sido['평균매매가격'] = sido_list"
272 |    ]
273 |   },
274 |   {
275 |    "cell_type": "code",
276 |    "execution_count": null,
277 |    "metadata": {},
278 |    "outputs": [],
279 |    "source": [
280 |     "# [예제 4.15] 그래프를 그리기 위한 설정\n",
281 |     "\n",
282 |     "import matplotlib.pyplot as plt\n",
283 |     "from matplotlib import font_manager, rc\n",
284 |     "from matplotlib import style\n",
285 |     "style.use('ggplot')\n",
286 |     "%matplotlib inline\n",
287 |     "\n",
288 |     "\n",
289 |     "font_name = font_manager.FontProperties(fname=\"c:/Windows/Fonts/malgun.ttf\").get_name()\n",
290 |     "rc('font', family=font_name)\n",
291 |     "# 맥OS인 경우 위 두 줄을 입력하지 말고 아래 코드를 입력하세요\n",
292 |     "# rc('font', family='AppleGothic')\n",
293 |     "plt.rcParams['axes.unicode_minus'] = False"
294 |    ]
295 |   },
296 |   {
297 |    "cell_type": "code",
298 |    "execution_count": null,
299 |    "metadata": {},
300 |    "outputs": [],
301 |    "source": [
302 |     "# [예제 4.16] 특목고 진학률과 평균 아파트 매매가격의 scatter 그래프 그리기\n",
303 |     "\n",
304 |     "plt.figure(figsize=(10, 7))\n",
305 |     "plt.scatter(gradu_sido['진학률'], gradu_sido['평균매매가격'], color='darkcyan', s=50)\n",
306 |     "plt.xlabel('졸업생 대비 특목고 진학생 비율(%)')\n",
307 |     "plt.ylabel('평균아파트매매가격')\n",
308 |     "for name in gradu_sido.index:\n",
309 |     "    plt.text(gradu_sido['진학률'][name]*1.02, gradu_sido['평균매매가격'][name], name, fontsize=13)\n",
310 |     "        \n",
311 |     "plt.show()"
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "code",
316 |    "execution_count": null,
317 |    "metadata": {},
318 |    "outputs": [],
319 |    "source": [
320 |     "# [예제 4.17] 선형 회귀선 추가하기\n",
321 |     "import seaborn as sns\n",
322 |     "\n",
323 |     "plt.figure(figsize=(10, 7))\n",
324 |     "plt.scatter(gradu_sido['진학률'], gradu_sido['평균매매가격'], color='darkcyan', s=50)\n",
325 |     "sns.regplot(gradu_sido['진학률'], gradu_sido['평균매매가격'], scatter=False, color='darkcyan')\n",
326 |     "plt.xlabel('졸업생 대비 특목고 진학생 비율(%)')\n",
327 |     "plt.ylabel('평균아파트매매가격')\n",
328 |     "for name in gradu_sido.index:\n",
329 |     "    plt.text(gradu_sido['진학률'][name]*1.02, gradu_sido['평균매매가격'][name], name, fontsize=13)\n",
330 |     "        \n",
331 |     "plt.show()"
332 |    ]
333 |   },
334 |   {
335 |    "cell_type": "code",
336 |    "execution_count": null,
337 |    "metadata": {},
338 |    "outputs": [],
339 |    "source": [
340 |     "#[예제 4.18] 서울시의 구별 특목고 진학자 알아보기\n",
341 |     "\n",
342 |     "local = '서울'\n",
343 |     "\n",
344 |     "gradu_gu = graduate_data[graduate_data['시도'] == local].groupby('구군').sum()\n",
345 |     "gradu_gu['진학률'] = gradu_gu['총합'] / gradu_gu['졸업자'] * 100\n",
346 |     "gradu_gu['평균매매가격'] = price_data.loc['2018-6-1'][local][gradu_gu.index]"
347 |    ]
348 |   },
349 |   {
350 |    "cell_type": "code",
351 |    "execution_count": null,
352 |    "metadata": {},
353 |    "outputs": [],
354 |    "source": [
355 |     "# [예제 4.19] 서울시의 특목고 진학률과 평균 아파트 매매가격 scatter 그래프 그리기 \n",
356 |     "\n",
357 |     "plt.figure(figsize=(12, 7))\n",
358 |     "plt.scatter(gradu_gu['진학률'], gradu_gu['평균매매가격'], color='steelblue', s=50)\n",
359 |     "sns.regplot(gradu_gu['진학률'], gradu_gu['평균매매가격'], scatter=False, color='steelblue')\n",
360 |     "plt.xlabel('졸업생 대비 특목고 진학생 비율(%)')\n",
361 |     "plt.ylabel('평균아파트매매가격')\n",
362 |     "for name in gradu_gu.index:\n",
363 |     "    plt.text(gradu_gu['진학률'][name]*1.02, gradu_gu['평균매매가격'][name], name, fontsize=13)\n",
364 |     "        \n",
365 |     "plt.show()"
366 |    ]
367 |   },
368 |   {
369 |    "cell_type": "code",
370 |    "execution_count": null,
371 |    "metadata": {},
372 |    "outputs": [],
373 |    "source": [
374 |     "# [예제 4.20] 부산시의 특목고 진학률과 평균 아파트 매매가격 scatter 그래프 그리기 \n",
375 |     "\n",
376 |     "local = '부산'\n",
377 |     "\n",
378 |     "gradu_gu = graduate_data[graduate_data['시도'] == local].groupby('구군').sum()\n",
379 |     "gradu_gu['진학률'] = gradu_gu['총합'] / gradu_gu['졸업자'] * 100\n",
380 |     "gradu_gu['평균매매가격'] = price_data.loc['2018-6-1'][local][gradu_gu.index]\n",
381 |     "gradu_gu = gradu_gu.dropna()\n",
382 |     "\n",
383 |     "plt.figure(figsize=(12, 7))\n",
384 |     "plt.scatter(gradu_gu['진학률'], gradu_gu['평균매매가격'], color='steelblue', s=50)\n",
385 |     "sns.regplot(gradu_gu['진학률'], gradu_gu['평균매매가격'], scatter=False, color='steelblue')\n",
386 |     "plt.xlabel('졸업생 대비 특목고 진학생 비율(%)')\n",
387 |     "plt.ylabel('평균아파트매매가격')\n",
388 |     "for name in gradu_gu.index:\n",
389 |     "    plt.text(gradu_gu['진학률'][name]*1.02, gradu_gu['평균매매가격'][name], name, fontsize=13)\n",
390 |     "        \n",
391 |     "plt.show()"
392 |    ]
393 |   },
394 |   {
395 |    "cell_type": "code",
396 |    "execution_count": null,
397 |    "metadata": {},
398 |    "outputs": [],
399 |    "source": [
400 |     "# [예제 4.21] 학원 데이터를 데이터프레임으로 가져오기\n",
401 |     "\n",
402 |     "#aca_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\2018년 시도별 행정구역별 사설학원 현황.xlsx'\n",
403 |     "aca_path = r' 사설학원 현황 데이터 디렉터리 \\ 사설학원 현황 엑셀 파일명.xlsx'\n",
404 |     "aca_raw = pd.read_excel(aca_path , skiprows=3)"
405 |    ]
406 |   },
407 |   {
408 |    "cell_type": "code",
409 |    "execution_count": null,
410 |    "metadata": {},
411 |    "outputs": [],
412 |    "source": [
413 |     "# [예제 4.22] 학원 데이터프레임을 정리 \n",
414 |     "\n",
415 |     "aca_data = aca_raw[aca_raw['분야'] == '입시검정및보습']\n",
416 |     "aca_data = aca_data[['시도', '행정구역', '학원수', '정원\\n(수강자수)', '강사수', '강의실수','월평균교습시간','월평균교습비(원)' ]]\n",
417 |     "aca_data.columns = ['시도', '구군', '학원수', '수강자수', '강사수', '강의실수','월평균교습시간','월평균교습비']"
418 |    ]
419 |   },
420 |   {
421 |    "cell_type": "code",
422 |    "execution_count": null,
423 |    "metadata": {},
424 |    "outputs": [],
425 |    "source": [
426 |     "# [예제 4.23] 전국 시도의 학원수와 평균 아파트 매매가격의 scatter 그래프 그리기 \n",
427 |     "\n",
428 |     "aca_sido = aca_data.groupby('시도').sum()\n",
429 |     "\n",
430 |     "sido_list = []\n",
431 |     "for i in aca_sido.index:\n",
432 |     "    sido_list.append(price_data.loc['2018-6-1'][i][i])\n",
433 |     "      \n",
434 |     "aca_sido['평균매매가격'] = sido_list\n",
435 |     "\n",
436 |     "plt.figure(figsize=(10, 7))\n",
437 |     "plt.scatter(aca_sido['학원수'], aca_sido['평균매매가격'], color='orange', s=50)\n",
438 |     "sns.regplot(aca_sido['학원수'], aca_sido['평균매매가격'], scatter=False, color='orange')\n",
439 |     "plt.xlabel('학원수')\n",
440 |     "plt.ylabel('평균아파트매매가격')\n",
441 |     "for name in aca_sido.index:\n",
442 |     "    plt.text(aca_sido['학원수'][name]*1.02, aca_sido['평균매매가격'][name], name, fontsize=13)\n",
443 |     "        \n",
444 |     "plt.show()"
445 |    ]
446 |   },
447 |   {
448 |    "cell_type": "code",
449 |    "execution_count": null,
450 |    "metadata": {},
451 |    "outputs": [],
452 |    "source": [
453 |     "# [예제 4.24] 서울시의 학원 수와 평균 아파트 매매가격의 scatter 그래프 그리기 \n",
454 |     "\n",
455 |     "local = '서울'\n",
456 |     "\n",
457 |     "aca_gu = aca_data[aca_data['시도'] == local].groupby('구군').sum()\n",
458 |     "aca_gu['평균매매가격'] = price_data.loc['2018-6-1'][local][aca_gu.index]\n",
459 |     "aca_gu = aca_gu.dropna()\n",
460 |     "\n",
461 |     "plt.figure(figsize=(12, 7))\n",
462 |     "plt.scatter(aca_gu['학원수'], aca_gu['평균매매가격'], color='orange', s=50)\n",
463 |     "sns.regplot(aca_gu['학원수'], aca_gu['평균매매가격'], scatter=False, color='orange')\n",
464 |     "plt.xlabel('학원수')\n",
465 |     "plt.ylabel('평균아파트매매가격')\n",
466 |     "for name in aca_gu.index:\n",
467 |     "    plt.text(aca_gu['학원수'][name]*1.02, aca_gu['평균매매가격'][name], name, fontsize=13)\n",
468 |     "        \n",
469 |     "plt.show()"
470 |    ]
471 |   },
472 |   {
473 |    "cell_type": "code",
474 |    "execution_count": null,
475 |    "metadata": {},
476 |    "outputs": [],
477 |    "source": [
478 |     "# [예제 4.25] 서울시의 특목고 진학률과 학원수를 하나의 데이터프레임으로 병합 \n",
479 |     "\n",
480 |     "local = '서울'\n",
481 |     "\n",
482 |     "gradu_gu = graduate_data[graduate_data['시도'] == local].groupby('구군').sum()\n",
483 |     "gradu_gu['진학률'] = gradu_gu['총합'] / gradu_gu['졸업자'] * 100\n",
484 |     "aca_gu = aca_data[aca_data['시도'] == local].groupby('구군').sum()\n",
485 |     "study_df = pd.merge(gradu_gu, aca_gu, how='outer', right_index=True, left_index=True)\n",
486 |     "study_df['평균매매가격'] = pd.to_numeric(price_data.loc['2018-6-1'][local][study_df.index])"
487 |    ]
488 |   },
489 |   {
490 |    "cell_type": "code",
491 |    "execution_count": null,
492 |    "metadata": {},
493 |    "outputs": [],
494 |    "source": [
495 |     "# [예제 4.26] 병합한 데이터프레임으로 scatter 그래프 그리기 \n",
496 |     "\n",
497 |     "plt.figure(figsize=(12, 6))\n",
498 |     "plt.scatter(study_df['진학률'], study_df['학원수'], c=study_df['평균매매가격'], s=study_df['평균매매가격']*0.001, cmap=\"YlOrRd\", alpha=0.5)\n",
499 |     "sns.regplot(study_df['진학률'], study_df['학원수'], scatter=False, color='silver')\n",
500 |     "plt.xlabel('특목고 진학률(%)')\n",
501 |     "plt.ylabel('학원수')\n",
502 |     "\n",
503 |     "for name in study_df.index:\n",
504 |     "    plt.text(study_df['진학률'][name]*1.01, study_df['학원수'][name], name)\n",
505 |     "    \n",
506 |     "plt.colorbar()\n",
507 |     "plt.show()"
508 |    ]
509 |   }
510 |  ],
511 |  "metadata": {
512 |   "kernelspec": {
513 |    "display_name": "Python 3",
514 |    "language": "python",
515 |    "name": "python3"
516 |   },
517 |   "language_info": {
518 |    "codemirror_mode": {
519 |     "name": "ipython",
520 |     "version": 3
521 |    },
522 |    "file_extension": ".py",
523 |    "mimetype": "text/x-python",
524 |    "name": "python",
525 |    "nbconvert_exporter": "python",
526 |    "pygments_lexer": "ipython3",
527 |    "version": "3.6.4"
528 |   }
529 |  },
530 |  "nbformat": 4,
531 |  "nbformat_minor": 2
532 | }
533 | 


--------------------------------------------------------------------------------
/4.2 일자리 (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 4.27] 지역고용통계 데이터 읽어오기\n",
 10 |     "\n",
 11 |     "import pandas as pd\n",
 12 |     "\n",
 13 |     "# job_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\지역고용통계.xls'\n",
 14 |     "job_path = r' 고용 데이터 파일 디렉터리 \\ 지역고용통계.xls'\n",
 15 |     "job_raw = pd.read_excel(job_path, skiprows=2)"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": null,
 21 |    "metadata": {
 22 |     "scrolled": false
 23 |    },
 24 |    "outputs": [],
 25 |    "source": [
 26 |     "# [예제 4.28] 데이터프레임 다듬기\n",
 27 |     "\n",
 28 |     "job_data = job_raw[job_raw['산업별'] == '전산업']\n",
 29 |     "job_data = job_data[['지역별', '전체종사자 (명)']]\n",
 30 |     "job_data.columns = ['지역명',  '고용자수']"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "code",
 35 |    "execution_count": null,
 36 |    "metadata": {},
 37 |    "outputs": [],
 38 |    "source": [
 39 |     "# [예제 4.29] 지역명 나누기(ch04/ 4.2 일자리.ipynb)\n",
 40 |     "\n",
 41 |     "def get_sido(x):\n",
 42 |     "    temp = x.split(' ')[0]\n",
 43 |     "    if len(temp) != 4:\n",
 44 |     "        return temp[:2]\n",
 45 |     "    else:\n",
 46 |     "        return temp[0] + temp[2]\n",
 47 |     "\n",
 48 |     "job_data['시도'] = job_data['지역명'].apply(get_sido)\n",
 49 |     "job_data['구군'] = job_data['지역명'].apply(lambda x: x.split(' ')[1])"
 50 |    ]
 51 |   },
 52 |   {
 53 |    "cell_type": "code",
 54 |    "execution_count": null,
 55 |    "metadata": {},
 56 |    "outputs": [],
 57 |    "source": [
 58 |     "# [예제 4.30] 지역고용통계 전처리 과정 정리해서 함수화\n",
 59 |     "\n",
 60 |     "def job_preprocessing(path):\n",
 61 |     "    job_raw = pd.read_excel(path, skiprows=2)\n",
 62 |     "    job_data = job_raw[job_raw['산업별'] == '전산업']\n",
 63 |     "    job_data = job_data[['지역별', '전체종사자 (명)']]\n",
 64 |     "    job_data.columns = ['지역명',  '고용자수']\n",
 65 |     "    \n",
 66 |     "    def get_sido(x):\n",
 67 |     "        temp = x.split(' ')[0]\n",
 68 |     "        if len(temp) != 4:\n",
 69 |     "            return temp[:2]\n",
 70 |     "        else:\n",
 71 |     "            return temp[0] + temp[2]\n",
 72 |     "\n",
 73 |     "    job_data['시도'] = job_data['지역명'].apply(get_sido)\n",
 74 |     "    job_data['구군'] = job_data['지역명'].apply(lambda x: x.split(' ')[1])\n",
 75 |     "    \n",
 76 |     "    return job_data"
 77 |    ]
 78 |   },
 79 |   {
 80 |    "cell_type": "code",
 81 |    "execution_count": null,
 82 |    "metadata": {},
 83 |    "outputs": [],
 84 |    "source": [
 85 |     "# [예제 4.31] 세대수 엑셀 데이터 읽어오기\n",
 86 |     "\n",
 87 |     "# house_n_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\행정구역_시군구_별_주민등록세대수_20190107134842.xlsx'\n",
 88 |     "house_n_path = r' 세대수 데이터 디렉터리 \\ 세대수 데이터 엑셀파일명.xlsx'\n",
 89 |     "house_n_raw = pd.read_excel(house_n_path)"
 90 |    ]
 91 |   },
 92 |   {
 93 |    "cell_type": "code",
 94 |    "execution_count": null,
 95 |    "metadata": {},
 96 |    "outputs": [],
 97 |    "source": [
 98 |     "# [예제 4.32] 세대수 데이터프레임에서 시도명 고치기\n",
 99 |     "\n",
100 |     "house_n_raw.columns = ['시도', '구군', '세대수']\n",
101 |     "\n",
102 |     "big_col = []\n",
103 |     "for num, temp in enumerate(house_n_raw['시도']):\n",
104 |     "    if pd.isna(temp) :\n",
105 |     "        big_col.append(big_col[num-1])\n",
106 |     "    else:\n",
107 |     "        big_col.append(temp)\n",
108 |     "\n",
109 |     "house_n_raw['시도'] = big_col"
110 |    ]
111 |   },
112 |   {
113 |    "cell_type": "code",
114 |    "execution_count": null,
115 |    "metadata": {},
116 |    "outputs": [],
117 |    "source": [
118 |     "# [예제 4.33] 세대수 데이터프레임에서 시도명 줄임말로 고치기\n",
119 |     "\n",
120 |     "def get_sido(x):\n",
121 |     "    if len(x) != 4:\n",
122 |     "        return x[:2]\n",
123 |     "    else:\n",
124 |     "        return x[0] + x[2]\n",
125 |     "\n",
126 |     "house_n_raw['시도'] = house_n_raw['시도'].apply(get_sido)\n",
127 |     "house_n_data = house_n_raw[house_n_raw['구군'] != '소계']"
128 |    ]
129 |   },
130 |   {
131 |    "cell_type": "code",
132 |    "execution_count": null,
133 |    "metadata": {
134 |     "scrolled": false
135 |    },
136 |    "outputs": [],
137 |    "source": [
138 |     "# [예제 4.34] 세대수 데이터 전처리 과정 함수화(ch04/ 4.2 일자리.ipynb)\n",
139 |     "\n",
140 |     "def house_number_preprocessing(path):\n",
141 |     "    house_n_raw = pd.read_excel(path)\n",
142 |     "    house_n_raw.columns = ['시도', '구군', '세대수']\n",
143 |     "\n",
144 |     "    big_col = []\n",
145 |     "    for num, temp in enumerate(house_n_raw['시도']):\n",
146 |     "        if pd.isna(temp) :\n",
147 |     "            big_col.append(big_col[num-1])\n",
148 |     "        else:\n",
149 |     "            big_col.append(temp)\n",
150 |     "\n",
151 |     "    house_n_raw['시도'] = big_col\n",
152 |     "    \n",
153 |     "    def get_sido(x):\n",
154 |     "        if len(x) != 4:\n",
155 |     "            return x[:2]\n",
156 |     "        else:\n",
157 |     "            return x[0] + x[2]\n",
158 |     "\n",
159 |     "    house_n_raw['시도'] = house_n_raw['시도'].apply(get_sido)\n",
160 |     "    house_n_data = house_n_raw[house_n_raw['구군'] != '소계']\n",
161 |     "    \n",
162 |     "    return house_n_data"
163 |    ]
164 |   },
165 |   {
166 |    "cell_type": "code",
167 |    "execution_count": null,
168 |    "metadata": {},
169 |    "outputs": [],
170 |    "source": [
171 |     "# [예제 4.35] 시도 단위로 고용자수 보기\n",
172 |     "\n",
173 |     "job_path = r' 고용 데이터 파일 디렉터리 \\ 지역고용통계.xls'\n",
174 |     "job_df = job_preprocessing(job_path)\n",
175 |     "\n",
176 |     "job_sido = job_df.groupby('시도').sum()\n",
177 |     "job_sido = job_sido.sort_values(by='고용자수', ascending=False)"
178 |    ]
179 |   },
180 |   {
181 |    "cell_type": "code",
182 |    "execution_count": null,
183 |    "metadata": {},
184 |    "outputs": [],
185 |    "source": [
186 |     "# [예제 4.36] 시도 단위의 고용자수를 막대 그래프로 보기 \n",
187 |     "\n",
188 |     "import matplotlib.pyplot as plt\n",
189 |     "from matplotlib import font_manager, rc\n",
190 |     "from matplotlib import style\n",
191 |     "import seaborn as sns\n",
192 |     "style.use('ggplot')\n",
193 |     "%matplotlib inline\n",
194 |     "\n",
195 |     "font_name = font_manager.FontProperties(fname=\"c:/Windows/Fonts/malgun.ttf\").get_name()\n",
196 |     "rc('font', family=font_name)\n",
197 |     "# 맥OS인 경우 위 두 줄을 입력하지 말고 아래 코드를 입력하세요\n",
198 |     "# rc('font', family='AppleGothic')\n",
199 |     "plt.rcParams['axes.unicode_minus'] = False\n",
200 |     "\n",
201 |     "\n",
202 |     "plt.figure(figsize=(12, 4))\n",
203 |     "job_sido['고용자수'].plot(kind='bar', color='darkcyan')\n",
204 |     "plt.axhline(y=job_sido['고용자수'].mean(), color='orange', linewidth=2, ls='--')\n",
205 |     "plt.xticks(rotation=0)\n",
206 |     "plt.xlabel('')\n",
207 |     "plt.ylabel('고용자수')\n",
208 |     "plt.show()"
209 |    ]
210 |   },
211 |   {
212 |    "cell_type": "code",
213 |    "execution_count": null,
214 |    "metadata": {},
215 |    "outputs": [],
216 |    "source": [
217 |     "# [예제 4.37] 세대수 데이터 추가\n",
218 |     "\n",
219 |     "house_n_path = r' 세대수 데이터 디렉터리 \\ 세대수 데이터 엑셀파일명.xlsx'\n",
220 |     "house_n_df = house_number_preprocessing(house_n_path)\n",
221 |     "\n",
222 |     "job_sido['세대수'] = house_n_df.groupby('시도').sum().loc[job_sido.index]\n",
223 |     "job_sido['세대수대비고용'] = job_sido['고용자수']/job_sido['세대수'] * 100"
224 |    ]
225 |   },
226 |   {
227 |    "cell_type": "code",
228 |    "execution_count": null,
229 |    "metadata": {},
230 |    "outputs": [],
231 |    "source": [
232 |     "# [예제 4.38] 시도 단위 세대구 대비 고용비율 막대 그래프로 보기\n",
233 |     "\n",
234 |     "plt.figure(figsize=(12, 4))\n",
235 |     "job_sido.sort_values(by='세대수대비고용', ascending=False)['세대수대비고용'].plot(kind='bar', color='darkcyan')\n",
236 |     "plt.axhline(y=job_sido['세대수대비고용'].mean(), color='orange', linewidth=2, ls='--')\n",
237 |     "plt.xticks(rotation=0)\n",
238 |     "plt.xlabel('')\n",
239 |     "plt.ylabel('세대수 대비 고용자수 (%)')\n",
240 |     "plt.show()"
241 |    ]
242 |   },
243 |   {
244 |    "cell_type": "code",
245 |    "execution_count": null,
246 |    "metadata": {},
247 |    "outputs": [],
248 |    "source": [
249 |     "# [예제 4.39] 평균 아파트 매매가격 가져오기\n",
250 |     "\n",
251 |     "def gamjungwon_price_preprocessing(path):\n",
252 |     "    row_price = pd.read_excel(path, skiprows=10)\n",
253 |     "    \n",
254 |     "    big_col = []\n",
255 |     "    for num, temp in enumerate(row_price['지 역']):\n",
256 |     "        if pd.isna(temp) :\n",
257 |     "            big_col.append(big_col[num-1])\n",
258 |     "        else:\n",
259 |     "            big_col.append(temp)\n",
260 |     "\n",
261 |     "\n",
262 |     "    small_col = []\n",
263 |     "    for num in range(len(row_price)):\n",
264 |     "        temp_list = list(row_price[['지 역', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3']].iloc[num])\n",
265 |     "        for temp in temp_list[3::-1]:\n",
266 |     "            if not pd.isna(temp):\n",
267 |     "                small_col.append(temp)\n",
268 |     "                break\n",
269 |     "\n",
270 |     "    row_price.index = [big_col, small_col]\n",
271 |     "    \n",
272 |     "    transposed_price = row_price.drop(['지 역', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3'], axis=1).T\n",
273 |     "    \n",
274 |     "    time_index = []\n",
275 |     "    for time in transposed_price.index:\n",
276 |     "        temp = time.split(' ')\n",
277 |     "        time_index.append(temp[0][:-1]+'.'+temp[1][:-1])\n",
278 |     "\n",
279 |     "    transposed_price.index = pd.to_datetime(time_index)\n",
280 |     "    \n",
281 |     "    return transposed_price\n",
282 |     "\n",
283 |     "\n",
284 |     "price_path =  r' 평균 아파트 매매가격 데이터 디렉터리 \\ 평균 아파트 매매가격 데이터 엑셀 파일명.xlsx'\n",
285 |     "price_df = gamjungwon_price_preprocessing(price_path)"
286 |    ]
287 |   },
288 |   {
289 |    "cell_type": "code",
290 |    "execution_count": null,
291 |    "metadata": {},
292 |    "outputs": [],
293 |    "source": [
294 |     "# [예제 4.40] 고용자수, 세대수, 평균 아파트 매매가격 scatter 그래프로 나타내기\n",
295 |     "\n",
296 |     "sido_list = []\n",
297 |     "for i in job_sido.index:\n",
298 |     "    sido_list.append(price_df.loc['2018-6-1'][i][i])\n",
299 |     "      \n",
300 |     "job_sido['평균매매가격'] = sido_list\n",
301 |     "\n",
302 |     "plt.figure(figsize=(10, 6))\n",
303 |     "plt.scatter(job_sido['고용자수'], job_sido['세대수'], c=job_sido['평균매매가격'], s=job_sido['평균매매가격']*0.001, cmap=\"YlOrRd\", alpha=0.5 )\n",
304 |     "plt.xlabel('고용자수')\n",
305 |     "plt.ylabel('세대수')\n",
306 |     "for name in job_sido.index:\n",
307 |     "    plt.text(job_sido['고용자수'][name]*1.01, job_sido['세대수'][name]*1.05, name, fontsize=13)\n",
308 |     "\n",
309 |     "plt.colorbar()\n",
310 |     "plt.show()"
311 |    ]
312 |   },
313 |   {
314 |    "cell_type": "code",
315 |    "execution_count": null,
316 |    "metadata": {},
317 |    "outputs": [],
318 |    "source": [
319 |     "# [예제 4.41] 세대수 대비 고용비율, 세대수, 평균 아파트 매매가격 scatter 그래프로 나타내기\n",
320 |     "\n",
321 |     "plt.figure(figsize=(10, 6))\n",
322 |     "plt.scatter(job_sido['세대수대비고용'], job_sido['세대수'], c=job_sido['평균매매가격'], s=job_sido['평균매매가격']*0.001, cmap=\"YlOrRd\", alpha=0.5 )\n",
323 |     "plt.xlabel('세대수 대비 고용자수 (%)')\n",
324 |     "plt.ylabel('세대수')\n",
325 |     "for name in job_sido.index:\n",
326 |     "    plt.text(job_sido['세대수대비고용'][name]*1.01, job_sido['세대수'][name]*1.05, name, fontsize=13)\n",
327 |     "\n",
328 |     "plt.colorbar()\n",
329 |     "plt.show()"
330 |    ]
331 |   },
332 |   {
333 |    "cell_type": "code",
334 |    "execution_count": null,
335 |    "metadata": {},
336 |    "outputs": [],
337 |    "source": [
338 |     "# [예제 4.42] 서울시 고용자수, 세대수, 평균 매매가격 데이터프레임 만들기\n",
339 |     "\n",
340 |     "local = '서울'\n",
341 |     "\n",
342 |     "job_gugun = job_df[job_df['시도']==local].groupby('구군').sum()\n",
343 |     "job_gugun['세대수'] = house_n_df[house_n_df['시도'] == local].groupby('구군').sum().loc[job_gugun.index]\n",
344 |     "job_gugun['세대수대비고용'] = job_gugun['고용자수']/job_gugun['세대수'] * 100\n",
345 |     "job_gugun['평균매매가격'] = price_df.loc['2018-6-1'][local][job_gugun.index]\n",
346 |     "job_gugun = job_gugun.dropna()"
347 |    ]
348 |   },
349 |   {
350 |    "cell_type": "code",
351 |    "execution_count": null,
352 |    "metadata": {},
353 |    "outputs": [],
354 |    "source": [
355 |     "# [예제 4.43] 서울시의 고용자수, 세대수 대비 고용비율 막대 그래프 그리기\n",
356 |     "\n",
357 |     "plt.figure(figsize=(12, 4))\n",
358 |     "job_gugun.sort_values(by='고용자수', ascending=False)['고용자수'].plot(kind='bar', color='darkcyan')\n",
359 |     "plt.axhline(y=job_gugun['고용자수'].mean(), color='orange', linewidth=2, ls='--')\n",
360 |     "plt.xticks(rotation=45)\n",
361 |     "plt.xlabel('')\n",
362 |     "plt.ylabel('고용자수')\n",
363 |     "plt.show()\n",
364 |     "\n",
365 |     "plt.figure(figsize=(12, 4))\n",
366 |     "job_gugun.sort_values(by='세대수대비고용', ascending=False)['세대수대비고용'].plot(kind='bar', color='darkcyan')\n",
367 |     "plt.axhline(y=job_sido['세대수대비고용'].mean(), color='orange', linewidth=2, ls='--')\n",
368 |     "plt.xticks(rotation=45)\n",
369 |     "plt.xlabel('')\n",
370 |     "plt.ylabel('세대수 대비 고용자수 (%)')\n",
371 |     "plt.show()"
372 |    ]
373 |   },
374 |   {
375 |    "cell_type": "code",
376 |    "execution_count": null,
377 |    "metadata": {},
378 |    "outputs": [],
379 |    "source": [
380 |     "# [예제 4.44] 서울시의 고용자수, 세대수, 평균 아파트 매매가로 scatter 그래프 그리기(ch04/ 4.2 일자리.ipynb)\n",
381 |     "\n",
382 |     "plt.figure(figsize=(10, 6))\n",
383 |     "plt.scatter(job_gugun['고용자수'], job_gugun['세대수'], c=job_gugun['평균매매가격'], s=pd.to_numeric(job_gugun['평균매매가격'])*0.001, cmap=\"YlOrRd\", alpha=0.5)\n",
384 |     "plt.xlabel('고용자수')\n",
385 |     "plt.ylabel('세대수')\n",
386 |     "\n",
387 |     "for name in job_gugun.index:\n",
388 |     "    plt.text(job_gugun['고용자수'][name]*1.01, job_gugun['세대수'][name], name)\n",
389 |     "    \n",
390 |     "plt.colorbar()\n",
391 |     "plt.show()"
392 |    ]
393 |   },
394 |   {
395 |    "cell_type": "code",
396 |    "execution_count": null,
397 |    "metadata": {},
398 |    "outputs": [],
399 |    "source": [
400 |     "# [예제 4.45] 서울시의 고용비율, 세대수, 평균 아파트 매매가로 scatter 그래프 그리기\n",
401 |     "\n",
402 |     "plt.figure(figsize=(10, 6))\n",
403 |     "plt.scatter(job_gugun['세대수대비고용'], job_gugun['세대수'], c=job_gugun['평균매매가격'], s=pd.to_numeric(job_gugun['평균매매가격'])*0.001, cmap=\"YlOrRd\", alpha=0.5)\n",
404 |     "plt.xlabel('세대수 대비 고용자수 (%)')\n",
405 |     "plt.ylabel('세대수')\n",
406 |     "\n",
407 |     "for name in job_gugun.index:\n",
408 |     "    plt.text(job_gugun['세대수대비고용'][name]*1.01, job_gugun['세대수'][name], name)\n",
409 |     "    \n",
410 |     "plt.colorbar()\n",
411 |     "plt.show()"
412 |    ]
413 |   }
414 |  ],
415 |  "metadata": {
416 |   "kernelspec": {
417 |    "display_name": "Python 3",
418 |    "language": "python",
419 |    "name": "python3"
420 |   },
421 |   "language_info": {
422 |    "codemirror_mode": {
423 |     "name": "ipython",
424 |     "version": 3
425 |    },
426 |    "file_extension": ".py",
427 |    "mimetype": "text/x-python",
428 |    "name": "python",
429 |    "nbconvert_exporter": "python",
430 |    "pygments_lexer": "ipython3",
431 |    "version": "3.6.4"
432 |   }
433 |  },
434 |  "nbformat": 4,
435 |  "nbformat_minor": 2
436 | }
437 | 


--------------------------------------------------------------------------------
/5. 지도와 부동산 (개정).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": null,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# [예제 5.1] folium 모듈을 임포트해 서울 시청을 중심으로 맵 만들기 (\n",
 10 |     "\n",
 11 |     "import folium\n",
 12 |     "\n",
 13 |     "seoul_map = folium.Map(location=[37.5614378,126.9751701])"
 14 |    ]
 15 |   },
 16 |   {
 17 |    "cell_type": "code",
 18 |    "execution_count": null,
 19 |    "metadata": {},
 20 |    "outputs": [],
 21 |    "source": [
 22 |     "# [예제 5.2] 확대해서 맵 생성하기\n",
 23 |     "\n",
 24 |     "seoul_map = folium.Map(location=[37.5614378,126.9751701], zoom_start=14)"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "code",
 29 |    "execution_count": null,
 30 |    "metadata": {},
 31 |    "outputs": [],
 32 |    "source": [
 33 |     "# [예제 5.3] 지도 위에 특정 위치 표시하기\n",
 34 |     "\n",
 35 |     "folium.Marker([37.5614378,126.9751701], popup='서울시청').add_to(seoul_map)"
 36 |    ]
 37 |   },
 38 |   {
 39 |    "cell_type": "code",
 40 |    "execution_count": null,
 41 |    "metadata": {},
 42 |    "outputs": [],
 43 |    "source": [
 44 |     "# [예제 5.4] 학교 정보 가져와서 데이터프레임으로 정리하기\n",
 45 |     "\n",
 46 |     "import pandas as pd\n",
 47 |     "\n",
 48 |     "#school_path = r'C:\\Users\\JK\\Desktop\\부동산 데이터\\2018년_공시대상학교정보(전체).xlsx'\n",
 49 |     "school_path = r' 학교정보 데이터 디렉터리  학교정보 데이터 엑셀파일명.xlsx'\n",
 50 |     "school_raw = pd.read_excel(school_path)\n",
 51 |     "school_data = school_raw[['정보공시 \\n 학교코드', '학교명', '위도', '경도' ]]\n",
 52 |     "school_data.columns = ['학교코드', '학교명','위도', '경도']\n",
 53 |     "school_df = school_data.drop_duplicates()"
 54 |    ]
 55 |   },
 56 |   {
 57 |    "cell_type": "code",
 58 |    "execution_count": null,
 59 |    "metadata": {},
 60 |    "outputs": [],
 61 |    "source": [
 62 |     "school_data"
 63 |    ]
 64 |   },
 65 |   {
 66 |    "cell_type": "code",
 67 |    "execution_count": null,
 68 |    "metadata": {},
 69 |    "outputs": [],
 70 |    "source": [
 71 |     "# [예제 5.5] graduate_preprocrssing 함수 가지고 와 데이터프레임 만들기\n",
 72 |     "\n",
 73 |     "def graduate_preprocrssing(path):\n",
 74 |     "    raw_graduate = pd.read_excel(path, sheet_name='2018_졸업생의 진로 현황(중)')\n",
 75 |     "    select_col = raw_graduate[['지역', '학교명','정보공시 \\n 학교코드', '졸업자.2', '(특수목적고)과학고 진학자.2','(특수목적고)외고ㆍ국제고 진학자.2']]\n",
 76 |     "    select_col.columns = ['지역', '학교명', '학교코드', '졸업자', '과고', '외고']\n",
 77 |     "    graduate_data = select_col.drop(0)\n",
 78 |     "    graduate_data['과고'] = pd.to_numeric(graduate_data['과고'])\n",
 79 |     "    graduate_data['외고'] =  pd.to_numeric(graduate_data['외고']) \n",
 80 |     "    graduate_data['졸업자'] =  pd.to_numeric(graduate_data['졸업자']) \n",
 81 |     "    graduate_data['총합'] = graduate_data['과고'] + graduate_data['외고']\n",
 82 |     "    \n",
 83 |     "    def get_sido(x):\n",
 84 |     "        temp = x.split(' ')[0]\n",
 85 |     "        if len(temp) != 4:\n",
 86 |     "            return temp[:2]\n",
 87 |     "        else:\n",
 88 |     "            return temp[0] + temp[2]\n",
 89 |     "    \n",
 90 |     "    graduate_data['시도'] = graduate_data['지역'].dropna().apply(get_sido)\n",
 91 |     "    graduate_data['구군'] = graduate_data['지역'].dropna().apply(lambda x: x.split(' ')[1])\n",
 92 |     "    \n",
 93 |     "    graduate_data.at[588,'시도'] = '부산'\n",
 94 |     "    graduate_data.at[588,'구군'] = '기장군'\n",
 95 |     "    graduate_data.at[3011,'시도'] = '경북'\n",
 96 |     "    graduate_data.at[3011,'구군'] = '예천군'\n",
 97 |     "    \n",
 98 |     "    return graduate_data\n",
 99 |     "\n",
100 |     "graduate_path = r' 졸업생 진로 데이터 엑셀 파일 디렉터리 \\ 졸업생 진로 데이터 엑셀 파일명.xlsx'\n",
101 |     "gradu_df = graduate_preprocrssing(graduate_path)"
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "code",
106 |    "execution_count": null,
107 |    "metadata": {},
108 |    "outputs": [],
109 |    "source": [
110 |     "# [예제 5.6] 학교코드를 기준으로 데이터프레임 병합\n",
111 |     "\n",
112 |     "total_school_df = pd.merge(gradu_df, school_df, how='inner', right_on='학교코드', left_on='학교코드')"
113 |    ]
114 |   },
115 |   {
116 |    "cell_type": "code",
117 |    "execution_count": null,
118 |    "metadata": {},
119 |    "outputs": [],
120 |    "source": [
121 |     "# [예제 5.7] 서울 지역에서 특목고 보낸 중학교 모두 표시하기\n",
122 |     "\n",
123 |     "seoul_school = total_school_df[total_school_df['시도'] == '서울']\n",
124 |     "good_school = seoul_school[seoul_school['총합'] > 0]\n",
125 |     "seoul_map = folium.Map(location=[37.5614378,126.9751701], zoom_start=11)\n",
126 |     "\n",
127 |     "for n in good_school.index:\n",
128 |     "    folium.Marker([ good_school['위도'][n], good_school['경도'][n] ], popup=good_school['학교명_x'][n]).add_to(seoul_map)"
129 |    ]
130 |   },
131 |   {
132 |    "cell_type": "code",
133 |    "execution_count": null,
134 |    "metadata": {},
135 |    "outputs": [],
136 |    "source": [
137 |     "# [예제 5.8] 서울의 특목고 진학률이 높은 학교와 낮은 학교 비교\n",
138 |     "\n",
139 |     "seoul_school['비율'] = seoul_school['총합'] /seoul_school['졸업자'] * 100\n",
140 |     "good_school = seoul_school[seoul_school['비율'] >= 3]\n",
141 |     "bad_school = seoul_school[seoul_school['비율'] < 3]\n",
142 |     "seoul_map = folium.Map(location=[37.5614378,126.9751701], zoom_start=11)\n",
143 |     "\n",
144 |     "for n in good_school.index:\n",
145 |     "    folium.CircleMarker([ good_school['위도'][n], good_school['경도'][n] ], color='crimson',fill_color='crimson', radius=7).add_to(seoul_map)\n",
146 |     "\n",
147 |     "for n in bad_school.index:\n",
148 |     "    folium.CircleMarker([ bad_school['위도'][n], bad_school['경도'][n] ], color='#3186cc',fill_color='#3186cc',  radius=7).add_to(seoul_map)"
149 |    ]
150 |   },
151 |   {
152 |    "cell_type": "code",
153 |    "execution_count": null,
154 |    "metadata": {},
155 |    "outputs": [],
156 |    "source": [
157 |     "# [예제 5.9] 대전의 특목고 진학률이 높은 학교와 낮은 학교 비교\n",
158 |     "\n",
159 |     "sido_school = total_school_df[total_school_df['시도'] == '대전']\n",
160 |     "sido_school['비율'] = sido_school['총합'] /sido_school['졸업자'] * 100\n",
161 |     "good_school = sido_school[sido_school['비율'] >= 3]\n",
162 |     "bad_school = sido_school[sido_school['비율'] < 3]\n",
163 |     "sido_map = folium.Map(location=[36.350461,127.38263], zoom_start=11)\n",
164 |     "\n",
165 |     "for n in good_school.index:\n",
166 |     "    folium.CircleMarker([ good_school['위도'][n], good_school['경도'][n] ], color='crimson',fill_color='crimson', radius=7).add_to(sido_map )\n",
167 |     "\n",
168 |     "for n in bad_school.index:\n",
169 |     "    folium.CircleMarker([ bad_school['위도'][n], bad_school['경도'][n] ], color='#3186cc',fill_color='#3186cc',  radius=7).add_to(sido_map )"
170 |    ]
171 |   },
172 |   {
173 |    "cell_type": "code",
174 |    "execution_count": null,
175 |    "metadata": {},
176 |    "outputs": [],
177 |    "source": [
178 |     "# [예제 5.10] 법정동 코드 정리\n",
179 |     "\n",
180 |     "# local_code = pd.read_excel(r'C:\\Users\\JK\\Desktop\\부동산 데이터\\KIKcd_B.20190101.xlsx')\n",
181 |     "local_code = pd.read_excel(r' 법정동 코드 데이터 디렉터리 \\ 법정동 코드 엑셀 파일명.xlsx')\n",
182 |     "local_code['법정동코드'] = local_code['법정동코드'].apply(lambda x: str(x)[:5])\n",
183 |     "filtered_code = local_code[['법정동코드', '시도명', '시군구명']].drop_duplicates()\n",
184 |     "filtered_code.dropna(inplace=True)\n",
185 |     "filtered_code.loc[20477] = ('36110', '세종', '세종')"
186 |    ]
187 |   },
188 |   {
189 |    "cell_type": "code",
190 |    "execution_count": null,
191 |    "metadata": {
192 |     "scrolled": false
193 |    },
194 |    "outputs": [],
195 |    "source": [
196 |     "# [예제 5.11] 법정동 코드 시도명 바꾸기\n",
197 |     "\n",
198 |     "def get_sido(x):\n",
199 |     "    temp = x.split(' ')[0]\n",
200 |     "    if len(temp) != 4:\n",
201 |     "        return temp[:2]\n",
202 |     "    else:\n",
203 |     "        return temp[0] + temp[2]\n",
204 |     "    \n",
205 |     "filtered_code['시도명'] = filtered_code['시도명'].dropna().apply(get_sido)\n",
206 |     "filtered_code['시군구명'] = filtered_code['시군구명'].dropna().apply(lambda x : x.split(' ')[-1])"
207 |    ]
208 |   },
209 |   {
210 |    "cell_type": "code",
211 |    "execution_count": null,
212 |    "metadata": {},
213 |    "outputs": [],
214 |    "source": [
215 |     "# [예제 5.12] KB부동산 데이터에서 매매가격 지수 데이터프레임으로 가져오기\n",
216 |     "\n",
217 |     "import xlwings as xw\n",
218 |     "\n",
219 |     "def KBpriceindex_preprocessing(path, data_type):\n",
220 |     "    # path : KB 데이터 엑셀 파일의 디렉토리 (문자열)\n",
221 |     "    # data_type : ‘매매종합’, ‘매매APT’, ‘매매연립’, ‘매매단독’, ‘전세종합’, ‘전세APT’, ‘전세연립’, ‘전세단독’ 중 하나\n",
222 |     "    \n",
223 |     "    wb = xw.Book(path)                \n",
224 |     "    sheet = wb.sheets[data_type]   \n",
225 |     "    row_num = sheet.range(1,1).end('down').end('down').end('down').row  \n",
226 |     "    data_range = 'A2:GE' + str(row_num)\n",
227 |     "    raw_data = sheet[data_range].options(pd.DataFrame, index=False, header=True).value \n",
228 |     "    \n",
229 |     "    bignames = '서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'\n",
230 |     "    bigname_list = bignames.split(' ')\n",
231 |     "    big_col = list(raw_data.columns)\n",
232 |     "    small_col = list(raw_data.iloc[0])\n",
233 |     "\n",
234 |     "    for num, gu_data in enumerate(small_col):\n",
235 |     "        if gu_data == None:\n",
236 |     "            small_col[num] = big_col[num]\n",
237 |     "\n",
238 |     "        check = num\n",
239 |     "        while True:\n",
240 |     "            if big_col[check] in bigname_list:\n",
241 |     "                big_col[num] = big_col[check]\n",
242 |     "                break\n",
243 |     "            else:\n",
244 |     "                check = check - 1\n",
245 |     "                \n",
246 |     "    big_col[129] = '경기' \n",
247 |     "    big_col[130] = '경기'\n",
248 |     "    small_col[185] = '서귀포'\n",
249 |     "    \n",
250 |     "    raw_data.columns = [big_col, small_col]\n",
251 |     "    new_col_data = raw_data.drop([0,1])\n",
252 |     "    \n",
253 |     "    index_list = list(new_col_data['구분']['구분'])\n",
254 |     "\n",
255 |     "    new_index = []\n",
256 |     "\n",
257 |     "    for num, raw_index in enumerate(index_list):\n",
258 |     "        temp = str(raw_index).split('.')\n",
259 |     "        if int(temp[0]) > 12 :\n",
260 |     "            if len(temp[0]) == 2:\n",
261 |     "                new_index.append('19' + temp[0] + '.' + temp[1])\n",
262 |     "            else:\n",
263 |     "                new_index.append(temp[0] + '.' + temp[1])\n",
264 |     "        else:\n",
265 |     "            new_index.append(new_index[num-1].split('.')[0] + '.' + temp[0])\n",
266 |     "\n",
267 |     "    new_col_data.set_index(pd.to_datetime(new_index), inplace=True)\n",
268 |     "    cleaned_data  = new_col_data.drop(('구분', '구분'), axis=1)\n",
269 |     "    return cleaned_data\n",
270 |     "\n",
271 |     "\n",
272 |     "path =  r' 여러분이 내려 받은 KB 엑셀파일의 디렉터리를 넣으세요 \\ KB엑셀 파일명.xls'\n",
273 |     "price_index = KBpriceindex_preprocessing(path, '매매apt')"
274 |    ]
275 |   },
276 |   {
277 |    "cell_type": "code",
278 |    "execution_count": null,
279 |    "metadata": {},
280 |    "outputs": [],
281 |    "source": [
282 |     "# [예제 5.13] 매매가격 지수의 증감률 구하여 데이터프레임으로 만들기\n",
283 |     "\n",
284 |     "diff_pct = ((price_index.loc['2018-6-1']/price_index.loc['2017-6-1']) - 1) * 100\n",
285 |     "diff_df = pd.DataFrame({'증감률':diff_pct})"
286 |    ]
287 |   },
288 |   {
289 |    "cell_type": "code",
290 |    "execution_count": null,
291 |    "metadata": {},
292 |    "outputs": [],
293 |    "source": [
294 |     "diff_df"
295 |    ]
296 |   },
297 |   {
298 |    "cell_type": "code",
299 |    "execution_count": null,
300 |    "metadata": {},
301 |    "outputs": [],
302 |    "source": [
303 |     "# [예제 5.14] 증감률 데이터프레임에 법정동 코드 추가\n",
304 |     "\n",
305 |     "filtered_code.index = [filtered_code['시도명'], filtered_code['시군구명']]\n",
306 |     "\n",
307 |     "code = []\n",
308 |     "for local in diff_df.index:\n",
309 |     "    if local[0] in filtered_code.index:\n",
310 |     "        temp_df = filtered_code.loc[local[0]]\n",
311 |     "        if local[1] in temp_df.index:\n",
312 |     "            code.append(temp_df.loc[local[1]]['법정동코드'])\n",
313 |     "        elif local[1] + '시' in temp_df.index:\n",
314 |     "            code.append(temp_df.loc[local[1] + '시']['법정동코드'])\n",
315 |     "        elif local[1] == '세종':\n",
316 |     "            code.append('36110')\n",
317 |     "        else:\n",
318 |     "            code.append('')\n",
319 |     "    else:\n",
320 |     "        code.append('')\n",
321 |     "        \n",
322 |     "diff_df['법정동코드'] = code"
323 |    ]
324 |   },
325 |   {
326 |    "cell_type": "code",
327 |    "execution_count": null,
328 |    "metadata": {},
329 |    "outputs": [],
330 |    "source": [
331 |     "# [예제 5.15] folium으로 choropleth 매매가격 지수 증감률 지도 만들기 \n",
332 |     "\n",
333 |     "import json\n",
334 |     "\n",
335 |     "# rfile = open(r'C:\\Users\\JK\\Desktop\\부동산 데이터\\TL_SCCO_SIG_WGS84.json', 'r', encoding='utf-8').read()\n",
336 |     "rfile = open(r' json 파일 디렉터리 \\TL_SCCO_SIG_WGS84.json', 'r', encoding='utf-8').read()\n",
337 |     "jsonData = json.loads(rfile)\n",
338 |     "    \n",
339 |     "korea_map = folium.Map(location=[36, 127], zoom_start=7)\n",
340 |     "    \n",
341 |     "korea_map.choropleth(\n",
342 |     " geo_data=jsonData,\n",
343 |     " data=diff_df,\n",
344 |     " columns=['법정동코드', '증감률'],\n",
345 |     " key_on='feature.properties.SIG_CD',\n",
346 |     " fill_color='RdYlGn',\n",
347 |     " fill_opacity=0.7,\n",
348 |     " line_opacity=0.5,\n",
349 |     " legend_name='증감률(%)'\n",
350 |     ")"
351 |    ]
352 |   },
353 |   {
354 |    "cell_type": "code",
355 |    "execution_count": null,
356 |    "metadata": {},
357 |    "outputs": [],
358 |    "source": [
359 |     "# 만들어진 지도 저장하기 \n",
360 |     "\n",
361 |     "korea_map.save(r'저장할 위치 디렉터리\\저장할 이름.html')"
362 |    ]
363 |   }
364 |  ],
365 |  "metadata": {
366 |   "kernelspec": {
367 |    "display_name": "Python 3",
368 |    "language": "python",
369 |    "name": "python3"
370 |   },
371 |   "language_info": {
372 |    "codemirror_mode": {
373 |     "name": "ipython",
374 |     "version": 3
375 |    },
376 |    "file_extension": ".py",
377 |    "mimetype": "text/x-python",
378 |    "name": "python",
379 |    "nbconvert_exporter": "python",
380 |    "pygments_lexer": "ipython3",
381 |    "version": "3.6.4"
382 |   }
383 |  },
384 |  "nbformat": 4,
385 |  "nbformat_minor": 2
386 | }
387 | 


--------------------------------------------------------------------------------
/데이터/2018년 2차_졸업생의 진로 현황(전체).xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/2018년 2차_졸업생의 진로 현황(전체).xlsx


--------------------------------------------------------------------------------
/데이터/2018년 시도별 행정구역별 사설학원 현황.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/2018년 시도별 행정구역별 사설학원 현황.xlsx


--------------------------------------------------------------------------------
/데이터/2018년_공시대상학교정보(전체).xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/2018년_공시대상학교정보(전체).xlsx


--------------------------------------------------------------------------------
/데이터/KIKcd_B.20190101.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/KIKcd_B.20190101.xlsx


--------------------------------------------------------------------------------
/데이터/SIG_201804.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/SIG_201804.zip


--------------------------------------------------------------------------------
/데이터/XrProjection 변환결과/TL_SCCO_SIG_WGS84.shp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/XrProjection 변환결과/TL_SCCO_SIG_WGS84.shp


--------------------------------------------------------------------------------
/데이터/XrProjection 변환결과/TL_SCCO_SIG_WGS84.shx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/XrProjection 변환결과/TL_SCCO_SIG_WGS84.shx


--------------------------------------------------------------------------------
/데이터/XrProjection 변환결과/tl_scco_sig_wgs84.dbf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/XrProjection 변환결과/tl_scco_sig_wgs84.dbf


--------------------------------------------------------------------------------
/데이터/XrProjection 설치파일/setup.exe:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/XrProjection 설치파일/setup.exe


--------------------------------------------------------------------------------
/데이터/★(월간)KB주택가격동향_시계열(2019.01)12831994601335062.xls:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/★(월간)KB주택가격동향_시계열(2019.01)12831994601335062.xls


--------------------------------------------------------------------------------
/데이터/시·군·구별+미분양현황_2082_128_20181229151931.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/시·군·구별+미분양현황_2082_128_20181229151931.xlsx


--------------------------------------------------------------------------------
/데이터/주택건설인허가실적.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/주택건설인허가실적.xlsx


--------------------------------------------------------------------------------
/데이터/평균매매가격_아파트.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/평균매매가격_아파트.xlsx


--------------------------------------------------------------------------------
/데이터/평균전세가격_아파트.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/평균전세가격_아파트.xlsx


--------------------------------------------------------------------------------
/데이터/행정구역_시군구_별_주민등록세대수_20190107134842.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/행정구역_시군구_별_주민등록세대수_20190107134842.xlsx


--------------------------------------------------------------------------------
/데이터/행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20180821155737.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20180821155737.xlsx


--------------------------------------------------------------------------------
/데이터/행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20190310191045.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wikibook/python-real-estate/788cf7064b468b05e91b537fb422ce4857be86b1/데이터/행정구역_시도_별_1인당_지역내총생산__지역총소득__개인소득_20190310191045.xlsx


--------------------------------------------------------------------------------