├── Data wrangling with Python 1(Pandas).ipynb
├── Data wrangling with Python 2(Regular expression).ipynb
├── Data wrangling with Python 3(Pandas).ipynb
├── Data wrangling with Python 4(Pandas).ipynb
├── Download raw contents and modify Request Headers.ipynb
├── Python_basic1.ipynb
├── Python_basic2.ipynb
├── Python_basic3.ipynb
├── Scrapping static webpage.ipynb
├── Scrapping text mining papers in arXiv.py
├── Selenium.ipynb
├── Sending an email with SMTP.ipynb
├── Static webpage and Dynamic webpage.ipynb
├── data
    ├── sample_mp3.zip
    └── univs_2014.xlsx
└── os_shutil.ipynb


/Data wrangling with Python 2(Regular expression).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 정규표현식 (Regular Expression, regex)\n",
  8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
  9 |     "만든이 : 김보섭  "
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "* 정규표현식 (Regular Expression, regex)\n",
 17 |     "    - 특정 텍스트 (구문) 패턴을 찾는것 \n",
 18 |     "    - 더 알고 싶다면 다음의 링크를 참고 : https://wikidocs.net/1669 (점프 투 파이썬, 정규표현식 chapter)"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "markdown",
 23 |    "metadata": {},
 24 |    "source": [
 25 |     "### Basic"
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "code",
 30 |    "execution_count": 1,
 31 |    "metadata": {
 32 |     "collapsed": true
 33 |    },
 34 |    "outputs": [],
 35 |    "source": [
 36 |     "text = '''저는 4형제가 있습니다.\\\n",
 37 |     " 안수찬, 안성찬, 안서찬, 안순찬 4명의 형제가 사이좋게 지내고 있습니다. 첫째는 안수찬입니다.'''"
 38 |    ]
 39 |   },
 40 |   {
 41 |    "cell_type": "code",
 42 |    "execution_count": 2,
 43 |    "metadata": {},
 44 |    "outputs": [
 45 |     {
 46 |      "name": "stdout",
 47 |      "output_type": "stream",
 48 |      "text": [
 49 |       "저는 4형제가 있습니다. 안수찬, 안성찬, 안서찬, 안순찬 4명의 형제가 사이좋게 지내고 있습니다. 첫째는 안수찬입니다.\n"
 50 |      ]
 51 |     }
 52 |    ],
 53 |    "source": [
 54 |     "print(text)"
 55 |    ]
 56 |   },
 57 |   {
 58 |    "cell_type": "code",
 59 |    "execution_count": 3,
 60 |    "metadata": {},
 61 |    "outputs": [
 62 |     {
 63 |      "data": {
 64 |       "text/plain": [
 65 |        "'안\\\\w찬'"
 66 |       ]
 67 |      },
 68 |      "execution_count": 3,
 69 |      "metadata": {},
 70 |      "output_type": "execute_result"
 71 |     }
 72 |    ],
 73 |    "source": [
 74 |     "# 안*찬\n",
 75 |     "# ['안수찬', '안성찬', '안서찬', '안순찬']\n",
 76 |     "'안\\w찬' # \\w => word (character 1개)\n",
 77 |     "         # \\d => 숫자 1개\n",
 78 |     "         # \\s => whitespace"
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "code",
 83 |    "execution_count": 4,
 84 |    "metadata": {
 85 |     "collapsed": true
 86 |    },
 87 |    "outputs": [],
 88 |    "source": [
 89 |     "import re\n",
 90 |     "brother_pattern = re.compile('안\\w찬')"
 91 |    ]
 92 |   },
 93 |   {
 94 |    "cell_type": "code",
 95 |    "execution_count": 5,
 96 |    "metadata": {},
 97 |    "outputs": [
 98 |     {
 99 |      "data": {
100 |       "text/plain": [
101 |        "re.compile(r'안\\w찬', re.UNICODE)"
102 |       ]
103 |      },
104 |      "execution_count": 5,
105 |      "metadata": {},
106 |      "output_type": "execute_result"
107 |     }
108 |    ],
109 |    "source": [
110 |     "brother_pattern"
111 |    ]
112 |   },
113 |   {
114 |    "cell_type": "code",
115 |    "execution_count": 6,
116 |    "metadata": {},
117 |    "outputs": [
118 |     {
119 |      "data": {
120 |       "text/plain": [
121 |        "['안수찬', '안성찬', '안서찬', '안순찬', '안수찬']"
122 |       ]
123 |      },
124 |      "execution_count": 6,
125 |      "metadata": {},
126 |      "output_type": "execute_result"
127 |     }
128 |    ],
129 |    "source": [
130 |     "brothers = brother_pattern.findall(text)\n",
131 |     "brothers"
132 |    ]
133 |   },
134 |   {
135 |    "cell_type": "code",
136 |    "execution_count": 7,
137 |    "metadata": {},
138 |    "outputs": [
139 |     {
140 |      "data": {
141 |       "text/plain": [
142 |        "['안성찬', '안순찬', '안수찬', '안서찬']"
143 |       ]
144 |      },
145 |      "execution_count": 7,
146 |      "metadata": {},
147 |      "output_type": "execute_result"
148 |     }
149 |    ],
150 |    "source": [
151 |     "list(set(brothers))"
152 |    ]
153 |   },
154 |   {
155 |    "cell_type": "code",
156 |    "execution_count": 8,
157 |    "metadata": {},
158 |    "outputs": [
159 |     {
160 |      "data": {
161 |       "text/plain": [
162 |        "['123', '100']"
163 |       ]
164 |      },
165 |      "execution_count": 8,
166 |      "metadata": {},
167 |      "output_type": "execute_result"
168 |     }
169 |    ],
170 |    "source": [
171 |     "# 3자리숫자 뽑아보기\n",
172 |     "text = '123 에서 100 을 빼면 23 이다.'\n",
173 |     "numbers_pattern = re.compile('\\d\\d\\d')\n",
174 |     "numbers_pattern.findall(text)"
175 |    ]
176 |   },
177 |   {
178 |    "cell_type": "code",
179 |    "execution_count": 9,
180 |    "metadata": {},
181 |    "outputs": [
182 |     {
183 |      "data": {
184 |       "text/plain": [
185 |        "['123', '100']"
186 |       ]
187 |      },
188 |      "execution_count": 9,
189 |      "metadata": {},
190 |      "output_type": "execute_result"
191 |     }
192 |    ],
193 |    "source": [
194 |     "numbers_pattern = re.compile('\\d{3}') # 기본적으로 1개를 의미하지만 {n} (n개의 표현을)\n",
195 |     "numbers_pattern.findall(text)"
196 |    ]
197 |   },
198 |   {
199 |    "cell_type": "code",
200 |    "execution_count": 10,
201 |    "metadata": {},
202 |    "outputs": [
203 |     {
204 |      "data": {
205 |       "text/plain": [
206 |        "['123', '100', '23']"
207 |       ]
208 |      },
209 |      "execution_count": 10,
210 |      "metadata": {},
211 |      "output_type": "execute_result"
212 |     }
213 |    ],
214 |    "source": [
215 |     "# 2자리, 3자리숫자 뽑기\n",
216 |     "numbers_pattern = re.compile('\\d{2,3}') #{m, n } -> n ~ m\n",
217 |     "numbers_pattern.findall(text)"
218 |    ]
219 |   },
220 |   {
221 |    "cell_type": "markdown",
222 |    "metadata": {},
223 |    "source": [
224 |     "### Special character \n",
225 |     "* \"*\" : 0~n 개\n",
226 |     "* \"+\" : 1~n 개\n",
227 |     "* \"?\" : 0, 1개"
228 |    ]
229 |   },
230 |   {
231 |    "cell_type": "code",
232 |    "execution_count": 11,
233 |    "metadata": {
234 |     "collapsed": true
235 |    },
236 |    "outputs": [],
237 |    "source": [
238 |     "# abc, abbc, abbbc, abbbbc, ac\n",
239 |     "text = 'this is abc family : abc, dbe, abbc, abbbc, fdsa, abbbbc, ac'"
240 |    ]
241 |   },
242 |   {
243 |    "cell_type": "code",
244 |    "execution_count": 12,
245 |    "metadata": {},
246 |    "outputs": [
247 |     {
248 |      "data": {
249 |       "text/plain": [
250 |        "['abc', 'abc', 'abbc', 'abbbc', 'abbbbc', 'ac']"
251 |       ]
252 |      },
253 |      "execution_count": 12,
254 |      "metadata": {},
255 |      "output_type": "execute_result"
256 |     }
257 |    ],
258 |    "source": [
259 |     "abc_pattern = re.compile('ab*c')\n",
260 |     "abc_pattern.findall(text)"
261 |    ]
262 |   },
263 |   {
264 |    "cell_type": "code",
265 |    "execution_count": 13,
266 |    "metadata": {},
267 |    "outputs": [
268 |     {
269 |      "data": {
270 |       "text/plain": [
271 |        "['happly', 'butterfly', 'soundly']"
272 |       ]
273 |      },
274 |      "execution_count": 13,
275 |      "metadata": {},
276 |      "output_type": "execute_result"
277 |     }
278 |    ],
279 |    "source": [
280 |     "# ly로 끝나는 모든 텍스트를 뽑는 정규표현식\n",
281 |     "# butterfly, dragonfly, soundly, happly...\n",
282 |     "text = 'I am happly butterfly, on soundly something'\n",
283 |     "ly_pattern = re.compile('\\w+ly')\n",
284 |     "ly_pattern.findall(text)"
285 |    ]
286 |   },
287 |   {
288 |    "cell_type": "code",
289 |    "execution_count": 14,
290 |    "metadata": {},
291 |    "outputs": [
292 |     {
293 |      "data": {
294 |       "text/plain": [
295 |        "['happly', 'butterfly', 'soundly']"
296 |       ]
297 |      },
298 |      "execution_count": 14,
299 |      "metadata": {},
300 |      "output_type": "execute_result"
301 |     }
302 |    ],
303 |    "source": [
304 |     "re.findall('\\w+ly', text)"
305 |    ]
306 |   },
307 |   {
308 |    "cell_type": "markdown",
309 |    "metadata": {},
310 |    "source": [
311 |     "### Example : 전화번호 추출 "
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "code",
316 |    "execution_count": 15,
317 |    "metadata": {
318 |     "collapsed": true
319 |    },
320 |    "outputs": [],
321 |    "source": [
322 |     "# **-****-****\n",
323 |     "# ***-****\n",
324 |     "# ***-****-****"
325 |    ]
326 |   },
327 |   {
328 |    "cell_type": "code",
329 |    "execution_count": 16,
330 |    "metadata": {},
331 |    "outputs": [],
332 |    "source": [
333 |     "with open('./phonenumbers.txt', 'r', encoding = 'utf8') as io:\n",
334 |     "    data = io.read()"
335 |    ]
336 |   },
337 |   {
338 |    "cell_type": "code",
339 |    "execution_count": 17,
340 |    "metadata": {},
341 |    "outputs": [
342 |     {
343 |      "data": {
344 |       "text/plain": [
345 |        "'01022205736\\n한글\\n010-2220-5736\\n이건 뽑으면 안되는 값\\n02-2220-5736\\n12345\\n02.2220.5736\\n01434239'"
346 |       ]
347 |      },
348 |      "execution_count": 17,
349 |      "metadata": {},
350 |      "output_type": "execute_result"
351 |     }
352 |    ],
353 |    "source": [
354 |     "data"
355 |    ]
356 |   },
357 |   {
358 |    "cell_type": "code",
359 |    "execution_count": 18,
360 |    "metadata": {
361 |     "collapsed": true
362 |    },
363 |    "outputs": [],
364 |    "source": [
365 |     "io = open('./phonenumbers.txt', 'r')\n"
366 |    ]
367 |   },
368 |   {
369 |    "cell_type": "code",
370 |    "execution_count": 19,
371 |    "metadata": {
372 |     "collapsed": true
373 |    },
374 |    "outputs": [],
375 |    "source": [
376 |     "phonenumber_pattern = '\\d{2,3}[-.]?\\d{4}[-.]?\\d{4}'\n",
377 |     "phonenumber_pattern = re.compile(phonenumber_pattern)"
378 |    ]
379 |   },
380 |   {
381 |    "cell_type": "code",
382 |    "execution_count": 20,
383 |    "metadata": {},
384 |    "outputs": [
385 |     {
386 |      "name": "stdout",
387 |      "output_type": "stream",
388 |      "text": [
389 |       "re.compile('\\\\d{2,3}[-.]?\\\\d{4}[-.]?\\\\d{4}')\n"
390 |      ]
391 |     },
392 |     {
393 |      "data": {
394 |       "text/plain": [
395 |        "['01022205736', '010-2220-5736', '02-2220-5736', '02.2220.5736']"
396 |       ]
397 |      },
398 |      "execution_count": 20,
399 |      "metadata": {},
400 |      "output_type": "execute_result"
401 |     }
402 |    ],
403 |    "source": [
404 |     "print(phonenumber_pattern)\n",
405 |     "result = phonenumber_pattern.findall(data) # list형태의 output\n",
406 |     "result"
407 |    ]
408 |   },
409 |   {
410 |    "cell_type": "code",
411 |    "execution_count": 21,
412 |    "metadata": {},
413 |    "outputs": [
414 |     {
415 |      "data": {
416 |       "text/plain": [
417 |        "['01022205736', '01022205736', '0222205736', '0222205736']"
418 |       ]
419 |      },
420 |      "execution_count": 21,
421 |      "metadata": {},
422 |      "output_type": "execute_result"
423 |     }
424 |    ],
425 |    "source": [
426 |     "# 여기에서 불필요한 값들은 replace\n",
427 |     "list(map(lambda x : re.sub('[-.]', '', x), result))"
428 |    ]
429 |   },
430 |   {
431 |    "cell_type": "markdown",
432 |    "metadata": {},
433 |    "source": [
434 |     "###  Grouping\n",
435 |     "ABC라는 문자열이 계속해서 반복되는지 조사하는 정규식을 작성하고 싶다고 하자. 어떻게 해야 할까?"
436 |    ]
437 |   },
438 |   {
439 |    "cell_type": "markdown",
440 |    "metadata": {},
441 |    "source": [
442 |     "### Example : 문자열의 특정부분만 치환하기\n",
443 |     "예제상황은 이벤트 당첨자들의 번호를 공개 할 수없는 상황이라 뒤의 네자리만 공개하는 상황을 가정한 예제, 간략하게는 아래와 같다.  \n",
444 |     " * 010-2220-5736 -> 010-****-5736\n",
445 |     " * 010.2220.5736 -> 010-****-5736\n",
446 |     " * 01022205736 -> 010-****-5736"
447 |    ]
448 |   },
449 |   {
450 |    "cell_type": "code",
451 |    "execution_count": 22,
452 |    "metadata": {
453 |     "collapsed": true
454 |    },
455 |    "outputs": [],
456 |    "source": [
457 |     "# example1 : \n",
458 |     "phonenumbers = '''\n",
459 |     "12숫자\n",
460 |     "123숫자\n",
461 |     "234숫자\n",
462 |     "김보섭\n",
463 |     "'''\n",
464 |     "\n",
465 |     "phonenumber_pattern = '(?P<first>\\d{2,3})(?P<second>숫자)' # 특정 정규표현식을 grouping하고 이름을 할당\n",
466 |     "phonenumber_pattern = re.compile(phonenumber_pattern)"
467 |    ]
468 |   },
469 |   {
470 |    "cell_type": "code",
471 |    "execution_count": 23,
472 |    "metadata": {},
473 |    "outputs": [
474 |     {
475 |      "data": {
476 |       "text/plain": [
477 |        "[('12', '숫자'), ('123', '숫자'), ('234', '숫자')]"
478 |       ]
479 |      },
480 |      "execution_count": 23,
481 |      "metadata": {},
482 |      "output_type": "execute_result"
483 |     }
484 |    ],
485 |    "source": [
486 |     "phonenumber_pattern.findall(phonenumbers)"
487 |    ]
488 |   },
489 |   {
490 |    "cell_type": "code",
491 |    "execution_count": 24,
492 |    "metadata": {},
493 |    "outputs": [
494 |     {
495 |      "data": {
496 |       "text/plain": [
497 |        "[('010', '-', '2220', '-', '5736'),\n",
498 |        " ('010', '', '2220', '', '5736'),\n",
499 |        " ('010', '.', '2220', '.', '5736')]"
500 |       ]
501 |      },
502 |      "execution_count": 24,
503 |      "metadata": {},
504 |      "output_type": "execute_result"
505 |     }
506 |    ],
507 |    "source": [
508 |     "# example2\n",
509 |     "# 정규표현식에서 '()' group을 만드나 group의 이름을 할당하지않는 예제\n",
510 |     "phonenumbers = '''\n",
511 |     "010-2220-5736\n",
512 |     "01022205736\n",
513 |     "010.2220.5736\n",
514 |     "'''\n",
515 |     "\n",
516 |     "phonenumber_pattern = '(\\d{2,3})([.-]?)(\\d{3,4})([.-]?)(\\d{4})'\n",
517 |     "phonenumber_pattern = re.compile(phonenumber_pattern)\n",
518 |     "phonenumber_pattern.findall(phonenumbers)"
519 |    ]
520 |   },
521 |   {
522 |    "cell_type": "code",
523 |    "execution_count": 25,
524 |    "metadata": {},
525 |    "outputs": [
526 |     {
527 |      "data": {
528 |       "text/plain": [
529 |        "[('010', '-', '2220', '-', '5736'),\n",
530 |        " ('010', '', '2220', '', '5736'),\n",
531 |        " ('010', '.', '2220', '.', '5736')]"
532 |       ]
533 |      },
534 |      "execution_count": 25,
535 |      "metadata": {},
536 |      "output_type": "execute_result"
537 |     }
538 |    ],
539 |    "source": [
540 |     "# example3\n",
541 |     "# 정규표현식에서 '()' group의 이름을 할당하는 예제\n",
542 |     "phonenumbers = '''\n",
543 |     "010-2220-5736\n",
544 |     "01022205736\n",
545 |     "010.2220.5736\n",
546 |     "'''\n",
547 |     "\n",
548 |     "# 아래의 코드에서 '()' grouping 안에 ?P<문자열>은 group의 이름을 주는 것으로 줘도되고 안줘도된다.\n",
549 |     "# group에 이름을 할당함으로써 얻는 이점은 문자열의 group에 이름으로 접근하여 어떤 처리를 한번에 할 수 있다는 것\n",
550 |     "phonenumber_pattern = '(?P<first>\\d{2,3})(?P<second>[.-]?)(?P<third>\\d{3,4})(?P<fourth>[.-]?)(?P<fifth>\\d{4})'\n",
551 |     "phonenumber_pattern = re.compile(phonenumber_pattern)\n",
552 |     "phonenumber_pattern.findall(phonenumbers)"
553 |    ]
554 |   },
555 |   {
556 |    "cell_type": "code",
557 |    "execution_count": 26,
558 |    "metadata": {},
559 |    "outputs": [
560 |     {
561 |      "data": {
562 |       "text/plain": [
563 |        "'\\n010-****-5736\\n010-****-5736\\n010-****-5736\\n'"
564 |       ]
565 |      },
566 |      "execution_count": 26,
567 |      "metadata": {},
568 |      "output_type": "execute_result"
569 |     }
570 |    ],
571 |    "source": [
572 |     "# \\g<first> : 처음 compile 할 때, (?P<first>~~)로 이름을 준 정규표현식의 group과 matching\n",
573 |     "# \\g<second> : 처음 compile 할 때, (?P<second>~~)로 이름을 준 정규표현식의 group과 matching\n",
574 |     "phonenumber_pattern.sub('\\g<first>-****-\\g<fifth>', phonenumbers)"
575 |    ]
576 |   },
577 |   {
578 |    "cell_type": "code",
579 |    "execution_count": 27,
580 |    "metadata": {},
581 |    "outputs": [
582 |     {
583 |      "name": "stdout",
584 |      "output_type": "stream",
585 |      "text": [
586 |       "\n",
587 |       "010-****-5736\n",
588 |       "010-****-5736\n",
589 |       "010-****-5736\n",
590 |       "\n"
591 |      ]
592 |     }
593 |    ],
594 |    "source": [
595 |     "print(phonenumber_pattern.sub('\\g<first>-****-\\g<fifth>', phonenumbers))"
596 |    ]
597 |   },
598 |   {
599 |    "cell_type": "code",
600 |    "execution_count": 28,
601 |    "metadata": {
602 |     "collapsed": true
603 |    },
604 |    "outputs": [],
605 |    "source": [
606 |     "# example4\n",
607 |     "text = '''\n",
608 |     "안수찬 900223-1234567\n",
609 |     "안성찬 910223-1234789\n",
610 |     "안서찬 910224-1234098\n",
611 |     "'''"
612 |    ]
613 |   },
614 |   {
615 |    "cell_type": "code",
616 |    "execution_count": 29,
617 |    "metadata": {
618 |     "collapsed": true
619 |    },
620 |    "outputs": [],
621 |    "source": [
622 |     "pattern = re.compile('(?P<name>\\w+) (?P<birth>\\d{6})-(?P<secret>\\d{7})')"
623 |    ]
624 |   },
625 |   {
626 |    "cell_type": "code",
627 |    "execution_count": 30,
628 |    "metadata": {},
629 |    "outputs": [
630 |     {
631 |      "data": {
632 |       "text/plain": [
633 |        "'\\n안수찬(900223-*******)\\n안성찬(910223-*******)\\n안서찬(910224-*******)\\n'"
634 |       ]
635 |      },
636 |      "execution_count": 30,
637 |      "metadata": {},
638 |      "output_type": "execute_result"
639 |     }
640 |    ],
641 |    "source": [
642 |     "pattern.sub('\\g<name>(\\g<birth>-*******)', text)"
643 |    ]
644 |   },
645 |   {
646 |    "cell_type": "code",
647 |    "execution_count": 31,
648 |    "metadata": {},
649 |    "outputs": [
650 |     {
651 |      "name": "stdout",
652 |      "output_type": "stream",
653 |      "text": [
654 |       "\n",
655 |       "안수찬(900223-*******)\n",
656 |       "안성찬(910223-*******)\n",
657 |       "안서찬(910224-*******)\n",
658 |       "\n"
659 |      ]
660 |     }
661 |    ],
662 |    "source": [
663 |     "print(pattern.sub('\\g<name>(\\g<birth>-*******)', text))"
664 |    ]
665 |   }
666 |  ],
667 |  "metadata": {
668 |   "kernelspec": {
669 |    "display_name": "Python 3",
670 |    "language": "python",
671 |    "name": "python3"
672 |   },
673 |   "language_info": {
674 |    "codemirror_mode": {
675 |     "name": "ipython",
676 |     "version": 3
677 |    },
678 |    "file_extension": ".py",
679 |    "mimetype": "text/x-python",
680 |    "name": "python",
681 |    "nbconvert_exporter": "python",
682 |    "pygments_lexer": "ipython3",
683 |    "version": "3.6.1"
684 |   }
685 |  },
686 |  "nbformat": 4,
687 |  "nbformat_minor": 2
688 | }
689 | 


--------------------------------------------------------------------------------
/Data wrangling with Python 3(Pandas).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 파이썬으로 엑셀다루기 3 (with Pandas)\n",
  8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
  9 |     "만든이 : 김보섭  "
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {
 15 |     "collapsed": true
 16 |    },
 17 |    "source": [
 18 |     "_**전국 대학교의 리스트, 전국 대학교에 포함된 학과의 리스트, 각각의 대학에 대한 지역에 관한 실제 데이터로 아래의 두 예제를 직접 풀어봅니다.**_  \n",
 19 |     "\n",
 20 |     "_(boolean mask를 이용해서 filtering 하는 방법, groupby를 활용하는 방법을 이용해봅니다.)_  \n",
 21 |     "\n",
 22 |     "  \n",
 23 |     "* Example 1. 대학교별로 나눠보고 싶다. eg) '경북대학교.xlsx' * 1500  \n",
 24 |     "* Example 2. 지역별로 나누고, 폴더로 구분하기 '경북/경북대학교.xlsx'"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {
 30 |     "collapsed": true
 31 |    },
 32 |    "source": [
 33 |     "### Example 1. 대학교라는 폴더안에 각각의 학교 별 excel 파일을 저장하기"
 34 |    ]
 35 |   },
 36 |   {
 37 |    "cell_type": "code",
 38 |    "execution_count": 1,
 39 |    "metadata": {
 40 |     "collapsed": true
 41 |    },
 42 |    "outputs": [],
 43 |    "source": [
 44 |     "import os, sys, re\n",
 45 |     "import shutil\n",
 46 |     "import pandas as pd"
 47 |    ]
 48 |   },
 49 |   {
 50 |    "cell_type": "code",
 51 |    "execution_count": 2,
 52 |    "metadata": {
 53 |     "collapsed": true
 54 |    },
 55 |    "outputs": [],
 56 |    "source": [
 57 |     "df = pd.read_excel('./데이터/univs_2014.xlsx')"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "code",
 62 |    "execution_count": 3,
 63 |    "metadata": {
 64 |     "scrolled": true
 65 |    },
 66 |    "outputs": [
 67 |     {
 68 |      "data": {
 69 |       "text/html": [
 70 |        "<div>\n",
 71 |        "<style>\n",
 72 |        "    .dataframe thead tr:only-child th {\n",
 73 |        "        text-align: right;\n",
 74 |        "    }\n",
 75 |        "\n",
 76 |        "    .dataframe thead th {\n",
 77 |        "        text-align: left;\n",
 78 |        "    }\n",
 79 |        "\n",
 80 |        "    .dataframe tbody tr th {\n",
 81 |        "        vertical-align: top;\n",
 82 |        "    }\n",
 83 |        "</style>\n",
 84 |        "<table border=\"1\" class=\"dataframe\">\n",
 85 |        "  <thead>\n",
 86 |        "    <tr style=\"text-align: right;\">\n",
 87 |        "      <th></th>\n",
 88 |        "      <th>연도</th>\n",
 89 |        "      <th>학제</th>\n",
 90 |        "      <th>시도</th>\n",
 91 |        "      <th>학교명</th>\n",
 92 |        "      <th>본분교</th>\n",
 93 |        "      <th>학교상태</th>\n",
 94 |        "      <th>설립</th>\n",
 95 |        "      <th>우편번호</th>\n",
 96 |        "      <th>주소</th>\n",
 97 |        "      <th>전화번호</th>\n",
 98 |        "      <th>팩스번호</th>\n",
 99 |        "      <th>홈페이지</th>\n",
100 |        "      <th>학과 계열</th>\n",
101 |        "      <th>Unnamed: 13</th>\n",
102 |        "      <th>Unnamed: 14</th>\n",
103 |        "      <th>학과상태</th>\n",
104 |        "      <th>학과명</th>\n",
105 |        "    </tr>\n",
106 |        "  </thead>\n",
107 |        "  <tbody>\n",
108 |        "    <tr>\n",
109 |        "      <th>0</th>\n",
110 |        "      <td>2014</td>\n",
111 |        "      <td>전문대학(3년제)</td>\n",
112 |        "      <td>충남</td>\n",
113 |        "      <td>충남도립청양대학</td>\n",
114 |        "      <td>본교</td>\n",
115 |        "      <td>기존</td>\n",
116 |        "      <td>공립</td>\n",
117 |        "      <td>345702</td>\n",
118 |        "      <td>충청남도 청양군 학사길 55 (청양읍, 청양대학)</td>\n",
119 |        "      <td>041-635-6600</td>\n",
120 |        "      <td>041-635-6633</td>\n",
121 |        "      <td>www.cyc.ac.kr</td>\n",
122 |        "      <td>사회계열</td>\n",
123 |        "      <td>경영ㆍ경제</td>\n",
124 |        "      <td>경영ㆍ경제</td>\n",
125 |        "      <td>기존</td>\n",
126 |        "      <td>토지행정과</td>\n",
127 |        "    </tr>\n",
128 |        "    <tr>\n",
129 |        "      <th>1</th>\n",
130 |        "      <td>2014</td>\n",
131 |        "      <td>전문대학(3년제)</td>\n",
132 |        "      <td>충남</td>\n",
133 |        "      <td>충남도립청양대학</td>\n",
134 |        "      <td>본교</td>\n",
135 |        "      <td>기존</td>\n",
136 |        "      <td>공립</td>\n",
137 |        "      <td>345702</td>\n",
138 |        "      <td>충청남도 청양군 학사길 55 (청양읍, 청양대학)</td>\n",
139 |        "      <td>041-635-6600</td>\n",
140 |        "      <td>041-635-6633</td>\n",
141 |        "      <td>www.cyc.ac.kr</td>\n",
142 |        "      <td>사회계열</td>\n",
143 |        "      <td>사회과학</td>\n",
144 |        "      <td>행정</td>\n",
145 |        "      <td>기존</td>\n",
146 |        "      <td>경찰행정과</td>\n",
147 |        "    </tr>\n",
148 |        "    <tr>\n",
149 |        "      <th>2</th>\n",
150 |        "      <td>2014</td>\n",
151 |        "      <td>전문대학(3년제)</td>\n",
152 |        "      <td>충남</td>\n",
153 |        "      <td>충남도립청양대학</td>\n",
154 |        "      <td>본교</td>\n",
155 |        "      <td>기존</td>\n",
156 |        "      <td>공립</td>\n",
157 |        "      <td>345702</td>\n",
158 |        "      <td>충청남도 청양군 학사길 55 (청양읍, 청양대학)</td>\n",
159 |        "      <td>041-635-6600</td>\n",
160 |        "      <td>041-635-6633</td>\n",
161 |        "      <td>www.cyc.ac.kr</td>\n",
162 |        "      <td>사회계열</td>\n",
163 |        "      <td>사회과학</td>\n",
164 |        "      <td>행정</td>\n",
165 |        "      <td>기존</td>\n",
166 |        "      <td>자치행정과</td>\n",
167 |        "    </tr>\n",
168 |        "    <tr>\n",
169 |        "      <th>3</th>\n",
170 |        "      <td>2014</td>\n",
171 |        "      <td>전문대학(3년제)</td>\n",
172 |        "      <td>충남</td>\n",
173 |        "      <td>충남도립청양대학</td>\n",
174 |        "      <td>본교</td>\n",
175 |        "      <td>기존</td>\n",
176 |        "      <td>공립</td>\n",
177 |        "      <td>345702</td>\n",
178 |        "      <td>충청남도 청양군 학사길 55 (청양읍, 청양대학)</td>\n",
179 |        "      <td>041-635-6600</td>\n",
180 |        "      <td>041-635-6633</td>\n",
181 |        "      <td>www.cyc.ac.kr</td>\n",
182 |        "      <td>사회계열</td>\n",
183 |        "      <td>사회과학</td>\n",
184 |        "      <td>행정</td>\n",
185 |        "      <td>기존</td>\n",
186 |        "      <td>자치행정학과</td>\n",
187 |        "    </tr>\n",
188 |        "    <tr>\n",
189 |        "      <th>4</th>\n",
190 |        "      <td>2014</td>\n",
191 |        "      <td>전문대학(3년제)</td>\n",
192 |        "      <td>충남</td>\n",
193 |        "      <td>충남도립청양대학</td>\n",
194 |        "      <td>본교</td>\n",
195 |        "      <td>기존</td>\n",
196 |        "      <td>공립</td>\n",
197 |        "      <td>345702</td>\n",
198 |        "      <td>충청남도 청양군 학사길 55 (청양읍, 청양대학)</td>\n",
199 |        "      <td>041-635-6600</td>\n",
200 |        "      <td>041-635-6633</td>\n",
201 |        "      <td>www.cyc.ac.kr</td>\n",
202 |        "      <td>공학계열</td>\n",
203 |        "      <td>전기ㆍ전자</td>\n",
204 |        "      <td>전자</td>\n",
205 |        "      <td>기존</td>\n",
206 |        "      <td>전기전자과</td>\n",
207 |        "    </tr>\n",
208 |        "  </tbody>\n",
209 |        "</table>\n",
210 |        "</div>"
211 |       ],
212 |       "text/plain": [
213 |        "     연도         학제  시도       학교명 본분교 학교상태  설립    우편번호  \\\n",
214 |        "0  2014  전문대학(3년제)  충남  충남도립청양대학  본교   기존  공립  345702   \n",
215 |        "1  2014  전문대학(3년제)  충남  충남도립청양대학  본교   기존  공립  345702   \n",
216 |        "2  2014  전문대학(3년제)  충남  충남도립청양대학  본교   기존  공립  345702   \n",
217 |        "3  2014  전문대학(3년제)  충남  충남도립청양대학  본교   기존  공립  345702   \n",
218 |        "4  2014  전문대학(3년제)  충남  충남도립청양대학  본교   기존  공립  345702   \n",
219 |        "\n",
220 |        "                            주소                  전화번호                  팩스번호  \\\n",
221 |        "0  충청남도 청양군 학사길 55 (청양읍, 청양대학)  041-635-6600          041-635-6633           \n",
222 |        "1  충청남도 청양군 학사길 55 (청양읍, 청양대학)  041-635-6600          041-635-6633           \n",
223 |        "2  충청남도 청양군 학사길 55 (청양읍, 청양대학)  041-635-6600          041-635-6633           \n",
224 |        "3  충청남도 청양군 학사길 55 (청양읍, 청양대학)  041-635-6600          041-635-6633           \n",
225 |        "4  충청남도 청양군 학사길 55 (청양읍, 청양대학)  041-635-6600          041-635-6633           \n",
226 |        "\n",
227 |        "            홈페이지 학과 계열 Unnamed: 13 Unnamed: 14 학과상태     학과명  \n",
228 |        "0  www.cyc.ac.kr  사회계열       경영ㆍ경제       경영ㆍ경제   기존   토지행정과  \n",
229 |        "1  www.cyc.ac.kr  사회계열        사회과학          행정   기존   경찰행정과  \n",
230 |        "2  www.cyc.ac.kr  사회계열        사회과학          행정   기존   자치행정과  \n",
231 |        "3  www.cyc.ac.kr  사회계열        사회과학          행정   기존  자치행정학과  \n",
232 |        "4  www.cyc.ac.kr  공학계열       전기ㆍ전자          전자   기존   전기전자과  "
233 |       ]
234 |      },
235 |      "execution_count": 3,
236 |      "metadata": {},
237 |      "output_type": "execute_result"
238 |     }
239 |    ],
240 |    "source": [
241 |     "df.head()"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "code",
246 |    "execution_count": 4,
247 |    "metadata": {},
248 |    "outputs": [
249 |     {
250 |      "data": {
251 |       "text/plain": [
252 |        "(20017, 17)"
253 |       ]
254 |      },
255 |      "execution_count": 4,
256 |      "metadata": {},
257 |      "output_type": "execute_result"
258 |     }
259 |    ],
260 |    "source": [
261 |     "df.shape"
262 |    ]
263 |   },
264 |   {
265 |    "cell_type": "code",
266 |    "execution_count": 5,
267 |    "metadata": {},
268 |    "outputs": [
269 |     {
270 |      "data": {
271 |       "text/plain": [
272 |        "1592"
273 |       ]
274 |      },
275 |      "execution_count": 5,
276 |      "metadata": {},
277 |      "output_type": "execute_result"
278 |     }
279 |    ],
280 |    "source": [
281 |     "# unique_univ 대학교 수?\n",
282 |     "unique_univs = list(set(df['학교명']))\n",
283 |     "len(unique_univs)"
284 |    ]
285 |   },
286 |   {
287 |    "cell_type": "code",
288 |    "execution_count": 6,
289 |    "metadata": {},
290 |    "outputs": [
291 |     {
292 |      "name": "stdout",
293 |      "output_type": "stream",
294 |      "text": [
295 |       "홍익대학교 세종캠퍼스 산업대학원\n"
296 |      ]
297 |     },
298 |     {
299 |      "data": {
300 |       "text/plain": [
301 |        "19222    광고홍보커뮤니케이션전공\n",
302 |        "19223          건축설계전공\n",
303 |        "19224     커뮤니케이션디자인전공\n",
304 |        "19225        게임프로듀싱전공\n",
305 |        "19226            색채전공\n",
306 |        "Name: 학과명, dtype: object"
307 |       ]
308 |      },
309 |      "execution_count": 6,
310 |      "metadata": {},
311 |      "output_type": "execute_result"
312 |     }
313 |    ],
314 |    "source": [
315 |     "# boolean Mask를 이용하여 필터링 하는 법\n",
316 |     "is_univ = df['학교명'] == unique_univs[1]\n",
317 |     "print(unique_univs[1])\n",
318 |     "df[is_univ]['학과명']"
319 |    ]
320 |   },
321 |   {
322 |    "cell_type": "markdown",
323 |    "metadata": {},
324 |    "source": [
325 |     "#### 대학교/ 대학교별 엑셀"
326 |    ]
327 |   },
328 |   {
329 |    "cell_type": "code",
330 |    "execution_count": 7,
331 |    "metadata": {
332 |     "collapsed": true
333 |    },
334 |    "outputs": [],
335 |    "source": [
336 |     "# 대학교/ 대학교별 엑셀\n",
337 |     "if '대학교' in os.listdir():\n",
338 |     "    shutil.rmtree('대학교')\n",
339 |     "os.makedirs('대학교')\n",
340 |     "\n",
341 |     "unique_univs = list(set(df['학교명']))\n",
342 |     "for univ_name in unique_univs:\n",
343 |     "    is_univ = df['학교명'] == univ_name\n",
344 |     "    univ_df = df[is_univ]\n",
345 |     "    univ_df.to_excel('./대학교/' + univ_name + '.xlsx')"
346 |    ]
347 |   },
348 |   {
349 |    "cell_type": "code",
350 |    "execution_count": 8,
351 |    "metadata": {
352 |     "collapsed": true
353 |    },
354 |    "outputs": [],
355 |    "source": [
356 |     "# 더 효율적으로 해보자\n",
357 |     "# list(set(df['학교명'])) 부분을 refactoring 하자!\n",
358 |     "# boolean mask를 이용한 필터링을 하지말고 groupby를 활용하자.\n",
359 |     "import os, sys\n",
360 |     "if '대학교' in os.listdir():\n",
361 |     "    shutil.rmtree('대학교')\n",
362 |     "os.makedirs('대학교')\n",
363 |     "\n",
364 |     "for univ_name in df['학교명'].unique():\n",
365 |     "    univ_df = df.groupby('학교명').get_group(univ_name)\n",
366 |     "    univ_df.to_excel('./대학교/' + univ_name + '.xlsx')"
367 |    ]
368 |   },
369 |   {
370 |    "cell_type": "markdown",
371 |    "metadata": {},
372 |    "source": [
373 |     "#### 성균관대학교 경영대학원 같은 case가 존재하므로 위의 case 들도 고려해야한다. 그래서 다음과 같은 코드를 사용\n",
374 |     " "
375 |    ]
376 |   },
377 |   {
378 |    "cell_type": "code",
379 |    "execution_count": 9,
380 |    "metadata": {},
381 |    "outputs": [
382 |     {
383 |      "data": {
384 |       "text/plain": [
385 |        "1592"
386 |       ]
387 |      },
388 |      "execution_count": 9,
389 |      "metadata": {},
390 |      "output_type": "execute_result"
391 |     }
392 |    ],
393 |    "source": [
394 |     "univ_names = df['학교명'].unique()\n",
395 |     "len(univ_names)"
396 |    ]
397 |   },
398 |   {
399 |    "cell_type": "code",
400 |    "execution_count": 10,
401 |    "metadata": {
402 |     "collapsed": true
403 |    },
404 |    "outputs": [],
405 |    "source": [
406 |     "univ_total_text = ' '.join(univ_names)"
407 |    ]
408 |   },
409 |   {
410 |    "cell_type": "code",
411 |    "execution_count": 11,
412 |    "metadata": {
413 |     "collapsed": true
414 |    },
415 |    "outputs": [],
416 |    "source": [
417 |     "univ_pattern = re.compile('\\w+대학교')\n",
418 |     "real_univ_names = list(set(univ_pattern.findall(univ_total_text)))"
419 |    ]
420 |   },
421 |   {
422 |    "cell_type": "code",
423 |    "execution_count": 12,
424 |    "metadata": {
425 |     "collapsed": true
426 |    },
427 |    "outputs": [],
428 |    "source": [
429 |     "if '대학교' in os.listdir():\n",
430 |     "    shutil.rmtree('대학교')\n",
431 |     "os.makedirs('대학교')\n",
432 |     "\n",
433 |     "for univ_name in real_univ_names:\n",
434 |     "    is_univ = df['학교명'].str.startswith(univ_name)\n",
435 |     "    univ_df = df[is_univ]\n",
436 |     "    univ_df.to_excel('./대학교/' + univ_name + '.xlsx')    "
437 |    ]
438 |   },
439 |   {
440 |    "cell_type": "code",
441 |    "execution_count": 13,
442 |    "metadata": {},
443 |    "outputs": [
444 |     {
445 |      "data": {
446 |       "text/plain": [
447 |        "384"
448 |       ]
449 |      },
450 |      "execution_count": 13,
451 |      "metadata": {},
452 |      "output_type": "execute_result"
453 |     }
454 |    ],
455 |    "source": [
456 |     "len(os.listdir('./대학교/'))"
457 |    ]
458 |   },
459 |   {
460 |    "cell_type": "markdown",
461 |    "metadata": {},
462 |    "source": [
463 |     "### Example 2. 대학교 폴더안에 지역 별 폴더로 나누고, 해당 폴더안에 지역별 대학교 excel sheet 저장"
464 |    ]
465 |   },
466 |   {
467 |    "cell_type": "code",
468 |    "execution_count": 14,
469 |    "metadata": {
470 |     "collapsed": true
471 |    },
472 |    "outputs": [],
473 |    "source": [
474 |     "# os.path.join을 이용하자.\n",
475 |     "if '대학교' in os.listdir():\n",
476 |     "    shutil.rmtree('대학교')\n",
477 |     "os.makedirs('대학교')\n",
478 |     "\n",
479 |     "city_groups = df.groupby('시도')\n",
480 |     "for city_name in df['시도'].unique():\n",
481 |     "    city_df = city_groups.get_group(city_name)\n",
482 |     "    \n",
483 |     "    if not city_name in os.listdir('대학교'):\n",
484 |     "        os.makedirs(os.path.join('대학교', city_name))\n",
485 |     "        \n",
486 |     "    univ_in_city_groups = city_df.groupby('학교명')\n",
487 |     "    \n",
488 |     "    for univ_name in city_df['학교명'].unique():\n",
489 |     "        univ_in_city_df = univ_in_city_groups.get_group(univ_name)\n",
490 |     "        univ_in_city_df.to_excel(os.path.join('대학교', city_name, univ_name + '.xlsx'))"
491 |    ]
492 |   }
493 |  ],
494 |  "metadata": {
495 |   "kernelspec": {
496 |    "display_name": "Python 3",
497 |    "language": "python",
498 |    "name": "python3"
499 |   },
500 |   "language_info": {
501 |    "codemirror_mode": {
502 |     "name": "ipython",
503 |     "version": 3
504 |    },
505 |    "file_extension": ".py",
506 |    "mimetype": "text/x-python",
507 |    "name": "python",
508 |    "nbconvert_exporter": "python",
509 |    "pygments_lexer": "ipython3",
510 |    "version": "3.6.2"
511 |   },
512 |   "varInspector": {
513 |    "cols": {
514 |     "lenName": 16,
515 |     "lenType": 16,
516 |     "lenVar": 40
517 |    },
518 |    "kernels_config": {
519 |     "python": {
520 |      "delete_cmd_postfix": "",
521 |      "delete_cmd_prefix": "del ",
522 |      "library": "var_list.py",
523 |      "varRefreshCmd": "print(var_dic_list())"
524 |     },
525 |     "r": {
526 |      "delete_cmd_postfix": ") ",
527 |      "delete_cmd_prefix": "rm(",
528 |      "library": "var_list.r",
529 |      "varRefreshCmd": "cat(var_dic_list()) "
530 |     }
531 |    },
532 |    "types_to_exclude": [
533 |     "module",
534 |     "function",
535 |     "builtin_function_or_method",
536 |     "instance",
537 |     "_Feature"
538 |    ],
539 |    "window_display": false
540 |   }
541 |  },
542 |  "nbformat": 4,
543 |  "nbformat_minor": 2
544 | }
545 | 


--------------------------------------------------------------------------------
/Data wrangling with Python 4(Pandas).ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# DataFrame 활용 (Pandas)\n",
  8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
  9 |     "만든이 : 김보섭  "
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "#### _**Pandas, mutagen library의 MP3 Class등을 활용합니다.**_ \n"
 17 |    ]
 18 |   },
 19 |   {
 20 |    "cell_type": "markdown",
 21 |    "metadata": {},
 22 |    "source": [
 23 |     "### DataFrame 활용"
 24 |    ]
 25 |   },
 26 |   {
 27 |    "cell_type": "markdown",
 28 |    "metadata": {},
 29 |    "source": [
 30 |     "#### mp3_df DataFrame 생성"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "code",
 35 |    "execution_count": 1,
 36 |    "metadata": {
 37 |     "collapsed": true
 38 |    },
 39 |    "outputs": [],
 40 |    "source": [
 41 |     "import os, sys\n",
 42 |     "import pandas as pd\n",
 43 |     "import shutil\n",
 44 |     "from mutagen.mp3 import MP3 # Class를 불러온다"
 45 |    ]
 46 |   },
 47 |   {
 48 |    "cell_type": "code",
 49 |    "execution_count": 2,
 50 |    "metadata": {},
 51 |    "outputs": [
 52 |     {
 53 |      "data": {
 54 |       "text/plain": [
 55 |        "['.ipynb_checkpoints',\n",
 56 |        " '2HG0Z4C2.mp3',\n",
 57 |        " '2W1KHXPI.mp3',\n",
 58 |        " '3KTIS7NN.mp3',\n",
 59 |        " 'BWD5GM5Q.mp3',\n",
 60 |        " 'DJJUFDQX.mp3',\n",
 61 |        " 'IXMFB7H1.mp3',\n",
 62 |        " 'README.txt',\n",
 63 |        " 'RIS29UDF.mp3',\n",
 64 |        " 'T51NGFL6.mp3',\n",
 65 |        " 'WHYGK9R4.mp3',\n",
 66 |        " 'YUF527VD.mp3']"
 67 |       ]
 68 |      },
 69 |      "execution_count": 2,
 70 |      "metadata": {},
 71 |      "output_type": "execute_result"
 72 |     }
 73 |    ],
 74 |    "source": [
 75 |     "# mp3 파일이 아닌 쓸데없는 txt파일이 존재함을 확인할 수 있다.\n",
 76 |     "os.listdir('./sample_mp3/')"
 77 |    ]
 78 |   },
 79 |   {
 80 |    "cell_type": "code",
 81 |    "execution_count": 3,
 82 |    "metadata": {},
 83 |    "outputs": [
 84 |     {
 85 |      "data": {
 86 |       "text/plain": [
 87 |        "['2HG0Z4C2.mp3',\n",
 88 |        " '2W1KHXPI.mp3',\n",
 89 |        " '3KTIS7NN.mp3',\n",
 90 |        " 'BWD5GM5Q.mp3',\n",
 91 |        " 'DJJUFDQX.mp3',\n",
 92 |        " 'IXMFB7H1.mp3',\n",
 93 |        " 'RIS29UDF.mp3',\n",
 94 |        " 'T51NGFL6.mp3',\n",
 95 |        " 'WHYGK9R4.mp3',\n",
 96 |        " 'YUF527VD.mp3']"
 97 |       ]
 98 |      },
 99 |      "execution_count": 3,
100 |      "metadata": {},
101 |      "output_type": "execute_result"
102 |     }
103 |    ],
104 |    "source": [
105 |     "# list comprehension을 이용하여 mp3 파일 목록만 뽑자\n",
106 |     "mp3_filenames = [\n",
107 |     "    filename\n",
108 |     "    for filename\n",
109 |     "    in os.listdir('./sample_mp3/')\n",
110 |     "    if filename.endswith('.mp3') # 특정 확장자를 가진 파일만 추출하는 방법\n",
111 |     "]\n",
112 |     "mp3_filenames"
113 |    ]
114 |   },
115 |   {
116 |    "cell_type": "code",
117 |    "execution_count": 4,
118 |    "metadata": {},
119 |    "outputs": [
120 |     {
121 |      "data": {
122 |       "text/html": [
123 |        "<div>\n",
124 |        "<style>\n",
125 |        "    .dataframe thead tr:only-child th {\n",
126 |        "        text-align: right;\n",
127 |        "    }\n",
128 |        "\n",
129 |        "    .dataframe thead th {\n",
130 |        "        text-align: left;\n",
131 |        "    }\n",
132 |        "\n",
133 |        "    .dataframe tbody tr th {\n",
134 |        "        vertical-align: top;\n",
135 |        "    }\n",
136 |        "</style>\n",
137 |        "<table border=\"1\" class=\"dataframe\">\n",
138 |        "  <thead>\n",
139 |        "    <tr style=\"text-align: right;\">\n",
140 |        "      <th></th>\n",
141 |        "      <th>filename</th>\n",
142 |        "    </tr>\n",
143 |        "  </thead>\n",
144 |        "  <tbody>\n",
145 |        "    <tr>\n",
146 |        "      <th>0</th>\n",
147 |        "      <td>2HG0Z4C2.mp3</td>\n",
148 |        "    </tr>\n",
149 |        "    <tr>\n",
150 |        "      <th>1</th>\n",
151 |        "      <td>2W1KHXPI.mp3</td>\n",
152 |        "    </tr>\n",
153 |        "    <tr>\n",
154 |        "      <th>2</th>\n",
155 |        "      <td>3KTIS7NN.mp3</td>\n",
156 |        "    </tr>\n",
157 |        "    <tr>\n",
158 |        "      <th>3</th>\n",
159 |        "      <td>BWD5GM5Q.mp3</td>\n",
160 |        "    </tr>\n",
161 |        "    <tr>\n",
162 |        "      <th>4</th>\n",
163 |        "      <td>DJJUFDQX.mp3</td>\n",
164 |        "    </tr>\n",
165 |        "  </tbody>\n",
166 |        "</table>\n",
167 |        "</div>"
168 |       ],
169 |       "text/plain": [
170 |        "       filename\n",
171 |        "0  2HG0Z4C2.mp3\n",
172 |        "1  2W1KHXPI.mp3\n",
173 |        "2  3KTIS7NN.mp3\n",
174 |        "3  BWD5GM5Q.mp3\n",
175 |        "4  DJJUFDQX.mp3"
176 |       ]
177 |      },
178 |      "execution_count": 4,
179 |      "metadata": {},
180 |      "output_type": "execute_result"
181 |     }
182 |    ],
183 |    "source": [
184 |     "# 위의 list로 DataFrame 생성\n",
185 |     "mp3_df = pd.DataFrame(mp3_filenames, columns = ['filename'])\n",
186 |     "mp3_df.head()"
187 |    ]
188 |   },
189 |   {
190 |    "cell_type": "code",
191 |    "execution_count": 5,
192 |    "metadata": {
193 |     "collapsed": true
194 |    },
195 |    "outputs": [],
196 |    "source": [
197 |     "mp3 = MP3('./sample_mp3/2HG0Z4C2.mp3')"
198 |    ]
199 |   },
200 |   {
201 |    "cell_type": "code",
202 |    "execution_count": 6,
203 |    "metadata": {},
204 |    "outputs": [
205 |     {
206 |      "data": {
207 |       "text/plain": [
208 |        "TPE2(encoding=<Encoding.UTF16: 1>, text=['The Weeknd'])"
209 |       ]
210 |      },
211 |      "execution_count": 6,
212 |      "metadata": {},
213 |      "output_type": "execute_result"
214 |     }
215 |    ],
216 |    "source": [
217 |     "mp3.get('TPE2')"
218 |    ]
219 |   },
220 |   {
221 |    "cell_type": "markdown",
222 |    "metadata": {},
223 |    "source": [
224 |     "#### mp3_df에 filepath column 추가"
225 |    ]
226 |   },
227 |   {
228 |    "cell_type": "code",
229 |    "execution_count": 7,
230 |    "metadata": {},
231 |    "outputs": [
232 |     {
233 |      "data": {
234 |       "text/html": [
235 |        "<div>\n",
236 |        "<style>\n",
237 |        "    .dataframe thead tr:only-child th {\n",
238 |        "        text-align: right;\n",
239 |        "    }\n",
240 |        "\n",
241 |        "    .dataframe thead th {\n",
242 |        "        text-align: left;\n",
243 |        "    }\n",
244 |        "\n",
245 |        "    .dataframe tbody tr th {\n",
246 |        "        vertical-align: top;\n",
247 |        "    }\n",
248 |        "</style>\n",
249 |        "<table border=\"1\" class=\"dataframe\">\n",
250 |        "  <thead>\n",
251 |        "    <tr style=\"text-align: right;\">\n",
252 |        "      <th></th>\n",
253 |        "      <th>filename</th>\n",
254 |        "      <th>filepath</th>\n",
255 |        "    </tr>\n",
256 |        "  </thead>\n",
257 |        "  <tbody>\n",
258 |        "    <tr>\n",
259 |        "      <th>0</th>\n",
260 |        "      <td>2HG0Z4C2.mp3</td>\n",
261 |        "      <td>.\\sample_mp3\\2HG0Z4C2.mp3</td>\n",
262 |        "    </tr>\n",
263 |        "    <tr>\n",
264 |        "      <th>1</th>\n",
265 |        "      <td>2W1KHXPI.mp3</td>\n",
266 |        "      <td>.\\sample_mp3\\2W1KHXPI.mp3</td>\n",
267 |        "    </tr>\n",
268 |        "    <tr>\n",
269 |        "      <th>2</th>\n",
270 |        "      <td>3KTIS7NN.mp3</td>\n",
271 |        "      <td>.\\sample_mp3\\3KTIS7NN.mp3</td>\n",
272 |        "    </tr>\n",
273 |        "    <tr>\n",
274 |        "      <th>3</th>\n",
275 |        "      <td>BWD5GM5Q.mp3</td>\n",
276 |        "      <td>.\\sample_mp3\\BWD5GM5Q.mp3</td>\n",
277 |        "    </tr>\n",
278 |        "    <tr>\n",
279 |        "      <th>4</th>\n",
280 |        "      <td>DJJUFDQX.mp3</td>\n",
281 |        "      <td>.\\sample_mp3\\DJJUFDQX.mp3</td>\n",
282 |        "    </tr>\n",
283 |        "  </tbody>\n",
284 |        "</table>\n",
285 |        "</div>"
286 |       ],
287 |       "text/plain": [
288 |        "       filename                   filepath\n",
289 |        "0  2HG0Z4C2.mp3  .\\sample_mp3\\2HG0Z4C2.mp3\n",
290 |        "1  2W1KHXPI.mp3  .\\sample_mp3\\2W1KHXPI.mp3\n",
291 |        "2  3KTIS7NN.mp3  .\\sample_mp3\\3KTIS7NN.mp3\n",
292 |        "3  BWD5GM5Q.mp3  .\\sample_mp3\\BWD5GM5Q.mp3\n",
293 |        "4  DJJUFDQX.mp3  .\\sample_mp3\\DJJUFDQX.mp3"
294 |       ]
295 |      },
296 |      "execution_count": 7,
297 |      "metadata": {},
298 |      "output_type": "execute_result"
299 |     }
300 |    ],
301 |    "source": [
302 |     "# 위의 과정을 DataFrame을 이용하여 효율적으로\n",
303 |     "mp3_df['filepath'] = \\\n",
304 |     "mp3_df.filename.apply(lambda x : os.path.join('.','sample_mp3',x))\n",
305 |     "mp3_df.head()"
306 |    ]
307 |   },
308 |   {
309 |    "cell_type": "code",
310 |    "execution_count": 8,
311 |    "metadata": {},
312 |    "outputs": [
313 |     {
314 |      "data": {
315 |       "text/plain": [
316 |        "TIT2(encoding=<Encoding.UTF16: 1>, text=[\"Can't Feel My Face\"])"
317 |       ]
318 |      },
319 |      "execution_count": 8,
320 |      "metadata": {},
321 |      "output_type": "execute_result"
322 |     }
323 |    ],
324 |    "source": [
325 |     "mp3 = MP3(mp3_df.iloc[0].filepath)\n",
326 |     "mp3.get('TIT2')"
327 |    ]
328 |   },
329 |   {
330 |    "cell_type": "markdown",
331 |    "metadata": {},
332 |    "source": [
333 |     "#### mp3_df에 title, new_title, new_filepath 추가"
334 |    ]
335 |   },
336 |   {
337 |    "cell_type": "code",
338 |    "execution_count": 9,
339 |    "metadata": {},
340 |    "outputs": [
341 |     {
342 |      "data": {
343 |       "text/html": [
344 |        "<div>\n",
345 |        "<style>\n",
346 |        "    .dataframe thead tr:only-child th {\n",
347 |        "        text-align: right;\n",
348 |        "    }\n",
349 |        "\n",
350 |        "    .dataframe thead th {\n",
351 |        "        text-align: left;\n",
352 |        "    }\n",
353 |        "\n",
354 |        "    .dataframe tbody tr th {\n",
355 |        "        vertical-align: top;\n",
356 |        "    }\n",
357 |        "</style>\n",
358 |        "<table border=\"1\" class=\"dataframe\">\n",
359 |        "  <thead>\n",
360 |        "    <tr style=\"text-align: right;\">\n",
361 |        "      <th></th>\n",
362 |        "      <th>filename</th>\n",
363 |        "      <th>filepath</th>\n",
364 |        "      <th>title</th>\n",
365 |        "    </tr>\n",
366 |        "  </thead>\n",
367 |        "  <tbody>\n",
368 |        "    <tr>\n",
369 |        "      <th>0</th>\n",
370 |        "      <td>2HG0Z4C2.mp3</td>\n",
371 |        "      <td>.\\sample_mp3\\2HG0Z4C2.mp3</td>\n",
372 |        "      <td>Can't Feel My Face</td>\n",
373 |        "    </tr>\n",
374 |        "    <tr>\n",
375 |        "      <th>1</th>\n",
376 |        "      <td>2W1KHXPI.mp3</td>\n",
377 |        "      <td>.\\sample_mp3\\2W1KHXPI.mp3</td>\n",
378 |        "      <td>What Do You Mean</td>\n",
379 |        "    </tr>\n",
380 |        "    <tr>\n",
381 |        "      <th>2</th>\n",
382 |        "      <td>3KTIS7NN.mp3</td>\n",
383 |        "      <td>.\\sample_mp3\\3KTIS7NN.mp3</td>\n",
384 |        "      <td>Watch Me</td>\n",
385 |        "    </tr>\n",
386 |        "    <tr>\n",
387 |        "      <th>3</th>\n",
388 |        "      <td>BWD5GM5Q.mp3</td>\n",
389 |        "      <td>.\\sample_mp3\\BWD5GM5Q.mp3</td>\n",
390 |        "      <td>Cheerleader</td>\n",
391 |        "    </tr>\n",
392 |        "    <tr>\n",
393 |        "      <th>4</th>\n",
394 |        "      <td>DJJUFDQX.mp3</td>\n",
395 |        "      <td>.\\sample_mp3\\DJJUFDQX.mp3</td>\n",
396 |        "      <td>Lean on</td>\n",
397 |        "    </tr>\n",
398 |        "  </tbody>\n",
399 |        "</table>\n",
400 |        "</div>"
401 |       ],
402 |       "text/plain": [
403 |        "       filename                   filepath               title\n",
404 |        "0  2HG0Z4C2.mp3  .\\sample_mp3\\2HG0Z4C2.mp3  Can't Feel My Face\n",
405 |        "1  2W1KHXPI.mp3  .\\sample_mp3\\2W1KHXPI.mp3    What Do You Mean\n",
406 |        "2  3KTIS7NN.mp3  .\\sample_mp3\\3KTIS7NN.mp3            Watch Me\n",
407 |        "3  BWD5GM5Q.mp3  .\\sample_mp3\\BWD5GM5Q.mp3         Cheerleader\n",
408 |        "4  DJJUFDQX.mp3  .\\sample_mp3\\DJJUFDQX.mp3             Lean on"
409 |       ]
410 |      },
411 |      "execution_count": 9,
412 |      "metadata": {},
413 |      "output_type": "execute_result"
414 |     }
415 |    ],
416 |    "source": [
417 |     "# 정식 사이트 음원 => MP3 (가수, 라이센스, 가사, 제목, ...)\n",
418 |     "# BeautifulSoup (HTML Parser), MP3 (MP3 File Parser), Image Parser\n",
419 |     "# 타이틀 한번에 추가하고, 새로운 파일명과 경로만들기\n",
420 |     "mp3_df['title'] = mp3_df.filepath.apply(lambda x: MP3(x).get('TIT2').text[0])\n",
421 |     "mp3_df.head()"
422 |    ]
423 |   },
424 |   {
425 |    "cell_type": "code",
426 |    "execution_count": 10,
427 |    "metadata": {},
428 |    "outputs": [
429 |     {
430 |      "data": {
431 |       "text/html": [
432 |        "<div>\n",
433 |        "<style>\n",
434 |        "    .dataframe thead tr:only-child th {\n",
435 |        "        text-align: right;\n",
436 |        "    }\n",
437 |        "\n",
438 |        "    .dataframe thead th {\n",
439 |        "        text-align: left;\n",
440 |        "    }\n",
441 |        "\n",
442 |        "    .dataframe tbody tr th {\n",
443 |        "        vertical-align: top;\n",
444 |        "    }\n",
445 |        "</style>\n",
446 |        "<table border=\"1\" class=\"dataframe\">\n",
447 |        "  <thead>\n",
448 |        "    <tr style=\"text-align: right;\">\n",
449 |        "      <th></th>\n",
450 |        "      <th>filename</th>\n",
451 |        "      <th>filepath</th>\n",
452 |        "      <th>title</th>\n",
453 |        "      <th>new_filename</th>\n",
454 |        "    </tr>\n",
455 |        "  </thead>\n",
456 |        "  <tbody>\n",
457 |        "    <tr>\n",
458 |        "      <th>0</th>\n",
459 |        "      <td>2HG0Z4C2.mp3</td>\n",
460 |        "      <td>.\\sample_mp3\\2HG0Z4C2.mp3</td>\n",
461 |        "      <td>Can't Feel My Face</td>\n",
462 |        "      <td>Can't Feel My Face.mp3</td>\n",
463 |        "    </tr>\n",
464 |        "    <tr>\n",
465 |        "      <th>1</th>\n",
466 |        "      <td>2W1KHXPI.mp3</td>\n",
467 |        "      <td>.\\sample_mp3\\2W1KHXPI.mp3</td>\n",
468 |        "      <td>What Do You Mean</td>\n",
469 |        "      <td>What Do You Mean.mp3</td>\n",
470 |        "    </tr>\n",
471 |        "    <tr>\n",
472 |        "      <th>2</th>\n",
473 |        "      <td>3KTIS7NN.mp3</td>\n",
474 |        "      <td>.\\sample_mp3\\3KTIS7NN.mp3</td>\n",
475 |        "      <td>Watch Me</td>\n",
476 |        "      <td>Watch Me.mp3</td>\n",
477 |        "    </tr>\n",
478 |        "    <tr>\n",
479 |        "      <th>3</th>\n",
480 |        "      <td>BWD5GM5Q.mp3</td>\n",
481 |        "      <td>.\\sample_mp3\\BWD5GM5Q.mp3</td>\n",
482 |        "      <td>Cheerleader</td>\n",
483 |        "      <td>Cheerleader.mp3</td>\n",
484 |        "    </tr>\n",
485 |        "    <tr>\n",
486 |        "      <th>4</th>\n",
487 |        "      <td>DJJUFDQX.mp3</td>\n",
488 |        "      <td>.\\sample_mp3\\DJJUFDQX.mp3</td>\n",
489 |        "      <td>Lean on</td>\n",
490 |        "      <td>Lean on.mp3</td>\n",
491 |        "    </tr>\n",
492 |        "  </tbody>\n",
493 |        "</table>\n",
494 |        "</div>"
495 |       ],
496 |       "text/plain": [
497 |        "       filename                   filepath               title  \\\n",
498 |        "0  2HG0Z4C2.mp3  .\\sample_mp3\\2HG0Z4C2.mp3  Can't Feel My Face   \n",
499 |        "1  2W1KHXPI.mp3  .\\sample_mp3\\2W1KHXPI.mp3    What Do You Mean   \n",
500 |        "2  3KTIS7NN.mp3  .\\sample_mp3\\3KTIS7NN.mp3            Watch Me   \n",
501 |        "3  BWD5GM5Q.mp3  .\\sample_mp3\\BWD5GM5Q.mp3         Cheerleader   \n",
502 |        "4  DJJUFDQX.mp3  .\\sample_mp3\\DJJUFDQX.mp3             Lean on   \n",
503 |        "\n",
504 |        "             new_filename  \n",
505 |        "0  Can't Feel My Face.mp3  \n",
506 |        "1    What Do You Mean.mp3  \n",
507 |        "2            Watch Me.mp3  \n",
508 |        "3         Cheerleader.mp3  \n",
509 |        "4             Lean on.mp3  "
510 |       ]
511 |      },
512 |      "execution_count": 10,
513 |      "metadata": {},
514 |      "output_type": "execute_result"
515 |     }
516 |    ],
517 |    "source": [
518 |     "mp3_df['new_filename'] = mp3_df.title.apply(lambda x : x + '.mp3')\n",
519 |     "mp3_df.head()"
520 |    ]
521 |   },
522 |   {
523 |    "cell_type": "code",
524 |    "execution_count": 11,
525 |    "metadata": {},
526 |    "outputs": [
527 |     {
528 |      "data": {
529 |       "text/html": [
530 |        "<div>\n",
531 |        "<style>\n",
532 |        "    .dataframe thead tr:only-child th {\n",
533 |        "        text-align: right;\n",
534 |        "    }\n",
535 |        "\n",
536 |        "    .dataframe thead th {\n",
537 |        "        text-align: left;\n",
538 |        "    }\n",
539 |        "\n",
540 |        "    .dataframe tbody tr th {\n",
541 |        "        vertical-align: top;\n",
542 |        "    }\n",
543 |        "</style>\n",
544 |        "<table border=\"1\" class=\"dataframe\">\n",
545 |        "  <thead>\n",
546 |        "    <tr style=\"text-align: right;\">\n",
547 |        "      <th></th>\n",
548 |        "      <th>filename</th>\n",
549 |        "      <th>filepath</th>\n",
550 |        "      <th>title</th>\n",
551 |        "      <th>new_filename</th>\n",
552 |        "      <th>new_filepath</th>\n",
553 |        "    </tr>\n",
554 |        "  </thead>\n",
555 |        "  <tbody>\n",
556 |        "    <tr>\n",
557 |        "      <th>0</th>\n",
558 |        "      <td>2HG0Z4C2.mp3</td>\n",
559 |        "      <td>.\\sample_mp3\\2HG0Z4C2.mp3</td>\n",
560 |        "      <td>Can't Feel My Face</td>\n",
561 |        "      <td>Can't Feel My Face.mp3</td>\n",
562 |        "      <td>.\\mp3\\Can't Feel My Face.mp3</td>\n",
563 |        "    </tr>\n",
564 |        "    <tr>\n",
565 |        "      <th>1</th>\n",
566 |        "      <td>2W1KHXPI.mp3</td>\n",
567 |        "      <td>.\\sample_mp3\\2W1KHXPI.mp3</td>\n",
568 |        "      <td>What Do You Mean</td>\n",
569 |        "      <td>What Do You Mean.mp3</td>\n",
570 |        "      <td>.\\mp3\\What Do You Mean.mp3</td>\n",
571 |        "    </tr>\n",
572 |        "    <tr>\n",
573 |        "      <th>2</th>\n",
574 |        "      <td>3KTIS7NN.mp3</td>\n",
575 |        "      <td>.\\sample_mp3\\3KTIS7NN.mp3</td>\n",
576 |        "      <td>Watch Me</td>\n",
577 |        "      <td>Watch Me.mp3</td>\n",
578 |        "      <td>.\\mp3\\Watch Me.mp3</td>\n",
579 |        "    </tr>\n",
580 |        "    <tr>\n",
581 |        "      <th>3</th>\n",
582 |        "      <td>BWD5GM5Q.mp3</td>\n",
583 |        "      <td>.\\sample_mp3\\BWD5GM5Q.mp3</td>\n",
584 |        "      <td>Cheerleader</td>\n",
585 |        "      <td>Cheerleader.mp3</td>\n",
586 |        "      <td>.\\mp3\\Cheerleader.mp3</td>\n",
587 |        "    </tr>\n",
588 |        "    <tr>\n",
589 |        "      <th>4</th>\n",
590 |        "      <td>DJJUFDQX.mp3</td>\n",
591 |        "      <td>.\\sample_mp3\\DJJUFDQX.mp3</td>\n",
592 |        "      <td>Lean on</td>\n",
593 |        "      <td>Lean on.mp3</td>\n",
594 |        "      <td>.\\mp3\\Lean on.mp3</td>\n",
595 |        "    </tr>\n",
596 |        "  </tbody>\n",
597 |        "</table>\n",
598 |        "</div>"
599 |       ],
600 |       "text/plain": [
601 |        "       filename                   filepath               title  \\\n",
602 |        "0  2HG0Z4C2.mp3  .\\sample_mp3\\2HG0Z4C2.mp3  Can't Feel My Face   \n",
603 |        "1  2W1KHXPI.mp3  .\\sample_mp3\\2W1KHXPI.mp3    What Do You Mean   \n",
604 |        "2  3KTIS7NN.mp3  .\\sample_mp3\\3KTIS7NN.mp3            Watch Me   \n",
605 |        "3  BWD5GM5Q.mp3  .\\sample_mp3\\BWD5GM5Q.mp3         Cheerleader   \n",
606 |        "4  DJJUFDQX.mp3  .\\sample_mp3\\DJJUFDQX.mp3             Lean on   \n",
607 |        "\n",
608 |        "             new_filename                  new_filepath  \n",
609 |        "0  Can't Feel My Face.mp3  .\\mp3\\Can't Feel My Face.mp3  \n",
610 |        "1    What Do You Mean.mp3    .\\mp3\\What Do You Mean.mp3  \n",
611 |        "2            Watch Me.mp3            .\\mp3\\Watch Me.mp3  \n",
612 |        "3         Cheerleader.mp3         .\\mp3\\Cheerleader.mp3  \n",
613 |        "4             Lean on.mp3             .\\mp3\\Lean on.mp3  "
614 |       ]
615 |      },
616 |      "execution_count": 11,
617 |      "metadata": {},
618 |      "output_type": "execute_result"
619 |     }
620 |    ],
621 |    "source": [
622 |     "mp3_df['new_filepath'] = \\\n",
623 |     "mp3_df.new_filename.apply(lambda x : os.path.join('.', 'mp3',x))\n",
624 |     "mp3_df.head()"
625 |    ]
626 |   },
627 |   {
628 |    "cell_type": "markdown",
629 |    "metadata": {},
630 |    "source": [
631 |     "#### mp3_df의 filepath column에 해당하는 mp3파일을 new_filepath에 복사하기"
632 |    ]
633 |   },
634 |   {
635 |    "cell_type": "code",
636 |    "execution_count": 12,
637 |    "metadata": {},
638 |    "outputs": [],
639 |    "source": [
640 |     "if 'mp3' in os.listdir():\n",
641 |     "    shutil.rmtree('./mp3')\n",
642 |     "os.mkdir('mp3')"
643 |    ]
644 |   },
645 |   {
646 |    "cell_type": "code",
647 |    "execution_count": 13,
648 |    "metadata": {},
649 |    "outputs": [],
650 |    "source": [
651 |     "for index, row in mp3_df.iterrows(): # dict, dict.items()와 유사\n",
652 |     "    filepath = row[1]\n",
653 |     "    new_filepath = row[4]\n",
654 |     "    shutil.copy2(filepath, new_filepath)    "
655 |    ]
656 |   }
657 |  ],
658 |  "metadata": {
659 |   "kernelspec": {
660 |    "display_name": "Python 3",
661 |    "language": "python",
662 |    "name": "python3"
663 |   },
664 |   "language_info": {
665 |    "codemirror_mode": {
666 |     "name": "ipython",
667 |     "version": 3
668 |    },
669 |    "file_extension": ".py",
670 |    "mimetype": "text/x-python",
671 |    "name": "python",
672 |    "nbconvert_exporter": "python",
673 |    "pygments_lexer": "ipython3",
674 |    "version": "3.6.2"
675 |   },
676 |   "varInspector": {
677 |    "cols": {
678 |     "lenName": 16,
679 |     "lenType": 16,
680 |     "lenVar": 40
681 |    },
682 |    "kernels_config": {
683 |     "python": {
684 |      "delete_cmd_postfix": "",
685 |      "delete_cmd_prefix": "del ",
686 |      "library": "var_list.py",
687 |      "varRefreshCmd": "print(var_dic_list())"
688 |     },
689 |     "r": {
690 |      "delete_cmd_postfix": ") ",
691 |      "delete_cmd_prefix": "rm(",
692 |      "library": "var_list.r",
693 |      "varRefreshCmd": "cat(var_dic_list()) "
694 |     }
695 |    },
696 |    "types_to_exclude": [
697 |     "module",
698 |     "function",
699 |     "builtin_function_or_method",
700 |     "instance",
701 |     "_Feature"
702 |    ],
703 |    "window_display": false
704 |   }
705 |  },
706 |  "nbformat": 4,
707 |  "nbformat_minor": 2
708 | }
709 | 


--------------------------------------------------------------------------------
/Download raw contents and modify Request Headers.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 웹에서 파일(또는 이미지) 다운로드 및 Request Headers 수정 예제\n",
  8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
  9 |     "만든이 : 김보섭  "
 10 |    ]
 11 |   },
 12 |   {
 13 |    "cell_type": "markdown",
 14 |    "metadata": {},
 15 |    "source": [
 16 |     "## Jupyter 개발환경 구축 (tip)\n",
 17 |     "* Python Shell (python)\n",
 18 |     "* Interactive Shell -> IPython (ipython)\n",
 19 |     "* IPython Notebook (=> Python 만을 실행할 수 있는 Notebook)\n",
 20 |     "* Jupyter Notebook  \n",
 21 |     "\n",
 22 |     "Anaconda로 환경 구성안할 시, cmd에서 아래의 command로 ipython 설치  \n",
 23 |     "* install ipython : pip install ipython  \n",
 24 |     "* install ipython notebook : pip install ipython[notebook]\n",
 25 |     "    * localhost:8888 (==127.0.0.1)"
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "markdown",
 30 |    "metadata": {},
 31 |    "source": [
 32 |     "## 웹에서 파일(또는 이미지) 다운로드"
 33 |    ]
 34 |   },
 35 |   {
 36 |    "cell_type": "code",
 37 |    "execution_count": 1,
 38 |    "metadata": {
 39 |     "collapsed": true
 40 |    },
 41 |    "outputs": [],
 42 |    "source": [
 43 |     "#fp = open('____', '___')\n",
 44 |     "# 네이버에 있는 모든 이미지를 다운로드\n",
 45 |     "# .jpg, .png, .svg, ....(?)\n",
 46 |     "image_url = 'http://cdn.www.fastcampus.co.kr/wp-content/uploads/2016/01/fastcampus_logo_345x76.png'"
 47 |    ]
 48 |   },
 49 |   {
 50 |    "cell_type": "code",
 51 |    "execution_count": 2,
 52 |    "metadata": {},
 53 |    "outputs": [
 54 |     {
 55 |      "data": {
 56 |       "text/plain": [
 57 |        "['http:',\n",
 58 |        " '',\n",
 59 |        " 'cdn.www.fastcampus.co.kr',\n",
 60 |        " 'wp-content',\n",
 61 |        " 'uploads',\n",
 62 |        " '2016',\n",
 63 |        " '01',\n",
 64 |        " 'fastcampus_logo_345x76.png']"
 65 |       ]
 66 |      },
 67 |      "execution_count": 2,
 68 |      "metadata": {},
 69 |      "output_type": "execute_result"
 70 |     }
 71 |    ],
 72 |    "source": [
 73 |     "image_url.split('/')"
 74 |    ]
 75 |   },
 76 |   {
 77 |    "cell_type": "code",
 78 |    "execution_count": 3,
 79 |    "metadata": {
 80 |     "collapsed": true
 81 |    },
 82 |    "outputs": [],
 83 |    "source": [
 84 |     "full_filename = image_url.split('/')[-1]"
 85 |    ]
 86 |   },
 87 |   {
 88 |    "cell_type": "code",
 89 |    "execution_count": 4,
 90 |    "metadata": {},
 91 |    "outputs": [
 92 |     {
 93 |      "name": "stdout",
 94 |      "output_type": "stream",
 95 |      "text": [
 96 |       "fastcampus_logo_345x76 png\n"
 97 |      ]
 98 |     }
 99 |    ],
100 |    "source": [
101 |     "filename = full_filename.split('.')[0]\n",
102 |     "file_extension = full_filename.split('.')[1]\n",
103 |     "print(filename, file_extension)"
104 |    ]
105 |   },
106 |   {
107 |    "cell_type": "code",
108 |    "execution_count": 5,
109 |    "metadata": {
110 |     "collapsed": true
111 |    },
112 |    "outputs": [],
113 |    "source": [
114 |     "import os, sys\n",
115 |     "import shutil # High-level File Operator\n",
116 |     "import requests\n",
117 |     "import re\n",
118 |     "from bs4 import BeautifulSoup"
119 |    ]
120 |   },
121 |   {
122 |    "cell_type": "code",
123 |    "execution_count": 6,
124 |    "metadata": {
125 |     "collapsed": true
126 |    },
127 |    "outputs": [],
128 |    "source": [
129 |     "# 파일 (이미지, 문서, 압축파일, ...) 다운로드\n",
130 |     "response = requests.get(image_url, stream = True)\n",
131 |     "with open(full_filename, 'wb') as fp:\n",
132 |     "    shutil.copyfileobj(response.raw, fp)"
133 |    ]
134 |   },
135 |   {
136 |    "cell_type": "code",
137 |    "execution_count": 7,
138 |    "metadata": {},
139 |    "outputs": [
140 |     {
141 |      "name": "stdout",
142 |      "output_type": "stream",
143 |      "text": [
144 |       "[' Download raw contents and modify Request Headers.ipynb', '.ipynb_checkpoints', 'fastcampus_logo_345x76.png', 'os_shutil.ipynb', 'Python_basic1.ipynb', 'Python_basic2.ipynb', 'Python_basic3.ipynb', 'Scrapping static webpage.ipynb', 'Scrapping text mining papers in arXiv.py', 'Selenium.ipynb', 'Static webpage and Dynamic webpage.ipynb', '파이썬']\n"
145 |      ]
146 |     }
147 |    ],
148 |    "source": [
149 |     "# 블로그 검색결과의 1번 페이지 (10개 포스트) = 썸네일 이미지를 저장하는 코드\n",
150 |     "# 파이썬/\n",
151 |     "# 노드/\n",
152 |     "# 업무 자동화/\n",
153 |     "print(os.listdir())\n",
154 |     "keywords = ['파이썬', '노드', '업무 자동화']"
155 |    ]
156 |   },
157 |   {
158 |    "cell_type": "code",
159 |    "execution_count": 8,
160 |    "metadata": {
161 |     "collapsed": true
162 |    },
163 |    "outputs": [],
164 |    "source": [
165 |     "def crawl_naver_posts_by(keyword):\n",
166 |     "    print(keyword + ' 크롤링을 시작합니다.')\n",
167 |     "    url = 'https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=1&ie=utf8&query=' + keyword\n",
168 |     "    response = requests.get(url)\n",
169 |     "    dom = BeautifulSoup(response.text, 'html.parser')\n",
170 |     "    post_elements = dom.select('li.sh_blog_top')\n",
171 |     "    \n",
172 |     "    # 크롤링 keyword에 대하여 기본 directory 내에 keyword를 이름으로 가지는 폴더 생성\n",
173 |     "    if keyword in os.listdir():\n",
174 |     "        shutil.rmtree(keyword)\n",
175 |     "    os.mkdir(keyword)\n",
176 |     "\n",
177 |     "    for post_element in post_elements:\n",
178 |     "        title = post_element.select_one('a.sh_blog_title').attrs.get('title') # 해당 블로그 post의 제목을 저장\n",
179 |     "        title = re.sub('\\\\W', '', title)\n",
180 |     "        image_url = post_element.select_one('img.sh_blog_thumbnail').attrs.get('src') # image의 링크를 저장\n",
181 |     "        image_response = requests.get(image_url, stream = True) # image_url에서 image를 받아서 image 형태로 get (stream = True)\n",
182 |     "        filepath = './{keyword}/{title}.jpg'.format(keyword = keyword, title = title) \n",
183 |     "        with open(filepath, 'wb') as fp:\n",
184 |     "            shutil.copyfileobj(image_response.raw, fp)"
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "code",
189 |    "execution_count": 9,
190 |    "metadata": {},
191 |    "outputs": [
192 |     {
193 |      "name": "stdout",
194 |      "output_type": "stream",
195 |      "text": [
196 |       "파이썬 크롤링을 시작합니다.\n"
197 |      ]
198 |     }
199 |    ],
200 |    "source": [
201 |     "crawl_naver_posts_by('파이썬')"
202 |    ]
203 |   },
204 |   {
205 |    "cell_type": "code",
206 |    "execution_count": 10,
207 |    "metadata": {},
208 |    "outputs": [
209 |     {
210 |      "data": {
211 |       "text/plain": [
212 |        "['코딩파이썬Python001파이썬이뭐지.jpg',\n",
213 |        " '파이썬PythonGUI프로그램PyQt5개발시작하기.jpg',\n",
214 |        " '파이썬기초5편입력출력조건문.jpg',\n",
215 |        " '파이썬으로MQTTPublishSbuscribeClient구현하기.jpg',\n",
216 |        " '파이썬학원기초교육받고SW공부시작하기.jpg']"
217 |       ]
218 |      },
219 |      "execution_count": 10,
220 |      "metadata": {},
221 |      "output_type": "execute_result"
222 |     }
223 |    ],
224 |    "source": [
225 |     "os.listdir('./파이썬')"
226 |    ]
227 |   },
228 |   {
229 |    "cell_type": "markdown",
230 |    "metadata": {},
231 |    "source": [
232 |     "## Request Headers 수정예제\n",
233 |     "### 다음지도 예제\n",
234 |     "다음 지도 (http://map.daum.net) 에 스타벅스를 검색했을 때, 지점들에 관한 정보를 가져오는 예제  \n",
235 |     "(Host와 Referer를 정해줘야함)"
236 |    ]
237 |   },
238 |   {
239 |    "cell_type": "code",
240 |    "execution_count": 11,
241 |    "metadata": {
242 |     "collapsed": true
243 |    },
244 |    "outputs": [],
245 |    "source": [
246 |     "### 1. Host, Referer\n",
247 |     "# 2. Parsing\n",
248 |     "# request headerf를 수정\n",
249 |     "headers = {\n",
250 |     "    'Host' : 'map.search.daum.net',\n",
251 |     "    'Referer' : 'http://map.daum.net/'\n",
252 |     "    \n",
253 |     "}\n",
254 |     "url = 'http://map.search.daum.net/mapsearch/map.daum?callback=jQuery18105775289722381518_1498028927868&q=%EC%8A%A4%ED%83%80%EB%B2%85%EC%8A%A4&msFlag=A&sort=0'\n",
255 |     "response = requests.get(url, headers = headers)"
256 |    ]
257 |   },
258 |   {
259 |    "cell_type": "code",
260 |    "execution_count": 12,
261 |    "metadata": {
262 |     "collapsed": true
263 |    },
264 |    "outputs": [],
265 |    "source": [
266 |     "# 1. '('로 split => 첫번째를 제외한 모든 애들 붙여서 text\n",
267 |     "# 2. ')'로 split => 맨 마지막을 제외한 모든 애들 붙여서 text\n",
268 |     "tmp = '('.join(response.text.split('(')[1:])\n",
269 |     "tmp = ''.join(response.text.split('(')[1:])\n",
270 |     "tmp = ''.join(tmp.split(')'))\n",
271 |     "tmp = tmp.replace('\\r\\n', ' ')"
272 |    ]
273 |   },
274 |   {
275 |    "cell_type": "code",
276 |    "execution_count": 13,
277 |    "metadata": {
278 |     "collapsed": true
279 |    },
280 |    "outputs": [],
281 |    "source": [
282 |     "import json\n",
283 |     "results = json.loads(tmp)\n"
284 |    ]
285 |   },
286 |   {
287 |    "cell_type": "code",
288 |    "execution_count": 14,
289 |    "metadata": {},
290 |    "outputs": [
291 |     {
292 |      "data": {
293 |       "text/plain": [
294 |        "dict_keys(['addr_count', 'place_totalcount', 'page_count', 'bus_cnt', 'busStop_cnt', 'bus_recommend', 'bus_recommend_top', 'query', 'org_query', 'guide_message', 'region_depth', 'region_type', 'sort', 'page', 'cate_id', 'mng_type', 'is_franchise', 'is_category', 'highway_yn', 'oil_yn', 'mrank_type', 'exposure_level', 'analysis', 'srcid', 'sn_query', 'samename', 'tile_search', 'target', 'trans_map_type', 'trans_map_str', 'bus', 'address_retry', 'address', 'premium', 'place', 'cateLink', 'category_depth', 'cateGroupList', 'busStop'])"
295 |       ]
296 |      },
297 |      "execution_count": 14,
298 |      "metadata": {},
299 |      "output_type": "execute_result"
300 |     }
301 |    ],
302 |    "source": [
303 |     "results.keys()"
304 |    ]
305 |   },
306 |   {
307 |    "cell_type": "code",
308 |    "execution_count": 15,
309 |    "metadata": {},
310 |    "outputs": [
311 |     {
312 |      "name": "stdout",
313 |      "output_type": "stream",
314 |      "text": [
315 |       "스타벅스 스타벅스|I10080304||||0\n"
316 |      ]
317 |     }
318 |    ],
319 |    "source": [
320 |     "print(results.get('query'), results.get('analysis'))"
321 |    ]
322 |   }
323 |  ],
324 |  "metadata": {
325 |   "kernelspec": {
326 |    "display_name": "Python 3",
327 |    "language": "python",
328 |    "name": "python3"
329 |   },
330 |   "language_info": {
331 |    "codemirror_mode": {
332 |     "name": "ipython",
333 |     "version": 3
334 |    },
335 |    "file_extension": ".py",
336 |    "mimetype": "text/x-python",
337 |    "name": "python",
338 |    "nbconvert_exporter": "python",
339 |    "pygments_lexer": "ipython3",
340 |    "version": "3.6.1"
341 |   }
342 |  },
343 |  "nbformat": 4,
344 |  "nbformat_minor": 2
345 | }
346 | 


--------------------------------------------------------------------------------
/Python_basic2.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# Python basic 2\n",
   8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
   9 |     "만든이 : 김보섭\n",
  10 |     "\n",
  11 |     "## Summary\n",
  12 |     "* 파일의 입력과 출력(1. open, 2. 'with')\n",
  13 |     "    * read(), readline(), readlines(),...\n",
  14 |     "    * Column을 모른다고 가정하고 csv 파일을 읽는 함수\n",
  15 |     "    * Column이 없을 때 임의의 칼럼이름을 만들어서 읽는 함수\n",
  16 |     "* Lambda (익명 함수)\n",
  17 |     "    * 함수형 프로그래밍   \n",
  18 |     "      (익명함수 Lambda + Lambda Operator - map, reduce, filter)\n",
  19 |     "        * map : 모든 element에 동일한 함수 적용\n",
  20 |     "        * filter 모든 element에 동일한 함수 적용, 결과가 참인 element => List\n",
  21 |     "        * reduce 모든 element... 줄여나가다가 => 값!!\n",
  22 |     "* List Comprehension\n",
  23 |     "    * [i\\_______ for i in elements if \\_________]"
  24 |    ]
  25 |   },
  26 |   {
  27 |    "cell_type": "markdown",
  28 |    "metadata": {},
  29 |    "source": [
  30 |     "## 파일의 입력과 출력(1. open, 2. with) \n",
  31 |     "read(), readline(), readlines()"
  32 |    ]
  33 |   },
  34 |   {
  35 |    "cell_type": "code",
  36 |    "execution_count": 1,
  37 |    "metadata": {
  38 |     "collapsed": false
  39 |    },
  40 |    "outputs": [
  41 |     {
  42 |      "name": "stdout",
  43 |      "output_type": "stream",
  44 |      "text": [
  45 |       "D:\\dev\\py-automate\n",
  46 |       "['.ipynb_checkpoints', 'animals.csv', 'animals2.csv', 'fruits.csv', 'hello', 'MyPhotos', 'new_automate', 'os, shutil.ipynb', 'Photos', 'Photos.zip', 'Python_basic1.ipynb', 'Python_basic2.ipynb', 'Python_basic3.ipynb', 'src.txt', 'test.txt']\n"
  47 |      ]
  48 |     }
  49 |    ],
  50 |    "source": [
  51 |     "import os, sys\n",
  52 |     "print(os.getcwd())\n",
  53 |     "print(os.listdir())"
  54 |    ]
  55 |   },
  56 |   {
  57 |    "cell_type": "markdown",
  58 |    "metadata": {},
  59 |    "source": [
  60 |     "### read() "
  61 |    ]
  62 |   },
  63 |   {
  64 |    "cell_type": "code",
  65 |    "execution_count": 2,
  66 |    "metadata": {
  67 |     "collapsed": false
  68 |    },
  69 |    "outputs": [
  70 |     {
  71 |      "data": {
  72 |       "text/plain": [
  73 |        "'강아지,dog\\n고양이,cat\\n물고기,fish\\n원숭이,monkey'"
  74 |       ]
  75 |      },
  76 |      "execution_count": 2,
  77 |      "metadata": {},
  78 |      "output_type": "execute_result"
  79 |     }
  80 |    ],
  81 |    "source": [
  82 |     "# animals.csv 데이터형태\n",
  83 |     "'''\n",
  84 |     "    강아지,dog\n",
  85 |     "    고양이,cat\n",
  86 |     "    물고기,fish\n",
  87 |     "    원숭이,monkey\n",
  88 |     "'''\n",
  89 |     "# file pointer\n",
  90 |     "# r => read\n",
  91 |     "# . => current dir (현재 폴더 경로)\n",
  92 |     "# .. => parent dir (부모/상위 폴더 경로)\n",
  93 |     "# / => 그 폴더 내부\n",
  94 |     "fp = open('./animals.csv', mode = 'r', encoding = 'utf-8')\n",
  95 |     "data = fp.read()\n",
  96 |     "fp.close()\n",
  97 |     "data"
  98 |    ]
  99 |   },
 100 |   {
 101 |    "cell_type": "code",
 102 |    "execution_count": 3,
 103 |    "metadata": {
 104 |     "collapsed": false
 105 |    },
 106 |    "outputs": [
 107 |     {
 108 |      "data": {
 109 |       "text/plain": [
 110 |        "'강아지,dog\\n고양이,cat\\n물고기,fish\\n원숭이,monkey'"
 111 |       ]
 112 |      },
 113 |      "execution_count": 3,
 114 |      "metadata": {},
 115 |      "output_type": "execute_result"
 116 |     }
 117 |    ],
 118 |    "source": [
 119 |     "with open('./animals.csv', 'r', encoding = 'utf-8') as fp:\n",
 120 |     "    # with block 내에서만 존재, 이게 끝나면 자동으로 닫히는 역할!\n",
 121 |     "    data = fp.read()\n",
 122 |     "data"
 123 |    ]
 124 |   },
 125 |   {
 126 |    "cell_type": "code",
 127 |    "execution_count": 4,
 128 |    "metadata": {
 129 |     "collapsed": false
 130 |    },
 131 |    "outputs": [
 132 |     {
 133 |      "data": {
 134 |       "text/plain": [
 135 |        "['강아지,dog', '고양이,cat', '물고기,fish', '원숭이,monkey']"
 136 |       ]
 137 |      },
 138 |      "execution_count": 4,
 139 |      "metadata": {},
 140 |      "output_type": "execute_result"
 141 |     }
 142 |    ],
 143 |    "source": [
 144 |     "data.split('\\n')\n",
 145 |     "# split(seperator) => seperator를 기준으로 String을 나누고 => List 반환"
 146 |    ]
 147 |   },
 148 |   {
 149 |    "cell_type": "code",
 150 |    "execution_count": 5,
 151 |    "metadata": {
 152 |     "collapsed": false
 153 |    },
 154 |    "outputs": [
 155 |     {
 156 |      "data": {
 157 |       "text/plain": [
 158 |        "'강아지,dog\\n고양이,cat\\n물고기,fish\\n원숭이,monkey'"
 159 |       ]
 160 |      },
 161 |      "execution_count": 5,
 162 |      "metadata": {},
 163 |      "output_type": "execute_result"
 164 |     }
 165 |    ],
 166 |    "source": [
 167 |     "# 1. fp.read()\n",
 168 |     "# 2. fp.readline()\n",
 169 |     "# 3. fp.readlines()\n",
 170 |     "'''\n",
 171 |     "    강아지,dog\n",
 172 |     "    고양이,cat\n",
 173 |     "    물고기,fish\n",
 174 |     "    원숭이,monkey\n",
 175 |     "'''\n",
 176 |     "with open('./animals.csv', 'r', encoding = 'utf-8') as fp:\n",
 177 |     "    # fp.read() => 문서 전체가 String 형태로 return\n",
 178 |     "    data = fp.read() \n",
 179 |     "data"
 180 |    ]
 181 |   },
 182 |   {
 183 |    "cell_type": "markdown",
 184 |    "metadata": {},
 185 |    "source": [
 186 |     "### readline() "
 187 |    ]
 188 |   },
 189 |   {
 190 |    "cell_type": "code",
 191 |    "execution_count": 6,
 192 |    "metadata": {
 193 |     "collapsed": false
 194 |    },
 195 |    "outputs": [
 196 |     {
 197 |      "name": "stdout",
 198 |      "output_type": "stream",
 199 |      "text": [
 200 |       "강아지,dog\n",
 201 |       " 고양이,cat\n",
 202 |       "\n"
 203 |      ]
 204 |     }
 205 |    ],
 206 |    "source": [
 207 |     "with open('./animals.csv', 'r', encoding = 'utf-8') as fp:\n",
 208 |     "    # fp.readline() => 한줄 (newline character를 기준으로 끊어서 가져오는 기능)\n",
 209 |     "    data = fp.readline()\n",
 210 |     "    data2 = fp.readline()\n",
 211 |     "print(data, data2)"
 212 |    ]
 213 |   },
 214 |   {
 215 |    "cell_type": "markdown",
 216 |    "metadata": {},
 217 |    "source": [
 218 |     "### readlines() "
 219 |    ]
 220 |   },
 221 |   {
 222 |    "cell_type": "code",
 223 |    "execution_count": 7,
 224 |    "metadata": {
 225 |     "collapsed": false
 226 |    },
 227 |    "outputs": [
 228 |     {
 229 |      "data": {
 230 |       "text/plain": [
 231 |        "['강아지,dog\\n', '고양이,cat\\n', '물고기,fish\\n', '원숭이,monkey']"
 232 |       ]
 233 |      },
 234 |      "execution_count": 7,
 235 |      "metadata": {},
 236 |      "output_type": "execute_result"
 237 |     }
 238 |    ],
 239 |    "source": [
 240 |     "with open('./animals.csv', 'r', encoding = 'utf-8') as fp:\n",
 241 |     "    # fp.readlines() => 전체를 newline character 기준으로 나눠서 List return\n",
 242 |     "    data = fp.readlines() # 나중에 불필요한 \\n을 제거해야함\n",
 243 |     "data"
 244 |    ]
 245 |   },
 246 |   {
 247 |    "cell_type": "code",
 248 |    "execution_count": 8,
 249 |    "metadata": {
 250 |     "collapsed": false
 251 |    },
 252 |    "outputs": [
 253 |     {
 254 |      "data": {
 255 |       "text/plain": [
 256 |        "\"\\n    {'Korean' : '강아지', 'English' : 'dog} # Dict!\\n\""
 257 |       ]
 258 |      },
 259 |      "execution_count": 8,
 260 |      "metadata": {},
 261 |      "output_type": "execute_result"
 262 |     }
 263 |    ],
 264 |    "source": [
 265 |     "# 아래와 같은 형태로 만들자\n",
 266 |     "# List of Dict... (딕셔너리의 리스트!)\n",
 267 |     "'''\n",
 268 |     "    {'Korean' : '강아지', 'English' : 'dog} # Dict!\n",
 269 |     "'''"
 270 |    ]
 271 |   },
 272 |   {
 273 |    "cell_type": "code",
 274 |    "execution_count": 9,
 275 |    "metadata": {
 276 |     "collapsed": false
 277 |    },
 278 |    "outputs": [
 279 |     {
 280 |      "data": {
 281 |       "text/plain": [
 282 |        "[{'English name': 'dog', 'Korean name': '강아지'},\n",
 283 |        " {'English name': 'cat', 'Korean name': '고양이'},\n",
 284 |        " {'English name': 'fish', 'Korean name': '물고기'},\n",
 285 |        " {'English name': 'monkey', 'Korean name': '원숭이'}]"
 286 |       ]
 287 |      },
 288 |      "execution_count": 9,
 289 |      "metadata": {},
 290 |      "output_type": "execute_result"
 291 |     }
 292 |    ],
 293 |    "source": [
 294 |     "with open('./animals.csv', 'r', encoding = 'utf-8') as fp:\n",
 295 |     "    data = fp.read()\n",
 296 |     "    rows = data.split('\\n')\n",
 297 |     "    tmp = []\n",
 298 |     "    for row in rows:\n",
 299 |     "        tmp.append({'Korean name' : row.split(',')[0],\n",
 300 |     "                    'English name' : row.split(',')[1]})\n",
 301 |     "tmp"
 302 |    ]
 303 |   },
 304 |   {
 305 |    "cell_type": "code",
 306 |    "execution_count": 10,
 307 |    "metadata": {
 308 |     "collapsed": false
 309 |    },
 310 |    "outputs": [
 311 |     {
 312 |      "data": {
 313 |       "text/html": [
 314 |        "<div>\n",
 315 |        "<table border=\"1\" class=\"dataframe\">\n",
 316 |        "  <thead>\n",
 317 |        "    <tr style=\"text-align: right;\">\n",
 318 |        "      <th></th>\n",
 319 |        "      <th>English name</th>\n",
 320 |        "      <th>Korean name</th>\n",
 321 |        "    </tr>\n",
 322 |        "  </thead>\n",
 323 |        "  <tbody>\n",
 324 |        "    <tr>\n",
 325 |        "      <th>0</th>\n",
 326 |        "      <td>dog</td>\n",
 327 |        "      <td>강아지</td>\n",
 328 |        "    </tr>\n",
 329 |        "    <tr>\n",
 330 |        "      <th>1</th>\n",
 331 |        "      <td>cat</td>\n",
 332 |        "      <td>고양이</td>\n",
 333 |        "    </tr>\n",
 334 |        "    <tr>\n",
 335 |        "      <th>2</th>\n",
 336 |        "      <td>fish</td>\n",
 337 |        "      <td>물고기</td>\n",
 338 |        "    </tr>\n",
 339 |        "    <tr>\n",
 340 |        "      <th>3</th>\n",
 341 |        "      <td>monkey</td>\n",
 342 |        "      <td>원숭이</td>\n",
 343 |        "    </tr>\n",
 344 |        "  </tbody>\n",
 345 |        "</table>\n",
 346 |        "</div>"
 347 |       ],
 348 |       "text/plain": [
 349 |        "  English name Korean name\n",
 350 |        "0          dog         강아지\n",
 351 |        "1          cat         고양이\n",
 352 |        "2         fish         물고기\n",
 353 |        "3       monkey         원숭이"
 354 |       ]
 355 |      },
 356 |      "execution_count": 10,
 357 |      "metadata": {},
 358 |      "output_type": "execute_result"
 359 |     }
 360 |    ],
 361 |    "source": [
 362 |     "import pandas as pd\n",
 363 |     "pd.DataFrame(tmp)"
 364 |    ]
 365 |   },
 366 |   {
 367 |    "cell_type": "markdown",
 368 |    "metadata": {},
 369 |    "source": [
 370 |     "### Column을 모른다고 가정하고 csv 파일을 읽는 함수"
 371 |    ]
 372 |   },
 373 |   {
 374 |    "cell_type": "code",
 375 |    "execution_count": 11,
 376 |    "metadata": {
 377 |     "collapsed": false
 378 |    },
 379 |    "outputs": [
 380 |     {
 381 |      "data": {
 382 |       "text/plain": [
 383 |        "[{'English Name': 'dog', 'Korean Name': '강아지', 'Size': '중형'},\n",
 384 |        " {'English Name': 'cat', 'Korean Name': '고양이', 'Size': '소형'},\n",
 385 |        " {'English Name': 'fish', 'Korean Name': '물고기', 'Size': '소형'},\n",
 386 |        " {'English Name': 'monkey', 'Korean Name': '원숭이', 'Size': '대형'}]"
 387 |       ]
 388 |      },
 389 |      "execution_count": 11,
 390 |      "metadata": {},
 391 |      "output_type": "execute_result"
 392 |     }
 393 |    ],
 394 |    "source": [
 395 |     "# 스크립트 => 함수\n",
 396 |     "# English name, korean name => cloumn을 미리 알고 x => 어떤 경우에도 읽을 수 있도록! => 함수 !\n",
 397 |     "# animals2.csv 데이터 형태\n",
 398 |     "'''\n",
 399 |     "    English Name,Korean Name,Size\n",
 400 |     "    dog,강아지,중형\n",
 401 |     "    cat,고양이,소형\n",
 402 |     "    fish,물고기,소형\n",
 403 |     "    monkey,원숭이,대형\n",
 404 |     "'''\n",
 405 |     "\n",
 406 |     "with open('./animals2.csv', 'r', encoding = 'utf-8') as fp:\n",
 407 |     "    data = fp.read()\n",
 408 |     "    rows = data.split('\\n')\n",
 409 |     "    result = []\n",
 410 |     "    \n",
 411 |     "    # 1. Column을 다루는 코드\n",
 412 |     "    columns = rows[0].split(',')\n",
 413 |     "    \n",
 414 |     "    # 2. 실제데이터를 다루는 코드\n",
 415 |     "    tmp = rows[1:]\n",
 416 |     "    for row in tmp:\n",
 417 |     "        row_datas = row.split(',')\n",
 418 |     "        row_dict = {}\n",
 419 |     "        \n",
 420 |     "        # column을 for문 돌리면서\n",
 421 |     "        # column에 적절한 데이터를 추가한다.\n",
 422 |     "        \n",
 423 |     "        for column_index in range(len(columns)): # 칼럼의 개수가 몇 개가 되든지 간에\n",
 424 |     "            column = columns[column_index]       # 알아서 적절한 횟수만큼 for문을 돌리자\n",
 425 |     "            row_dict[column] = row_datas[column_index]\n",
 426 |     "            \n",
 427 |     "        result.append(row_dict)\n",
 428 |     "result"
 429 |    ]
 430 |   },
 431 |   {
 432 |    "cell_type": "code",
 433 |    "execution_count": 12,
 434 |    "metadata": {
 435 |     "collapsed": false
 436 |    },
 437 |    "outputs": [],
 438 |    "source": [
 439 |     "# 함수화\n",
 440 |     "# 1. Input => '파일경로'\n",
 441 |     "# 2. Output => 'result' (dict의 list)\n",
 442 |     "def read_csv(filepath, encoding = 'utf-8'):\n",
 443 |     "    with open(filepath, 'r', encoding = encoding) as fp:\n",
 444 |     "        data = fp.read()\n",
 445 |     "        rows = data.split('\\n')\n",
 446 |     "        result = []\n",
 447 |     "\n",
 448 |     "        # 1. Column을 다루는 코드\n",
 449 |     "        columns = rows[0].split(',')\n",
 450 |     "\n",
 451 |     "        # 2. 실제데이터를 다루는 코드\n",
 452 |     "        tmp = rows[1:]\n",
 453 |     "        for row in tmp:\n",
 454 |     "            row_datas = row.split(',')\n",
 455 |     "            row_dict = {}\n",
 456 |     "\n",
 457 |     "            # column을 for문 돌리면서\n",
 458 |     "            # column에 적절한 데이터를 추가한다.\n",
 459 |     "\n",
 460 |     "            for column_index in range(len(columns)): # 칼럼의 개수가 몇 개가 되든지 간에\n",
 461 |     "                column = columns[column_index]       # 알아서 적절한 횟수만큼 for문을 돌리자\n",
 462 |     "                row_dict[column] = row_datas[column_index]\n",
 463 |     "\n",
 464 |     "            result.append(row_dict)\n",
 465 |     "    return result"
 466 |    ]
 467 |   },
 468 |   {
 469 |    "cell_type": "code",
 470 |    "execution_count": 13,
 471 |    "metadata": {
 472 |     "collapsed": false
 473 |    },
 474 |    "outputs": [
 475 |     {
 476 |      "data": {
 477 |       "text/plain": [
 478 |        "[{'English Name': 'dog', 'Korean Name': '강아지', 'Size': '중형'},\n",
 479 |        " {'English Name': 'cat', 'Korean Name': '고양이', 'Size': '소형'},\n",
 480 |        " {'English Name': 'fish', 'Korean Name': '물고기', 'Size': '소형'},\n",
 481 |        " {'English Name': 'monkey', 'Korean Name': '원숭이', 'Size': '대형'}]"
 482 |       ]
 483 |      },
 484 |      "execution_count": 13,
 485 |      "metadata": {},
 486 |      "output_type": "execute_result"
 487 |     }
 488 |    ],
 489 |    "source": [
 490 |     "read_csv('./animals2.csv')"
 491 |    ]
 492 |   },
 493 |   {
 494 |    "cell_type": "code",
 495 |    "execution_count": 14,
 496 |    "metadata": {
 497 |     "collapsed": false
 498 |    },
 499 |    "outputs": [
 500 |     {
 501 |      "data": {
 502 |       "text/plain": [
 503 |        "[{'dog': 'cat', '강아지': '고양이'},\n",
 504 |        " {'dog': 'fish', '강아지': '물고기'},\n",
 505 |        " {'dog': 'monkey', '강아지': '원숭이'}]"
 506 |       ]
 507 |      },
 508 |      "execution_count": 14,
 509 |      "metadata": {},
 510 |      "output_type": "execute_result"
 511 |     }
 512 |    ],
 513 |    "source": [
 514 |     "read_csv('./animals.csv')"
 515 |    ]
 516 |   },
 517 |   {
 518 |    "cell_type": "markdown",
 519 |    "metadata": {},
 520 |    "source": [
 521 |     "### Column이 없을 때 임의의 칼럼이름을 만들어서 읽는 함수"
 522 |    ]
 523 |   },
 524 |   {
 525 |    "cell_type": "code",
 526 |    "execution_count": 15,
 527 |    "metadata": {
 528 |     "collapsed": false
 529 |    },
 530 |    "outputs": [],
 531 |    "source": [
 532 |     "# 범용적으로 함수화\n",
 533 |     "# 1. Input => '파일경로'\n",
 534 |     "# 2. Output => 'result' (dict의 list)\n",
 535 |     "def read_txt(filepath, separator = ',', header = True, encoding = 'utf-8'): # header 관련 옵션추가\n",
 536 |     "    if header:\n",
 537 |     "        with open(filepath, 'r', encoding = encoding) as fp:\n",
 538 |     "            data = fp.read()\n",
 539 |     "            rows = data.split('\\n')\n",
 540 |     "            result = []\n",
 541 |     "\n",
 542 |     "            # 1. Column을 다루는 코드\n",
 543 |     "            columns = rows[0].split(separator)\n",
 544 |     "\n",
 545 |     "            # 2. 실제데이터를 다루는 코드\n",
 546 |     "            tmp = rows[1:]\n",
 547 |     "            for row in tmp:\n",
 548 |     "                row_datas = row.split(separator)\n",
 549 |     "                row_dict = {}\n",
 550 |     "\n",
 551 |     "                # column을 for문 돌리면서\n",
 552 |     "                # column에 적절한 데이터를 추가한다.\n",
 553 |     "\n",
 554 |     "                for column_index in range(len(columns)): # 칼럼의 개수가 몇 개가 되든지 간에\n",
 555 |     "                    column = columns[column_index]       # 알아서 적절한 횟수만큼 for문을 돌리자\n",
 556 |     "                    row_dict[column] = row_datas[column_index]\n",
 557 |     "\n",
 558 |     "                result.append(row_dict)\n",
 559 |     "        return result\n",
 560 |     "    else:\n",
 561 |     "        with open(filepath, 'r', encoding = encoding) as fp:\n",
 562 |     "            data = fp.read()\n",
 563 |     "            rows = data.split('\\n')\n",
 564 |     "            result = []\n",
 565 |     "\n",
 566 |     "            # 1. Column을 다루는 코드\n",
 567 |     "            \n",
 568 |     "            columns = [('V' + str(i + 1)) for i in range(len(rows[0].split(separator)))]\n",
 569 |     "\n",
 570 |     "            # 2. 실제데이터를 다루는 코드\n",
 571 |     "            tmp = rows[1:]\n",
 572 |     "            for row in tmp:\n",
 573 |     "                row_datas = row.split(separator)\n",
 574 |     "                row_dict = {}\n",
 575 |     "\n",
 576 |     "                # column을 for문 돌리면서\n",
 577 |     "                # column에 적절한 데이터를 추가한다.\n",
 578 |     "\n",
 579 |     "                for column_index in range(len(columns)): # 칼럼의 개수가 몇 개가 되든지 간에\n",
 580 |     "                    column = columns[column_index]       # 알아서 적절한 횟수만큼 for문을 돌리자\n",
 581 |     "                    row_dict[column] = row_datas[column_index]\n",
 582 |     "\n",
 583 |     "                result.append(row_dict)\n",
 584 |     "        return result"
 585 |    ]
 586 |   },
 587 |   {
 588 |    "cell_type": "code",
 589 |    "execution_count": 16,
 590 |    "metadata": {
 591 |     "collapsed": false
 592 |    },
 593 |    "outputs": [
 594 |     {
 595 |      "data": {
 596 |       "text/plain": [
 597 |        "[{'V1': '고양이', 'V2': 'cat'},\n",
 598 |        " {'V1': '물고기', 'V2': 'fish'},\n",
 599 |        " {'V1': '원숭이', 'V2': 'monkey'}]"
 600 |       ]
 601 |      },
 602 |      "execution_count": 16,
 603 |      "metadata": {},
 604 |      "output_type": "execute_result"
 605 |     }
 606 |    ],
 607 |    "source": [
 608 |     "# animals.csv 파일형태\n",
 609 |     "'''\n",
 610 |     "    고양이,cat\n",
 611 |     "    물고기,fish\n",
 612 |     "    원숭이,monkey\n",
 613 |     "'''\n",
 614 |     "read_txt('./animals.csv', header = False)"
 615 |    ]
 616 |   },
 617 |   {
 618 |    "cell_type": "code",
 619 |    "execution_count": 17,
 620 |    "metadata": {
 621 |     "collapsed": false
 622 |    },
 623 |    "outputs": [
 624 |     {
 625 |      "data": {
 626 |       "text/plain": [
 627 |        "[{'Name': '사과', 'Size': '중형'}, {'Name': '수박', 'Size': '대형'}]"
 628 |       ]
 629 |      },
 630 |      "execution_count": 17,
 631 |      "metadata": {},
 632 |      "output_type": "execute_result"
 633 |     }
 634 |    ],
 635 |    "source": [
 636 |     "# fruits.csv 파일형태\n",
 637 |     "'''\n",
 638 |     "    Name|Size\n",
 639 |     "    사과|중형\n",
 640 |     "    수박|대형\n",
 641 |     "'''\n",
 642 |     "read_txt('./fruits.csv', separator ='|')"
 643 |    ]
 644 |   },
 645 |   {
 646 |    "cell_type": "markdown",
 647 |    "metadata": {},
 648 |    "source": [
 649 |     "## Lambda\n",
 650 |     "* Lambda (이름이 있는 함수..., 익명함수) + Lambda Operator (map, filter, reduce)"
 651 |    ]
 652 |   },
 653 |   {
 654 |    "cell_type": "code",
 655 |    "execution_count": 18,
 656 |    "metadata": {
 657 |     "collapsed": false
 658 |    },
 659 |    "outputs": [
 660 |     {
 661 |      "data": {
 662 |       "text/plain": [
 663 |        "<function __main__.double>"
 664 |       ]
 665 |      },
 666 |      "execution_count": 18,
 667 |      "metadata": {},
 668 |      "output_type": "execute_result"
 669 |     }
 670 |    ],
 671 |    "source": [
 672 |     "# Double\n",
 673 |     "def double(x):\n",
 674 |     "    return 2 * x\n",
 675 |     "double"
 676 |    ]
 677 |   },
 678 |   {
 679 |    "cell_type": "code",
 680 |    "execution_count": 19,
 681 |    "metadata": {
 682 |     "collapsed": false
 683 |    },
 684 |    "outputs": [
 685 |     {
 686 |      "data": {
 687 |       "text/plain": [
 688 |        "<function __main__.<lambda>>"
 689 |       ]
 690 |      },
 691 |      "execution_count": 19,
 692 |      "metadata": {},
 693 |      "output_type": "execute_result"
 694 |     }
 695 |    ],
 696 |    "source": [
 697 |     "lambda x : 2 * x"
 698 |    ]
 699 |   },
 700 |   {
 701 |    "cell_type": "code",
 702 |    "execution_count": 20,
 703 |    "metadata": {
 704 |     "collapsed": false
 705 |    },
 706 |    "outputs": [
 707 |     {
 708 |      "data": {
 709 |       "text/plain": [
 710 |        "200"
 711 |       ]
 712 |      },
 713 |      "execution_count": 20,
 714 |      "metadata": {},
 715 |      "output_type": "execute_result"
 716 |     }
 717 |    ],
 718 |    "source": [
 719 |     "(lambda x : 2 * x)(100)"
 720 |    ]
 721 |   },
 722 |   {
 723 |    "cell_type": "code",
 724 |    "execution_count": 21,
 725 |    "metadata": {
 726 |     "collapsed": false
 727 |    },
 728 |    "outputs": [
 729 |     {
 730 |      "data": {
 731 |       "text/plain": [
 732 |        "[2, 4, 6, 8, 10]"
 733 |       ]
 734 |      },
 735 |      "execution_count": 21,
 736 |      "metadata": {},
 737 |      "output_type": "execute_result"
 738 |     }
 739 |    ],
 740 |    "source": [
 741 |     "# [1, 2, 3, 4, 5] List => [2, 4, 6, 8, 10]\n",
 742 |     "# 1. for 문을 돌리고,\n",
 743 |     "# 2. 각각의 element 들에 대해서 '동일한 함수'를 적용\n",
 744 |     "my_list = [1, 2, 3, 4, 5]\n",
 745 |     "result = []\n",
 746 |     "for element in my_list:\n",
 747 |     "    result.append(double(element))\n",
 748 |     "result"
 749 |    ]
 750 |   },
 751 |   {
 752 |    "cell_type": "markdown",
 753 |    "metadata": {},
 754 |    "source": [
 755 |     "### map(function, list)  => list\n",
 756 |     "list의 각각의 element에 동일한 function을 적용해서 새로운 list를 반환"
 757 |    ]
 758 |   },
 759 |   {
 760 |    "cell_type": "code",
 761 |    "execution_count": 22,
 762 |    "metadata": {
 763 |     "collapsed": false
 764 |    },
 765 |    "outputs": [
 766 |     {
 767 |      "data": {
 768 |       "text/plain": [
 769 |        "[2, 4, 6, 8, 10]"
 770 |       ]
 771 |      },
 772 |      "execution_count": 22,
 773 |      "metadata": {},
 774 |      "output_type": "execute_result"
 775 |     }
 776 |    ],
 777 |    "source": [
 778 |     "# map(Lambda Operator)\n",
 779 |     "# \"동일한 함수\"를 적용! (Lambda)\n",
 780 |     "list(map(double,[1,2,3,4,5])) # double이 되게 간단한 함수니까 익명함수써서 정의하자"
 781 |    ]
 782 |   },
 783 |   {
 784 |    "cell_type": "code",
 785 |    "execution_count": 23,
 786 |    "metadata": {
 787 |     "collapsed": false
 788 |    },
 789 |    "outputs": [
 790 |     {
 791 |      "data": {
 792 |       "text/plain": [
 793 |        "[2, 4, 6, 8, 10]"
 794 |       ]
 795 |      },
 796 |      "execution_count": 23,
 797 |      "metadata": {},
 798 |      "output_type": "execute_result"
 799 |     }
 800 |    ],
 801 |    "source": [
 802 |     "list(map(lambda x: 2*x,[1,2,3,4,5]))"
 803 |    ]
 804 |   },
 805 |   {
 806 |    "cell_type": "code",
 807 |    "execution_count": 24,
 808 |    "metadata": {
 809 |     "collapsed": false
 810 |    },
 811 |    "outputs": [
 812 |     {
 813 |      "data": {
 814 |       "text/plain": [
 815 |        "[{'Name': '사과', 'Size': '중형'}, {'Name': '수박', 'Size': '대형'}]"
 816 |       ]
 817 |      },
 818 |      "execution_count": 24,
 819 |      "metadata": {},
 820 |      "output_type": "execute_result"
 821 |     }
 822 |    ],
 823 |    "source": [
 824 |     "read_lsv = lambda filepath : read_txt(filepath, separator = '|')\n",
 825 |     "read_lsv('./fruits.csv')"
 826 |    ]
 827 |   },
 828 |   {
 829 |    "cell_type": "code",
 830 |    "execution_count": 25,
 831 |    "metadata": {
 832 |     "collapsed": false
 833 |    },
 834 |    "outputs": [
 835 |     {
 836 |      "data": {
 837 |       "text/plain": [
 838 |        "[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]"
 839 |       ]
 840 |      },
 841 |      "execution_count": 25,
 842 |      "metadata": {},
 843 |      "output_type": "execute_result"
 844 |     }
 845 |    ],
 846 |    "source": [
 847 |     "list(map(lambda x: x * 2,\n",
 848 |     "         range(10) # iterable한 애들이면 넣을 수 있음\n",
 849 |     "        ))"
 850 |    ]
 851 |   },
 852 |   {
 853 |    "cell_type": "markdown",
 854 |    "metadata": {},
 855 |    "source": [
 856 |     "### filter(function, list) => list\n",
 857 |     "list의 각각의 element에 대해서 동일한 function을 적용한 다음에, 함수의 결과가 True인 Element만 남겨서 새로운 list 반환\n"
 858 |    ]
 859 |   },
 860 |   {
 861 |    "cell_type": "code",
 862 |    "execution_count": 26,
 863 |    "metadata": {
 864 |     "collapsed": false
 865 |    },
 866 |    "outputs": [
 867 |     {
 868 |      "data": {
 869 |       "text/plain": [
 870 |        "[100, 101, 105, 106]"
 871 |       ]
 872 |      },
 873 |      "execution_count": 26,
 874 |      "metadata": {},
 875 |      "output_type": "execute_result"
 876 |     }
 877 |    ],
 878 |    "source": [
 879 |     "# 가지고 있는 list에서 100이상의 수만 뽑는 경우\n",
 880 |     "list(filter( lambda x : x>=100,\n",
 881 |     "       [1,100,101,3,4,105,106]\n",
 882 |     "      ))"
 883 |    ]
 884 |   },
 885 |   {
 886 |    "cell_type": "code",
 887 |    "execution_count": 27,
 888 |    "metadata": {
 889 |     "collapsed": false
 890 |    },
 891 |    "outputs": [
 892 |     {
 893 |      "data": {
 894 |       "text/plain": [
 895 |        "[64, 81, 100]"
 896 |       ]
 897 |      },
 898 |      "execution_count": 27,
 899 |      "metadata": {},
 900 |      "output_type": "execute_result"
 901 |     }
 902 |    ],
 903 |    "source": [
 904 |     "# 1-10까지의 자연수 리스트 => 제곱한 결과중에서 50이상의 수만 리스트\n",
 905 |     "# for문 활용\n",
 906 |     "tmp = []\n",
 907 |     "for number in range(1,11):\n",
 908 |     "    if number ** 2 >= 50:\n",
 909 |     "        tmp.append((number) ** 2)\n",
 910 |     "tmp   "
 911 |    ]
 912 |   },
 913 |   {
 914 |    "cell_type": "code",
 915 |    "execution_count": 28,
 916 |    "metadata": {
 917 |     "collapsed": false
 918 |    },
 919 |    "outputs": [
 920 |     {
 921 |      "data": {
 922 |       "text/plain": [
 923 |        "[64, 81, 100]"
 924 |       ]
 925 |      },
 926 |      "execution_count": 28,
 927 |      "metadata": {},
 928 |      "output_type": "execute_result"
 929 |     }
 930 |    ],
 931 |    "source": [
 932 |     "# map, filter\n",
 933 |     "tmp = list(filter(lambda x : x>=50,\n",
 934 |     "             map(lambda x : x ** 2, range(1, 11))))\n",
 935 |     "tmp"
 936 |    ]
 937 |   },
 938 |   {
 939 |    "cell_type": "markdown",
 940 |    "metadata": {},
 941 |    "source": [
 942 |     "### reduce(function, list) => value"
 943 |    ]
 944 |   },
 945 |   {
 946 |    "cell_type": "code",
 947 |    "execution_count": 29,
 948 |    "metadata": {
 949 |     "collapsed": false
 950 |    },
 951 |    "outputs": [
 952 |     {
 953 |      "data": {
 954 |       "text/plain": [
 955 |        "15"
 956 |       ]
 957 |      },
 958 |      "execution_count": 29,
 959 |      "metadata": {},
 960 |      "output_type": "execute_result"
 961 |     }
 962 |    ],
 963 |    "source": [
 964 |     "# example 1~5까지의함\n",
 965 |     "result = 0\n",
 966 |     "for i in range(1,6):\n",
 967 |     "    result = result + i\n",
 968 |     "result"
 969 |    ]
 970 |   },
 971 |   {
 972 |    "cell_type": "code",
 973 |    "execution_count": 30,
 974 |    "metadata": {
 975 |     "collapsed": false
 976 |    },
 977 |    "outputs": [
 978 |     {
 979 |      "data": {
 980 |       "text/plain": [
 981 |        "15"
 982 |       ]
 983 |      },
 984 |      "execution_count": 30,
 985 |      "metadata": {},
 986 |      "output_type": "execute_result"
 987 |     }
 988 |    ],
 989 |    "source": [
 990 |     "# 1, 2, 3, 4, 5\n",
 991 |     "# 1, 2, 3, 4, 5\n",
 992 |     "#    3, 3, 4, 5\n",
 993 |     "#       6, 4, 5\n",
 994 |     "#         10, 5\n",
 995 |     "#            15\n",
 996 |     "from functools import reduce\n",
 997 |     "reduce(\n",
 998 |     "    lambda a, b: a+b,\n",
 999 |     "    range(1,6)\n",
1000 |     ")"
1001 |    ]
1002 |   },
1003 |   {
1004 |    "cell_type": "code",
1005 |    "execution_count": 31,
1006 |    "metadata": {
1007 |     "collapsed": false
1008 |    },
1009 |    "outputs": [
1010 |     {
1011 |      "data": {
1012 |       "text/plain": [
1013 |        "105"
1014 |       ]
1015 |      },
1016 |      "execution_count": 31,
1017 |      "metadata": {},
1018 |      "output_type": "execute_result"
1019 |     }
1020 |    ],
1021 |    "source": [
1022 |     "# input : 리스트\n",
1023 |     "# output : 리스트의 element 중에서 가장 큰 수\n",
1024 |     "# get_max() => filter...\n",
1025 |     "def get_max(elements):\n",
1026 |     "    tmp = elements[0]\n",
1027 |     "    \n",
1028 |     "    for element in elements:\n",
1029 |     "        if element >= tmp:\n",
1030 |     "            tmp = element\n",
1031 |     "    return tmp\n",
1032 |     "get_max([1, 100, 2,3, 4, 105, 6])"
1033 |    ]
1034 |   },
1035 |   {
1036 |    "cell_type": "code",
1037 |    "execution_count": 32,
1038 |    "metadata": {
1039 |     "collapsed": false
1040 |    },
1041 |    "outputs": [
1042 |     {
1043 |      "data": {
1044 |       "text/plain": [
1045 |        "105"
1046 |       ]
1047 |      },
1048 |      "execution_count": 32,
1049 |      "metadata": {},
1050 |      "output_type": "execute_result"
1051 |     }
1052 |    ],
1053 |    "source": [
1054 |     "# 1, 100, 2, 3, 4, 105, 6\n",
1055 |     "# 100, 2, 3, 4, 105, 6\n",
1056 |     "# 100, 3, 4, 105, 6\n",
1057 |     "# 100, 4, 105, 6\n",
1058 |     "# 100, 105, 6\n",
1059 |     "# 105, 6\n",
1060 |     "# 105\n",
1061 |     "reduce(\n",
1062 |     "    lambda a, b: a if a > b else b,\n",
1063 |     "    [1, 100, 2,3, 4, 105, 6]\n",
1064 |     ")"
1065 |    ]
1066 |   },
1067 |   {
1068 |    "cell_type": "code",
1069 |    "execution_count": 33,
1070 |    "metadata": {
1071 |     "collapsed": false
1072 |    },
1073 |    "outputs": [
1074 |     {
1075 |      "data": {
1076 |       "text/plain": [
1077 |        "105"
1078 |       ]
1079 |      },
1080 |      "execution_count": 33,
1081 |      "metadata": {},
1082 |      "output_type": "execute_result"
1083 |     }
1084 |    ],
1085 |    "source": [
1086 |     "get_max = lambda elements: reduce(\n",
1087 |     "    lambda a, b: a if a > b else b,\n",
1088 |     "    elements\n",
1089 |     ")\n",
1090 |     "get_max([1, 100, 2,3, 4, 105, 6])"
1091 |    ]
1092 |   },
1093 |   {
1094 |    "cell_type": "markdown",
1095 |    "metadata": {},
1096 |    "source": [
1097 |     "## List Comprehension \n",
1098 |     "list를 정의하는 것처럼 쓰지만. 실제로 내부적으로는 Lambda Operator  \n",
1099 |     "\n",
1100 |     "[i--- for i in elements if ----]"
1101 |    ]
1102 |   },
1103 |   {
1104 |    "cell_type": "code",
1105 |    "execution_count": 34,
1106 |    "metadata": {
1107 |     "collapsed": false
1108 |    },
1109 |    "outputs": [
1110 |     {
1111 |      "data": {
1112 |       "text/plain": [
1113 |        "[1, 4, 9, 16, 25]"
1114 |       ]
1115 |      },
1116 |      "execution_count": 34,
1117 |      "metadata": {},
1118 |      "output_type": "execute_result"
1119 |     }
1120 |    ],
1121 |    "source": [
1122 |     "list(map(\n",
1123 |     "    lambda x: x**2, # 1. lambda function\n",
1124 |     "    range(1,6)      # 2. list\n",
1125 |     "))"
1126 |    ]
1127 |   },
1128 |   {
1129 |    "cell_type": "code",
1130 |    "execution_count": 35,
1131 |    "metadata": {
1132 |     "collapsed": false
1133 |    },
1134 |    "outputs": [
1135 |     {
1136 |      "data": {
1137 |       "text/plain": [
1138 |        "[1, 4, 9, 16, 25]"
1139 |       ]
1140 |      },
1141 |      "execution_count": 35,
1142 |      "metadata": {},
1143 |      "output_type": "execute_result"
1144 |     }
1145 |    ],
1146 |    "source": [
1147 |     "# [i___ for i in elements if ________]\n",
1148 |     "[  \n",
1149 |     "   i**2 # 1. lambda function  \n",
1150 |     "   for i\n",
1151 |     "   in range(1,6) # 2. list\n",
1152 |     "]"
1153 |    ]
1154 |   },
1155 |   {
1156 |    "cell_type": "code",
1157 |    "execution_count": 36,
1158 |    "metadata": {
1159 |     "collapsed": false
1160 |    },
1161 |    "outputs": [
1162 |     {
1163 |      "data": {
1164 |       "text/plain": [
1165 |        "[101, 104, 105]"
1166 |       ]
1167 |      },
1168 |      "execution_count": 36,
1169 |      "metadata": {},
1170 |      "output_type": "execute_result"
1171 |     }
1172 |    ],
1173 |    "source": [
1174 |     "[\n",
1175 |     "    i                               # 1. map lambda\n",
1176 |     "    for i\n",
1177 |     "    in [101, 2, 3, 104, 105]        # 2. list\n",
1178 |     "    if i >= 100                     # 3, filter lambda\n",
1179 |     "    \n",
1180 |     "]"
1181 |    ]
1182 |   },
1183 |   {
1184 |    "cell_type": "code",
1185 |    "execution_count": 37,
1186 |    "metadata": {
1187 |     "collapsed": false
1188 |    },
1189 |    "outputs": [
1190 |     {
1191 |      "data": {
1192 |       "text/plain": [
1193 |        "[64, 81, 100]"
1194 |       ]
1195 |      },
1196 |      "execution_count": 37,
1197 |      "metadata": {},
1198 |      "output_type": "execute_result"
1199 |     }
1200 |    ],
1201 |    "source": [
1202 |     "# 1-10까지의 자연수 리스트 => 제곱한 결과중에서 50이상의 수만 리스트\n",
1203 |     "[\n",
1204 |     "    i ** 2\n",
1205 |     "    for i \n",
1206 |     "    in range(1,11)\n",
1207 |     "    if i**2 >= 50   \n",
1208 |     "]"
1209 |    ]
1210 |   },
1211 |   {
1212 |    "cell_type": "code",
1213 |    "execution_count": 38,
1214 |    "metadata": {
1215 |     "collapsed": false
1216 |    },
1217 |    "outputs": [
1218 |     {
1219 |      "data": {
1220 |       "text/plain": [
1221 |        "[{'English name': 'dog', 'Korean name': '강아지'},\n",
1222 |        " {'English name': 'cat', 'Korean name': '고양이'},\n",
1223 |        " {'English name': 'fish', 'Korean name': '물고기'},\n",
1224 |        " {'English name': 'monkey', 'Korean name': '원숭이'}]"
1225 |       ]
1226 |      },
1227 |      "execution_count": 38,
1228 |      "metadata": {},
1229 |      "output_type": "execute_result"
1230 |     }
1231 |    ],
1232 |    "source": [
1233 |     "# list comprehension으로 dictionary의 list만들기\n",
1234 |     "'''\n",
1235 |     "    강아지,dog\n",
1236 |     "    고양이,cat\n",
1237 |     "    물고기,fish\n",
1238 |     "    원숭이,monkey\n",
1239 |     "'''\n",
1240 |     "\n",
1241 |     "[\n",
1242 |     "    {\n",
1243 |     "        'Korean name' : row.split(',')[0],\n",
1244 |     "        'English name' : row.split(',')[1]\n",
1245 |     "    }\n",
1246 |     "    \n",
1247 |     "    for row\n",
1248 |     "    in open('./animals.csv', encoding = 'utf-8').read().split('\\n')\n",
1249 |     "]"
1250 |    ]
1251 |   }
1252 |  ],
1253 |  "metadata": {
1254 |   "kernelspec": {
1255 |    "display_name": "Python 3",
1256 |    "language": "python",
1257 |    "name": "python3"
1258 |   },
1259 |   "language_info": {
1260 |    "codemirror_mode": {
1261 |     "name": "ipython",
1262 |     "version": 3
1263 |    },
1264 |    "file_extension": ".py",
1265 |    "mimetype": "text/x-python",
1266 |    "name": "python",
1267 |    "nbconvert_exporter": "python",
1268 |    "pygments_lexer": "ipython3",
1269 |    "version": "3.6.0"
1270 |   }
1271 |  },
1272 |  "nbformat": 4,
1273 |  "nbformat_minor": 2
1274 | }
1275 | 


--------------------------------------------------------------------------------
/Python_basic3.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# Python basic 3\n",
   8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
   9 |     "만든이 : 김보섭  \n",
  10 |     "## Summary\n",
  11 |     "* Function advanced\n",
  12 |     "    * \\*args(tuple), \\*\\*kwargs(dict)\n",
  13 |     "    * \\*args, **kwargs를 이용한 옵션 추가  \n",
  14 |     "  \n",
  15 |     "  \n",
  16 |     "* Decorator (함수를 반환하는 함수)  \n",
  17 |     "\n",
  18 |     "    함수를 반환하는 함수 def \\___________(func) : wrapped...반환!  \n",
  19 |     "    @decorator_name  \n",
  20 |     "  \n",
  21 |     "  \n",
  22 |     "* 객체지향프로그래밍  \n",
  23 |     "\n",
  24 |     "    Class, Object(Instance)"
  25 |    ]
  26 |   },
  27 |   {
  28 |    "cell_type": "markdown",
  29 |    "metadata": {},
  30 |    "source": [
  31 |     "## Function advanced\n",
  32 |     "* \\*args (여러 개의 인자... => tuple)\n",
  33 |     "* \\**kwargs (여러 개의 인자; 이름이 있다!(Keyword Argument) => Dict)\n",
  34 |     "\n",
  35 |     "args : 함수를 정의할 때 '\\*'를 쓰면 pack의 기능, 함수를 호출 할때 '\\*'은 unpack  \n",
  36 |     "kwargs : 함수를 정의할 때 '\\*\\*'를 쓰면 pack의 기능, 함수를 호출 할때 '\\*\\*'은 unpack"
  37 |    ]
  38 |   },
  39 |   {
  40 |    "cell_type": "markdown",
  41 |    "metadata": {},
  42 |    "source": [
  43 |     "### args example : pack "
  44 |    ]
  45 |   },
  46 |   {
  47 |    "cell_type": "code",
  48 |    "execution_count": 1,
  49 |    "metadata": {
  50 |     "collapsed": false
  51 |    },
  52 |    "outputs": [
  53 |     {
  54 |      "name": "stdout",
  55 |      "output_type": "stream",
  56 |      "text": [
  57 |       "(1, 2, 3, 4)\n"
  58 |      ]
  59 |     }
  60 |    ],
  61 |    "source": [
  62 |     "# 1. 함수 정의 (+ 남들이 만들어둔 편리한 함수...)\n",
  63 |     "#    함수 인자 ... + 옵션으로 ... (입력해도 그만, 안해도 그만. 입력하면 => 0)\n",
  64 |     "# *args, **kwargs\n",
  65 |     "def get_sum(*args):\n",
  66 |     "    print(args)\n",
  67 |     "get_sum(1,2,3,4)"
  68 |    ]
  69 |   },
  70 |   {
  71 |    "cell_type": "code",
  72 |    "execution_count": 2,
  73 |    "metadata": {
  74 |     "collapsed": false
  75 |    },
  76 |    "outputs": [
  77 |     {
  78 |      "data": {
  79 |       "text/plain": [
  80 |        "10"
  81 |       ]
  82 |      },
  83 |      "execution_count": 2,
  84 |      "metadata": {},
  85 |      "output_type": "execute_result"
  86 |     }
  87 |    ],
  88 |    "source": [
  89 |     "def get_sum(*args): # 함수를 정의할 떄 '*' => 하나로 묶어주는 기능! (pack)\n",
  90 |     "    result = 0\n",
  91 |     "    for arg in args:\n",
  92 |     "        result += arg\n",
  93 |     "    return result\n",
  94 |     "get_sum(1,2,3,4)"
  95 |    ]
  96 |   },
  97 |   {
  98 |    "cell_type": "markdown",
  99 |    "metadata": {},
 100 |    "source": [
 101 |     "### args example : unpack "
 102 |    ]
 103 |   },
 104 |   {
 105 |    "cell_type": "code",
 106 |    "execution_count": 3,
 107 |    "metadata": {
 108 |     "collapsed": false
 109 |    },
 110 |    "outputs": [
 111 |     {
 112 |      "name": "stdout",
 113 |      "output_type": "stream",
 114 |      "text": [
 115 |       "([10, 20, 30, 40],)\n"
 116 |      ]
 117 |     }
 118 |    ],
 119 |    "source": [
 120 |     "# * => pack, unpack\n",
 121 |     "def get_sum(*args):\n",
 122 |     "    print(args)\n",
 123 |     "numbers = [10, 20, 30, 40]\n",
 124 |     "get_sum(numbers)"
 125 |    ]
 126 |   },
 127 |   {
 128 |    "cell_type": "code",
 129 |    "execution_count": 4,
 130 |    "metadata": {
 131 |     "collapsed": false
 132 |    },
 133 |    "outputs": [
 134 |     {
 135 |      "ename": "TypeError",
 136 |      "evalue": "unsupported operand type(s) for +=: 'int' and 'list'",
 137 |      "output_type": "error",
 138 |      "traceback": [
 139 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
 140 |       "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
 141 |       "\u001b[0;32m<ipython-input-4-5901ca22deb1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      4\u001b[0m         \u001b[0mresult\u001b[0m \u001b[1;33m+=\u001b[0m \u001b[0marg\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m     \u001b[1;32mreturn\u001b[0m \u001b[0mresult\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mget_sum\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnumbers\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
 142 |       "\u001b[0;32m<ipython-input-4-5901ca22deb1>\u001b[0m in \u001b[0;36mget_sum\u001b[0;34m(*args)\u001b[0m\n\u001b[1;32m      2\u001b[0m     \u001b[0mresult\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m      3\u001b[0m     \u001b[1;32mfor\u001b[0m \u001b[0marg\u001b[0m \u001b[1;32min\u001b[0m \u001b[0margs\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m         \u001b[0mresult\u001b[0m \u001b[1;33m+=\u001b[0m \u001b[0marg\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      5\u001b[0m     \u001b[1;32mreturn\u001b[0m \u001b[0mresult\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m      6\u001b[0m \u001b[0mget_sum\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnumbers\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
 143 |       "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for +=: 'int' and 'list'"
 144 |      ]
 145 |     }
 146 |    ],
 147 |    "source": [
 148 |     "def get_sum(*args): # 함수를 정의할 떄 '*' => 하나로 묶어주는 기능! (pack)\n",
 149 |     "    result = 0\n",
 150 |     "    for arg in args:\n",
 151 |     "        result += arg\n",
 152 |     "    return result\n",
 153 |     "get_sum(numbers)"
 154 |    ]
 155 |   },
 156 |   {
 157 |    "cell_type": "code",
 158 |    "execution_count": 5,
 159 |    "metadata": {
 160 |     "collapsed": false
 161 |    },
 162 |    "outputs": [
 163 |     {
 164 |      "data": {
 165 |       "text/plain": [
 166 |        "100"
 167 |       ]
 168 |      },
 169 |      "execution_count": 5,
 170 |      "metadata": {},
 171 |      "output_type": "execute_result"
 172 |     }
 173 |    ],
 174 |    "source": [
 175 |     "get_sum(*numbers) # 함수를 호출할 때 '*' => 묶인걸 풀어주는 기능! (unpack)"
 176 |    ]
 177 |   },
 178 |   {
 179 |    "cell_type": "markdown",
 180 |    "metadata": {},
 181 |    "source": [
 182 |     "### kwargs example : pack "
 183 |    ]
 184 |   },
 185 |   {
 186 |    "cell_type": "code",
 187 |    "execution_count": 6,
 188 |    "metadata": {
 189 |     "collapsed": false
 190 |    },
 191 |    "outputs": [
 192 |     {
 193 |      "name": "stdout",
 194 |      "output_type": "stream",
 195 |      "text": [
 196 |       "{}\n"
 197 |      ]
 198 |     }
 199 |    ],
 200 |    "source": [
 201 |     "def hello(**kwargs): # pack!\n",
 202 |     "    print(kwargs)\n",
 203 |     "hello()"
 204 |    ]
 205 |   },
 206 |   {
 207 |    "cell_type": "code",
 208 |    "execution_count": 7,
 209 |    "metadata": {
 210 |     "collapsed": false
 211 |    },
 212 |    "outputs": [
 213 |     {
 214 |      "name": "stdout",
 215 |      "output_type": "stream",
 216 |      "text": [
 217 |       "{'name': 'Boseop', 'email': 'svei89@korea.ac.kr'}\n"
 218 |      ]
 219 |     }
 220 |    ],
 221 |    "source": [
 222 |     "hello(name = 'Boseop', email = 'svei89@korea.ac.kr')"
 223 |    ]
 224 |   },
 225 |   {
 226 |    "cell_type": "markdown",
 227 |    "metadata": {},
 228 |    "source": [
 229 |     "### kwargs example : unpack"
 230 |    ]
 231 |   },
 232 |   {
 233 |    "cell_type": "code",
 234 |    "execution_count": 8,
 235 |    "metadata": {
 236 |     "collapsed": false
 237 |    },
 238 |    "outputs": [
 239 |     {
 240 |      "ename": "TypeError",
 241 |      "evalue": "hello() takes 0 positional arguments but 1 was given",
 242 |      "output_type": "error",
 243 |      "traceback": [
 244 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
 245 |       "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
 246 |       "\u001b[0;32m<ipython-input-8-bb7100839a47>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0mstudent\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m{\u001b[0m\u001b[1;34m'name'\u001b[0m \u001b[1;33m:\u001b[0m \u001b[1;34m'Boseop'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'email'\u001b[0m \u001b[1;33m:\u001b[0m \u001b[1;34m'svei89@korea.ac.kr'\u001b[0m\u001b[1;33m}\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mhello\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mstudent\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
 247 |       "\u001b[0;31mTypeError\u001b[0m: hello() takes 0 positional arguments but 1 was given"
 248 |      ]
 249 |     }
 250 |    ],
 251 |    "source": [
 252 |     "student = {'name' : 'Boseop', 'email' : 'svei89@korea.ac.kr'}\n",
 253 |     "hello(student)"
 254 |    ]
 255 |   },
 256 |   {
 257 |    "cell_type": "markdown",
 258 |    "metadata": {},
 259 |    "source": [
 260 |     "### args, kwargs example "
 261 |    ]
 262 |   },
 263 |   {
 264 |    "cell_type": "code",
 265 |    "execution_count": 9,
 266 |    "metadata": {
 267 |     "collapsed": false
 268 |    },
 269 |    "outputs": [
 270 |     {
 271 |      "name": "stdout",
 272 |      "output_type": "stream",
 273 |      "text": [
 274 |       "(1, 2, 3)\n",
 275 |       "{'name': 'Boseop'}\n"
 276 |      ]
 277 |     }
 278 |    ],
 279 |    "source": [
 280 |     "def hello(*args, **kwargs):\n",
 281 |     "    print(args)\n",
 282 |     "    print(kwargs)\n",
 283 |     "hello(1, 2, 3, name = 'Boseop')"
 284 |    ]
 285 |   },
 286 |   {
 287 |    "cell_type": "code",
 288 |    "execution_count": 10,
 289 |    "metadata": {
 290 |     "collapsed": false
 291 |    },
 292 |    "outputs": [
 293 |     {
 294 |      "name": "stdout",
 295 |      "output_type": "stream",
 296 |      "text": [
 297 |       "안녕하세요, boseop 입니다.\n",
 298 |       "이메일은 svei89@korea.ac.kr\n",
 299 |       "----------------\n",
 300 |       "()\n",
 301 |       "{}\n",
 302 |       "----------------\n"
 303 |      ]
 304 |     }
 305 |    ],
 306 |    "source": [
 307 |     "# 자기소개 하는 함수\n",
 308 |     "# 필수정보 => 이름, 이메일\n",
 309 |     "# 선택정보 =>______________\n",
 310 |     "def hello(name, email, *args, **kwargs):\n",
 311 |     "    print('안녕하세요, ' + name + ' 입니다.')\n",
 312 |     "    print('이메일은 ' + email)\n",
 313 |     "    print('----------------')\n",
 314 |     "    print(args)\n",
 315 |     "    print(kwargs)\n",
 316 |     "    print('----------------')\n",
 317 |     "hello(name = 'boseop', email = 'svei89@korea.ac.kr')"
 318 |    ]
 319 |   },
 320 |   {
 321 |    "cell_type": "markdown",
 322 |    "metadata": {},
 323 |    "source": [
 324 |     "## Decorator  \n",
 325 |     "기존에 정의된 함수의 능력을 확장할 수 있게 해주는 함수, 먼저 아래의 예제로 Python 함수의 특징을 살펴보기  \n",
 326 |     "_(해당 section은 아래의 블로그를 참조하여 작성하였습니다.)_  \n",
 327 |     "\n",
 328 |     "출처 : http://jonnung.github.io/python/2015/08/17/python-decorator/  \n",
 329 |     "출처 : http://trowind.tistory.com/72  \n",
 330 |     "출처 : http://bluese05.tistory.com/30"
 331 |    ]
 332 |   },
 333 |   {
 334 |    "cell_type": "markdown",
 335 |    "metadata": {},
 336 |    "source": [
 337 |     "### 함수의 특징 "
 338 |    ]
 339 |   },
 340 |   {
 341 |    "cell_type": "markdown",
 342 |    "metadata": {},
 343 |    "source": [
 344 |     "#### 1. 변수에 할당할 수 있다. "
 345 |    ]
 346 |   },
 347 |   {
 348 |    "cell_type": "code",
 349 |    "execution_count": 11,
 350 |    "metadata": {
 351 |     "collapsed": false
 352 |    },
 353 |    "outputs": [
 354 |     {
 355 |      "data": {
 356 |       "text/plain": [
 357 |        "'Hello boseop'"
 358 |       ]
 359 |      },
 360 |      "execution_count": 11,
 361 |      "metadata": {},
 362 |      "output_type": "execute_result"
 363 |     }
 364 |    ],
 365 |    "source": [
 366 |     "def greet(name):\n",
 367 |     "    return \"Hello {}\".format(name)\n",
 368 |     "\n",
 369 |     "greet_someone = greet\n",
 370 |     "greet_someone(\"boseop\")"
 371 |    ]
 372 |   },
 373 |   {
 374 |    "cell_type": "markdown",
 375 |    "metadata": {},
 376 |    "source": [
 377 |     "#### 2. 다른 함수 내에서 정의될 수 있다. "
 378 |    ]
 379 |   },
 380 |   {
 381 |    "cell_type": "code",
 382 |    "execution_count": 12,
 383 |    "metadata": {
 384 |     "collapsed": false
 385 |    },
 386 |    "outputs": [
 387 |     {
 388 |      "data": {
 389 |       "text/plain": [
 390 |        "'Hello boseop'"
 391 |       ]
 392 |      },
 393 |      "execution_count": 12,
 394 |      "metadata": {},
 395 |      "output_type": "execute_result"
 396 |     }
 397 |    ],
 398 |    "source": [
 399 |     "def greeting(name):\n",
 400 |     "    def greet_message():\n",
 401 |     "        return 'Hello'\n",
 402 |     "    return \"{} {}\".format(greet_message(), name)\n",
 403 |     "\n",
 404 |     "greeting(\"boseop\")"
 405 |    ]
 406 |   },
 407 |   {
 408 |    "cell_type": "markdown",
 409 |    "metadata": {},
 410 |    "source": [
 411 |     "#### 3. 함수의 인자로 전달할 수 있다."
 412 |    ]
 413 |   },
 414 |   {
 415 |    "cell_type": "code",
 416 |    "execution_count": 13,
 417 |    "metadata": {
 418 |     "collapsed": false
 419 |    },
 420 |    "outputs": [
 421 |     {
 422 |      "data": {
 423 |       "text/plain": [
 424 |        "'Hello boseop'"
 425 |       ]
 426 |      },
 427 |      "execution_count": 13,
 428 |      "metadata": {},
 429 |      "output_type": "execute_result"
 430 |     }
 431 |    ],
 432 |    "source": [
 433 |     "def change_name_greet(func):\n",
 434 |     "    name = \"boseop\"\n",
 435 |     "    return func(name)\n",
 436 |     "change_name_greet(greet)"
 437 |    ]
 438 |   },
 439 |   {
 440 |    "cell_type": "markdown",
 441 |    "metadata": {},
 442 |    "source": [
 443 |     "#### 4. 함수의 반환값이 될 수 있다. "
 444 |    ]
 445 |   },
 446 |   {
 447 |    "cell_type": "code",
 448 |    "execution_count": 14,
 449 |    "metadata": {
 450 |     "collapsed": false
 451 |    },
 452 |    "outputs": [
 453 |     {
 454 |      "data": {
 455 |       "text/plain": [
 456 |        "'HELLO BOSEOP'"
 457 |       ]
 458 |      },
 459 |      "execution_count": 14,
 460 |      "metadata": {},
 461 |      "output_type": "execute_result"
 462 |     }
 463 |    ],
 464 |    "source": [
 465 |     "def uppercase(func):\n",
 466 |     "    def wrapper(name):\n",
 467 |     "        result = func(name)\n",
 468 |     "        return result.upper()\n",
 469 |     "    return wrapper\n",
 470 |     "\n",
 471 |     "new_greet = uppercase(greet)\n",
 472 |     "new_greet(\"boseop\")"
 473 |    ]
 474 |   },
 475 |   {
 476 |    "cell_type": "markdown",
 477 |    "metadata": {},
 478 |    "source": [
 479 |     "### Decorator 활용\n",
 480 |     "아래의 예제는 다음과 같다.\n",
 481 |     "  \n",
 482 |     "* 먼저 decorator 역할을 하는 함수를 정의하고, 이 함수에서 decorator가 적용될 함수를 인자로 받는다. Python은 함수의 인자로 다른 함수를 받을 수 있다는 특징을 이용하는 것  \n",
 483 |     "* decorator 역할을 하는 함수 내부에 또 한번 함수를 선언(nested function)하여 여기에 추가적인 작업(시간 출력)을 선언해 주는 것  \n",
 484 |     "* nested 함수를 return 한다."
 485 |    ]
 486 |   },
 487 |   {
 488 |    "cell_type": "code",
 489 |    "execution_count": 15,
 490 |    "metadata": {
 491 |     "collapsed": false
 492 |    },
 493 |    "outputs": [],
 494 |    "source": [
 495 |     "def main_function():\n",
 496 |     "     print(\"MAIN FUNCTION START\")"
 497 |    ]
 498 |   },
 499 |   {
 500 |    "cell_type": "code",
 501 |    "execution_count": 16,
 502 |    "metadata": {
 503 |     "collapsed": true
 504 |    },
 505 |    "outputs": [],
 506 |    "source": [
 507 |     "import datetime\n",
 508 |     "def datetime_decorator(func):\n",
 509 |     "    def decorated():\n",
 510 |     "        print(datetime.datetime.now())\n",
 511 |     "        func()\n",
 512 |     "        print(datetime.datetime.now())\n",
 513 |     "    return decorated()"
 514 |    ]
 515 |   },
 516 |   {
 517 |    "cell_type": "code",
 518 |    "execution_count": 17,
 519 |    "metadata": {
 520 |     "collapsed": false
 521 |    },
 522 |    "outputs": [
 523 |     {
 524 |      "name": "stdout",
 525 |      "output_type": "stream",
 526 |      "text": [
 527 |       "2017-03-16 23:22:48.072721\n",
 528 |       "Main Function 1 start\n",
 529 |       "2017-03-16 23:22:48.073721\n"
 530 |      ]
 531 |     }
 532 |    ],
 533 |    "source": [
 534 |     "@datetime_decorator\n",
 535 |     "def main_funcion_1():\n",
 536 |     "    print('Main Function 1 start')"
 537 |    ]
 538 |   },
 539 |   {
 540 |    "cell_type": "markdown",
 541 |    "metadata": {},
 542 |    "source": [
 543 |     "## 객체지향프로그래밍\n",
 544 |     "* Class 기본 (example 위주)\n",
 545 |     "* Class 상속"
 546 |    ]
 547 |   },
 548 |   {
 549 |    "cell_type": "markdown",
 550 |    "metadata": {},
 551 |    "source": [
 552 |     "### Class 기본\n",
 553 |     "#### example : person Class"
 554 |    ]
 555 |   },
 556 |   {
 557 |    "cell_type": "code",
 558 |    "execution_count": 18,
 559 |    "metadata": {
 560 |     "collapsed": false
 561 |    },
 562 |    "outputs": [],
 563 |    "source": [
 564 |     "class Person:\n",
 565 |     "\n",
 566 |     "    def __init__(self, name, age, money, *args, **kwargs): # instance 생성하는 함수\n",
 567 |     "        self.name = name\n",
 568 |     "        self.age = age\n",
 569 |     "        self.money = money\n",
 570 |     "    \n",
 571 |     "    def introduce(self):\n",
 572 |     "        print('안녕하세요, 저는 {age}살 {name} 입니다.'.format(age = self.age, name = self.name))\n",
 573 |     "    \n",
 574 |     "    def give(self, partner, amount):\n",
 575 |     "        self.money -= amount\n",
 576 |     "        partner.money += amount\n",
 577 |     "        \n",
 578 |     "    def meet(self, another):\n",
 579 |     "        print('{myname}이 {partner_name} 을 만났습니다!'.format(\n",
 580 |     "            myname = self.name,\n",
 581 |     "            partner_name = another.name\n",
 582 |     "        ))"
 583 |    ]
 584 |   },
 585 |   {
 586 |    "cell_type": "code",
 587 |    "execution_count": 19,
 588 |    "metadata": {
 589 |     "collapsed": false
 590 |    },
 591 |    "outputs": [
 592 |     {
 593 |      "name": "stdout",
 594 |      "output_type": "stream",
 595 |      "text": [
 596 |       "김보섭 29\n"
 597 |      ]
 598 |     }
 599 |    ],
 600 |    "source": [
 601 |     "boseop = Person(name = '김보섭', age = '29', money = 1000)\n",
 602 |     "print(boseop.name, boseop.age)"
 603 |    ]
 604 |   },
 605 |   {
 606 |    "cell_type": "code",
 607 |    "execution_count": 20,
 608 |    "metadata": {
 609 |     "collapsed": false
 610 |    },
 611 |    "outputs": [
 612 |     {
 613 |      "name": "stdout",
 614 |      "output_type": "stream",
 615 |      "text": [
 616 |       "안녕하세요, 저는 29살 김보섭 입니다.\n"
 617 |      ]
 618 |     }
 619 |    ],
 620 |    "source": [
 621 |     "boseop.introduce()"
 622 |    ]
 623 |   },
 624 |   {
 625 |    "cell_type": "code",
 626 |    "execution_count": 21,
 627 |    "metadata": {
 628 |     "collapsed": true
 629 |    },
 630 |    "outputs": [],
 631 |    "source": [
 632 |     "data = [\n",
 633 |     "        {'name' : '사람1', 'age':30, 'money' : 1000},\n",
 634 |     "        {'name' : '사람2', 'age':30, 'money' : 1000},\n",
 635 |     "        {'name' : '사람3', 'age':30, 'money' : 1000}\n",
 636 |     "]"
 637 |    ]
 638 |   },
 639 |   {
 640 |    "cell_type": "code",
 641 |    "execution_count": 22,
 642 |    "metadata": {
 643 |     "collapsed": false
 644 |    },
 645 |    "outputs": [],
 646 |    "source": [
 647 |     "people = [ Person(name = person['name'], age = person['age'], money = person['money']) for person in data]"
 648 |    ]
 649 |   },
 650 |   {
 651 |    "cell_type": "code",
 652 |    "execution_count": 23,
 653 |    "metadata": {
 654 |     "collapsed": false
 655 |    },
 656 |    "outputs": [
 657 |     {
 658 |      "name": "stdout",
 659 |      "output_type": "stream",
 660 |      "text": [
 661 |       "안녕하세요, 저는 30살 사람1 입니다.\n"
 662 |      ]
 663 |     }
 664 |    ],
 665 |    "source": [
 666 |     "people[0].introduce()"
 667 |    ]
 668 |   },
 669 |   {
 670 |    "cell_type": "code",
 671 |    "execution_count": 24,
 672 |    "metadata": {
 673 |     "collapsed": false
 674 |    },
 675 |    "outputs": [
 676 |     {
 677 |      "name": "stdout",
 678 |      "output_type": "stream",
 679 |      "text": [
 680 |       "안녕하세요, 저는 30살 사람2 입니다.\n"
 681 |      ]
 682 |     }
 683 |    ],
 684 |    "source": [
 685 |     "people[1].introduce()"
 686 |    ]
 687 |   },
 688 |   {
 689 |    "cell_type": "code",
 690 |    "execution_count": 25,
 691 |    "metadata": {
 692 |     "collapsed": false
 693 |    },
 694 |    "outputs": [
 695 |     {
 696 |      "name": "stdout",
 697 |      "output_type": "stream",
 698 |      "text": [
 699 |       "안녕하세요, 저는 30살 사람3 입니다.\n"
 700 |      ]
 701 |     }
 702 |    ],
 703 |    "source": [
 704 |     "people[2].introduce()"
 705 |    ]
 706 |   },
 707 |   {
 708 |    "cell_type": "code",
 709 |    "execution_count": 26,
 710 |    "metadata": {
 711 |     "collapsed": false
 712 |    },
 713 |    "outputs": [
 714 |     {
 715 |      "name": "stdout",
 716 |      "output_type": "stream",
 717 |      "text": [
 718 |       "김보섭이 보섭김 을 만났습니다!\n"
 719 |      ]
 720 |     }
 721 |    ],
 722 |    "source": [
 723 |     "seopbo = Person(name = '보섭김', age = '29', money = 5000)\n",
 724 |     "boseop.meet(another = seopbo)"
 725 |    ]
 726 |   },
 727 |   {
 728 |    "cell_type": "code",
 729 |    "execution_count": 27,
 730 |    "metadata": {
 731 |     "collapsed": false
 732 |    },
 733 |    "outputs": [
 734 |     {
 735 |      "name": "stdout",
 736 |      "output_type": "stream",
 737 |      "text": [
 738 |       "3000\n",
 739 |       "3000\n"
 740 |      ]
 741 |     }
 742 |    ],
 743 |    "source": [
 744 |     "seopbo.give(partner = boseop, amount = 2000)\n",
 745 |     "print(seopbo.money)\n",
 746 |     "print(boseop.money)"
 747 |    ]
 748 |   },
 749 |   {
 750 |    "cell_type": "code",
 751 |    "execution_count": 28,
 752 |    "metadata": {
 753 |     "collapsed": false
 754 |    },
 755 |    "outputs": [
 756 |     {
 757 |      "name": "stdout",
 758 |      "output_type": "stream",
 759 |      "text": [
 760 |       "<__main__.Person object at 0x000002BA73404BA8>\n"
 761 |      ]
 762 |     }
 763 |    ],
 764 |    "source": [
 765 |     "# 위의 person class는 boseop에 print를 취하면 아래와 같은 결과\n",
 766 |     "print(boseop)"
 767 |    ]
 768 |   },
 769 |   {
 770 |    "cell_type": "code",
 771 |    "execution_count": 29,
 772 |    "metadata": {
 773 |     "collapsed": false
 774 |    },
 775 |    "outputs": [],
 776 |    "source": [
 777 |     "# Class에 아래의 코드를 추가하여 동작방식을 변경\n",
 778 |     "# 객체의 메소드의 동작방식을 바꿀 수 있다.\n",
 779 |     "class Person:\n",
 780 |     "\n",
 781 |     "    def __init__(self, name, age, money, *args, **kwargs): # instance 생성하는 함수\n",
 782 |     "        self.name = name\n",
 783 |     "        self.age = age\n",
 784 |     "        self.money = money\n",
 785 |     "    \n",
 786 |     "    def __str__(self): # print에 대한 기능 확장 (해당 class에 대해서)\n",
 787 |     "        return self.name\n",
 788 |     "    \n",
 789 |     "    def __add__(self, partner): # 사칙연산에 대한 기능 확장 (해당 class에 대해서)\n",
 790 |     "        print('{name} & {partner_name} | 결혼을 축하합니다.'.format(\n",
 791 |     "            name = self.name,\n",
 792 |     "            partner_name = partner.name\n",
 793 |     "        ))\n",
 794 |     "    \n",
 795 |     "    def introduce(self):\n",
 796 |     "        print('안녕하세요, 저는 {age}살 {name} 입니다.'.format(age = self.age, name = self.name))\n",
 797 |     "    \n",
 798 |     "    def give(self, partner, amount):\n",
 799 |     "        self.money -= amount\n",
 800 |     "        partner.money += amount\n",
 801 |     "        \n",
 802 |     "    def meet(self, another):\n",
 803 |     "        print('{myname}이 {partner_name} 을 만났습니다!'.format(\n",
 804 |     "            myname = self.name,\n",
 805 |     "            partner_name = another.name\n",
 806 |     "        ))"
 807 |    ]
 808 |   },
 809 |   {
 810 |    "cell_type": "code",
 811 |    "execution_count": 30,
 812 |    "metadata": {
 813 |     "collapsed": false
 814 |    },
 815 |    "outputs": [
 816 |     {
 817 |      "name": "stdout",
 818 |      "output_type": "stream",
 819 |      "text": [
 820 |       "김보섭\n"
 821 |      ]
 822 |     }
 823 |    ],
 824 |    "source": [
 825 |     "boseop = Person(name = '김보섭', age = '29', money = 1000)\n",
 826 |     "seopbo = Person(name = '보섭김', age = '29', money = 5000)\n",
 827 |     "print(boseop)"
 828 |    ]
 829 |   },
 830 |   {
 831 |    "cell_type": "code",
 832 |    "execution_count": 31,
 833 |    "metadata": {
 834 |     "collapsed": false
 835 |    },
 836 |    "outputs": [
 837 |     {
 838 |      "name": "stdout",
 839 |      "output_type": "stream",
 840 |      "text": [
 841 |       "김보섭 & 보섭김 | 결혼을 축하합니다.\n"
 842 |      ]
 843 |     }
 844 |    ],
 845 |    "source": [
 846 |     "boseop + seopbo"
 847 |    ]
 848 |   },
 849 |   {
 850 |    "cell_type": "markdown",
 851 |    "metadata": {},
 852 |    "source": [
 853 |     "#### example : Triangle Class\n",
 854 |     "* State : height, width\n",
 855 |     "* Behavior : area(넓이계산), is_bigger(다른 삼각형이랑 비교)"
 856 |    ]
 857 |   },
 858 |   {
 859 |    "cell_type": "code",
 860 |    "execution_count": 32,
 861 |    "metadata": {
 862 |     "collapsed": true
 863 |    },
 864 |    "outputs": [],
 865 |    "source": [
 866 |     "class Triangle:\n",
 867 |     "    def __init__(self, height, width):\n",
 868 |     "        self.height = height\n",
 869 |     "        self.width = width\n",
 870 |     "    \n",
 871 |     "    def __str__(self):\n",
 872 |     "        return '({width}, {height}) Triangle'.format(\n",
 873 |     "            width = self.width,\n",
 874 |     "            height = self.height\n",
 875 |     "        )\n",
 876 |     "    \n",
 877 |     "    def area(self):\n",
 878 |     "        self.area = self.height * self.width * 1/2\n",
 879 |     "        return self.area\n",
 880 |     "    \n",
 881 |     "    def is_bigger(self, another):\n",
 882 |     "        return '큽니다' if self.area() > another.area() else ('작습니다' if self.area() < another.area() else '똑같습니다.')        "
 883 |    ]
 884 |   },
 885 |   {
 886 |    "cell_type": "code",
 887 |    "execution_count": 33,
 888 |    "metadata": {
 889 |     "collapsed": false
 890 |    },
 891 |    "outputs": [
 892 |     {
 893 |      "name": "stdout",
 894 |      "output_type": "stream",
 895 |      "text": [
 896 |       "(20, 10) Triangle\n"
 897 |      ]
 898 |     }
 899 |    ],
 900 |    "source": [
 901 |     "t1 = Triangle(10, 20)\n",
 902 |     "print(t1)"
 903 |    ]
 904 |   },
 905 |   {
 906 |    "cell_type": "code",
 907 |    "execution_count": 34,
 908 |    "metadata": {
 909 |     "collapsed": false
 910 |    },
 911 |    "outputs": [
 912 |     {
 913 |      "name": "stdout",
 914 |      "output_type": "stream",
 915 |      "text": [
 916 |       "(20, 10) Triangle\n"
 917 |      ]
 918 |     }
 919 |    ],
 920 |    "source": [
 921 |     "t2 = Triangle(10,5)\n",
 922 |     "print(t1)"
 923 |    ]
 924 |   },
 925 |   {
 926 |    "cell_type": "code",
 927 |    "execution_count": 35,
 928 |    "metadata": {
 929 |     "collapsed": false
 930 |    },
 931 |    "outputs": [
 932 |     {
 933 |      "data": {
 934 |       "text/plain": [
 935 |        "'큽니다'"
 936 |       ]
 937 |      },
 938 |      "execution_count": 35,
 939 |      "metadata": {},
 940 |      "output_type": "execute_result"
 941 |     }
 942 |    ],
 943 |    "source": [
 944 |     "t1.is_bigger(t2)"
 945 |    ]
 946 |   },
 947 |   {
 948 |    "cell_type": "markdown",
 949 |    "metadata": {},
 950 |    "source": [
 951 |     "### Class 상속 (객체의 상속, Inheritance)\n",
 952 |     "모든 Class는 일단 object Class에서 상속을 받는 형태  \n",
 953 |     "class object: \\__init\\__, \\__str\\__, \\__add\\__ 같은 method들이 사전에 정의되어있음  \n",
 954 |     "새로 만드는 class에서 저런 동적을 변경하고 싶다면 위의 method들을 재정의하면됨, 그러한 동적을 method overriding이라고 칭함\n",
 955 |     "#### simple example\n",
 956 |     "동물 : 행동 (eat, attack)  \n",
 957 |     "강아지 : (+ 행동 : bark)  \n",
 958 |     "새 : (+ 행동 : fly)"
 959 |    ]
 960 |   },
 961 |   {
 962 |    "cell_type": "code",
 963 |    "execution_count": 36,
 964 |    "metadata": {
 965 |     "collapsed": true
 966 |    },
 967 |    "outputs": [],
 968 |    "source": [
 969 |     "class Animal: # 사실상 기본적으로 class Animal(object): 이런식으로 상속 받는 것 \n",
 970 |     "              # class object: __init__, __str__, __add__, 이런게 미리 다 정의되어 있는 것!\n",
 971 |     "    \n",
 972 |     "    def eat(self):\n",
 973 |     "        print('먹는다!!!')\n",
 974 |     "        \n",
 975 |     "    def attack(self):\n",
 976 |     "        print('공격!!!')"
 977 |    ]
 978 |   },
 979 |   {
 980 |    "cell_type": "code",
 981 |    "execution_count": 37,
 982 |    "metadata": {
 983 |     "collapsed": true
 984 |    },
 985 |    "outputs": [],
 986 |    "source": [
 987 |     "class Dog(Animal):\n",
 988 |     "    def bark(self):\n",
 989 |     "        print('월월!!')"
 990 |    ]
 991 |   },
 992 |   {
 993 |    "cell_type": "code",
 994 |    "execution_count": 38,
 995 |    "metadata": {
 996 |     "collapsed": false
 997 |    },
 998 |    "outputs": [
 999 |     {
1000 |      "name": "stdout",
1001 |      "output_type": "stream",
1002 |      "text": [
1003 |       "먹는다!!!\n"
1004 |      ]
1005 |     }
1006 |    ],
1007 |    "source": [
1008 |     "dog = Dog()\n",
1009 |     "dog.eat()"
1010 |    ]
1011 |   },
1012 |   {
1013 |    "cell_type": "code",
1014 |    "execution_count": 39,
1015 |    "metadata": {
1016 |     "collapsed": false
1017 |    },
1018 |    "outputs": [
1019 |     {
1020 |      "name": "stdout",
1021 |      "output_type": "stream",
1022 |      "text": [
1023 |       "공격!!!\n"
1024 |      ]
1025 |     }
1026 |    ],
1027 |    "source": [
1028 |     "dog.attack()"
1029 |    ]
1030 |   },
1031 |   {
1032 |    "cell_type": "code",
1033 |    "execution_count": 40,
1034 |    "metadata": {
1035 |     "collapsed": false
1036 |    },
1037 |    "outputs": [
1038 |     {
1039 |      "name": "stdout",
1040 |      "output_type": "stream",
1041 |      "text": [
1042 |       "월월!!\n"
1043 |      ]
1044 |     }
1045 |    ],
1046 |    "source": [
1047 |     "dog.bark()"
1048 |    ]
1049 |   },
1050 |   {
1051 |    "cell_type": "code",
1052 |    "execution_count": 41,
1053 |    "metadata": {
1054 |     "collapsed": true
1055 |    },
1056 |    "outputs": [],
1057 |    "source": [
1058 |     "class Bird(Animal):\n",
1059 |     "    def fly(self):\n",
1060 |     "        print('날다')"
1061 |    ]
1062 |   },
1063 |   {
1064 |    "cell_type": "code",
1065 |    "execution_count": 42,
1066 |    "metadata": {
1067 |     "collapsed": false
1068 |    },
1069 |    "outputs": [
1070 |     {
1071 |      "name": "stdout",
1072 |      "output_type": "stream",
1073 |      "text": [
1074 |       "날다\n"
1075 |      ]
1076 |     }
1077 |    ],
1078 |    "source": [
1079 |     "bird = Bird()\n",
1080 |     "bird.fly()"
1081 |    ]
1082 |   },
1083 |   {
1084 |    "cell_type": "code",
1085 |    "execution_count": 43,
1086 |    "metadata": {
1087 |     "collapsed": false
1088 |    },
1089 |    "outputs": [
1090 |     {
1091 |      "ename": "AttributeError",
1092 |      "evalue": "'Dog' object has no attribute 'fly'",
1093 |      "output_type": "error",
1094 |      "traceback": [
1095 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
1096 |       "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
1097 |       "\u001b[0;32m<ipython-input-43-02aa21288d79>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdog\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfly\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
1098 |       "\u001b[0;31mAttributeError\u001b[0m: 'Dog' object has no attribute 'fly'"
1099 |      ]
1100 |     }
1101 |    ],
1102 |    "source": [
1103 |     "dog.fly()"
1104 |    ]
1105 |   },
1106 |   {
1107 |    "cell_type": "markdown",
1108 |    "metadata": {},
1109 |    "source": [
1110 |     "#### Method Overriding"
1111 |    ]
1112 |   },
1113 |   {
1114 |    "cell_type": "code",
1115 |    "execution_count": 44,
1116 |    "metadata": {
1117 |     "collapsed": true
1118 |    },
1119 |    "outputs": [],
1120 |    "source": [
1121 |     "class Dog(Animal):\n",
1122 |     "    \n",
1123 |     "    def eat(self):\n",
1124 |     "        print('침을 흘리면서 먹는다!!') # method overriding\n",
1125 |     "    \n",
1126 |     "    def bark(self):\n",
1127 |     "        print('월월!!')"
1128 |    ]
1129 |   },
1130 |   {
1131 |    "cell_type": "code",
1132 |    "execution_count": 45,
1133 |    "metadata": {
1134 |     "collapsed": false
1135 |    },
1136 |    "outputs": [
1137 |     {
1138 |      "name": "stdout",
1139 |      "output_type": "stream",
1140 |      "text": [
1141 |       "침을 흘리면서 먹는다!!\n"
1142 |      ]
1143 |     }
1144 |    ],
1145 |    "source": [
1146 |     "dog = Dog()\n",
1147 |     "dog.eat()"
1148 |    ]
1149 |   },
1150 |   {
1151 |    "cell_type": "markdown",
1152 |    "metadata": {},
1153 |    "source": [
1154 |     "#### Illustration : pandas"
1155 |    ]
1156 |   },
1157 |   {
1158 |    "cell_type": "code",
1159 |    "execution_count": 46,
1160 |    "metadata": {
1161 |     "collapsed": true
1162 |    },
1163 |    "outputs": [],
1164 |    "source": [
1165 |     "import os, sys\n",
1166 |     "import pandas as pd"
1167 |    ]
1168 |   },
1169 |   {
1170 |    "cell_type": "code",
1171 |    "execution_count": 47,
1172 |    "metadata": {
1173 |     "collapsed": true
1174 |    },
1175 |    "outputs": [],
1176 |    "source": [
1177 |     "df = pd.DataFrame() # df => 객체 또는 DataFrame 클래스의 인스턴스"
1178 |    ]
1179 |   },
1180 |   {
1181 |    "cell_type": "code",
1182 |    "execution_count": 48,
1183 |    "metadata": {
1184 |     "collapsed": false
1185 |    },
1186 |    "outputs": [],
1187 |    "source": [
1188 |     "df = pd.read_csv('./animals.csv')"
1189 |    ]
1190 |   },
1191 |   {
1192 |    "cell_type": "code",
1193 |    "execution_count": 49,
1194 |    "metadata": {
1195 |     "collapsed": false
1196 |    },
1197 |    "outputs": [
1198 |     {
1199 |      "data": {
1200 |       "text/html": [
1201 |        "<div>\n",
1202 |        "<table border=\"1\" class=\"dataframe\">\n",
1203 |        "  <thead>\n",
1204 |        "    <tr style=\"text-align: right;\">\n",
1205 |        "      <th></th>\n",
1206 |        "      <th>강아지</th>\n",
1207 |        "      <th>dog</th>\n",
1208 |        "    </tr>\n",
1209 |        "  </thead>\n",
1210 |        "  <tbody>\n",
1211 |        "    <tr>\n",
1212 |        "      <th>0</th>\n",
1213 |        "      <td>고양이</td>\n",
1214 |        "      <td>cat</td>\n",
1215 |        "    </tr>\n",
1216 |        "    <tr>\n",
1217 |        "      <th>1</th>\n",
1218 |        "      <td>물고기</td>\n",
1219 |        "      <td>fish</td>\n",
1220 |        "    </tr>\n",
1221 |        "    <tr>\n",
1222 |        "      <th>2</th>\n",
1223 |        "      <td>원숭이</td>\n",
1224 |        "      <td>monkey</td>\n",
1225 |        "    </tr>\n",
1226 |        "  </tbody>\n",
1227 |        "</table>\n",
1228 |        "</div>"
1229 |       ],
1230 |       "text/plain": [
1231 |        "   강아지     dog\n",
1232 |        "0  고양이     cat\n",
1233 |        "1  물고기    fish\n",
1234 |        "2  원숭이  monkey"
1235 |       ]
1236 |      },
1237 |      "execution_count": 49,
1238 |      "metadata": {},
1239 |      "output_type": "execute_result"
1240 |     }
1241 |    ],
1242 |    "source": [
1243 |     "df # __repr__의 method가 DataFrame 클래스에 맞는 형태로 method overriding 되어 있는 것"
1244 |    ]
1245 |   },
1246 |   {
1247 |    "cell_type": "code",
1248 |    "execution_count": 50,
1249 |    "metadata": {
1250 |     "collapsed": false
1251 |    },
1252 |    "outputs": [
1253 |     {
1254 |      "name": "stdout",
1255 |      "output_type": "stream",
1256 |      "text": [
1257 |       "   강아지     dog\n",
1258 |       "0  고양이     cat\n",
1259 |       "1  물고기    fish\n",
1260 |       "2  원숭이  monkey\n"
1261 |      ]
1262 |     }
1263 |    ],
1264 |    "source": [
1265 |     "print(df) # __str__의 method가 DataFrame 클래스에 맞는 형태로 method overriding 되어 있는 것"
1266 |    ]
1267 |   }
1268 |  ],
1269 |  "metadata": {
1270 |   "kernelspec": {
1271 |    "display_name": "Python 3",
1272 |    "language": "python",
1273 |    "name": "python3"
1274 |   },
1275 |   "language_info": {
1276 |    "codemirror_mode": {
1277 |     "name": "ipython",
1278 |     "version": 3
1279 |    },
1280 |    "file_extension": ".py",
1281 |    "mimetype": "text/x-python",
1282 |    "name": "python",
1283 |    "nbconvert_exporter": "python",
1284 |    "pygments_lexer": "ipython3",
1285 |    "version": "3.6.0"
1286 |   }
1287 |  },
1288 |  "nbformat": 4,
1289 |  "nbformat_minor": 2
1290 | }
1291 | 


--------------------------------------------------------------------------------
/Scrapping static webpage.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 정적인 웹사이트 크롤링\n",
  8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
  9 |     "만든이 : 김보섭  \n",
 10 |     "\n",
 11 |     "## Preliminary\n",
 12 |     "* 웹 사이트 (웹 클라이언트) : HTML, CSS, Javascript\n",
 13 |     "* HTML : 사이트의 구조\n",
 14 |     "* CSS : 사이트의 스타일\n",
 15 |     "* Javascript : 사이트의 동적인 기능들 (애니메이션, 데이터를 서버에서 불러오는 기능)\n",
 16 |     "* 데이터를 우리에게 뿌려주는 주체\n",
 17 |     "    * 웹 서버 => HTML (Server rendering)\n",
 18 |     "    * 웹 클라이언트 (Javascript) => HTML (Client Rendering)\n",
 19 |     "\n",
 20 |     "## Summary\n",
 21 |     "* 정적인 웹사이트\n",
 22 |     "    * HTML 파일을 다운로드 (Crawling, Scraping)\n",
 23 |     "    * 우리가 원하는 데이터의 위치를 찾아서 추출! (Parsing)"
 24 |    ]
 25 |   },
 26 |   {
 27 |    "cell_type": "markdown",
 28 |    "metadata": {},
 29 |    "source": [
 30 |     "### 네이버 실시간 검색어 데이터\n",
 31 |     "'realrank'라는 id를 가진 ol이라고 하는 태그 내에 10개의 li 태그가 있고, 그 li 태그안에 있는 텍스트를 뽑아내자.  \n",
 32 |     "\n",
 33 |     "* CSS Selector (CSS 선택자)\n",
 34 |     "    * ol#realrank li\n",
 35 |     "    * id = '#', class ='.'"
 36 |    ]
 37 |   },
 38 |   {
 39 |    "cell_type": "markdown",
 40 |    "metadata": {},
 41 |    "source": [
 42 |     "#### GET으로 요청 "
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "code",
 47 |    "execution_count": 1,
 48 |    "metadata": {
 49 |     "collapsed": false
 50 |    },
 51 |    "outputs": [
 52 |     {
 53 |      "data": {
 54 |       "text/plain": [
 55 |        "str"
 56 |       ]
 57 |      },
 58 |      "execution_count": 1,
 59 |      "metadata": {},
 60 |      "output_type": "execute_result"
 61 |     }
 62 |    ],
 63 |    "source": [
 64 |     "# 1. html\n",
 65 |     "import requests\n",
 66 |     "response = requests.get('http://www.naver.com/')\n",
 67 |     "type(response.text)\n",
 68 |     "# response.text (string)"
 69 |    ]
 70 |   },
 71 |   {
 72 |    "cell_type": "markdown",
 73 |    "metadata": {},
 74 |    "source": [
 75 |     "#### BeautifulSoup으로 parsing"
 76 |    ]
 77 |   },
 78 |   {
 79 |    "cell_type": "code",
 80 |    "execution_count": 2,
 81 |    "metadata": {
 82 |     "collapsed": false
 83 |    },
 84 |    "outputs": [
 85 |     {
 86 |      "data": {
 87 |       "text/plain": [
 88 |        "bs4.BeautifulSoup"
 89 |       ]
 90 |      },
 91 |      "execution_count": 2,
 92 |      "metadata": {},
 93 |      "output_type": "execute_result"
 94 |     }
 95 |    ],
 96 |    "source": [
 97 |     "#### 2. parsing\n",
 98 |     "from bs4 import BeautifulSoup\n",
 99 |     "bs = BeautifulSoup(response.text, 'html.parser')\n",
100 |     "type(bs)\n",
101 |     "# bs.. 우리가 쉽게 파싱할 수 있도록 HTML을 구조화해둔 객체"
102 |    ]
103 |   },
104 |   {
105 |    "cell_type": "markdown",
106 |    "metadata": {},
107 |    "source": [
108 |     "#### 실시간 검색어 키워드"
109 |    ]
110 |   },
111 |   {
112 |    "cell_type": "code",
113 |    "execution_count": 3,
114 |    "metadata": {
115 |     "collapsed": false
116 |    },
117 |    "outputs": [],
118 |    "source": [
119 |     "elements = bs.select('ol#realrank li')"
120 |    ]
121 |   },
122 |   {
123 |    "cell_type": "code",
124 |    "execution_count": 4,
125 |    "metadata": {
126 |     "collapsed": false
127 |    },
128 |    "outputs": [],
129 |    "source": [
130 |     "element = elements[0]"
131 |    ]
132 |   },
133 |   {
134 |    "cell_type": "code",
135 |    "execution_count": 5,
136 |    "metadata": {
137 |     "collapsed": false
138 |    },
139 |    "outputs": [
140 |     {
141 |      "data": {
142 |       "text/plain": [
143 |        "bs4.element.Tag"
144 |       ]
145 |      },
146 |      "execution_count": 5,
147 |      "metadata": {},
148 |      "output_type": "execute_result"
149 |     }
150 |    ],
151 |    "source": [
152 |     "type(element)"
153 |    ]
154 |   },
155 |   {
156 |    "cell_type": "code",
157 |    "execution_count": 6,
158 |    "metadata": {
159 |     "collapsed": false
160 |    },
161 |    "outputs": [
162 |     {
163 |      "data": {
164 |       "text/plain": [
165 |        "<li class=\"up\" value=\"1\"><a href=\"http://search.naver.com/search.naver?where=nexearch&amp;query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&amp;sm=top_lve&amp;ie=utf8\" title=\"\"><span class=\"ell\">공각 기동대</span><span class=\"tx\">상승</span><span class=\"ic\"></span><span class=\"rk\">93</span></a></li>"
166 |       ]
167 |      },
168 |      "execution_count": 6,
169 |      "metadata": {},
170 |      "output_type": "execute_result"
171 |     }
172 |    ],
173 |    "source": [
174 |     "element"
175 |    ]
176 |   },
177 |   {
178 |    "cell_type": "code",
179 |    "execution_count": 7,
180 |    "metadata": {
181 |     "collapsed": false
182 |    },
183 |    "outputs": [
184 |     {
185 |      "data": {
186 |       "text/plain": [
187 |        "'공각 기동대'"
188 |       ]
189 |      },
190 |      "execution_count": 7,
191 |      "metadata": {},
192 |      "output_type": "execute_result"
193 |     }
194 |    ],
195 |    "source": [
196 |     "element.select_one('span.ell').text"
197 |    ]
198 |   },
199 |   {
200 |    "cell_type": "code",
201 |    "execution_count": 8,
202 |    "metadata": {
203 |     "collapsed": false
204 |    },
205 |    "outputs": [
206 |     {
207 |      "name": "stdout",
208 |      "output_type": "stream",
209 |      "text": [
210 |       "<span class=\"ell\">공각 기동대</span> [<span class=\"ell\">공각 기동대</span>]\n"
211 |      ]
212 |     }
213 |    ],
214 |    "source": [
215 |     "# 주의할점 select_one으로 찾으면 그 자체로 bs4.element.Tag 이지만\n",
216 |     "# select는 해당 CSS에 걸리는 여러 개를 찾는 것이므로 결과물이 bs4.element.Tag가 List의 하나하나의 값으로 들어가있음\n",
217 |     "print(element.select_one('span.ell'), element.select('span.ell'))"
218 |    ]
219 |   },
220 |   {
221 |    "cell_type": "markdown",
222 |    "metadata": {},
223 |    "source": [
224 |     "#### 실시간 검색어 링크"
225 |    ]
226 |   },
227 |   {
228 |    "cell_type": "code",
229 |    "execution_count": 9,
230 |    "metadata": {
231 |     "collapsed": false
232 |    },
233 |    "outputs": [
234 |     {
235 |      "data": {
236 |       "text/plain": [
237 |        "<li class=\"up\" value=\"1\"><a href=\"http://search.naver.com/search.naver?where=nexearch&amp;query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&amp;sm=top_lve&amp;ie=utf8\" title=\"\"><span class=\"ell\">공각 기동대</span><span class=\"tx\">상승</span><span class=\"ic\"></span><span class=\"rk\">93</span></a></li>"
238 |       ]
239 |      },
240 |      "execution_count": 9,
241 |      "metadata": {},
242 |      "output_type": "execute_result"
243 |     }
244 |    ],
245 |    "source": [
246 |     "element"
247 |    ]
248 |   },
249 |   {
250 |    "cell_type": "code",
251 |    "execution_count": 10,
252 |    "metadata": {
253 |     "collapsed": false
254 |    },
255 |    "outputs": [
256 |     {
257 |      "data": {
258 |       "text/plain": [
259 |        "<a href=\"http://search.naver.com/search.naver?where=nexearch&amp;query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&amp;sm=top_lve&amp;ie=utf8\" title=\"\"><span class=\"ell\">공각 기동대</span><span class=\"tx\">상승</span><span class=\"ic\"></span><span class=\"rk\">93</span></a>"
260 |       ]
261 |      },
262 |      "execution_count": 10,
263 |      "metadata": {},
264 |      "output_type": "execute_result"
265 |     }
266 |    ],
267 |    "source": [
268 |     "element.select_one('a')"
269 |    ]
270 |   },
271 |   {
272 |    "cell_type": "code",
273 |    "execution_count": 11,
274 |    "metadata": {
275 |     "collapsed": false
276 |    },
277 |    "outputs": [
278 |     {
279 |      "name": "stdout",
280 |      "output_type": "stream",
281 |      "text": [
282 |       "{'href': 'http://search.naver.com/search.naver?where=nexearch&query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&sm=top_lve&ie=utf8', 'title': ''}\n",
283 |       "<class 'dict'>\n"
284 |      ]
285 |     }
286 |    ],
287 |    "source": [
288 |     "print(element.select_one('a').attrs)\n",
289 |     "print(type(element.select_one('a').attrs)) # dictionary의 형태"
290 |    ]
291 |   },
292 |   {
293 |    "cell_type": "code",
294 |    "execution_count": 12,
295 |    "metadata": {
296 |     "collapsed": false
297 |    },
298 |    "outputs": [
299 |     {
300 |      "data": {
301 |       "text/plain": [
302 |        "'http://search.naver.com/search.naver?where=nexearch&query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&sm=top_lve&ie=utf8'"
303 |       ]
304 |      },
305 |      "execution_count": 12,
306 |      "metadata": {},
307 |      "output_type": "execute_result"
308 |     }
309 |    ],
310 |    "source": [
311 |     "element.select_one('a').attrs.get('href')"
312 |    ]
313 |   },
314 |   {
315 |    "cell_type": "code",
316 |    "execution_count": 13,
317 |    "metadata": {
318 |     "collapsed": false
319 |    },
320 |    "outputs": [
321 |     {
322 |      "name": "stdout",
323 |      "output_type": "stream",
324 |      "text": [
325 |       "<a href=\"http://search.naver.com/search.naver?where=nexearch&amp;query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&amp;sm=top_lve&amp;ie=utf8\" title=\"\"><span class=\"ell\">공각 기동대</span><span class=\"tx\">상승</span><span class=\"ic\"></span><span class=\"rk\">93</span></a>\n",
326 |       "[<a href=\"http://search.naver.com/search.naver?where=nexearch&amp;query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&amp;sm=top_lve&amp;ie=utf8\" title=\"\"><span class=\"ell\">공각 기동대</span><span class=\"tx\">상승</span><span class=\"ic\"></span><span class=\"rk\">93</span></a>]\n"
327 |      ]
328 |     }
329 |    ],
330 |    "source": [
331 |     "# 주의할점 select_one으로 찾으면 그 자체로 bs4.element.Tag 이지만\n",
332 |     "# select는 해당 CSS에 걸리는 여러 개를 찾는 것이므로 결과물이 bs4.element.Tag가 List의 하나하나의 값으로 들어가있음\n",
333 |     "print(element.select_one('a'))\n",
334 |     "print(element.select('a'))"
335 |    ]
336 |   },
337 |   {
338 |    "cell_type": "markdown",
339 |    "metadata": {},
340 |    "source": [
341 |     "#### 실시간 검색어, 해당 검색어의 링크를 parsing"
342 |    ]
343 |   },
344 |   {
345 |    "cell_type": "code",
346 |    "execution_count": 14,
347 |    "metadata": {
348 |     "collapsed": false
349 |    },
350 |    "outputs": [],
351 |    "source": [
352 |     "# List of dictionary의 형태로\n",
353 |     "data = [{'keyword' : tmp.select_one('span.ell').text,\n",
354 |     "  'address' : tmp.select_one('a').attrs.get('href')} for tmp in bs.select('ol#realrank li')]"
355 |    ]
356 |   },
357 |   {
358 |    "cell_type": "code",
359 |    "execution_count": 15,
360 |    "metadata": {
361 |     "collapsed": false
362 |    },
363 |    "outputs": [
364 |     {
365 |      "name": "stdout",
366 |      "output_type": "stream",
367 |      "text": [
368 |       "[{'keyword': '공각 기동대', 'address': 'http://search.naver.com/search.naver?where=nexearch&query=%EA%B3%B5%EA%B0%81+%EA%B8%B0%EB%8F%99%EB%8C%80&sm=top_lve&ie=utf8'}, {'keyword': '엄석대', 'address': 'http://search.naver.com/search.naver?where=nexearch&query=%EC%97%84%EC%84%9D%EB%8C%80&sm=top_lve&ie=utf8'}, {'keyword': '스칼렛요한슨', 'address': 'http://search.naver.com/search.naver?where=nexearch&query=%EC%8A%A4%EC%B9%BC%EB%A0%9B%EC%9A%94%ED%95%9C%EC%8A%A8&sm=top_lve&ie=utf8'}, {'keyword': '설민석', 'address': 'http://search.naver.com/search.naver?where=nexearch&query=%EC%84%A4%EB%AF%BC%EC%84%9D&sm=top_lve&ie=utf8'}]\n"
369 |      ]
370 |     }
371 |    ],
372 |    "source": [
373 |     "print(data[0:4])"
374 |    ]
375 |   },
376 |   {
377 |    "cell_type": "code",
378 |    "execution_count": 16,
379 |    "metadata": {
380 |     "collapsed": false
381 |    },
382 |    "outputs": [
383 |     {
384 |      "data": {
385 |       "text/html": [
386 |        "<div>\n",
387 |        "<table border=\"1\" class=\"dataframe\">\n",
388 |        "  <thead>\n",
389 |        "    <tr style=\"text-align: right;\">\n",
390 |        "      <th></th>\n",
391 |        "      <th>address</th>\n",
392 |        "      <th>keyword</th>\n",
393 |        "    </tr>\n",
394 |        "  </thead>\n",
395 |        "  <tbody>\n",
396 |        "    <tr>\n",
397 |        "      <th>0</th>\n",
398 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
399 |        "      <td>공각 기동대</td>\n",
400 |        "    </tr>\n",
401 |        "    <tr>\n",
402 |        "      <th>1</th>\n",
403 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
404 |        "      <td>엄석대</td>\n",
405 |        "    </tr>\n",
406 |        "    <tr>\n",
407 |        "      <th>2</th>\n",
408 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
409 |        "      <td>스칼렛요한슨</td>\n",
410 |        "    </tr>\n",
411 |        "    <tr>\n",
412 |        "      <th>3</th>\n",
413 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
414 |        "      <td>설민석</td>\n",
415 |        "    </tr>\n",
416 |        "    <tr>\n",
417 |        "      <th>4</th>\n",
418 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
419 |        "      <td>고등래퍼</td>\n",
420 |        "    </tr>\n",
421 |        "    <tr>\n",
422 |        "      <th>5</th>\n",
423 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
424 |        "      <td>인터파크티켓</td>\n",
425 |        "    </tr>\n",
426 |        "    <tr>\n",
427 |        "      <th>6</th>\n",
428 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
429 |        "      <td>영화순위</td>\n",
430 |        "    </tr>\n",
431 |        "    <tr>\n",
432 |        "      <th>7</th>\n",
433 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
434 |        "      <td>bgf리테일 채용</td>\n",
435 |        "    </tr>\n",
436 |        "    <tr>\n",
437 |        "      <th>8</th>\n",
438 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
439 |        "      <td>나혼자산다</td>\n",
440 |        "    </tr>\n",
441 |        "    <tr>\n",
442 |        "      <th>9</th>\n",
443 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
444 |        "      <td>양홍원</td>\n",
445 |        "    </tr>\n",
446 |        "    <tr>\n",
447 |        "      <th>10</th>\n",
448 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
449 |        "      <td>김어준의 파파이스</td>\n",
450 |        "    </tr>\n",
451 |        "    <tr>\n",
452 |        "      <th>11</th>\n",
453 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
454 |        "      <td>언니들의 슬램덩크 시즌2</td>\n",
455 |        "    </tr>\n",
456 |        "    <tr>\n",
457 |        "      <th>12</th>\n",
458 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
459 |        "      <td>미운우리새끼</td>\n",
460 |        "    </tr>\n",
461 |        "    <tr>\n",
462 |        "      <th>13</th>\n",
463 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
464 |        "      <td>트위치tv</td>\n",
465 |        "    </tr>\n",
466 |        "    <tr>\n",
467 |        "      <th>14</th>\n",
468 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
469 |        "      <td>미녀와 야수</td>\n",
470 |        "    </tr>\n",
471 |        "    <tr>\n",
472 |        "      <th>15</th>\n",
473 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
474 |        "      <td>엠넷</td>\n",
475 |        "    </tr>\n",
476 |        "    <tr>\n",
477 |        "      <th>16</th>\n",
478 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
479 |        "      <td>도봉순</td>\n",
480 |        "    </tr>\n",
481 |        "    <tr>\n",
482 |        "      <th>17</th>\n",
483 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
484 |        "      <td>영화</td>\n",
485 |        "    </tr>\n",
486 |        "    <tr>\n",
487 |        "      <th>18</th>\n",
488 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
489 |        "      <td>문재인</td>\n",
490 |        "    </tr>\n",
491 |        "    <tr>\n",
492 |        "      <th>19</th>\n",
493 |        "      <td>http://search.naver.com/search.naver?where=nex...</td>\n",
494 |        "      <td>미세먼지</td>\n",
495 |        "    </tr>\n",
496 |        "  </tbody>\n",
497 |        "</table>\n",
498 |        "</div>"
499 |       ],
500 |       "text/plain": [
501 |        "                                              address        keyword\n",
502 |        "0   http://search.naver.com/search.naver?where=nex...         공각 기동대\n",
503 |        "1   http://search.naver.com/search.naver?where=nex...            엄석대\n",
504 |        "2   http://search.naver.com/search.naver?where=nex...         스칼렛요한슨\n",
505 |        "3   http://search.naver.com/search.naver?where=nex...            설민석\n",
506 |        "4   http://search.naver.com/search.naver?where=nex...           고등래퍼\n",
507 |        "5   http://search.naver.com/search.naver?where=nex...         인터파크티켓\n",
508 |        "6   http://search.naver.com/search.naver?where=nex...           영화순위\n",
509 |        "7   http://search.naver.com/search.naver?where=nex...      bgf리테일 채용\n",
510 |        "8   http://search.naver.com/search.naver?where=nex...          나혼자산다\n",
511 |        "9   http://search.naver.com/search.naver?where=nex...            양홍원\n",
512 |        "10  http://search.naver.com/search.naver?where=nex...      김어준의 파파이스\n",
513 |        "11  http://search.naver.com/search.naver?where=nex...  언니들의 슬램덩크 시즌2\n",
514 |        "12  http://search.naver.com/search.naver?where=nex...         미운우리새끼\n",
515 |        "13  http://search.naver.com/search.naver?where=nex...          트위치tv\n",
516 |        "14  http://search.naver.com/search.naver?where=nex...         미녀와 야수\n",
517 |        "15  http://search.naver.com/search.naver?where=nex...             엠넷\n",
518 |        "16  http://search.naver.com/search.naver?where=nex...            도봉순\n",
519 |        "17  http://search.naver.com/search.naver?where=nex...             영화\n",
520 |        "18  http://search.naver.com/search.naver?where=nex...            문재인\n",
521 |        "19  http://search.naver.com/search.naver?where=nex...           미세먼지"
522 |       ]
523 |      },
524 |      "execution_count": 16,
525 |      "metadata": {},
526 |      "output_type": "execute_result"
527 |     }
528 |    ],
529 |    "source": [
530 |     "import pandas as pd\n",
531 |     "pd.DataFrame(data)"
532 |    ]
533 |   },
534 |   {
535 |    "cell_type": "markdown",
536 |    "metadata": {},
537 |    "source": [
538 |     "### 네이버 블로그 포스트 가져오기"
539 |    ]
540 |   },
541 |   {
542 |    "cell_type": "code",
543 |    "execution_count": 17,
544 |    "metadata": {
545 |     "collapsed": true
546 |    },
547 |    "outputs": [],
548 |    "source": [
549 |     "response = requests.get('https://search.naver.com/search.naver?where=post&sm=tab_pge&query=%ED%8C%8C%EC%9D%B4%EC%8D%AC&st=sim&date_option=0&date_from=&date_to=&dup_remove=1&post_blogurl=&post_blogurl_without=&srchby=all&nso=&ie=utf8&start=1')\n",
550 |     "bs = BeautifulSoup(response.text, 'html.parser')"
551 |    ]
552 |   },
553 |   {
554 |    "cell_type": "code",
555 |    "execution_count": 18,
556 |    "metadata": {
557 |     "collapsed": false
558 |    },
559 |    "outputs": [],
560 |    "source": [
561 |     "contents = bs.select('ul#elThumbnailResultArea li.sh_blog_top dt')"
562 |    ]
563 |   },
564 |   {
565 |    "cell_type": "markdown",
566 |    "metadata": {},
567 |    "source": [
568 |     "#### 블로그 포스트 제목"
569 |    ]
570 |   },
571 |   {
572 |    "cell_type": "code",
573 |    "execution_count": 19,
574 |    "metadata": {
575 |     "collapsed": false
576 |    },
577 |    "outputs": [
578 |     {
579 |      "data": {
580 |       "text/plain": [
581 |        "[<dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://rmsep39.tistory.com/1905\" onclick=\"return goOtherCR(this, 'a=blg*t.tit&amp;r=2&amp;i=a00000fa_24235b281218b73aca74f1c5&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"파이썬프로그래밍 기업에서 선호하는 이유\"><strong class=\"hl\">파이썬</strong>프로그래밍 기업에서 선호하는 이유</a></dt>,\n",
582 |        " <dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://blog.naver.com/infopub?Redirect=Log&amp;logNo=220945501558\" onclick=\"return goOtherCR(this, 'a=blg*i.tit&amp;r=2&amp;i=90000003_00000000000000337160CD76&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"초보자를 위한 파이썬 200제\">초보자를 위한 <strong class=\"hl\">파이썬</strong> 200제</a></dt>,\n",
583 |        " <dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://1984.tistory.com/448\" onclick=\"return goOtherCR(this, 'a=blg*t.tit&amp;r=2&amp;i=a00000fa_127472e44f8c39b566a8fa88&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"파이썬 프로그래밍 배워야하는 3가지이유!\"><strong class=\"hl\">파이썬</strong> 프로그래밍 배워야하는 3가지이유!</a></dt>]"
584 |       ]
585 |      },
586 |      "execution_count": 19,
587 |      "metadata": {},
588 |      "output_type": "execute_result"
589 |     }
590 |    ],
591 |    "source": [
592 |     "contents[0:3]"
593 |    ]
594 |   },
595 |   {
596 |    "cell_type": "code",
597 |    "execution_count": 20,
598 |    "metadata": {
599 |     "collapsed": false
600 |    },
601 |    "outputs": [
602 |     {
603 |      "data": {
604 |       "text/plain": [
605 |        "<dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://rmsep39.tistory.com/1905\" onclick=\"return goOtherCR(this, 'a=blg*t.tit&amp;r=2&amp;i=a00000fa_24235b281218b73aca74f1c5&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"파이썬프로그래밍 기업에서 선호하는 이유\"><strong class=\"hl\">파이썬</strong>프로그래밍 기업에서 선호하는 이유</a></dt>"
606 |       ]
607 |      },
608 |      "execution_count": 20,
609 |      "metadata": {},
610 |      "output_type": "execute_result"
611 |     }
612 |    ],
613 |    "source": [
614 |     "contents[0]"
615 |    ]
616 |   },
617 |   {
618 |    "cell_type": "code",
619 |    "execution_count": 21,
620 |    "metadata": {
621 |     "collapsed": false
622 |    },
623 |    "outputs": [
624 |     {
625 |      "data": {
626 |       "text/plain": [
627 |        "'파이썬프로그래밍 기업에서 선호하는 이유'"
628 |       ]
629 |      },
630 |      "execution_count": 21,
631 |      "metadata": {},
632 |      "output_type": "execute_result"
633 |     }
634 |    ],
635 |    "source": [
636 |     "contents[0].select_one('a').attrs.get('title')"
637 |    ]
638 |   },
639 |   {
640 |    "cell_type": "markdown",
641 |    "metadata": {},
642 |    "source": [
643 |     "#### 블로그 포스트 주소 "
644 |    ]
645 |   },
646 |   {
647 |    "cell_type": "code",
648 |    "execution_count": 22,
649 |    "metadata": {
650 |     "collapsed": false
651 |    },
652 |    "outputs": [
653 |     {
654 |      "data": {
655 |       "text/plain": [
656 |        "[<dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://rmsep39.tistory.com/1905\" onclick=\"return goOtherCR(this, 'a=blg*t.tit&amp;r=2&amp;i=a00000fa_24235b281218b73aca74f1c5&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"파이썬프로그래밍 기업에서 선호하는 이유\"><strong class=\"hl\">파이썬</strong>프로그래밍 기업에서 선호하는 이유</a></dt>,\n",
657 |        " <dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://blog.naver.com/infopub?Redirect=Log&amp;logNo=220945501558\" onclick=\"return goOtherCR(this, 'a=blg*i.tit&amp;r=2&amp;i=90000003_00000000000000337160CD76&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"초보자를 위한 파이썬 200제\">초보자를 위한 <strong class=\"hl\">파이썬</strong> 200제</a></dt>,\n",
658 |        " <dt><a class=\"sh_blog_title _sp_each_url _sp_each_title\" href=\"http://1984.tistory.com/448\" onclick=\"return goOtherCR(this, 'a=blg*t.tit&amp;r=2&amp;i=a00000fa_127472e44f8c39b566a8fa88&amp;u='+urlencode(this.href))\" target=\"_blank\" title=\"파이썬 프로그래밍 배워야하는 3가지이유!\"><strong class=\"hl\">파이썬</strong> 프로그래밍 배워야하는 3가지이유!</a></dt>]"
659 |       ]
660 |      },
661 |      "execution_count": 22,
662 |      "metadata": {},
663 |      "output_type": "execute_result"
664 |     }
665 |    ],
666 |    "source": [
667 |     "contents[0:3]"
668 |    ]
669 |   },
670 |   {
671 |    "cell_type": "code",
672 |    "execution_count": 23,
673 |    "metadata": {
674 |     "collapsed": false
675 |    },
676 |    "outputs": [
677 |     {
678 |      "data": {
679 |       "text/plain": [
680 |        "'http://rmsep39.tistory.com/1905'"
681 |       ]
682 |      },
683 |      "execution_count": 23,
684 |      "metadata": {},
685 |      "output_type": "execute_result"
686 |     }
687 |    ],
688 |    "source": [
689 |     "contents[0].select_one('a').attrs.get('href')"
690 |    ]
691 |   },
692 |   {
693 |    "cell_type": "markdown",
694 |    "metadata": {},
695 |    "source": [
696 |     "#### 블로그 포스트의 제목과, 블로그 포스트 주소 parsing "
697 |    ]
698 |   },
699 |   {
700 |    "cell_type": "code",
701 |    "execution_count": 24,
702 |    "metadata": {
703 |     "collapsed": false
704 |    },
705 |    "outputs": [],
706 |    "source": [
707 |     "data = [\n",
708 |     "    {'title': content.select_one('a').attrs.get('title'),\n",
709 |     "    'address' : content.select_one('a').attrs.get('href')}\n",
710 |     "    for content\n",
711 |     "    in contents\n",
712 |     "]"
713 |    ]
714 |   },
715 |   {
716 |    "cell_type": "code",
717 |    "execution_count": 25,
718 |    "metadata": {
719 |     "collapsed": false
720 |    },
721 |    "outputs": [
722 |     {
723 |      "data": {
724 |       "text/plain": [
725 |        "10"
726 |       ]
727 |      },
728 |      "execution_count": 25,
729 |      "metadata": {},
730 |      "output_type": "execute_result"
731 |     }
732 |    ],
733 |    "source": [
734 |     "len(data)"
735 |    ]
736 |   },
737 |   {
738 |    "cell_type": "code",
739 |    "execution_count": 26,
740 |    "metadata": {
741 |     "collapsed": false
742 |    },
743 |    "outputs": [
744 |     {
745 |      "data": {
746 |       "text/plain": [
747 |        "[{'address': 'http://rmsep39.tistory.com/1905',\n",
748 |        "  'title': '파이썬프로그래밍 기업에서 선호하는 이유'},\n",
749 |        " {'address': 'http://blog.naver.com/infopub?Redirect=Log&logNo=220945501558',\n",
750 |        "  'title': '초보자를 위한 파이썬 200제'},\n",
751 |        " {'address': 'http://1984.tistory.com/448', 'title': '파이썬 프로그래밍 배워야하는 3가지이유!'},\n",
752 |        " {'address': 'http://chogar.blog.me/220942149662',\n",
753 |        "  'title': '마인크래프트를 활용한 파이썬 프로그래밍 과정 시작!'},\n",
754 |        " {'address': 'http://edujoa.tistory.com/1389', 'title': '프로그래밍 입문 파이썬부터 시작!!'},\n",
755 |        " {'address': 'http://blog.naver.com/nasu0210?Redirect=Log&logNo=220932224509',\n",
756 |        "  'title': '대구 프로그래밍 학원 파이썬 1개월 완성 주말 과정 개설'},\n",
757 |        " {'address': 'http://blog.alyac.co.kr/985',\n",
758 |        "  'title': '패치되지 않은 파이썬과 자바 취약점, 해커들이 FTP 인젝션을 통해 방화벽 우회하도록 허용해'},\n",
759 |        " {'address': 'http://shaeod.tistory.com/949',\n",
760 |        "  'title': '[개발] 파이썬 다운로드 및 윈도우에 설치하는 방법'},\n",
761 |        " {'address': 'http://rmsep39.tistory.com/2056',\n",
762 |        "  'title': '파이썬 강좌, 비전공자도 배울 수 있다!'},\n",
763 |        " {'address': 'http://blog.fastcampus.co.kr/220942981452',\n",
764 |        "  'title': '파이썬과 장고의 실무 개발 노하우를 전수해드리겠습니다.'}]"
765 |       ]
766 |      },
767 |      "execution_count": 26,
768 |      "metadata": {},
769 |      "output_type": "execute_result"
770 |     }
771 |    ],
772 |    "source": [
773 |     "data"
774 |    ]
775 |   },
776 |   {
777 |    "cell_type": "code",
778 |    "execution_count": 27,
779 |    "metadata": {
780 |     "collapsed": false
781 |    },
782 |    "outputs": [
783 |     {
784 |      "data": {
785 |       "text/html": [
786 |        "<div>\n",
787 |        "<table border=\"1\" class=\"dataframe\">\n",
788 |        "  <thead>\n",
789 |        "    <tr style=\"text-align: right;\">\n",
790 |        "      <th></th>\n",
791 |        "      <th>address</th>\n",
792 |        "      <th>title</th>\n",
793 |        "    </tr>\n",
794 |        "  </thead>\n",
795 |        "  <tbody>\n",
796 |        "    <tr>\n",
797 |        "      <th>0</th>\n",
798 |        "      <td>http://rmsep39.tistory.com/1905</td>\n",
799 |        "      <td>파이썬프로그래밍 기업에서 선호하는 이유</td>\n",
800 |        "    </tr>\n",
801 |        "    <tr>\n",
802 |        "      <th>1</th>\n",
803 |        "      <td>http://blog.naver.com/infopub?Redirect=Log&amp;log...</td>\n",
804 |        "      <td>초보자를 위한 파이썬 200제</td>\n",
805 |        "    </tr>\n",
806 |        "    <tr>\n",
807 |        "      <th>2</th>\n",
808 |        "      <td>http://1984.tistory.com/448</td>\n",
809 |        "      <td>파이썬 프로그래밍 배워야하는 3가지이유!</td>\n",
810 |        "    </tr>\n",
811 |        "    <tr>\n",
812 |        "      <th>3</th>\n",
813 |        "      <td>http://chogar.blog.me/220942149662</td>\n",
814 |        "      <td>마인크래프트를 활용한 파이썬 프로그래밍 과정 시작!</td>\n",
815 |        "    </tr>\n",
816 |        "    <tr>\n",
817 |        "      <th>4</th>\n",
818 |        "      <td>http://edujoa.tistory.com/1389</td>\n",
819 |        "      <td>프로그래밍 입문 파이썬부터 시작!!</td>\n",
820 |        "    </tr>\n",
821 |        "    <tr>\n",
822 |        "      <th>5</th>\n",
823 |        "      <td>http://blog.naver.com/nasu0210?Redirect=Log&amp;lo...</td>\n",
824 |        "      <td>대구 프로그래밍 학원 파이썬 1개월 완성 주말 과정 개설</td>\n",
825 |        "    </tr>\n",
826 |        "    <tr>\n",
827 |        "      <th>6</th>\n",
828 |        "      <td>http://blog.alyac.co.kr/985</td>\n",
829 |        "      <td>패치되지 않은 파이썬과 자바 취약점, 해커들이 FTP 인젝션을 통해 방화벽 우회하도...</td>\n",
830 |        "    </tr>\n",
831 |        "    <tr>\n",
832 |        "      <th>7</th>\n",
833 |        "      <td>http://shaeod.tistory.com/949</td>\n",
834 |        "      <td>[개발] 파이썬 다운로드 및 윈도우에 설치하는 방법</td>\n",
835 |        "    </tr>\n",
836 |        "    <tr>\n",
837 |        "      <th>8</th>\n",
838 |        "      <td>http://rmsep39.tistory.com/2056</td>\n",
839 |        "      <td>파이썬 강좌, 비전공자도 배울 수 있다!</td>\n",
840 |        "    </tr>\n",
841 |        "    <tr>\n",
842 |        "      <th>9</th>\n",
843 |        "      <td>http://blog.fastcampus.co.kr/220942981452</td>\n",
844 |        "      <td>파이썬과 장고의 실무 개발 노하우를 전수해드리겠습니다.</td>\n",
845 |        "    </tr>\n",
846 |        "  </tbody>\n",
847 |        "</table>\n",
848 |        "</div>"
849 |       ],
850 |       "text/plain": [
851 |        "                                             address  \\\n",
852 |        "0                    http://rmsep39.tistory.com/1905   \n",
853 |        "1  http://blog.naver.com/infopub?Redirect=Log&log...   \n",
854 |        "2                        http://1984.tistory.com/448   \n",
855 |        "3                 http://chogar.blog.me/220942149662   \n",
856 |        "4                     http://edujoa.tistory.com/1389   \n",
857 |        "5  http://blog.naver.com/nasu0210?Redirect=Log&lo...   \n",
858 |        "6                        http://blog.alyac.co.kr/985   \n",
859 |        "7                      http://shaeod.tistory.com/949   \n",
860 |        "8                    http://rmsep39.tistory.com/2056   \n",
861 |        "9          http://blog.fastcampus.co.kr/220942981452   \n",
862 |        "\n",
863 |        "                                               title  \n",
864 |        "0                              파이썬프로그래밍 기업에서 선호하는 이유  \n",
865 |        "1                                   초보자를 위한 파이썬 200제  \n",
866 |        "2                             파이썬 프로그래밍 배워야하는 3가지이유!  \n",
867 |        "3                       마인크래프트를 활용한 파이썬 프로그래밍 과정 시작!  \n",
868 |        "4                                프로그래밍 입문 파이썬부터 시작!!  \n",
869 |        "5                    대구 프로그래밍 학원 파이썬 1개월 완성 주말 과정 개설  \n",
870 |        "6  패치되지 않은 파이썬과 자바 취약점, 해커들이 FTP 인젝션을 통해 방화벽 우회하도...  \n",
871 |        "7                       [개발] 파이썬 다운로드 및 윈도우에 설치하는 방법  \n",
872 |        "8                             파이썬 강좌, 비전공자도 배울 수 있다!  \n",
873 |        "9                     파이썬과 장고의 실무 개발 노하우를 전수해드리겠습니다.  "
874 |       ]
875 |      },
876 |      "execution_count": 27,
877 |      "metadata": {},
878 |      "output_type": "execute_result"
879 |     }
880 |    ],
881 |    "source": [
882 |     "pd.DataFrame(data)"
883 |    ]
884 |   }
885 |  ],
886 |  "metadata": {
887 |   "kernelspec": {
888 |    "display_name": "Python 3",
889 |    "language": "python",
890 |    "name": "python3"
891 |   },
892 |   "language_info": {
893 |    "codemirror_mode": {
894 |     "name": "ipython",
895 |     "version": 3
896 |    },
897 |    "file_extension": ".py",
898 |    "mimetype": "text/x-python",
899 |    "name": "python",
900 |    "nbconvert_exporter": "python",
901 |    "pygments_lexer": "ipython3",
902 |    "version": "3.6.0"
903 |   }
904 |  },
905 |  "nbformat": 4,
906 |  "nbformat_minor": 2
907 | }
908 | 


--------------------------------------------------------------------------------
/Scrapping text mining papers in arXiv.py:
--------------------------------------------------------------------------------
 1 | import requests
 2 | import pandas as pd
 3 | import re
 4 | import time
 5 | from bs4 import BeautifulSoup
 6 | 
 7 | # url = 'https://arxiv.org/find/all/1/all:+EXACT+text_mining/0/1/0/all/0/1?skip=0'
 8 | # response = requests.get(url)
 9 | # bs = BeautifulSoup(response.text, 'html.parser')
10 | 
11 | start = time.clock()
12 | contents = []
13 | for i, skip in enumerate(list(range(0, 175, 25))):
14 | 
15 |     tmp_url = 'https://arxiv.org/find/all/1/all:+EXACT+text_mining/0/1/0/all/0/1?skip=' + str(skip)
16 |     tmp_response = requests.get(tmp_url)
17 |     tmp_bs = BeautifulSoup(tmp_response.text, 'html.parser')
18 |     tmp_contents = tmp_bs.select_one('div#dlpage').select('span.list-identifier')
19 | 
20 |     tmp_href = [tmp_href.select_one('a').attrs.get('href') for tmp_href in tmp_contents]
21 |     #tmp_href = list(map(lambda x : x.select_one('a').attrs.get('href'),tmp_contents))
22 |     tmp_list = list(map(lambda x : 'https://arxiv.org' + x, tmp_href))
23 | 
24 | 
25 |     for j, tmp_paper in enumerate(tmp_list):
26 |         tmp_response_paper = requests.get(tmp_paper)
27 |         tmp_bs_paper = BeautifulSoup(tmp_response_paper.text, 'html.parser')
28 |         tmp_contents = {'title' : re.sub('\\s+',
29 |                                          ' ',
30 |                                          tmp_bs_paper.select_one('h1.title.mathjax').text.replace('Title:\n', '')),
31 |                          'author' : tmp_bs_paper.select_one('div.authors').text.replace('\n', '').replace('Authors:', ''),
32 |                          'subject' : tmp_bs_paper.select_one('span.primary-subject').text,
33 |                          'abstract' : tmp_bs_paper.select_one('blockquote.abstract.mathjax').text.replace('\nAbstract: ', ''),
34 |                          'meta' : re.sub('\\s+',
35 |                                          ' ',
36 |                                          tmp_bs_paper.select_one('div.submission-history').text.split('[v1]')[1].strip())}
37 |         contents.append(tmp_contents)
38 |         print(j + 1, '/', len(tmp_list))
39 | 
40 |     print(i + 1, '/', len(list(range(0, 175, 25))))
41 | end = time.clock()
42 | print(end - start)
43 | 
44 | len(contents)
45 | data = pd.DataFrame(contents)
46 | data.shape
47 | data.iloc[0:2,:]
48 | 


--------------------------------------------------------------------------------
/Static webpage and Dynamic webpage.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# 정적인 웹사이트, 동적인 웹사이트\n",
  8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
  9 |     "만든이 : 김보섭  \n",
 10 |     "\n",
 11 |     "## Preliminary\n",
 12 |     "* 웹 사이트 (웹 클라이언트) : HTML, CSS, Javascript\n",
 13 |     "* HTML : 사이트의 구조\n",
 14 |     "* CSS : 사이트의 스타일\n",
 15 |     "* Javascript : 사이트의 동적인 기능들 (애니메이션, 데이터를 서버에서 불러오는 기능)\n",
 16 |     "* 데이터를 우리에게 뿌려주는 주체\n",
 17 |     "    * 웹 서버 => HTML (Server rendering)\n",
 18 |     "    * 웹 클라이언트 (Javascript) => HTML (Client Rendering)\n",
 19 |     "\n",
 20 |     "## Summary\n",
 21 |     "* __정적인 웹사이트__ _(requests.get HTML => BeautifulSoup CSS Selector)_\n",
 22 |     "    * HTML 파일을 다운로드 (Crawling, Scraping)\n",
 23 |     "    * 우리가 원하는 데이터의 위치를 찾아서 추출! (Parsing)\n",
 24 |     "        * CSS Selector\n",
 25 |     "  \n",
 26 |     "  \n",
 27 |     "* __동적인 웹사이트__ _(API URL (Headers) => requests.get JSON => Dict)_\n",
 28 |     "    * 어딘가에서(\\_____________:: API) 찾는 과정!\n",
 29 |     "    * 데이터를 불러오고\n",
 30 |     "    * 데이터를 추출하는 (parsing)\n",
 31 |     "\n",
 32 |     "## HTTP Method (서버에 요청하는 방식)\n",
 33 |     "* GET - URL을 통해서 데이터(요청)을 보낸다. (브라우져에서 서버로)\n",
 34 |     "* POST - HTTP Body를 통해서 데이터(요청)을 보낸다."
 35 |    ]
 36 |   },
 37 |   {
 38 |    "cell_type": "markdown",
 39 |    "metadata": {},
 40 |    "source": [
 41 |     "## 정적인 웹사이트\n",
 42 |     "### 네이버 실시간 검색어 데이터"
 43 |    ]
 44 |   },
 45 |   {
 46 |    "cell_type": "code",
 47 |    "execution_count": 1,
 48 |    "metadata": {
 49 |     "collapsed": true
 50 |    },
 51 |    "outputs": [],
 52 |    "source": [
 53 |     "import requests\n",
 54 |     "from bs4 import BeautifulSoup"
 55 |    ]
 56 |   },
 57 |   {
 58 |    "cell_type": "code",
 59 |    "execution_count": 2,
 60 |    "metadata": {
 61 |     "collapsed": false
 62 |    },
 63 |    "outputs": [
 64 |     {
 65 |      "data": {
 66 |       "text/plain": [
 67 |        "200"
 68 |       ]
 69 |      },
 70 |      "execution_count": 2,
 71 |      "metadata": {},
 72 |      "output_type": "execute_result"
 73 |     }
 74 |    ],
 75 |    "source": [
 76 |     "response = requests.get('http://www.naver.com/')\n",
 77 |     "response.status_code # 200이 뜨면 데이터를 잘 받아왔다는 뜻"
 78 |    ]
 79 |   },
 80 |   {
 81 |    "cell_type": "code",
 82 |    "execution_count": 3,
 83 |    "metadata": {
 84 |     "collapsed": false
 85 |    },
 86 |    "outputs": [
 87 |     {
 88 |      "data": {
 89 |       "text/plain": [
 90 |        "str"
 91 |       ]
 92 |      },
 93 |      "execution_count": 3,
 94 |      "metadata": {},
 95 |      "output_type": "execute_result"
 96 |     }
 97 |    ],
 98 |    "source": [
 99 |     "response.text # HTML Text 중에서, 우리가 원하는 데이터 parsing!\n",
100 |     "type(response.text)"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "code",
105 |    "execution_count": 4,
106 |    "metadata": {
107 |     "collapsed": false
108 |    },
109 |    "outputs": [],
110 |    "source": [
111 |     "# HTML 구조를 계층적으로! (DOM : Document Object Model)\n",
112 |     "# HTML > BODY > DIV > UL > LI ...\n",
113 |     "#             > DIV > OL > LI\n",
114 |     "dom = BeautifulSoup(response.text, 'html.parser') "
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": 5,
120 |    "metadata": {
121 |     "collapsed": false
122 |    },
123 |    "outputs": [],
124 |    "source": [
125 |     "rank_elements = dom.select('ol#realrank li.up')"
126 |    ]
127 |   },
128 |   {
129 |    "cell_type": "code",
130 |    "execution_count": 6,
131 |    "metadata": {
132 |     "collapsed": true
133 |    },
134 |    "outputs": [],
135 |    "source": [
136 |     "rank_element = rank_elements[0]"
137 |    ]
138 |   },
139 |   {
140 |    "cell_type": "code",
141 |    "execution_count": 7,
142 |    "metadata": {
143 |     "collapsed": false
144 |    },
145 |    "outputs": [
146 |     {
147 |      "data": {
148 |       "text/plain": [
149 |        "<li class=\"up\" value=\"1\"><a href=\"http://search.naver.com/search.naver?where=nexearch&amp;query=%EC%B1%84%EC%88%98%EB%B9%88&amp;sm=top_lve&amp;ie=utf8\" title=\"\"><span class=\"ell\">채수빈</span><span class=\"tx\">상승</span><span class=\"ic\"></span><span class=\"rk\">171</span></a></li>"
150 |       ]
151 |      },
152 |      "execution_count": 7,
153 |      "metadata": {},
154 |      "output_type": "execute_result"
155 |     }
156 |    ],
157 |    "source": [
158 |     "rank_element"
159 |    ]
160 |   },
161 |   {
162 |    "cell_type": "code",
163 |    "execution_count": 8,
164 |    "metadata": {
165 |     "collapsed": false
166 |    },
167 |    "outputs": [
168 |     {
169 |      "data": {
170 |       "text/plain": [
171 |        "'채수빈'"
172 |       ]
173 |      },
174 |      "execution_count": 8,
175 |      "metadata": {},
176 |      "output_type": "execute_result"
177 |     }
178 |    ],
179 |    "source": [
180 |     "rank_element.select_one('span.ell').text"
181 |    ]
182 |   },
183 |   {
184 |    "cell_type": "code",
185 |    "execution_count": 9,
186 |    "metadata": {
187 |     "collapsed": false
188 |    },
189 |    "outputs": [
190 |     {
191 |      "data": {
192 |       "text/plain": [
193 |        "['채수빈',\n",
194 |        " '역적',\n",
195 |        " '에이미',\n",
196 |        " '역적 ost',\n",
197 |        " '이수민',\n",
198 |        " '한국 중국',\n",
199 |        " '프리스틴',\n",
200 |        " '학교폭력 실태조사',\n",
201 |        " '애플',\n",
202 |        " '그녀는 거짓말을 너무 사랑해',\n",
203 |        " '몬스타엑스',\n",
204 |        " '박근혜',\n",
205 |        " '외부자들',\n",
206 |        " '하이라이트',\n",
207 |        " 'v앱',\n",
208 |        " '이비에스아이',\n",
209 |        " '아프리카tv',\n",
210 |        " '프리즌']"
211 |       ]
212 |      },
213 |      "execution_count": 9,
214 |      "metadata": {},
215 |      "output_type": "execute_result"
216 |     }
217 |    ],
218 |    "source": [
219 |     "ranks = [rank_element.select_one('span.ell').text for rank_element in rank_elements ]\n",
220 |     "ranks"
221 |    ]
222 |   },
223 |   {
224 |    "cell_type": "markdown",
225 |    "metadata": {},
226 |    "source": [
227 |     "### 네이버 블로그 포스트 가져오기\n",
228 |     "* 포스트 제목\n",
229 |     "* 포스트 주소  "
230 |    ]
231 |   },
232 |   {
233 |    "cell_type": "code",
234 |    "execution_count": 10,
235 |    "metadata": {
236 |     "collapsed": true
237 |    },
238 |    "outputs": [],
239 |    "source": [
240 |     "response = requests.get('https://search.naver.com/search.naver?where=post&sm=tab_jum&ie=utf8&query=%ED%8C%8C%EC%9D%B4%EC%8D%AC')"
241 |    ]
242 |   },
243 |   {
244 |    "cell_type": "code",
245 |    "execution_count": 11,
246 |    "metadata": {
247 |     "collapsed": true
248 |    },
249 |    "outputs": [],
250 |    "source": [
251 |     "dom = BeautifulSoup(response.text, 'html.parser')"
252 |    ]
253 |   },
254 |   {
255 |    "cell_type": "code",
256 |    "execution_count": 12,
257 |    "metadata": {
258 |     "collapsed": false
259 |    },
260 |    "outputs": [],
261 |    "source": [
262 |     "post_elements = dom.select('ul#elThumbnailResultArea li.sh_blog_top')"
263 |    ]
264 |   },
265 |   {
266 |    "cell_type": "code",
267 |    "execution_count": 13,
268 |    "metadata": {
269 |     "collapsed": false
270 |    },
271 |    "outputs": [
272 |     {
273 |      "data": {
274 |       "text/plain": [
275 |        "[{'title': '파이썬프로그래밍 기업에서 선호하는 이유', 'url': 'http://rmsep39.tistory.com/1905'},\n",
276 |        " {'title': '초보자를 위한 파이썬 200제',\n",
277 |        "  'url': 'http://blog.naver.com/infopub?Redirect=Log&logNo=220945501558'},\n",
278 |        " {'title': '파이썬 프로그래밍 배워야하는 3가지이유!', 'url': 'http://1984.tistory.com/448'},\n",
279 |        " {'title': '마인크래프트를 활용한 파이썬 프로그래밍 과정 시작!',\n",
280 |        "  'url': 'http://chogar.blog.me/220942149662'},\n",
281 |        " {'title': '프로그래밍 입문 파이썬부터 시작!!', 'url': 'http://edujoa.tistory.com/1389'},\n",
282 |        " {'title': '오픈소스 언어로 만나는 데이터 분석, ‘파이썬’과 ‘R’',\n",
283 |        "  'url': 'http://blog.lgcns.com/1363'},\n",
284 |        " {'title': '대구 프로그래밍 학원 파이썬 1개월 완성 주말 과정 개설',\n",
285 |        "  'url': 'http://blog.naver.com/nasu0210?Redirect=Log&logNo=220932224509'},\n",
286 |        " {'title': '패치되지 않은 파이썬과 자바 취약점, 해커들이 FTP 인젝션을 통해 방화벽 우회하도록 허용해',\n",
287 |        "  'url': 'http://blog.alyac.co.kr/985'},\n",
288 |        " {'title': '[개발] 파이썬 다운로드 및 윈도우에 설치하는 방법',\n",
289 |        "  'url': 'http://shaeod.tistory.com/949'},\n",
290 |        " {'title': '파이썬 강좌, 비전공자도 배울 수 있다!', 'url': 'http://rmsep39.tistory.com/2056'}]"
291 |       ]
292 |      },
293 |      "execution_count": 13,
294 |      "metadata": {},
295 |      "output_type": "execute_result"
296 |     }
297 |    ],
298 |    "source": [
299 |     "# <a href=\"http://blog.naver.com/__________/123/\"></a>\n",
300 |     "# a (anchor) :: href (hyperlink reference)\n",
301 |     "data = [{'title' : post_element.select_one('a.sh_blog_title').attrs.get('title'),\n",
302 |     "         'url' : post_element.select_one('a.sh_blog_title').attrs.get('href')} for post_element in post_elements]\n",
303 |     "data"
304 |    ]
305 |   },
306 |   {
307 |    "cell_type": "markdown",
308 |    "metadata": {},
309 |    "source": [
310 |     "### 네이버 블로그 포스트 가져오기 (Page 넘어가면서)\n",
311 |     "HTTP method 중 GET 방식, URL에 데이터를 요청하므로, 아래의 example을 보면 URL parameter인 query에 특정값이 할당되어있음을 볼 수 있다.  \n",
312 |     "_(URL parameter인 query에 =로 써있는 값은 '파이썬')_  \n",
313 |     "\n",
314 |     "example : https://search.naver.com/search.naver?where=post&sm=tab_jum&ie=utf8&query=%ED%8C%8C%EC%9D%B4%EC%8D%AC  \n",
315 |     "example(page 정보포함) :  \n",
316 |     "https://search.naver.com/search.naver?where=post&sm=tab_pge&query=%ED%8C%8C%EC%9D%B4%EC%8D%AC&st=sim&date_option=0&dup_remove=1&srchby=all&ie=utf8&start=1\n",
317 |     "  \n",
318 |     "따라서 Page를 넘어가면서 네이버 블로그 '파이썬' 검색결과에서 블로그의 포스트의 제목과 URL을 가져오는 코드는 아래와 같이 구현할 수 있다."
319 |    ]
320 |   },
321 |   {
322 |    "cell_type": "code",
323 |    "execution_count": 14,
324 |    "metadata": {
325 |     "collapsed": false
326 |    },
327 |    "outputs": [],
328 |    "source": [
329 |     "base_url = 'https://search.naver.com/search.naver?where=post&sm=tab_pge&query=파이썬&start={page}'"
330 |    ]
331 |   },
332 |   {
333 |    "cell_type": "code",
334 |    "execution_count": 15,
335 |    "metadata": {
336 |     "collapsed": false
337 |    },
338 |    "outputs": [],
339 |    "source": [
340 |     "data = []\n",
341 |     "for page in range(1,101,10):\n",
342 |     "    response = requests.get(base_url.format(page = page))\n",
343 |     "    dom = BeautifulSoup(response.text, 'html.parser')\n",
344 |     "    post_elements = dom.select('ul#elThumbnailResultArea li.sh_blog_top')\n",
345 |     "    data.append([{'title' : post_element.select_one('a.sh_blog_title').attrs.get('title'),\n",
346 |     "             'url' : post_element.select_one('a.sh_blog_title').attrs.get('href')} for post_element in post_elements])"
347 |    ]
348 |   },
349 |   {
350 |    "cell_type": "code",
351 |    "execution_count": 16,
352 |    "metadata": {
353 |     "collapsed": false
354 |    },
355 |    "outputs": [
356 |     {
357 |      "data": {
358 |       "text/plain": [
359 |        "'D:\\\\dev\\\\py-automate'"
360 |       ]
361 |      },
362 |      "execution_count": 16,
363 |      "metadata": {},
364 |      "output_type": "execute_result"
365 |     }
366 |    ],
367 |    "source": [
368 |     "# 과제\n",
369 |     "# 특정 파일(keywords.txt)을 읽어서\n",
370 |     "# 각각의 키워드에 해당하는 상위노출된 블로그 주소, 제목을 크롤링하고,\n",
371 |     "# 각각의 키워드명.csv 파일에다가 저장하기\n",
372 |     "import os\n",
373 |     "os.getcwd()"
374 |    ]
375 |   },
376 |   {
377 |    "cell_type": "markdown",
378 |    "metadata": {},
379 |    "source": [
380 |     "## 동적인 웹사이트\n",
381 |     "* 실제 데이터가 어디서부터 오는지 체크 (예제 : 직방)\n",
382 |     "* 실제 웹 브라우저를 켜서 직방을 들어가는 방법 (Selenium) (다음 notebook 파일에)"
383 |    ]
384 |   },
385 |   {
386 |    "cell_type": "markdown",
387 |    "metadata": {},
388 |    "source": [
389 |     "### 직방 예제 (실제 주소찾기)\n",
390 |     "* 직방 사이트 :: 직방 개발자들 :: 직방 웹 클라이언트 : 직방 웹 서버 (API Server)  \n",
391 |     "* JSON (JSON API) :: Javascript Obejct Notation == Python Dict (매우 유사!)  \n",
392 |     "* requests.get => str(JSON Format String)"
393 |    ]
394 |   },
395 |   {
396 |    "cell_type": "markdown",
397 |    "metadata": {},
398 |    "source": [
399 |     "#### Python Dict => JSON Format String\n",
400 |     "json.dumps()"
401 |    ]
402 |   },
403 |   {
404 |    "cell_type": "code",
405 |    "execution_count": 17,
406 |    "metadata": {
407 |     "collapsed": true
408 |    },
409 |    "outputs": [],
410 |    "source": [
411 |     "# Python Dict ==> JSON Format String\n",
412 |     "import json\n",
413 |     "student = {'name' : 'Boseop Kim', 'email' : 'svei89@korea.ac.kr'}\n",
414 |     "json_text = json.dumps(student)"
415 |    ]
416 |   },
417 |   {
418 |    "cell_type": "code",
419 |    "execution_count": 18,
420 |    "metadata": {
421 |     "collapsed": false
422 |    },
423 |    "outputs": [
424 |     {
425 |      "name": "stdout",
426 |      "output_type": "stream",
427 |      "text": [
428 |       "{\"name\": \"Boseop Kim\", \"email\": \"svei89@korea.ac.kr\"} <class 'str'>\n"
429 |      ]
430 |     }
431 |    ],
432 |    "source": [
433 |     "print(json_text, type(json_text))"
434 |    ]
435 |   },
436 |   {
437 |    "cell_type": "markdown",
438 |    "metadata": {},
439 |    "source": [
440 |     "####  JSON Format String => Python Dict\n",
441 |     "json.loads()"
442 |    ]
443 |   },
444 |   {
445 |    "cell_type": "code",
446 |    "execution_count": 19,
447 |    "metadata": {
448 |     "collapsed": false
449 |    },
450 |    "outputs": [
451 |     {
452 |      "data": {
453 |       "text/plain": [
454 |        "'{\"name\": \"Boseop Kim\", \"email\": \"svei89@korea.ac.kr\"}'"
455 |       ]
456 |      },
457 |      "execution_count": 19,
458 |      "metadata": {},
459 |      "output_type": "execute_result"
460 |     }
461 |    ],
462 |    "source": [
463 |     "json_text"
464 |    ]
465 |   },
466 |   {
467 |    "cell_type": "code",
468 |    "execution_count": 20,
469 |    "metadata": {
470 |     "collapsed": false
471 |    },
472 |    "outputs": [
473 |     {
474 |      "data": {
475 |       "text/plain": [
476 |        "{'email': 'svei89@korea.ac.kr', 'name': 'Boseop Kim'}"
477 |       ]
478 |      },
479 |      "execution_count": 20,
480 |      "metadata": {},
481 |      "output_type": "execute_result"
482 |     }
483 |    ],
484 |    "source": [
485 |     "json.loads(json_text)"
486 |    ]
487 |   },
488 |   {
489 |    "cell_type": "code",
490 |    "execution_count": 21,
491 |    "metadata": {
492 |     "collapsed": false
493 |    },
494 |    "outputs": [
495 |     {
496 |      "data": {
497 |       "text/plain": [
498 |        "dict"
499 |       ]
500 |      },
501 |      "execution_count": 21,
502 |      "metadata": {},
503 |      "output_type": "execute_result"
504 |     }
505 |    ],
506 |    "source": [
507 |     "type(json.loads(json_text))"
508 |    ]
509 |   },
510 |   {
511 |    "cell_type": "markdown",
512 |    "metadata": {},
513 |    "source": [
514 |     "#### 직방 예제\n",
515 |     "실제 매물정보가 담긴 URL을 크롬 개발자 도구를 이용하여 찾아서 진행, string을 받아왔을 때, 직방의 매물이 json의 형태로 담겨있으므로 바로 json 패키지를 이용하여 진행  "
516 |    ]
517 |   },
518 |   {
519 |    "cell_type": "markdown",
520 |    "metadata": {},
521 |    "source": [
522 |     "#### 보증금과 월세 가져오기"
523 |    ]
524 |   },
525 |   {
526 |    "cell_type": "code",
527 |    "execution_count": 22,
528 |    "metadata": {
529 |     "collapsed": true
530 |    },
531 |    "outputs": [],
532 |    "source": [
533 |     "response = requests.get('https://api.zigbang.com/v3/items?detail=true&item_ids=[7468696,7780221,7747863,7699078,7250207,7779822,7593985,7672971,7557867,7662179,7728008,7748278,7590167,7753835,7684547,7767467,7764519,7703739,7729634,7792521,7747993,7736951,7515215,7577402,7720787,7603913,7721071,7755858,7775742,7755021,7694602,7566668,7722769,7791292,6915486,7542897,7618794,7772538,7690527,7777154,7745980,7710823,7685570,7780189,7775527,7770996,7782847,7750303,7715553,7493782,7736339,7663573,7645807,7652183,7439872,7768911,7584389,7615213,7718616,7720892]')"
534 |    ]
535 |   },
536 |   {
537 |    "cell_type": "code",
538 |    "execution_count": 23,
539 |    "metadata": {
540 |     "collapsed": true
541 |    },
542 |    "outputs": [],
543 |    "source": [
544 |     "zigbang = json.loads(response.text)\n",
545 |     "# zigbang = response.json() 위와 동일한 기능 "
546 |    ]
547 |   },
548 |   {
549 |    "cell_type": "code",
550 |    "execution_count": 24,
551 |    "metadata": {
552 |     "collapsed": false
553 |    },
554 |    "outputs": [
555 |     {
556 |      "data": {
557 |       "text/plain": [
558 |        "dict"
559 |       ]
560 |      },
561 |      "execution_count": 24,
562 |      "metadata": {},
563 |      "output_type": "execute_result"
564 |     }
565 |    ],
566 |    "source": [
567 |     "type(zigbang)"
568 |    ]
569 |   },
570 |   {
571 |    "cell_type": "code",
572 |    "execution_count": 25,
573 |    "metadata": {
574 |     "collapsed": false
575 |    },
576 |    "outputs": [
577 |     {
578 |      "data": {
579 |       "text/plain": [
580 |        "68"
581 |       ]
582 |      },
583 |      "execution_count": 25,
584 |      "metadata": {},
585 |      "output_type": "execute_result"
586 |     }
587 |    ],
588 |    "source": [
589 |     "zigbang['items'][0]['item']['rent']"
590 |    ]
591 |   },
592 |   {
593 |    "cell_type": "code",
594 |    "execution_count": 26,
595 |    "metadata": {
596 |     "collapsed": false
597 |    },
598 |    "outputs": [
599 |     {
600 |      "data": {
601 |       "text/plain": [
602 |        "1000"
603 |       ]
604 |      },
605 |      "execution_count": 26,
606 |      "metadata": {},
607 |      "output_type": "execute_result"
608 |     }
609 |    ],
610 |    "source": [
611 |     "zigbang['items'][0]['item']['deposit']"
612 |    ]
613 |   },
614 |   {
615 |    "cell_type": "markdown",
616 |    "metadata": {},
617 |    "source": [
618 |     "#### 전체 매물에서 월세, 보증금을 뽑기\n",
619 |     "dict의 List 형태 (deposit, rent)"
620 |    ]
621 |   },
622 |   {
623 |    "cell_type": "code",
624 |    "execution_count": 27,
625 |    "metadata": {
626 |     "collapsed": false
627 |    },
628 |    "outputs": [
629 |     {
630 |      "data": {
631 |       "text/plain": [
632 |        "[{'deposit': 1000, 'rent': 68},\n",
633 |        " {'deposit': 2000, 'rent': 75},\n",
634 |        " {'deposit': 2000, 'rent': 30},\n",
635 |        " {'deposit': 500, 'rent': 55},\n",
636 |        " {'deposit': 1000, 'rent': 60},\n",
637 |        " {'deposit': 15000, 'rent': 0},\n",
638 |        " {'deposit': 1000, 'rent': 65},\n",
639 |        " {'deposit': 95, 'rent': 95},\n",
640 |        " {'deposit': 25000, 'rent': 0},\n",
641 |        " {'deposit': 3000, 'rent': 30},\n",
642 |        " {'deposit': 2000, 'rent': 65},\n",
643 |        " {'deposit': 1000, 'rent': 60},\n",
644 |        " {'deposit': 25000, 'rent': 0},\n",
645 |        " {'deposit': 500, 'rent': 50},\n",
646 |        " {'deposit': 5000, 'rent': 65},\n",
647 |        " {'deposit': 22000, 'rent': 0},\n",
648 |        " {'deposit': 2000, 'rent': 80},\n",
649 |        " {'deposit': 14000, 'rent': 0},\n",
650 |        " {'deposit': 1000, 'rent': 45},\n",
651 |        " {'deposit': 1000, 'rent': 65},\n",
652 |        " {'deposit': 500, 'rent': 60},\n",
653 |        " {'deposit': 1000, 'rent': 75},\n",
654 |        " {'deposit': 500, 'rent': 55},\n",
655 |        " {'deposit': 105, 'rent': 105},\n",
656 |        " {'deposit': 100, 'rent': 58},\n",
657 |        " {'deposit': 5000, 'rent': 50},\n",
658 |        " {'deposit': 2000, 'rent': 60},\n",
659 |        " {'deposit': 20000, 'rent': 0},\n",
660 |        " {'deposit': 500, 'rent': 65},\n",
661 |        " {'deposit': 17000, 'rent': 0},\n",
662 |        " {'deposit': 1000, 'rent': 38},\n",
663 |        " {'deposit': 1000, 'rent': 55},\n",
664 |        " {'deposit': 3000, 'rent': 67},\n",
665 |        " {'deposit': 12000, 'rent': 0},\n",
666 |        " {'deposit': 3000, 'rent': 25},\n",
667 |        " {'deposit': 1000, 'rent': 60},\n",
668 |        " {'deposit': 120, 'rent': 120},\n",
669 |        " {'deposit': 100, 'rent': 200},\n",
670 |        " {'deposit': 500, 'rent': 60},\n",
671 |        " {'deposit': 500, 'rent': 45},\n",
672 |        " {'deposit': 80, 'rent': 80},\n",
673 |        " {'deposit': 13000, 'rent': 0},\n",
674 |        " {'deposit': 3000, 'rent': 20},\n",
675 |        " {'deposit': 1000, 'rent': 70},\n",
676 |        " {'deposit': 500, 'rent': 25},\n",
677 |        " {'deposit': 1000, 'rent': 65},\n",
678 |        " {'deposit': 2000, 'rent': 80},\n",
679 |        " {'deposit': 95, 'rent': 95},\n",
680 |        " {'deposit': 3000, 'rent': 95},\n",
681 |        " {'deposit': 500, 'rent': 50},\n",
682 |        " {'deposit': 20000, 'rent': 0},\n",
683 |        " {'deposit': 500, 'rent': 45},\n",
684 |        " {'deposit': 9000, 'rent': 0},\n",
685 |        " {'deposit': 1000, 'rent': 65},\n",
686 |        " {'deposit': 5000, 'rent': 30},\n",
687 |        " {'deposit': 37000, 'rent': 0},\n",
688 |        " {'deposit': 13000, 'rent': 0},\n",
689 |        " {'deposit': 1000, 'rent': 100},\n",
690 |        " {'deposit': 1000, 'rent': 70},\n",
691 |        " {'deposit': 7000, 'rent': 40}]"
692 |       ]
693 |      },
694 |      "execution_count": 27,
695 |      "metadata": {},
696 |      "output_type": "execute_result"
697 |     }
698 |    ],
699 |    "source": [
700 |     "#for item in zigbang.get('items'):\n",
701 |     "#    deposit = item.get('item').get('deposit')\n",
702 |     "#    rent = item.get('item').get('rent')\n",
703 |     "#    print(deposit, rent)\n",
704 |     "[{'deposit': item.get('item').get('deposit'),\n",
705 |     "  'rent': item.get('item').get('rent')} for item in zigbang.get('items')]"
706 |    ]
707 |   },
708 |   {
709 |    "cell_type": "markdown",
710 |    "metadata": {},
711 |    "source": [
712 |     "### 요기요 예제 (개발자 도구에서 response에서는 보이나 해당 URL을 쳐서들어가면 안보이는 경우)\n",
713 |     "아래의 링크의 식당 이름 20개 가져오기  \n",
714 |     "링크 : https://www.yogiyo.co.kr/mobile/#/%EC%84%9C%EC%9A%B8/139231/\n",
715 |     "\n",
716 |     "* Referer : Request가 어디서부터 왔는가?\n",
717 |     "    * 개발자 도구 : yogiyo.co.kr/\n",
718 |     "    * 직접 URL 입력 : Referer가 없는 상태  \n",
719 |     "    \n",
720 |     "    \n",
721 |     "* Request Header (http://docs.python-requests.org/en/master/user/quickstart/)\n",
722 |     "    * Authorization\n",
723 |     "    * X---\n",
724 |     "    * Host"
725 |    ]
726 |   },
727 |   {
728 |    "cell_type": "code",
729 |    "execution_count": 28,
730 |    "metadata": {
731 |     "collapsed": true
732 |    },
733 |    "outputs": [],
734 |    "source": [
735 |     "url = 'https://www.yogiyo.co.kr/api/v1/restaurants-geo/?items=20&order=rank&page=0&search=&zip_code=139231'\n",
736 |     "# 위에 url query에 items를 바꿔서 한번에 여러개 긁어오기도 가능 (API가 잘 구성된 사이트라면!)\n",
737 |     "headers = {'X-ApiKey' : 'iphoneap',\n",
738 |     "           'X-ApiSecret': 'fe5183cc3dea12bd0ce299cf110a75a2'}\n",
739 |     "response = requests.get(url, headers = headers)"
740 |    ]
741 |   },
742 |   {
743 |    "cell_type": "code",
744 |    "execution_count": 29,
745 |    "metadata": {
746 |     "collapsed": false
747 |    },
748 |    "outputs": [
749 |     {
750 |      "data": {
751 |       "text/plain": [
752 |        "['7번가피자-상계점',\n",
753 |        " 'BHC-하계점',\n",
754 |        " 'BHC-하계점',\n",
755 |        " '후라이드참잘하는집-중계점',\n",
756 |        " 'BHC-중계점',\n",
757 |        " '치킨레인저-노원기지',\n",
758 |        " '앵그리불닭발',\n",
759 |        " '땡초불닭발동대문엽기떡볶이-월계점',\n",
760 |        " '빨간고추피자-공릉점',\n",
761 |        " '83닭발-노원점',\n",
762 |        " '엄청난쌀국수-본점',\n",
763 |        " \"강'스피자-since2000\",\n",
764 |        " '정성커리&돈카츠-노원점',\n",
765 |        " '바다회포차',\n",
766 |        " '분식명인김라덕선생-노원점',\n",
767 |        " '피자마루-공릉점',\n",
768 |        " '강정구의피자생각-상계점',\n",
769 |        " '호식이두마리치킨-월계1호점',\n",
770 |        " '왕족발칼국수',\n",
771 |        " '마왕족발-노원점']"
772 |       ]
773 |      },
774 |      "execution_count": 29,
775 |      "metadata": {},
776 |      "output_type": "execute_result"
777 |     }
778 |    ],
779 |    "source": [
780 |     "yogiyo = response.json()\n",
781 |     "[restaurant.get('name') for restaurant in yogiyo.get('restaurants')]"
782 |    ]
783 |   }
784 |  ],
785 |  "metadata": {
786 |   "kernelspec": {
787 |    "display_name": "Python 3",
788 |    "language": "python",
789 |    "name": "python3"
790 |   },
791 |   "language_info": {
792 |    "codemirror_mode": {
793 |     "name": "ipython",
794 |     "version": 3
795 |    },
796 |    "file_extension": ".py",
797 |    "mimetype": "text/x-python",
798 |    "name": "python",
799 |    "nbconvert_exporter": "python",
800 |    "pygments_lexer": "ipython3",
801 |    "version": "3.6.0"
802 |   }
803 |  },
804 |  "nbformat": 4,
805 |  "nbformat_minor": 2
806 | }
807 | 


--------------------------------------------------------------------------------
/data/sample_mp3.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seopbo/py-automate/e0de7f9fc2e38f7b6479be7c586bac842ddea239/data/sample_mp3.zip


--------------------------------------------------------------------------------
/data/univs_2014.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/seopbo/py-automate/e0de7f9fc2e38f7b6479be7c586bac842ddea239/data/univs_2014.xlsx


--------------------------------------------------------------------------------
/os_shutil.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "cell_type": "markdown",
   5 |    "metadata": {},
   6 |    "source": [
   7 |     "# 폴더와 파일관리를 위한 os, shutil module\n",
   8 |     "_본 자료는 안수찬 강사님의 파이썬을 활용한 업무자동화 Camp (fast campus)의 강의자료를 기반으로 만들어졌습니다._  \n",
   9 |     "만든이 : 김보섭  \n",
  10 |     "\n",
  11 |     "## 패키지, 모듈이란?\n",
  12 |     "파이썬 파일, 파이썬 파일들이 모여있는 폴더  \n",
  13 |     "* 파이썬 기본 모듈 - os(Operating System) , shutil(High level)  \n",
  14 |     "* 파이썬 외장 모듈 (3rd Party) - install!  \n",
  15 |     "    * requests(HTTP Request), BeautifulSoup(HTML)  \n",
  16 |     "\n",
  17 |     "## summary : os, shutil\n",
  18 |     "* 파일/폴더 생성하기\n",
  19 |     "* 파일/폴더 복사하기 : shutil.copy2, shutil.copytree\n",
  20 |     "* 파일/폴더 삭제하기 : os.remove, os.removedirs, shutil.rmtree\n",
  21 |     "* 파일/폴더 이동하기 : os.rename, shutil.move\n",
  22 |     "* 파일/폴더 압축하기 : shutil.make_archive, shutil.unpack_archive"
  23 |    ]
  24 |   },
  25 |   {
  26 |    "cell_type": "markdown",
  27 |    "metadata": {},
  28 |    "source": [
  29 |     "## os  \n",
  30 |     "https://docs.python.org/3/library/os.html\n",
  31 |     "### os.listdir()  \n",
  32 |     "폴더안에 있는 모든 파일을 출력하는 함수"
  33 |    ]
  34 |   },
  35 |   {
  36 |    "cell_type": "code",
  37 |    "execution_count": 1,
  38 |    "metadata": {
  39 |     "collapsed": true
  40 |    },
  41 |    "outputs": [],
  42 |    "source": [
  43 |     "# from _______________ import ______________\n",
  44 |     "# import ______________"
  45 |    ]
  46 |   },
  47 |   {
  48 |    "cell_type": "code",
  49 |    "execution_count": 2,
  50 |    "metadata": {
  51 |     "collapsed": true
  52 |    },
  53 |    "outputs": [],
  54 |    "source": [
  55 |     "import os\n",
  56 |     "from os import listdir"
  57 |    ]
  58 |   },
  59 |   {
  60 |    "cell_type": "code",
  61 |    "execution_count": 3,
  62 |    "metadata": {
  63 |     "collapsed": false
  64 |    },
  65 |    "outputs": [
  66 |     {
  67 |      "data": {
  68 |       "text/plain": [
  69 |        "['.ipynb_checkpoints',\n",
  70 |        " 'animals.csv',\n",
  71 |        " 'animals2.csv',\n",
  72 |        " 'fruits.csv',\n",
  73 |        " 'os_shutil.ipynb',\n",
  74 |        " 'Python_basic1.ipynb',\n",
  75 |        " 'Python_basic2.ipynb',\n",
  76 |        " 'Python_basic3.ipynb',\n",
  77 |        " 'src.txt',\n",
  78 |        " 'test.txt']"
  79 |       ]
  80 |      },
  81 |      "execution_count": 3,
  82 |      "metadata": {},
  83 |      "output_type": "execute_result"
  84 |     }
  85 |    ],
  86 |    "source": [
  87 |     "listdir()"
  88 |    ]
  89 |   },
  90 |   {
  91 |    "cell_type": "code",
  92 |    "execution_count": 4,
  93 |    "metadata": {
  94 |     "collapsed": false
  95 |    },
  96 |    "outputs": [
  97 |     {
  98 |      "data": {
  99 |       "text/plain": [
 100 |        "['.ipynb_checkpoints',\n",
 101 |        " 'animals.csv',\n",
 102 |        " 'animals2.csv',\n",
 103 |        " 'fruits.csv',\n",
 104 |        " 'os_shutil.ipynb',\n",
 105 |        " 'Python_basic1.ipynb',\n",
 106 |        " 'Python_basic2.ipynb',\n",
 107 |        " 'Python_basic3.ipynb',\n",
 108 |        " 'src.txt',\n",
 109 |        " 'test.txt']"
 110 |       ]
 111 |      },
 112 |      "execution_count": 4,
 113 |      "metadata": {},
 114 |      "output_type": "execute_result"
 115 |     }
 116 |    ],
 117 |    "source": [
 118 |     "os.listdir()"
 119 |    ]
 120 |   },
 121 |   {
 122 |    "cell_type": "code",
 123 |    "execution_count": 5,
 124 |    "metadata": {
 125 |     "collapsed": false
 126 |    },
 127 |    "outputs": [
 128 |     {
 129 |      "data": {
 130 |       "text/plain": [
 131 |        "['.ipynb_checkpoints',\n",
 132 |        " 'DLEL',\n",
 133 |        " 'DLFS',\n",
 134 |        " 'ML-Python',\n",
 135 |        " 'py-automate',\n",
 136 |        " 'Python-Tutorial',\n",
 137 |        " 'R-Tutorial',\n",
 138 |        " 'sample-code-from-cs231n',\n",
 139 |        " 'tmp']"
 140 |       ]
 141 |      },
 142 |      "execution_count": 5,
 143 |      "metadata": {},
 144 |      "output_type": "execute_result"
 145 |     }
 146 |    ],
 147 |    "source": [
 148 |     "# . => Current directory\n",
 149 |     "# .. => Parent directory\n",
 150 |     "# ../ => 상위폴더\n",
 151 |     "# ../../ => 상위폴더의 상위폴더\n",
 152 |     "os.listdir('../')"
 153 |    ]
 154 |   },
 155 |   {
 156 |    "cell_type": "markdown",
 157 |    "metadata": {},
 158 |    "source": [
 159 |     "### os.path.join()  \n",
 160 |     "PATH를 생성하는 함수"
 161 |    ]
 162 |   },
 163 |   {
 164 |    "cell_type": "code",
 165 |    "execution_count": 6,
 166 |    "metadata": {
 167 |     "collapsed": false
 168 |    },
 169 |    "outputs": [
 170 |     {
 171 |      "data": {
 172 |       "text/plain": [
 173 |        "'some\\\\path\\\\to\\\\excel.xls'"
 174 |       ]
 175 |      },
 176 |      "execution_count": 6,
 177 |      "metadata": {},
 178 |      "output_type": "execute_result"
 179 |     }
 180 |    ],
 181 |    "source": [
 182 |     "os.path.join('some', 'path', 'to', 'excel.xls')"
 183 |    ]
 184 |   },
 185 |   {
 186 |    "cell_type": "code",
 187 |    "execution_count": 7,
 188 |    "metadata": {
 189 |     "collapsed": false
 190 |    },
 191 |    "outputs": [
 192 |     {
 193 |      "data": {
 194 |       "text/plain": [
 195 |        "['.ipynb_checkpoints',\n",
 196 |        " 'animals.csv',\n",
 197 |        " 'animals2.csv',\n",
 198 |        " 'fruits.csv',\n",
 199 |        " 'os_shutil.ipynb',\n",
 200 |        " 'Python_basic1.ipynb',\n",
 201 |        " 'Python_basic2.ipynb',\n",
 202 |        " 'Python_basic3.ipynb',\n",
 203 |        " 'src.txt',\n",
 204 |        " 'test.txt']"
 205 |       ]
 206 |      },
 207 |      "execution_count": 7,
 208 |      "metadata": {},
 209 |      "output_type": "execute_result"
 210 |     }
 211 |    ],
 212 |    "source": [
 213 |     "os.listdir(os.path.join('./'))"
 214 |    ]
 215 |   },
 216 |   {
 217 |    "cell_type": "code",
 218 |    "execution_count": 8,
 219 |    "metadata": {
 220 |     "collapsed": false
 221 |    },
 222 |    "outputs": [
 223 |     {
 224 |      "data": {
 225 |       "text/plain": [
 226 |        "['src.txt', 'test.txt']"
 227 |       ]
 228 |      },
 229 |      "execution_count": 8,
 230 |      "metadata": {},
 231 |      "output_type": "execute_result"
 232 |     }
 233 |    ],
 234 |    "source": [
 235 |     "# .txt 파일만의 리스트 생성\n",
 236 |     "# 해당 디렉토리 내에 txt 파일 아무거나 만들어놓고 따라해보시면됩니다.\n",
 237 |     "# list.comprehension 이용\n",
 238 |     "[i for i in os.listdir(os.path.join('./')) if i.endswith('.txt')]"
 239 |    ]
 240 |   },
 241 |   {
 242 |    "cell_type": "code",
 243 |    "execution_count": 9,
 244 |    "metadata": {
 245 |     "collapsed": false
 246 |    },
 247 |    "outputs": [
 248 |     {
 249 |      "data": {
 250 |       "text/plain": [
 251 |        "['src.txt', 'test.txt']"
 252 |       ]
 253 |      },
 254 |      "execution_count": 9,
 255 |      "metadata": {},
 256 |      "output_type": "execute_result"
 257 |     }
 258 |    ],
 259 |    "source": [
 260 |     "# Lambda Operator ( Filter)\n",
 261 |     "list(filter(lambda x : x.endswith('.txt'),\n",
 262 |     "            os.listdir(os.path.join('./'))))"
 263 |    ]
 264 |   },
 265 |   {
 266 |    "cell_type": "markdown",
 267 |    "metadata": {},
 268 |    "source": [
 269 |     "### os.makedirs()\n",
 270 |     "폴더를 생성하는 함수"
 271 |    ]
 272 |   },
 273 |   {
 274 |    "cell_type": "code",
 275 |    "execution_count": 10,
 276 |    "metadata": {
 277 |     "collapsed": false
 278 |    },
 279 |    "outputs": [
 280 |     {
 281 |      "data": {
 282 |       "text/plain": [
 283 |        "['.ipynb_checkpoints',\n",
 284 |        " 'animals.csv',\n",
 285 |        " 'animals2.csv',\n",
 286 |        " 'fruits.csv',\n",
 287 |        " 'os_shutil.ipynb',\n",
 288 |        " 'Python_basic1.ipynb',\n",
 289 |        " 'Python_basic2.ipynb',\n",
 290 |        " 'Python_basic3.ipynb',\n",
 291 |        " 'src.txt',\n",
 292 |        " 'test.txt']"
 293 |       ]
 294 |      },
 295 |      "execution_count": 10,
 296 |      "metadata": {},
 297 |      "output_type": "execute_result"
 298 |     }
 299 |    ],
 300 |    "source": [
 301 |     "os.listdir()"
 302 |    ]
 303 |   },
 304 |   {
 305 |    "cell_type": "code",
 306 |    "execution_count": 11,
 307 |    "metadata": {
 308 |     "collapsed": false
 309 |    },
 310 |    "outputs": [],
 311 |    "source": [
 312 |     "os.makedirs('./automate') # 폴더생성"
 313 |    ]
 314 |   },
 315 |   {
 316 |    "cell_type": "code",
 317 |    "execution_count": 12,
 318 |    "metadata": {
 319 |     "collapsed": false
 320 |    },
 321 |    "outputs": [
 322 |     {
 323 |      "data": {
 324 |       "text/plain": [
 325 |        "['.ipynb_checkpoints',\n",
 326 |        " 'animals.csv',\n",
 327 |        " 'animals2.csv',\n",
 328 |        " 'automate',\n",
 329 |        " 'fruits.csv',\n",
 330 |        " 'os_shutil.ipynb',\n",
 331 |        " 'Python_basic1.ipynb',\n",
 332 |        " 'Python_basic2.ipynb',\n",
 333 |        " 'Python_basic3.ipynb',\n",
 334 |        " 'src.txt',\n",
 335 |        " 'test.txt']"
 336 |       ]
 337 |      },
 338 |      "execution_count": 12,
 339 |      "metadata": {},
 340 |      "output_type": "execute_result"
 341 |     }
 342 |    ],
 343 |    "source": [
 344 |     "os.listdir()"
 345 |    ]
 346 |   },
 347 |   {
 348 |    "cell_type": "markdown",
 349 |    "metadata": {},
 350 |    "source": [
 351 |     "### example : copy file"
 352 |    ]
 353 |   },
 354 |   {
 355 |    "cell_type": "code",
 356 |    "execution_count": 13,
 357 |    "metadata": {
 358 |     "collapsed": true
 359 |    },
 360 |    "outputs": [],
 361 |    "source": [
 362 |     "# 파일 복사\n",
 363 |     "# with open('', '') as fp\n",
 364 |     "# 파일복사 -> 1. 기존 파일 읽기 => data => 2. 새로운 파일에 쓰기\n",
 365 |     "# 잘라내기 -> 1. 기존 파일 읽기 => data => 2. 새로운 파일에 쓰기 => 3. 기존파일 삭제하기"
 366 |    ]
 367 |   },
 368 |   {
 369 |    "cell_type": "code",
 370 |    "execution_count": 14,
 371 |    "metadata": {
 372 |     "collapsed": true
 373 |    },
 374 |    "outputs": [],
 375 |    "source": [
 376 |     "def copy(src_filename, dest_filename):\n",
 377 |     "    # 1. src file을 읽어서\n",
 378 |     "    #       fp = open('_____', ') fp.close()\n",
 379 |     "    #       with open('_______', '') as fp:\n",
 380 |     "\n",
 381 |     "    with open(src_filename, 'r') as src_fp:\n",
 382 |     "        data = src_fp.read()\n",
 383 |     "    # 2. dest file에다가 쓴다.\n",
 384 |     "    with open(dest_filename, 'w') as dest_fp:\n",
 385 |     "        dest_fp.write(data)"
 386 |    ]
 387 |   },
 388 |   {
 389 |    "cell_type": "code",
 390 |    "execution_count": 15,
 391 |    "metadata": {
 392 |     "collapsed": false
 393 |    },
 394 |    "outputs": [],
 395 |    "source": [
 396 |     "copy(src_filename = './src.txt',\n",
 397 |     "     dest_filename = './automate/dest.txt')"
 398 |    ]
 399 |   },
 400 |   {
 401 |    "cell_type": "code",
 402 |    "execution_count": 16,
 403 |    "metadata": {
 404 |     "collapsed": false
 405 |    },
 406 |    "outputs": [
 407 |     {
 408 |      "data": {
 409 |       "text/plain": [
 410 |        "['dest.txt']"
 411 |       ]
 412 |      },
 413 |      "execution_count": 16,
 414 |      "metadata": {},
 415 |      "output_type": "execute_result"
 416 |     }
 417 |    ],
 418 |    "source": [
 419 |     "os.listdir('./automate')"
 420 |    ]
 421 |   },
 422 |   {
 423 |    "cell_type": "code",
 424 |    "execution_count": 17,
 425 |    "metadata": {
 426 |     "collapsed": false
 427 |    },
 428 |    "outputs": [],
 429 |    "source": [
 430 |     "copy(src_filename = os.path.join('src.txt'),\n",
 431 |     "     dest_filename = os.path.join('automate', 'dest.txt')\n",
 432 |     ")"
 433 |    ]
 434 |   },
 435 |   {
 436 |    "cell_type": "code",
 437 |    "execution_count": 18,
 438 |    "metadata": {
 439 |     "collapsed": false
 440 |    },
 441 |    "outputs": [
 442 |     {
 443 |      "data": {
 444 |       "text/plain": [
 445 |        "['dest.txt']"
 446 |       ]
 447 |      },
 448 |      "execution_count": 18,
 449 |      "metadata": {},
 450 |      "output_type": "execute_result"
 451 |     }
 452 |    ],
 453 |    "source": [
 454 |     "os.listdir('./automate/')"
 455 |    ]
 456 |   },
 457 |   {
 458 |    "cell_type": "markdown",
 459 |    "metadata": {},
 460 |    "source": [
 461 |     "## shutil\n",
 462 |     "복잡한 파일 관리... + os module => shutil  \n",
 463 |     "https://docs.python.org/3/library/shutil.html"
 464 |    ]
 465 |   },
 466 |   {
 467 |    "cell_type": "markdown",
 468 |    "metadata": {},
 469 |    "source": [
 470 |     "### shutil.copy2(src, dst)\n",
 471 |     "파일을 복사하는 함수"
 472 |    ]
 473 |   },
 474 |   {
 475 |    "cell_type": "code",
 476 |    "execution_count": 19,
 477 |    "metadata": {
 478 |     "collapsed": false
 479 |    },
 480 |    "outputs": [
 481 |     {
 482 |      "data": {
 483 |       "text/plain": [
 484 |        "'automate\\\\shutil_dest.txt'"
 485 |       ]
 486 |      },
 487 |      "execution_count": 19,
 488 |      "metadata": {},
 489 |      "output_type": "execute_result"
 490 |     }
 491 |    ],
 492 |    "source": [
 493 |     "import shutil\n",
 494 |     "shutil.copy2(\n",
 495 |     "    os.path.join('src.txt'),\n",
 496 |     "    os.path.join('automate', 'shutil_dest.txt')\n",
 497 |     ")"
 498 |    ]
 499 |   },
 500 |   {
 501 |    "cell_type": "code",
 502 |    "execution_count": 20,
 503 |    "metadata": {
 504 |     "collapsed": false
 505 |    },
 506 |    "outputs": [
 507 |     {
 508 |      "data": {
 509 |       "text/plain": [
 510 |        "['dest.txt', 'shutil_dest.txt']"
 511 |       ]
 512 |      },
 513 |      "execution_count": 20,
 514 |      "metadata": {},
 515 |      "output_type": "execute_result"
 516 |     }
 517 |    ],
 518 |    "source": [
 519 |     "os.listdir('./automate/')"
 520 |    ]
 521 |   },
 522 |   {
 523 |    "cell_type": "markdown",
 524 |    "metadata": {},
 525 |    "source": [
 526 |     "### shutil.copytree(src, dst)\n",
 527 |     "폴더를 복사하는 함수"
 528 |    ]
 529 |   },
 530 |   {
 531 |    "cell_type": "code",
 532 |    "execution_count": 21,
 533 |    "metadata": {
 534 |     "collapsed": false
 535 |    },
 536 |    "outputs": [
 537 |     {
 538 |      "data": {
 539 |       "text/plain": [
 540 |        "'./new_automate/'"
 541 |       ]
 542 |      },
 543 |      "execution_count": 21,
 544 |      "metadata": {},
 545 |      "output_type": "execute_result"
 546 |     }
 547 |    ],
 548 |    "source": [
 549 |     "# automate라고 하는 폴더\n",
 550 |     "shutil.copytree(src = './automate/', dst = './new_automate/')"
 551 |    ]
 552 |   },
 553 |   {
 554 |    "cell_type": "code",
 555 |    "execution_count": 22,
 556 |    "metadata": {
 557 |     "collapsed": false
 558 |    },
 559 |    "outputs": [
 560 |     {
 561 |      "data": {
 562 |       "text/plain": [
 563 |        "['.ipynb_checkpoints',\n",
 564 |        " 'animals.csv',\n",
 565 |        " 'animals2.csv',\n",
 566 |        " 'automate',\n",
 567 |        " 'fruits.csv',\n",
 568 |        " 'new_automate',\n",
 569 |        " 'os_shutil.ipynb',\n",
 570 |        " 'Python_basic1.ipynb',\n",
 571 |        " 'Python_basic2.ipynb',\n",
 572 |        " 'Python_basic3.ipynb',\n",
 573 |        " 'src.txt',\n",
 574 |        " 'test.txt']"
 575 |       ]
 576 |      },
 577 |      "execution_count": 22,
 578 |      "metadata": {},
 579 |      "output_type": "execute_result"
 580 |     }
 581 |    ],
 582 |    "source": [
 583 |     "os.listdir()"
 584 |    ]
 585 |   },
 586 |   {
 587 |    "cell_type": "code",
 588 |    "execution_count": 23,
 589 |    "metadata": {
 590 |     "collapsed": false
 591 |    },
 592 |    "outputs": [
 593 |     {
 594 |      "data": {
 595 |       "text/plain": [
 596 |        "['dest.txt', 'shutil_dest.txt']"
 597 |       ]
 598 |      },
 599 |      "execution_count": 23,
 600 |      "metadata": {},
 601 |      "output_type": "execute_result"
 602 |     }
 603 |    ],
 604 |    "source": [
 605 |     "os.listdir('./new_automate/')"
 606 |    ]
 607 |   },
 608 |   {
 609 |    "cell_type": "code",
 610 |    "execution_count": 24,
 611 |    "metadata": {
 612 |     "collapsed": false
 613 |    },
 614 |    "outputs": [
 615 |     {
 616 |      "data": {
 617 |       "text/plain": [
 618 |        "['dest.txt', 'shutil_dest.txt']"
 619 |       ]
 620 |      },
 621 |      "execution_count": 24,
 622 |      "metadata": {},
 623 |      "output_type": "execute_result"
 624 |     }
 625 |    ],
 626 |    "source": [
 627 |     "os.listdir('./automate/')"
 628 |    ]
 629 |   },
 630 |   {
 631 |    "cell_type": "markdown",
 632 |    "metadata": {},
 633 |    "source": [
 634 |     "### 폴더 및 파일 삭제\n",
 635 |     "os.remove, os.removedirs, shutil.rmtree"
 636 |    ]
 637 |   },
 638 |   {
 639 |    "cell_type": "code",
 640 |    "execution_count": 25,
 641 |    "metadata": {
 642 |     "collapsed": false
 643 |    },
 644 |    "outputs": [
 645 |     {
 646 |      "data": {
 647 |       "text/plain": [
 648 |        "['dest.txt', 'shutil_dest.txt']"
 649 |       ]
 650 |      },
 651 |      "execution_count": 25,
 652 |      "metadata": {},
 653 |      "output_type": "execute_result"
 654 |     }
 655 |    ],
 656 |    "source": [
 657 |     "os.listdir('./automate/')"
 658 |    ]
 659 |   },
 660 |   {
 661 |    "cell_type": "code",
 662 |    "execution_count": 26,
 663 |    "metadata": {
 664 |     "collapsed": false
 665 |    },
 666 |    "outputs": [],
 667 |    "source": [
 668 |     "os.remove('./automate/dest.txt')"
 669 |    ]
 670 |   },
 671 |   {
 672 |    "cell_type": "code",
 673 |    "execution_count": 27,
 674 |    "metadata": {
 675 |     "collapsed": false
 676 |    },
 677 |    "outputs": [
 678 |     {
 679 |      "data": {
 680 |       "text/plain": [
 681 |        "['shutil_dest.txt']"
 682 |       ]
 683 |      },
 684 |      "execution_count": 27,
 685 |      "metadata": {},
 686 |      "output_type": "execute_result"
 687 |     }
 688 |    ],
 689 |    "source": [
 690 |     "os.listdir('./automate/')"
 691 |    ]
 692 |   },
 693 |   {
 694 |    "cell_type": "code",
 695 |    "execution_count": 28,
 696 |    "metadata": {
 697 |     "collapsed": false
 698 |    },
 699 |    "outputs": [
 700 |     {
 701 |      "ename": "OSError",
 702 |      "evalue": "[WinError 145] 디렉터리가 비어 있지 않습니다: './automate/'",
 703 |      "output_type": "error",
 704 |      "traceback": [
 705 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
 706 |       "\u001b[0;31mOSError\u001b[0m                                   Traceback (most recent call last)",
 707 |       "\u001b[0;32m<ipython-input-28-1961dc150a83>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mos\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mremovedirs\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'./automate/'\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# 디렉토리가 비어야 삭제됨\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
 708 |       "\u001b[0;32mC:\\ProgramData\\Anaconda3\\lib\\os.py\u001b[0m in \u001b[0;36mremovedirs\u001b[0;34m(name)\u001b[0m\n\u001b[1;32m    236\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m    237\u001b[0m     \"\"\"\n\u001b[0;32m--> 238\u001b[0;31m     \u001b[0mrmdir\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    239\u001b[0m     \u001b[0mhead\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtail\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mpath\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msplit\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m    240\u001b[0m     \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mtail\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
 709 |       "\u001b[0;31mOSError\u001b[0m: [WinError 145] 디렉터리가 비어 있지 않습니다: './automate/'"
 710 |      ]
 711 |     }
 712 |    ],
 713 |    "source": [
 714 |     "os.removedirs('./automate/') # 디렉토리가 비어야 삭제됨"
 715 |    ]
 716 |   },
 717 |   {
 718 |    "cell_type": "code",
 719 |    "execution_count": 29,
 720 |    "metadata": {
 721 |     "collapsed": true
 722 |    },
 723 |    "outputs": [],
 724 |    "source": [
 725 |     "# 디렉토리가 비어있지않아도 삭제가능\n",
 726 |     "shutil.rmtree('./automate/')"
 727 |    ]
 728 |   },
 729 |   {
 730 |    "cell_type": "code",
 731 |    "execution_count": 30,
 732 |    "metadata": {
 733 |     "collapsed": false
 734 |    },
 735 |    "outputs": [
 736 |     {
 737 |      "data": {
 738 |       "text/plain": [
 739 |        "['dest.txt', 'shutil_dest.txt']"
 740 |       ]
 741 |      },
 742 |      "execution_count": 30,
 743 |      "metadata": {},
 744 |      "output_type": "execute_result"
 745 |     }
 746 |    ],
 747 |    "source": [
 748 |     "os.listdir('new_automate/')"
 749 |    ]
 750 |   },
 751 |   {
 752 |    "cell_type": "code",
 753 |    "execution_count": 31,
 754 |    "metadata": {
 755 |     "collapsed": false
 756 |    },
 757 |    "outputs": [
 758 |     {
 759 |      "data": {
 760 |       "text/plain": [
 761 |        "['.ipynb_checkpoints',\n",
 762 |        " 'animals.csv',\n",
 763 |        " 'animals2.csv',\n",
 764 |        " 'fruits.csv',\n",
 765 |        " 'new_automate',\n",
 766 |        " 'os_shutil.ipynb',\n",
 767 |        " 'Python_basic1.ipynb',\n",
 768 |        " 'Python_basic2.ipynb',\n",
 769 |        " 'Python_basic3.ipynb',\n",
 770 |        " 'src.txt',\n",
 771 |        " 'test.txt']"
 772 |       ]
 773 |      },
 774 |      "execution_count": 31,
 775 |      "metadata": {},
 776 |      "output_type": "execute_result"
 777 |     }
 778 |    ],
 779 |    "source": [
 780 |     "os.listdir()"
 781 |    ]
 782 |   },
 783 |   {
 784 |    "cell_type": "markdown",
 785 |    "metadata": {},
 786 |    "source": [
 787 |     "### 파일/폴더 옮기기\n",
 788 |     "os.rename, shutil.move"
 789 |    ]
 790 |   },
 791 |   {
 792 |    "cell_type": "code",
 793 |    "execution_count": 32,
 794 |    "metadata": {
 795 |     "collapsed": false
 796 |    },
 797 |    "outputs": [],
 798 |    "source": [
 799 |     "os.rename(src = './new_automate/dest.txt', dst = './new_automate/copy.txt')"
 800 |    ]
 801 |   },
 802 |   {
 803 |    "cell_type": "code",
 804 |    "execution_count": 33,
 805 |    "metadata": {
 806 |     "collapsed": false
 807 |    },
 808 |    "outputs": [
 809 |     {
 810 |      "data": {
 811 |       "text/plain": [
 812 |        "['copy.txt', 'shutil_dest.txt']"
 813 |       ]
 814 |      },
 815 |      "execution_count": 33,
 816 |      "metadata": {},
 817 |      "output_type": "execute_result"
 818 |     }
 819 |    ],
 820 |    "source": [
 821 |     "os.listdir('new_automate/')"
 822 |    ]
 823 |   },
 824 |   {
 825 |    "cell_type": "code",
 826 |    "execution_count": 34,
 827 |    "metadata": {
 828 |     "collapsed": false
 829 |    },
 830 |    "outputs": [
 831 |     {
 832 |      "data": {
 833 |       "text/plain": [
 834 |        "'./new_automate/move.txt'"
 835 |       ]
 836 |      },
 837 |      "execution_count": 34,
 838 |      "metadata": {},
 839 |      "output_type": "execute_result"
 840 |     }
 841 |    ],
 842 |    "source": [
 843 |     "shutil.move(src = './new_automate/copy.txt', dst = './new_automate/move.txt')"
 844 |    ]
 845 |   },
 846 |   {
 847 |    "cell_type": "code",
 848 |    "execution_count": 35,
 849 |    "metadata": {
 850 |     "collapsed": false
 851 |    },
 852 |    "outputs": [
 853 |     {
 854 |      "data": {
 855 |       "text/plain": [
 856 |        "['move.txt', 'shutil_dest.txt']"
 857 |       ]
 858 |      },
 859 |      "execution_count": 35,
 860 |      "metadata": {},
 861 |      "output_type": "execute_result"
 862 |     }
 863 |    ],
 864 |    "source": [
 865 |     "os.listdir('new_automate/')"
 866 |    ]
 867 |   },
 868 |   {
 869 |    "cell_type": "markdown",
 870 |    "metadata": {},
 871 |    "source": [
 872 |     "### example : 사진파일정리\n",
 873 |     "연도 별, 월 별 사진 파일 정리"
 874 |    ]
 875 |   },
 876 |   {
 877 |    "cell_type": "code",
 878 |    "execution_count": 36,
 879 |    "metadata": {
 880 |     "collapsed": false
 881 |    },
 882 |    "outputs": [],
 883 |    "source": [
 884 |     "os.mkdir('./hello/')"
 885 |    ]
 886 |   },
 887 |   {
 888 |    "cell_type": "code",
 889 |    "execution_count": 37,
 890 |    "metadata": {
 891 |     "collapsed": true
 892 |    },
 893 |    "outputs": [],
 894 |    "source": [
 895 |     "for year in range(2013, 2016):\n",
 896 |     "    for month in range(1, 12 + 1):\n",
 897 |     "        for day in range(1, 30 + 1):\n",
 898 |     "            filename = 'Screenshot-{year}-{month}-{day}.jpg'.format(\n",
 899 |     "            year = year,\n",
 900 |     "            month = month,\n",
 901 |     "            day = day)\n",
 902 |     "            \n",
 903 |     "            fp = open('./hello/' + filename, 'w')\n",
 904 |     "            fp.close()"
 905 |    ]
 906 |   },
 907 |   {
 908 |    "cell_type": "code",
 909 |    "execution_count": 38,
 910 |    "metadata": {
 911 |     "collapsed": false
 912 |    },
 913 |    "outputs": [],
 914 |    "source": [
 915 |     "# 년/월 기준으로 나누자\n",
 916 |     "# 2009/08/________________________\n",
 917 |     "#         ________________________\n",
 918 |     "# 2009/09/________________________\n",
 919 |     "#         ________________________\n",
 920 |     "if 'Photos' in os.listdir():\n",
 921 |     "    shutil.rmtree('./Photos/')\n",
 922 |     "os.mkdir(path = './Photos/')"
 923 |    ]
 924 |   },
 925 |   {
 926 |    "cell_type": "code",
 927 |    "execution_count": 39,
 928 |    "metadata": {
 929 |     "collapsed": false
 930 |    },
 931 |    "outputs": [
 932 |     {
 933 |      "data": {
 934 |       "text/plain": [
 935 |        "['Screenshot-2013-1-1.jpg',\n",
 936 |        " 'Screenshot-2013-1-10.jpg',\n",
 937 |        " 'Screenshot-2013-1-11.jpg',\n",
 938 |        " 'Screenshot-2013-1-12.jpg']"
 939 |       ]
 940 |      },
 941 |      "execution_count": 39,
 942 |      "metadata": {},
 943 |      "output_type": "execute_result"
 944 |     }
 945 |    ],
 946 |    "source": [
 947 |     "filenames = os.listdir('./hello/')\n",
 948 |     "filenames[0:4]"
 949 |    ]
 950 |   },
 951 |   {
 952 |    "cell_type": "code",
 953 |    "execution_count": 40,
 954 |    "metadata": {
 955 |     "collapsed": false
 956 |    },
 957 |    "outputs": [
 958 |     {
 959 |      "data": {
 960 |       "text/plain": [
 961 |        "'Screenshot-2013-1-1.jpg'"
 962 |       ]
 963 |      },
 964 |      "execution_count": 40,
 965 |      "metadata": {},
 966 |      "output_type": "execute_result"
 967 |     }
 968 |    ],
 969 |    "source": [
 970 |     "filename = filenames[0]\n",
 971 |     "filename"
 972 |    ]
 973 |   },
 974 |   {
 975 |    "cell_type": "code",
 976 |    "execution_count": 41,
 977 |    "metadata": {
 978 |     "collapsed": false
 979 |    },
 980 |    "outputs": [
 981 |     {
 982 |      "data": {
 983 |       "text/plain": [
 984 |        "['Screenshot', '2013', '1', '1']"
 985 |       ]
 986 |      },
 987 |      "execution_count": 41,
 988 |      "metadata": {},
 989 |      "output_type": "execute_result"
 990 |     }
 991 |    ],
 992 |    "source": [
 993 |     "filename.split('.')[0].split('-')"
 994 |    ]
 995 |   },
 996 |   {
 997 |    "cell_type": "code",
 998 |    "execution_count": 42,
 999 |    "metadata": {
1000 |     "collapsed": false
1001 |    },
1002 |    "outputs": [],
1003 |    "source": [
1004 |     "for filename in filenames:\n",
1005 |     "    year = filename.split('.')[0].split('-')[1]\n",
1006 |     "    month = filename.split('.')[0].split('-')[2]\n",
1007 |     "    # year에 맞는 폴더를 생성해야한다.\n",
1008 |     "    if not year in os.listdir('./Photos/'):\n",
1009 |     "        os.mkdir('./Photos/{year}/'.format(year = year))\n",
1010 |     "    # month에 맞는 폴더를 생성한다.\n",
1011 |     "    if not month in os.listdir(os.path.join('Photos', year)):\n",
1012 |     "        os.mkdir('./Photos/{year}/{month}/'.format(year = year, month = month))\n",
1013 |     "        \n",
1014 |     "    \n",
1015 |     "    # shutil.copy2(src, dest)\n",
1016 |     "    src_filename = os.path.join('.', 'hello', filename)\n",
1017 |     "    dest_filename = os.path.join('Photos', year, month, filename)\n",
1018 |     "    shutil.copy2(src = src_filename, dst = dest_filename)"
1019 |    ]
1020 |   },
1021 |   {
1022 |    "cell_type": "markdown",
1023 |    "metadata": {},
1024 |    "source": [
1025 |     "### example : 압축하기"
1026 |    ]
1027 |   },
1028 |   {
1029 |    "cell_type": "code",
1030 |    "execution_count": 43,
1031 |    "metadata": {
1032 |     "collapsed": true
1033 |    },
1034 |    "outputs": [],
1035 |    "source": [
1036 |     "?shutil.make_archive"
1037 |    ]
1038 |   },
1039 |   {
1040 |    "cell_type": "code",
1041 |    "execution_count": 44,
1042 |    "metadata": {
1043 |     "collapsed": false
1044 |    },
1045 |    "outputs": [
1046 |     {
1047 |      "data": {
1048 |       "text/plain": [
1049 |        "'D:\\\\dev\\\\py-automate\\\\Photos.zip'"
1050 |       ]
1051 |      },
1052 |      "execution_count": 44,
1053 |      "metadata": {},
1054 |      "output_type": "execute_result"
1055 |     }
1056 |    ],
1057 |    "source": [
1058 |     "# shutil - archive\n",
1059 |     "shutil.make_archive(\n",
1060 |     "        'Photos', # 압축파일 이름 (Photos.zip)\n",
1061 |     "        'zip', # 압축파일 형태\n",
1062 |     "        './Photos/' # 압축할 폴더지정  \n",
1063 |     ")"
1064 |    ]
1065 |   },
1066 |   {
1067 |    "cell_type": "markdown",
1068 |    "metadata": {},
1069 |    "source": [
1070 |     "### example : 압축풀기 "
1071 |    ]
1072 |   },
1073 |   {
1074 |    "cell_type": "code",
1075 |    "execution_count": 45,
1076 |    "metadata": {
1077 |     "collapsed": false
1078 |    },
1079 |    "outputs": [
1080 |     {
1081 |      "data": {
1082 |       "text/plain": [
1083 |        "['Photos.zip']"
1084 |       ]
1085 |      },
1086 |      "execution_count": 45,
1087 |      "metadata": {},
1088 |      "output_type": "execute_result"
1089 |     }
1090 |    ],
1091 |    "source": [
1092 |     "[filename for filename in os.listdir() if filename.endswith('.zip')]"
1093 |    ]
1094 |   },
1095 |   {
1096 |    "cell_type": "code",
1097 |    "execution_count": 46,
1098 |    "metadata": {
1099 |     "collapsed": true
1100 |    },
1101 |    "outputs": [],
1102 |    "source": [
1103 |     "shutil.unpack_archive('Photos.zip', './MyPhotos/')"
1104 |    ]
1105 |   },
1106 |   {
1107 |    "cell_type": "code",
1108 |    "execution_count": 47,
1109 |    "metadata": {
1110 |     "collapsed": false
1111 |    },
1112 |    "outputs": [
1113 |     {
1114 |      "data": {
1115 |       "text/plain": [
1116 |        "['.ipynb_checkpoints',\n",
1117 |        " 'animals.csv',\n",
1118 |        " 'animals2.csv',\n",
1119 |        " 'fruits.csv',\n",
1120 |        " 'hello',\n",
1121 |        " 'MyPhotos',\n",
1122 |        " 'new_automate',\n",
1123 |        " 'os_shutil.ipynb',\n",
1124 |        " 'Photos',\n",
1125 |        " 'Photos.zip',\n",
1126 |        " 'Python_basic1.ipynb',\n",
1127 |        " 'Python_basic2.ipynb',\n",
1128 |        " 'Python_basic3.ipynb',\n",
1129 |        " 'src.txt',\n",
1130 |        " 'test.txt']"
1131 |       ]
1132 |      },
1133 |      "execution_count": 47,
1134 |      "metadata": {},
1135 |      "output_type": "execute_result"
1136 |     }
1137 |    ],
1138 |    "source": [
1139 |     "os.listdir()"
1140 |    ]
1141 |   }
1142 |  ],
1143 |  "metadata": {
1144 |   "kernelspec": {
1145 |    "display_name": "Python 3",
1146 |    "language": "python",
1147 |    "name": "python3"
1148 |   },
1149 |   "language_info": {
1150 |    "codemirror_mode": {
1151 |     "name": "ipython",
1152 |     "version": 3
1153 |    },
1154 |    "file_extension": ".py",
1155 |    "mimetype": "text/x-python",
1156 |    "name": "python",
1157 |    "nbconvert_exporter": "python",
1158 |    "pygments_lexer": "ipython3",
1159 |    "version": "3.6.0"
1160 |   }
1161 |  },
1162 |  "nbformat": 4,
1163 |  "nbformat_minor": 2
1164 | }
1165 | 


--------------------------------------------------------------------------------