├── .github └── workflows │ ├── first_github_action.yml │ └── python-app.yml ├── LICENSE ├── README.md ├── notebooks ├── 카일-스쿨-10회차-Docker.ipynb ├── 카일-스쿨-1회차-동기부여-및-일-잘하기.ipynb.ipynb ├── 카일-스쿨-2회차-자세와-Pandas-시각화.ipynb ├── 카일-스쿨-3회차-쉘-커맨드.ipynb ├── 카일-스쿨-4회차-쉘-스크립트.ipynb ├── 카일-스쿨-5회차-클라우드와-데이터-엔지니어링.ipynb ├── 카일-스쿨-6회차-영양제와-airflow.ipynb ├── 카일-스쿨-7회차-회복탄력성과-git-github.ipynb ├── 카일-스쿨-8회차-Github-Action.ipynb └── 카일-스쿨-9회차-Test-Code.ipynb ├── script └── mac-setting.sh ├── week1 └── index.html ├── week10 ├── docker_images │ ├── 01-simple_notebook │ │ └── Dockerfile │ ├── 02-simple_notebook_with_entry │ │ └── Dockerfile │ └── 03-simple_notebook_docker_compose │ │ └── docker-compose.yml └── index.html ├── week2 └── index.html ├── week3 └── index.html ├── week4 ├── arg_test.sh ├── awkfile ├── case_test.sh ├── for_loop.sh ├── for_loop2.sh ├── for_loop3.sh ├── if_test.sh ├── index.html ├── ohin.sh ├── set_test.sh ├── single_double_quote.sh ├── trap_slack.sh └── while_loop.sh ├── week5 └── index.html ├── week6 ├── dags │ ├── 01-bash_operator.py │ ├── 02-python_operator.py │ ├── 03-python_operator_with_context.py │ ├── 04-python_operator_with_jinja.py │ └── 05-simple_etl.py ├── data │ ├── bike_data_20200209.csv │ ├── bike_data_20200210.csv │ ├── bike_data_20200211.csv │ ├── bike_data_20200212.csv │ └── bike_schema.json └── index.html ├── week7 └── index.html ├── week8 └── index.html └── week9 ├── calc_class.py ├── calc_func.py ├── index.html ├── iris.csv ├── simple_class.py ├── tests ├── __init__.py ├── test_calc_class.py ├── test_simple_class.py ├── test_utils.py └── test_your_module.py ├── utils.py └── your_module.py /.github/workflows/first_github_action.yml: -------------------------------------------------------------------------------- 1 | # This is a basic workflow to help you get started with Actions 2 | 3 | name: CI 4 | 5 | # Controls when the action will run. Triggers the workflow on push or pull request 6 | # events but only for the master branch 7 | on: 8 | push: 9 | branches: [ master ] 10 | pull_request: 11 | branches: [ master ] 12 | 13 | # A workflow run is made up of one or more jobs that can run sequentially or in parallel 14 | jobs: 15 | # This workflow contains a single job called "build" 16 | build: 17 | # The type of runner that the job will run on 18 | runs-on: ubuntu-latest 19 | 20 | # Steps represent a sequence of tasks that will be executed as part of the job 21 | steps: 22 | # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it 23 | - uses: actions/checkout@v2 24 | 25 | # Runs a single command using the runners shell 26 | - name: Run a one-line script 27 | run: echo Hello, world! 28 | 29 | # Runs a set of commands using the runners shell 30 | - name: Run a multi-line script 31 | run: | 32 | echo Add other actions to build, 33 | echo test, and deploy your project. 34 | -------------------------------------------------------------------------------- /.github/workflows/python-app.yml: -------------------------------------------------------------------------------- 1 | # This workflow will install Python dependencies, run tests and lint with a single version of Python 2 | # For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions 3 | 4 | name: Python application 5 | 6 | on: 7 | push: 8 | branches: [ master ] 9 | pull_request: 10 | branches: [ master ] 11 | 12 | jobs: 13 | build: 14 | 15 | runs-on: ubuntu-latest 16 | 17 | steps: 18 | - uses: actions/checkout@v2 19 | - name: Set up Python 3.7 20 | uses: actions/setup-python@v2 21 | with: 22 | python-version: 3.7 23 | - name: Install dependencies 24 | run: | 25 | python -m pip install --upgrade pip 26 | pip install flake8 pytest 27 | if [ -f requirements.txt ]; then pip install -r requirements.txt; fi 28 | - name: Lint with flake8 29 | run: | 30 | # stop the build if there are Python syntax errors or undefined names 31 | flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics 32 | # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide 33 | flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics 34 | - name: Test echo 35 | run: | 36 | echo "hi" 37 | - name: Test python -h 38 | run: | 39 | python3 -h 40 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Sung Yun Byeon 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # kyle-school 2 | [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fzzsza%2Fkyle-school)](https://hits.seeyoufarm.com) 3 | 4 | - 쏘카 데이터 그룹 사내 신입/인턴을 대상으로 한 카일 스쿨 5 | - [카일스쿨](http://bit.ly/kyleschool_github) 유튜브를 만들었습니다. 추후 영상도 업로드할 예정입니다 :) 6 | 7 | --- 8 | 9 | ### 생각 10 | - 데이터 그룹의 전반적인 실력 상승을 위해 제 시간을 할애해 알려드릴 예정 11 | - 단, 데이터 분석/머신러닝 방법론은 각자 공부하고(일하면서 하고 있고 요새 좋은 자료는 너무 많음. 그냥 하면 됨) 그 외에 알아두면 좋을 내용 위주로 구성 12 | 13 | --- 14 | 15 | ### 방식 16 | - 2주에 1번씩 1시간씩 진행 17 | - 궁금한 내용, 원하는 내용은 Issues에 올리기 18 | 19 | 20 | --- 21 | 22 | ### 폴더 구조 23 | - 요약하면 발표 자료가 보고싶다면 `https://zzsza.github.io/kyle-school/<주차>` 24 | - 주차는 week1, week2 등으로 표시될 예정 25 | - 발표 자료를 노트북으로 보고 싶다면 notebooks에 저장된 파일 참고 26 | 27 | ``` 28 | ├── LICENSE 29 | ├── README.md 30 | ├── notebooks : 발표 자료 원본 노트북 파일 31 | ├── script : 기타 스크립트 파일 32 | └── week1 : 1주차 자료 33 | └── index.html 34 | ``` 35 | 36 | 37 | --- 38 | 39 | ### 커리큘럼 40 | - (1) 1주차 - [[발표 자료]](https://zzsza.github.io/kyle-school/week1/) 41 | - 우리는 왜 일하는가? 42 | - 어떻게 해야 일을 더 잘할 수 있을까? 43 | - 번아웃 44 | - 맥북 스마트하게 사용하기 45 | - (2) 2주차 - [[발표 자료]](https://zzsza.github.io/kyle-school/week2/) 46 | - 올바른 자세 47 | - Pandas 퀵 리뷰 48 | - 데이터 시각화 49 | - Matplotlib, Seaborn, Cufflinks 50 | - Ipywidgets 51 | - Pydeck 52 | - (3) Shell Command - [[발표 자료]](https://zzsza.github.io/kyle-school/week3/) 53 | - 쉘 관련 용어 정의 및 데이터 직군에서 쉘 사용하는 경우 54 | - 기본 쉘 커맨드 55 | - 데이터 전처리시 사용할 쉘 커맨드 56 | - 종합 실습 : 카카오톡에서 대화 많이 한 사람을 1줄로 추출해보기! 57 | - 서버에서 주로 사용하는 쉘 커맨드 58 | - (4) Shell Command & Shell Script - [[발표 자료]](https://zzsza.github.io/kyle-school/week4/) 59 | - 질문 잘 하는 방법 60 | - 쉘 커맨드 61 | - awk 62 | - sed 63 | - alias 64 | - xargs 65 | - nohup 66 | - screen 67 | - scp 68 | - pbcopy 69 | - /dev/null이란 70 | - 쉘 스크립트 71 | - 함수 72 | - 변수 73 | - 위치 매개 변수 74 | - for loop 75 | - while loop 76 | - 조건문(if elif, case) 77 | - set 78 | - ''와 ""는 다르다 79 | - trap 80 | - (5) 클라우드 & 데이터 엔지니어링 - [[발표 자료]](https://zzsza.github.io/kyle-school/week5/) 81 | - 연말정산 82 | - 클라우드란? 83 | - 데이터 엔지니어링 퀵 요약 84 | - (6) 영양제 & Airflow - [[발표 자료]](https://zzsza.github.io/kyle-school/week6/) 85 | - 영양제 86 | - Apache Airflow란? 87 | - Apache Airflow를 사용한 간단한 ETL 파이프라인 만들기 88 | - (7) Git, Github - [[발표 자료]](https://zzsza.github.io/kyle-school/week7/) 89 | - 회복 탄력성 90 | - Git 91 | - Github 92 | - Sourcetree 93 | - (8) Github Action - [[발표 자료]](http://zzsza.github.io/kyle-school/week8/) 94 | - Github Action 입문 95 | - YES24의 IT 신간 도서 TOP40을 매일 아침에 Github Issue에 업로드 96 | - [Github Action with Python](https://github.com/zzsza/github-action-with-python) 97 | - (9) Test Code - [[발표 자료]](http://zzsza.github.io/kyle-school/week9/) 98 | - Pytest 사용 방법 99 | - 오픈소스의 Test Code 100 | - 딥러닝 프로젝트의 Test Code 101 | - (10) Docker - [[발표 자료]](https://zzsza.github.io/kyle-school/week10/) 102 | - Docker 사용 방법 103 | - Docker Image 빌드하기 104 | - Docker Container 실행하기 105 | - Dockerfile 106 | 107 | --- 108 | 109 | ### 소재 110 | - MLOps 개론 111 | - MLOps 개론 112 | 113 | 114 | --- 115 | 116 | ### 제게 하는 이야기 117 | - 말을 천천히 하자 118 | - 매주 [링크](https://forms.gle/V21W8MHPq7bAsoQU6)를 통해서 피드백 받기 119 | 120 | -------------------------------------------------------------------------------- /notebooks/카일-스쿨-10회차-Docker.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "## 카일 스쿨 10회차\n", 12 | "\n", 13 | "[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=http%3A%2F%2Fzzsza.github.io%2Fkyle-school%2Fweek10)](https://hits.seeyoufarm.com)\n", 14 | "\n", 15 | "- #1. Docker 큰 개념\n", 16 | "- #2. 따라치며 배우는 도커\n", 17 | "- #3. Dockerfile 만들기\n", 18 | " - Dockerhub와 GCR 이해하기\n", 19 | "- #4. 도커 이미지를 사용해 인스턴스 띄우기\n", 20 | "- #5. 도커를 사용해 Superset / Metabase 띄우기" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": { 26 | "slideshow": { 27 | "slide_type": "slide" 28 | } 29 | }, 30 | "source": [ 31 | "## Docker 큰 개념\n", 32 | "### 오늘 이것만은 꼭!\n", 33 | "- Docker란 무엇인지 이해한다\n", 34 | "- Docker Image Pull하기\n", 35 | "- Docker Container를 실행\n", 36 | " - 로컬에서 Jupyter Notebook을 띄워보기\n", 37 | "- 서버에 Jupyter Notebook 띄우기\n", 38 | "- Dockerfile 작성하기" 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": { 44 | "slideshow": { 45 | "slide_type": "subslide" 46 | } 47 | }, 48 | "source": [ 49 | "### Docker란 무엇인가\n", 50 | "- Docker는 환경을 격리해 이미지를 만들고, 실행해줌\n", 51 | " - 피시방에서 재부팅해도 다시 그대로 프로그램이 그대로!\n", 52 | " - 애플리케이션 + 환경 모두 같이 저장\n", 53 | " - 도커 파일을 빌드 => 도커 이미지 : 명령어를 모두 실행한 결과\n", 54 | " - 1년 사이에 특정 도구가 사라져있을 수 있다면? 도커 이미지는 특정 도구까지 품어서 만듬\n", 55 | " - A 서버에선 라이브러리 설치, B에도 설치?\n", 56 | " - GPU 환경설정 등을 쉽게 가능함\n", 57 | " - 내 로컬에 파이썬 안깔고 깔끔하게 유지 가능 " 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": { 63 | "slideshow": { 64 | "slide_type": "subslide" 65 | } 66 | }, 67 | "source": [ 68 | "- 도커가 등장하기 전 세상\n", 69 | "- 1) 배포\n", 70 | " - 서버를 운영한다\n", 71 | " - 배포한다!\n", 72 | " - 파일을 ftp 등으로 보내고 인스턴스를 껐다 킴\n", 73 | " - 서버가 50대라면..?" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": { 79 | "slideshow": { 80 | "slide_type": "fragment" 81 | } 82 | }, 83 | "source": [ 84 | "- 2) 환경\n", 85 | " - 우리에겐 pyenv, virtualenv도 있지만.. 리눅스 환경 자체를 원한다면?\n", 86 | " - 인스턴스 실행하고 터미널 명령어 계속 치는 방식?" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": { 92 | "slideshow": { 93 | "slide_type": "fragment" 94 | } 95 | }, 96 | "source": [ 97 | "- 3) 이런 적 있지 않나요\n", 98 | " - GPU 엔비디아 설치...\n", 99 | " - 뭔가 안된다 => 스택오브플로우 => 음 다시 하래 => 음" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": { 105 | "slideshow": { 106 | "slide_type": "subslide" 107 | } 108 | }, 109 | "source": [ 110 | "- 도커 관련 용어 설명\n", 111 | " - 이미지 : 일종의 템플릿\n", 112 | " - 컨테이너 : 이미지를 가지고 실행\n", 113 | " - 도커 허브 : 템플릿 창고\n", 114 | " - 호스트 : 시스템의 핵심이 되는 PC" 115 | ] 116 | }, 117 | { 118 | "cell_type": "markdown", 119 | "metadata": { 120 | "slideshow": { 121 | "slide_type": "subslide" 122 | } 123 | }, 124 | "source": [ 125 | "- 도커 설치\n", 126 | " - Docker Desktop 설치\n", 127 | " - https://docs.docker.com/get-docker/" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": { 133 | "slideshow": { 134 | "slide_type": "slide" 135 | } 136 | }, 137 | "source": [ 138 | "### 따라치며 배우는 도커\n", 139 | "- 도커 명령어\n", 140 | " - docker\n", 141 | " - docker image pull\n", 142 | " - docker image ls\n", 143 | " - docker run\n", 144 | " - docker exec\n", 145 | " - docker container\n", 146 | " - docker build" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": { 152 | "slideshow": { 153 | "slide_type": "subslide" 154 | } 155 | }, 156 | "source": [ 157 | "- docker image pull\n", 158 | " - docker image를 땡겨오는 명령어\n", 159 | " \n", 160 | " ```\n", 161 | " docker image pull jupyter/minimal-notebook\n", 162 | " ```" 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "metadata": { 168 | "slideshow": { 169 | "slide_type": "subslide" 170 | } 171 | }, 172 | "source": [ 173 | "- docker image ls\n", 174 | " - 현재 있는 도커 이미지를 출력\n", 175 | " - `-a` 조건을 주면 전체 이미지 출력\n", 176 | " \n", 177 | " ```\n", 178 | " docker image ls\n", 179 | " ```" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": { 185 | "slideshow": { 186 | "slide_type": "subslide" 187 | } 188 | }, 189 | "source": [ 190 | "- docker run\n", 191 | " - docker image 기반으로 컨테이너 실행\n", 192 | " - 실행할 때 포트 정보를 같이 인자로 넘겨야 함\n", 193 | " \n", 194 | " ```\n", 195 | " docker run -p 8888:8888 jupyter/minimal-notebook\n", 196 | " ```\n", 197 | " \n", 198 | " - -p host_port:container_port\n", 199 | " - `-d` 옵션을 주면 백그라운드에서 실행\n" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": { 205 | "slideshow": { 206 | "slide_type": "subslide" 207 | } 208 | }, 209 | "source": [ 210 | "- docker container ls\n", 211 | " - 현재 실행중인 컨테이너 출력" 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": { 217 | "slideshow": { 218 | "slide_type": "subslide" 219 | } 220 | }, 221 | "source": [ 222 | "- docker exec\n", 223 | " - 컨테이너 안에 명령을 날리고 싶은 경우\n", 224 | " - 1) 컨테이너 안에 들어가서 명령을 날려도 되고\n", 225 | " - 2) exec을 써도 됨 : 이 방법\n", 226 | " - docker exec b8314275379e pip install tensorflow " 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": { 232 | "slideshow": { 233 | "slide_type": "subslide" 234 | } 235 | }, 236 | "source": [ 237 | "- docker 컨테이너 안으로 들어가기\n", 238 | " - docker exec\n", 239 | " - -it : interactive tty 접속한다는 뜻. 일단 -it를 많이 쓴다고 알아두어도 좋아요\n", 240 | " \n", 241 | " ```\n", 242 | " docker exec -it container_id /bin/bash\n", 243 | " ```" 244 | ] 245 | }, 246 | { 247 | "cell_type": "markdown", 248 | "metadata": { 249 | "slideshow": { 250 | "slide_type": "subslide" 251 | } 252 | }, 253 | "source": [ 254 | "- docker run 취소\n", 255 | " - docker run 했던 터미널을 취소(command c)하고 다시 실행해봅시다\n", 256 | " - tensorflow를 import하면?\n", 257 | " - 안됩니다\n", 258 | " - docker run은 한번만 일회성으로 띄우고, 끄는 순간 꺼집니다" 259 | ] 260 | }, 261 | { 262 | "cell_type": "markdown", 263 | "metadata": { 264 | "slideshow": { 265 | "slide_type": "subslide" 266 | } 267 | }, 268 | "source": [ 269 | "- Volume mount\n", 270 | " - 호스트와 컨테이너끼리 파일 공유가 안됨 => 삭제됨\n", 271 | " - 볼륨 마운트로 진행\n", 272 | " - v option을 주면 가능함\n", 273 | " \n", 274 | " ```\n", 275 | " docker run -it -p 8888:8888 -v /some/host/folder/for/work:/home/jovyan/workspace jupyter/minimal-notebook\n", 276 | " ```" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": { 282 | "slideshow": { 283 | "slide_type": "subslide" 284 | } 285 | }, 286 | "source": [ 287 | "- 자 여기서 실습\n", 288 | " - jupyter notebook(jupyter/minimal-notebook) 띄우기\n", 289 | " - 컨테이너로 직접 들어가서 라이브러리 설치하기\n", 290 | " - Docker Volume 싱크 확인하기(컨테이너 종료해서 다시 켜볼 때 공유되는 폴더가 있는지?)" 291 | ] 292 | }, 293 | { 294 | "cell_type": "markdown", 295 | "metadata": { 296 | "slideshow": { 297 | "slide_type": "slide" 298 | } 299 | }, 300 | "source": [ 301 | "### Dockerfile\n", 302 | "- Docker 명령어 모음\n", 303 | "- 보통 도커파일을 많이 사용함\n", 304 | "- Dockerfile은 이미지를 만드는데 필요한 모든 명령을 순서대로 포함하는 텍스트 파일\n", 305 | "- 각 명령을 읽어서 이미지를 빌드함\n", 306 | "- FROM, COPY, RUN, EXPOSE, ENV, CMD, ENTRYPOINT, WORKDIR, USER, VOLUME 등을 사용함. 잘 설명된 [링크](https://rampart81.github.io/post/dockerfile_instructions)" 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": { 312 | "slideshow": { 313 | "slide_type": "subslide" 314 | } 315 | }, 316 | "source": [ 317 | "- Dockerfile이란 파일을 사용하며, docker build -t dev/dev:v1 . 이런 식으로 자주 씀\n", 318 | "\n", 319 | "```\n", 320 | "FROM jupyter/minimal-notebook\n", 321 | "RUN pip install tensorflow\n", 322 | "```" 323 | ] 324 | }, 325 | { 326 | "cell_type": "markdown", 327 | "metadata": { 328 | "slideshow": { 329 | "slide_type": "subslide" 330 | } 331 | }, 332 | "source": [ 333 | "- RUN과 CMD, ENTRYPOINT의 차이\n", 334 | " - RUN은 이미지 빌드 과정에서 진행. 컨테이너 이미지에 커밋됨\n", 335 | " - CMD는 빌드된 이미지를 시작할 때 컨테이너가 실행하는 명령. docker run할 때 인자값을 전달해 실행하면 CMD는 무시됨\n", 336 | " - ENTRYPOINT: docker run시 실행되는 명령, 단 한번만 사용. RUN 또는 start 때 사용됨\n", 337 | " " 338 | ] 339 | }, 340 | { 341 | "cell_type": "markdown", 342 | "metadata": { 343 | "slideshow": { 344 | "slide_type": "subslide" 345 | } 346 | }, 347 | "source": [ 348 | "- 아래와 같은 Dockerfile을 만들어봅시다\n", 349 | "\n", 350 | "```\n", 351 | "FROM jupyter/minimal-notebook\n", 352 | "RUN pip install tensorflow\n", 353 | "RUN jupyter notebook --generate-config --allow-root -y \\\n", 354 | " && echo \"c.NotebookApp.password = 'sha1:fee705da7ee3:39094efec15c2bc5f651b88fdd5536685b5fd229'\" >> /home/jovyan/.jupyter/jupyter_notebook_config.py\n", 355 | "\n", 356 | "EXPOSE 8888\n", 357 | "\n", 358 | "ENTRYPOINT jupyter notebook --allow-root --ip=0.0.0.0 --port=8888 --no-browser \n", 359 | "```" 360 | ] 361 | }, 362 | { 363 | "cell_type": "markdown", 364 | "metadata": { 365 | "slideshow": { 366 | "slide_type": "subslide" 367 | } 368 | }, 369 | "source": [ 370 | "- docker image 찾기 팁\n", 371 | " - [Dockerhub](https://hub.docker.com/) 가입\n", 372 | " - 원하는 검색어를 사용하면 여러 도커 이미지가 나옴\n", 373 | " - " 374 | ] 375 | }, 376 | { 377 | "cell_type": "markdown", 378 | "metadata": { 379 | "slideshow": { 380 | "slide_type": "subslide" 381 | } 382 | }, 383 | "source": [ 384 | "- Airflow 이미지 선택\n", 385 | " - " 386 | ] 387 | }, 388 | { 389 | "cell_type": "markdown", 390 | "metadata": { 391 | "slideshow": { 392 | "slide_type": "subslide" 393 | } 394 | }, 395 | "source": [ 396 | "- Tags 확인\n", 397 | " - " 398 | ] 399 | }, 400 | { 401 | "cell_type": "markdown", 402 | "metadata": { 403 | "slideshow": { 404 | "slide_type": "subslide" 405 | } 406 | }, 407 | "source": [ 408 | "- Dockerfile을 공개하는 이미지도 있음\n", 409 | " - " 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": { 415 | "slideshow": { 416 | "slide_type": "slide" 417 | } 418 | }, 419 | "source": [ 420 | "### GCR\n", 421 | "- Google Container Registry\n", 422 | "- Google Cloud Platform에서 만든 도커 허브라고 생각하면 편함\n", 423 | "- 여기도 Docker 이미지를 push하고 pull할 수 있음\n", 424 | "- sha1:~~~ 은 kyle을 암호화한 것\n", 425 | "\n", 426 | "```\n", 427 | "FROM jupyter/minimal-notebook\n", 428 | "RUN pip install tensorflow\n", 429 | "RUN jupyter notebook --generate-config --allow-root -y \\\n", 430 | " && echo \"c.NotebookApp.password = 'sha1:fee705da7ee3:39094efec15c2bc5f651b88fdd5536685b5fd229'\" >> /home/jovyan/.jupyter/jupyter_notebook_config.py\n", 431 | "\n", 432 | "EXPOSE 8888\n", 433 | "\n", 434 | "ENTRYPOINT jupyter notebook --allow-root --ip=0.0.0.0 --port=8888 --no-browser \n", 435 | "```" 436 | ] 437 | }, 438 | { 439 | "cell_type": "markdown", 440 | "metadata": { 441 | "slideshow": { 442 | "slide_type": "subslide" 443 | } 444 | }, 445 | "source": [ 446 | "- " 447 | ] 448 | }, 449 | { 450 | "cell_type": "markdown", 451 | "metadata": { 452 | "slideshow": { 453 | "slide_type": "subslide" 454 | } 455 | }, 456 | "source": [ 457 | "- Container Registry API 사용 설정\n", 458 | " - " 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": { 464 | "slideshow": { 465 | "slide_type": "subslide" 466 | } 467 | }, 468 | "source": [ 469 | "- 빌드\n", 470 | " - gcloud builds submit --tag gcr.io/bigquery-definitive/simple_notebook .\n", 471 | " - bigquery-definitive엔 여러분의 project_id를 넣어주세요\n", 472 | " - 빌드된 이미지 확인\n", 473 | " - " 474 | ] 475 | }, 476 | { 477 | "cell_type": "markdown", 478 | "metadata": { 479 | "slideshow": { 480 | "slide_type": "slide" 481 | } 482 | }, 483 | "source": [ 484 | "- Compute Engine에서 쉽게 사용하는 방법\n", 485 | " - gcr 링크 복사하기 : gcr.io/bigquery-definitive/simple_notebook\n", 486 | " - " 487 | ] 488 | }, 489 | { 490 | "cell_type": "markdown", 491 | "metadata": { 492 | "slideshow": { 493 | "slide_type": "subslide" 494 | } 495 | }, 496 | "source": [ 497 | "- VPC 네트워크 - 방화벽 이동\n", 498 | " - 8888 포트를 열어줘야 함\n", 499 | " - " 500 | ] 501 | }, 502 | { 503 | "cell_type": "markdown", 504 | "metadata": { 505 | "slideshow": { 506 | "slide_type": "subslide" 507 | } 508 | }, 509 | "source": [ 510 | "- Compute Engine으로 이동하기\n", 511 | " - VM 인스턴스 만들기 - 이 VM 인스턴스에 컨테이너 이미지를 배포합니다 클릭\n", 512 | " - " 513 | ] 514 | }, 515 | { 516 | "cell_type": "markdown", 517 | "metadata": { 518 | "slideshow": { 519 | "slide_type": "subslide" 520 | } 521 | }, 522 | "source": [ 523 | "- Volume Mount 설정\n", 524 | " - `home/jovyan/workspace`\n", 525 | " - " 526 | ] 527 | }, 528 | { 529 | "cell_type": "markdown", 530 | "metadata": { 531 | "slideshow": { 532 | "slide_type": "subslide" 533 | } 534 | }, 535 | "source": [ 536 | "- 네트워크 태그에 위에서 만든 jupyter 네트워크 방화벽 설정\n", 537 | "- 인스턴스 확인\n", 538 | " - \n", 539 | " " 540 | ] 541 | }, 542 | { 543 | "cell_type": "markdown", 544 | "metadata": { 545 | "slideshow": { 546 | "slide_type": "subslide" 547 | } 548 | }, 549 | "source": [ 550 | "- 인스턴스 ip를 사용해서 노트북 포트(8888)로 이동\n", 551 | " - \n", 552 | "- 만약 도커 이미지에 이상이 있다면 다시 빌드 => 인스턴스 종료 후 다시 시작하면 최신 도커 이미지 사용함" 553 | ] 554 | }, 555 | { 556 | "cell_type": "markdown", 557 | "metadata": { 558 | "slideshow": { 559 | "slide_type": "slide" 560 | } 561 | }, 562 | "source": [ 563 | "### Docker Image 사용시 불편한 점\n", 564 | "- 옵션이 다양함\n", 565 | " - 실행시 작성해야 하는 명령어 옵션이 많고, 귀찮음\n", 566 | " - 미리 정의해둘 수 없을까?\n", 567 | "- 컨테이너의 순서를 제어할 수 없을까?\n", 568 | " - A 컨테이너를 먼저 띄우고, B 컨테이너를 실행해야 하는 경우\n", 569 | " - 예를 들어 Database 컨테이너를 먼저 띄우고 어플리케이션 컨테이너를 띄워야하는 경우" 570 | ] 571 | }, 572 | { 573 | "cell_type": "markdown", 574 | "metadata": { 575 | "slideshow": { 576 | "slide_type": "subslide" 577 | } 578 | }, 579 | "source": [ 580 | "- Docker Compose\n", 581 | " - 위에 나오는 이슈들을 해결하기 위해 나옴\n", 582 | " - 여러 컨테이너를 한번에 띄울 수 있음\n", 583 | " - 여러 컨테이너의 실행 순서, 의존도를 관리할 수 있음\n", 584 | " - `docker-compose.yml` 파일에 작성함" 585 | ] 586 | }, 587 | { 588 | "cell_type": "markdown", 589 | "metadata": { 590 | "slideshow": { 591 | "slide_type": "subslide" 592 | } 593 | }, 594 | "source": [ 595 | "- simple notebook docker-compose.yml 파일\n", 596 | "\n", 597 | "```\n", 598 | "version: '3' # 파일 규격 버전\n", 599 | "\n", 600 | "services: # 컨테이너들을 정의\n", 601 | " notebook: # notebook 서비스\n", 602 | " image: jupyter/minimal-notebook # notebook 서비스에서 사용할 도커 이미지\n", 603 | " container_name: notebook # 컨테이너 이름\n", 604 | " volumes: # --volume 옵션 사용해서 연결하는 부분\n", 605 | " - ./docker-volume:/home/jovyan/workspace\n", 606 | " ports: # ports 호스트:컨테이너\n", 607 | " - 8888:8888\n", 608 | " command:\n", 609 | " jupyter notebook --allow-root --ip=0.0.0.0 --no-browser\n", 610 | "```" 611 | ] 612 | }, 613 | { 614 | "cell_type": "markdown", 615 | "metadata": { 616 | "slideshow": { 617 | "slide_type": "subslide" 618 | } 619 | }, 620 | "source": [ 621 | "- 백그라운드에서 실행하기(docker run -d와 동일) : docker-compose up -d\n", 622 | "- 서비스 중단(컨테이너, 볼륨 등 삭제) : docker-compose down\n", 623 | "- 실행 중 서비스 확인 : docker-compose ps\n", 624 | "- 로그 확인 : docker-compose logs <서비스명>\n", 625 | "- 참고 지식!\n", 626 | " - docker-compose.yml 파일을 수정했으면 => up을 하면 컨테이너 재생성 후 서비스 재시작함" 627 | ] 628 | }, 629 | { 630 | "cell_type": "markdown", 631 | "metadata": { 632 | "slideshow": { 633 | "slide_type": "subslide" 634 | } 635 | }, 636 | "source": [ 637 | "- [Airflow Dockerfile](https://github.com/puckel/docker-airflow) 띄우기\n", 638 | " \n", 639 | " ```\n", 640 | " git clone https://github.com/puckel/docker-airflow\n", 641 | " cd docker-airflow\n", 642 | " docker-compose -f docker-compose-CeleryExecutor.yml up\n", 643 | " ```\n", 644 | " \n", 645 | "- 혹시 Bind for 0.0.0.0:5555 failed: port is already allocated 에러가 발생하는 경우\n", 646 | " - docker container ls로 실행된 컨테이너 중 포트 사용하는거 확인하고 \n", 647 | " - docker rm -f <컨테이너 id>로 삭제" 648 | ] 649 | }, 650 | { 651 | "cell_type": "markdown", 652 | "metadata": { 653 | "slideshow": { 654 | "slide_type": "subslide" 655 | } 656 | }, 657 | "source": [ 658 | "- PostgreSQL Dockerfile\n", 659 | " - docker-compose up -d\n", 660 | "\n", 661 | " ```\n", 662 | " version: '3' # 파일 규격 버전\n", 663 | "\n", 664 | " services: # 컨테이너들을 정의\n", 665 | " postgresql: # postgresql 서비스\n", 666 | " image: postgres # postgresql 서비스에서 사용할 도커 이미지\n", 667 | " container_name: postgresql # 컨테이너 이름\n", 668 | " volumes: # --volume 옵션 사용해서 연결하는 부분\n", 669 | " - ./postgresql/data:/var/lib/postgresql/data\n", 670 | " ports: # ports 호스트:컨테이너\n", 671 | " - 5432:5432\n", 672 | " environment: # 환경 변수\n", 673 | " POSTGRES_PASSWORD: \"password\"\n", 674 | " TZ: \"Asia/Seoul\"\n", 675 | " ```" 676 | ] 677 | }, 678 | { 679 | "cell_type": "markdown", 680 | "metadata": { 681 | "slideshow": { 682 | "slide_type": "subslide" 683 | } 684 | }, 685 | "source": [ 686 | "### Superset Docker\n", 687 | "- docker-airflow\n", 688 | " - localhost:8088에서 실행\n", 689 | " - id, password : admin, admin\n", 690 | "\n", 691 | "```\n", 692 | "docker-compose up -d\n", 693 | "git clone https://github.com/abhioncbr/docker-superset\n", 694 | "\n", 695 | "\n", 696 | "cd docker-files\n", 697 | "docker-compose up -d\n", 698 | "```\n", 699 | "\n", 700 | "- 만약 Bind for 127.0.0.1:5432 failed: port is already allocated 에러가 난다면 5432 포트에 이미 선점한 것이 있을 수 있음\n", 701 | " - 아마 postgre일 수 있음\n", 702 | " - docker container ls로 확인 후 \n", 703 | " - docker rm -f <컨테이너 id>\n", 704 | " - 또는 아래 명령어 실행\n", 705 | "\n", 706 | "```\n", 707 | "docker-compose down\n", 708 | "docker rm -fv $(docker ps -aq)\n", 709 | "sudo lsof -i -P -n | grep 5432\n", 710 | "sudo kill -9 \n", 711 | "```\n" 712 | ] 713 | }, 714 | { 715 | "cell_type": "markdown", 716 | "metadata": { 717 | "slideshow": { 718 | "slide_type": "subslide" 719 | } 720 | }, 721 | "source": [ 722 | "- proxy: listen tcp 0.0.0.0:6379: bind: address already in use 이 에러가 뜬다면\n", 723 | "\n", 724 | "```\n", 725 | "docker-compose down\n", 726 | "docker rm -fv $(docker ps -aq)\n", 727 | "sudo lsof -i -P -n | grep 6379\n", 728 | "sudo kill -9 6379\n", 729 | "```\n", 730 | "\n", 731 | "- " 732 | ] 733 | }, 734 | { 735 | "cell_type": "markdown", 736 | "metadata": { 737 | "slideshow": { 738 | "slide_type": "subslide" 739 | } 740 | }, 741 | "source": [ 742 | "### Metabase Docker\n", 743 | "```\n", 744 | "docker run -d -p 3000:3000 --name metabase metabase/metabase\n", 745 | "```\n", 746 | "\n", 747 | "- " 748 | ] 749 | }, 750 | { 751 | "cell_type": "markdown", 752 | "metadata": { 753 | "slideshow": { 754 | "slide_type": "subslide" 755 | } 756 | }, 757 | "source": [ 758 | "### Reference\n", 759 | "\n", 760 | "- https://subicura.com/2017/01/19/docker-guide-for-beginners-2.html\n", 761 | "- jupyter notebook docker : https://towardsdatascience.com/docker-jupyter-for-machine-learning-in-1-minute-30e1df969d09\n", 762 | "- http://moducon.kr/2018/wp-content/uploads/sites/2/2018/12/leesangsoo_slide.pdf" 763 | ] 764 | } 765 | ], 766 | "metadata": { 767 | "celltoolbar": "Slideshow", 768 | "kernelspec": { 769 | "display_name": "Python 3", 770 | "language": "python", 771 | "name": "python3" 772 | }, 773 | "language_info": { 774 | "codemirror_mode": { 775 | "name": "ipython", 776 | "version": 3 777 | }, 778 | "file_extension": ".py", 779 | "mimetype": "text/x-python", 780 | "name": "python", 781 | "nbconvert_exporter": "python", 782 | "pygments_lexer": "ipython3", 783 | "version": "3.7.4" 784 | }, 785 | "varInspector": { 786 | "cols": { 787 | "lenName": 16, 788 | "lenType": 16, 789 | "lenVar": 40 790 | }, 791 | "kernels_config": { 792 | "python": { 793 | "delete_cmd_postfix": "", 794 | "delete_cmd_prefix": "del ", 795 | "library": "var_list.py", 796 | "varRefreshCmd": "print(var_dic_list())" 797 | }, 798 | "r": { 799 | "delete_cmd_postfix": ") ", 800 | "delete_cmd_prefix": "rm(", 801 | "library": "var_list.r", 802 | "varRefreshCmd": "cat(var_dic_list()) " 803 | } 804 | }, 805 | "types_to_exclude": [ 806 | "module", 807 | "function", 808 | "builtin_function_or_method", 809 | "instance", 810 | "_Feature" 811 | ], 812 | "window_display": false 813 | } 814 | }, 815 | "nbformat": 4, 816 | "nbformat_minor": 2 817 | } 818 | -------------------------------------------------------------------------------- /notebooks/카일-스쿨-5회차-클라우드와-데이터-엔지니어링.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "## 카일 스쿨 5회차\n", 12 | "- [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fzzsza.github.io%2Fkyle-school%2Fweek5%2F)](https://hits.seeyoufarm.com)\n", 13 | "- #0. 연말정산\n", 14 | "- #1. 클라우드란?\n", 15 | "- #2. 데이터 엔지니어링 퀵 요약" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "slideshow": { 22 | "slide_type": "slide" 23 | } 24 | }, 25 | "source": [ 26 | "### 연말 정산\n", 27 | "- 연말 정산이 끝났지만, 내년을 위해 한번 정리해보는 시간을 가지려고 함\n", 28 | "- 신입/인턴 관점에서 챙길 내용 위주로!\n", 29 | "- 늦어도 지금 시작하면 내년에 효과를 볼 수 있음!" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": { 35 | "slideshow": { 36 | "slide_type": "subslide" 37 | } 38 | }, 39 | "source": [ 40 | "- 연말 정산의 원리\n", 41 | " - 연말에 하는 정산\n", 42 | " - 국가의 정책 사업을 위해 세금을 냄\n", 43 | " - 회사가 월급을 줄 때 세금을 미리 떼고 줌!\n", 44 | " - 월급에서 4대 보험, 소득세 빼서 나머지 금액을 줌\n", 45 | " - 매달 우리는 세금을 내고 있다\n", 46 | " - 연말이 되서 확인하니, \"너 세금 더 냄. 돈 더 줄게\" or \"너 세금 덜 냄. 더 내\" 이러는 과정\n", 47 | " " 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": { 53 | "slideshow": { 54 | "slide_type": "subslide" 55 | } 56 | }, 57 | "source": [ 58 | "- 연말정산 시기\n", 59 | " - 1~2월에 진행. 회사마다 기간은 다름\n", 60 | " - 솔루션이 있는 회사에 외주할 경우 => 홈택스에서 받은 PDF를 업로드하면 끝\n", 61 | " - 직접 연말정산 서류를 작성할 경우 => 직접 작성" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "slideshow": { 68 | "slide_type": "subslide" 69 | } 70 | }, 71 | "source": [ 72 | "- 국가에서 세금 매기는 1단계 : 내가 번 돈 만큼 매달 소득세가 정해짐\n", 73 | "- 국가에서 세금 매기는 2단계 : 국가 정책 사업에 기여한 것이 있으면 세금을 깎아줌(소득공제 또는 세액공제)\n", 74 | "- 결정세액 : 1, 2단계를 고려해 결정된 세금\n", 75 | "- 기납부세액 : 회사가 이미 납부해준 세금\n", 76 | "- 결정세액 - 기납부세액을 통해 결정됨" 77 | ] 78 | }, 79 | { 80 | "cell_type": "markdown", 81 | "metadata": { 82 | "slideshow": { 83 | "slide_type": "subslide" 84 | } 85 | }, 86 | "source": [ 87 | "\n", 88 | " " 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": { 94 | "slideshow": { 95 | "slide_type": "subslide" 96 | } 97 | }, 98 | "source": [ 99 | "- 소득공제와 세액공제\n", 100 | " - 소득공제\n", 101 | " - 세금 부과 대상이 되는 소득을 줄여주는 것(간접)\n", 102 | " - 소득 규모가 크면 높은 세율을 적용받아 과세표준이 줄,어 세율이 낮은 저소득자보다 소득세 절감액이 많아짐\n", 103 | " - 예시 : 인적공제, 특별소득공제(보험료, 주택자금공제), 조특법상 소득공제(신용카드 등 사용금액, 소기업/소상공인 공제부금 등) 등\n", 104 | " - 세액공제\n", 105 | " - 결정된 세금에서 깎아주는 것(직접)\n", 106 | " - 세금 80만원을 내야 하는데, 세액공제 30만원 받으면 50만원만\n", 107 | " - 예시 : 중소기업 소득세 감면, 월세 세액공제\n", 108 | " - 소득이 많으면 소득공제가 유리하고, 일반 근로자면 보통 세액공제가 더 이득\n", 109 | " " 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": { 115 | "slideshow": { 116 | "slide_type": "subslide" 117 | } 118 | }, 119 | "source": [ 120 | "- (참고) 5월 종합소득세 신고\n", 121 | " - 종합소득의 구성\n", 122 | " - 근로소득, 사업소득, 이자소득, 배당소득, 기타소득, 연금소득 등\n", 123 | " - 프리랜서거나 기타 소득이 있는 경우\n", 124 | " - 알바, 심지어 대학원생도 5월 종합소득세 신고에서 세금을 다시 받을 수 있음(연구비 인건비)" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": { 130 | "slideshow": { 131 | "slide_type": "subslide" 132 | } 133 | }, 134 | "source": [ 135 | "- 처음에 챙기면 좋은 항목들 - 소득공제\n", 136 | " - 2020년 1월 기준이고, 계속 정책이 바뀌니 유심히 찾아보기(보통 회사에서 작년이랑 변경된 것 같이 보내줌)\n", 137 | " - 주택마련저축 소득공제(청약)\n", 138 | " - 무주택이고 세대주일 경우\n", 139 | " - 총급여 7000만원 이하\n", 140 | " - 불입액의 40% 공제\n", 141 | " - 최대 240만원 한도 => 최대 공제 96만원\n", 142 | " - 최초에 자동 반영안되니 꼭 직접\n", 143 | " - 주택임차차입금 원리금상환액 소득공제\n", 144 | " - 전세자금을 차입한 경우, 원리금 상환액의 40% 공제\n", 145 | " - 카드를 많이 써서 절세받는게 나을까? 고민해보기. 소비를 절제하고 내 돈에서 쓰는 습관이 더 좋을수도 있음\n", 146 | " - 현금영수증은 핸드폰번호 국세청 홈택스에 꼭 신청하기!(최초라면 더 신경)" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": { 152 | "slideshow": { 153 | "slide_type": "subslide" 154 | } 155 | }, 156 | "source": [ 157 | "- 처음에 챙기면 좋은 항목들 - 세액공제\n", 158 | " - 2020년 1월 기준이고, 계속 정책이 바뀌니 유심히 찾아보기(보통 회사에서 작년이랑 변경된 것 같이 보내줌)\n", 159 | " - 중소기업 소득세 감면 90%, 1년에 최대 150만원. 5년\n", 160 | " - 월세\n", 161 | " - 1년 총 급여 5500만원 이하 => 월세 12% 공제\n", 162 | " - 1년 총 급여 5500만원 ~ 7000만원 => 월세 10% 공제\n", 163 | " - 전용면적 85m(25.7평) 이하 or 기준시가 3억원 이하 주택 거주\n", 164 | " - 계약서와 주민등록등본 주소지 일치해야 하고, 전입신고 필수\n", 165 | " - 연간 최대 750만원" 166 | ] 167 | }, 168 | { 169 | "cell_type": "markdown", 170 | "metadata": { 171 | "slideshow": { 172 | "slide_type": "subslide" 173 | } 174 | }, 175 | "source": [ 176 | "- 의료비 중 렌즈/안경 구입비\n", 177 | " - 총 급여에서 3% 초과해서 지출해야 함\n", 178 | " - 시력 교정용(도수가 있어야 함)이어야 함\n", 179 | " - 1인 50만원 한도\n", 180 | "- IRP\n", 181 | " - 개인형 퇴직연금\n", 182 | " - 연소득 5500만원 이하 => 16.5%\n", 183 | " - 연소득 5500만원 초과 => 13.2%\n", 184 | " - 최대 연 700만원\n", 185 | " - 단, 요즘같은 경제 상황에 미래 퇴직 연금이 가치가 있을지? 고민해볼 필요가 있음\n", 186 | "- 보험료도 납임액의 12% 세액공제\n", 187 | "- 기부금 15% 세액공제" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": { 193 | "slideshow": { 194 | "slide_type": "subslide" 195 | } 196 | }, 197 | "source": [ 198 | "- 원천징수세액\n", 199 | " - 월급에서 세금을 얼마나 뗄건지?\n", 200 | " - 80 100 120\n", 201 | " - 120으로 하면 소비가 줄어들고 연말정산때 안뱉을 수 있지 않나? 싶었음 \n", 202 | " - 다른 의견 : 80으로 하고 남은 돈을 차라리 활용해라. 120으로 하면 무이자로 국가에게 빌려주는 격\n", 203 | " - 결국 자신의 선택" 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": { 209 | "slideshow": { 210 | "slide_type": "subslide" 211 | } 212 | }, 213 | "source": [ 214 | "### 정리\n", 215 | "" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": { 221 | "slideshow": { 222 | "slide_type": "slide" 223 | } 224 | }, 225 | "source": [ 226 | "### 클라우드\n", 227 | "- 클라우드가 왜 필요한가?\n", 228 | " - 우리가 사업을 한다고 합시다\n", 229 | " - 개인 웹을 만들려고 함\n", 230 | " - 내 컴퓨터에서 작업을 하면?\n", 231 | " - 내 컴퓨터는 항상 끌 수 없음. 메인 컴퓨터가 꺼지면 Request도 못받기 때문" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": { 237 | "slideshow": { 238 | "slide_type": "subslide" 239 | } 240 | }, 241 | "source": [ 242 | "- 그럼 어떻게 할까?\n", 243 | " - 물리적 공간과 확장성을 고려해서 서버실을 만듬(IDC, Internet Data Center)\n", 244 | " - 컴퓨터 자원을 넣을 공간 + 추후 자원을 추가할 때 즉각적인 확장할 수 있는 규모인지\n", 245 | " - 에어컨 등등\n", 246 | " " 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": { 252 | "slideshow": { 253 | "slide_type": "subslide" 254 | } 255 | }, 256 | "source": [ 257 | "- 예전에 이런 명목에서 서버실이 있었음\n", 258 | "- 서버실을 운영하는 것보다, 그냥 가져다 쓰는 개념으로 클라우드가 시작했고, 점점 더 성장\n", 259 | "- 개발자들이 모두 해야하는 작업(예 : Docker로 이미지 생성하고, 그 이미지로 컴퓨터를 띄우고, IP 지정하고, 네트워크 설정하고 등등...)\n", 260 | " - 이것도 클라우드에서 해줌 => 개발자의 퍼포먼스 개선\n", 261 | " " 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": { 267 | "slideshow": { 268 | "slide_type": "subslide" 269 | } 270 | }, 271 | "source": [ 272 | "- 종류\n", 273 | " - IaaS : Infrastructure as a Service, 인프라 자원 서비스 \n", 274 | " - PaaS : Platform as a Service, 개발에 필요한 환경 서비스\n", 275 | " - SaaS : Software as a Service, 사용자가 원하는 소프트웨어 서비스\n", 276 | " - \n", 277 | " - 이미지 출처 : https://rubygarage.org/blog/iaas-vs-paas-vs-saas" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "metadata": { 283 | "slideshow": { 284 | "slide_type": "subslide" 285 | } 286 | }, 287 | "source": [ 288 | "- 클라우드 회사\n", 289 | " - 대표적으로 AWS(Amazon), Azure(Microsoft), GCP(Google), Alibaba Cloud, Naver Cloud 등\n", 290 | " - 이 클라우드의 차이는?\n", 291 | " - 벤더사가 어디냐, 제품의 차이, 어떤 것에 중점을 두느냐 등이 있음\n", 292 | " - 시장 점유율 1위 AWS, 2위 Azure, 3위 GCP라고 함\n", 293 | " - 처음 접할땐 GCP 추천. 크레딧 300달러를 줌(왠만한거 테스트 가능)" 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "metadata": { 299 | "slideshow": { 300 | "slide_type": "subslide" 301 | } 302 | }, 303 | "source": [ 304 | "- 제품군\n", 305 | " - 클라우드가 처음이면 자세히 익히는 것보다 큰 그림부터 가져가는 것 추천\n", 306 | " - 클라우드마다 비슷한 제품이 있음\n", 307 | " - " 308 | ] 309 | }, 310 | { 311 | "cell_type": "markdown", 312 | "metadata": { 313 | "slideshow": { 314 | "slide_type": "subslide" 315 | } 316 | }, 317 | "source": [ 318 | "- 이거만 먼저 알고 가세요\n", 319 | " - 컴퓨팅 리소스 : EC2, Compute Engine\n", 320 | " - Object Storage(저장소) : S3, Cloud Storage\n", 321 | " - RDB\n", 322 | " - 데이터 웨어하우스 : BigQuery\n", 323 | " - GCP에만 존재하는 것 : Composer(Airflow), TPU\n", 324 | " - 각 클라우드마다 장단이 있음\n", 325 | " - AWS : 인프라 전반 우수, 강화학습쪽 딥레이서 개발\n", 326 | " - GCP : 머신러닝 AI Platform, 데이터 전처리, Auto ML, TPU 등" 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": { 332 | "slideshow": { 333 | "slide_type": "subslide" 334 | } 335 | }, 336 | "source": [ 337 | "- 클라우드를 어떻게 활용하는가?\n", 338 | " - 개발 프로세스에선 Local에서 개발하고, Staging(혹은 Alpha) 서버 / Production 서버로 배포 과정을 거침\n", 339 | " - 배포 : 코드를 최신화하고, 필요시 프로그램을 재기동하고 등등\n", 340 | " - 서버에서 로그를 남길 경우, DB에 데이터가 있기도 하고 + 로그 데이터는 Object Storage에 저장하곤 함\n", 341 | " - 데이터쪽은 RDB 또는 데이터 웨어하우스 주로 사용\n", 342 | " - Jupyter notebook 환경 만들 때 주로 사용" 343 | ] 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "metadata": { 348 | "slideshow": { 349 | "slide_type": "subslide" 350 | } 351 | }, 352 | "source": [ 353 | "- 우리가 클라우드를 바라볼 관점\n", 354 | " - 컴퓨터를 빌리고, 머신러닝/딥러닝 모델 학습하고, 데이터는 데이터 웨어하우스에서 추출\n", 355 | " - 데이터 웨어하우스(BigQuery)를 사용한 전처리\n", 356 | " - 클라우드가 발전해서 점점 데이터 엔지니어링이 쉬워짐\n", 357 | "- 구글 클라우드 UI로 Go!" 358 | ] 359 | }, 360 | { 361 | "cell_type": "markdown", 362 | "metadata": { 363 | "slideshow": { 364 | "slide_type": "subslide" 365 | } 366 | }, 367 | "source": [ 368 | "- 어떻게 공부해야 좋을까?\n", 369 | " - 필요한 것을 만들어보기(간단한 웹, 크롤러, 자신의 업무 자동화 등)\n", 370 | " - 개인 프로젝트로 하고 괜찮다면 회사에 적용\n", 371 | " - 인스타그램 자동 좋아요 코드 짜고 클라우드에 배포해서 주기적으로 매일 10시 실행\n", 372 | " - 저는 매달 5만원씩 적금처럼 돈 모아서, 그 돈으로 GCP 사용 중\n", 373 | " - AWS는 Ebay에서 \"aws credit code\" 검색해보시면...\n", 374 | " - GPU를 많이 사용한다면 그래픽카드를 맞추는 것도 좋고, 많이 안쓰면 클라우드도 좋음\n", 375 | " - 백번 책 읽는것보다, 직접 하는게 좋음\n", 376 | " - Jupyter Notebook 작업 환경 만들기\n", 377 | " - 하면서 Docker, Shell 등에 대해 필요성을 느낌\n", 378 | " - 현재 회사에선 어떤 것을 사용하고 있을까? 생각해보고 필요시 생성(단, 데이터 엔지니어링 협조 필수)" 379 | ] 380 | }, 381 | { 382 | "cell_type": "markdown", 383 | "metadata": { 384 | "slideshow": { 385 | "slide_type": "slide" 386 | } 387 | }, 388 | "source": [ 389 | "### 데이터 엔지니어링\n", 390 | "- 데이터 엔지니어링이란?\n", 391 | " - 데이터 엔지니어링은 앱 또는 웹에서 발생하는 데이터들을 파이프라인을 만들어 저장하는 일을 주로 함\n", 392 | " - 즉, 우리가 보고 있는 데이터를 적재해주는 일\n", 393 | " - 데이터 처리를 어떻게 시스템화하는가?\n", 394 | " - 그 외\n", 395 | " - 데이터 분석가들이 더 잘 일할 수 있도록 도와주시고 계심\n", 396 | " - 실시간 API 개발 등\n" 397 | ] 398 | }, 399 | { 400 | "cell_type": "markdown", 401 | "metadata": { 402 | "slideshow": { 403 | "slide_type": "subslide" 404 | } 405 | }, 406 | "source": [ 407 | "### 데이터 웨어하우스, 데이터 마트, 데이터 레이크 용어 정리\n", 408 | "- 데이터 웨어하우스\n", 409 | " - 일반 RDB와 달리 대량의 데이터를 장기 보존\n", 410 | " - RDB나 로그 저장하는 곳을 데이터 소스(Data source)라 부름\n", 411 | " - Raw 데이터를 추출해 데이터 웨어하우스에 넣는 것이 ETL 프로세스" 412 | ] 413 | }, 414 | { 415 | "cell_type": "markdown", 416 | "metadata": { 417 | "slideshow": { 418 | "slide_type": "subslide" 419 | } 420 | }, 421 | "source": [ 422 | "- 데이터 마트\n", 423 | " - 데이터 웨어하우스에서 필요한 데이터만 추출해 구축\n", 424 | " - BI 도구와 조합시키는 형태로 주로 사용됨" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": { 430 | "slideshow": { 431 | "slide_type": "subslide" 432 | } 433 | }, 434 | "source": [ 435 | "- 데이터 레이크\n", 436 | " - 모든 데이터가 데이터 웨어하우스를 가정해서 만들어지지 않음\n", 437 | " - 따라서 모든 데이터를 원래 형태로 저장하고, 나중에 필요에 따라 가공하는 구조가 필요 => 데이터 레이크\n", 438 | " - 예를 들어 다른 업체에서 받은 텍스트 파일은 데이터 웨어하우스에 넣을 수 없음\n", 439 | " - 보험사에서 엑셀로 들어오는 데이터(매번 형식이 바뀐다면?)\n", 440 | "- 데이터 레이크 중심으로 하는 파이프라인과 데이터 웨어하우스를 기반으로 하는 파이프라인 등이 있음" 441 | ] 442 | }, 443 | { 444 | "cell_type": "markdown", 445 | "metadata": { 446 | "slideshow": { 447 | "slide_type": "subslide" 448 | } 449 | }, 450 | "source": [ 451 | "- 왜 알아야 해요?\n", 452 | " - 데이터 엔지니어링팀과 협업\n", 453 | " - 가능하면 데이터 전처리부터 모델링까지 직접 모두 처리하는 것도 매력적\n", 454 | " - 클라우드의 발전으로 점점 엔지니어링이 쉬워지고 있음\n", 455 | " - 점점 하이브리드로 데이터 엔지니어링 능력이 있는 사람이 각광받을 것으로 예상" 456 | ] 457 | }, 458 | { 459 | "cell_type": "markdown", 460 | "metadata": { 461 | "slideshow": { 462 | "slide_type": "subslide" 463 | } 464 | }, 465 | "source": [ 466 | "- 알아두면 좋은 지식\n", 467 | " - Batch(배치)와 Real Time의 차이\n", 468 | " - 배치성 : 1시간(혹은 30분)마다 작업\n", 469 | " - Real Time은 실시간으로 바로 데이터 적재. 실시간은 네트워크나 데이터 보존 등이 필요해 난이도가 높은 편 \n", 470 | " - ETL\n", 471 | " - Extract : 데이터 소스에서 데이터를 추출하고\n", 472 | " - Transform : 원하는 형태로 변환하고\n", 473 | " - Load : 저장" 474 | ] 475 | }, 476 | { 477 | "cell_type": "markdown", 478 | "metadata": { 479 | "slideshow": { 480 | "slide_type": "subslide" 481 | } 482 | }, 483 | "source": [ 484 | "- 정형 데이터와 비정형 데이터\n", 485 | " - 정형 : Tabular Data\n", 486 | " - 비정형 : 이미지, 텍스트, 음성 등\n", 487 | "- DB 스키마는 자주 변경될 수 있음\n", 488 | " - 새로운 기능이 추가되면 새로운 스키마가 추가됨\n", 489 | " - 이런 것에 대응하기 위해 NoSQL를 사용하기도 함\n", 490 | "- 데이터 엔지니어링을 어렵다고 생각하지 말고, 블럭 쌓기와 비슷다고 생각하면 됨\n", 491 | "- 꼭 스파크를 하지 않아도 됨\n", 492 | "- 분산처리는 매우 어려운 부분" 493 | ] 494 | }, 495 | { 496 | "cell_type": "markdown", 497 | "metadata": { 498 | "slideshow": { 499 | "slide_type": "subslide" 500 | } 501 | }, 502 | "source": [ 503 | "- 파이프라인\n", 504 | " - \n", 505 | " - 핵심\n", 506 | " - 최초에 데이터가 \"어디에\" 저장되는가?\n", 507 | " - 최초에 데이터가 \"어떤 형태로\" 저장되는가?\n", 508 | " - 최종적으로 데이터를 \"어디에\" 저장할 것인가?\n", 509 | " - 최종적으로 데이터를 \"어떤 형태로\" 저장할 것인가?" 510 | ] 511 | }, 512 | { 513 | "cell_type": "markdown", 514 | "metadata": { 515 | "slideshow": { 516 | "slide_type": "subslide" 517 | } 518 | }, 519 | "source": [ 520 | "- 간단하게 Raw level 아이디어 생각해보기\n", 521 | " - 매일 1시간마다 서버에서 S3에 데이터를 저장한다(json으로)\n", 522 | " - json은 계속 덮어쓰기된다\n", 523 | " - 로그 샘플\n", 524 | " - " 525 | ] 526 | }, 527 | { 528 | "cell_type": "markdown", 529 | "metadata": { 530 | "slideshow": { 531 | "slide_type": "subslide" 532 | } 533 | }, 534 | "source": [ 535 | "- 간단한 접근\n", 536 | " - 1시간마다 json을 BigQuery에 적재한다\n", 537 | " - 정규표현식 등으로 필요한 데이터만 추출\n", 538 | " - crontab에 1시간 간격으로 실행되도록 설정한다\n", 539 | " - crontab에서 1가지 작업만 하면 큰 이슈는 없음" 540 | ] 541 | }, 542 | { 543 | "cell_type": "markdown", 544 | "metadata": { 545 | "slideshow": { 546 | "slide_type": "fragment" 547 | } 548 | }, 549 | "source": [ 550 | "- 그러나 A Table을 저장하고, B Table은 A Table을 가공한 결과라면?\n", 551 | " - A가 제대로 적재되지 않으면, B Table이 연산을 시도해도 올바른 결과가 생기지 않음\n", 552 | " - 일종의 Dependency가 있는격\n", 553 | " - crontab의 단점\n", 554 | " - 앞선 Task가 실패하면 뒤 Task를 멈추지 않고 실행함(구현을 안했다면)\n", 555 | " - ETL 파이프라인 모니터링이 힘듬\n", 556 | " - 이걸 쉽게 하기 위해 Airflow, Luigi, Rundeck 등을 사용함" 557 | ] 558 | }, 559 | { 560 | "cell_type": "markdown", 561 | "metadata": { 562 | "slideshow": { 563 | "slide_type": "subslide" 564 | } 565 | }, 566 | "source": [ 567 | "- " 568 | ] 569 | }, 570 | { 571 | "cell_type": "markdown", 572 | "metadata": { 573 | "slideshow": { 574 | "slide_type": "subslide" 575 | } 576 | }, 577 | "source": [ 578 | "- 레트리카에서 제가 만든 파이프라인\n", 579 | " - 서버 데이터는 Pub/Sub으로 실시간 스트리밍으로 데이터를 받고\n", 580 | " - 앱 데이터는 Firebase에서 데이터 적재가 되면, Batch로 데이터 처리\n", 581 | " - \n", 582 | " " 583 | ] 584 | }, 585 | { 586 | "cell_type": "markdown", 587 | "metadata": { 588 | "slideshow": { 589 | "slide_type": "subslide" 590 | } 591 | }, 592 | "source": [ 593 | "- Uber\n", 594 | " - 모든 데이터를 Kafka로 실시간 스트리밍하고, 데이터를 하둡에 저장함\n", 595 | " - " 596 | ] 597 | }, 598 | { 599 | "cell_type": "markdown", 600 | "metadata": { 601 | "slideshow": { 602 | "slide_type": "subslide" 603 | } 604 | }, 605 | "source": [ 606 | "### 추천 자료\n", 607 | "- 강대명님의 자료 [Data Engineering 101](https://www.slideshare.net/charsyam2/data-engineering-101)\n", 608 | "- [빅데이터를 지탱하는 기술](http://www.yes24.com/Product/Goods/66277191) : 데이터 엔지니어 신입이라면 꼭 읽어보라고 권함. 기초 내용 다 있어요\n", 609 | " - " 610 | ] 611 | }, 612 | { 613 | "cell_type": "markdown", 614 | "metadata": { 615 | "slideshow": { 616 | "slide_type": "subslide" 617 | } 618 | }, 619 | "source": [ 620 | "### 우리가 다음 주에 실습할 내용\n", 621 | "- 간단한 파이프라인 만들기\n", 622 | " - 클라우드가 발전하면서 점점 덜 신경써도 간단한 파이프라인을 만들 수 있음\n", 623 | " - Airflow Local에 띄우기\n", 624 | " - Python Operator\n", 625 | " - BigQuery Operator\n", 626 | " - BigQuery Public 데이터 활용할 예정\n", 627 | "- 여러분이 하셔야 하는 것\n", 628 | " - 로컬에 Airflow 설치\n", 629 | " - GCP 개인 Project 생성 후, Google Cloud Credential 다운(서비스 계정 생성)\n", 630 | " - [참고 링크](https://github.com/zzsza/bigquery-tutorial/blob/master/tutorials/05-ETC/01.%20GOOGLE_CLOUD_CRENDENTIALS_json_file_setting.ipynb)\n", 631 | " - BigQuery Admin 권한 설정해야 함" 632 | ] 633 | }, 634 | { 635 | "cell_type": "markdown", 636 | "metadata": { 637 | "slideshow": { 638 | "slide_type": "subslide" 639 | } 640 | }, 641 | "source": [ 642 | "- Airflow 설치\n", 643 | "\n", 644 | "```\n", 645 | "pip3 install 'apache-airflow[gcp]'==1.10.3\n", 646 | "```" 647 | ] 648 | } 649 | ], 650 | "metadata": { 651 | "celltoolbar": "Slideshow", 652 | "kernelspec": { 653 | "display_name": "Python 3", 654 | "language": "python", 655 | "name": "python3" 656 | }, 657 | "language_info": { 658 | "codemirror_mode": { 659 | "name": "ipython", 660 | "version": 3 661 | }, 662 | "file_extension": ".py", 663 | "mimetype": "text/x-python", 664 | "name": "python", 665 | "nbconvert_exporter": "python", 666 | "pygments_lexer": "ipython3", 667 | "version": "3.7.4" 668 | }, 669 | "varInspector": { 670 | "cols": { 671 | "lenName": 16, 672 | "lenType": 16, 673 | "lenVar": 40 674 | }, 675 | "kernels_config": { 676 | "python": { 677 | "delete_cmd_postfix": "", 678 | "delete_cmd_prefix": "del ", 679 | "library": "var_list.py", 680 | "varRefreshCmd": "print(var_dic_list())" 681 | }, 682 | "r": { 683 | "delete_cmd_postfix": ") ", 684 | "delete_cmd_prefix": "rm(", 685 | "library": "var_list.r", 686 | "varRefreshCmd": "cat(var_dic_list()) " 687 | } 688 | }, 689 | "types_to_exclude": [ 690 | "module", 691 | "function", 692 | "builtin_function_or_method", 693 | "instance", 694 | "_Feature" 695 | ], 696 | "window_display": false 697 | } 698 | }, 699 | "nbformat": 4, 700 | "nbformat_minor": 2 701 | } 702 | -------------------------------------------------------------------------------- /notebooks/카일-스쿨-6회차-영양제와-airflow.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "## 카일 스쿨 6회차\n", 12 | "- [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fzzsza.github.io%2Fkyle-school%2Fweek6)](https://hits.seeyoufarm.com)\n", 13 | "- #0. 영양제\n", 14 | "- #1. Airflow" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": { 20 | "slideshow": { 21 | "slide_type": "slide" 22 | } 23 | }, 24 | "source": [ 25 | "### 0. 영양제" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": { 31 | "slideshow": { 32 | "slide_type": "fragment" 33 | } 34 | }, 35 | "source": [ 36 | "- 영양제/비타민을 왜 섭취해야 하는가\n", 37 | " - 최고의 건강 관리 : 균형잡힌 식사와 꾸준한 운동\n", 38 | " - 하지만... 꾸준한 운동..\n", 39 | " - 과로, 스트레스에 시달리는 경우 영양소 소모가 많음 => 부족한 부분 발생\n", 40 | " - 자취생 => 과일 못먹음 + 배달 => 균형 잡히지 못한 식사\n", 41 | " - 비타민 D => 햇빛을 봐야하는데 => 우린 일을 하네" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": { 47 | "slideshow": { 48 | "slide_type": "subslide" 49 | } 50 | }, 51 | "source": [ 52 | "- 어떤 비타민을 먹어야 하는가\n", 53 | "- 비타민B\n", 54 | " - 육체 피로 회복\n", 55 | " - 학업, 취업, 업무 등으로 육체피로가 만성화 => 마그네슘과 같이 섞기\n", 56 | " - 저는 비맥스 메타를 먹고 있는데 효과 좋음" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": { 62 | "slideshow": { 63 | "slide_type": "fragment" 64 | } 65 | }, 66 | "source": [ 67 | "- 마그네슘\n", 68 | " - 세포 에너지 생성에 관여\n", 69 | " - 마그네슘 부족시 피로, 근육통, 경련, 수면장애, 우울감 등" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": { 75 | "slideshow": { 76 | "slide_type": "subslide" 77 | } 78 | }, 79 | "source": [ 80 | "- 루테인\n", 81 | " - 시세포가 밀집된 황반의 기능을 유지\n", 82 | "- 비타민C\n", 83 | " - 항산화 효능, 면역령 강화\n", 84 | "- 오메가3\n", 85 | " - 안구건조증에 효과적, 당뇨에 도움, 치매 방지 등\n", 86 | "- 다 먹으라는 것은 아니고 하나씩 찾아보고, 자신에게 맞는 것을 섭취!" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": { 92 | "slideshow": { 93 | "slide_type": "subslide" 94 | } 95 | }, 96 | "source": [ 97 | "- 비타민을 어떻게 구입할 것인가\n", 98 | " - 아이허브 VS 로켓 쿠팡직구\n", 99 | " - 아이허브가 무조건 저렴한 것은 아니고, 로켓 쿠팡직구가 저렴한 경우도 있음\n", 100 | " - 아이허브는 40달러 이상 구매하면 무료 배송\n", 101 | "- 사실 제일 중요한 것은\n", 102 | " - 꾸준한 운동\n", 103 | " - 좋은 식습관\n", 104 | " - 영양제\n", 105 | " - 3위일체..!\n", 106 | " " 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": { 112 | "slideshow": { 113 | "slide_type": "subslide" 114 | } 115 | }, 116 | "source": [ 117 | "- 추천 자료\n", 118 | " - 약사가 들려주는 약이야기(고약사님) [유튜브](https://www.youtube.com/watch?v=TqtSLSsjtZs)\n", 119 | " - 약사가 들려주는 약이야기(고약사님) [인스타](https://www.instagram.com/yakstory119/)\n", 120 | " - 쿠마님 [블로그](https://blog.naver.com/hs_kuma)" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": { 126 | "slideshow": { 127 | "slide_type": "slide" 128 | } 129 | }, 130 | "source": [ 131 | "### 1. Airflow\n", 132 | "- 오늘 할 이야기\n", 133 | " - Airflow란?\n", 134 | " - Airflow Architecture\n", 135 | " - DAG\n", 136 | " - Airflow BashOperator, PythonOperator 사용하기\n", 137 | " - Jinja Template 사용하기\n", 138 | " - Airflow로 토이 ETL 파이프라인 만들기\n", 139 | "- 사실 더 하고싶지만.. 1시간은 생각보다 짧기 때문에 이정도만 :)" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": { 145 | "slideshow": { 146 | "slide_type": "subslide" 147 | } 148 | }, 149 | "source": [ 150 | "- Apache Airflow란?\n", 151 | " - 에어비앤비에서 만든 Workflow Management Tool\n", 152 | " - Workflow : 일련의 Task들의 연결\n", 153 | " - 활용할 수 있는 포인트\n", 154 | " - 데이터 엔지니어링 : ETL 파이프라인\n", 155 | " - 데이터를 source에서 가져와서 데이터 마트, 데이터 웨어하우스 등에 저장\n", 156 | " - 머신러닝 엔지니어링\n", 157 | " - 머신러닝 모델 주기적인 학습(1주 간격), 예측(30분 간격)\n", 158 | " - 실시간 API가 아닌 Batch성 예측\n", 159 | " - 간단한 cron 작업\n", 160 | " - crontab에 특정 작업 반복 등을 실행\n", 161 | " - 여러 작업들의 연결성(의존성) 관리\n", 162 | " - 앞의 작업이 성공해야 뒤 작업을 하도록 설정\n", 163 | " - 여러가지 작업을 효율적으로 관리(시각화 등) " 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": { 169 | "slideshow": { 170 | "slide_type": "subslide" 171 | } 172 | }, 173 | "source": [ 174 | "- Apache Airflow의 장점\n", 175 | " - Python 기반\n", 176 | " - Scheduling : 특정 간격으로 계속 실행\n", 177 | " - Backfill : 과거 작업 실행\n", 178 | " - 특정 Task 실패시 => Task만 재실행 / DAG 재실행 등 실패 로직도 있음\n", 179 | " - 데이터 엔지니어링에서 많이 사용됨\n", 180 | " - Google Cloud Platform에 있는 대부분의 기능을 지원\n", 181 | " - Google Cloud Platform엔 Managed Service(관리형 서비스)인 Composer 존재" 182 | ] 183 | }, 184 | { 185 | "cell_type": "markdown", 186 | "metadata": { 187 | "slideshow": { 188 | "slide_type": "subslide" 189 | } 190 | }, 191 | "source": [ 192 | "- Airflow UI 설명\n", 193 | " - " 194 | ] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "metadata": { 199 | "slideshow": { 200 | "slide_type": "subslide" 201 | } 202 | }, 203 | "source": [ 204 | "- 각종 Task 연결(Graph View)\n", 205 | " - \n", 206 | " - 빨간색 : failed => 앞에 작업들이 실패해서 뒤 작업 run_after_loop이 노란색 upstream_failed 되고 실행되지 않음" 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "metadata": { 212 | "slideshow": { 213 | "slide_type": "subslide" 214 | } 215 | }, 216 | "source": [ 217 | "- UTC\n", 218 | " - 협정 세계시로 1972년 1월 1일부터 시행된 국제 표준시\n", 219 | " - 서버에서 시간 처리할 땐, 거의 UTC를 사용함\n", 220 | " - 한국 시간은 **UTC+9hour** \n", 221 | " - Airflow에서 UTC를 사용하기 때문에, CRON 표시할 때 UTC 기준으로 작성\n", 222 | " - 예 : UTC `30 1 * * *` => 한국은 `30 10 * * *` => 한국 오전 10시 30분\n", 223 | " " 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": { 229 | "slideshow": { 230 | "slide_type": "subslide" 231 | } 232 | }, 233 | "source": [ 234 | "- Airflow 실행\n", 235 | " - airflow webserver와 airflow scheduler 2개 실행해야 함\n", 236 | " - 터미널 1개에 webserver를 띄우고, command+t로 새로운 터미널을 띄워서 scheduler를 띄우기\n", 237 | "\n", 238 | " ```\n", 239 | " airflow webserver\n", 240 | " airflow scheduler\n", 241 | " ```" 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "metadata": { 247 | "slideshow": { 248 | "slide_type": "subslide" 249 | } 250 | }, 251 | "source": [ 252 | "- Airflow 실행해보기\n", 253 | " - tutorial DAG을 실행(Links 아래에 있는 재생 버튼 클릭)\n", 254 | " - 혹시 ValueError: unknown locale: UTF-8 에러가 날경우 `~/.zshrc` 또는 `~/.bash_profile`에 아래 설정 추가\n", 255 | "\n", 256 | " ```\n", 257 | " export LC_ALL=en_US.UTF-8\n", 258 | " export LANG=en_US.UTF-8\n", 259 | " ```\n", 260 | "\n", 261 | " - 그 후 터미널에서 아래 커맨드 실행하고 webserver 다시 실행\n", 262 | "\n", 263 | " ```\n", 264 | " source ~/.zshrc\n", 265 | " # 또는 source ~/.bash_profile\n", 266 | " ```" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": { 272 | "slideshow": { 273 | "slide_type": "subslide" 274 | } 275 | }, 276 | "source": [ 277 | "- Airflow Architecture\n", 278 | " - \n", 279 | " - Airflow Webserver\n", 280 | " - 웹 UI를 표현하고, workflow 상태 표시하고 실행, 재시작, 수동 조작, 로그 확인 등 가능\n", 281 | " - Airflow Scheduler\n", 282 | " - 작업 기준이 충족되는지 여부를 확인\n", 283 | " - 종속 작업이 성공적으로 완료되었고, 예약 간격이 주어지면 실행할 수 있는 작업인지, 실행 조건이 충족되는지 등\n", 284 | " - 위 충족 여부가 DB에 기록되면, task들이 worker에게 선택되서 작업을 실행함" 285 | ] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": { 290 | "slideshow": { 291 | "slide_type": "subslide" 292 | } 293 | }, 294 | "source": [ 295 | "- DAG\n", 296 | " - \n", 297 | " - Airflow의 DAG으로 모델링됨\n", 298 | " - Directed Acyclic Graphs\n", 299 | " - 방향이 있는 비순환 그래프\n", 300 | " - 비순환이기 때문에 마지막 Task가 다시 처음 Task로 이어지지 않음" 301 | ] 302 | }, 303 | { 304 | "cell_type": "markdown", 305 | "metadata": { 306 | "slideshow": { 307 | "slide_type": "subslide" 308 | } 309 | }, 310 | "source": [ 311 | "- 코드로 보는 DAG\n", 312 | " - \n", 313 | " " 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": { 319 | "slideshow": { 320 | "slide_type": "subslide" 321 | } 322 | }, 323 | "source": [ 324 | "- 1) Default Argument 정의\n", 325 | " - start_date가 중요! 과거 날짜를 설정하면 그 날부터 실행\n", 326 | " - retries, retry_delay : 실패할 경우 몇분 뒤에 재실행할지?\n", 327 | " - priority_weight : 우선 순위\n", 328 | " - 외에도 다양한 옵션이 있는데, [문서](https://airflow.apache.org/docs/stable/tutorial.html) 참고\n", 329 | " \n", 330 | " ```\n", 331 | " default_args = {\n", 332 | " 'owner': 'your_name',\n", 333 | " 'depends_on_past': False,\n", 334 | " 'start_date': datetime(2018, 12, 1),\n", 335 | " 'email': ['your@mail.com'],\n", 336 | " 'email_on_failure': False,\n", 337 | " 'email_on_retry': False,\n", 338 | " 'retries': 1,\n", 339 | " 'retry_delay': timedelta(minutes=5),\n", 340 | " 'priority_weight': 10,\n", 341 | " 'end_date': datetime(2018, 12, 3),\n", 342 | " # end_date가 없으면 계속 진행함\n", 343 | " }\n", 344 | " ```" 345 | ] 346 | }, 347 | { 348 | "cell_type": "markdown", 349 | "metadata": { 350 | "slideshow": { 351 | "slide_type": "subslide" 352 | } 353 | }, 354 | "source": [ 355 | "- 2) DAG 객체 생성\n", 356 | " - 첫 인자는 dag_id인데 고유한 id 작성\n", 357 | " - default_args는 위에서 정의한 argument를 넣고\n", 358 | " - schedule_interval은 crontab 표현 사용\n", 359 | " - schedule_interval='@once'는 한번만 실행. 디버깅용으로 자주 사용\n", 360 | " - `5 4 * * *` 같은 표현을 사용\n", 361 | " - 더 궁금하면 [crontab guru](https://crontab.guru/) 참고\n", 362 | " \n", 363 | " ```\n", 364 | " dag = DAG('bash_dag', default_args=default_args, schedule_interval='@once'))\n", 365 | " ```" 366 | ] 367 | }, 368 | { 369 | "cell_type": "markdown", 370 | "metadata": { 371 | "slideshow": { 372 | "slide_type": "subslide" 373 | } 374 | }, 375 | "source": [ 376 | "- 3) Operator로 Task 정의\n", 377 | " - Operator가 Instance가 되면 Task라 부름\n", 378 | " - BashOperator : Bash Command 실행\n", 379 | " - PythonOperator : Python 함수 실행\n", 380 | " - BigQueryOperator : BigQuery 쿼리 날린 후 Table 저장\n", 381 | " - 외에도 다양한 operator가 있고, operator마다 옵션이 다름\n", 382 | " - [Airflow Document](https://airflow.apache.org/docs/stable/_api/airflow/operators/index.html), [Integration Operator](https://airflow.apache.org/docs/stable/integration.html) 참고\n", 383 | " - mysql_to_hive 등도 있음\n", 384 | " \n", 385 | " ```\n", 386 | " task1 = BashOperator(\n", 387 | " task_id='print_date',\n", 388 | " bash_command='date',\n", 389 | " dag=dag)\n", 390 | "\n", 391 | " task2 = BashOperator(\n", 392 | " task_id='sleep',\n", 393 | " bash_command='sleep 5',\n", 394 | " retries=2,\n", 395 | " dag=dag)\n", 396 | "\n", 397 | " task3 = BashOperator(\n", 398 | " task_id='pwd',\n", 399 | " bash_command='pwd',\n", 400 | " dag=dag)\n", 401 | " ```" 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "metadata": { 407 | "slideshow": { 408 | "slide_type": "subslide" 409 | } 410 | }, 411 | "source": [ 412 | "- 4) task 의존 설정\n", 413 | " - task1 후에 task2를 실행하고 싶다면\n", 414 | " - task1.set_downstream(task2)\n", 415 | " - task2.set_upstream(task1)\n", 416 | " - 더 편해지면서 `>>`나 `<<` 사용 가능\n", 417 | " - task1 >> task2로 사용 가능\n", 418 | " - task1 >> [task2, task3]는 task1 후에 task2, task3 병렬 실행을 의미\n", 419 | " \n", 420 | " ```\n", 421 | " task1 >> task2\n", 422 | " task1 >> task3\n", 423 | " ```" 424 | ] 425 | }, 426 | { 427 | "cell_type": "markdown", 428 | "metadata": { 429 | "slideshow": { 430 | "slide_type": "subslide" 431 | } 432 | }, 433 | "source": [ 434 | "- 5) DAG 파일을 DAG 폴더에 저장해 실행되는지 확인\n", 435 | " - DAG 폴더에 넣고 바로 Webserver에 반영되진 않고 약간의 시간이 필요함\n", 436 | " - 수정하고 싶으면 `~/airflow/airflow.cfg`에서 dagbag_import_timeout, dag_file_processor_timeout 값을 수정하면 됨" 437 | ] 438 | }, 439 | { 440 | "cell_type": "markdown", 441 | "metadata": { 442 | "slideshow": { 443 | "slide_type": "subslide" 444 | } 445 | }, 446 | "source": [ 447 | "- 6) 디버깅\n", 448 | " - DAG이 실행되는지 확인 => 실행이 안된다면 DAG의 start_date를 확인\n", 449 | " - 실행되서 초록색 불이 들어오길 기도\n", 450 | " - 만약 초록이 아닌 빨간불이면 Task를 클릭해서 View log 클릭\n", 451 | " - " 452 | ] 453 | }, 454 | { 455 | "cell_type": "markdown", 456 | "metadata": { 457 | "slideshow": { 458 | "slide_type": "subslide" 459 | } 460 | }, 461 | "source": [ 462 | "- Airflow BashOperator 사용하기\n", 463 | " - 01-bash_operator.py 참고\n", 464 | " - 앞에서 예제로 보여준 BashOperator 내용을 타이핑해보기 (5분)\n", 465 | " - default_argument에서 start_date는 datetime(2019, 2, 13)\n", 466 | " - DAG의 schedule_interval은 `0 10 * * *` 입력\n", 467 | " - 파일명은 airflow_test.py\n", 468 | " - (따로 설정 안했다면) `~/airflow/dags`에 저장하면 됨\n", 469 | " - dags 폴더가 없다면 생성\n", 470 | " - dags에 airflow_test.py 저장\n", 471 | " - 지금은 간단한 bash command를 사용했지만, bash로 파이썬 파일도 실행할 수 있으니 활용 포인트가 무궁무진함\n", 472 | " - 재실행하고 싶으면 Task 클릭 후 Clear 클릭" 473 | ] 474 | }, 475 | { 476 | "cell_type": "markdown", 477 | "metadata": { 478 | "slideshow": { 479 | "slide_type": "subslide" 480 | } 481 | }, 482 | "source": [ 483 | "- PythonOperator\n", 484 | " - 02-python_operator.py 참고\n", 485 | " - current_date를 받아서 한글로 요일 출력하는 함수 작성하기 (3분)" 486 | ] 487 | }, 488 | { 489 | "cell_type": "markdown", 490 | "metadata": { 491 | "slideshow": { 492 | "slide_type": "skip" 493 | } 494 | }, 495 | "source": [ 496 | "```\n", 497 | "from datetime import datetime\n", 498 | "\n", 499 | "def print_current_date():\n", 500 | " date_kor = [\"월\",\"화\",\"수\",\"목\",\"금\",\"토\",\"일\"]\n", 501 | " date_now = datetime.now().date()\n", 502 | " datetime_weeknum = date_now.weekday()\n", 503 | " print(f\"{date_now}는 {date_kor[datetime_weeknum]}요일입니다\")\n", 504 | "```" 505 | ] 506 | }, 507 | { 508 | "cell_type": "markdown", 509 | "metadata": { 510 | "slideshow": { 511 | "slide_type": "subslide" 512 | } 513 | }, 514 | "source": [ 515 | "- PythonOperator(task_id, python_callable, op_args, dag, provide_context, templates_dict)로 사용함\n", 516 | " - task_id는 task의 id(예 : print_current_date)\n", 517 | " - python_callable는 호출 수 있는 python 함수를 인자로 넣음\n", 518 | " - op_args : callable 함수가 호출될 때 사용할 함수의 인자\n", 519 | " - dag : DAG 정의한 객체 넣으면 됨\n", 520 | " - provide_context : True로 지정하면 Airflow에서 기본적으로 사용되는 keyword arguments 등이 사용 가능하게 됨\n", 521 | " - templates_dict : op_args 등과 비슷하지만 jinja template이 변환됨" 522 | ] 523 | }, 524 | { 525 | "cell_type": "markdown", 526 | "metadata": { 527 | "slideshow": { 528 | "slide_type": "subslide" 529 | } 530 | }, 531 | "source": [ 532 | "- 방금 작성한 PythonOperator의 아쉬운 점\n", 533 | " - 돌아간 Task의 로그를 보면 => 모두 같은 결과가 나옴\n", 534 | " - 언제 실행해도 무조건 datetime.now()를 사용해서 현재 날짜를 사용함\n", 535 | " - 어제 일자에서 이 함수를 실행했다면?\n", 536 | " - 2020-02-13는 목요일입니다가 출력되었을 것\n", 537 | " - 이런 경우 Python Code에서 시간에 대한 컨트롤을 가진 케이스\n", 538 | " - Python Code에서 컨트롤을 가지면 과거 작업을 돌리기 힘듬\n", 539 | " - Airflow에서 Date를 컨트롤하는게 좋음\n", 540 | " - 이럴 때 Airflow에서 제공되는 기본 context 변수 또는 Jinja Template 사용\n", 541 | " - Flask에서 Jinja Template을 사용함" 542 | ] 543 | }, 544 | { 545 | "cell_type": "markdown", 546 | "metadata": { 547 | "slideshow": { 548 | "slide_type": "subslide" 549 | } 550 | }, 551 | "source": [ 552 | "- Airflow의 기본 context 변수 사용하기\n", 553 | " - 03-python_operator_with_context.py 참고\n", 554 | " - PythonOperator에서 provide_context=True일 경우 사용 가능\n", 555 | " - kwargs에 값이 저장됨\n", 556 | " - 예를 들면\n", 557 | " \n", 558 | " ```\n", 559 | " provide_context=True로 지정하면 kwargs 다양한 값들이 저장됨\n", 560 | " {'dag': ,\n", 561 | " 'ds': '2020-02-10',\n", 562 | " 'next_ds': '2020-02-11',\n", 563 | " 'next_ds_nodash': '20200211',\n", 564 | " 'prev_ds': '2020-02-09',\n", 565 | " 'prev_ds_nodash': '20200209',\n", 566 | " 'ds_nodash': '20200210',\n", 567 | " 'ts': '2020-02-10T00:30:00+00:00',\n", 568 | " 'ts_nodash': '20200210T003000',\n", 569 | " 'ts_nodash_with_tz': '20200210T003000+0000',\n", 570 | " 'yesterday_ds': '2020-02-09',\n", 571 | " 'yesterday_ds_nodash': '20200209',\n", 572 | " 'tomorrow_ds': '2020-02-11',\n", 573 | " 'tomorrow_ds_nodash': '20200211',\n", 574 | " 'end_date': '2020-02-10',\n", 575 | " 'execution_date': ...}\n", 576 | " ```" 577 | ] 578 | }, 579 | { 580 | "cell_type": "markdown", 581 | "metadata": { 582 | "slideshow": { 583 | "slide_type": "subslide" 584 | } 585 | }, 586 | "source": [ 587 | "- Jinja Template 사용하기\n", 588 | " - 04-python_operator_with_jinja.py 참고\n", 589 | " - `\"{{ ds }}\"` 이런 형태로 사용함 : execution_date\n", 590 | " - PythonOperator는 기본 context 변수 사용이 더 쉽지만, 다른 Operator는 Jinja Template이 편함\n", 591 | " - PythonOperator는 templates_dict에 변수를 넣어서 사용\n", 592 | " - [Macros Default Variables](https://airflow.apache.org/docs/stable/macros.html#default-variables) Document에 정의되어 있음" 593 | ] 594 | }, 595 | { 596 | "cell_type": "markdown", 597 | "metadata": { 598 | "slideshow": { 599 | "slide_type": "subslide" 600 | } 601 | }, 602 | "source": [ 603 | "- Backfill\n", 604 | " - Context Variable이나 Jinja Template을 사용하면 Backfill을 제대로 사용할 수 있음\n", 605 | " - Backfill : 과거 날짜 기준으로 실행\n", 606 | " - airflow backfill -s START_DATE -e END_DATE dag_id\n", 607 | " - 아래 명령어를 입력해보고 Webserver에 가봅시다\n", 608 | " \n", 609 | "```\n", 610 | "airflow backfill -s 2020-01-05 -e 2020-01-10 python_dag_with_jinja\n", 611 | "```" 612 | ] 613 | }, 614 | { 615 | "cell_type": "markdown", 616 | "metadata": { 617 | "slideshow": { 618 | "slide_type": "subslide" 619 | } 620 | }, 621 | "source": [ 622 | "- Airflow로 토이 ETL 파이프라인 만들기\n", 623 | " - 시나리오\n", 624 | " - Google Cloud Storage에 매일 하루에 1번씩 주기적으로 csv 파일이 저장됨\n", 625 | " - csv 파일을 BigQuery에 Load\n", 626 | " - BigQuery에서 쿼리를 돌린 후, 일자별로 사용량 쿼리해서 Table 저장 " 627 | ] 628 | }, 629 | { 630 | "cell_type": "markdown", 631 | "metadata": { 632 | "slideshow": { 633 | "slide_type": "subslide" 634 | } 635 | }, 636 | "source": [ 637 | "- 설정 확인\n", 638 | " - APIs & Services - Create Credentials - Service account\n", 639 | " - Service account permissions에서 BigQuery Admin, Storage Admin \n", 640 | " - Create key (optional) 밑에 있는 CREATE KEY 클릭\n", 641 | " - JSON 선택하고 CREATE\n", 642 | " - 다운로드된 project_name-123123.json 확인\n", 643 | " - 이 Key는 매우 중요하니 꼭 잘 보관!!!(유출시 피해가 큼. 잘 모르면 그냥 삭제 추천)\n", 644 | " - " 645 | ] 646 | }, 647 | { 648 | "cell_type": "markdown", 649 | "metadata": { 650 | "slideshow": { 651 | "slide_type": "subslide" 652 | } 653 | }, 654 | "source": [ 655 | "- Airflow Webserver - Admin - connection 이동\n", 656 | " - Conn Id가 google_cloud_default 찾기\n", 657 | " - 왼쪽에 있는 연필 버튼 클릭\n", 658 | " - Project Id에 자신의 프로젝트 ID 입력(name 아님!)\n", 659 | " - [Google Cloud Platform Console](https://console.cloud.google.com/)에 있음\n", 660 | " - Keyfile JSON에 아까 위에서 만든 JSON key 내용 통째로 복사해서 붙여넣기\n", 661 | " - Scopes는 https://www.googleapis.com/auth/cloud-platform 입력\n", 662 | " - Save 클릭\n", 663 | " - " 664 | ] 665 | }, 666 | { 667 | "cell_type": "markdown", 668 | "metadata": { 669 | "slideshow": { 670 | "slide_type": "subslide" 671 | } 672 | }, 673 | "source": [ 674 | "- Google Cloud Storage에 데이터 업로드\n", 675 | " - [Google Cloud Storage](https://console.cloud.google.com/storage)로 이동\n", 676 | " - CREATE BUCKET(이미 있다면 그거 사용해도 무방) 클릭\n", 677 | " - 지역은 그냥 Region, us-east1하고 나머지 그냥 다 Continue 클릭\n", 678 | " - 전 kyle-school bucket 만듬\n", 679 | " - https://github.com/zzsza/kyle-school/tree/master/week6/data 데이터 다운!\n", 680 | " - bike_data_20200209 ~ bike_data_20200212.csv\n", 681 | " - 방금 만든 Bucket의 data 폴더 안에 방금 받은 파일 업로드\n", 682 | "- GoogleCloudStorageToBigQueryOperator, BigQueryOperator 사용할 예정" 683 | ] 684 | }, 685 | { 686 | "cell_type": "markdown", 687 | "metadata": { 688 | "slideshow": { 689 | "slide_type": "subslide" 690 | } 691 | }, 692 | "source": [ 693 | "- GoogleCloudStorageToBigQueryOperator\n", 694 | " - 05-simple_etl.py 참고\n", 695 | " - file명은 일정한 특징이 있음. bike_data_{date}.csv\n", 696 | " - schema_object : 스키마가 어떤 이름을 갖고, 어떤 타입인지 정의해둔 json \n", 697 | " - bucket : 우리가 만든 bucket\n", 698 | " - source_objects = Source Data\n", 699 | " - destination_project_dataset_table : 저장할 경로\n", 700 | " - 지금 bike_data_{date} Table 형태로 저장하는데, 이 형태는 샤딩으로 저장한 Table" 701 | ] 702 | }, 703 | { 704 | "cell_type": "markdown", 705 | "metadata": { 706 | "slideshow": { 707 | "slide_type": "subslide" 708 | } 709 | }, 710 | "source": [ 711 | "- BigQueryOperator\n", 712 | " - 간단히 생각하면 쿼리를 날려서 Table에 저장\n", 713 | " - agg_query에 간단한 쿼리 작성함\n", 714 | " - BigQueryOperator는 destination_dataset_table만 잘 정의하면 됨" 715 | ] 716 | }, 717 | { 718 | "cell_type": "markdown", 719 | "metadata": { 720 | "slideshow": { 721 | "slide_type": "subslide" 722 | } 723 | }, 724 | "source": [ 725 | "- Webserver에서 이런 오류가 발생한다면\n", 726 | " - No module named 'googleapiclient' \n", 727 | " \n", 728 | " ```\n", 729 | " pip3 install --upgrade google-api-python-client\n", 730 | " ```\n", 731 | "\n", 732 | " - No module named 'airflow.gcp'\n", 733 | " \n", 734 | " ```\n", 735 | " pip3 install 'apache-airflow[gcp]'==1.10.3\n", 736 | " ```\n", 737 | " \n", 738 | " - 아마 웹서버쪽에서 werkzeug 다시 설치해야할 수 있음\n", 739 | " \n", 740 | " ```\n", 741 | " pip3 install werkzeug==0.15.1\n", 742 | " ```" 743 | ] 744 | }, 745 | { 746 | "cell_type": "markdown", 747 | "metadata": { 748 | "slideshow": { 749 | "slide_type": "subslide" 750 | } 751 | }, 752 | "source": [ 753 | "- Airflow Local에서 실행할 경우\n", 754 | " - SequentialExecutor : 동시에 1개만 처리 가능\n", 755 | " - DB : sqlite\n", 756 | " - 병렬로 돌리기 위해 postegre, redis 등을 붙임\n", 757 | " - 설치가 매우 복잡하고 까다롭기 때문에 Docker 등을 활용하면 좋음\n", 758 | " - 보통 회사엔 공용 Airflow가 띄워져 있고, 그걸 활용함\n", 759 | " - 바닥부터 해봤으니, Airflow 단순히 사용하는 것은 꽤 익숙할 것이라 생각\n", 760 | " - Jupyter Notebook으로 DAG 폴더를 연결하면, 친숙하게 사용할 수 있음" 761 | ] 762 | }, 763 | { 764 | "cell_type": "markdown", 765 | "metadata": { 766 | "slideshow": { 767 | "slide_type": "subslide" 768 | } 769 | }, 770 | "source": [ 771 | "- Already running on PID XXXX Error가 발생할 경우\n", 772 | " - Webserver가 제대로 종료되지 않은 상황\n", 773 | "\n", 774 | " ```\n", 775 | " kill -9 $(lsof -t -i:8080)\n", 776 | " ```" 777 | ] 778 | }, 779 | { 780 | "cell_type": "markdown", 781 | "metadata": { 782 | "slideshow": { 783 | "slide_type": "subslide" 784 | } 785 | }, 786 | "source": [ 787 | "### 참고 자료\n", 788 | "- [Apache Airflow - Workflow 관리 도구(1)](https://zzsza.github.io/data/2018/01/04/airflow-1/)\n", 789 | "- [Awesome Apache Airflow Github](https://github.com/jghoman/awesome-apache-airflow)\n", 790 | "- [BigQuery non-partition Table을 partition Table로 옮기기](https://zzsza.github.io/gcp/2020/02/11/bigquery_query_to_partition_table/) : 아까 배운 샤딩 말고, 어떻게 하는지 확인해보기" 791 | ] 792 | }, 793 | { 794 | "cell_type": "markdown", 795 | "metadata": {}, 796 | "source": [ 797 | "### 다음 카일 스쿨\n", 798 | "- 각종 개발 도구들 & MLOps 개론\n", 799 | " - 설문을 해주시면, 커리큘럼 개선에 도움이 됩니다\n", 800 | " - 지금까지 받은 설문은 다 받고 고민하고 있습니다\n", 801 | "- 참고 : 3월 13일은 데이터 그룹 워크샵으로 카일 스쿨이 없습니다" 802 | ] 803 | } 804 | ], 805 | "metadata": { 806 | "celltoolbar": "Slideshow", 807 | "kernelspec": { 808 | "display_name": "Python3.6", 809 | "language": "python", 810 | "name": "python3" 811 | }, 812 | "language_info": { 813 | "codemirror_mode": { 814 | "name": "ipython", 815 | "version": 3 816 | }, 817 | "file_extension": ".py", 818 | "mimetype": "text/x-python", 819 | "name": "python", 820 | "nbconvert_exporter": "python", 821 | "pygments_lexer": "ipython3", 822 | "version": "3.6.5" 823 | }, 824 | "varInspector": { 825 | "cols": { 826 | "lenName": 16, 827 | "lenType": 16, 828 | "lenVar": 40 829 | }, 830 | "kernels_config": { 831 | "python": { 832 | "delete_cmd_postfix": "", 833 | "delete_cmd_prefix": "del ", 834 | "library": "var_list.py", 835 | "varRefreshCmd": "print(var_dic_list())" 836 | }, 837 | "r": { 838 | "delete_cmd_postfix": ") ", 839 | "delete_cmd_prefix": "rm(", 840 | "library": "var_list.r", 841 | "varRefreshCmd": "cat(var_dic_list()) " 842 | } 843 | }, 844 | "types_to_exclude": [ 845 | "module", 846 | "function", 847 | "builtin_function_or_method", 848 | "instance", 849 | "_Feature" 850 | ], 851 | "window_display": false 852 | } 853 | }, 854 | "nbformat": 4, 855 | "nbformat_minor": 2 856 | } 857 | -------------------------------------------------------------------------------- /notebooks/카일-스쿨-7회차-회복탄력성과-git-github.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "## 카일 스쿨 7회차\n", 12 | "- [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=http%3A%2F%2Fzzsza.github.io%2Fkyle-school%2Fweek7)](https://hits.seeyoufarm.com)\n", 13 | "- #0. 회복 탄력성\n", 14 | "- #1. Git" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": { 20 | "slideshow": { 21 | "slide_type": "slide" 22 | } 23 | }, 24 | "source": [ 25 | "### 0. 회복 탄력성\n", 26 | "- 회복 탄력성이란?\n", 27 | " - resilience의 번역\n", 28 | " - 심리학, 정신의학, 간호학, 교육학, 유아교육, 사회학 등 다양한 분야에서 연구되는 개념\n", 29 | " - 다양한 역경, 시련, 실패에 대한 인식을 도약의 발판으로 삼아 더 높이 뛰어오르는 마음\n", 30 | " - 실패한 상황을 어떻게 인식하냐에 따라 달라진다는 개념\n", 31 | " " 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "metadata": { 37 | "slideshow": { 38 | "slide_type": "subslide" 39 | } 40 | }, 41 | "source": [ 42 | "- 회복 탄력성 지수\n", 43 | " - 자기조절력 = 감정조절력 + 충동통제력 + 원인분석력\n", 44 | " - 대인관계력 = 소통 + 공감 + 자아확장\n", 45 | " - 긍정력 = 자아낙관성 + 생활만족도 + 감사\n", 46 | " " 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": { 52 | "slideshow": { 53 | "slide_type": "subslide" 54 | } 55 | }, 56 | "source": [ 57 | "- 앞 부분은 교과서에서 나올 내용이고, 조금 더 현실적인 이야기\n", 58 | " - 인생에선 많은 역경과 고난이 옴\n", 59 | " - 많은 사람들이 실패와 성공을 반복함\n", 60 | " - 그럴 때 자신의 멘탈을 어떻게 지킬 수 있을지? 카일타임 때 많이 물어본 내용\n", 61 | " " 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "slideshow": { 68 | "slide_type": "subslide" 69 | } 70 | }, 71 | "source": [ 72 | "- 최근에 마음이 다친건 언제인가요?\n", 73 | " - 발표의 피드백이 좋지 않은 경우\n", 74 | " - 사람들이 내가 공유하는 글을 읽으면 좋은데 읽지 않는 경우\n", 75 | "- 이런 경우 힘들기도 하지만, 저는 원인을 분석합니다\n", 76 | " - 발표 피드백이 안좋아? 그럼 개선하자 ㅇㅋ\n", 77 | " - 공유하는 글 왜 안읽지? 그럼 더 읽게 어떻게 이야기해볼까? 근데 안읽으면 뭐 어쩔수 없지~" 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": { 83 | "slideshow": { 84 | "slide_type": "subslide" 85 | } 86 | }, 87 | "source": [ 88 | "- 핵심\n", 89 | " - 어떻든 나와 미리 약속! 빨리 되돌아올 수 있는 오뚜기같은 마음\n", 90 | " - 자존감, 번아웃과도 관련됨" 91 | ] 92 | }, 93 | { 94 | "cell_type": "markdown", 95 | "metadata": { 96 | "slideshow": { 97 | "slide_type": "subslide" 98 | } 99 | }, 100 | "source": [ 101 | "- 추가적으로\n", 102 | " - 운동하거나\n", 103 | " - 게임하거나(카트라이더)\n", 104 | " - 평소에 해보지 않은 것들을 하거나\n", 105 | " - 다양한 경험을 통해 내가 어떤 것을 좋아하는지 체크 => 그걸 하는것\n", 106 | " - 맛있는 음식 먹는 것도 좋음\n", 107 | " - 각자의 상황에 맞는 해결책이 있음" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": { 113 | "slideshow": { 114 | "slide_type": "subslide" 115 | } 116 | }, 117 | "source": [ 118 | "- 자신을 잘 이해합시다. 그리고 나와 회복을 잘 하도록 약속합시다\n", 119 | " - 우리 인생은 처음이라, 다 낯설고 어렵잖아요\n", 120 | " - 어렵지만 그래도 계속 나아갑시다" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": { 126 | "slideshow": { 127 | "slide_type": "subslide" 128 | } 129 | }, 130 | "source": [ 131 | "- 추천 자료\n", 132 | " - [회복탄력성 책](http://www.yes24.com/Product/Goods/71743513)\n", 133 | " - [개발정글에 떨어진 고슴도치는 어떻게 살고 있을까](https://www.slideshare.net/junekim5030927/ss-138682097)\n", 134 | " " 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": { 140 | "slideshow": { 141 | "slide_type": "slide" 142 | } 143 | }, 144 | "source": [ 145 | "### 1. Git\n", 146 | "- Git이란?\n", 147 | " - 언제 쓰는가?\n", 148 | " - 학교에서 팀플에서 많이 겪는 이야기\n", 149 | " - 내가 피피티 앞쪽 수정할게~\n", 150 | " - ㅇㅇㅋ\n", 151 | " - 근데 음성 파일 깨졌네?\n", 152 | " - 발표 오마이갓?\n", 153 | " - 이런 경우 개발에선 Git을 씀\n", 154 | " - 버전관리, 협업을 위한 용도\n", 155 | " " 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": { 161 | "slideshow": { 162 | "slide_type": "subslide" 163 | } 164 | }, 165 | "source": [ 166 | "- Git\n", 167 | " - 특정 시점에 체크인을 하고\n", 168 | " - 원하는 시점으로 순간이동! (체크아웃)\n", 169 | " - USB가 아닌 저장 공간만 있다면 가능!\n", 170 | " - 이거 다 만들었어~ 합치자 \n", 171 | " - 어? 에러나는데 저번꺼로 가자!" 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": { 177 | "slideshow": { 178 | "slide_type": "subslide" 179 | } 180 | }, 181 | "source": [ 182 | "- Github란?\n", 183 | " - Git으로 관리하는 프로젝트를 올려둘 수 있는 곳\n", 184 | " - Github, Gitlab, Bitbucket 등\n", 185 | " - 공간의 제약 없이 협업 가능\n", 186 | " - 오픈소스도 대부분 Github나 Gitlab에서 진행\n", 187 | " - Microsoft가 인수하고 매우 좋아지는 중(기존 유료 플랜이 무료로 전환되는 등)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": { 193 | "slideshow": { 194 | "slide_type": "subslide" 195 | } 196 | }, 197 | "source": [ 198 | "- 정리하면\n", 199 | " - Git은 버전관리를 위한 도구\n", 200 | " - Github는 Git을 호스팅해주는 플랫폼" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": { 206 | "slideshow": { 207 | "slide_type": "subslide" 208 | } 209 | }, 210 | "source": [ 211 | "- 1) Git 설치\n", 212 | " - 맥\n", 213 | " - brew로 설치하면 편함\n", 214 | " - brew가 없다면\n", 215 | " \n", 216 | " ```\n", 217 | " /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)\"\n", 218 | " ```\n", 219 | " \n", 220 | " - brew 설치 후(있으면 바로)\n", 221 | " \n", 222 | " ```\n", 223 | " brew install git\n", 224 | " ```\n", 225 | " - 윈도우\n", 226 | " - [Git 홈페이지](https://git-scm.com/)에서 다운" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": { 232 | "slideshow": { 233 | "slide_type": "subslide" 234 | } 235 | }, 236 | "source": [ 237 | "- 2) Git 초기 설정\n", 238 | " - 내 컴퓨터에서 수정한 후, github에 push하려면 github id, password를 계속 입력해야 함\n", 239 | " - 따라서 설정을 1번 해주면 편함\n", 240 | " \n", 241 | " ```\n", 242 | " git config --global user.name \"kyle\"\n", 243 | " git config --global user.email zzsza@naver.com \n", 244 | " ```\n", 245 | " \n", 246 | " - 만약 프로젝트마다 다른 email을 사용하고 싶으면 `--global` 인자를 제거하면 됨\n", 247 | " - `git config --list`로 설정을 확인할 수 있음" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": { 253 | "slideshow": { 254 | "slide_type": "subslide" 255 | } 256 | }, 257 | "source": [ 258 | "- 3) Git에 SSH 공개키 등록\n", 259 | " - SSH란?\n", 260 | " - Secure Shell Protocol, 네트워크 프로토콜 중 하나\n", 261 | " - 컴퓨터끼리 서로 통신할 때, 보안적으로 안전하게 통신하려고 사용하는 프로토콜\n", 262 | " - Private Key와 Public Key가 있음\n", 263 | " - Private는 암호라고 생각하면 되고, 다른 사람에게 노출되면 안됨\n", 264 | " - SSH는 Private Key와 Public Key 쌍을 통해 컴퓨터와 인증함\n", 265 | " - Public Key를 통해 메세지를 암화하고, Private Key로 암호화된 메세지를 복호화함\n", 266 | " - SSH를 등록하면 유저 메일, 이메일을 물어보지 않고 사용 가능" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": { 272 | "slideshow": { 273 | "slide_type": "subslide" 274 | } 275 | }, 276 | "source": [ 277 | "- 참고로 Github는 SSH, HTTPS 2가지가 있음\n", 278 | " - \n", 279 | "- 우리는 ssh-key를 생성하고, Github에 Public Key를 저장할거임\n", 280 | "- 터미널에서 공개키 생성\n", 281 | " ```\n", 282 | " ssh-keygen\n", 283 | " ```" 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": { 289 | "slideshow": { 290 | "slide_type": "subslide" 291 | } 292 | }, 293 | "source": [ 294 | "- 엔터를 3번 눌러주면 공개키가 생성됨. passphrase를 설정해도 되는데, 저는 안하는 편\n", 295 | "- 이제 `~/.ssh/id_rsa.pub` 키가 생김\n", 296 | " - 아래 명령어로 나오는 공개키를 복사\n", 297 | "\n", 298 | " ```\n", 299 | " cat ~/.ssh/id_rsa.pub\n", 300 | " ```\n", 301 | " " 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "metadata": { 307 | "slideshow": { 308 | "slide_type": "subslide" 309 | } 310 | }, 311 | "source": [ 312 | "- 이 키를 이제 Github - Settings - SSH and GPG Keys에 저장\n", 313 | " - " 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": { 319 | "slideshow": { 320 | "slide_type": "subslide" 321 | } 322 | }, 323 | "source": [ 324 | "### Github에서 코드 올리는 흐름\n", 325 | "- 1) Github에서 Repository 생성\n", 326 | "- 2) Git clone해서 로컬에 저장\n", 327 | "- 3) 수정이 필요한 내용을 add => 이제 이 파일을 올릴 준비\n", 328 | "- 4) add한 파일들을 묶어서 한 덩어리로 commit하며 메세지 작성\n", 329 | "- 5) Github에 push\n", 330 | "- 6) 필요시 원하는 브랜치를 추가해서 이동 가능\n", 331 | "- 7) 과거 커밋으로도 이동 가능\n" 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": { 337 | "slideshow": { 338 | "slide_type": "subslide" 339 | } 340 | }, 341 | "source": [ 342 | "### Git 명령어\n", 343 | "- clone\n", 344 | " - 원격 저장소에 있는 파일을 내 로컬로 이동시킨다\n", 345 | "\n", 346 | " ```\n", 347 | " git clone https://github.com/zzsza/kyle-school.git\n", 348 | " ```" 349 | ] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "metadata": { 354 | "slideshow": { 355 | "slide_type": "subslide" 356 | } 357 | }, 358 | "source": [ 359 | "- README.md를 수정해봅시다\n", 360 | "- add\n", 361 | " - Staging Area에 추가\n", 362 | " - git add README.md" 363 | ] 364 | }, 365 | { 366 | "cell_type": "markdown", 367 | "metadata": { 368 | "slideshow": { 369 | "slide_type": "subslide" 370 | } 371 | }, 372 | "source": [ 373 | "- commit\n", 374 | " - 변경 사항 최종 확인 겸 메세지 작성\n", 375 | " - 의미있는 단위로 쪼개서 메세지와 작성!\n", 376 | " - git commit -a를 쓰면 add를 생략하고 바로 커밋도 가능함\n", 377 | " - git commit -m\"[modify] README\"" 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": { 383 | "slideshow": { 384 | "slide_type": "subslide" 385 | } 386 | }, 387 | "source": [ 388 | "- status\n", 389 | " - Branch와 Stage를 출력\n", 390 | " - git status" 391 | ] 392 | }, 393 | { 394 | "cell_type": "markdown", 395 | "metadata": { 396 | "slideshow": { 397 | "slide_type": "subslide" 398 | } 399 | }, 400 | "source": [ 401 | "- log\n", 402 | " - 커밋 로그를 보여줌\n", 403 | " - git log" 404 | ] 405 | }, 406 | { 407 | "cell_type": "markdown", 408 | "metadata": { 409 | "slideshow": { 410 | "slide_type": "subslide" 411 | } 412 | }, 413 | "source": [ 414 | "- push\n", 415 | " - 원격 저장소에 Push\n", 416 | " - git push origin HEAD:master : 현재 branch의 HEAD Commit까지 변경을 origin라는 원격 저장소의 master branch에 push\n", 417 | " - git push" 418 | ] 419 | }, 420 | { 421 | "cell_type": "markdown", 422 | "metadata": { 423 | "slideshow": { 424 | "slide_type": "subslide" 425 | } 426 | }, 427 | "source": [ 428 | "- branch\n", 429 | " - git branch branchname\n", 430 | " - branchname을 생성함\n", 431 | " - git branch <닉네임>/github_test" 432 | ] 433 | }, 434 | { 435 | "cell_type": "markdown", 436 | "metadata": { 437 | "slideshow": { 438 | "slide_type": "subslide" 439 | } 440 | }, 441 | "source": [ 442 | "- checkout\n", 443 | " - git checkout branchname으로 브랜치에 들어감\n", 444 | " - 브랜치가 없는 경우, 생성하면서 체크아웃하고 싶으면 -b 조건을 붙이면 됨\n", 445 | " - git checkout <닉네임>/github_test" 446 | ] 447 | }, 448 | { 449 | "cell_type": "markdown", 450 | "metadata": { 451 | "slideshow": { 452 | "slide_type": "subslide" 453 | } 454 | }, 455 | "source": [ 456 | "- 다시 README를 수정하고 add - commit - push\n", 457 | "- Master와 <닉네임>/github_test의 차이 확인\n", 458 | "- Github에서 commit 로그 확인하기" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": { 464 | "slideshow": { 465 | "slide_type": "subslide" 466 | } 467 | }, 468 | "source": [ 469 | "### Github 첫 실습 : repo 만들어서 master와 branch에 커밋하기\n", 470 | "- 자신의 repo 생성 : first-git이란 이름을 가지는 repo 생성\n", 471 | "- add, commit, push 진행\n", 472 | "- `kyleschool` branch 만들어서 push\n", 473 | "- 슬랙에 URL 제출" 474 | ] 475 | }, 476 | { 477 | "cell_type": "markdown", 478 | "metadata": { 479 | "slideshow": { 480 | "slide_type": "subslide" 481 | } 482 | }, 483 | "source": [ 484 | "- 파일의 Lifecycle\n", 485 | " - 일부러 이 내용을 말하지 않음\n", 486 | " - Working directory, Staging, Git Directory(repository)에 대해 알면 좋음(이건 모두 Local 환경 이야기)\n", 487 | " - Sourcetree 등에도 사용되는 단어" 488 | ] 489 | }, 490 | { 491 | "cell_type": "markdown", 492 | "metadata": { 493 | "slideshow": { 494 | "slide_type": "subslide" 495 | } 496 | }, 497 | "source": [ 498 | "- Commit : 커밋된 상태, 안전하게 저장된 상태로 remote(원격) 저장소로 이동할 준비\n", 499 | "- Modified : 수정한 파일을 아직 커밋하지 않은 상태\n", 500 | "- Staged : 수정한 파일을 곧 커밋할 것이라 표시한 상태(add 상태)\n", 501 | "- git Directory : 깃의 repo\n", 502 | "- working directory : 프로젝트의 특정 버전을 checkout\n", 503 | "- staging area : 곧 commit 할 것으로 예상되는 공간\n", 504 | " - 이 공간이 있는 이유는 여러 파일을 add하고 한번에 commit이 가능하도록 저장하는 임시 저장\n", 505 | " \n", 506 | "- " 507 | ] 508 | }, 509 | { 510 | "cell_type": "markdown", 511 | "metadata": { 512 | "slideshow": { 513 | "slide_type": "subslide" 514 | } 515 | }, 516 | "source": [ 517 | "- \n", 518 | "- " 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": { 524 | "slideshow": { 525 | "slide_type": "slide" 526 | } 527 | }, 528 | "source": [ 529 | "### 협업하기\n", 530 | "- 협업할 땐 어떻게 사용할까?\n", 531 | " - 1) Repo를 Fork해서 push하고 PR => 해당 레포의 관리 권한이 없는 경우(주로 오픈소스)\n", 532 | " - 2) 자체 Repo에서 새로운 브랜치를 따서 push하고 PR => 해당 레포의 관리 권한이 있는 경우(회사에서 이렇게 진행)\n", 533 | " " 534 | ] 535 | }, 536 | { 537 | "cell_type": "markdown", 538 | "metadata": { 539 | "slideshow": { 540 | "slide_type": "subslide" 541 | } 542 | }, 543 | "source": [ 544 | "- 1) Fork하기\n", 545 | " - Github Repo에서 Fork 클릭\n", 546 | " - \n", 547 | " " 548 | ] 549 | }, 550 | { 551 | "cell_type": "markdown", 552 | "metadata": { 553 | "slideshow": { 554 | "slide_type": "subslide" 555 | } 556 | }, 557 | "source": [ 558 | "- 2) Github Branch 전략(Git Flow)\n", 559 | " - git Flow 전략은 소프트웨어의 소스코드를 관리하고 출시(release)하기 위한 브랜치 관리 전략\n", 560 | " - 데이터쪽은 개발과 관련된 부분(모델 프러덕션) 제작시 활용할 수 있음. 노트북은 이렇게 진행하진 않음\n", 561 | " - \n", 562 | " " 563 | ] 564 | }, 565 | { 566 | "cell_type": "markdown", 567 | "metadata": { 568 | "slideshow": { 569 | "slide_type": "subslide" 570 | } 571 | }, 572 | "source": [ 573 | "- pull\n", 574 | " - 원격 저장소에 저장된 최신 코드 받아오고 merge\n", 575 | " - 항상 브랜치 이동할 때 사용하는거 추천. commit 전에 pull할 것이 있는지?\n", 576 | " - git pull\n", 577 | " " 578 | ] 579 | }, 580 | { 581 | "cell_type": "markdown", 582 | "metadata": { 583 | "slideshow": { 584 | "slide_type": "subslide" 585 | } 586 | }, 587 | "source": [ 588 | "- fetch\n", 589 | " - 원격 저장소에 저장된 최신 코드 다운(단, merge는 따로 해야함)\n", 590 | " - 기존 내용과 바뀐 내용의 차이를 알 수 있음 : git diff HEAD origin/master\n", 591 | " - commit이 얼마나되었는지 확인 가능 : git log --decorate --all --oneline\n", 592 | " - merge하면 git pull가 상태가 같아짐 : git merge origin/master\n", 593 | " - git fetch" 594 | ] 595 | }, 596 | { 597 | "cell_type": "markdown", 598 | "metadata": { 599 | "slideshow": { 600 | "slide_type": "subslide" 601 | } 602 | }, 603 | "source": [ 604 | "- stash\n", 605 | " - 종종 하던 작업을 멈추고 다른 브랜치로 변경해야할 수 있음\n", 606 | " - 이 때 아직 완료하지 않은 일을 commit 하는 것은 좀..? 그냥 checkout 하려면 오류가 발생함\n", 607 | " - 아직 마무리하지 않은 작업을 잠시 stack에 저장하는 명령어\n", 608 | " - 아직 완료하지 않은 일을 commit하지 않고 다시 꺼내올 수 있음\n", 609 | " - git stash list로 stash 목록 확인 가능\n", 610 | " - git stash apply를 하면 최신 stash를 가져옴\n", 611 | " - git stash apply [stash 이름]을 사용하면 해당 stash를 적용함" 612 | ] 613 | }, 614 | { 615 | "cell_type": "markdown", 616 | "metadata": { 617 | "slideshow": { 618 | "slide_type": "subslide" 619 | } 620 | }, 621 | "source": [ 622 | "- pull reqeust\n", 623 | " - 왜 사용하는가?\n", 624 | " - Push 권한이 없는 오픈소스 프로젝트에 기여할 때\n", 625 | " - 코드 리뷰하기 위해 사용\n", 626 | " - Collaborator에 소속되면 그냥 저장소에 branch 따고 Push가 가능함\n", 627 | " " 628 | ] 629 | }, 630 | { 631 | "cell_type": "markdown", 632 | "metadata": { 633 | "slideshow": { 634 | "slide_type": "subslide" 635 | } 636 | }, 637 | "source": [ 638 | "- branch에 commit하고 repo에 들어오면 가능\n", 639 | "- Compare & pull request 클릭\n", 640 | "- " 641 | ] 642 | }, 643 | { 644 | "cell_type": "markdown", 645 | "metadata": { 646 | "slideshow": { 647 | "slide_type": "subslide" 648 | } 649 | }, 650 | "source": [ 651 | "- 브랜치 지정 확인하고, 리뷰어랑 메세지 작성\n", 652 | "- " 653 | ] 654 | }, 655 | { 656 | "cell_type": "markdown", 657 | "metadata": { 658 | "slideshow": { 659 | "slide_type": "subslide" 660 | } 661 | }, 662 | "source": [ 663 | "### 협업 프로세스 실습\n", 664 | "- kyle-school repo의 오타 찾기\n", 665 | "- kyle-schoool Github Fork\n", 666 | "- 오타를 찾아서 commit push - PR 날려주세요!\n" 667 | ] 668 | }, 669 | { 670 | "cell_type": "markdown", 671 | "metadata": { 672 | "slideshow": { 673 | "slide_type": "slide" 674 | } 675 | }, 676 | "source": [ 677 | "### Sourcetree\n", 678 | "- CLI 터미널에서 하는건 파일이 적으면 괜찮음\n", 679 | "- 하지만 수정할 파일이 많거나, 브랜치가 복잡하면? => 헷갈리고 어려움\n", 680 | "- GUI의 편리함\n", 681 | "- [링크](https://www.sourcetreeapp.com/)에서 다운로드 후 설치\n", 682 | "- Registration에서 Bicbucket 클릭 후 소스트리 가입하고 로그인\n", 683 | "- Sourcetree - Preference - Accounts에서 계정 등록" 684 | ] 685 | }, 686 | { 687 | "cell_type": "markdown", 688 | "metadata": { 689 | "slideshow": { 690 | "slide_type": "subslide" 691 | } 692 | }, 693 | "source": [ 694 | "- 메인 화면\n", 695 | " - \n", 696 | " " 697 | ] 698 | }, 699 | { 700 | "cell_type": "markdown", 701 | "metadata": { 702 | "slideshow": { 703 | "slide_type": "subslide" 704 | } 705 | }, 706 | "source": [ 707 | "- Git clone\n", 708 | " - Clone from URL\n", 709 | " - " 710 | ] 711 | }, 712 | { 713 | "cell_type": "markdown", 714 | "metadata": { 715 | "slideshow": { 716 | "slide_type": "subslide" 717 | } 718 | }, 719 | "source": [ 720 | "- repo URL 입력 및 경로 설정\n", 721 | "- \n", 722 | " " 723 | ] 724 | }, 725 | { 726 | "cell_type": "markdown", 727 | "metadata": { 728 | "slideshow": { 729 | "slide_type": "subslide" 730 | } 731 | }, 732 | "source": [ 733 | "- 혹은 터미널에서 먼저 git clone하고, 그 repo를 연동하기\n", 734 | " - Add Existing Local Repository - 폴더 선택 후 클릭\n", 735 | " - " 736 | ] 737 | }, 738 | { 739 | "cell_type": "markdown", 740 | "metadata": { 741 | "slideshow": { 742 | "slide_type": "subslide" 743 | } 744 | }, 745 | "source": [ 746 | "- Sourcetree에서 commit\n", 747 | " - commit 버튼 클릭 - 체크박스에서 체크하고 commit(여기선 No staging이라 add가 없이 가능)\n", 748 | " - " 749 | ] 750 | }, 751 | { 752 | "cell_type": "markdown", 753 | "metadata": { 754 | "slideshow": { 755 | "slide_type": "subslide" 756 | } 757 | }, 758 | "source": [ 759 | "- Split view staging을 누르면 staging 공간이 같이 보임(즉, add 부분)\n", 760 | " - " 761 | ] 762 | }, 763 | { 764 | "cell_type": "markdown", 765 | "metadata": { 766 | "slideshow": { 767 | "slide_type": "subslide" 768 | } 769 | }, 770 | "source": [ 771 | "- push\n", 772 | " - commit 후 push 버튼 클릭\n", 773 | "- pull\n", 774 | " - pull 버튼 클릭" 775 | ] 776 | }, 777 | { 778 | "cell_type": "markdown", 779 | "metadata": { 780 | "slideshow": { 781 | "slide_type": "subslide" 782 | } 783 | }, 784 | "source": [ 785 | "- branch 생성\n", 786 | " - Branch 버튼 클릭하고 새로운 브랜치 이름 작성\n", 787 | " - " 788 | ] 789 | }, 790 | { 791 | "cell_type": "markdown", 792 | "metadata": { 793 | "slideshow": { 794 | "slide_type": "subslide" 795 | } 796 | }, 797 | "source": [ 798 | "- 새로운 branch에서 커밋 - push\n", 799 | " - \n", 800 | " - 그리고 다시 master branch 클릭" 801 | ] 802 | }, 803 | { 804 | "cell_type": "markdown", 805 | "metadata": { 806 | "slideshow": { 807 | "slide_type": "subslide" 808 | } 809 | }, 810 | "source": [ 811 | "- Branch History 보기\n", 812 | " - History 클릭하면 분기가 보임\n", 813 | " - \n", 814 | " " 815 | ] 816 | }, 817 | { 818 | "cell_type": "markdown", 819 | "metadata": { 820 | "slideshow": { 821 | "slide_type": "slide" 822 | } 823 | }, 824 | "source": [ 825 | "### Github Blog 만들기 실습\n", 826 | "- [카일 블로그](https://github.com/zzsza/zzsza.github.io) Fork\n", 827 | "- Settings - Repository name에 zzsza을 현재 계정명(여기선 socar-kyle)으로 변경 : socar-kyle.github.io\n", 828 | " - 단, 기존에 socar-kyle.github.io가 있으면 삭제하고 진행해야 함\n", 829 | "- 조금 기다리면 socar-kyle.github.io로 접근 가능" 830 | ] 831 | }, 832 | { 833 | "cell_type": "markdown", 834 | "metadata": { 835 | "slideshow": { 836 | "slide_type": "subslide" 837 | } 838 | }, 839 | "source": [ 840 | "- 현재 카일이 작성한 글이랑 테마가 저장됨 => 지워줍시다\n", 841 | "- `_config.yml`, `_data`, `_featured_categories`, `_featured_tags`, `about.md` 내용 수정\n", 842 | "- `favicon.ico`, `tile-wide.png`, `tile.png` 원하는 이미지로 설정" 843 | ] 844 | }, 845 | { 846 | "cell_type": "markdown", 847 | "metadata": { 848 | "slideshow": { 849 | "slide_type": "subslide" 850 | } 851 | }, 852 | "source": [ 853 | "- 사실 더 이야기하고 싶은 내용\n", 854 | " - rebase, merge\n", 855 | " - 꼬였을 때 푸는 방식 등\n", 856 | " - 이건 추후에.. \n", 857 | "- 참고 자료\n", 858 | " - [팀 개발을 위한 Git, GitHub 시작하기](http://www.yes24.com/Product/Goods/85382769) 책\n", 859 | " - [git workflow](https://blog.osteele.com/2008/05/my-git-workflow/)\n", 860 | " - [git-a-little-tale](https://papadako.github.io/git-a-little-tale/#)\n", 861 | " - [A successful Git branching model](https://nvie.com/posts/a-successful-git-branching-model/)\n", 862 | "- 다음 주\n", 863 | " - Github Action, Test Code, CI/CD " 864 | ] 865 | } 866 | ], 867 | "metadata": { 868 | "celltoolbar": "Slideshow", 869 | "kernelspec": { 870 | "display_name": "Python 3", 871 | "language": "python", 872 | "name": "python3" 873 | }, 874 | "language_info": { 875 | "codemirror_mode": { 876 | "name": "ipython", 877 | "version": 3 878 | }, 879 | "file_extension": ".py", 880 | "mimetype": "text/x-python", 881 | "name": "python", 882 | "nbconvert_exporter": "python", 883 | "pygments_lexer": "ipython3", 884 | "version": "3.7.4" 885 | }, 886 | "varInspector": { 887 | "cols": { 888 | "lenName": 16, 889 | "lenType": 16, 890 | "lenVar": 40 891 | }, 892 | "kernels_config": { 893 | "python": { 894 | "delete_cmd_postfix": "", 895 | "delete_cmd_prefix": "del ", 896 | "library": "var_list.py", 897 | "varRefreshCmd": "print(var_dic_list())" 898 | }, 899 | "r": { 900 | "delete_cmd_postfix": ") ", 901 | "delete_cmd_prefix": "rm(", 902 | "library": "var_list.r", 903 | "varRefreshCmd": "cat(var_dic_list()) " 904 | } 905 | }, 906 | "types_to_exclude": [ 907 | "module", 908 | "function", 909 | "builtin_function_or_method", 910 | "instance", 911 | "_Feature" 912 | ], 913 | "window_display": false 914 | } 915 | }, 916 | "nbformat": 4, 917 | "nbformat_minor": 2 918 | } 919 | -------------------------------------------------------------------------------- /notebooks/카일-스쿨-8회차-Github-Action.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "## 카일 스쿨 8회차\n", 12 | "- [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=http%3A%2F%2Fzzsza.github.io%2Fkyle-school%2Fweek8)](https://hits.seeyoufarm.com)\n", 13 | "- #1. Github Action\n", 14 | " - 첫 Github Action\n", 15 | " - YES24 IT 신간 40개 크롤링해서 Issue에 올리는 Github Action" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "slideshow": { 22 | "slide_type": "slide" 23 | } 24 | }, 25 | "source": [ 26 | "### 1. Github Action\n", 27 | "- [공식 홈페이지](https://github.com/features/actions), [공식 문서](https://help.github.com/en/actions) \n", 28 | "- 소프트웨어 workflow를 자동화할 수 있도록 도와주는 도구" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": { 34 | "slideshow": { 35 | "slide_type": "subslide" 36 | } 37 | }, 38 | "source": [ 39 | "- workflow의 대표적인 예\n", 40 | "- 1) Test Code\n", 41 | " - ex) df의 타입이 pd.DataFrame이 맞는가\n", 42 | " - ex) value1에 특정 값이 들어가는가?\n", 43 | " - ex) 특정 함수의 테스트 코드가 정상적으로 작동하는가?\n", 44 | " - 쿼리를 날리고 데이터가 맞는지 정합성 체크하는 것도 테스트의 일종이라 볼 수 있음" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": { 50 | "slideshow": { 51 | "slide_type": "subslide" 52 | } 53 | }, 54 | "source": [ 55 | "- workflow의 대표적인 예\n", 56 | "- 2) 배포\n", 57 | " - 코드가 어디서 작동할까?\n", 58 | " - 서버에서 작동!\n", 59 | " - 서버에 코드를 어떻게 보내야할까?\n", 60 | " - 단순한 방법 예시 : 그냥 소스코드를 그대로 scp로 보낸다\n", 61 | " - 이런 것들을 수동으로 하려면 => 소스코드를 git clone 받고, checkout하고, 압축해서 scp하고 압축 풀고 작동중 프로그램 껐다 키고..\n", 62 | " - 고도화된 방법 예시 : 쿠버네티스에 배포한다" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": { 68 | "slideshow": { 69 | "slide_type": "subslide" 70 | } 71 | }, 72 | "source": [ 73 | "- 이런 행동들을 자동화할 때 도와주는 도구\n", 74 | "- Github에 연동되서 특정 행동이 발생할 때 실행되도록 설정 가능\n", 75 | " - 예 : 커밋시, 새로운 브랜치 생성시, 태그 생성시 등\n", 76 | "- Public이면 추가 서버 비용이 들지 않음\n", 77 | "- [Github Marketplace](https://github.com/marketplace?type=actions)에서 어떤 도구들이 있는지 확인 가능함\n", 78 | " " 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": { 84 | "slideshow": { 85 | "slide_type": "subslide" 86 | } 87 | }, 88 | "source": [ 89 | "- 가격\n", 90 | " - Public repo : 무료\n", 91 | " - Private repo : [링크](https://help.github.com/en/github/setting-up-and-managing-billing-and-payments-on-github/about-billing-for-github-actions) 참고, 한달에 500MB 스토리지와 실행 시간 2,000분이 무료로 제공됨(무료 계정의 경우)" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": { 97 | "slideshow": { 98 | "slide_type": "subslide" 99 | } 100 | }, 101 | "source": [ 102 | "- 사용할 수 있는 한도\n", 103 | " - Workflow는 하나의 Repo에 최대 20개까지 등록할 수 있음\n", 104 | " - Workflow 안에 존재하는 Job은 6시간동안 실행될 수 있고, 초과시 자동으로 중지됨\n", 105 | " - 동시에 실행할 수 있는 Job 개수가 정해짐" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": { 111 | "slideshow": { 112 | "slide_type": "subslide" 113 | } 114 | }, 115 | "source": [ 116 | "- Github Action 사용하는 방식\n", 117 | " - 1) 코드 작성\n", 118 | " - 2) 코드 작성 후, 진행할 workflow 정의(Test Code, 배포, 단순 작업 등)\n", 119 | " - 3) 정상 작동하는지 Test" 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": { 125 | "slideshow": { 126 | "slide_type": "subslide" 127 | } 128 | }, 129 | "source": [ 130 | "- Workflow 시작하기\n", 131 | " - 기본적인 방법 : `.github/worfklows` 폴더 안에 `.yml` 파일을 생성 => 템플릿이 있어요\n", 132 | " - Github Repo에서 Actions 클릭\n", 133 | " - " 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": { 139 | "slideshow": { 140 | "slide_type": "subslide" 141 | } 142 | }, 143 | "source": [ 144 | "- " 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "metadata": { 150 | "slideshow": { 151 | "slide_type": "subslide" 152 | } 153 | }, 154 | "source": [ 155 | "- Set up this workflow 클릭하면 간단한 workflow 생성할 수 있음\n", 156 | " - " 157 | ] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "metadata": { 162 | "slideshow": { 163 | "slide_type": "subslide" 164 | } 165 | }, 166 | "source": [ 167 | "- Workflow 구성\n", 168 | " - 특정 이벤트 발생시 실행\n", 169 | " - Jobs에 Job들을 정의하고, 여러 Step을 정의할 수 있음(마치 Airflow와 비슷한 느낌)\n", 170 | " - Steps에서 커맨드 실행 또는 이미 만들어진 액션을 사용할 수 있음" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": { 176 | "slideshow": { 177 | "slide_type": "subslide" 178 | } 179 | }, 180 | "source": [ 181 | "- yml 파일 구성\n", 182 | " - name : workflow의 구성\n", 183 | " - on : 언제 실행할 것인지? \n", 184 | " - push - master : master에 push될 경우\n", 185 | " - pul_requests - master : master에 pull requests가 올 경우\n", 186 | " - Jobs 인자 안에 여러 Job들이 존재\n", 187 | " - Jobs를 어디서 실행할 지는 runs-on에 지정\n", 188 | " - Job들은 name, run으로 구성" 189 | ] 190 | }, 191 | { 192 | "cell_type": "markdown", 193 | "metadata": { 194 | "slideshow": { 195 | "slide_type": "subslide" 196 | } 197 | }, 198 | "source": [ 199 | "- 이벤트 트리거는 언제 가능할까?\n", 200 | " - 1) Push, Pull Request\n", 201 | " - 2) Crontab처럼 반복적인 상황\n", 202 | " - 3) REST API를 호출시 실행" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": { 208 | "slideshow": { 209 | "slide_type": "slide" 210 | } 211 | }, 212 | "source": [ 213 | "### 첫 Github Action\n", 214 | "- Github Repo 생성! 아무 이름으로 설정\n", 215 | "- hello.py 파일 push\n", 216 | " - 내용 : print(\"hello world\")" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": { 222 | "slideshow": { 223 | "slide_type": "subslide" 224 | } 225 | }, 226 | "source": [ 227 | "- Actions 클릭 - New workflow - Continuous integration workflows 더보기 - Python application - Set up the workflow\n", 228 | "- " 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": { 234 | "slideshow": { 235 | "slide_type": "subslide" 236 | } 237 | }, 238 | "source": [ 239 | "- 아래에 pip install하는 부분과 pytest를 지우고, python3 hello.py 실행하도록 수정\n", 240 | " - 템플릿 코드를 조금만 수정함\n", 241 | " - \n", 242 | " - python3 hello.py를 다양한 파이썬 버전에서 실행함" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": { 248 | "slideshow": { 249 | "slide_type": "subslide" 250 | } 251 | }, 252 | "source": [ 253 | "- 디버깅\n", 254 | " - 현재 Github Action은 바로 실행 기능이 없음\n", 255 | " - Master branch로 push시 바로 실행으로 테스트하고, Cron으로 정의하는 방식으로 사용\n", 256 | " - 참고로 Crontab 실행은 조금 지연이 있는듯" 257 | ] 258 | }, 259 | { 260 | "cell_type": "markdown", 261 | "metadata": { 262 | "slideshow": { 263 | "slide_type": "subslide" 264 | } 265 | }, 266 | "source": [ 267 | "- push\n", 268 | " - 노란색으로 표시되는 것은 현재 작업 중임을 의미함\n", 269 | " - " 270 | ] 271 | }, 272 | { 273 | "cell_type": "markdown", 274 | "metadata": { 275 | "slideshow": { 276 | "slide_type": "subslide" 277 | } 278 | }, 279 | "source": [ 280 | "- 노란색 아이콘(또는 초록색 체크)를 클릭하면 작업 상태를 확인할 수 있음\n", 281 | " - " 282 | ] 283 | }, 284 | { 285 | "cell_type": "markdown", 286 | "metadata": { 287 | "slideshow": { 288 | "slide_type": "subslide" 289 | } 290 | }, 291 | "source": [ 292 | "- Details 클릭\n", 293 | " - 세부 작업 내용을 확인할 수 있음\n", 294 | " - " 295 | ] 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": { 300 | "slideshow": { 301 | "slide_type": "slide" 302 | } 303 | }, 304 | "source": [ 305 | "### YES24 IT 신간 40개 크롤링해서 Issue에 올리는 Github Action\n", 306 | "- " 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": { 312 | "slideshow": { 313 | "slide_type": "subslide" 314 | } 315 | }, 316 | "source": [ 317 | "- " 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": { 323 | "slideshow": { 324 | "slide_type": "subslide" 325 | } 326 | }, 327 | "source": [ 328 | "- (복습) Github Action 사용하는 방식\n", 329 | " - 1) 코드 작성\n", 330 | " - 2) 코드 작성 후, 진행할 workflow 정의(Test Code, 배포, 단순 작업 등)\n", 331 | " - 3) 정상 작동하는지 Test" 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": { 337 | "slideshow": { 338 | "slide_type": "subslide" 339 | } 340 | }, 341 | "source": [ 342 | "- YES24 Github Action 사용하는 방식\n", 343 | " - 1) 코드 작성 : 크롤링 코드 + Issue 업로드 코드 작성\n", 344 | " - 2) 코드 작성 후, 진행할 workflow 정의 : 크롤링 코드 실행\n", 345 | " - 3) 정상 작동하는지 Test\n", 346 | " - 추가 작업 : Github Issue에 글 올리기 위해 Key가 필요함" 347 | ] 348 | }, 349 | { 350 | "cell_type": "markdown", 351 | "metadata": { 352 | "slideshow": { 353 | "slide_type": "subslide" 354 | } 355 | }, 356 | "source": [ 357 | "- 1) 크롤링 코드 설명" 358 | ] 359 | }, 360 | { 361 | "cell_type": "markdown", 362 | "metadata": { 363 | "slideshow": { 364 | "slide_type": "subslide" 365 | } 366 | }, 367 | "source": [ 368 | "- 2) workflow 정의\n", 369 | " ```\n", 370 | " name: yes24_crawler\n", 371 | "\n", 372 | " on:\n", 373 | " schedule:\n", 374 | " - cron: '0 0 * * *'\n", 375 | "\n", 376 | " jobs:\n", 377 | " build:\n", 378 | " runs-on: ubuntu-latest\n", 379 | " steps:\n", 380 | " - uses: actions/checkout@v2\n", 381 | " - name: Set up Python\n", 382 | " uses: actions/setup-python@v2\n", 383 | " with:\n", 384 | " python-version: 3.7\n", 385 | " - name: Install dependencies\n", 386 | " run: |\n", 387 | " python -m pip install --upgrade pip\n", 388 | " if [ -f requirements.txt ]; then pip install -r requirements.txt; fi\n", 389 | " - name: Run main.py\n", 390 | " run: |\n", 391 | " python main.py\n", 392 | " env:\n", 393 | " MY_GITHUB_TOKEN: ${{ secrets.MY_GITHUB_TOKEN }}\n", 394 | " ```\n", 395 | " " 396 | ] 397 | }, 398 | { 399 | "cell_type": "markdown", 400 | "metadata": { 401 | "slideshow": { 402 | "slide_type": "subslide" 403 | } 404 | }, 405 | "source": [ 406 | "- MY_GITHUB_TOKEN이란?\n", 407 | " - 암호화된 키를 노출하지 않으려고 사용\n", 408 | " - Github Issue에 글 올릴 때 활용함" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": { 414 | "slideshow": { 415 | "slide_type": "subslide" 416 | } 417 | }, 418 | "source": [ 419 | "- secrets.key 등록하는 방법\n", 420 | " - [문서](https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets)\n", 421 | " - 1) Key 생성하기\n", 422 | " - Settings - Developer settings - Personal access tokens - Generate new token 클릭\n", 423 | " - 체크박스에 repo, workflow 설정 후 저장\n", 424 | " - Value 복사하기\n", 425 | " - 2) Secret Key 등록하기\n", 426 | " - Repository의 Settings - 왼쪽에 Secrets 클릭\n", 427 | " - New secret 클릭하고 Name과 Value 입력하기\n", 428 | " - Repository access 쪽으로 가면, access policy를 정할 수 있음\n", 429 | " - workflow에서 사용할 땐 ${{ secrets.이름 }} 으로 사용하고, python에선 os.getenv(변수 이름)으로 사용\n", 430 | "- Master branch에 push 후 테스트" 431 | ] 432 | } 433 | ], 434 | "metadata": { 435 | "celltoolbar": "Slideshow", 436 | "kernelspec": { 437 | "display_name": "Python 3", 438 | "language": "python", 439 | "name": "python3" 440 | }, 441 | "language_info": { 442 | "codemirror_mode": { 443 | "name": "ipython", 444 | "version": 3 445 | }, 446 | "file_extension": ".py", 447 | "mimetype": "text/x-python", 448 | "name": "python", 449 | "nbconvert_exporter": "python", 450 | "pygments_lexer": "ipython3", 451 | "version": "3.7.4" 452 | }, 453 | "varInspector": { 454 | "cols": { 455 | "lenName": 16, 456 | "lenType": 16, 457 | "lenVar": 40 458 | }, 459 | "kernels_config": { 460 | "python": { 461 | "delete_cmd_postfix": "", 462 | "delete_cmd_prefix": "del ", 463 | "library": "var_list.py", 464 | "varRefreshCmd": "print(var_dic_list())" 465 | }, 466 | "r": { 467 | "delete_cmd_postfix": ") ", 468 | "delete_cmd_prefix": "rm(", 469 | "library": "var_list.r", 470 | "varRefreshCmd": "cat(var_dic_list()) " 471 | } 472 | }, 473 | "types_to_exclude": [ 474 | "module", 475 | "function", 476 | "builtin_function_or_method", 477 | "instance", 478 | "_Feature" 479 | ], 480 | "window_display": false 481 | } 482 | }, 483 | "nbformat": 4, 484 | "nbformat_minor": 2 485 | } 486 | -------------------------------------------------------------------------------- /script/mac-setting.sh: -------------------------------------------------------------------------------- 1 | ``` 2 | #!/bin/bash 3 | 4 | ## Install Homebrew & Cask 5 | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" 6 | brew tap homebrew/cask-versions 7 | 8 | ## Update Homebrew 9 | brew update 10 | 11 | ## Install Python3 12 | brew install python3 13 | 14 | 15 | ## Install Mac Command Line Tools 16 | xcode-select --install 17 | 18 | ## Install Browsers (Google Chrome, Firefox) 19 | brew cask install google-chrome 20 | brew cask install firefox 21 | 22 | ## Install HashiCorp Tools 23 | # brew install terraform 24 | 25 | ## Install Virtualization Tools (Docker) 26 | brew cask install docker 27 | 28 | 29 | ## Install IDEs (Intellij, Pycharm, Visual Studio Code) 30 | brew cask install visual-studio-code 31 | brew cask install intellij-idea 32 | brew cask install pycharm-ce 33 | 34 | # Install important Visual Studio Code Extensions 35 | cat vscode-extensions.txt | xargs -L1 code --install-extension 36 | 37 | ## Install AWS Tools (AWS CLI & SAM CLI) 38 | pip3 --version 39 | curl -O https://bootstrap.pypa.io/get-pip.py 40 | python3 get-pip.py --user 41 | pip3 install awscli --upgrade --user 42 | aws --version 43 | rm get-pip.py 44 | 45 | brew tap aws/tap 46 | brew install aws-sam-cli 47 | sam --version 48 | 49 | ## Install GCP Tools (gcloud) 50 | brew cask install google-cloud-sdk 51 | 52 | 53 | ## Install Developer utilities (Spectacle, Tree, httpie) 54 | brew cask install spectacle 55 | brew install tree 56 | brew install httpie 57 | 58 | 59 | ## Install Productivity Tools (Slack) 60 | brew cask install slack 61 | 62 | ## Global Git Config 63 | git config --global push.default current 64 | git config --global core.excludesfile ~/.gitignore 65 | git config --global user.name "" 66 | git config --global user.email 67 | git config --global color.branch auto 68 | git config --global color.diff auto 69 | git config --global color.interactive auto 70 | git config --global color.status auto 71 | git config --global alias.st status 72 | git config --global alias.ci commit 73 | git config --global alias.co checkout 74 | git config --global alias.br branch 75 | git config --global alias.lg "log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative" 76 | 77 | 78 | 79 | ## Source zshrc 80 | source ~/.zshrc 81 | ``` 82 | -------------------------------------------------------------------------------- /week10/docker_images/01-simple_notebook/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM jupyter/minimal-notebook 2 | RUN pip install tensorflow 3 | -------------------------------------------------------------------------------- /week10/docker_images/02-simple_notebook_with_entry/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM jupyter/minimal-notebook 2 | RUN pip install tensorflow 3 | 4 | RUN jupyter notebook --generate-config --allow-root -y \ 5 | && echo "c.NotebookApp.password = 'sha1:fee705da7ee3:39094efec15c2bc5f651b88fdd5536685b5fd229'" >> /home/jovyan/.jupyter/jupyter_notebook_config.py 6 | 7 | EXPOSE 8888 8 | 9 | ENTRYPOINT jupyter notebook --allow-root --ip=0.0.0.0 --port=8888 --no-browser 10 | -------------------------------------------------------------------------------- /week10/docker_images/03-simple_notebook_docker_compose/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '3' # 파일 규격 버전 2 | 3 | services: # 컨테이너들을 정의 4 | notebook: # notebook 서비스 5 | image: jupyter/minimal-notebook # notebook 서비스에서 사용할 도커 이미지 6 | container_name: notebook # 컨테이너 이름 7 | volumes: # --volume 옵션 사용해서 연결하는 부분 8 | - ./docker-volume:/home/jovyan/workspace 9 | ports: # ports 호스트:컨테이너 10 | - 8888:8888 11 | command: 12 | jupyter notebook --allow-root --ip=0.0.0.0 --no-browser -------------------------------------------------------------------------------- /week4/arg_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | echo "file name is" $0 4 | echo "first arg is" $1 5 | echo "second arg is" $2 6 | echo "all arg: " $@ 7 | 8 | echo "len of arg : " $# -------------------------------------------------------------------------------- /week4/awkfile: -------------------------------------------------------------------------------- 1 | 홍 길동 3324 5/11/96 50354 2 | 임 꺽정 5246 15/9/66 287650 3 | 이 성계 87654 6/20/58 60000 4 | 정 약용 908683 9/40/48 365000 -------------------------------------------------------------------------------- /week4/case_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | for string in "hi" "HELLO" "WORLD" "hello" "world" "wow" "awesome" "start" "end" "etc"; do 4 | case ${string} in 5 | hello|HELLO) 6 | echo "hello or HELLO : ${string}" ;; 7 | 8 | wo*) 9 | echo "wo로 시작하는 단어 : ${string}" 10 | ;; 11 | a*|end) 12 | echo "a로 시작하는 단어 or end일 때 : ${string}" 13 | ;; 14 | *) 15 | echo "기타 : ${string}" 16 | ;; 17 | esac 18 | done 19 | -------------------------------------------------------------------------------- /week4/for_loop.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | SEQUENCE=$(seq 0 9) 3 | for i in $SEQUENCE 4 | do 5 | echo "Running loop seq ${i}" 6 | done 7 | 8 | -------------------------------------------------------------------------------- /week4/for_loop2.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | ORDER="5 6 7 8 9 4 3 2 1 0" 3 | for i in $ORDER 4 | do 5 | echo "Running loop ${i}" 6 | done 7 | 8 | -------------------------------------------------------------------------------- /week4/for_loop3.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | for ((i=0;i<=9;i++)) 3 | do 4 | echo "Running loop "$i 5 | done 6 | -------------------------------------------------------------------------------- /week4/if_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | string1=$1 3 | string2=$2 4 | string3="awesome" 5 | 6 | if [ ${string1} == ${string2} ]; then 7 | echo "string 1, string2 is same" 8 | elif [ ${string1} == ${string3} ]; then 9 | echo "string 1 is awesome" 10 | else 11 | echo "else condition" 12 | fi 13 | -------------------------------------------------------------------------------- /week4/ohin.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | echo "file name is" $0 4 | echo "first arg is" $1 5 | echo "second arg is" $2 6 | echo "all arg: " $@ 7 | 8 | echo "len of arg : " $# 9 | -------------------------------------------------------------------------------- /week4/set_test.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -euo pipefail 4 | echo "hi" 5 | 6 | echi hihihi 7 | 8 | echo "hello" 9 | 10 | -------------------------------------------------------------------------------- /week4/single_double_quote.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | BASH_VAR="Bash Script is Amazing" 4 | 5 | # single quote 6 | echo '${BASH_VAR}' 7 | 8 | # double quotes 9 | echo "${BASH_VAR}" 10 | -------------------------------------------------------------------------------- /week4/trap_slack.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -euo pipefail 4 | # 종료가 되거나 끝나면 메세지 전송 5 | trap '[ "$?" -eq 0 ] || send_error ${LINENO} ${FUNCNAME}' EXIT 6 | 7 | function send_error { 8 | local lineno=$1 9 | local funname=$2 10 | local current_date=$(date '+%Y-%m-%d %H:%M:%S') 11 | curl 'https://hooks.slack.com/services/yourkey' -d "payload={\"text\": \"[CRON][ERROR] LineNum=${lineno} FunName=${funname}\"}" 12 | } 13 | 14 | function slack_test2 { 15 | echo "now sleep..." 16 | echo "command + c" 17 | sleep 5 18 | } 19 | 20 | slack_test2 -------------------------------------------------------------------------------- /week4/while_loop.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | count=0 4 | while [ ${count} -le 5 ]; do 5 | echo ${count} 6 | count=$(( ${count}+1 )) 7 | done 8 | 9 | -------------------------------------------------------------------------------- /week6/dags/01-bash_operator.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.bash_operator import BashOperator 3 | from datetime import datetime, timedelta 4 | 5 | default_args = { 6 | 'owner': 'kyle', 7 | 'depends_on_past': False, 8 | 'start_date': datetime(2020, 2, 10), 9 | 'email': ['your@mail.com'], 10 | 'email_on_failure': False, 11 | 'email_on_retry': True, 12 | 'retries': 1, 13 | 'retry_delay': timedelta(minutes=5), 14 | } 15 | 16 | dag = DAG('bash_dag', 17 | default_args=default_args, 18 | schedule_interval='@once') 19 | 20 | task1 = BashOperator( 21 | task_id='print_date', 22 | bash_command='date', 23 | dag=dag) 24 | 25 | task2 = BashOperator( 26 | task_id='sleep', 27 | bash_command='sleep 5', 28 | retries=2, 29 | dag=dag) 30 | 31 | task3 = BashOperator( 32 | task_id='pwd', 33 | bash_command='pwd', 34 | dag=dag) 35 | 36 | 37 | task1 >> task2 38 | task1 >> task3 39 | -------------------------------------------------------------------------------- /week6/dags/02-python_operator.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.python_operator import PythonOperator 3 | from datetime import datetime, timedelta 4 | 5 | default_args = { 6 | 'owner': 'kyle', 7 | 'depends_on_past': False, 8 | 'start_date': datetime(2020, 2, 10), 9 | 'email': ['your@maile.com'], 10 | 'email_on_failure': False, 11 | 'email_on_retry': True, 12 | 'retries': 1, 13 | 'retry_delay': timedelta(minutes=1), 14 | } 15 | 16 | dag = DAG('python_dag1', 17 | default_args=default_args, 18 | schedule_interval='30 0 * * *') 19 | 20 | 21 | def print_current_date(): 22 | date_kor = ["월", "화", "수", "목", "금", "토", "일"] 23 | date_now = datetime.now().date() 24 | datetime_weeknum = date_now.weekday() 25 | print(f"{date_now}는 {date_kor[datetime_weeknum]}요일입니다") 26 | 27 | 28 | python_task = PythonOperator( 29 | task_id='print_current_date', 30 | python_callable=print_current_date, 31 | dag=dag, 32 | ) 33 | 34 | python_task 35 | 36 | -------------------------------------------------------------------------------- /week6/dags/03-python_operator_with_context.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.python_operator import PythonOperator 3 | from datetime import datetime, timedelta 4 | 5 | default_args = { 6 | 'owner': 'kyle', 7 | 'depends_on_past': False, 8 | 'start_date': datetime(2020, 2, 10), 9 | 'email': ['your@mail.com'], 10 | 'email_on_failure': False, 11 | 'email_on_retry': True, 12 | 'retries': 1, 13 | 'retry_delay': timedelta(minutes=1), 14 | } 15 | 16 | dag = DAG('python_dag_with_context', 17 | default_args=default_args, 18 | schedule_interval='30 0 * * *') 19 | 20 | 21 | def print_current_date_provide_context(*args, **kwargs): 22 | """ 23 | provide_context=True로 지정하면 kwargs 다양한 값들이 저장됨 24 | {'dag': , 25 | 'ds': '2020-02-10', 26 | 'next_ds': '2020-02-11', 27 | 'next_ds_nodash': '20200211', 28 | 'prev_ds': '2020-02-09', 29 | 'prev_ds_nodash': '20200209', 30 | 'ds_nodash': '20200210', 31 | 'ts': '2020-02-10T00:30:00+00:00', 32 | 'ts_nodash': '20200210T003000', 33 | 'ts_nodash_with_tz': '20200210T003000+0000', 34 | 'yesterday_ds': '2020-02-09', 35 | 'yesterday_ds_nodash': '20200209', 36 | 'tomorrow_ds': '2020-02-11', 37 | 'tomorrow_ds_nodash': '20200211', 38 | 'end_date': '2020-02-10', 39 | 'execution_date': ...} 40 | """ 41 | print(f"kwargs :{kwargs}") 42 | execution_date = kwargs['ds'] 43 | execution_date = datetime.strptime(execution_date, "%Y-%m-%d").date() 44 | date_kor = ["월", "화", "수", "목", "금", "토", "일"] 45 | datetime_weeknum = execution_date.weekday() 46 | print(f"{execution_date}는 {date_kor[datetime_weeknum]}요일입니다") 47 | 48 | 49 | 50 | python_task_context = PythonOperator( 51 | task_id='print_current_date_with_context_variable', 52 | python_callable=print_current_date_provide_context, 53 | provide_context=True, 54 | dag=dag, 55 | ) 56 | 57 | 58 | python_task_context 59 | -------------------------------------------------------------------------------- /week6/dags/04-python_operator_with_jinja.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.python_operator import PythonOperator 3 | from datetime import datetime, timedelta 4 | 5 | default_args = { 6 | 'owner': 'kyle', 7 | 'depends_on_past': False, 8 | 'start_date': datetime(2020, 2, 10), 9 | 'email': ['your@mail.com'], 10 | 'email_on_failure': False, 11 | 'email_on_retry': True, 12 | 'retries': 1, 13 | 'retry_delay': timedelta(minutes=1), 14 | } 15 | 16 | dag = DAG('python_dag_with_jinja', 17 | default_args=default_args, 18 | schedule_interval='30 0 * * *') 19 | 20 | 21 | def print_current_date_jinja(*args, **kwargs): 22 | """ 23 | jinja template(today)가 templates_dict으로 저장되서 kwargs에서 사용할 수 있음 24 | """ 25 | execution_date = kwargs.get('templates_dict').get('today', None) 26 | execution_date = datetime.strptime(execution_date, "%Y-%m-%d").date() 27 | date_kor = ["월", "화", "수", "목", "금", "토", "일"] 28 | datetime_weeknum = execution_date.weekday() 29 | print(f"{execution_date}는 {date_kor[datetime_weeknum]}요일입니다") 30 | 31 | 32 | today = "{{ ds }}" 33 | 34 | python_task_jinja = PythonOperator( 35 | task_id='print_current_date_with_jinja', 36 | python_callable=print_current_date_jinja, 37 | provide_context=True, 38 | templates_dict={ 39 | 'today': today, 40 | 41 | }, 42 | dag=dag, 43 | ) 44 | 45 | 46 | python_task_jinja 47 | 48 | -------------------------------------------------------------------------------- /week6/dags/05-simple_etl.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from datetime import datetime, timedelta 3 | from airflow.contrib.operators.gcs_to_bq import GoogleCloudStorageToBigQueryOperator 4 | from airflow.contrib.operators.bigquery_operator import BigQueryOperator 5 | 6 | # 시나리오 7 | # Google Cloud Storage에 매일 하루에 1번씩 주기적으로 csv 파일이 저장됨 8 | # - csv 파일을 BigQuery에 Load 9 | # - BigQuery에서 쿼리를 돌린 후, 일자별로 사용량 쿼리해서 Table 저장 10 | 11 | default_args = { 12 | 'owner': 'kyle', 13 | 'depends_on_past': False, 14 | 'start_date': datetime(2020, 2, 9), 15 | 'email': ['your@mail.com'], 16 | 'email_on_failure': False, 17 | 'email_on_retry': True, 18 | 'retries': 0, 19 | 'retry_delay': timedelta(minutes=1), 20 | 'end_date': datetime(2020, 2, 13), 21 | 'project_id': 'my-project-1541645429744' 22 | } 23 | 24 | dag = DAG('simple_etl_storage_to_bigquery', 25 | default_args=default_args, 26 | schedule_interval='30 0 * * *') 27 | 28 | execution_date = '{{ ds_nodash }}' 29 | 30 | 31 | storage_to_bigquery_task = GoogleCloudStorageToBigQueryOperator( 32 | dag=dag, 33 | google_cloud_storage_conn_id='google_cloud_default', 34 | bigquery_conn_id='google_cloud_default', 35 | task_id='storage_to_bigquery', 36 | schema_object='data/bike_schema.json', 37 | bucket='kyle-school', # 생성한 bucket 이름을 넣으세요 38 | source_objects=[f"data/bike_data_{execution_date}.csv"], 39 | source_format='CSV', 40 | destination_project_dataset_table=f'my-project-1541645429744.temp.bike_{execution_date}', # 맨 앞 project_id 변경하세요 41 | write_disposition='WRITE_TRUNCATE', 42 | skip_leading_rows=1 43 | ) 44 | 45 | agg_query = f""" 46 | SELECT 47 | dummy_date, start_station_id, end_station_id, COUNT(bikeid) as cnt 48 | FROM `my-project-1541645429744.temp.bike_{execution_date}` 49 | GROUP BY dummy_date, start_station_id, end_station_id 50 | """ 51 | 52 | query_task = BigQueryOperator( 53 | dag=dag, 54 | task_id="query_to_table", 55 | bigquery_conn_id='google_cloud_default', 56 | sql=agg_query, 57 | use_legacy_sql=False, 58 | write_disposition='WRITE_TRUNCATE', 59 | destination_dataset_table=f"temp.bike_agg_{execution_date}" 60 | ) 61 | 62 | storage_to_bigquery_task >> query_task 63 | -------------------------------------------------------------------------------- /week6/data/bike_data_20200209.csv: -------------------------------------------------------------------------------- 1 | trip_id,subscriber_type,bikeid,start_time,start_station_id,start_station_name,end_station_id,end_station_name,duration_minutes,dummy_date 2 | 21300987,Explorer,647,2019-11-25 13:37:48 UTC,4061,Lakeshore/Austin Hostel,4061,Lakeshore/Austin Hostel,3,2020-02-09 3 | 21300048,Local365,386,2019-11-25 11:43:00 UTC,4057,"6th/Chalmers ",4047,8th/Lavaca,12,2020-02-09 4 | 21300259,Pay-as-you-ride,278,2019-11-25 12:05:23 UTC,4059,Nash Hernandez/East @ RBJ South,4059,Nash Hernandez/East @ RBJ South,54,2020-02-09 5 | 21300571,24 Hour Walk Up Pass,647,2019-11-25 12:47:27 UTC,4061,Lakeshore/Austin Hostel,4061,Lakeshore/Austin Hostel,46,2020-02-09 6 | 21298475,Local365,397,2019-11-25 02:17:26 UTC,3635,13th/San Antonio,3635,13th/San Antonio,1,2020-02-09 7 | 21301635,Single Trip (Pay-as-you-ride),278,2019-11-25 14:57:55 UTC,4059,Nash Hernandez/East @ RBJ South,4059,Nash Hernandez/East @ RBJ South,28,2020-02-09 8 | 21303009,Single Trip (Pay-as-you-ride),647,2019-11-25 18:10:42 UTC,4061,Lakeshore/Austin Hostel,4061,Lakeshore/Austin Hostel,18,2020-02-09 9 | 21300315,Pay-as-you-ride,226,2019-11-25 12:12:43 UTC,4059,Nash Hernandez/East @ RBJ South,4059,Nash Hernandez/East @ RBJ South,47,2020-02-09 10 | 21300318,Pay-as-you-ride,1932,2019-11-25 12:13:01 UTC,4059,Nash Hernandez/East @ RBJ South,4059,Nash Hernandez/East @ RBJ South,46,2020-02-09 11 | 21301646,Single Trip (Pay-as-you-ride),1932,2019-11-25 14:59:19 UTC,4059,Nash Hernandez/East @ RBJ South,4059,Nash Hernandez/East @ RBJ South,30,2020-02-09 12 | 21299447,Local365,201,2019-11-25 09:55:13 UTC,2569,East 11th/San Marcos,4051,10th/Red River,4,2020-02-09 13 | 21302513,Local365,14264,2019-11-25 16:54:02 UTC,3619,6th/Congress,4048,South Congress @ Bouldin Creek,5,2020-02-09 14 | 21301276,Single Trip (Pay-as-you-ride),152,2019-11-25 14:12:30 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3294,6th/Lavaca,33,2020-02-09 15 | 21301341,Single Trip (Pay-as-you-ride),387,2019-11-25 14:17:42 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3294,6th/Lavaca,28,2020-02-09 16 | 21301329,Single Trip (Pay-as-you-ride),1850,2019-11-25 14:16:46 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3294,6th/Lavaca,29,2020-02-09 17 | 21303308,Single Trip (Pay-as-you-ride),272G,2019-11-25 19:28:00 UTC,3684,Cesar Chavez/Congress,3294,6th/Lavaca,9,2020-02-09 18 | 21302854,Local30,969,2019-11-25 17:42:10 UTC,3686,Sterzing/Barton Springs,4058,Hollow Creek/Barton Hills,6,2020-02-09 19 | 21300042,U.T. Student Membership,1803,2019-11-25 11:41:44 UTC,3687,Boardwalk West,4062,Lakeshore/Pleasant Valley,148,2020-02-09 20 | 21302658,Local365,709,2019-11-25 17:14:49 UTC,2711,Barton Springs/Kinney,4058,Hollow Creek/Barton Hills,19,2020-02-09 21 | 21300892,Local365,328G,2019-11-25 13:24:12 UTC,2495,4th/Congress,4047,8th/Lavaca,15,2020-02-09 22 | 21301098,24 Hour Walk Up Pass,311,2019-11-25 13:50:30 UTC,2501,5th/Bowie,3660,East 6th/Medina,50,2020-02-09 23 | 21301083,Pay-as-you-ride,180,2019-11-25 13:48:50 UTC,2501,5th/Bowie,3660,East 6th/Medina,52,2020-02-09 24 | 21298474,Local365,2326,2019-11-25 02:13:27 UTC,2547,21st/Guadalupe,3635,13th/San Antonio,4,2020-02-09 25 | 21303443,Local365,113G,2019-11-25 20:08:26 UTC,2547,21st/Guadalupe,3635,13th/San Antonio,13,2020-02-09 26 | 21300402,Local365,148G,2019-11-25 12:23:49 UTC,2552,3rd/West,4050,5th/Campbell,11,2020-02-09 27 | 21301492,Single Trip (Pay-as-you-ride),298,2019-11-25 14:35:53 UTC,2552,3rd/West,3294,6th/Lavaca,10,2020-02-09 28 | 21300943,Explorer,272G,2019-11-25 13:31:50 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2494,2nd/Congress,12,2020-02-09 29 | 21300940,Explorer,375G,2019-11-25 13:31:40 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2494,2nd/Congress,12,2020-02-09 30 | 21298682,Local365,160,2019-11-25 07:42:31 UTC,3621,3rd/Nueces,2494,2nd/Congress,3,2020-02-09 31 | 21301015,Single Trip (Pay-as-you-ride),588,2019-11-25 13:40:55 UTC,4062,Lakeshore/Pleasant Valley,2495,4th/Congress,38,2020-02-09 32 | 21301034,Single Trip (Pay-as-you-ride),428,2019-11-25 13:42:23 UTC,4062,Lakeshore/Pleasant Valley,2495,4th/Congress,36,2020-02-09 33 | 21300613,Local365,229G,2019-11-25 12:53:05 UTC,2495,4th/Congress,2495,4th/Congress,23,2020-02-09 34 | 21300534,Local365,328G,2019-11-25 12:42:18 UTC,2501,5th/Bowie,2495,4th/Congress,42,2020-02-09 35 | 21301123,Local365,1439,2019-11-25 13:53:59 UTC,2540,17th/Guadalupe,2496,8th/Congress,24,2020-02-09 36 | 21299267,Explorer,138G,2019-11-25 09:24:42 UTC,2499,2nd/Lavaca @ City Hall,2499,2nd/Lavaca @ City Hall,1,2020-02-09 37 | 21303028,Local365,14264,2019-11-25 18:17:27 UTC,4048,South Congress @ Bouldin Creek,2501,5th/Bowie,11,2020-02-09 38 | 21300495,Local365,180,2019-11-25 12:34:41 UTC,4050,5th/Campbell,2501,5th/Bowie,7,2020-02-09 39 | 21303112,Local365,273,2019-11-25 18:33:09 UTC,3798,21st/Speedway @ PCL,2501,5th/Bowie,48,2020-02-09 40 | 21298573,Local365,104,2019-11-25 06:54:23 UTC,2549,South 1st/Riverside @ Long Center,2501,5th/Bowie,6,2020-02-09 41 | 21300066,Pay-as-you-ride,453,2019-11-25 11:45:16 UTC,2504,South Congress/Elizabeth,2504,South Congress/Elizabeth,4,2020-02-09 42 | 21299202,Local365+Guest Pass,28,2019-11-25 09:14:42 UTC,2707,Rainey/Cummings,2539,3rd/Trinity @ The Convention Center,6,2020-02-09 43 | 21302589,Local365,440,2019-11-25 17:06:21 UTC,2496,8th/Congress,2539,3rd/Trinity @ The Convention Center,6,2020-02-09 44 | 21302998,Local365,113G,2019-11-25 18:06:46 UTC,3794,"Dean Keeton/Speedway ",2540,17th/Guadalupe,20,2020-02-09 45 | 21298687,Local30,277,2019-11-25 07:43:41 UTC,3798,21st/Speedway @ PCL,2540,17th/Guadalupe,6,2020-02-09 46 | 21300129,Local365,1439,2019-11-25 11:52:07 UTC,2547,21st/Guadalupe,2540,17th/Guadalupe,17,2020-02-09 47 | 21303021,Local365,1277,2019-11-25 18:14:34 UTC,3293,East 2nd/Pedernales,2542,Plaza Saltillo,6,2020-02-09 48 | 21303482,Local365,113G,2019-11-25 20:22:17 UTC,3635,13th/San Antonio,2547,21st/Guadalupe,6,2020-02-09 49 | 21302770,U.T. Student Membership,460,2019-11-25 17:29:08 UTC,2571,8th/Red River,2547,21st/Guadalupe,17,2020-02-09 50 | 21300508,Pay-as-you-ride,460,2019-11-25 12:37:27 UTC,2497,11th/Congress @ The Texas Capitol,2547,21st/Guadalupe,15,2020-02-09 51 | 21300498,Pay-as-you-ride,881,2019-11-25 12:34:55 UTC,2497,11th/Congress @ The Texas Capitol,2547,21st/Guadalupe,18,2020-02-09 52 | 21300605,Local365,4,2019-11-25 12:52:15 UTC,3793,28th/Rio Grande,2547,21st/Guadalupe,4,2020-02-09 53 | 21303147,U.T. Student Membership,2351,2019-11-25 18:42:55 UTC,3793,28th/Rio Grande,2547,21st/Guadalupe,7,2020-02-09 54 | 21303027,Local30,881,2019-11-25 18:16:57 UTC,3794,"Dean Keeton/Speedway ",2547,21st/Guadalupe,6,2020-02-09 55 | 21303173,Local365,113G,2019-11-25 18:51:55 UTC,3798,21st/Speedway @ PCL,2547,21st/Guadalupe,27,2020-02-09 56 | 21303568,Local365,113G,2019-11-25 20:48:52 UTC,3798,21st/Speedway @ PCL,2547,21st/Guadalupe,5,2020-02-09 57 | 21303383,Local365,113G,2019-11-25 19:49:11 UTC,3799,23rd/San Jacinto @ DKR Stadium,2547,21st/Guadalupe,13,2020-02-09 58 | 21300024,Local365,1439,2019-11-25 11:35:56 UTC,3799,23rd/San Jacinto @ DKR Stadium,2547,21st/Guadalupe,11,2020-02-09 59 | 21299727,U.T. Student Membership,2077,2019-11-25 10:50:07 UTC,3841,23rd/Rio Grande,2548,Guadalupe/West Mall @ University Co-op,2,2020-02-09 60 | 21302921,Local365,2887,2019-11-25 17:51:53 UTC,3792,22nd/Pearl,2548,Guadalupe/West Mall @ University Co-op,4,2020-02-09 61 | 21302067,Local365,2351,2019-11-25 15:54:27 UTC,3794,"Dean Keeton/Speedway ",2548,Guadalupe/West Mall @ University Co-op,4,2020-02-09 62 | 21301032,U.T. Student Membership,936,2019-11-25 13:42:16 UTC,3799,23rd/San Jacinto @ DKR Stadium,2548,Guadalupe/West Mall @ University Co-op,7,2020-02-09 63 | 21301250,Local365,105G,2019-11-25 14:09:39 UTC,3838,26th/Nueces,2548,Guadalupe/West Mall @ University Co-op,47,2020-02-09 64 | 21303852,Local365,113G,2019-11-25 22:51:31 UTC,3838,26th/Nueces,2548,Guadalupe/West Mall @ University Co-op,11,2020-02-09 65 | 21298853,Local365,2288,2019-11-25 08:21:54 UTC,3838,26th/Nueces,2548,Guadalupe/West Mall @ University Co-op,5,2020-02-09 66 | 21298476,Local365,2326,2019-11-25 02:18:50 UTC,3635,13th/San Antonio,2549,South 1st/Riverside @ Long Center,10,2020-02-09 67 | 21302638,Local365,646,2019-11-25 17:12:21 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2549,South 1st/Riverside @ Long Center,6,2020-02-09 68 | 21300564,Local365,815,2019-11-25 12:46:39 UTC,2575,Riverside/South Lamar,2549,South 1st/Riverside @ Long Center,4,2020-02-09 69 | 21300966,Local365,133,2019-11-25 13:34:22 UTC,3841,23rd/Rio Grande,2552,3rd/West,12,2020-02-09 70 | 21301289,Single Trip (Pay-as-you-ride),922,2019-11-25 14:13:42 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,2552,3rd/West,22,2020-02-09 71 | 21302071,Local365,110G,2019-11-25 15:55:14 UTC,2711,Barton Springs/Kinney,2552,3rd/West,11,2020-02-09 72 | 21302755,Local365,014G,2019-11-25 17:27:11 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2552,3rd/West,10,2020-02-09 73 | 21300645,Local365,160,2019-11-25 12:57:26 UTC,2494,2nd/Congress,2552,3rd/West,4,2020-02-09 74 | 21299097,Local30,298,2019-11-25 09:01:40 UTC,2496,8th/Congress,2552,3rd/West,6,2020-02-09 75 | 21303116,Local365,668,2019-11-25 18:33:37 UTC,3798,21st/Speedway @ PCL,2552,3rd/West,26,2020-02-09 76 | 21300289,Local365,309G,2019-11-25 12:09:26 UTC,2540,17th/Guadalupe,2552,3rd/West,12,2020-02-09 77 | 21302937,Local365,133,2019-11-25 17:54:13 UTC,2552,3rd/West,2552,3rd/West,18,2020-02-09 78 | 21301072,U.T. Student Membership,12811,2019-11-25 13:47:06 UTC,3798,21st/Speedway @ PCL,2562,8th/San Jacinto,8,2020-02-09 79 | 21300857,Single Trip (Pay-as-you-ride),1434,2019-11-25 13:19:21 UTC,2563,Rainey/Davis,2563,Rainey/Davis,34,2020-02-09 80 | 21301977,Explorer,866,2019-11-25 15:41:54 UTC,2572,Barton Springs Pool,2563,Rainey/Davis,36,2020-02-09 81 | 21301980,Explorer,1524,2019-11-25 15:41:58 UTC,2572,Barton Springs Pool,2563,Rainey/Davis,36,2020-02-09 82 | 21302141,Local365,004G,2019-11-25 16:05:44 UTC,3621,3rd/Nueces,2565,6th/Trinity,8,2020-02-09 83 | 21300713,Local365,309G,2019-11-25 13:03:37 UTC,2552,3rd/West,2565,6th/Trinity,8,2020-02-09 84 | 21301887,Local365,326,2019-11-25 15:31:33 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,4,2020-02-09 85 | 21303098,Local365,1949,2019-11-25 18:30:40 UTC,3619,6th/Congress,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,5,2020-02-09 86 | 21298854,Local365,1834,2019-11-25 08:22:08 UTC,2549,South 1st/Riverside @ Long Center,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,5,2020-02-09 87 | 21299194,Local365,1834,2019-11-25 09:12:45 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2567,Barton Springs/Bouldin @ Palmer Auditorium,5,2020-02-09 88 | 21302694,Local365,326,2019-11-25 17:19:43 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2567,Barton Springs/Bouldin @ Palmer Auditorium,5,2020-02-09 89 | 21301742,Local365,369G,2019-11-25 15:13:35 UTC,2572,Barton Springs Pool,2567,Barton Springs/Bouldin @ Palmer Auditorium,12,2020-02-09 90 | 21298516,Local365,272G,2019-11-25 06:12:32 UTC,3687,Boardwalk West,2567,Barton Springs/Bouldin @ Palmer Auditorium,9,2020-02-09 91 | 21300701,Explorer,369G,2019-11-25 13:01:58 UTC,2711,Barton Springs/Kinney,2567,Barton Springs/Bouldin @ Palmer Auditorium,8,2020-02-09 92 | 21300702,Explorer,375G,2019-11-25 13:01:58 UTC,2711,Barton Springs/Kinney,2567,Barton Springs/Bouldin @ Palmer Auditorium,8,2020-02-09 93 | 21301620,Local365,326,2019-11-25 14:55:03 UTC,2711,Barton Springs/Kinney,2567,Barton Springs/Bouldin @ Palmer Auditorium,3,2020-02-09 94 | 21301790,Local365,920,2019-11-25 15:19:13 UTC,2544,East 6th/Pedernales,2567,Barton Springs/Bouldin @ Palmer Auditorium,18,2020-02-09 95 | 21303152,Local365,2283,2019-11-25 18:43:36 UTC,3291,11th/San Jacinto,2569,East 11th/San Marcos,4,2020-02-09 96 | 21301516,Single Trip (Pay-as-you-ride),696,2019-11-25 14:38:13 UTC,3684,Cesar Chavez/Congress,2569,East 11th/San Marcos,13,2020-02-09 97 | 21301520,Single Trip (Pay-as-you-ride),975,2019-11-25 14:38:44 UTC,3684,Cesar Chavez/Congress,2569,East 11th/San Marcos,13,2020-02-09 98 | 21302476,U.T. Student Membership,460,2019-11-25 16:47:30 UTC,2547,21st/Guadalupe,2571,8th/Red River,17,2020-02-09 99 | 21299730,Explorer,866,2019-11-25 10:50:20 UTC,2563,Rainey/Davis,2572,Barton Springs Pool,35,2020-02-09 100 | 21299760,Explorer,1524,2019-11-25 10:53:31 UTC,2563,Rainey/Davis,2572,Barton Springs Pool,32,2020-02-09 101 | 21301374,Local365,369G,2019-11-25 14:21:05 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2572,Barton Springs Pool,12,2020-02-09 102 | 21300700,Explorer,335G,2019-11-25 13:01:57 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,51,2020-02-09 103 | 21301221,Explorer,1524,2019-11-25 14:05:24 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,34,2020-02-09 104 | 21301247,Pay-as-you-ride,2274,2019-11-25 14:09:13 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,85,2020-02-09 105 | 21301220,Explorer,866,2019-11-25 14:04:53 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,35,2020-02-09 106 | 21301245,Pay-as-you-ride,335G,2019-11-25 14:08:52 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,86,2020-02-09 107 | 21301605,Local365,2828,2019-11-25 14:51:44 UTC,2552,3rd/West,2572,Barton Springs Pool,9,2020-02-09 108 | 21302807,Single Trip (Pay-as-you-ride),3541,2019-11-25 17:34:30 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,24,2020-02-09 109 | 21300200,24 Hour Walk Up Pass,610,2019-11-25 11:58:04 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,79,2020-02-09 110 | 21302799,Single Trip (Pay-as-you-ride),198,2019-11-25 17:33:35 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,25,2020-02-09 111 | 21300219,24 Hour Walk Up Pass,3541,2019-11-25 11:59:59 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,78,2020-02-09 112 | 21303880,24 Hour Walk Up Pass,1555,2019-11-25 23:04:43 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,46,2020-02-09 113 | 21299755,Single Trip (Pay-as-you-ride),1555,2019-11-25 10:52:46 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,29,2020-02-09 114 | 21299325,Single Trip (Pay-as-you-ride),3541,2019-11-25 09:35:52 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,30,2020-02-09 115 | 21299763,Single Trip (Pay-as-you-ride),430,2019-11-25 10:53:37 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,28,2020-02-09 116 | 21303633,Pay-as-you-ride,229G,2019-11-25 21:10:40 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,32,2020-02-09 117 | 21299319,Single Trip (Pay-as-you-ride),14,2019-11-25 09:35:12 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,29,2020-02-09 118 | 21302811,Single Trip (Pay-as-you-ride),14,2019-11-25 17:35:33 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,23,2020-02-09 119 | 21301550,Single Trip (Pay-as-you-ride),255G,2019-11-25 14:42:57 UTC,2495,4th/Congress,2575,Riverside/South Lamar,13,2020-02-09 120 | 21301540,Single Trip (Pay-as-you-ride),229G,2019-11-25 14:41:40 UTC,2495,4th/Congress,2575,Riverside/South Lamar,14,2020-02-09 121 | 21300991,Local365,1958,2019-11-25 13:38:29 UTC,2549,South 1st/Riverside @ Long Center,2575,Riverside/South Lamar,5,2020-02-09 122 | 21302145,Single Trip (Pay-as-you-ride),14148,2019-11-25 16:05:56 UTC,3684,Cesar Chavez/Congress,2707,Rainey/Cummings,10,2020-02-09 123 | 21301227,Local365+Guest Pass,869,2019-11-25 14:06:03 UTC,2552,3rd/West,2707,Rainey/Cummings,11,2020-02-09 124 | 21300825,Local365,1834,2019-11-25 13:16:08 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2711,Barton Springs/Kinney,3,2020-02-09 125 | 21301856,Local365,2828,2019-11-25 15:27:57 UTC,2572,Barton Springs Pool,2711,Barton Springs/Kinney,7,2020-02-09 126 | 21299269,Explorer,244G,2019-11-25 09:25:41 UTC,2494,2nd/Congress,2711,Barton Springs/Kinney,19,2020-02-09 127 | 21299277,Explorer,270G,2019-11-25 09:27:43 UTC,2494,2nd/Congress,2711,Barton Springs/Kinney,17,2020-02-09 128 | 21301021,Explorer,548,2019-11-25 13:41:26 UTC,4061,Lakeshore/Austin Hostel,3292,East 4th/Chicon,249,2020-02-09 129 | 21300997,Explorer,80,2019-11-25 13:39:21 UTC,4061,Lakeshore/Austin Hostel,3292,East 4th/Chicon,33,2020-02-09 130 | 21299593,Single Trip (Pay-as-you-ride),152,2019-11-25 10:22:30 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,110,2020-02-09 131 | 21299595,Single Trip (Pay-as-you-ride),412,2019-11-25 10:23:36 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,110,2020-02-09 132 | 21299615,Single Trip (Pay-as-you-ride),387,2019-11-25 10:28:24 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,104,2020-02-09 133 | 21299597,Single Trip (Pay-as-you-ride),922,2019-11-25 10:24:20 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,3,2020-02-09 134 | 21300804,Pay-as-you-ride,240,2019-11-25 13:13:37 UTC,4058,Hollow Creek/Barton Hills,3513,South Congress/Barton Springs @ The Austin American-Statesman,37,2020-02-09 135 | 21300810,Pay-as-you-ride,554,2019-11-25 13:14:01 UTC,4058,Hollow Creek/Barton Hills,3513,South Congress/Barton Springs @ The Austin American-Statesman,35,2020-02-09 136 | 21302235,Single Trip (Pay-as-you-ride),014G,2019-11-25 16:16:13 UTC,2707,Rainey/Cummings,3513,South Congress/Barton Springs @ The Austin American-Statesman,54,2020-02-09 137 | 21302301,Local365,865,2019-11-25 16:25:07 UTC,2707,Rainey/Cummings,3513,South Congress/Barton Springs @ The Austin American-Statesman,8,2020-02-09 138 | 21302106,Single Trip (Pay-as-you-ride),375G,2019-11-25 16:00:35 UTC,2494,2nd/Congress,3513,South Congress/Barton Springs @ The Austin American-Statesman,70,2020-02-09 139 | 21303356,Single Trip (Pay-as-you-ride),1850,2019-11-25 19:38:45 UTC,3294,6th/Lavaca,3619,6th/Congress,396,2020-02-09 140 | 21303534,Local365,64,2019-11-25 20:36:11 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,3619,6th/Congress,8,2020-02-09 141 | 21298738,Local365,134,2019-11-25 07:56:04 UTC,2707,Rainey/Cummings,3619,6th/Congress,9,2020-02-09 142 | 21301686,Pay-as-you-ride,849,2019-11-25 15:06:36 UTC,2496,8th/Congress,3619,6th/Congress,22,2020-02-09 143 | 21300165,Local365,14264,2019-11-25 11:54:18 UTC,2501,5th/Bowie,3619,6th/Congress,6,2020-02-09 144 | 21298652,Local365,2048,2019-11-25 07:33:05 UTC,2501,5th/Bowie,3619,6th/Congress,8,2020-02-09 145 | 21301031,Local365,004G,2019-11-25 13:42:13 UTC,4047,8th/Lavaca,3621,3rd/Nueces,33,2020-02-09 146 | 21300382,Local365,863,2019-11-25 12:21:01 UTC,2496,8th/Congress,3621,3rd/Nueces,5,2020-02-09 147 | 21300723,Single Trip (Pay-as-you-ride),696,2019-11-25 13:04:41 UTC,2569,East 11th/San Marcos,3684,Cesar Chavez/Congress,14,2020-02-09 148 | 21300739,Single Trip (Pay-as-you-ride),975,2019-11-25 13:05:32 UTC,2569,East 11th/San Marcos,3684,Cesar Chavez/Congress,13,2020-02-09 149 | 21302101,Local365,12802,2019-11-25 15:59:55 UTC,2570,South Congress/Academy,3684,Cesar Chavez/Congress,8,2020-02-09 150 | 21301228,24 Hour Walk Up Pass,610,2019-11-25 14:06:05 UTC,2575,Riverside/South Lamar,3684,Cesar Chavez/Congress,168,2020-02-09 151 | 21302109,Single Trip (Pay-as-you-ride),272G,2019-11-25 16:01:16 UTC,2494,2nd/Congress,3684,Cesar Chavez/Congress,4,2020-02-09 152 | 21302990,Local365,255G,2019-11-25 18:05:09 UTC,2575,Riverside/South Lamar,3686,Sterzing/Barton Springs,10,2020-02-09 153 | 21301035,Single Trip (Pay-as-you-ride),507,2019-11-25 13:42:47 UTC,3686,Sterzing/Barton Springs,3686,Sterzing/Barton Springs,35,2020-02-09 154 | 21301971,24 Hour Walk Up Pass,873,2019-11-25 15:41:15 UTC,3686,Sterzing/Barton Springs,3686,Sterzing/Barton Springs,85,2020-02-09 155 | 21301030,Single Trip (Pay-as-you-ride),969,2019-11-25 13:42:08 UTC,3686,Sterzing/Barton Springs,3686,Sterzing/Barton Springs,33,2020-02-09 156 | 21302009,24 Hour Walk Up Pass,2021,2019-11-25 15:46:01 UTC,3686,Sterzing/Barton Springs,3686,Sterzing/Barton Springs,81,2020-02-09 157 | 21301020,Single Trip (Pay-as-you-ride),1423,2019-11-25 13:41:24 UTC,3686,Sterzing/Barton Springs,3686,Sterzing/Barton Springs,35,2020-02-09 158 | 21301990,24 Hour Walk Up Pass,969,2019-11-25 15:43:48 UTC,3686,Sterzing/Barton Springs,3686,Sterzing/Barton Springs,84,2020-02-09 159 | 21300054,Pay-as-you-ride,969,2019-11-25 11:43:56 UTC,2504,South Congress/Elizabeth,3686,Sterzing/Barton Springs,27,2020-02-09 160 | 21300115,Pay-as-you-ride,1423,2019-11-25 11:51:11 UTC,2504,South Congress/Elizabeth,3686,Sterzing/Barton Springs,19,2020-02-09 161 | 21300577,Pay-as-you-ride,871,2019-11-25 12:48:04 UTC,2575,Riverside/South Lamar,3687,Boardwalk West,10,2020-02-09 162 | 21300578,Pay-as-you-ride,220,2019-11-25 12:48:26 UTC,2575,Riverside/South Lamar,3687,Boardwalk West,10,2020-02-09 163 | 21300414,Single Trip (Pay-as-you-ride),2122,2019-11-25 12:25:41 UTC,3687,Boardwalk West,3687,Boardwalk West,1,2020-02-09 164 | 21300426,Single Trip (Pay-as-you-ride),432,2019-11-25 12:27:06 UTC,3687,Boardwalk West,3687,Boardwalk West,27,2020-02-09 165 | 21300412,Single Trip (Pay-as-you-ride),759,2019-11-25 12:24:54 UTC,3687,Boardwalk West,3687,Boardwalk West,25,2020-02-09 166 | 21300435,Single Trip (Pay-as-you-ride),2122,2019-11-25 12:27:46 UTC,3687,Boardwalk West,3687,Boardwalk West,28,2020-02-09 167 | 21303165,Local365,2304,2019-11-25 18:48:51 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3687,Boardwalk West,5,2020-02-09 168 | 21298482,U.T. Student Membership,153,2019-11-25 02:26:53 UTC,3795,Dean Keeton/Whitis,3792,22nd/Pearl,4,2020-02-09 169 | 21301145,Local365,2887,2019-11-25 13:55:58 UTC,3795,Dean Keeton/Whitis,3792,22nd/Pearl,7,2020-02-09 170 | 21303890,U.T. Student Membership,057G,2019-11-25 23:16:34 UTC,3798,21st/Speedway @ PCL,3792,22nd/Pearl,5,2020-02-09 171 | 21302914,U.T. Student Membership,156,2019-11-25 17:51:08 UTC,3841,23rd/Rio Grande,3793,28th/Rio Grande,4,2020-02-09 172 | 21301593,Local365,621,2019-11-25 14:49:52 UTC,3793,28th/Rio Grande,3793,28th/Rio Grande,1,2020-02-09 173 | 21300882,Local365,621,2019-11-25 13:22:49 UTC,3793,28th/Rio Grande,3793,28th/Rio Grande,1,2020-02-09 174 | 21303942,U.T. Student Membership,164,2019-11-25 23:56:21 UTC,3793,28th/Rio Grande,3793,28th/Rio Grande,2,2020-02-09 175 | 21301640,Local365,277,2019-11-25 14:58:40 UTC,3794,"Dean Keeton/Speedway ",3793,28th/Rio Grande,6,2020-02-09 176 | 21302778,Local365,164,2019-11-25 17:31:04 UTC,3798,21st/Speedway @ PCL,3793,28th/Rio Grande,9,2020-02-09 177 | 21302450,U.T. Student Membership,2351,2019-11-25 16:43:11 UTC,2548,Guadalupe/West Mall @ University Co-op,3793,28th/Rio Grande,4,2020-02-09 178 | 21300657,Local365,907,2019-11-25 12:58:50 UTC,3793,28th/Rio Grande,3794,"Dean Keeton/Speedway ",6,2020-02-09 179 | 21303013,U.T. Student Membership,895,2019-11-25 18:11:41 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",10,2020-02-09 180 | 21302907,U.T. Student Membership,153,2019-11-25 17:50:42 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",6,2020-02-09 181 | 21302615,U.T. Student Membership,113G,2019-11-25 17:09:37 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",1,2020-02-09 182 | 21303050,U.T. Student Membership,110,2019-11-25 18:22:18 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",42,2020-02-09 183 | 21303785,U.T. Student Membership,110,2019-11-25 22:18:39 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",18,2020-02-09 184 | 21301120,U.T. Student Membership,113G,2019-11-25 13:53:45 UTC,3795,Dean Keeton/Whitis,3794,"Dean Keeton/Speedway ",2,2020-02-09 185 | 21303485,U.T. Student Membership,110,2019-11-25 20:22:36 UTC,3797,21st/University,3794,"Dean Keeton/Speedway ",3,2020-02-09 186 | 21302777,U.T. Student Membership,153,2019-11-25 17:30:45 UTC,3797,21st/University,3794,"Dean Keeton/Speedway ",5,2020-02-09 187 | 21299982,Local365,105G,2019-11-25 11:28:30 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",5,2020-02-09 188 | 21301613,Local365,057G,2019-11-25 14:53:39 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",5,2020-02-09 189 | 21299609,Local365,2288,2019-11-25 10:26:14 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",3,2020-02-09 190 | 21301604,Local365,2351,2019-11-25 14:51:40 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",4,2020-02-09 191 | 21300762,U.T. Student Membership,153,2019-11-25 13:08:54 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",3,2020-02-09 192 | 21300985,Local365,895,2019-11-25 13:37:40 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",4,2020-02-09 193 | 21299187,Local30,277,2019-11-25 09:11:29 UTC,2540,17th/Guadalupe,3794,"Dean Keeton/Speedway ",9,2020-02-09 194 | 21302833,Local365,881,2019-11-25 17:39:05 UTC,2547,21st/Guadalupe,3794,"Dean Keeton/Speedway ",27,2020-02-09 195 | 21303608,Local365,113G,2019-11-25 21:00:58 UTC,2547,21st/Guadalupe,3794,"Dean Keeton/Speedway ",43,2020-02-09 196 | 21300570,U.T. Student Membership,2887,2019-11-25 12:47:26 UTC,3793,28th/Rio Grande,3795,Dean Keeton/Whitis,6,2020-02-09 197 | 21300802,U.T. Student Membership,19,2019-11-25 13:13:27 UTC,3794,"Dean Keeton/Speedway ",3795,Dean Keeton/Whitis,2,2020-02-09 198 | 21300638,U.T. Student Membership,113G,2019-11-25 12:56:45 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,5,2020-02-09 199 | 21299773,Local365,895,2019-11-25 10:54:36 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,10,2020-02-09 200 | 21303829,U.T. Student Membership,936,2019-11-25 22:40:03 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,6,2020-02-09 201 | 21303451,U.T. Student Membership,110,2019-11-25 20:10:35 UTC,3794,"Dean Keeton/Speedway ",3797,21st/University,6,2020-02-09 202 | 21302618,U.T. Student Membership,153,2019-11-25 17:10:04 UTC,3794,"Dean Keeton/Speedway ",3797,21st/University,5,2020-02-09 203 | 21303430,U.T. Student Membership,895,2019-11-25 20:05:00 UTC,3794,"Dean Keeton/Speedway ",3797,21st/University,4,2020-02-09 204 | 21301857,Local365,040G,2019-11-25 15:28:01 UTC,3841,23rd/Rio Grande,3798,21st/Speedway @ PCL,6,2020-02-09 205 | 21303749,U.T. Student Membership,040G,2019-11-25 21:58:59 UTC,3841,23rd/Rio Grande,3798,21st/Speedway @ PCL,5,2020-02-09 206 | 21301323,Local365,057G,2019-11-25 14:16:10 UTC,3621,3rd/Nueces,3798,21st/Speedway @ PCL,30,2020-02-09 207 | 21299308,U.T. Student Membership,153,2019-11-25 09:33:46 UTC,3792,22nd/Pearl,3798,21st/Speedway @ PCL,4,2020-02-09 208 | 21298453,Local365,668,2019-11-25 00:35:43 UTC,3793,28th/Rio Grande,3798,21st/Speedway @ PCL,11,2020-02-09 209 | 21300451,U.T. Student Membership,105G,2019-11-25 12:29:07 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,4,2020-02-09 210 | 21299802,U.T. Student Membership,2288,2019-11-25 10:57:59 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,5,2020-02-09 211 | 21302072,Local365,057G,2019-11-25 15:55:24 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,3,2020-02-09 212 | 21302961,U.T. Student Membership,153,2019-11-25 17:59:38 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,4,2020-02-09 213 | 21300127,U.T. Student Membership,113G,2019-11-25 11:51:56 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,4,2020-02-09 214 | 21302958,U.T. Student Membership,907,2019-11-25 17:59:03 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,2,2020-02-09 215 | 21301008,U.T. Student Membership,2351,2019-11-25 13:40:29 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,3,2020-02-09 216 | 21299916,Local365,895,2019-11-25 11:17:05 UTC,3795,Dean Keeton/Whitis,3798,21st/Speedway @ PCL,4,2020-02-09 217 | 21301027,Local365,668,2019-11-25 13:41:57 UTC,3798,21st/Speedway @ PCL,3798,21st/Speedway @ PCL,20,2020-02-09 218 | 21302706,Local365,273,2019-11-25 17:20:56 UTC,3798,21st/Speedway @ PCL,3798,21st/Speedway @ PCL,1,2020-02-09 219 | 21303079,Local365,113G,2019-11-25 18:26:31 UTC,2540,17th/Guadalupe,3798,21st/Speedway @ PCL,25,2020-02-09 220 | 21303530,Local365,113G,2019-11-25 20:35:59 UTC,2547,21st/Guadalupe,3798,21st/Speedway @ PCL,10,2020-02-09 221 | 21299432,Local365,2288,2019-11-25 09:51:54 UTC,2548,Guadalupe/West Mall @ University Co-op,3798,21st/Speedway @ PCL,4,2020-02-09 222 | 21302989,Local365,2887,2019-11-25 18:05:00 UTC,2548,Guadalupe/West Mall @ University Co-op,3798,21st/Speedway @ PCL,3,2020-02-09 223 | 21303043,U.T. Student Membership,936,2019-11-25 18:21:26 UTC,2548,Guadalupe/West Mall @ University Co-op,3798,21st/Speedway @ PCL,4,2020-02-09 224 | 21303087,U.T. Student Membership,105G,2019-11-25 18:28:41 UTC,3838,26th/Nueces,3798,21st/Speedway @ PCL,8,2020-02-09 225 | 21302974,U.T. Student Membership,2077,2019-11-25 18:02:12 UTC,3841,23rd/Rio Grande,3799,23rd/San Jacinto @ DKR Stadium,8,2020-02-09 226 | 21300959,U.T. Student Membership,2722,2019-11-25 13:33:29 UTC,3798,21st/Speedway @ PCL,3799,23rd/San Jacinto @ DKR Stadium,4,2020-02-09 227 | 21303283,Local365,113G,2019-11-25 19:18:26 UTC,2547,21st/Guadalupe,3799,23rd/San Jacinto @ DKR Stadium,31,2020-02-09 228 | 21303731,U.T. Student Membership,113G,2019-11-25 21:49:59 UTC,3794,"Dean Keeton/Speedway ",3838,26th/Nueces,4,2020-02-09 229 | 21301189,Local365,105G,2019-11-25 14:01:23 UTC,3798,21st/Speedway @ PCL,3838,26th/Nueces,8,2020-02-09 230 | 21298477,Local365,2288,2019-11-25 02:22:15 UTC,3798,21st/Speedway @ PCL,3838,26th/Nueces,12,2020-02-09 231 | 21301840,Local365,105G,2019-11-25 15:26:15 UTC,2548,Guadalupe/West Mall @ University Co-op,3838,26th/Nueces,5,2020-02-09 232 | 21303912,Local365,113G,2019-11-25 23:36:49 UTC,2548,Guadalupe/West Mall @ University Co-op,3838,26th/Nueces,7,2020-02-09 233 | 21300059,U.T. Student Membership,040G,2019-11-25 11:44:41 UTC,3841,23rd/Rio Grande,3841,23rd/Rio Grande,1,2020-02-09 234 | 21302972,U.T. Student Membership,040G,2019-11-25 18:01:46 UTC,3841,23rd/Rio Grande,3841,23rd/Rio Grande,1,2020-02-09 235 | 21303294,U.T. Student Membership,156,2019-11-25 19:22:53 UTC,3793,28th/Rio Grande,3841,23rd/Rio Grande,4,2020-02-09 236 | 21302898,Local365,277,2019-11-25 17:48:47 UTC,3793,28th/Rio Grande,3841,23rd/Rio Grande,4,2020-02-09 237 | 21301214,U.T. Student Membership,19,2019-11-25 14:03:55 UTC,3795,Dean Keeton/Whitis,3841,23rd/Rio Grande,1254,2020-02-09 238 | 21303778,U.T. Student Membership,895,2019-11-25 22:11:55 UTC,3797,21st/University,3841,23rd/Rio Grande,5,2020-02-09 239 | 21302899,U.T. Student Membership,040G,2019-11-25 17:49:05 UTC,3798,21st/Speedway @ PCL,3841,23rd/Rio Grande,6,2020-02-09 240 | 21303415,U.T. Student Membership,460,2019-11-25 20:00:15 UTC,2547,21st/Guadalupe,3841,23rd/Rio Grande,4,2020-02-09 241 | 21300309,U.T. Student Membership,2077,2019-11-25 12:11:47 UTC,2548,Guadalupe/West Mall @ University Co-op,3841,23rd/Rio Grande,2,2020-02-09 242 | -------------------------------------------------------------------------------- /week6/data/bike_data_20200210.csv: -------------------------------------------------------------------------------- 1 | trip_id,subscriber_type,bikeid,start_time,start_station_id,start_station_name,end_station_id,end_station_name,duration_minutes,dummy_date 2 | 21306385,U.T. Student Membership,862,2019-11-26 15:49:47 UTC,3790,Lake Austin Blvd/Deep Eddy,3790,Lake Austin Blvd/Deep Eddy,22,2020-02-10 3 | 21304367,Single Trip (Pay-as-you-ride),647,2019-11-26 08:44:39 UTC,4061,Lakeshore/Austin Hostel,4061,Lakeshore/Austin Hostel,22,2020-02-10 4 | 21304359,Single Trip (Pay-as-you-ride),882,2019-11-26 08:42:15 UTC,4061,Lakeshore/Austin Hostel,4061,Lakeshore/Austin Hostel,23,2020-02-10 5 | 21304362,Single Trip (Pay-as-you-ride),420,2019-11-26 08:43:43 UTC,4061,Lakeshore/Austin Hostel,4061,Lakeshore/Austin Hostel,26,2020-02-10 6 | 21305911,Local365,230,2019-11-26 14:20:05 UTC,4047,8th/Lavaca,4057,"6th/Chalmers ",9,2020-02-10 7 | 21307093,U.T. Student Membership,62,2019-11-26 18:16:41 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,4058,Hollow Creek/Barton Hills,913,2020-02-10 8 | 21305821,Explorer,745,2019-11-26 14:06:35 UTC,2822,East 6th/Robert T. Martinez,2544,East 6th/Pedernales,154,2020-02-10 9 | 21306424,Local365,2048,2019-11-26 15:55:31 UTC,3619,6th/Congress,4048,South Congress @ Bouldin Creek,7,2020-02-10 10 | 21305421,U.T. Student Membership,827,2019-11-26 12:54:02 UTC,3621,3rd/Nueces,4061,Lakeshore/Austin Hostel,25,2020-02-10 11 | 21306185,Single Trip (Pay-as-you-ride),650,2019-11-26 15:11:27 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,4048,South Congress @ Bouldin Creek,168,2020-02-10 12 | 21306178,Single Trip (Pay-as-you-ride),1612,2019-11-26 15:10:40 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,4048,South Congress @ Bouldin Creek,168,2020-02-10 13 | 21304842,Explorer,213,2019-11-26 10:55:34 UTC,2494,2nd/Congress,4061,Lakeshore/Austin Hostel,52,2020-02-10 14 | 21305026,Local365,2143,2019-11-26 11:38:28 UTC,3799,23rd/San Jacinto @ DKR Stadium,3294,6th/Lavaca,32,2020-02-10 15 | 21305890,Explorer,342,2019-11-26 14:16:36 UTC,3292,East 4th/Chicon,2544,East 6th/Pedernales,144,2020-02-10 16 | 21304209,Explorer,329,2019-11-26 07:42:30 UTC,3292,East 4th/Chicon,2568,East 11th/Victory Grill,15,2020-02-10 17 | 21304203,Explorer,898,2019-11-26 07:41:33 UTC,3292,East 4th/Chicon,2568,East 11th/Victory Grill,16,2020-02-10 18 | 21307741,Single Trip (Pay-as-you-ride),200,2019-11-26 22:50:32 UTC,2540,17th/Guadalupe,3660,East 6th/Medina,21,2020-02-10 19 | 21304693,U.T. Student Membership,881,2019-11-26 10:20:45 UTC,2547,21st/Guadalupe,3294,6th/Lavaca,7,2020-02-10 20 | 21305065,Explorer,894,2019-11-26 11:48:08 UTC,4061,Lakeshore/Austin Hostel,2494,2nd/Congress,41,2020-02-10 21 | 21307501,Local365,043G,2019-11-26 20:46:28 UTC,4048,South Congress @ Bouldin Creek,2494,2nd/Congress,5,2020-02-10 22 | 21305724,24 Hour Walk Up Pass,507,2019-11-26 13:51:21 UTC,3686,Sterzing/Barton Springs,2494,2nd/Congress,20,2020-02-10 23 | 21306317,Single Trip (Pay-as-you-ride),75,2019-11-26 15:37:08 UTC,2494,2nd/Congress,2494,2nd/Congress,22,2020-02-10 24 | 21306311,Single Trip (Pay-as-you-ride),100,2019-11-26 15:35:53 UTC,2494,2nd/Congress,2494,2nd/Congress,25,2020-02-10 25 | 21306524,24 Hour Walk Up Pass,894,2019-11-26 16:11:31 UTC,2494,2nd/Congress,2494,2nd/Congress,62,2020-02-10 26 | 21306323,Single Trip (Pay-as-you-ride),894,2019-11-26 15:37:59 UTC,2494,2nd/Congress,2494,2nd/Congress,22,2020-02-10 27 | 21306529,24 Hour Walk Up Pass,100,2019-11-26 16:12:10 UTC,2494,2nd/Congress,2494,2nd/Congress,61,2020-02-10 28 | 21306059,Explorer,113G,2019-11-26 14:46:48 UTC,2547,21st/Guadalupe,2494,2nd/Congress,17,2020-02-10 29 | 21306062,Explorer,014G,2019-11-26 14:47:07 UTC,2547,21st/Guadalupe,2494,2nd/Congress,16,2020-02-10 30 | 21304177,Local365,014G,2019-11-26 07:30:52 UTC,2552,3rd/West,2494,2nd/Congress,31,2020-02-10 31 | 21305110,Local365,465,2019-11-26 11:56:03 UTC,3621,3rd/Nueces,2495,4th/Congress,4,2020-02-10 32 | 21305506,Local365,074G,2019-11-26 13:11:41 UTC,2539,3rd/Trinity @ The Convention Center,2495,4th/Congress,46,2020-02-10 33 | 21304201,Local365,14264,2019-11-26 07:41:07 UTC,2501,5th/Bowie,2496,8th/Congress,9,2020-02-10 34 | 21304221,Local365,800,2019-11-26 07:48:51 UTC,2542,Plaza Saltillo,2496,8th/Congress,11,2020-02-10 35 | 21305567,Pay-as-you-ride,902,2019-11-26 13:22:40 UTC,2563,Rainey/Davis,2499,2nd/Lavaca @ City Hall,19,2020-02-10 36 | 21305601,Single Trip (Pay-as-you-ride),1524,2019-11-26 13:28:33 UTC,2563,Rainey/Davis,2499,2nd/Lavaca @ City Hall,13,2020-02-10 37 | 21307100,Local365,2048,2019-11-26 18:19:06 UTC,4048,South Congress @ Bouldin Creek,2501,5th/Bowie,9,2020-02-10 38 | 21304920,Local365,64,2019-11-26 11:13:32 UTC,3619,6th/Congress,2501,5th/Bowie,7,2020-02-10 39 | 21304370,Local365,369G,2019-11-26 08:45:15 UTC,2542,Plaza Saltillo,2501,5th/Bowie,14,2020-02-10 40 | 21306792,Local365,2095,2019-11-26 17:02:40 UTC,2561,12th/San Jacinto @ State Capitol Visitors Garage,2503,South Congress/James,16,2020-02-10 41 | 21307404,Single Trip (Pay-as-you-ride),1277,2019-11-26 20:04:59 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2503,South Congress/James,10,2020-02-10 42 | 21307398,Single Trip (Pay-as-you-ride),864,2019-11-26 20:03:50 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2503,South Congress/James,11,2020-02-10 43 | 21304987,24 Hour Walk Up Pass,1551,2019-11-26 11:32:27 UTC,4058,Hollow Creek/Barton Hills,2504,South Congress/Elizabeth,40,2020-02-10 44 | 21304969,24 Hour Walk Up Pass,2122B,2019-11-26 11:30:37 UTC,4058,Hollow Creek/Barton Hills,2504,South Congress/Elizabeth,42,2020-02-10 45 | 21305227,Local365,272G,2019-11-26 12:16:48 UTC,3294,6th/Lavaca,2539,3rd/Trinity @ The Convention Center,44,2020-02-10 46 | 21306040,Local365,14148,2019-11-26 14:43:48 UTC,2707,Rainey/Cummings,2539,3rd/Trinity @ The Convention Center,6,2020-02-10 47 | 21306774,Local365,800,2019-11-26 16:58:11 UTC,2496,8th/Congress,2539,3rd/Trinity @ The Convention Center,6,2020-02-10 48 | 21307503,Local365,650,2019-11-26 20:46:40 UTC,4048,South Congress @ Bouldin Creek,2542,Plaza Saltillo,13,2020-02-10 49 | 21303981,Local365,369G,2019-11-26 00:36:35 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2542,Plaza Saltillo,16,2020-02-10 50 | 21307511,Local365,894,2019-11-26 20:51:44 UTC,2494,2nd/Congress,2542,Plaza Saltillo,8,2020-02-10 51 | 21304563,Local365,864,2019-11-26 09:39:08 UTC,2501,5th/Bowie,2542,Plaza Saltillo,15,2020-02-10 52 | 21307128,Local365,460,2019-11-26 18:25:05 UTC,3841,23rd/Rio Grande,2547,21st/Guadalupe,18,2020-02-10 53 | 21304929,U.T. Student Membership,115,2019-11-26 11:17:20 UTC,3792,22nd/Pearl,2547,21st/Guadalupe,3,2020-02-10 54 | 21304953,24 Hour Walk Up Pass,2204,2019-11-26 11:28:01 UTC,3797,21st/University,2547,21st/Guadalupe,146,2020-02-10 55 | 21304944,24 Hour Walk Up Pass,014G,2019-11-26 11:25:46 UTC,3797,21st/University,2547,21st/Guadalupe,149,2020-02-10 56 | 21304949,24 Hour Walk Up Pass,113G,2019-11-26 11:26:48 UTC,3797,21st/University,2547,21st/Guadalupe,148,2020-02-10 57 | 21305023,Explorer,4,2019-11-26 11:38:00 UTC,2547,21st/Guadalupe,2547,21st/Guadalupe,133,2020-02-10 58 | 21305029,Explorer,2351,2019-11-26 11:40:02 UTC,2547,21st/Guadalupe,2547,21st/Guadalupe,134,2020-02-10 59 | 21307594,Single Trip (Pay-as-you-ride),571,2019-11-26 21:27:18 UTC,4047,8th/Lavaca,2548,Guadalupe/West Mall @ University Co-op,22,2020-02-10 60 | 21307597,Single Trip (Pay-as-you-ride),303G,2019-11-26 21:28:38 UTC,4047,8th/Lavaca,2548,Guadalupe/West Mall @ University Co-op,21,2020-02-10 61 | 21306639,Local365,115,2019-11-26 16:30:56 UTC,3793,28th/Rio Grande,2548,Guadalupe/West Mall @ University Co-op,5,2020-02-10 62 | 21304782,Local365,2887,2019-11-26 10:43:47 UTC,3798,21st/Speedway @ PCL,2548,Guadalupe/West Mall @ University Co-op,4,2020-02-10 63 | 21307409,Local365,057G,2019-11-26 20:07:35 UTC,3798,21st/Speedway @ PCL,2548,Guadalupe/West Mall @ University Co-op,7,2020-02-10 64 | 21306791,U.T. Student Membership,2077,2019-11-26 17:02:19 UTC,3799,23rd/San Jacinto @ DKR Stadium,2548,Guadalupe/West Mall @ University Co-op,6,2020-02-10 65 | 21304060,Local365,2304,2019-11-26 06:27:18 UTC,3687,Boardwalk West,2549,South 1st/Riverside @ Long Center,7,2020-02-10 66 | 21304614,Local30,1439,2019-11-26 09:55:50 UTC,2496,8th/Congress,2552,3rd/West,7,2020-02-10 67 | 21307390,Local365,460,2019-11-26 20:01:19 UTC,3794,"Dean Keeton/Speedway ",2563,Rainey/Davis,23,2020-02-10 68 | 21306726,Pay-as-you-ride,882,2019-11-26 16:49:37 UTC,4061,Lakeshore/Austin Hostel,2565,6th/Trinity,43,2020-02-10 69 | 21306733,Pay-as-you-ride,1830,2019-11-26 16:50:26 UTC,4061,Lakeshore/Austin Hostel,2565,6th/Trinity,42,2020-02-10 70 | 21307007,Pay-as-you-ride,1949,2019-11-26 17:53:04 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,29,2020-02-10 71 | 21304753,Local365,22,2019-11-26 10:39:16 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,5,2020-02-10 72 | 21307106,Single Trip (Pay-as-you-ride),198,2019-11-26 18:19:54 UTC,2575,Riverside/South Lamar,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,4,2020-02-10 73 | 21306205,Single Trip (Pay-as-you-ride),2355,2019-11-26 15:15:59 UTC,3684,Cesar Chavez/Congress,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,93,2020-02-10 74 | 21306207,Single Trip (Pay-as-you-ride),963,2019-11-26 15:16:36 UTC,3684,Cesar Chavez/Congress,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,92,2020-02-10 75 | 21306212,Single Trip (Pay-as-you-ride),610,2019-11-26 15:17:15 UTC,3684,Cesar Chavez/Congress,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,91,2020-02-10 76 | 21306217,Single Trip (Pay-as-you-ride),815,2019-11-26 15:17:52 UTC,3684,Cesar Chavez/Congress,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,91,2020-02-10 77 | 21306609,Single Trip (Pay-as-you-ride),3,2019-11-26 16:25:39 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2567,Barton Springs/Bouldin @ Palmer Auditorium,23,2020-02-10 78 | 21306620,Single Trip (Pay-as-you-ride),326,2019-11-26 16:27:12 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2567,Barton Springs/Bouldin @ Palmer Auditorium,20,2020-02-10 79 | 21304578,Local365,22,2019-11-26 09:44:19 UTC,2544,East 6th/Pedernales,2567,Barton Springs/Bouldin @ Palmer Auditorium,16,2020-02-10 80 | 21307739,Single Trip (Pay-as-you-ride),352,2019-11-26 22:45:33 UTC,2568,East 11th/Victory Grill,2571,8th/Red River,5,2020-02-10 81 | 21306752,Local365,640,2019-11-26 16:53:36 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,2,2020-02-10 82 | 21306762,Local365,640,2019-11-26 16:55:59 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,2,2020-02-10 83 | 21304580,24 Hour Walk Up Pass,861,2019-11-26 09:44:22 UTC,2574,Zilker Park,2574,Zilker Park,1829,2020-02-10 84 | 21304560,24 Hour Walk Up Pass,873,2019-11-26 09:38:25 UTC,3686,Sterzing/Barton Springs,2574,Zilker Park,4,2020-02-10 85 | 21307122,Single Trip (Pay-as-you-ride),815,2019-11-26 18:23:54 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2575,Riverside/South Lamar,18,2020-02-10 86 | 21305503,Single Trip (Pay-as-you-ride),198,2019-11-26 13:10:17 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,3,2020-02-10 87 | 21307272,Local365,229G,2019-11-26 19:17:22 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,1,2020-02-10 88 | 21307275,Local365,815,2019-11-26 19:18:48 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,13,2020-02-10 89 | 21305846,Single Trip (Pay-as-you-ride),3541,2019-11-26 14:10:13 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,102,2020-02-10 90 | 21305505,24 Hour Walk Up Pass,430,2019-11-26 13:11:37 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,3,2020-02-10 91 | 21307102,Single Trip (Pay-as-you-ride),1555,2019-11-26 18:19:17 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,22,2020-02-10 92 | 21304671,24 Hour Walk Up Pass,963,2019-11-26 10:13:06 UTC,3684,Cesar Chavez/Congress,2575,Riverside/South Lamar,31,2020-02-10 93 | 21304665,24 Hour Walk Up Pass,12802,2019-11-26 10:11:24 UTC,3684,Cesar Chavez/Congress,2575,Riverside/South Lamar,29,2020-02-10 94 | 21304669,24 Hour Walk Up Pass,610,2019-11-26 10:12:23 UTC,3684,Cesar Chavez/Congress,2575,Riverside/South Lamar,30,2020-02-10 95 | 21305129,24 Hour Walk Up Pass,869,2019-11-26 11:58:37 UTC,2707,Rainey/Cummings,2575,Riverside/South Lamar,104,2020-02-10 96 | 21305132,24 Hour Walk Up Pass,551,2019-11-26 11:59:29 UTC,2707,Rainey/Cummings,2575,Riverside/South Lamar,104,2020-02-10 97 | 21305152,24 Hour Walk Up Pass,362,2019-11-26 12:04:15 UTC,2707,Rainey/Cummings,2575,Riverside/South Lamar,98,2020-02-10 98 | 21304699,24 Hour Walk Up Pass,646,2019-11-26 10:23:59 UTC,2549,South 1st/Riverside @ Long Center,2575,Riverside/South Lamar,22,2020-02-10 99 | 21304701,24 Hour Walk Up Pass,815,2019-11-26 10:24:55 UTC,2549,South 1st/Riverside @ Long Center,2575,Riverside/South Lamar,22,2020-02-10 100 | 21305946,Single Trip (Pay-as-you-ride),220,2019-11-26 14:24:28 UTC,3687,Boardwalk West,2707,Rainey/Cummings,45,2020-02-10 101 | 21305940,Single Trip (Pay-as-you-ride),432,2019-11-26 14:23:27 UTC,3687,Boardwalk West,2707,Rainey/Cummings,46,2020-02-10 102 | 21305954,Single Trip (Pay-as-you-ride),871,2019-11-26 14:26:05 UTC,3687,Boardwalk West,2707,Rainey/Cummings,43,2020-02-10 103 | 21304191,Local365,1434,2019-11-26 07:36:08 UTC,2563,Rainey/Davis,3390,6th/Brazos,6,2020-02-10 104 | 21306780,Pay-as-you-ride,2274,2019-11-26 16:59:44 UTC,2572,Barton Springs Pool,3513,South Congress/Barton Springs @ The Austin American-Statesman,56,2020-02-10 105 | 21306779,Local365,640,2019-11-26 16:59:38 UTC,2572,Barton Springs Pool,3513,South Congress/Barton Springs @ The Austin American-Statesman,56,2020-02-10 106 | 21306982,Local365,1277,2019-11-26 17:48:13 UTC,2542,Plaza Saltillo,3513,South Congress/Barton Springs @ The Austin American-Statesman,13,2020-02-10 107 | 21306979,Local365,864,2019-11-26 17:48:08 UTC,2542,Plaza Saltillo,3513,South Congress/Barton Springs @ The Austin American-Statesman,13,2020-02-10 108 | 21305109,Local365,64,2019-11-26 11:55:44 UTC,2501,5th/Bowie,3619,6th/Congress,7,2020-02-10 109 | 21306921,Local30,663,2019-11-26 17:30:47 UTC,2552,3rd/West,3619,6th/Congress,9,2020-02-10 110 | 21305195,Local365,64,2019-11-26 12:11:02 UTC,3619,6th/Congress,3621,3rd/Nueces,3,2020-02-10 111 | 21305217,Local365,503,2019-11-26 12:13:51 UTC,2494,2nd/Congress,3621,3rd/Nueces,5,2020-02-10 112 | 21305539,Local365,135,2019-11-26 13:18:44 UTC,2495,4th/Congress,3621,3rd/Nueces,6,2020-02-10 113 | 21306002,Local365,860,2019-11-26 14:36:01 UTC,2501,5th/Bowie,3621,3rd/Nueces,4,2020-02-10 114 | 21304793,24 Hour Walk Up Pass,815,2019-11-26 10:46:34 UTC,2575,Riverside/South Lamar,3684,Cesar Chavez/Congress,69,2020-02-10 115 | 21304775,24 Hour Walk Up Pass,2355,2019-11-26 10:42:28 UTC,2575,Riverside/South Lamar,3684,Cesar Chavez/Congress,73,2020-02-10 116 | 21304789,24 Hour Walk Up Pass,646,2019-11-26 10:45:49 UTC,2575,Riverside/South Lamar,3684,Cesar Chavez/Congress,70,2020-02-10 117 | 21304783,24 Hour Walk Up Pass,610,2019-11-26 10:43:58 UTC,2575,Riverside/South Lamar,3684,Cesar Chavez/Congress,72,2020-02-10 118 | 21304786,24 Hour Walk Up Pass,963,2019-11-26 10:44:45 UTC,2575,Riverside/South Lamar,3684,Cesar Chavez/Congress,72,2020-02-10 119 | 21304799,24 Hour Walk Up Pass,315,2019-11-26 10:49:00 UTC,2574,Zilker Park,3686,Sterzing/Barton Springs,2,2020-02-10 120 | 21305254,24 Hour Walk Up Pass,461,2019-11-26 12:20:21 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3686,Sterzing/Barton Springs,27,2020-02-10 121 | 21305241,24 Hour Walk Up Pass,479,2019-11-26 12:18:25 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3686,Sterzing/Barton Springs,29,2020-02-10 122 | 21306944,U.T. Student Membership,040G,2019-11-26 17:39:04 UTC,3841,23rd/Rio Grande,3792,22nd/Pearl,2,2020-02-10 123 | 21306314,Local365,088G,2019-11-26 15:36:42 UTC,3793,28th/Rio Grande,3793,28th/Rio Grande,6,2020-02-10 124 | 21304374,U.T. Student Membership,621,2019-11-26 08:46:45 UTC,3793,28th/Rio Grande,3793,28th/Rio Grande,1,2020-02-10 125 | 21304501,Local365,936,2019-11-26 09:21:47 UTC,3795,Dean Keeton/Whitis,3793,28th/Rio Grande,5,2020-02-10 126 | 21304508,U.T. Student Membership,113G,2019-11-26 09:23:02 UTC,3797,21st/University,3793,28th/Rio Grande,11,2020-02-10 127 | 21305514,U.T. Student Membership,057G,2019-11-26 13:13:00 UTC,3798,21st/Speedway @ PCL,3793,28th/Rio Grande,9,2020-02-10 128 | 21306776,Local365,388,2019-11-26 16:58:47 UTC,3798,21st/Speedway @ PCL,3793,28th/Rio Grande,11,2020-02-10 129 | 21305747,Local365,115,2019-11-26 13:56:17 UTC,2547,21st/Guadalupe,3793,28th/Rio Grande,8,2020-02-10 130 | 21305122,U.T. Student Membership,2887,2019-11-26 11:57:46 UTC,2548,Guadalupe/West Mall @ University Co-op,3793,28th/Rio Grande,4,2020-02-10 131 | 21307581,Local365,460,2019-11-26 21:19:26 UTC,3841,23rd/Rio Grande,3794,"Dean Keeton/Speedway ",4,2020-02-10 132 | 21304541,Local365,544,2019-11-26 09:30:42 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",1,2020-02-10 133 | 21307203,Local365,460,2019-11-26 18:58:26 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",25,2020-02-10 134 | 21304825,Local365,400,2019-11-26 10:51:32 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",4,2020-02-10 135 | 21305842,Local365,936,2019-11-26 14:09:56 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",4,2020-02-10 136 | 21305668,U.T. Student Membership,854,2019-11-26 13:41:32 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",31,2020-02-10 137 | 21307173,Local365,460,2019-11-26 18:49:19 UTC,2547,21st/Guadalupe,3794,"Dean Keeton/Speedway ",9,2020-02-10 138 | 21305203,U.T. Student Membership,400,2019-11-26 12:12:27 UTC,3838,26th/Nueces,3794,"Dean Keeton/Speedway ",120,2020-02-10 139 | 21305721,U.T. Student Membership,156,2019-11-26 13:50:43 UTC,3841,23rd/Rio Grande,3795,Dean Keeton/Whitis,4,2020-02-10 140 | 21304839,U.T. Student Membership,382,2019-11-26 10:55:04 UTC,3792,22nd/Pearl,3795,Dean Keeton/Whitis,5,2020-02-10 141 | 21307773,U.T. Student Membership,2288,2019-11-26 23:20:08 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,6,2020-02-10 142 | 21307334,Local365,4,2019-11-26 19:37:05 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,5,2020-02-10 143 | 21305627,Local365,2722,2019-11-26 13:33:15 UTC,3799,23rd/San Jacinto @ DKR Stadium,3795,Dean Keeton/Whitis,4,2020-02-10 144 | 21306065,Local365,4,2019-11-26 14:47:22 UTC,2547,21st/Guadalupe,3795,Dean Keeton/Whitis,5,2020-02-10 145 | 21306800,Local365,58,2019-11-26 17:03:34 UTC,3838,26th/Nueces,3795,Dean Keeton/Whitis,3,2020-02-10 146 | 21305245,U.T. Student Membership,895,2019-11-26 12:18:59 UTC,3841,23rd/Rio Grande,3797,21st/University,4,2020-02-10 147 | 21304605,Explorer,2204,2019-11-26 09:52:56 UTC,2494,2nd/Congress,3797,21st/University,18,2020-02-10 148 | 21304604,Explorer,014G,2019-11-26 09:52:52 UTC,2494,2nd/Congress,3797,21st/University,18,2020-02-10 149 | 21304898,Local365,113G,2019-11-26 11:08:34 UTC,3793,28th/Rio Grande,3797,21st/University,6,2020-02-10 150 | 21307250,U.T. Student Membership,115,2019-11-26 19:11:36 UTC,2548,Guadalupe/West Mall @ University Co-op,3797,21st/University,3,2020-02-10 151 | 21304211,U.T. Student Membership,113G,2019-11-26 07:45:16 UTC,3838,26th/Nueces,3797,21st/University,5,2020-02-10 152 | 21306181,U.T. Student Membership,460,2019-11-26 15:11:02 UTC,3841,23rd/Rio Grande,3798,21st/Speedway @ PCL,8,2020-02-10 153 | 21305397,U.T. Student Membership,057G,2019-11-26 12:49:07 UTC,3792,22nd/Pearl,3798,21st/Speedway @ PCL,6,2020-02-10 154 | 21306519,U.T. Student Membership,388,2019-11-26 16:10:46 UTC,3792,22nd/Pearl,3798,21st/Speedway @ PCL,4,2020-02-10 155 | 21305049,U.T. Student Membership,936,2019-11-26 11:46:08 UTC,3793,28th/Rio Grande,3798,21st/Speedway @ PCL,7,2020-02-10 156 | 21306454,Local365,088G,2019-11-26 16:00:06 UTC,3793,28th/Rio Grande,3798,21st/Speedway @ PCL,13,2020-02-10 157 | 21307232,Local365,057G,2019-11-26 19:06:38 UTC,3793,28th/Rio Grande,3798,21st/Speedway @ PCL,9,2020-02-10 158 | 21304467,Local365,400,2019-11-26 09:13:53 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,4,2020-02-10 159 | 21307119,Local365,4,2019-11-26 18:23:07 UTC,3795,Dean Keeton/Whitis,3798,21st/Speedway @ PCL,4,2020-02-10 160 | 21307771,U.T. Student Membership,571,2019-11-26 23:15:30 UTC,2548,Guadalupe/West Mall @ University Co-op,3798,21st/Speedway @ PCL,6,2020-02-10 161 | 21304394,U.T. Student Membership,110,2019-11-26 08:51:55 UTC,3838,26th/Nueces,3798,21st/Speedway @ PCL,5,2020-02-10 162 | 21306704,U.T. Student Membership,078G,2019-11-26 16:43:35 UTC,3838,26th/Nueces,3798,21st/Speedway @ PCL,6,2020-02-10 163 | 21304252,U.T. Student Membership,164,2019-11-26 08:01:09 UTC,3793,28th/Rio Grande,3799,23rd/San Jacinto @ DKR Stadium,7,2020-02-10 164 | 21305753,Local365,078G,2019-11-26 13:57:43 UTC,2495,4th/Congress,3838,26th/Nueces,38,2020-02-10 165 | 21304918,U.T. Student Membership,400,2019-11-26 11:12:46 UTC,3794,"Dean Keeton/Speedway ",3838,26th/Nueces,4,2020-02-10 166 | 21303996,U.T. Student Membership,110,2019-11-26 01:03:45 UTC,3794,"Dean Keeton/Speedway ",3838,26th/Nueces,3,2020-02-10 167 | 21307187,Local365,58,2019-11-26 18:51:55 UTC,3795,Dean Keeton/Whitis,3838,26th/Nueces,2,2020-02-10 168 | 21306086,Local365,078G,2019-11-26 14:50:14 UTC,3838,26th/Nueces,3838,26th/Nueces,15,2020-02-10 169 | 21307470,Local365,460,2019-11-26 20:33:05 UTC,2563,Rainey/Davis,3841,23rd/Rio Grande,32,2020-02-10 170 | 21305869,24 Hour Walk Up Pass,335G,2019-11-26 14:14:12 UTC,2572,Barton Springs Pool,3841,23rd/Rio Grande,44,2020-02-10 171 | 21305874,24 Hour Walk Up Pass,716,2019-11-26 14:15:05 UTC,2572,Barton Springs Pool,3841,23rd/Rio Grande,43,2020-02-10 172 | 21306359,U.T. Student Membership,349,2019-11-26 15:44:47 UTC,3792,22nd/Pearl,3841,23rd/Rio Grande,2,2020-02-10 173 | 21306858,Local365,2887,2019-11-26 17:17:53 UTC,3793,28th/Rio Grande,3841,23rd/Rio Grande,19,2020-02-10 174 | 21306214,U.T. Student Membership,936,2019-11-26 15:17:18 UTC,3794,"Dean Keeton/Speedway ",3841,23rd/Rio Grande,4,2020-02-10 175 | 21307160,U.T. Student Membership,400,2019-11-26 18:40:32 UTC,3794,"Dean Keeton/Speedway ",3841,23rd/Rio Grande,7,2020-02-10 176 | 21306782,U.T. Student Membership,156,2019-11-26 17:00:02 UTC,3795,Dean Keeton/Whitis,3841,23rd/Rio Grande,4,2020-02-10 177 | 21306617,Local365,460,2019-11-26 16:26:33 UTC,3798,21st/Speedway @ PCL,3841,23rd/Rio Grande,8,2020-02-10 178 | 21306622,U.T. Student Membership,040G,2019-11-26 16:27:21 UTC,3798,21st/Speedway @ PCL,3841,23rd/Rio Grande,7,2020-02-10 179 | -------------------------------------------------------------------------------- /week6/data/bike_data_20200211.csv: -------------------------------------------------------------------------------- 1 | trip_id,subscriber_type,bikeid,start_time,start_station_id,start_station_name,end_station_id,end_station_name,duration_minutes,dummy_date 2 | 21308544,U.T. Student Membership,213,2019-11-27 11:17:18 UTC,4061,Lakeshore/Austin Hostel,4062,Lakeshore/Pleasant Valley,2,2020-02-11 3 | 21310203,Local365,2130,2019-11-27 18:27:12 UTC,4057,"6th/Chalmers ",2544,East 6th/Pedernales,4,2020-02-11 4 | 21309658,Explorer,430,2019-11-27 15:34:01 UTC,3791,Lake Austin/Enfield,3790,Lake Austin Blvd/Deep Eddy,8,2020-02-11 5 | 21309227,Explorer,963,2019-11-27 13:52:45 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,3791,Lake Austin/Enfield,30,2020-02-11 6 | 21309284,Explorer,430,2019-11-27 14:03:57 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3791,Lake Austin/Enfield,20,2020-02-11 7 | 21308183,Explorer,1551,2019-11-27 09:13:24 UTC,2504,South Congress/Elizabeth,4060,Red River/Cesar Chavez @ The Fairmont,14,2020-02-11 8 | 21308175,Explorer,2122B,2019-11-27 09:11:35 UTC,2504,South Congress/Elizabeth,4060,Red River/Cesar Chavez @ The Fairmont,16,2020-02-11 9 | 21309393,Explorer,1531,2019-11-27 14:31:39 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2494,2nd/Congress,13,2020-02-11 10 | 21309390,Explorer,920,2019-11-27 14:31:20 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2494,2nd/Congress,13,2020-02-11 11 | 21307883,Local365,865,2019-11-27 06:46:19 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2494,2nd/Congress,4,2020-02-11 12 | 21307961,Local365,160,2019-11-27 07:35:14 UTC,2552,3rd/West,2494,2nd/Congress,3,2020-02-11 13 | 21307897,Local365,110G,2019-11-27 06:50:30 UTC,2552,3rd/West,2495,4th/Congress,4,2020-02-11 14 | 21308222,Local365,109,2019-11-27 09:29:39 UTC,2552,3rd/West,2495,4th/Congress,4,2020-02-11 15 | 21307983,Local365,650,2019-11-27 07:48:46 UTC,2542,Plaza Saltillo,2496,8th/Congress,11,2020-02-11 16 | 21308989,Local365+Guest Pass,432,2019-11-27 12:57:53 UTC,2707,Rainey/Cummings,2501,5th/Bowie,13,2020-02-11 17 | 21309070,Single Trip (Pay-as-you-ride),893,2019-11-27 13:13:19 UTC,2574,Zilker Park,2503,South Congress/James,29,2020-02-11 18 | 21309849,Explorer,963,2019-11-27 16:19:03 UTC,2574,Zilker Park,2503,South Congress/James,39,2020-02-11 19 | 21309066,Single Trip (Pay-as-you-ride),873,2019-11-27 13:12:40 UTC,2574,Zilker Park,2503,South Congress/James,30,2020-02-11 20 | 21309850,Explorer,861,2019-11-27 16:19:27 UTC,2574,Zilker Park,2503,South Congress/James,39,2020-02-11 21 | 21309730,Explorer,252,2019-11-27 15:48:28 UTC,2503,South Congress/James,2503,South Congress/James,1,2020-02-11 22 | 21309738,Explorer,873,2019-11-27 15:49:58 UTC,2503,South Congress/James,2503,South Congress/James,3,2020-02-11 23 | 21309122,Explorer,1585,2019-11-27 13:21:47 UTC,3619,6th/Congress,2504,South Congress/Elizabeth,57,2020-02-11 24 | 21309118,Explorer,849,2019-11-27 13:21:21 UTC,3619,6th/Congress,2504,South Congress/Elizabeth,57,2020-02-11 25 | 21309117,Explorer,134,2019-11-27 13:20:57 UTC,3619,6th/Congress,2504,South Congress/Elizabeth,58,2020-02-11 26 | 21308178,Explorer,453,2019-11-27 09:12:04 UTC,2504,South Congress/Elizabeth,2504,South Congress/Elizabeth,1,2020-02-11 27 | 21310395,Local365,663,2019-11-27 20:14:09 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2537,6th/West,3,2020-02-11 28 | 21308790,Local365,650,2019-11-27 12:10:14 UTC,2496,8th/Congress,2539,3rd/Trinity @ The Convention Center,7,2020-02-11 29 | 21309879,Explorer,349G,2019-11-27 16:26:21 UTC,2574,Zilker Park,2540,17th/Guadalupe,62,2020-02-11 30 | 21309881,Explorer,88,2019-11-27 16:26:35 UTC,2574,Zilker Park,2540,17th/Guadalupe,62,2020-02-11 31 | 21309884,Explorer,529,2019-11-27 16:27:09 UTC,2574,Zilker Park,2540,17th/Guadalupe,1066,2020-02-11 32 | 21308043,Local30,110,2019-11-27 08:17:45 UTC,3798,21st/Speedway @ PCL,2540,17th/Guadalupe,6,2020-02-11 33 | 21310130,Local365,745,2019-11-27 17:53:17 UTC,2544,East 6th/Pedernales,2542,Plaza Saltillo,5,2020-02-11 34 | 21309973,Local365,2351,2019-11-27 16:49:20 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2547,21st/Guadalupe,17,2020-02-11 35 | 21310237,Local365,166,2019-11-27 18:40:43 UTC,3799,23rd/San Jacinto @ DKR Stadium,2547,21st/Guadalupe,8,2020-02-11 36 | 21310409,Local365,2351,2019-11-27 20:22:22 UTC,2552,3rd/West,2547,21st/Guadalupe,18,2020-02-11 37 | 21308764,Local365,460,2019-11-27 12:02:50 UTC,3838,26th/Nueces,2547,21st/Guadalupe,7,2020-02-11 38 | 21308955,Local365,572,2019-11-27 12:49:31 UTC,3838,26th/Nueces,2547,21st/Guadalupe,5,2020-02-11 39 | 21308138,Single Trip (Pay-as-you-ride),472,2019-11-27 08:57:24 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2549,South 1st/Riverside @ Long Center,19,2020-02-11 40 | 21309181,Local365,2351,2019-11-27 13:40:41 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2552,3rd/West,16,2020-02-11 41 | 21310369,Local365,014G,2019-11-27 19:58:39 UTC,2494,2nd/Congress,2552,3rd/West,42,2020-02-11 42 | 21308605,Local365,865,2019-11-27 11:30:32 UTC,2494,2nd/Congress,2552,3rd/West,4,2020-02-11 43 | 21310368,Local365,160,2019-11-27 19:57:55 UTC,2494,2nd/Congress,2552,3rd/West,8,2020-02-11 44 | 21308263,Local30,14264,2019-11-27 09:47:45 UTC,2496,8th/Congress,2552,3rd/West,11,2020-02-11 45 | 21310250,Local365,2351,2019-11-27 18:48:20 UTC,2547,21st/Guadalupe,2552,3rd/West,18,2020-02-11 46 | 21308806,Local365,2351,2019-11-27 12:15:01 UTC,2547,21st/Guadalupe,2552,3rd/West,15,2020-02-11 47 | 21310104,Local365,133,2019-11-27 17:37:51 UTC,2552,3rd/West,2552,3rd/West,14,2020-02-11 48 | 21310290,Local365,2351,2019-11-27 19:08:00 UTC,2552,3rd/West,2552,3rd/West,13,2020-02-11 49 | 21310317,Local365,2351,2019-11-27 19:21:52 UTC,2552,3rd/West,2552,3rd/West,41,2020-02-11 50 | 21309867,Single Trip (Pay-as-you-ride),827,2019-11-27 16:24:17 UTC,4061,Lakeshore/Austin Hostel,2563,Rainey/Davis,30,2020-02-11 51 | 21309784,24 Hour Walk Up Pass,871,2019-11-27 16:01:08 UTC,2707,Rainey/Cummings,2563,Rainey/Davis,53,2020-02-11 52 | 21309798,24 Hour Walk Up Pass,220,2019-11-27 16:04:53 UTC,2707,Rainey/Cummings,2563,Rainey/Davis,50,2020-02-11 53 | 21308193,Pay-as-you-ride,280,2019-11-27 09:15:45 UTC,2569,East 11th/San Marcos,2565,6th/Trinity,13,2020-02-11 54 | 21309042,Local365,2351,2019-11-27 13:09:29 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,31,2020-02-11 55 | 21309420,Local365,1555,2019-11-27 14:37:43 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,6,2020-02-11 56 | 21309109,Explorer,663,2019-11-27 13:20:32 UTC,3292,East 4th/Chicon,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,31,2020-02-11 57 | 21309114,Explorer,024G,2019-11-27 13:20:46 UTC,3292,East 4th/Chicon,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,31,2020-02-11 58 | 21308028,Local365,305G,2019-11-27 08:11:56 UTC,2549,South 1st/Riverside @ Long Center,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,6,2020-02-11 59 | 21308872,Local365,2351,2019-11-27 12:34:54 UTC,2552,3rd/West,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,18,2020-02-11 60 | 21309330,Local365,2351,2019-11-27 14:15:54 UTC,2552,3rd/West,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,45,2020-02-11 61 | 21309168,Local365,1531,2019-11-27 13:36:33 UTC,2563,Rainey/Davis,2567,Barton Springs/Bouldin @ Palmer Auditorium,18,2020-02-11 62 | 21309579,Local365,1555,2019-11-27 15:17:04 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,2567,Barton Springs/Bouldin @ Palmer Auditorium,5,2020-02-11 63 | 21309337,Local365,1555,2019-11-27 14:17:04 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,2567,Barton Springs/Bouldin @ Palmer Auditorium,11,2020-02-11 64 | 21309445,Local365,3,2019-11-27 14:44:56 UTC,2711,Barton Springs/Kinney,2567,Barton Springs/Bouldin @ Palmer Auditorium,4,2020-02-11 65 | 21308465,Local365,440,2019-11-27 10:57:01 UTC,2539,3rd/Trinity @ The Convention Center,2569,East 11th/San Marcos,12,2020-02-11 66 | 21310011,Local365,3,2019-11-27 17:00:14 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2571,8th/Red River,17,2020-02-11 67 | 21309434,Pay-as-you-ride,208,2019-11-27 14:40:45 UTC,2572,Barton Springs Pool,2572,Barton Springs Pool,21,2020-02-11 68 | 21309694,Explorer,529,2019-11-27 15:42:56 UTC,3790,Lake Austin Blvd/Deep Eddy,2574,Zilker Park,30,2020-02-11 69 | 21309656,Explorer,963,2019-11-27 15:33:50 UTC,3791,Lake Austin/Enfield,2574,Zilker Park,39,2020-02-11 70 | 21309542,Single Trip (Pay-as-you-ride),080G,2019-11-27 15:09:07 UTC,2570,South Congress/Academy,2574,Zilker Park,23,2020-02-11 71 | 21309547,Single Trip (Pay-as-you-ride),349G,2019-11-27 15:09:42 UTC,2570,South Congress/Academy,2574,Zilker Park,23,2020-02-11 72 | 21308571,Explorer,88,2019-11-27 11:24:35 UTC,2574,Zilker Park,2574,Zilker Park,5,2020-02-11 73 | 21308562,Explorer,893,2019-11-27 11:23:04 UTC,2574,Zilker Park,2574,Zilker Park,6,2020-02-11 74 | 21308471,Explorer,113G,2019-11-27 10:58:50 UTC,2494,2nd/Congress,2574,Zilker Park,24,2020-02-11 75 | 21308470,Explorer,043G,2019-11-27 10:58:50 UTC,2494,2nd/Congress,2574,Zilker Park,24,2020-02-11 76 | 21309751,Explorer,928,2019-11-27 15:52:50 UTC,2503,South Congress/James,2574,Zilker Park,32,2020-02-11 77 | 21309727,Explorer,2095,2019-11-27 15:48:17 UTC,2503,South Congress/James,2574,Zilker Park,36,2020-02-11 78 | 21309735,Explorer,893,2019-11-27 15:49:23 UTC,2503,South Congress/James,2574,Zilker Park,35,2020-02-11 79 | 21309929,Single Trip (Pay-as-you-ride),815,2019-11-27 16:37:25 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,22,2020-02-11 80 | 21309142,Pay-as-you-ride,871,2019-11-27 13:28:35 UTC,2707,Rainey/Cummings,2707,Rainey/Cummings,33,2020-02-11 81 | 21308980,Local365,3,2019-11-27 12:54:19 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,2711,Barton Springs/Kinney,3,2020-02-11 82 | 21309002,Explorer,024G,2019-11-27 13:01:29 UTC,3619,6th/Congress,3292,East 4th/Chicon,19,2020-02-11 83 | 21309000,Explorer,663,2019-11-27 13:01:16 UTC,3619,6th/Congress,3292,East 4th/Chicon,19,2020-02-11 84 | 21309229,Explorer,610,2019-11-27 13:53:01 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,9,2020-02-11 85 | 21308748,Single Trip (Pay-as-you-ride),430,2019-11-27 11:59:16 UTC,2575,Riverside/South Lamar,3377,Veterans/Atlanta @ MoPac Ped Bridge,15,2020-02-11 86 | 21308736,Single Trip (Pay-as-you-ride),1555,2019-11-27 11:58:31 UTC,2575,Riverside/South Lamar,3377,Veterans/Atlanta @ MoPac Ped Bridge,16,2020-02-11 87 | 21310024,Local365,922,2019-11-27 17:05:20 UTC,2552,3rd/West,3390,6th/Brazos,8,2020-02-11 88 | 21309253,Explorer,1471,2019-11-27 13:57:53 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3513,South Congress/Barton Springs @ The Austin American-Statesman,52,2020-02-11 89 | 21309268,Explorer,554,2019-11-27 14:00:27 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3513,South Congress/Barton Springs @ The Austin American-Statesman,49,2020-02-11 90 | 21309273,Explorer,240,2019-11-27 14:01:53 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3513,South Congress/Barton Springs @ The Austin American-Statesman,48,2020-02-11 91 | 21309614,Local365,1531,2019-11-27 15:24:15 UTC,2494,2nd/Congress,3513,South Congress/Barton Springs @ The Austin American-Statesman,3,2020-02-11 92 | 21309851,Single Trip (Pay-as-you-ride),75,2019-11-27 16:19:53 UTC,2494,2nd/Congress,3513,South Congress/Barton Springs @ The Austin American-Statesman,31,2020-02-11 93 | 21309854,Single Trip (Pay-as-you-ride),920,2019-11-27 16:20:37 UTC,2494,2nd/Congress,3513,South Congress/Barton Springs @ The Austin American-Statesman,30,2020-02-11 94 | 21310315,Local365,1555,2019-11-27 19:21:23 UTC,2567,Barton Springs/Bouldin @ Palmer Auditorium,3619,6th/Congress,9,2020-02-11 95 | 21310248,Local365,661,2019-11-27 18:47:13 UTC,3390,6th/Brazos,3621,3rd/Nueces,6,2020-02-11 96 | 21309385,Single Trip (Pay-as-you-ride),2274,2019-11-27 14:30:26 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3687,Boardwalk West,6,2020-02-11 97 | 21310364,Local365,1545,2019-11-27 19:52:29 UTC,2549,South 1st/Riverside @ Long Center,3687,Boardwalk West,7,2020-02-11 98 | 21310558,U.T. Student Membership,303G,2019-11-27 21:44:18 UTC,3798,21st/Speedway @ PCL,3792,22nd/Pearl,6,2020-02-11 99 | 21308541,24 Hour Walk Up Pass,683,2019-11-27 11:16:10 UTC,4052,Rosewood/Angelina,3794,"Dean Keeton/Speedway ",33,2020-02-11 100 | 21308543,24 Hour Walk Up Pass,034G,2019-11-27 11:17:13 UTC,4052,Rosewood/Angelina,3794,"Dean Keeton/Speedway ",32,2020-02-11 101 | 21309130,Local365,772,2019-11-27 13:25:23 UTC,3798,21st/Speedway @ PCL,3794,"Dean Keeton/Speedway ",5,2020-02-11 102 | 21308610,Local365,2197,2019-11-27 11:31:15 UTC,3799,23rd/San Jacinto @ DKR Stadium,3794,"Dean Keeton/Speedway ",6,2020-02-11 103 | 21310445,Local365,2351,2019-11-27 20:40:22 UTC,2547,21st/Guadalupe,3794,"Dean Keeton/Speedway ",13,2020-02-11 104 | 21309010,Single Trip (Pay-as-you-ride),511,2019-11-27 13:03:49 UTC,3795,Dean Keeton/Whitis,3795,Dean Keeton/Whitis,60,2020-02-11 105 | 21309196,U.T. Student Membership,057G,2019-11-27 13:43:37 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,43,2020-02-11 106 | 21309201,Single Trip (Pay-as-you-ride),153,2019-11-27 13:44:55 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,19,2020-02-11 107 | 21309205,Single Trip (Pay-as-you-ride),907,2019-11-27 13:45:37 UTC,3798,21st/Speedway @ PCL,3795,Dean Keeton/Whitis,18,2020-02-11 108 | 21307799,U.T. Student Membership,040G,2019-11-27 00:21:41 UTC,3792,22nd/Pearl,3798,21st/Speedway @ PCL,4,2020-02-11 109 | 21308719,Local365,854,2019-11-27 11:55:59 UTC,3794,"Dean Keeton/Speedway ",3798,21st/Speedway @ PCL,5,2020-02-11 110 | 21309043,Single Trip (Pay-as-you-ride),382,2019-11-27 13:10:03 UTC,3795,Dean Keeton/Whitis,3798,21st/Speedway @ PCL,33,2020-02-11 111 | 21309048,Single Trip (Pay-as-you-ride),3471,2019-11-27 13:10:49 UTC,3795,Dean Keeton/Whitis,3798,21st/Speedway @ PCL,33,2020-02-11 112 | 21308160,U.T. Student Membership,057G,2019-11-27 09:04:16 UTC,2548,Guadalupe/West Mall @ University Co-op,3798,21st/Speedway @ PCL,3,2020-02-11 113 | 21310534,U.T. Student Membership,303G,2019-11-27 21:34:14 UTC,2548,Guadalupe/West Mall @ University Co-op,3798,21st/Speedway @ PCL,3,2020-02-11 114 | 21307839,U.T. Student Membership,772,2019-11-27 05:12:49 UTC,3838,26th/Nueces,3798,21st/Speedway @ PCL,5,2020-02-11 115 | 21308554,U.T. Student Membership,58,2019-11-27 11:20:19 UTC,3838,26th/Nueces,3798,21st/Speedway @ PCL,5,2020-02-11 116 | 21310491,Local365,2351,2019-11-27 21:01:50 UTC,3794,"Dean Keeton/Speedway ",3799,23rd/San Jacinto @ DKR Stadium,4,2020-02-11 117 | 21308607,Local365,14151,2019-11-27 11:30:50 UTC,3799,23rd/San Jacinto @ DKR Stadium,3799,23rd/San Jacinto @ DKR Stadium,1,2020-02-11 118 | 21308639,Local365,460,2019-11-27 11:37:22 UTC,3794,"Dean Keeton/Speedway ",3838,26th/Nueces,25,2020-02-11 119 | 21309206,Local365,549,2019-11-27 13:45:38 UTC,3793,28th/Rio Grande,3841,23rd/Rio Grande,3,2020-02-11 120 | 21308705,Local365,388,2019-11-27 11:52:53 UTC,3793,28th/Rio Grande,3841,23rd/Rio Grande,4,2020-02-11 121 | -------------------------------------------------------------------------------- /week6/data/bike_data_20200212.csv: -------------------------------------------------------------------------------- 1 | trip_id,subscriber_type,bikeid,start_time,start_station_id,start_station_name,end_station_id,end_station_name,duration_minutes,dummy_date 2 | 21312064,Explorer,995,2019-11-28 16:40:40 UTC,4061,Lakeshore/Austin Hostel,4050,5th/Campbell,43,2020-02-12 3 | 21312062,24 Hour Walk Up Pass,420,2019-11-28 16:40:27 UTC,4061,Lakeshore/Austin Hostel,4050,5th/Campbell,43,2020-02-12 4 | 21311288,Single Trip (Pay-as-you-ride),066G,2019-11-28 11:34:17 UTC,2823,East 5th/Broadway @ Capital Metro HQ,2823,East 5th/Broadway @ Capital Metro HQ,71,2020-02-12 5 | 21312061,Explorer,683,2019-11-28 16:40:11 UTC,4061,Lakeshore/Austin Hostel,4050,5th/Campbell,43,2020-02-12 6 | 21312063,Explorer,12811,2019-11-28 16:40:29 UTC,4061,Lakeshore/Austin Hostel,4050,5th/Campbell,43,2020-02-12 7 | 21312068,Explorer,034G,2019-11-28 16:41:37 UTC,4061,Lakeshore/Austin Hostel,4050,5th/Campbell,42,2020-02-12 8 | 21311990,Explorer,12811,2019-11-28 16:02:10 UTC,2562,8th/San Jacinto,4061,Lakeshore/Austin Hostel,37,2020-02-12 9 | 21311993,Explorer,995,2019-11-28 16:02:47 UTC,2562,8th/San Jacinto,4061,Lakeshore/Austin Hostel,37,2020-02-12 10 | 21311438,Single Trip (Pay-as-you-ride),2355,2019-11-28 12:40:15 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,3790,Lake Austin Blvd/Deep Eddy,52,2020-02-12 11 | 21311434,Single Trip (Pay-as-you-ride),024G,2019-11-28 12:39:14 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,3790,Lake Austin Blvd/Deep Eddy,53,2020-02-12 12 | 21311430,Single Trip (Pay-as-you-ride),198,2019-11-28 12:38:09 UTC,2566,Electric Drive/Sandra Muraida Way @ Pfluger Ped Bridge,3790,Lake Austin Blvd/Deep Eddy,54,2020-02-12 13 | 21311182,Local365,229G,2019-11-28 10:49:16 UTC,2575,Riverside/South Lamar,4058,Hollow Creek/Barton Hills,19,2020-02-12 14 | 21311869,24 Hour Walk Up Pass,106G,2019-11-28 15:20:10 UTC,3684,Cesar Chavez/Congress,4062,Lakeshore/Pleasant Valley,20,2020-02-12 15 | 21311872,24 Hour Walk Up Pass,134,2019-11-28 15:20:42 UTC,3684,Cesar Chavez/Congress,4062,Lakeshore/Pleasant Valley,20,2020-02-12 16 | 21311871,24 Hour Walk Up Pass,849,2019-11-28 15:20:26 UTC,3684,Cesar Chavez/Congress,4062,Lakeshore/Pleasant Valley,20,2020-02-12 17 | 21312203,24 Hour Walk Up Pass,375G,2019-11-28 18:48:10 UTC,2494,2nd/Congress,3660,East 6th/Medina,32,2020-02-12 18 | 21311924,Explorer,683,2019-11-28 15:42:05 UTC,3794,"Dean Keeton/Speedway ",4061,Lakeshore/Austin Hostel,57,2020-02-12 19 | 21311925,Explorer,034G,2019-11-28 15:42:17 UTC,3794,"Dean Keeton/Speedway ",4061,Lakeshore/Austin Hostel,59,2020-02-12 20 | 21311951,24 Hour Walk Up Pass,088G,2019-11-28 15:48:19 UTC,3798,21st/Speedway @ PCL,4061,Lakeshore/Austin Hostel,51,2020-02-12 21 | 21311266,24 Hour Walk Up Pass,1531,2019-11-28 11:24:19 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2494,2nd/Congress,40,2020-02-12 22 | 21312153,24 Hour Walk Up Pass,375G,2019-11-28 17:43:13 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2494,2nd/Congress,25,2020-02-12 23 | 21311268,24 Hour Walk Up Pass,1471,2019-11-28 11:24:53 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2494,2nd/Congress,40,2020-02-12 24 | 21311565,Explorer,922,2019-11-28 13:21:48 UTC,3390,6th/Brazos,2495,4th/Congress,130,2020-02-12 25 | 21311606,24 Hour Walk Up Pass,387,2019-11-28 13:36:48 UTC,3294,6th/Lavaca,2503,South Congress/James,21,2020-02-12 26 | 21311604,24 Hour Walk Up Pass,578,2019-11-28 13:36:17 UTC,3294,6th/Lavaca,2503,South Congress/James,21,2020-02-12 27 | 21311607,24 Hour Walk Up Pass,152,2019-11-28 13:37:17 UTC,3294,6th/Lavaca,2503,South Congress/James,20,2020-02-12 28 | 21311378,Pay-as-you-ride,75,2019-11-28 12:14:39 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,2539,3rd/Trinity @ The Convention Center,6,2020-02-12 29 | 21311366,Local365,2351,2019-11-28 12:10:42 UTC,2552,3rd/West,2539,3rd/Trinity @ The Convention Center,26,2020-02-12 30 | 21312251,HT Ram Membership,078G,2019-11-28 19:41:24 UTC,2548,Guadalupe/West Mall @ University Co-op,2540,17th/Guadalupe,902,2020-02-12 31 | 21311973,Single Trip (Pay-as-you-ride),461,2019-11-28 15:56:32 UTC,3684,Cesar Chavez/Congress,2547,21st/Guadalupe,16,2020-02-12 32 | 21311966,Single Trip (Pay-as-you-ride),1277,2019-11-28 15:54:43 UTC,3684,Cesar Chavez/Congress,2547,21st/Guadalupe,18,2020-02-12 33 | 21311963,Annual Membership,670,2019-11-28 14:53:49 UTC,3684,Cesar Chavez/Congress,2547,21st/Guadalupe,19,2020-02-12 34 | 21311971,Annual Membership,646,2019-11-28 14:55:52 UTC,3684,Cesar Chavez/Congress,2547,21st/Guadalupe,17,2020-02-12 35 | 21311433,Local365,2351,2019-11-28 12:39:05 UTC,2539,3rd/Trinity @ The Convention Center,2547,21st/Guadalupe,19,2020-02-12 36 | 21312189,HT Ram Membership,078G,2019-11-28 18:31:48 UTC,3798,21st/Speedway @ PCL,2548,Guadalupe/West Mall @ University Co-op,9,2020-02-12 37 | 21312202,HT Ram Membership,078G,2019-11-28 18:47:59 UTC,2548,Guadalupe/West Mall @ University Co-op,2548,Guadalupe/West Mall @ University Co-op,53,2020-02-12 38 | 21312267,24 Hour Walk Up Pass,452,2019-11-28 19:48:56 UTC,2570,South Congress/Academy,2549,South 1st/Riverside @ Long Center,38,2020-02-12 39 | 21311041,Single Trip (Pay-as-you-ride),815,2019-11-28 09:47:17 UTC,2575,Riverside/South Lamar,2549,South 1st/Riverside @ Long Center,17,2020-02-12 40 | 21311689,Local365,2351,2019-11-28 14:01:27 UTC,3793,28th/Rio Grande,2549,South 1st/Riverside @ Long Center,42,2020-02-12 41 | 21311667,Pay-as-you-ride,75,2019-11-28 13:56:28 UTC,2539,3rd/Trinity @ The Convention Center,2549,South 1st/Riverside @ Long Center,27,2020-02-12 42 | 21311277,Local365,2351,2019-11-28 11:30:03 UTC,3799,23rd/San Jacinto @ DKR Stadium,2552,3rd/West,21,2020-02-12 43 | 21311926,Explorer,772,2019-11-28 15:42:27 UTC,3794,"Dean Keeton/Speedway ",2562,8th/San Jacinto,19,2020-02-12 44 | 21311917,Explorer,2197,2019-11-28 15:40:52 UTC,3794,"Dean Keeton/Speedway ",2562,8th/San Jacinto,21,2020-02-12 45 | 21312237,24 Hour Walk Up Pass,1984,2019-11-28 19:23:54 UTC,3660,East 6th/Medina,2570,South Congress/Academy,20,2020-02-12 46 | 21311422,Single Trip (Pay-as-you-ride),113G,2019-11-28 12:36:26 UTC,2574,Zilker Park,2574,Zilker Park,31,2020-02-12 47 | 21311419,Single Trip (Pay-as-you-ride),2095,2019-11-28 12:35:02 UTC,2574,Zilker Park,2574,Zilker Park,32,2020-02-12 48 | 21311467,Single Trip (Pay-as-you-ride),043G,2019-11-28 12:51:36 UTC,2574,Zilker Park,2574,Zilker Park,1,2020-02-12 49 | 21311406,24 Hour Walk Up Pass,3201,2019-11-28 12:31:07 UTC,4062,Lakeshore/Pleasant Valley,2575,Riverside/South Lamar,30,2020-02-12 50 | 21311409,24 Hour Walk Up Pass,1803,2019-11-28 12:32:05 UTC,4062,Lakeshore/Pleasant Valley,2575,Riverside/South Lamar,29,2020-02-12 51 | 21310976,Local365,301,2019-11-28 09:14:16 UTC,4058,Hollow Creek/Barton Hills,2575,Riverside/South Lamar,7,2020-02-12 52 | 21311404,24 Hour Walk Up Pass,2126,2019-11-28 12:30:16 UTC,4062,Lakeshore/Pleasant Valley,2575,Riverside/South Lamar,31,2020-02-12 53 | 21311056,Pay-as-you-ride,14,2019-11-28 09:59:03 UTC,2575,Riverside/South Lamar,2575,Riverside/South Lamar,42,2020-02-12 54 | 21311538,Explorer,1851,2019-11-28 13:17:57 UTC,3390,6th/Brazos,2711,Barton Springs/Kinney,36,2020-02-12 55 | 21312088,Local365,3201,2019-11-28 16:55:05 UTC,2575,Riverside/South Lamar,3377,Veterans/Atlanta @ MoPac Ped Bridge,2639,2020-02-12 56 | 21311712,24 Hour Walk Up Pass,966,2019-11-28 14:12:08 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,70,2020-02-12 57 | 21311707,24 Hour Walk Up Pass,610,2019-11-28 14:11:17 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,71,2020-02-12 58 | 21311362,Pay-as-you-ride,966,2019-11-28 12:09:00 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,42,2020-02-12 59 | 21311365,Pay-as-you-ride,610,2019-11-28 12:10:19 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,39,2020-02-12 60 | 21311694,Local365,966,2019-11-28 14:03:04 UTC,3377,Veterans/Atlanta @ MoPac Ped Bridge,3377,Veterans/Atlanta @ MoPac Ped Bridge,8,2020-02-12 61 | 21310838,Single Trip (Pay-as-you-ride),072G,2019-11-28 07:39:00 UTC,4052,Rosewood/Angelina,3390,6th/Brazos,10,2020-02-12 62 | 21311528,24 Hour Walk Up Pass,507,2019-11-28 13:13:16 UTC,2494,2nd/Congress,3390,6th/Brazos,43,2020-02-12 63 | 21311076,Explorer,520,2019-11-28 10:14:18 UTC,2495,4th/Congress,3390,6th/Brazos,830,2020-02-12 64 | 21311931,24 Hour Walk Up Pass,554,2019-11-28 15:43:09 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3513,South Congress/Barton Springs @ The Austin American-Statesman,32,2020-02-12 65 | 21311934,24 Hour Walk Up Pass,920,2019-11-28 15:44:15 UTC,3513,South Congress/Barton Springs @ The Austin American-Statesman,3513,South Congress/Barton Springs @ The Austin American-Statesman,31,2020-02-12 66 | 21312026,24 Hour Walk Up Pass,2077,2019-11-28 16:22:51 UTC,2548,Guadalupe/West Mall @ University Co-op,3513,South Congress/Barton Springs @ The Austin American-Statesman,33,2020-02-12 67 | 21311475,Single Trip (Pay-as-you-ride),893,2019-11-28 12:54:04 UTC,2574,Zilker Park,3621,3rd/Nueces,36,2020-02-12 68 | 21311710,24 Hour Walk Up Pass,106G,2019-11-28 14:11:42 UTC,2504,South Congress/Elizabeth,3684,Cesar Chavez/Congress,11,2020-02-12 69 | 21311711,24 Hour Walk Up Pass,849,2019-11-28 14:12:00 UTC,2504,South Congress/Elizabeth,3684,Cesar Chavez/Congress,10,2020-02-12 70 | 21311713,24 Hour Walk Up Pass,134,2019-11-28 14:12:14 UTC,2504,South Congress/Elizabeth,3684,Cesar Chavez/Congress,10,2020-02-12 71 | 21311780,Pay-as-you-ride,871,2019-11-28 14:45:35 UTC,2563,Rainey/Davis,3686,Sterzing/Barton Springs,34,2020-02-12 72 | 21312385,Pay-as-you-ride,072G,2019-11-28 22:05:50 UTC,3390,6th/Brazos,3792,22nd/Pearl,2228,2020-02-12 73 | 21311593,Local365,2351,2019-11-28 13:32:14 UTC,3793,28th/Rio Grande,3793,28th/Rio Grande,21,2020-02-12 74 | 21311491,Local365,2351,2019-11-28 12:58:12 UTC,2547,21st/Guadalupe,3793,28th/Rio Grande,27,2020-02-12 75 | 21311922,Explorer,544,2019-11-28 15:41:44 UTC,3794,"Dean Keeton/Speedway ",3794,"Dean Keeton/Speedway ",1,2020-02-12 76 | 21312304,24 Hour Walk Up Pass,1909,2019-11-28 20:27:30 UTC,2549,South 1st/Riverside @ Long Center,3794,"Dean Keeton/Speedway ",26,2020-02-12 77 | 21312127,Explorer,995,2019-11-28 17:24:06 UTC,4050,5th/Campbell,3795,Dean Keeton/Whitis,1094,2020-02-12 78 | 21312125,Explorer,034G,2019-11-28 17:23:57 UTC,4050,5th/Campbell,3795,Dean Keeton/Whitis,1095,2020-02-12 79 | 21312121,Explorer,683,2019-11-28 17:23:46 UTC,4050,5th/Campbell,3795,Dean Keeton/Whitis,1095,2020-02-12 80 | 21312124,24 Hour Walk Up Pass,1558,2019-11-28 17:23:57 UTC,4050,5th/Campbell,3795,Dean Keeton/Whitis,1095,2020-02-12 81 | 21312120,Explorer,12811,2019-11-28 17:23:38 UTC,4050,5th/Campbell,3795,Dean Keeton/Whitis,1095,2020-02-12 82 | 21312330,Local365,040G,2019-11-28 20:54:24 UTC,3798,21st/Speedway @ PCL,3798,21st/Speedway @ PCL,1,2020-02-12 83 | 21311988,Explorer,712,2019-11-28 16:01:53 UTC,3798,21st/Speedway @ PCL,3798,21st/Speedway @ PCL,1,2020-02-12 84 | -------------------------------------------------------------------------------- /week6/data/bike_schema.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "mode": "NULLABLE", 4 | "name": "trip_id", 5 | "type": "INTEGER" 6 | }, 7 | { 8 | "mode": "NULLABLE", 9 | "name": "subscriber_type", 10 | "type": "STRING" 11 | }, 12 | { 13 | "mode": "NULLABLE", 14 | "name": "bikeid", 15 | "type": "STRING" 16 | }, 17 | { 18 | "mode": "NULLABLE", 19 | "name": "start_time", 20 | "type": "TIMESTAMP" 21 | }, 22 | { 23 | "mode": "NULLABLE", 24 | "name": "start_station_id", 25 | "type": "INTEGER" 26 | }, 27 | { 28 | "mode": "NULLABLE", 29 | "name": "start_station_name", 30 | "type": "STRING" 31 | }, 32 | { 33 | "mode": "NULLABLE", 34 | "name": "end_station_id", 35 | "type": "STRING" 36 | }, 37 | { 38 | "mode": "NULLABLE", 39 | "name": "end_station_name", 40 | "type": "STRING" 41 | }, 42 | { 43 | "mode": "NULLABLE", 44 | "name": "duration_minutes", 45 | "type": "INTEGER" 46 | }, 47 | { 48 | "mode": "NULLABLE", 49 | "name": "dummy_date", 50 | "type": "DATE" 51 | } 52 | ] 53 | -------------------------------------------------------------------------------- /week9/calc_class.py: -------------------------------------------------------------------------------- 1 | from calc_func import * 2 | 3 | 4 | class Calculator: 5 | def __init__(self): 6 | self._last_answer = 0.0 7 | 8 | @property 9 | def last_answer(self): 10 | return self._last_answer 11 | 12 | def _do_math(self, a, b, func): 13 | self._last_answer = func(a, b) 14 | return self.last_answer 15 | 16 | def add(self, a, b): 17 | return self._do_math(a, b, add) 18 | 19 | def subtract(self, a, b): 20 | return self._do_math(a, b, subtract) 21 | 22 | def multiply(self, a, b): 23 | return self._do_math(a, b, multiply) 24 | 25 | def divide(self, a, b): 26 | return self._do_math(a, b, divide) 27 | 28 | def maximum(self, a, b): 29 | return self._do_math(a, b, maximum) 30 | 31 | def minimum(self, a, b): 32 | return self._do_math(a, b, minimum) 33 | -------------------------------------------------------------------------------- /week9/calc_func.py: -------------------------------------------------------------------------------- 1 | def add(a, b): 2 | return a + b 3 | 4 | 5 | def subtract(a, b): 6 | return a - b 7 | 8 | 9 | def multiply(a, b): 10 | return a * b 11 | 12 | 13 | def divide(a, b): 14 | # automatically raises ZeroDivisionError 15 | return a * 1.0 / b 16 | 17 | 18 | def maximum(a, b): 19 | return a if a >= b else b 20 | 21 | 22 | def minimum(a, b): 23 | return a if a <= b else b 24 | 25 | -------------------------------------------------------------------------------- /week9/iris.csv: -------------------------------------------------------------------------------- 1 | "sepal.length","sepal.width","petal.length","petal.width","variety" 2 | 5.1,3.5,1.4,.2,"Setosa" 3 | 4.9,3,1.4,.2,"Setosa" 4 | 4.7,3.2,1.3,.2,"Setosa" 5 | 4.6,3.1,1.5,.2,"Setosa" 6 | 5,3.6,1.4,.2,"Setosa" 7 | 5.4,3.9,1.7,.4,"Setosa" 8 | 4.6,3.4,1.4,.3,"Setosa" 9 | 5,3.4,1.5,.2,"Setosa" 10 | 4.4,2.9,1.4,.2,"Setosa" 11 | 4.9,3.1,1.5,.1,"Setosa" 12 | 5.4,3.7,1.5,.2,"Setosa" 13 | 4.8,3.4,1.6,.2,"Setosa" 14 | 4.8,3,1.4,.1,"Setosa" 15 | 4.3,3,1.1,.1,"Setosa" 16 | 5.8,4,1.2,.2,"Setosa" 17 | 5.7,4.4,1.5,.4,"Setosa" 18 | 5.4,3.9,1.3,.4,"Setosa" 19 | 5.1,3.5,1.4,.3,"Setosa" 20 | 5.7,3.8,1.7,.3,"Setosa" 21 | 5.1,3.8,1.5,.3,"Setosa" 22 | 5.4,3.4,1.7,.2,"Setosa" 23 | 5.1,3.7,1.5,.4,"Setosa" 24 | 4.6,3.6,1,.2,"Setosa" 25 | 5.1,3.3,1.7,.5,"Setosa" 26 | 4.8,3.4,1.9,.2,"Setosa" 27 | 5,3,1.6,.2,"Setosa" 28 | 5,3.4,1.6,.4,"Setosa" 29 | 5.2,3.5,1.5,.2,"Setosa" 30 | 5.2,3.4,1.4,.2,"Setosa" 31 | 4.7,3.2,1.6,.2,"Setosa" 32 | 4.8,3.1,1.6,.2,"Setosa" 33 | 5.4,3.4,1.5,.4,"Setosa" 34 | 5.2,4.1,1.5,.1,"Setosa" 35 | 5.5,4.2,1.4,.2,"Setosa" 36 | 4.9,3.1,1.5,.2,"Setosa" 37 | 5,3.2,1.2,.2,"Setosa" 38 | 5.5,3.5,1.3,.2,"Setosa" 39 | 4.9,3.6,1.4,.1,"Setosa" 40 | 4.4,3,1.3,.2,"Setosa" 41 | 5.1,3.4,1.5,.2,"Setosa" 42 | 5,3.5,1.3,.3,"Setosa" 43 | 4.5,2.3,1.3,.3,"Setosa" 44 | 4.4,3.2,1.3,.2,"Setosa" 45 | 5,3.5,1.6,.6,"Setosa" 46 | 5.1,3.8,1.9,.4,"Setosa" 47 | 4.8,3,1.4,.3,"Setosa" 48 | 5.1,3.8,1.6,.2,"Setosa" 49 | 4.6,3.2,1.4,.2,"Setosa" 50 | 5.3,3.7,1.5,.2,"Setosa" 51 | 5,3.3,1.4,.2,"Setosa" 52 | 7,3.2,4.7,1.4,"Versicolor" 53 | 6.4,3.2,4.5,1.5,"Versicolor" 54 | 6.9,3.1,4.9,1.5,"Versicolor" 55 | 5.5,2.3,4,1.3,"Versicolor" 56 | 6.5,2.8,4.6,1.5,"Versicolor" 57 | 5.7,2.8,4.5,1.3,"Versicolor" 58 | 6.3,3.3,4.7,1.6,"Versicolor" 59 | 4.9,2.4,3.3,1,"Versicolor" 60 | 6.6,2.9,4.6,1.3,"Versicolor" 61 | 5.2,2.7,3.9,1.4,"Versicolor" 62 | 5,2,3.5,1,"Versicolor" 63 | 5.9,3,4.2,1.5,"Versicolor" 64 | 6,2.2,4,1,"Versicolor" 65 | 6.1,2.9,4.7,1.4,"Versicolor" 66 | 5.6,2.9,3.6,1.3,"Versicolor" 67 | 6.7,3.1,4.4,1.4,"Versicolor" 68 | 5.6,3,4.5,1.5,"Versicolor" 69 | 5.8,2.7,4.1,1,"Versicolor" 70 | 6.2,2.2,4.5,1.5,"Versicolor" 71 | 5.6,2.5,3.9,1.1,"Versicolor" 72 | 5.9,3.2,4.8,1.8,"Versicolor" 73 | 6.1,2.8,4,1.3,"Versicolor" 74 | 6.3,2.5,4.9,1.5,"Versicolor" 75 | 6.1,2.8,4.7,1.2,"Versicolor" 76 | 6.4,2.9,4.3,1.3,"Versicolor" 77 | 6.6,3,4.4,1.4,"Versicolor" 78 | 6.8,2.8,4.8,1.4,"Versicolor" 79 | 6.7,3,5,1.7,"Versicolor" 80 | 6,2.9,4.5,1.5,"Versicolor" 81 | 5.7,2.6,3.5,1,"Versicolor" 82 | 5.5,2.4,3.8,1.1,"Versicolor" 83 | 5.5,2.4,3.7,1,"Versicolor" 84 | 5.8,2.7,3.9,1.2,"Versicolor" 85 | 6,2.7,5.1,1.6,"Versicolor" 86 | 5.4,3,4.5,1.5,"Versicolor" 87 | 6,3.4,4.5,1.6,"Versicolor" 88 | 6.7,3.1,4.7,1.5,"Versicolor" 89 | 6.3,2.3,4.4,1.3,"Versicolor" 90 | 5.6,3,4.1,1.3,"Versicolor" 91 | 5.5,2.5,4,1.3,"Versicolor" 92 | 5.5,2.6,4.4,1.2,"Versicolor" 93 | 6.1,3,4.6,1.4,"Versicolor" 94 | 5.8,2.6,4,1.2,"Versicolor" 95 | 5,2.3,3.3,1,"Versicolor" 96 | 5.6,2.7,4.2,1.3,"Versicolor" 97 | 5.7,3,4.2,1.2,"Versicolor" 98 | 5.7,2.9,4.2,1.3,"Versicolor" 99 | 6.2,2.9,4.3,1.3,"Versicolor" 100 | 5.1,2.5,3,1.1,"Versicolor" 101 | 5.7,2.8,4.1,1.3,"Versicolor" 102 | 6.3,3.3,6,2.5,"Virginica" 103 | 5.8,2.7,5.1,1.9,"Virginica" 104 | 7.1,3,5.9,2.1,"Virginica" 105 | 6.3,2.9,5.6,1.8,"Virginica" 106 | 6.5,3,5.8,2.2,"Virginica" 107 | 7.6,3,6.6,2.1,"Virginica" 108 | 4.9,2.5,4.5,1.7,"Virginica" 109 | 7.3,2.9,6.3,1.8,"Virginica" 110 | 6.7,2.5,5.8,1.8,"Virginica" 111 | 7.2,3.6,6.1,2.5,"Virginica" 112 | 6.5,3.2,5.1,2,"Virginica" 113 | 6.4,2.7,5.3,1.9,"Virginica" 114 | 6.8,3,5.5,2.1,"Virginica" 115 | 5.7,2.5,5,2,"Virginica" 116 | 5.8,2.8,5.1,2.4,"Virginica" 117 | 6.4,3.2,5.3,2.3,"Virginica" 118 | 6.5,3,5.5,1.8,"Virginica" 119 | 7.7,3.8,6.7,2.2,"Virginica" 120 | 7.7,2.6,6.9,2.3,"Virginica" 121 | 6,2.2,5,1.5,"Virginica" 122 | 6.9,3.2,5.7,2.3,"Virginica" 123 | 5.6,2.8,4.9,2,"Virginica" 124 | 7.7,2.8,6.7,2,"Virginica" 125 | 6.3,2.7,4.9,1.8,"Virginica" 126 | 6.7,3.3,5.7,2.1,"Virginica" 127 | 7.2,3.2,6,1.8,"Virginica" 128 | 6.2,2.8,4.8,1.8,"Virginica" 129 | 6.1,3,4.9,1.8,"Virginica" 130 | 6.4,2.8,5.6,2.1,"Virginica" 131 | 7.2,3,5.8,1.6,"Virginica" 132 | 7.4,2.8,6.1,1.9,"Virginica" 133 | 7.9,3.8,6.4,2,"Virginica" 134 | 6.4,2.8,5.6,2.2,"Virginica" 135 | 6.3,2.8,5.1,1.5,"Virginica" 136 | 6.1,2.6,5.6,1.4,"Virginica" 137 | 7.7,3,6.1,2.3,"Virginica" 138 | 6.3,3.4,5.6,2.4,"Virginica" 139 | 6.4,3.1,5.5,1.8,"Virginica" 140 | 6,3,4.8,1.8,"Virginica" 141 | 6.9,3.1,5.4,2.1,"Virginica" 142 | 6.7,3.1,5.6,2.4,"Virginica" 143 | 6.9,3.1,5.1,2.3,"Virginica" 144 | 5.8,2.7,5.1,1.9,"Virginica" 145 | 6.8,3.2,5.9,2.3,"Virginica" 146 | 6.7,3.3,5.7,2.5,"Virginica" 147 | 6.7,3,5.2,2.3,"Virginica" 148 | 6.3,2.5,5,1.9,"Virginica" 149 | 6.5,3,5.2,2,"Virginica" 150 | 6.2,3.4,5.4,2.3,"Virginica" 151 | 5.9,3,5.1,1.8,"Virginica" -------------------------------------------------------------------------------- /week9/simple_class.py: -------------------------------------------------------------------------------- 1 | class Queue: 2 | def __init__(self): 3 | self.items = [] 4 | 5 | def add_item(self, item): 6 | self.items.append(item) 7 | 8 | def first(self): 9 | return self.items[0] 10 | 11 | def last(self): 12 | return self.items[-1] 13 | 14 | def length(self): 15 | return len(self.items) 16 | 17 | -------------------------------------------------------------------------------- /week9/tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzsza/kyle-school/2b61aa015c97eae2eacdffab48f3aead9a412b3e/week9/tests/__init__.py -------------------------------------------------------------------------------- /week9/tests/test_calc_class.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | from calc_class import Calculator 3 | 4 | # 상수 5 | NUMBER_1 = 3.0 6 | NUMBER_2 = 2.0 7 | 8 | # Fixtures 9 | @pytest.fixture 10 | def calculator(): 11 | return Calculator() 12 | 13 | def verify_answer(expected, answer, last_answer): 14 | assert expected == answer 15 | assert expected == last_answer 16 | 17 | 18 | # ======Test Cases 시작====== 19 | def test_last_answer_init(calculator): 20 | # TODO : Test Code 21 | 22 | 23 | def test_add(calculator): 24 | # TODO: Use NUMBER_1, NUMBER_2을 사용해 Test 25 | 26 | 27 | def test_subtract(calculator): 28 | # TODO: Use NUMBER_1, NUMBER_2을 사용해 Test 29 | 30 | 31 | def test_subtract_negative(calculator): 32 | # TODO: Use NUMBER_1, NUMBER_2을 사용해 Test 33 | 34 | 35 | def test_multiply(calculator): 36 | # TODO: Use NUMBER_1, NUMBER_2을 사용해 Test 37 | 38 | 39 | def test_divide(calculator): 40 | # TODO: Use NUMBER_1, NUMBER_2을 사용해 Test 41 | 42 | 43 | def test_divide_by_zero(calculator): 44 | # TODO : ZeroDivisionError가 나오는지 확인하는 Test 45 | 46 | 47 | @pytest.mark.parametrize("a,b,expected", [ 48 | (NUMBER_1, NUMBER_2, NUMBER_1), 49 | (NUMBER_2, NUMBER_1, NUMBER_1), 50 | (NUMBER_1, NUMBER_1, NUMBER_1), 51 | ]) 52 | def test_maximum(calculator, a, b, expected): 53 | # TODO : parametrize를 사용해 파라미터를 주입 54 | 55 | 56 | @pytest.mark.parametrize("a,b,expected", [ 57 | (NUMBER_1, NUMBER_2, NUMBER_2), 58 | (NUMBER_2, NUMBER_1, NUMBER_2), 59 | (NUMBER_2, NUMBER_2, NUMBER_2), 60 | ]) 61 | def test_minimum(calculator, a, b, expected): 62 | # TODO : parametrize를 사용해 파라미터를 주입 63 | 64 | 65 | 66 | 67 | 68 | 69 | -------------------------------------------------------------------------------- /week9/tests/test_simple_class.py: -------------------------------------------------------------------------------- 1 | from simple_class import Queue 2 | 3 | def test_firstlast(): 4 | q = Queue() 5 | 6 | q.add_item(5) 7 | q.add_item(17) 8 | q.add_item("hello") 9 | 10 | assert q.first() == 5 11 | assert q.last() == "hello" 12 | 13 | def test_len(): 14 | q = Queue() 15 | 16 | assert q.length() == 0 17 | 18 | q.add_item(1) 19 | 20 | assert q.length() == 1 21 | 22 | for i in range(10): 23 | q.add_item(i) 24 | 25 | assert q.length() == 10 26 | 27 | -------------------------------------------------------------------------------- /week9/tests/test_utils.py: -------------------------------------------------------------------------------- 1 | # test_utils.py를 아래 내용으로 overwrite합니다(-a 옵션 없이!) 2 | import pytest 3 | import pandas as pd 4 | import datetime 5 | from utils import is_working_day, load_data 6 | 7 | def test_is_working_day(): 8 | assert is_working_day(datetime.date(2020,7,5)) == False 9 | assert is_working_day(datetime.date(2020,7,4)) == False 10 | assert is_working_day(datetime.date(2020,7,6)) == True 11 | 12 | 13 | @pytest.fixture(scope="session") 14 | def result_fixture(): 15 | result = load_data() 16 | return result 17 | 18 | 19 | def test_len(result_fixture): 20 | assert len(result_fixture) == 150 21 | 22 | 23 | def test_object_type(result_fixture): 24 | assert isinstance(result_fixture, pd.DataFrame) 25 | 26 | 27 | 28 | -------------------------------------------------------------------------------- /week9/tests/test_your_module.py: -------------------------------------------------------------------------------- 1 | from your_module import * 2 | 3 | def test_multiply_by_two(): 4 | assert multiply_by_two(2) == 4 5 | assert multiply_by_two(3.6) == 7.2 6 | 7 | -------------------------------------------------------------------------------- /week9/utils.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import datetime 3 | 4 | def is_working_day(date: datetime.date): 5 | """ 6 | date를 받아서 근무일인지 확인하는 함수 7 | 연휴는 고려하지 않고, 토/일은 근무일이 아니고 월~금은 근무일 8 | """ 9 | weekday = date.weekday() 10 | if weekday in {5, 6}: 11 | return False 12 | else: 13 | return True 14 | # 이 파일을 실행하면 utils.py에 파일이 저장됩니다 15 | 16 | def load_data(): 17 | df = pd.read_csv("iris.csv") 18 | return df 19 | 20 | 21 | -------------------------------------------------------------------------------- /week9/your_module.py: -------------------------------------------------------------------------------- 1 | 2 | def multiply_by_two(x): 3 | return x * 2 4 | --------------------------------------------------------------------------------