├── Proyecto final ├── Dockerfile ├── Makefile ├── README.md ├── airflow.cfg ├── credentials.env ├── dag-screenshot.png ├── dags │ ├── config.json │ └── final-project-dag.py ├── db-init-scripts │ └── create_table.sql ├── docker-compose.yml └── webserver_config.py ├── README.md ├── Semana 1 ├── Ejercicios_Python_DataEngineer-main │ ├── Ejercicio1_Python.py │ ├── Ejercicio2_Python.py │ ├── Ejercicio3_Python.py │ ├── Ejercicio4_Python.py │ ├── Ejercicio5_Python.py │ └── Ejercicio6_Python.py └── Ejercicios_SQL_DataEngineer-main │ ├── Ejercicio1_SQL.sql │ ├── Ejercicio2_SQL.sql │ ├── Ejercicio3_SQL.sql │ ├── Ejercicio4_SQL.sql │ ├── Ejercicio5_SQL.sql │ ├── Ejercicio6_SQL.sql │ ├── Ejercicio7_SQL.sql │ ├── agents.csv │ ├── calls.csv │ └── customers.csv ├── Semana 10 ├── Data_pipeline_sencillo │ ├── Mensaje_exito_cargar_data_parte2.txt │ ├── Mensaje_exito_transformar_data_parte1.txt │ ├── booking.csv │ ├── client.csv │ ├── docker-compose.yaml │ ├── file │ └── hotel.csv ├── Ejemplo en vivo ETL Bitcoin.rar ├── Mensaje_exito_tarea1.txt ├── Mensaje_exito_tarea2.txt ├── Video 2_ Lanzando Airflow con Docker │ └── docker-compose.yaml ├── Video 3_ Tasks y Operators │ └── primer_dag_v3.py ├── Video 4_ DAGs │ ├── attrib │ ├── primer_dag.py │ ├── primer_dag_v2.py │ ├── primer_dag_v3.py │ └── primer_tag_python_operator.py ├── Video 6_ Context │ ├── attrib │ └── ejemplo_template.py ├── dag_con_backfilling.py ├── dag_con_catchup.py ├── dag_pipeline_sencillo.py ├── dag_postgres_database.py ├── docker-compose.yaml └── primer_dag_v4.py ├── Semana 11 ├── Actividad_XCOMS.py ├── Microdesafio_Semana11.py ├── Video 1_ Parametros DAG │ ├── attrib │ └── ejemplo_params.py ├── Video 2_ Sensors │ ├── dag_sensors.py │ └── dag_sensors2.py ├── Video 3_ XCOMS │ ├── dag_con_xcom.py │ └── dag_con_xcom2.py ├── Video 4_ TaskGroups y Depencias_SubDAGs │ └── subdags.py ├── Video 5_ Airflow.cfg │ ├── airflow.cfg │ └── attrib ├── dag_smtp_email.py └── docker-compose.yaml ├── Semana 2 ├── Ejemplo_en_vivo.sql ├── agents.csv ├── calls.csv └── customers.csv ├── Semana 3 ├── Creacion_tablas_microdesafio.sql ├── Ejemplo_en_vivo_MongoDB.sh ├── Ejemplo_en_vivo_SQLServer.sql ├── Insumo_para_crear_procedimiento_microdesafio.sql └── Microdesafio.sql ├── Semana 4 ├── 1Forma_normal_solucion.sql ├── 2Forma_normal_solucion.sql ├── 3Forma_normal_solucion.sql ├── Condiciones_Microdesafio_ETL_desastres.sql ├── ETL_Desastres_Microdesafio.sql ├── data_base.png ├── data_base_1NF.png ├── data_base_2NF.png ├── data_base_3NF.png └── main ├── Semana 5 ├── Consulta_Actividad_Semana5.sql ├── Consulta_Actividad_Semana5_Lenta.sql ├── Extraccion_datos_API_Semana5_Ejemplo_en_vivo.ipynb ├── Query_Lenta_Larga.sql ├── Query_Optimizada.sql ├── agents.csv ├── calls.csv └── customers.csv ├── Semana 6 ├── Actividad_Semana6.ipynb ├── Lectura de archivos JSON y SQL.ipynb ├── Lectura_APIs.py ├── Lectura_csv.py ├── Lectura_github.py ├── Lectura_txt.py ├── Lectura_xlsx.py ├── Microdesafio_Semana6.ipynb ├── Notebook_Integridad_DFS_DBU.ipynb ├── defaultoutput.xlsx ├── nba_salary.sqlite ├── nested_json.json ├── pokemon_data.txt └── winequality-red.csv ├── Semana 7 ├── MicroDesafio_Semana7.ipynb ├── Semana7_Psyco2pg_DE_Ejemplo_en_vivo2.ipynb ├── Semana7_SqlAlchemy_DE_Actividad_Colaborativa.ipynb ├── Semana7_SqlAlchemy_DE_Ejemplo_en_vivo1.ipynb └── Tablas │ ├── countryregioncurrency.csv │ ├── currencyrate.csv │ ├── product.csv │ ├── productcategory.csv │ ├── productdescription.csv │ ├── productmodelproductdescriptionculture.csv │ ├── productreview.csv │ ├── productsubcategory.csv │ ├── salesorderdetail.csv │ ├── salesorderheader.csv │ ├── salesperson.csv │ └── salesterritory.csv ├── Semana 8 ├── Datos_Microdesafio_Semana8_DE.csv ├── Ejemplo_anonimizacion_columnas_DF.ipynb ├── Ejemplo_en_vivo_Visualizacion_permisos_Redshift.sql ├── Ejemplo_en_vivo_Visualizacion_permisos_completos_Redshift.sql ├── Microdesafio_Semana8_DE.ipynb ├── Seguridad_basica_Redshift.sql ├── Seguridad_columnas_Redshift.sql └── airbnb_nyc.rar └── Semana 9 ├── EJEMPLO1_DOCKER.rar ├── EJEMPLO2_DOCKER.rar └── web-page.rar /Proyecto final/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM apache/airflow:2.3.3 2 | 3 | ADD webserver_config.py /opt/airflow/webserver_config.py 4 | 5 | USER root 6 | RUN apt-get update \ 7 | && apt-get install -y --no-install-recommends \ 8 | vim \ 9 | && apt-get autoremove -yqq --purge \ 10 | && apt-get clean \ 11 | && rm -rf /var/lib/apt/lists/* 12 | USER airflow 13 | 14 | RUN pip install yfinance 15 | RUN pip install psycopg2-binary 16 | RUN pip install sendgrid 17 | -------------------------------------------------------------------------------- /Proyecto final/Makefile: -------------------------------------------------------------------------------- 1 | ## ---------------------------------------------------------------------- 2 | ## Welcome to the Coderhouse's Data Engineering example project 3 | ## ---------------------------------------------------------------------- 4 | 5 | help: ## show this help. 6 | @sed -ne '/@sed/!s/## //p' $(MAKEFILE_LIST) 7 | 8 | build: ## build the solution 9 | echo "Building Airflow locally using the LocalExecutor" 10 | docker-compose -f docker-compose.yml build --progress=plain --no-cache 11 | 12 | run: ## run the solution 13 | echo "Running Airflow locally using the LocalExecutor" 14 | docker-compose -f docker-compose.yml up -d 15 | 16 | stop: ## stop running every container 17 | echo "Stopping all containers" 18 | docker-compose -f docker-compose.yml down -v --remove-orphans 19 | 20 | enter-warehouse: ## enter the postgres DB with SQL 21 | docker exec -it coderhouse-final-project-demo_data_warehouse_1 psql -U airflow_dw --dbname dw 22 | 23 | get-admin-password: ## get the admin's password 24 | docker exec -it coderhouse-final-project-demo_webserver_1 cat standalone_admin_password.txt 25 | 26 | bash: ## enter the airflow container with bash 27 | docker exec -it coderhouse-final-project-demo_webserver_1 bash 28 | -------------------------------------------------------------------------------- /Proyecto final/README.md: -------------------------------------------------------------------------------- 1 | # Coderhouse's Data Engineering Example Final Project 2 | ### Developed by Axel Furlan 3 | 4 | ## Requirements 5 | - Have Docker 6 | - If you want email alerts to work, create a [SendGrid](https://sendgrid.com) account. Don't worry, it's free! 7 | 8 | ## Description 9 | This code gives you all the tools to run the specific DAG called `get_stocks_data_and_alert`. 10 | 11 | What this DAG does is: 12 | 13 | 1. Pulls data from the [Yahoo Finance Python SDK](https://pypi.org/project/yfinance/) 14 | 2. Saves that data into a Postgres Database (simulating a Data Warehouse). 15 | 3. Checks against a config file (`config.json`) if the values of the stocks surpass the limits. If they do, it sends an email to whatever address you have configured alerting about the anomaly. 16 | 17 | 18 | ## Configuring your credentials 19 | On the `credentials.env` file, input your `SENDGRID_API_KEY` that you got from your SendGrid account. To get one, go [here](https://app.sendgrid.com/settings/api_keys). 20 | 21 | Set `EMAIL_FROM` to the email your configured to send the emails on SendGrid. Set `EMAIL_TO` to whatever address you want the emails to go (I suggest the same as `EMAIL_FROM` to avoid getting flagged). 22 | 23 | Remember: emails may go to **SPAM**. Check that folder. 24 | 25 | 26 | ## Usage 27 | It's easy, just do: 28 | 29 | 1. `make build` 30 | 2. `make run` 31 | 3. `make get-admin-password` to get the password. 32 | 4. Enter `localhost:8080` in whatever browser you want. 33 | 5. Input `admin` as the user and the password you got on step 3. Without the `%` char. 34 | 6. Once inside, activate the DAG, wait for it to turn dark green and voila! The pipeline ran. 35 | 7. To kill everything, you can `make stop` 36 | 37 | 38 | ![DAG Screenshot](dag-screenshot.png "DAG Screenshot") 39 | 40 | 41 | ## HELP! 42 | Run `make help`. 43 | -------------------------------------------------------------------------------- /Proyecto final/credentials.env: -------------------------------------------------------------------------------- 1 | WAREHOUSE_HOST=data_warehouse 2 | WAREHOUSE_DBNAME=dw 3 | WAREHOUSE_USER=airflow_dw 4 | WAREHOUSE_PASSWORD=airflow_dw 5 | SENDGRID_API_KEY= 6 | EMAIL_TO= 7 | EMAIL_FROM= 8 | -------------------------------------------------------------------------------- /Proyecto final/dag-screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Proyecto final/dag-screenshot.png -------------------------------------------------------------------------------- /Proyecto final/dags/config.json: -------------------------------------------------------------------------------- 1 | { 2 | "stocks": ["MSFT", "AAPL", "QQQ", "XLE"], 3 | "thresholds": { 4 | "MSFT": {"min": 240, "max": 290}, 5 | "AAPL": {"min": 240, "max": 290}, 6 | "QQQ": {"min": 240, "max": 290}, 7 | "XLE": {"min": 240, "max": 290} 8 | } 9 | } 10 | -------------------------------------------------------------------------------- /Proyecto final/dags/final-project-dag.py: -------------------------------------------------------------------------------- 1 | """ 2 | Run first: 3 | create table stocks_data (Open float, 4 | High float, 5 | Low float, 6 | Close float, 7 | Volume float, 8 | Dividends float, 9 | Stock_splits float, 10 | Stock varchar, 11 | Date timestamp 12 | fifty_two_week_low float, 13 | fifty_two_week_high float 14 | ); 15 | """ 16 | 17 | import pandas as pd 18 | import os 19 | import json 20 | import logging 21 | import psycopg2 22 | 23 | import yfinance as yf 24 | 25 | from airflow import DAG 26 | from airflow.macros import ds_add 27 | from airflow.operators.python_operator import PythonOperator 28 | from datetime import datetime, timedelta 29 | from sqlalchemy import create_engine 30 | from sendgrid import SendGridAPIClient 31 | from sendgrid.helpers.mail import Mail 32 | 33 | default_args = { 34 | 'owner': 'axelfurlan', 35 | 'depends_on_past': False, 36 | 'start_date': datetime(2022, 7, 26), 37 | 'retries': 1, 38 | 'retry_delay': timedelta(minutes=5), 39 | } 40 | 41 | WAREHOUSE_HOST = os.environ.get('WAREHOUSE_HOST') 42 | WAREHOUSE_DBNAME = os.environ.get('WAREHOUSE_DBNAME') 43 | WAREHOUSE_USER = os.environ.get('WAREHOUSE_USER') 44 | WAREHOUSE_PASSWORD = os.environ.get('WAREHOUSE_PASSWORD') 45 | 46 | CONN_STRING = f"host='{WAREHOUSE_HOST}' dbname='{WAREHOUSE_DBNAME}' user='{WAREHOUSE_USER}' password='{WAREHOUSE_PASSWORD}'" 47 | 48 | 49 | def retrieve_data_from_api(**context): 50 | df_final = pd.DataFrame(columns=['Open', 'High', 'Low', 'Close', 'Volume', 'Dividends', 'Stock_splits']) 51 | with open('dags/config.json', 'r') as json_config: 52 | config = json.load(json_config) 53 | 54 | for stock in config['stocks']: 55 | logging.info(f"Retrieving info for {stock}") 56 | stock_data = yf.Ticker(stock) 57 | df = stock_data.history(period="1d") 58 | df = df.reset_index(level=0) 59 | df['Stock'] = stock.upper() 60 | df.rename(columns={"Stock Splits": "Stock_splits"}, inplace=True) 61 | df_final = df_final.append(df) 62 | 63 | csv_filename = f"{context['ds']}_stocks_data.csv" 64 | df_final.columns = df_final.columns.str.lower() 65 | df_final.to_csv(csv_filename, index=False) 66 | 67 | return csv_filename 68 | 69 | 70 | def save_data_to_dw(**context): 71 | csv_filename = context['ti'].xcom_pull(task_ids='get_data') 72 | df = pd.read_csv(csv_filename) 73 | 74 | conn = psycopg2.connect(CONN_STRING) 75 | with conn: 76 | with conn.cursor() as cur: 77 | cur.execute(f"DELETE FROM stocks_data WHERE date = '{ds_add(context['ds'], 1)}';") 78 | conn.close() 79 | 80 | engine = create_engine(f'postgresql://{WAREHOUSE_USER}:{WAREHOUSE_PASSWORD}@{WAREHOUSE_HOST}:5432/{WAREHOUSE_DBNAME}') 81 | df.to_sql('stocks_data', engine, if_exists="append", index=False) 82 | 83 | 84 | def verify_threshold_and_verify(**context): 85 | csv_filename = context['ti'].xcom_pull(task_ids='get_data') 86 | df = pd.read_csv(csv_filename) 87 | 88 | with open('dags/config.json', 'r') as json_config: 89 | config = json.load(json_config) 90 | 91 | for stock in config["stocks"]: 92 | min_t = config["thresholds"][stock].get("min") 93 | max_t = config["thresholds"][stock].get("max") 94 | 95 | current_date = ds_add(context['ds'], 1) 96 | close_value = df.loc[(df.stock == stock) & (df.date == current_date), 'close'].values[0] 97 | print(f"Close value for stock {stock} is {close_value}. thresholds are: between {min_t} and {max_t}") 98 | if min_t > close_value or max_t < close_value: 99 | 100 | if min_t > close_value: 101 | subject = f"Stock {stock} is under the threshold" 102 | else: 103 | subject = f"Stock {stock} is over the threshold" 104 | 105 | body = f""" 106 | Close value for stock {stock} is {close_value}. 107 | Thresholds values are: between {min_t} and {max_t} 108 | """ 109 | 110 | message = Mail( 111 | from_email=os.environ['EMAIL_FROM'], 112 | to_emails=os.environ['EMAIL_TO'], 113 | subject=subject, 114 | html_content=body) 115 | 116 | sg = SendGridAPIClient(os.environ.get('SENDGRID_API_KEY')) 117 | response = sg.send(message) 118 | print(response.status_code) 119 | 120 | 121 | def update_aggregations(**context): 122 | conn = psycopg2.connect(CONN_STRING) 123 | query = """ 124 | UPDATE stocks_data s 125 | SET fifty_two_week_low=sub.fifty_two_week_low, 126 | fifty_two_week_high=sub.fifty_two_week_high 127 | FROM ( 128 | SELECT 129 | stock 130 | , date 131 | , min(Low) over (partition by Stock order by date rows between 364 preceding and current row) as fifty_two_week_low 132 | , max(High) over (partition by Stock order by date rows between 364 preceding and current row) as fifty_two_week_high 133 | FROM stocks_data 134 | ) sub 135 | WHERE 136 | s.stock = sub.stock 137 | AND s.date = sub.date 138 | """ 139 | with conn: 140 | with conn.cursor() as cur: 141 | cur.execute(query) 142 | conn.close() 143 | 144 | 145 | with DAG('get_stocks_data_and_alert', 146 | description='DAG that retrieves data from API and saves it into a table in a Data Warehouse', 147 | schedule_interval='0 12 * * *', 148 | catchup=False, 149 | default_args=default_args) as dag: 150 | 151 | get_data = PythonOperator(task_id='get_data', python_callable=retrieve_data_from_api, dag=dag, provide_context=True) 152 | 153 | save_data = PythonOperator(task_id='save_data', python_callable=save_data_to_dw, dag=dag, provide_context=True) 154 | 155 | set_aggregations = PythonOperator(task_id='set_aggregations', python_callable=update_aggregations, dag=dag, provide_context=True) 156 | 157 | send_email_if_anomaly = PythonOperator(task_id='send_email_if_anomaly', python_callable=verify_threshold_and_verify, dag=dag, provide_context=True) 158 | 159 | get_data >> save_data >> set_aggregations 160 | get_data >> send_email_if_anomaly 161 | -------------------------------------------------------------------------------- /Proyecto final/db-init-scripts/create_table.sql: -------------------------------------------------------------------------------- 1 | create table stocks_data ( 2 | Open float, 3 | High float, 4 | Low float, 5 | Close float, 6 | Volume float, 7 | Dividends float, 8 | Stock_splits float, 9 | Stock varchar, 10 | Date timestamp, 11 | fifty_two_week_low float, 12 | fifty_two_week_high float 13 | ); 14 | -------------------------------------------------------------------------------- /Proyecto final/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '2.1' 2 | services: 3 | postgres: 4 | image: postgres:13 5 | environment: 6 | # THESE DEFAULTS WILL BE OVERWRITTEN IN PRD DEPLOY 7 | - POSTGRES_USER=airflow 8 | - POSTGRES_PASSWORD=airflow 9 | - POSTGRES_DB=airflow 10 | 11 | data_warehouse: 12 | image: postgres:13 13 | volumes: 14 | - ./db-init-scripts:/docker-entrypoint-initdb.d 15 | environment: 16 | # THESE DEFAULTS WILL BE OVERWRITTEN IN PRD DEPLOY 17 | - POSTGRES_USER=airflow_dw 18 | - POSTGRES_PASSWORD=airflow_dw 19 | - POSTGRES_DB=dw 20 | 21 | webserver: 22 | build: 23 | context: . 24 | dockerfile: Dockerfile 25 | restart: always 26 | depends_on: 27 | - postgres 28 | environment: 29 | - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho= 30 | - AIRFLOW__CORE__EXECUTOR=LocalExecutor 31 | - AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres:5432/airflow 32 | # THESE DEFAULTS WILL BE OVERWRITTEN IN PRD DEPLOY 33 | # - POSTGRES_USER=airflow 34 | # - POSTGRES_PASSWORD=airflow 35 | # - POSTGRES_DB=airflow 36 | # - REDIS_PASSWORD=redispass 37 | env_file: 38 | - ./credentials.env 39 | volumes: 40 | - ./dags:/opt/airflow/dags 41 | - ./webserver_config.py:/opt/airflow/webserver_config.py 42 | - ${HOME}/.aws:/root/.aws # copy aws credentials from host to container 43 | ports: 44 | - "8080:8080" 45 | command: > 46 | bash -c "airflow standalone" 47 | healthcheck: 48 | test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"] 49 | interval: 30s 50 | timeout: 30s 51 | retries: 3 52 | -------------------------------------------------------------------------------- /Proyecto final/webserver_config.py: -------------------------------------------------------------------------------- 1 | # 2 | # Licensed to the Apache Software Foundation (ASF) under one 3 | # or more contributor license agreements. See the NOTICE file 4 | # distributed with this work for additional information 5 | # regarding copyright ownership. The ASF licenses this file 6 | # to you under the Apache License, Version 2.0 (the 7 | # "License"); you may not use this file except in compliance 8 | # with the License. You may obtain a copy of the License at 9 | # 10 | # http://www.apache.org/licenses/LICENSE-2.0 11 | # 12 | # Unless required by applicable law or agreed to in writing, 13 | # software distributed under the License is distributed on an 14 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 15 | # KIND, either express or implied. See the License for the 16 | # specific language governing permissions and limitations 17 | # under the License. 18 | """Default configuration for the Airflow webserver""" 19 | import os 20 | 21 | from airflow.www.fab_security.manager import AUTH_DB 22 | 23 | # from airflow.www.fab_security.manager import AUTH_LDAP 24 | # from airflow.www.fab_security.manager import AUTH_OAUTH 25 | # from airflow.www.fab_security.manager import AUTH_OID 26 | # from airflow.www.fab_security.manager import AUTH_REMOTE_USER 27 | 28 | 29 | basedir = os.path.abspath(os.path.dirname(__file__)) 30 | 31 | # Flask-WTF flag for CSRF 32 | WTF_CSRF_ENABLED = True 33 | 34 | # ---------------------------------------------------- 35 | # AUTHENTICATION CONFIG 36 | # ---------------------------------------------------- 37 | # For details on how to set up each of the following authentication, see 38 | # http://flask-appbuilder.readthedocs.io/en/latest/security.html# authentication-methods 39 | # for details. 40 | 41 | # The authentication type 42 | # AUTH_OID : Is for OpenID 43 | # AUTH_DB : Is for database 44 | # AUTH_LDAP : Is for LDAP 45 | # AUTH_REMOTE_USER : Is for using REMOTE_USER from web server 46 | # AUTH_OAUTH : Is for OAuth 47 | AUTH_TYPE = AUTH_DB 48 | 49 | # Uncomment to setup Full admin role name 50 | AUTH_ROLE_ADMIN = 'Admin' 51 | 52 | # Uncomment and set to desired role to enable access without authentication 53 | AUTH_ROLE_PUBLIC = 'Admin' 54 | 55 | # Will allow user self registration 56 | # AUTH_USER_REGISTRATION = True 57 | 58 | # The recaptcha it's automatically enabled for user self registration is active and the keys are necessary 59 | # RECAPTCHA_PRIVATE_KEY = PRIVATE_KEY 60 | # RECAPTCHA_PUBLIC_KEY = PUBLIC_KEY 61 | 62 | # Config for Flask-Mail necessary for user self registration 63 | # MAIL_SERVER = 'smtp.gmail.com' 64 | # MAIL_USE_TLS = True 65 | # MAIL_USERNAME = 'yourappemail@gmail.com' 66 | # MAIL_PASSWORD = 'passwordformail' 67 | # MAIL_DEFAULT_SENDER = 'sender@gmail.com' 68 | 69 | # The default user self registration role 70 | # AUTH_USER_REGISTRATION_ROLE = "Public" 71 | 72 | # When using OAuth Auth, uncomment to setup provider(s) info 73 | # Google OAuth example: 74 | # OAUTH_PROVIDERS = [{ 75 | # 'name':'google', 76 | # 'token_key':'access_token', 77 | # 'icon':'fa-google', 78 | # 'remote_app': { 79 | # 'api_base_url':'https://www.googleapis.com/oauth2/v2/', 80 | # 'client_kwargs':{ 81 | # 'scope': 'email profile' 82 | # }, 83 | # 'access_token_url':'https://accounts.google.com/o/oauth2/token', 84 | # 'authorize_url':'https://accounts.google.com/o/oauth2/auth', 85 | # 'request_token_url': None, 86 | # 'client_id': GOOGLE_KEY, 87 | # 'client_secret': GOOGLE_SECRET_KEY, 88 | # } 89 | # }] 90 | 91 | # When using LDAP Auth, setup the ldap server 92 | # AUTH_LDAP_SERVER = "ldap://ldapserver.new" 93 | 94 | # When using OpenID Auth, uncomment to setup OpenID providers. 95 | # example for OpenID authentication 96 | # OPENID_PROVIDERS = [ 97 | # { 'name': 'Yahoo', 'url': 'https://me.yahoo.com' }, 98 | # { 'name': 'AOL', 'url': 'http://openid.aol.com/' }, 99 | # { 'name': 'Flickr', 'url': 'http://www.flickr.com/' }, 100 | # { 'name': 'MyOpenID', 'url': 'https://www.myopenid.com' }] 101 | 102 | # ---------------------------------------------------- 103 | # Theme CONFIG 104 | # ---------------------------------------------------- 105 | # Flask App Builder comes up with a number of predefined themes 106 | # that you can use for Apache Airflow. 107 | # http://flask-appbuilder.readthedocs.io/en/latest/customizing.html#changing-themes 108 | # Please make sure to remove "navbar_color" configuration from airflow.cfg 109 | # in order to fully utilize the theme. (or use that property in conjunction with theme) 110 | # APP_THEME = "bootstrap-theme.css" # default bootstrap 111 | # APP_THEME = "amelia.css" 112 | # APP_THEME = "cerulean.css" 113 | # APP_THEME = "cosmo.css" 114 | # APP_THEME = "cyborg.css" 115 | # APP_THEME = "darkly.css" 116 | # APP_THEME = "flatly.css" 117 | # APP_THEME = "journal.css" 118 | # APP_THEME = "lumen.css" 119 | # APP_THEME = "paper.css" 120 | # APP_THEME = "readable.css" 121 | # APP_THEME = "sandstone.css" 122 | # APP_THEME = "simplex.css" 123 | # APP_THEME = "slate.css" 124 | # APP_THEME = "solar.css" 125 | # APP_THEME = "spacelab.css" 126 | # APP_THEME = "superhero.css" 127 | # APP_THEME = "united.css" 128 | # APP_THEME = "yeti.css" 129 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Data-Engineering 2 | Repositorio para curso Data Engineering, modalidad Flex / 3 | © Coderhouse 2022 Todos los derechos reservados / 4 | Repositorio exclusivo para usos didácticos de Coderhouse, se prohíbe expresamente toda forma de reproducción tanto en forma digital como física / 5 | Amparado por la ley internacional de derechos de autor. 6 | -------------------------------------------------------------------------------- /Semana 1/Ejercicios_Python_DataEngineer-main/Ejercicio1_Python.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Escribí un programa que lea un número impar por teclado. Si el usuario no 3 | introduce un número impar, debe repetirse el proceso hasta que lo introduzca 4 | correctamente. 5 | ''' 6 | while True: 7 | if int(input('Introduce un numero impar:'))% 2==0: 8 | print('Incorrecto introduce un numero impar') 9 | else: 10 | print('Ciclo finalizado') 11 | break 12 | -------------------------------------------------------------------------------- /Semana 1/Ejercicios_Python_DataEngineer-main/Ejercicio2_Python.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Escribí un programa que pida al usuario cuantos números quiere introducir. Luego lee 3 | todos los números y realiza una media aritmética: 4 | ''' 5 | cantidad=int(input('Introduce una cantidad de numeros para sacar la media:')) 6 | lista=[] 7 | for i in range(cantidad): 8 | a=float(input('Introduce el numero {}:'.format(i+1))) 9 | lista.append(a) 10 | print('La media de los numeros es:', sum(lista)/len(lista)) -------------------------------------------------------------------------------- /Semana 1/Ejercicios_Python_DataEngineer-main/Ejercicio3_Python.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Utilizando la función range() y la conversión a listas genera las siguientes listas 3 | dinámicamente: 4 | 5 | ● Todos los números del 0 al 10 [0, 1, 2, ..., 10] 6 | ● Todos los números del -10 al 0 [-10, -9, -8, ..., 0] 7 | ● Todos los números pares del 0 al 20 [0, 2, 4, ..., 20] 8 | ● Todos los números impares entre -20 y 0 [-19, -17, -15, ..., -1] 9 | ● Todos los números múltiples de 5 del 0 al 50 [0, 5, 10, ..., 50] 10 | ''' 11 | print(list(range(0,10+1,1))); 12 | print([x for x in range(-10, 1)]) 13 | print([x for x in range(0,20+1,1) if x%2==0]) 14 | print([x for x in range(-19, -1+1)]) 15 | print([x for x in range(0,50+1,1) if x%5==0]) 16 | 17 | 18 | -------------------------------------------------------------------------------- /Semana 1/Ejercicios_Python_DataEngineer-main/Ejercicio4_Python.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Dadas dos listas, debes generar una tercera con todos los elementos que se 3 | repitan en ellas, pero no debe repetirse ningún elemento en la nueva lista: 4 | ''' 5 | lista_1 = ["h",'o','l','a',' ', 'm','u','n','d','o'] 6 | lista_2 = ["h",'o','l','a',' ', 'l','u','n','a'] 7 | nueva_lista = [] 8 | for element in lista_2: 9 | if element in lista_1: 10 | nueva_lista.append(element) 11 | print([*set(nueva_lista)]) -------------------------------------------------------------------------------- /Semana 1/Ejercicios_Python_DataEngineer-main/Ejercicio5_Python.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Escribí un programa que sume todos los números enteros impares desde el 0 hasta 3 | el 100: 4 | ''' 5 | lista_v=[] 6 | for i in range(1,100+1,1): 7 | if i %2 ==0: 8 | lista_v.append(0) 9 | else: 10 | lista_v.append(i) 11 | print(sum(lista_v)) -------------------------------------------------------------------------------- /Semana 1/Ejercicios_Python_DataEngineer-main/Ejercicio6_Python.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Contar cuantas veces aparece un elemento en una lista 3 | ''' 4 | def conteo(lista, elemento): 5 | contador = 0 6 | for elemento in lista: 7 | if (elemento == x): 8 | contador = contador + 1 9 | return contador 10 | 11 | lt = [8, 6, 8, 10, 8, 20, 10, 8, 8] 12 | x = 8 #elemento 13 | print('{} aparece {} veces'.format(x, conteo(lt, x))) -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio1_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Extraer agentes cuyo nombre empiezen por M o terminen en O 3 | */ 4 | select * from agents 5 | where name like 'M%' or name like '%o' -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio2_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Escriba una consulta que produzca una lista, en orden alfabético, 3 | de todas las distintas ocupaciones en la tabla Customer que contengan la palabra 4 | "Engineer". 5 | */ 6 | SELECT DISTINCT Occupation 7 | FROM customers 8 | WHERE Occupation LIKE '%Engineer%' 9 | ORDER BY Occupation -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio3_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Escriba una consulta que devuelva el ID del cliente, su nombre y una columna 3 | Mayor30 que contenga "Sí "si el cliente tiene más de 30 años y "No" en caso contrario. 4 | */ 5 | SELECT CustomerID, Name, 6 | CASE 7 | WHEN Age >= 30 THEN 'Yes' 8 | WHEN Age < 30 THEN 'No' 9 | ELSE 'Missing Data' 10 | END AS Over30 11 | FROM customers 12 | ORDER BY Name DESC -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio4_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Escriba una consulta que devuelva todas las llamadas realizadas a clientes de la 3 | profesión de ingeniería y muestre si son mayores o menores de 30, así como si 4 | terminaron comprando el producto de esa llamada. 5 | */ 6 | SELECT CallID, Cu.CustomerID, Name, ProductSold, 7 | CASE 8 | WHEN Age >= 30 THEN 'Yes' 9 | WHEN Age < 30 THEN 'No' 10 | ELSE 'Missing Data' 11 | END AS Over30 12 | FROM customers Cu 13 | JOIN calls Ca ON Ca.CustomerID = Cu.CustomerID 14 | WHERE Occupation LIKE '%Engineer%' 15 | ORDER BY Name DESC -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio5_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Escriba dos consultas: una que calcule las ventas totales y las llamadas totales 3 | realizadas a los clientes de la profesión de ingeniería y otra que calcule las mismas 4 | métricas para toda la base de clientes 5 | */ 6 | SELECT SUM(ProductSold) AS TotalSales, COUNT(*) AS NCalls 7 | FROM customers Cu 8 | JOIN calls Ca ON Ca.CustomerID = Cu.CustomerID 9 | WHERE Occupation LIKE '%Engineer%' -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio6_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Escriba una consulta que devuelva, para cada agente, el nombre del agente, la cantidad de llamadas, 3 | las llamadas más largas y más cortas, la duración promedio de las llamadas y la cantidad total de 4 | productos vendidos. Nombra las columnas AgentName, NCalls, Shortest, Longest, AvgDuration y TotalSales 5 | Luego ordena la tabla por AgentName en orden alfabético. 6 | (Asegúrese de incluir la cláusula WHERE PickedUp = 1 para calcular solo el promedio de todas las 7 | llamadas que fueron atendidas (de lo contrario, ¡todas las duraciones mínimas serán 0)!) 8 | */ 9 | SELECT Name AS AgentName, COUNT(*) AS NCalls, MIN(Duration) AS Shortest, MAX(Duration) AS Longest, ROUND(AVG(Duration),2) AS AvgDuration, SUM(ProductSold) AS TotalSales 10 | FROM calls C 11 | JOIN agents A ON C.AgentID = A.AgentID 12 | WHERE PickeDup = 1 13 | GROUP BY Name 14 | ORDER BY Name -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/Ejercicio7_SQL.sql: -------------------------------------------------------------------------------- 1 | /* 2 | Dos métricas del desempeño de los agentes de ventas que le interesan a su empresa son: 3 | 1) para cada agente, cuántos segundos en promedio les toma vender un producto cuando tienen éxito; 4 | y 2) para cada agente, cuántos segundos en promedio permanecen en el teléfono antes de darse por 5 | vencidos cuando no tienen éxito. Escribe una consulta que calcule esto 6 | */ 7 | SELECT a.name, 8 | SUM( 9 | CASE 10 | WHEN productsold = 0 THEN duration 11 | ELSE 0 12 | END)/SUM( 13 | CASE 14 | WHEN productsold = 0 THEN 1 15 | ELSE 0 16 | END) 17 | AS avgWhenNotSold , 18 | SUM( 19 | CASE 20 | WHEN productsold = 1 THEN duration 21 | ELSE 0 22 | END)/SUM( 23 | CASE WHEN productsold = 1 THEN 1 24 | ELSE 0 25 | END) 26 | AS avgWhenSold 27 | FROM calls c 28 | JOIN agents a ON c.agentid = a.agentid 29 | GROUP BY a.name 30 | ORDER BY 1 -------------------------------------------------------------------------------- /Semana 1/Ejercicios_SQL_DataEngineer-main/agents.csv: -------------------------------------------------------------------------------- 1 | agentid,name 2 | 0,Michele Williams 3 | 1,Jocelyn Parker 4 | 2,Christopher Moreno 5 | 3,Todd Morrow 6 | 4,Randy Moore 7 | 5,Paul Nunez 8 | 6,Gloria Singh 9 | 7,Angel Briggs 10 | 8,Lisa Cordova 11 | 9,Dana Hardy 12 | 10,Agent X 13 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/Mensaje_exito_cargar_data_parte2.txt: -------------------------------------------------------------------------------- 1 | *** Reading local file: /opt/airflow/logs/dag_id=ingestion_data/run_id=scheduled__2022-09-05T18:50:11.403809+00:00/task_id=load_data/attempt=1.log 2 | [2022-09-05, 19:50:27 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 3 | [2022-09-05, 19:50:27 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 4 | [2022-09-05, 19:50:27 UTC] {taskinstance.py:1376} INFO - 5 | -------------------------------------------------------------------------------- 6 | [2022-09-05, 19:50:27 UTC] {taskinstance.py:1377} INFO - Starting attempt 1 of 1 7 | [2022-09-05, 19:50:27 UTC] {taskinstance.py:1378} INFO - 8 | -------------------------------------------------------------------------------- 9 | [2022-09-05, 19:50:27 UTC] {taskinstance.py:1397} INFO - Executing on 2022-09-05 18:50:11.403809+00:00 10 | [2022-09-05, 19:50:27 UTC] {standard_task_runner.py:52} INFO - Started process 11070 to run task 11 | [2022-09-05, 19:50:27 UTC] {standard_task_runner.py:79} INFO - Running: ['***', 'tasks', 'run', 'ingestion_data', 'load_data', 'scheduled__2022-09-05T18:50:11.403809+00:00', '--job-id', '75', '--raw', '--subdir', 'DAGS_FOLDER/dag_pipeline_sencillo.py', '--cfg-path', '/tmp/tmp4tyd3aj0', '--error-file', '/tmp/tmp9mcdrke9'] 12 | [2022-09-05, 19:50:27 UTC] {standard_task_runner.py:80} INFO - Job 75: Subtask load_data 13 | [2022-09-05, 19:50:28 UTC] {task_command.py:371} INFO - Running on host 7122e948edb0 14 | [2022-09-05, 19:50:28 UTC] {logging_mixin.py:115} WARNING - /home/***/.local/lib/python3.7/site-packages/***/utils/context.py:202 AirflowContextDeprecationWarning: Accessing 'execution_date' from the template is deprecated and will be removed in a future version. Please use 'data_interval_start' or 'logical_date' instead. 15 | [2022-09-05, 19:50:28 UTC] {taskinstance.py:1591} INFO - Exporting the following env vars: 16 | AIRFLOW_CTX_DAG_OWNER=*** 17 | AIRFLOW_CTX_DAG_ID=ingestion_data 18 | AIRFLOW_CTX_TASK_ID=load_data 19 | AIRFLOW_CTX_EXECUTION_DATE=2022-09-05T18:50:11.403809+00:00 20 | AIRFLOW_CTX_TRY_NUMBER=1 21 | AIRFLOW_CTX_DAG_RUN_ID=scheduled__2022-09-05T18:50:11.403809+00:00 22 | [2022-09-05, 19:50:28 UTC] {logging_mixin.py:115} INFO - Cargando la data para la fecha: 2022-09-05 18 23 | [2022-09-05, 19:50:29 UTC] {python.py:173} INFO - Done. Returned value was: None 24 | [2022-09-05, 19:50:29 UTC] {taskinstance.py:1420} INFO - Marking task as SUCCESS. dag_id=ingestion_data, task_id=load_data, execution_date=20220905T185011, start_date=20220905T195027, end_date=20220905T195029 25 | [2022-09-05, 19:50:29 UTC] {local_task_job.py:156} INFO - Task exited with return code 0 26 | [2022-09-05, 19:50:29 UTC] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check 27 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/Mensaje_exito_transformar_data_parte1.txt: -------------------------------------------------------------------------------- 1 | *** Reading local file: /opt/airflow/logs/dag_id=ingestion_data/run_id=scheduled__2022-09-05T18:50:11.403809+00:00/task_id=transformar_data/attempt=1.log 2 | [2022-09-05, 19:50:21 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 3 | [2022-09-05, 19:50:21 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 4 | [2022-09-05, 19:50:21 UTC] {taskinstance.py:1376} INFO - 5 | -------------------------------------------------------------------------------- 6 | [2022-09-05, 19:50:21 UTC] {taskinstance.py:1377} INFO - Starting attempt 1 of 1 7 | [2022-09-05, 19:50:21 UTC] {taskinstance.py:1378} INFO - 8 | -------------------------------------------------------------------------------- 9 | [2022-09-05, 19:50:21 UTC] {taskinstance.py:1397} INFO - Executing on 2022-09-05 18:50:11.403809+00:00 10 | [2022-09-05, 19:50:21 UTC] {standard_task_runner.py:52} INFO - Started process 11018 to run task 11 | [2022-09-05, 19:50:21 UTC] {standard_task_runner.py:79} INFO - Running: ['***', 'tasks', 'run', 'ingestion_data', 'transformar_data', 'scheduled__2022-09-05T18:50:11.403809+00:00', '--job-id', '74', '--raw', '--subdir', 'DAGS_FOLDER/dag_pipeline_sencillo.py', '--cfg-path', '/tmp/tmpzpd86657', '--error-file', '/tmp/tmpuudd19nk'] 12 | [2022-09-05, 19:50:21 UTC] {standard_task_runner.py:80} INFO - Job 74: Subtask transformar_data 13 | [2022-09-05, 19:50:21 UTC] {task_command.py:371} INFO - Running on host 7122e948edb0 14 | [2022-09-05, 19:50:22 UTC] {logging_mixin.py:115} WARNING - /home/***/.local/lib/python3.7/site-packages/***/utils/context.py:202 AirflowContextDeprecationWarning: Accessing 'execution_date' from the template is deprecated and will be removed in a future version. Please use 'data_interval_start' or 'logical_date' instead. 15 | [2022-09-05, 19:50:22 UTC] {taskinstance.py:1591} INFO - Exporting the following env vars: 16 | AIRFLOW_CTX_DAG_OWNER=*** 17 | AIRFLOW_CTX_DAG_ID=ingestion_data 18 | AIRFLOW_CTX_TASK_ID=transformar_data 19 | AIRFLOW_CTX_EXECUTION_DATE=2022-09-05T18:50:11.403809+00:00 20 | AIRFLOW_CTX_TRY_NUMBER=1 21 | AIRFLOW_CTX_DAG_RUN_ID=scheduled__2022-09-05T18:50:11.403809+00:00 22 | [2022-09-05, 19:50:22 UTC] {logging_mixin.py:115} INFO - Adquiriendo data para la fecha: 2022-09-05 18 23 | [2022-09-05, 19:50:22 UTC] {logging_mixin.py:115} WARNING - /opt/***/dags/dag_pipeline_sencillo.py:36 FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only 24 | [2022-09-05, 19:50:22 UTC] {python.py:173} INFO - Done. Returned value was: None 25 | [2022-09-05, 19:50:22 UTC] {taskinstance.py:1420} INFO - Marking task as SUCCESS. dag_id=ingestion_data, task_id=transformar_data, execution_date=20220905T185011, start_date=20220905T195021, end_date=20220905T195022 26 | [2022-09-05, 19:50:22 UTC] {local_task_job.py:156} INFO - Task exited with return code 0 27 | [2022-09-05, 19:50:22 UTC] {local_task_job.py:273} INFO - 1 downstream tasks scheduled from follow-on schedule check 28 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/booking.csv: -------------------------------------------------------------------------------- 1 | client_id,booking_date,room_type,hotel_id,booking_cost,currency 2 | 2,2021/08/07,standard_1_bed,1,2910.0,GBP 3 | 4,2017/04/12,standard_1_bed,2,2910.0,GBP 4 | 5,2018/06/15,standard_1_bed,4,2910.0,EUR 5 | 1,2016/03/10,standard_1_bed,4,2910.0,EUR 6 | 2,2016/03/15,first_class_1_bed,4,2910.0,EUR 7 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/client.csv: -------------------------------------------------------------------------------- 1 | client_id,age,name,type 2 | 1,34.0,Ann,standard 3 | 2,38.0,Ben,standard 4 | 3,30.0,Tom,standard 5 | 4,43.0,Bianca,VIP 6 | 5,49.0,Caroline,standard 7 | 6,28.0,Kate,VIP 8 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/docker-compose.yaml: -------------------------------------------------------------------------------- 1 | # Licensed to the Apache Software Foundation (ASF) under one 2 | # or more contributor license agreements. See the NOTICE file 3 | # distributed with this work for additional information 4 | # regarding copyright ownership. The ASF licenses this file 5 | # to you under the Apache License, Version 2.0 (the 6 | # "License"); you may not use this file except in compliance 7 | # with the License. You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, 12 | # software distributed under the License is distributed on an 13 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | # KIND, either express or implied. See the License for the 15 | # specific language governing permissions and limitations 16 | # under the License. 17 | # 18 | 19 | # Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL. 20 | # 21 | # WARNING: This configuration is for local development. Do not use it in a production deployment. 22 | # 23 | # This configuration supports basic configuration using environment variables or an .env file 24 | # The following variables are supported: 25 | # 26 | # AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow. 27 | # Default: apache/airflow:2.3.3 28 | # AIRFLOW_UID - User ID in Airflow containers 29 | # Default: 50000 30 | # Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode 31 | # 32 | # _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account (if requested). 33 | # Default: airflow 34 | # _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account (if requested). 35 | # Default: airflow 36 | # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers. 37 | # Default: '' 38 | # 39 | # Feel free to modify this file to suit your needs. 40 | --- 41 | version: '3' 42 | x-airflow-common: 43 | &airflow-common 44 | # In order to add custom dependencies or upgrade provider packages you can use your extended image. 45 | # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml 46 | # and uncomment the "build" line below, Then run `docker-compose build` to build the images. 47 | image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.3.3} 48 | # build: . 49 | environment: 50 | &airflow-common-env 51 | AIRFLOW__CORE__EXECUTOR: LocalExecutor 52 | AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 53 | # For backward compatibility, with Airflow <2.3 54 | AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 55 | AIRFLOW__CORE__FERNET_KEY: '' 56 | AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true' 57 | AIRFLOW__CORE__LOAD_EXAMPLES: 'false' 58 | AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL: 10 59 | volumes: 60 | - ./dags:/opt/airflow/dags 61 | - ./logs:/opt/airflow/logs 62 | - ./plugins:/opt/airflow/plugins 63 | - ./db:/usr/local/airflow/db 64 | - ./raw_data:/opt/airflow/raw_data 65 | - ./processed_data:/opt/airflow/processed_data 66 | user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}" 67 | depends_on: 68 | postgres: 69 | condition: service_healthy 70 | 71 | services: 72 | postgres: 73 | image: postgres:13 74 | environment: 75 | POSTGRES_USER: airflow 76 | POSTGRES_PASSWORD: airflow 77 | POSTGRES_DB: airflow 78 | volumes: 79 | - postgres-db-volume:/var/lib/postgresql/data 80 | ports: 81 | - 5432:5432 82 | healthcheck: 83 | test: ["CMD", "pg_isready", "-U", "airflow"] 84 | interval: 5s 85 | retries: 5 86 | restart: always 87 | 88 | airflow-webserver: 89 | <<: *airflow-common 90 | command: webserver 91 | ports: 92 | - 8080:8080 93 | healthcheck: 94 | test: ["CMD", "curl", "--fail", "http://localhost:8080/health"] 95 | interval: 10s 96 | timeout: 10s 97 | retries: 5 98 | restart: always 99 | 100 | airflow-scheduler: 101 | <<: *airflow-common 102 | command: scheduler 103 | restart: always 104 | 105 | airflow-init: 106 | <<: *airflow-common 107 | # yamllint disable rule:line-length 108 | command: version 109 | environment: 110 | <<: *airflow-common-env 111 | _AIRFLOW_DB_UPGRADE: 'true' 112 | _AIRFLOW_WWW_USER_CREATE: 'true' 113 | _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow} 114 | _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow} 115 | 116 | volumes: 117 | postgres-db-volume: 118 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/file: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Semana 10/Data_pipeline_sencillo/hotel.csv: -------------------------------------------------------------------------------- 1 | hotel_id,name,address 2 | 1,Astro Resort,address1 3 | 2,Dream Connect,address2 4 | 3,Green Acres,address3 5 | 4,Millennium Times Square,address5 6 | 5,The Clift Royal,address5 7 | 6,The New View,address6 8 | -------------------------------------------------------------------------------- /Semana 10/Ejemplo en vivo ETL Bitcoin.rar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 10/Ejemplo en vivo ETL Bitcoin.rar -------------------------------------------------------------------------------- /Semana 10/Mensaje_exito_tarea1.txt: -------------------------------------------------------------------------------- 1 | *** Reading local file: /opt/airflow/logs/dag_id=dag_con_conexion_postgres/run_id=manual__2022-09-04T20:01:37.601817+00:00/task_id=crear_tabla_postgres/attempt=1.log 2 | [2022-09-04, 20:01:38 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 3 | [2022-09-04, 20:01:38 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 4 | [2022-09-04, 20:01:38 UTC] {taskinstance.py:1376} INFO - 5 | -------------------------------------------------------------------------------- 6 | [2022-09-04, 20:01:38 UTC] {taskinstance.py:1377} INFO - Starting attempt 1 of 6 7 | [2022-09-04, 20:01:38 UTC] {taskinstance.py:1378} INFO - 8 | -------------------------------------------------------------------------------- 9 | [2022-09-04, 20:01:38 UTC] {taskinstance.py:1397} INFO - Executing on 2022-09-04 20:01:37.601817+00:00 10 | [2022-09-04, 20:01:38 UTC] {standard_task_runner.py:52} INFO - Started process 1614 to run task 11 | [2022-09-04, 20:01:38 UTC] {standard_task_runner.py:79} INFO - Running: ['***', 'tasks', 'run', 'dag_con_conexion_postgres', 'crear_tabla_postgres', 'manual__2022-09-04T20:01:37.601817+00:00', '--job-id', '35', '--raw', '--subdir', 'DAGS_FOLDER/dag_postgres_database.py', '--cfg-path', '/tmp/tmpefrikk0_', '--error-file', '/tmp/tmpq6m4hu2v'] 12 | [2022-09-04, 20:01:38 UTC] {standard_task_runner.py:80} INFO - Job 35: Subtask crear_tabla_postgres 13 | [2022-09-04, 20:01:38 UTC] {task_command.py:371} INFO - Running on host 6076183c9d93 14 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1591} INFO - Exporting the following env vars: 15 | AIRFLOW_CTX_DAG_OWNER=DavidBU 16 | AIRFLOW_CTX_DAG_ID=dag_con_conexion_postgres 17 | AIRFLOW_CTX_TASK_ID=crear_tabla_postgres 18 | AIRFLOW_CTX_EXECUTION_DATE=2022-09-04T20:01:37.601817+00:00 19 | AIRFLOW_CTX_TRY_NUMBER=1 20 | AIRFLOW_CTX_DAG_RUN_ID=manual__2022-09-04T20:01:37.601817+00:00 21 | [2022-09-04, 20:01:39 UTC] {base.py:68} INFO - Using connection ID 'postgres_localhost' for task execution. 22 | [2022-09-04, 20:01:39 UTC] {dbapi.py:231} INFO - Running statement: 23 | create table if not exists fin_mundo( 24 | dt date, 25 | pais varchar(30) 26 | ) 27 | , parameters: None 28 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1420} INFO - Marking task as SUCCESS. dag_id=dag_con_conexion_postgres, task_id=crear_tabla_postgres, execution_date=20220904T200137, start_date=20220904T200138, end_date=20220904T200139 29 | [2022-09-04, 20:01:39 UTC] {local_task_job.py:156} INFO - Task exited with return code 0 30 | [2022-09-04, 20:01:39 UTC] {local_task_job.py:273} INFO - 1 downstream tasks scheduled from follow-on schedule check 31 | -------------------------------------------------------------------------------- /Semana 10/Mensaje_exito_tarea2.txt: -------------------------------------------------------------------------------- 1 | *** Reading local file: /opt/airflow/logs/dag_id=dag_con_conexion_postgres/run_id=manual__2022-09-04T20:01:37.601817+00:00/task_id=insertar_en_tabla/attempt=1.log 2 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 3 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1179} INFO - Dependencies all met for 4 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1376} INFO - 5 | -------------------------------------------------------------------------------- 6 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1377} INFO - Starting attempt 1 of 6 7 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1378} INFO - 8 | -------------------------------------------------------------------------------- 9 | [2022-09-04, 20:01:39 UTC] {taskinstance.py:1397} INFO - Executing on 2022-09-04 20:01:37.601817+00:00 10 | [2022-09-04, 20:01:39 UTC] {standard_task_runner.py:52} INFO - Started process 1636 to run task 11 | [2022-09-04, 20:01:39 UTC] {standard_task_runner.py:79} INFO - Running: ['***', 'tasks', 'run', 'dag_con_conexion_postgres', 'insertar_en_tabla', 'manual__2022-09-04T20:01:37.601817+00:00', '--job-id', '36', '--raw', '--subdir', 'DAGS_FOLDER/dag_postgres_database.py', '--cfg-path', '/tmp/tmpukbt6rtz', '--error-file', '/tmp/tmpd8yg252j'] 12 | [2022-09-04, 20:01:39 UTC] {standard_task_runner.py:80} INFO - Job 36: Subtask insertar_en_tabla 13 | [2022-09-04, 20:01:40 UTC] {task_command.py:371} INFO - Running on host 6076183c9d93 14 | [2022-09-04, 20:01:40 UTC] {taskinstance.py:1591} INFO - Exporting the following env vars: 15 | AIRFLOW_CTX_DAG_OWNER=DavidBU 16 | AIRFLOW_CTX_DAG_ID=dag_con_conexion_postgres 17 | AIRFLOW_CTX_TASK_ID=insertar_en_tabla 18 | AIRFLOW_CTX_EXECUTION_DATE=2022-09-04T20:01:37.601817+00:00 19 | AIRFLOW_CTX_TRY_NUMBER=1 20 | AIRFLOW_CTX_DAG_RUN_ID=manual__2022-09-04T20:01:37.601817+00:00 21 | [2022-09-04, 20:01:40 UTC] {base.py:68} INFO - Using connection ID 'postgres_localhost' for task execution. 22 | [2022-09-04, 20:01:40 UTC] {dbapi.py:231} INFO - Running statement: 23 | insert into fin_mundo (dt,pais) values ('12-12-2025','Colombia'); 24 | insert into fin_mundo (dt,pais) values ('15-08-2035','Brasil'); 25 | insert into fin_mundo (dt,pais) values ('21-09-2030','Argentina'); 26 | insert into fin_mundo (dt,pais) values ('13-07-2045','Chile'); 27 | insert into fin_mundo (dt,pais) values ('17-11-2028','Ecuador'); 28 | insert into fin_mundo (dt,pais) values ('19-03-2032','Peru'); 29 | insert into fin_mundo (dt,pais) values ('18-08-2026','Uruguay'); 30 | insert into fin_mundo (dt,pais) values ('22-05-2037','Paraguay'); 31 | insert into fin_mundo (dt,pais) values ('12-12-2080','Venezuela'); 32 | insert into fin_mundo (dt,pais) values ('12-12-2071','Mexico'); 33 | , parameters: None 34 | [2022-09-04, 20:01:40 UTC] {dbapi.py:239} INFO - Rows affected: 1 35 | [2022-09-04, 20:01:40 UTC] {taskinstance.py:1420} INFO - Marking task as SUCCESS. dag_id=dag_con_conexion_postgres, task_id=insertar_en_tabla, execution_date=20220904T200137, start_date=20220904T200139, end_date=20220904T200140 36 | [2022-09-04, 20:01:40 UTC] {local_task_job.py:156} INFO - Task exited with return code 0 37 | [2022-09-04, 20:01:40 UTC] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check 38 | -------------------------------------------------------------------------------- /Semana 10/Video 2_ Lanzando Airflow con Docker/docker-compose.yaml: -------------------------------------------------------------------------------- 1 | # Licensed to the Apache Software Foundation (ASF) under one 2 | # or more contributor license agreements. See the NOTICE file 3 | # distributed with this work for additional information 4 | # regarding copyright ownership. The ASF licenses this file 5 | # to you under the Apache License, Version 2.0 (the 6 | # "License"); you may not use this file except in compliance 7 | # with the License. You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, 12 | # software distributed under the License is distributed on an 13 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | # KIND, either express or implied. See the License for the 15 | # specific language governing permissions and limitations 16 | # under the License. 17 | # 18 | 19 | # Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL. 20 | # 21 | # WARNING: This configuration is for local development. Do not use it in a production deployment. 22 | # 23 | # This configuration supports basic configuration using environment variables or an .env file 24 | # The following variables are supported: 25 | # 26 | # AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow. 27 | # Default: apache/airflow:2.3.3 28 | # AIRFLOW_UID - User ID in Airflow containers 29 | # Default: 50000 30 | # Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode 31 | # 32 | # _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account (if requested). 33 | # Default: airflow 34 | # _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account (if requested). 35 | # Default: airflow 36 | # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers. 37 | # Default: '' 38 | # 39 | # Feel free to modify this file to suit your needs. 40 | --- 41 | version: '3' 42 | x-airflow-common: 43 | &airflow-common 44 | # In order to add custom dependencies or upgrade provider packages you can use your extended image. 45 | # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml 46 | # and uncomment the "build" line below, Then run `docker-compose build` to build the images. 47 | image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.3.3} 48 | # build: . 49 | environment: 50 | &airflow-common-env 51 | AIRFLOW__CORE__EXECUTOR: LocalExecutor 52 | AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 53 | # For backward compatibility, with Airflow <2.3 54 | AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 55 | AIRFLOW__CORE__FERNET_KEY: '' 56 | AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true' 57 | AIRFLOW__CORE__LOAD_EXAMPLES: 'false' 58 | AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL: 10 59 | volumes: 60 | - ./dags:/opt/airflow/dags 61 | - ./logs:/opt/airflow/logs 62 | - ./plugins:/opt/airflow/plugins 63 | user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}" 64 | depends_on: 65 | postgres: 66 | condition: service_healthy 67 | 68 | services: 69 | postgres: 70 | image: postgres:13 71 | environment: 72 | POSTGRES_USER: airflow 73 | POSTGRES_PASSWORD: airflow 74 | POSTGRES_DB: airflow 75 | volumes: 76 | - postgres-db-volume:/var/lib/postgresql/data 77 | ports: 78 | - 5432:5432 79 | healthcheck: 80 | test: ["CMD", "pg_isready", "-U", "airflow"] 81 | interval: 5s 82 | retries: 5 83 | restart: always 84 | 85 | airflow-webserver: 86 | <<: *airflow-common 87 | command: webserver 88 | ports: 89 | - 8080:8080 90 | healthcheck: 91 | test: ["CMD", "curl", "--fail", "http://localhost:8080/health"] 92 | interval: 10s 93 | timeout: 10s 94 | retries: 5 95 | restart: always 96 | 97 | airflow-scheduler: 98 | <<: *airflow-common 99 | command: scheduler 100 | restart: always 101 | 102 | airflow-init: 103 | <<: *airflow-common 104 | # yamllint disable rule:line-length 105 | command: version 106 | environment: 107 | <<: *airflow-common-env 108 | _AIRFLOW_DB_UPGRADE: 'true' 109 | _AIRFLOW_WWW_USER_CREATE: 'true' 110 | _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow} 111 | _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow} 112 | 113 | volumes: 114 | postgres-db-volume: 115 | -------------------------------------------------------------------------------- /Semana 10/Video 3_ Tasks y Operators/primer_dag_v3.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner': 'DavidBU', 7 | 'retries': 5, 8 | 'retry_delay': timedelta(minutes=2) # 2 min de espera antes de cualquier re intento 9 | } 10 | 11 | with DAG( 12 | dag_id="mi_primer_dag_v3", 13 | default_args= default_args, 14 | description="Este es el primer DAG que creamos", 15 | start_date=datetime(2022,8,1,2),# esto dice que debemos iniciar el 1-Ago-2022 y a un intervalo diario 16 | schedule_interval='@daily' ) as dag: 17 | task1= BashOperator(task_id='primera_tarea', 18 | bash_command='echo hola mundo, esta es nuestra primera tarea!' 19 | ) 20 | 21 | task2 =BashOperator( 22 | task_id='segunda_tarea', 23 | bash_command="echo hola, soy la tarea 2 y sere corrida luego de la Tarea 1" 24 | ) 25 | 26 | task3 =BashOperator( 27 | task_id= 'tercera_tarea', 28 | bash_command='echo hola, soy la tarea 3 y sere corrida luego de Tarea 1 al mismo tiempo que Tarea 2' 29 | ) 30 | 31 | task1.set_downstream(task2) 32 | task1.set_downstream(task3) 33 | -------------------------------------------------------------------------------- /Semana 10/Video 4_ DAGs/attrib: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Semana 10/Video 4_ DAGs/primer_dag.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner': 'DavidBU', 7 | 'retries': 5, 8 | 'retry_delay': timedelta(minutes=2) # 2 min de espera antes de cualquier re intento 9 | } 10 | 11 | with DAG( 12 | dag_id="mi_primer_dag", 13 | default_args= default_args, 14 | description="Este es el primer DAG que creamos", 15 | start_date=datetime(2022,8,1,2),# esto dice que debemos iniciar el 1-Ago-2022 y a un intervalo diario 16 | schedule_interval='@daily' ) as dag: 17 | task1= BashOperator(task_id='primer_task', 18 | bash_command='echo hola mundo, esta es nuestra primera tarea!' 19 | ) 20 | 21 | task1 -------------------------------------------------------------------------------- /Semana 10/Video 4_ DAGs/primer_dag_v2.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner': 'DavidBU', 7 | 'retries': 5, 8 | 'retry_delay': timedelta(minutes=2) # 2 min de espera antes de cualquier re intento 9 | } 10 | 11 | with DAG( 12 | dag_id="mi_primer_dag_v2", 13 | default_args= default_args, 14 | description="Este es el primer DAG que creamos", 15 | start_date=datetime(2022,8,1,2),# esto dice que debemos iniciar el 1-Ago-2022 y a un intervalo diario 16 | schedule_interval='@daily' ) as dag: 17 | task1= BashOperator(task_id='primer_task', 18 | bash_command='echo hola mundo, esta es nuestra primera tarea!' 19 | ) 20 | 21 | task2 =BashOperator( 22 | task_id='segunda_tarea', 23 | bash_command="echo hola, soy la tarea 2 y sere corrida luego de la Tarea 1" 24 | ) 25 | 26 | task1.set_downstream(task2) -------------------------------------------------------------------------------- /Semana 10/Video 4_ DAGs/primer_dag_v3.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner': 'DavidBU', 7 | 'retries': 5, 8 | 'retry_delay': timedelta(minutes=2) # 2 min de espera antes de cualquier re intento 9 | } 10 | 11 | with DAG( 12 | dag_id="mi_primer_dag_v3", 13 | default_args= default_args, 14 | description="Este es el primer DAG que creamos", 15 | start_date=datetime(2022,8,1,2),# esto dice que debemos iniciar el 1-Ago-2022 y a un intervalo diario 16 | schedule_interval='@daily' ) as dag: 17 | task1= BashOperator(task_id='primer_task', 18 | bash_command='echo hola mundo, esta es nuestra primera tarea!' 19 | ) 20 | 21 | task2 =BashOperator( 22 | task_id='segunda_tarea', 23 | bash_command="echo hola, soy la tarea 2 y sere corrida luego de la Tarea 1" 24 | ) 25 | 26 | task3 =BashOperator( 27 | task_id= 'tercera_tarea', 28 | bash_command='echo hola, soy la tarea 3 y sere corrida luego de Tarea 1 al mismo tiempo que Tarea 2' 29 | ) 30 | 31 | task1.set_downstream(task2) 32 | task1.set_downstream(task3) -------------------------------------------------------------------------------- /Semana 10/Video 4_ DAGs/primer_tag_python_operator.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.python import PythonOperator 4 | 5 | default_args={ 6 | 'owner': 'DavidBU', 7 | 'retries':5, 8 | 'retry_delay': timedelta(minutes=3) 9 | } 10 | 11 | def aloha_david(): 12 | print("Aloha Mundo soy yo!") 13 | 14 | with DAG( 15 | default_args=default_args, 16 | dag_id='mi_primer_dar_con_PythonOperator', 17 | description= 'Nuestro primer dag usando python Operator', 18 | start_date=datetime(2022,8,1,2), 19 | schedule_interval='@daily' 20 | ) as dag: 21 | task1= PythonOperator( 22 | task_id='aloha_david', 23 | python_callable= aloha_david, 24 | ) 25 | 26 | task1 27 | -------------------------------------------------------------------------------- /Semana 10/Video 6_ Context/attrib: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Semana 10/Video 6_ Context/ejemplo_template.py: -------------------------------------------------------------------------------- 1 | import airflow.utils.dates 2 | from airflow import DAG 3 | from airflow.operators.python import PythonOperator 4 | 5 | dag= DAG ( 6 | dag_id= "dag_print_context", 7 | start_date=airflow.utils.dates.days_ago(3), 8 | schedule_interval="@daily", 9 | tags=['Ejemplo',"David"]# estos apareceran debajo del DAG en la plataforma 10 | ) 11 | 12 | def _print_context(**context): 13 | print(context) # el context es un diccionario con todas las variables por default 14 | start=context["execution_date"], 15 | end=context['next_execution_date'], 16 | print(f"Inicio:{start}, Fin: {end}") 17 | 18 | print_context= PythonOperator( 19 | task_id="print_context", 20 | python_callable=_print_context, #nombre de la funcion 21 | dag=dag 22 | ) -------------------------------------------------------------------------------- /Semana 10/dag_con_backfilling.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner':'DavidBU', 7 | 'retries':5, 8 | 'retry_delay':timedelta(minutes=5) 9 | 10 | } 11 | 12 | with DAG( 13 | default_args=default_args, 14 | dag_id='dag_con_backfilling', 15 | description= 'Dag con catchup', 16 | start_date=datetime(2022,9,1), 17 | schedule_interval='@daily', 18 | catchup=False 19 | ) as dag: 20 | task1=BashOperator( 21 | task_id='tarea1', 22 | bash_command='echo Esto es un DAG con catchup' 23 | ) 24 | task1 -------------------------------------------------------------------------------- /Semana 10/dag_con_catchup.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner':'DavidBU', 7 | 'retries':5, 8 | 'retry_delay':timedelta(minutes=5) 9 | 10 | } 11 | 12 | with DAG( 13 | default_args=default_args, 14 | dag_id='dag_con_catchup', 15 | description= 'Dag con catchup', 16 | start_date=datetime(2022,9,1), 17 | schedule_interval='@daily', 18 | catchup=True 19 | ) as dag: 20 | task1=BashOperator( 21 | task_id='tarea1', 22 | bash_command='echo Esto es un DAG con catchup' 23 | ) 24 | task1 -------------------------------------------------------------------------------- /Semana 10/dag_pipeline_sencillo.py: -------------------------------------------------------------------------------- 1 | from datetime import timedelta,datetime 2 | from pathlib import Path 3 | from airflow import DAG 4 | # Operadores 5 | from airflow.operators.python_operator import PythonOperator 6 | # funcion days_ago 7 | from airflow.utils.dates import days_ago 8 | import pandas as pd 9 | import sqlite3 10 | import os 11 | # Obtener el dag del directorio 12 | dag_path = os.getcwd() 13 | 14 | # funcion de transformacion de datos 15 | def transformar_data(exec_date): 16 | try: 17 | print(f"Adquiriendo data para la fecha: {exec_date}") 18 | date = datetime.strptime(exec_date, '%Y-%m-%d %H') 19 | file_date_path = f"{date.strftime('%Y-%m-%d')}/{date.hour}" 20 | # Leer la data 21 | booking = pd.read_csv(f"{dag_path}/raw_data/booking.csv", low_memory=False) 22 | client = pd.read_csv(f"{dag_path}/raw_data/client.csv", low_memory=False) 23 | hotel = pd.read_csv(f"{dag_path}/raw_data/hotel.csv", low_memory=False) 24 | # Hacer el merge de booking con client 25 | data = pd.merge(booking, client, on='client_id') 26 | data.rename(columns={'name': 'client_name', 'type': 'client_type'}, inplace=True) 27 | # hacer merge de booking, client & hotel 28 | data = pd.merge(data, hotel, on='hotel_id') 29 | data.rename(columns={'name': 'hotel_name'}, inplace=True) 30 | # convertir a formato fecha 31 | data.booking_date = pd.to_datetime(data.booking_date, infer_datetime_format=True) 32 | # Convertir todo a GBP currency con filtros 33 | data.loc[data.currency == 'EUR', ['booking_cost']] = data.booking_cost * 0.8 34 | data.currency.replace("EUR", "GBP", inplace=True) 35 | # remover columnas innecesarias 36 | data = data.drop('address', 1) 37 | # cargar la data procesada 38 | output_dir = Path(f'{dag_path}/processed_data/{file_date_path}') 39 | output_dir.mkdir(parents=True, exist_ok=True) 40 | data.to_csv(output_dir / f"{file_date_path}.csv".replace("/", "_"), index=False, mode='a') 41 | #data.to_csv(f"{dag_path}/processed_data/processed_data.csv", index=False) 42 | except ValueError as e: 43 | print("Formato datetime deberia ser %Y-%m-%d %H", e) 44 | raise e 45 | 46 | # funcion de carga de datos en base de datos 47 | def cargar_data(exec_date): 48 | print(f"Cargando la data para la fecha: {exec_date}") 49 | date = datetime.strptime(exec_date, '%Y-%m-%d %H') 50 | file_date_path = f"{date.strftime('%Y-%m-%d')}/{date.hour}" 51 | conn = sqlite3.connect("/usr/local/airflow/db/datascience.db") 52 | c = conn.cursor() 53 | c.execute(''' 54 | CREATE TABLE IF NOT EXISTS booking_record ( 55 | client_id INTEGER NOT NULL, 56 | booking_date TEXT NOT NULL, 57 | room_type TEXT(512) NOT NULL, 58 | hotel_id INTEGER NOT NULL, 59 | booking_cost NUMERIC, 60 | currency TEXT, 61 | age INTEGER, 62 | client_name TEXT(512), 63 | client_type TEXT(512), 64 | hotel_name TEXT(512) 65 | ); 66 | ''') 67 | #records = pd.read_csv(f"{dag_path}/processed_data/processed_data.csv") #leer tabla creada 68 | processed_file = f"{dag_path}/processed_data/{file_date_path}/{file_date_path.replace('/', '_')}.csv" 69 | records = pd.read_csv(processed_file) 70 | records.to_sql('booking_record', conn, index=False, if_exists='append') 71 | #records.to_sql('booking_record', conn, if_exists='replace', index=False) # mandarla a la base de datos 72 | 73 | 74 | # argumentos por defecto para el DAG 75 | default_args = { 76 | 'owner': 'DavidBU', 77 | 'start_date': days_ago(5) 78 | } 79 | 80 | ingestion_dag = DAG( 81 | dag_id='ingestion_data', 82 | default_args=default_args, 83 | description='Agrega records de reservas para analisis', 84 | schedule_interval=timedelta(hours=1), 85 | catchup=False 86 | ) 87 | 88 | task_1 = PythonOperator( 89 | task_id='transformar_data', 90 | python_callable=transformar_data, 91 | op_args=["{{ ds }} {{ execution_date.hour }}"], 92 | dag=ingestion_dag, 93 | ) 94 | 95 | task_2 = PythonOperator( 96 | task_id='load_data', 97 | python_callable=cargar_data, 98 | op_args=["{{ ds }} {{ execution_date.hour }}"], 99 | dag=ingestion_dag, 100 | ) 101 | 102 | 103 | task_1 >> task_2 104 | -------------------------------------------------------------------------------- /Semana 10/dag_postgres_database.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from email.policy import default 3 | from airflow import DAG 4 | from airflow.providers.postgres.operators.postgres import PostgresOperator 5 | 6 | default_args={ 7 | 'owner': 'DavidBU', 8 | 'retries':5, 9 | 'retry_delay': timedelta(minutes=5) 10 | } 11 | 12 | with DAG( 13 | default_args=default_args, 14 | dag_id='dag_con_conexion_postgres', 15 | description= 'Nuestro primer dag usando python Operator', 16 | start_date=datetime(2022,9,3), 17 | schedule_interval='0 0 * * *' 18 | ) as dag: 19 | task1= PostgresOperator( 20 | task_id='crear_tabla_postgres', 21 | postgres_conn_id= 'postgres_localhost', 22 | sql=""" 23 | create table if not exists fin_mundo( 24 | dt date, 25 | pais varchar(30) 26 | ) 27 | """ 28 | ) 29 | task2 =PostgresOperator( 30 | task_id='insertar_en_tabla', 31 | postgres_conn_id= 'postgres_localhost', 32 | sql=""" 33 | insert into fin_mundo (dt,pais) values ('12-12-2025','Colombia'); 34 | insert into fin_mundo (dt,pais) values ('15-08-2035','Brasil'); 35 | insert into fin_mundo (dt,pais) values ('21-09-2030','Argentina'); 36 | insert into fin_mundo (dt,pais) values ('13-07-2045','Chile'); 37 | insert into fin_mundo (dt,pais) values ('17-11-2028','Ecuador'); 38 | insert into fin_mundo (dt,pais) values ('19-03-2032','Peru'); 39 | insert into fin_mundo (dt,pais) values ('18-08-2026','Uruguay'); 40 | insert into fin_mundo (dt,pais) values ('22-05-2037','Paraguay'); 41 | insert into fin_mundo (dt,pais) values ('12-12-2080','Venezuela'); 42 | insert into fin_mundo (dt,pais) values ('12-12-2071','Mexico'); 43 | """ 44 | ) 45 | task1 >> task2 46 | -------------------------------------------------------------------------------- /Semana 10/docker-compose.yaml: -------------------------------------------------------------------------------- 1 | # Licensed to the Apache Software Foundation (ASF) under one 2 | # or more contributor license agreements. See the NOTICE file 3 | # distributed with this work for additional information 4 | # regarding copyright ownership. The ASF licenses this file 5 | # to you under the Apache License, Version 2.0 (the 6 | # "License"); you may not use this file except in compliance 7 | # with the License. You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, 12 | # software distributed under the License is distributed on an 13 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | # KIND, either express or implied. See the License for the 15 | # specific language governing permissions and limitations 16 | # under the License. 17 | # 18 | 19 | # Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL. 20 | # 21 | # WARNING: This configuration is for local development. Do not use it in a production deployment. 22 | # 23 | # This configuration supports basic configuration using environment variables or an .env file 24 | # The following variables are supported: 25 | # 26 | # AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow. 27 | # Default: apache/airflow:2.3.3 28 | # AIRFLOW_UID - User ID in Airflow containers 29 | # Default: 50000 30 | # Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode 31 | # 32 | # _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account (if requested). 33 | # Default: airflow 34 | # _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account (if requested). 35 | # Default: airflow 36 | # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers. 37 | # Default: '' 38 | # 39 | # Feel free to modify this file to suit your needs. 40 | --- 41 | version: '3' 42 | x-airflow-common: 43 | &airflow-common 44 | # In order to add custom dependencies or upgrade provider packages you can use your extended image. 45 | # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml 46 | # and uncomment the "build" line below, Then run `docker-compose build` to build the images. 47 | image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.3.3} 48 | # build: . 49 | environment: 50 | &airflow-common-env 51 | AIRFLOW__CORE__EXECUTOR: LocalExecutor 52 | AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 53 | # For backward compatibility, with Airflow <2.3 54 | AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 55 | AIRFLOW__CORE__FERNET_KEY: '' 56 | AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true' 57 | AIRFLOW__CORE__LOAD_EXAMPLES: 'false' 58 | AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL: 10 59 | volumes: 60 | - ./dags:/opt/airflow/dags 61 | - ./logs:/opt/airflow/logs 62 | - ./plugins:/opt/airflow/plugins 63 | user: "${AIRFLOW_UID:-50000}:${AIRFLOW_GID:-50000}" 64 | depends_on: 65 | postgres: 66 | condition: service_healthy 67 | 68 | services: 69 | postgres: 70 | image: postgres:13 71 | environment: 72 | POSTGRES_USER: airflow 73 | POSTGRES_PASSWORD: airflow 74 | POSTGRES_DB: airflow 75 | volumes: 76 | - postgres-db-volume:/var/lib/postgresql/data 77 | ports: 78 | - 5432:5432 79 | healthcheck: 80 | test: ["CMD", "pg_isready", "-U", "airflow"] 81 | interval: 5s 82 | retries: 5 83 | restart: always 84 | 85 | airflow-webserver: 86 | <<: *airflow-common 87 | command: webserver 88 | ports: 89 | - 8080:8080 90 | healthcheck: 91 | test: ["CMD", "curl", "--fail", "http://localhost:8080/health"] 92 | interval: 10s 93 | timeout: 10s 94 | retries: 5 95 | restart: always 96 | 97 | airflow-scheduler: 98 | <<: *airflow-common 99 | command: scheduler 100 | restart: always 101 | 102 | airflow-init: 103 | <<: *airflow-common 104 | # yamllint disable rule:line-length 105 | command: version 106 | environment: 107 | <<: *airflow-common-env 108 | _AIRFLOW_DB_UPGRADE: 'true' 109 | _AIRFLOW_WWW_USER_CREATE: 'true' 110 | _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow} 111 | _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow} 112 | 113 | volumes: 114 | postgres-db-volume: 115 | -------------------------------------------------------------------------------- /Semana 10/primer_dag_v4.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | default_args={ 6 | 'owner': 'DavidBU', 7 | 'retries': 5, 8 | 'retry_delay': timedelta(minutes=2) # 2 min de espera antes de cualquier re intento 9 | } 10 | 11 | with DAG( 12 | dag_id="mi_primer_dag_v4", 13 | default_args= default_args, 14 | description="DAG de ejemplo para imprimir en logs", 15 | start_date=datetime(2022,9,3,2),# esto dice que debemos iniciar el 1-Ago-2022 y a un intervalo diario 16 | schedule_interval='@daily' ) as dag: 17 | task1= BashOperator(task_id='primera_tarea', 18 | bash_command='echo Chile' 19 | ) 20 | 21 | task2 =BashOperator( 22 | task_id='segunda_tarea', 23 | bash_command="echo David" 24 | ) 25 | 26 | task3 =BashOperator( 27 | task_id= 'tercera_tarea', 28 | bash_command='echo Bustos Usta' 29 | ) 30 | task1 >> task2 >> task3 -------------------------------------------------------------------------------- /Semana 11/Actividad_XCOMS.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.python import PythonOperator, BranchPythonOperator 3 | from datetime import datetime 4 | from random import randint 5 | 6 | def _performance(): 7 | return randint(1,10) 8 | 9 | def _elegir_mejor_valor(ti): 10 | #esta funcion trea los tres valores de las tareas creadas 11 | performance=ti.xcom_pull(task_ids=['empleado_A','empleado_B','empleado_C']) 12 | mejor_performance=max(performance) # elige el valor mas grande de la lista 13 | # condicional para elegir el valor mas grande 14 | if mejor_performance == performance[0]: 15 | [ti.xcom_push(key='mejor_algo',value='empleado_A'),ti.xcom_push(key='mejor_performance',value=mejor_performance)] 16 | elif mejor_performance == performance[1]: 17 | [ti.xcom_push(key='mejor_algo',value='empleado_B'),ti.xcom_push(key='mejor_performance',value=mejor_performance)] 18 | else: 19 | [ti.xcom_push(key='mejor_algo',value='empleado_C'),ti.xcom_push(key='mejor_performance',value=mejor_performance)] 20 | return 'usar_algo' # devuelve una tarea en este caso 21 | 22 | def _usar_tarea(ti): 23 | nombre= ti.xcom_pull(key='mejor_algo',task_ids='elegir_mejor_valor') 24 | performance_tarea= ti.xcom_pull(key='mejor_performance', task_ids='elegir_mejor_valor') 25 | print('Usando', nombre + ' con valor de:', performance_tarea) 26 | 27 | 28 | default_args={ 29 | 'owner': 'DavidBU', 30 | 'start_date': datetime(2022,9,7), 31 | 'end_date': datetime(2022,12,20), 32 | 'depends_on_past': False, 33 | 'email': ['davidbu@gcp.com'], 34 | 'email_on_failure': False 35 | } 36 | with DAG( 37 | dag_id ='dag_xcoms_facil', 38 | default_args=default_args, 39 | schedule_interval='@daily', 40 | catchup =False ) as dag: 41 | empleado_A= PythonOperator( 42 | task_id= 'empleado_A', 43 | python_callable= _performance 44 | ) 45 | empleado_B= PythonOperator( 46 | task_id= 'empleado_B', 47 | python_callable= _performance 48 | ) 49 | empleado_C= PythonOperator( 50 | task_id= 'empleado_C', 51 | python_callable= _performance 52 | ) 53 | elegir_mejor =BranchPythonOperator( 54 | task_id= 'elegir_mejor_valor', 55 | python_callable=_elegir_mejor_valor 56 | ) 57 | usar_tarea= PythonOperator( 58 | task_id='usar_algo', 59 | python_callable=_usar_tarea 60 | ) 61 | 62 | [empleado_A,empleado_B,empleado_C] >> elegir_mejor >> usar_tarea 63 | -------------------------------------------------------------------------------- /Semana 11/Microdesafio_Semana11.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | from email import message 3 | from airflow.models import DAG, Variable 4 | from airflow.operators.python_operator import PythonOperator 5 | 6 | import smtplib 7 | 8 | pais=['Argentina','Brasil','Colombia','Chile','Paraguay','Uruguay','Venezuela','Peru','Ecuador','Bolivia','México'] 9 | acronimo= ['AR','BR','CO','CL','PY','UR','VE','PE','EC','BO','MX'] 10 | lista_fin_mundo=[2040,2080,2095,2100,2089,2093,2054,2078,2079,2083,2071] 11 | 12 | texto=[] 13 | 14 | for i in range(len(pais)): 15 | string='Pais {} ({}), Fecha fin mundo estimada: {}'.format(pais[i], acronimo[i],lista_fin_mundo[i]) 16 | texto.append(string) 17 | 18 | final = '\n'.join(texto) 19 | print(final) 20 | 21 | def enviar(): 22 | try: 23 | x=smtplib.SMTP('smtp.gmail.com',587) 24 | x.starttls() 25 | x.login('tu_email@gmail.com','NO_TE_DIRE_MI_CONTRASEÑA') # Cambia tu contraseña !!!!!!!! 26 | subject='Fechas fin del mundo' 27 | body_text=final 28 | message='Subject: {}\n\n{}'.format(subject,body_text) 29 | x.sendmail('tu_email@gmail.com','destinatario@gmail.com',message) 30 | print('Exito') 31 | except Exception as exception: 32 | print(exception) 33 | print('Failure') 34 | 35 | default_args={ 36 | 'owner': 'DavidBU', 37 | 'start_date': datetime(2022,9,7) 38 | } 39 | 40 | with DAG( 41 | dag_id='dag_smtp_email_fin_mundo', 42 | default_args=default_args, 43 | schedule_interval='@daily') as dag: 44 | 45 | tarea_1=PythonOperator( 46 | task_id='dag_envio_fin_mundo', 47 | python_callable=enviar 48 | ) 49 | 50 | tarea_1 51 | -------------------------------------------------------------------------------- /Semana 11/Video 1_ Parametros DAG/attrib: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Semana 11/Video 1_ Parametros DAG/ejemplo_params.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | 5 | # Aqui creamos los default arguments 6 | default_args={ 7 | 'owner': 'DavidBU', 8 | 'depends_on_past': True, 9 | 'email': ['dafbustosus@unal.edu.co'], 10 | 'email_on_retry':False, 11 | 'email_on_failure': False, 12 | 'retries':5, 13 | 'retry_delay': timedelta(minutes=1) 14 | } 15 | 16 | # Creamos el objeto DAG 17 | with DAG( 18 | dag_id="dag_para_explicar_atributos", 19 | default_args= default_args, 20 | description="En este DAG explicamos atributos importantes de los DAGs", 21 | start_date=datetime(2022,8,3,2), 22 | #end_date=datetime(2022,8,1,10), 23 | tags=['Ejemplo',"David" ,'Params'], # estos apareceran debajo del DAG en la plataforma 24 | schedule_interval="@daily") as dag: 25 | task1= BashOperator(task_id='primera_tarea', 26 | bash_command='echo Primer tarea mi nombre es David!' 27 | ) 28 | 29 | task2 =BashOperator( 30 | task_id='segunda_tarea', 31 | bash_command="echo Segunda tarea completada soy Ingeniero" 32 | ) 33 | 34 | task3 =BashOperator( 35 | task_id= 'tercera_tarea', 36 | bash_command='echo Tercera tarea completada me gusta programar' 37 | ) 38 | 39 | task1 >> task2 >> task3 -------------------------------------------------------------------------------- /Semana 11/Video 2_ Sensors/dag_sensors.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | import airflow.utils.dates 4 | from airflow.operators.python import PythonOperator 5 | from airflow.contrib.sensors.file_sensor import FileSensor 6 | from airflow.operators.dummy_operator import DummyOperator 7 | from airflow.sensors.python import PythonSensor 8 | from pathlib import Path 9 | 10 | default_args={ 11 | 'owner': 'DavidBU', 12 | 'depends_on_past': False, 13 | 'email': ['dafbustosus@unal.edu.co'], 14 | 'email_on_retry':False, 15 | 'email_on_failure': False, 16 | 'retries':10, 17 | 'retry_delay': timedelta(minutes=1) 18 | } 19 | 20 | dag = DAG( 21 | dag_id='data_sensors_DBU', 22 | start_date=airflow.utils.dates.days_ago(3), 23 | schedule_interval='@daily', 24 | default_args=default_args 25 | ) 26 | 27 | def _wait_for_supermarket(compania_id): 28 | compania_path = Path("/data/" + compania_id) 29 | data_files = compania_path.glob("data-*.csv") 30 | success_file = compania_path / "_SUCCESS" 31 | return data_files and success_file.exists() 32 | 33 | wait_for_supermarket_1 = PythonSensor( 34 | task_id="esperando_por_compania_1", 35 | python_callable=_wait_for_supermarket, 36 | op_kwargs={"compania_id": "compania1"}, 37 | dag=dag, 38 | ) -------------------------------------------------------------------------------- /Semana 11/Video 2_ Sensors/dag_sensors2.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime, timedelta 2 | from airflow import DAG 3 | import airflow.utils.dates 4 | from airflow.operators.python import PythonOperator 5 | from airflow.contrib.sensors.file_sensor import FileSensor 6 | 7 | default_args={ 8 | 'owner': 'DavidBU', 9 | 'depends_on_past': False, 10 | 'email': ['dafbustosus@unal.edu.co'], 11 | 'email_on_retry':False, 12 | 'email_on_failure': False, 13 | 'retries':10, 14 | 'retry_delay': timedelta(minutes=1) 15 | } 16 | 17 | dag = DAG( 18 | dag_id='data_sensors_DBU', 19 | start_date=airflow.utils.dates.days_ago(3), 20 | schedule_interval='@daily', 21 | default_args=default_args 22 | ) 23 | 24 | 25 | def print_message(): 26 | print("Llegó el archivo!") 27 | 28 | 29 | file_sensor = FileSensor( 30 | task_id="sensar_archivo", 31 | poke_interval=60, 32 | timeout=60 * 30, 33 | filepath='/opt/airflow/data/compania1/data-*.csv', 34 | ) 35 | 36 | 37 | imprimir = PythonOperator( 38 | task_id="print_message", 39 | dag=dag, 40 | python_callable=print_message 41 | ) 42 | 43 | 44 | file_sensor >> imprimir 45 | -------------------------------------------------------------------------------- /Semana 11/Video 3_ XCOMS/dag_con_xcom.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.bash import BashOperator 3 | from airflow.operators.python import PythonOperator 4 | from random import uniform 5 | from datetime import datetime 6 | 7 | default_args = { 8 | 'start_date': datetime(2022, 8, 4) 9 | } 10 | 11 | def _entrenamiento_modelo(ti): # ticorresponde al objeto task instance 12 | accuracy = uniform(0.01, 1.0) 13 | print(f'Accuracy de modelo\'s: {accuracy}') 14 | ti.xcom_push(key='accuracy_modelo', value=accuracy) 15 | 16 | def _elegir_mejor_modelo(ti): 17 | print('elegir el mejor modelo') 18 | accuracies = ti.xcom_pull(key='accuracy_modelo', task_ids=['entrenando_modelo_A', 'entrenando_modelo_B', 'entrenando_modelo_C']) 19 | print(accuracies) 20 | 21 | with DAG('xcom_dag', schedule_interval='@daily', default_args=default_args, catchup=False) as dag: 22 | # primera tarea: descargar data... simulacion 23 | descargando_data = BashOperator( 24 | task_id='descargando_data', 25 | bash_command='sleep 3', #esperar por 3 segundos 26 | do_xcom_push=False 27 | ) 28 | # segunda, tercera y cuarta tarea: entrenar modelos..... simulacion 29 | tarea_entrenamiento = [ 30 | PythonOperator( 31 | task_id=f'entrenando_modelo_{task}', 32 | python_callable=_entrenamiento_modelo 33 | ) for task in ['A', 'B', 'C']] 34 | 35 | elegir_modelo = PythonOperator( 36 | task_id='elegir_modelo', 37 | python_callable=_elegir_mejor_modelo 38 | ) 39 | 40 | descargando_data >> tarea_entrenamiento >> elegir_modelo 41 | -------------------------------------------------------------------------------- /Semana 11/Video 3_ XCOMS/dag_con_xcom2.py: -------------------------------------------------------------------------------- 1 | import airflow 2 | from airflow import DAG 3 | from airflow.operators.bash import BashOperator 4 | from airflow.operators.python import PythonOperator 5 | from random import uniform 6 | from datetime import datetime 7 | import requests 8 | import json 9 | from datetime import timedelta 10 | 11 | default_args = { 12 | 'owner': 'David BU', 13 | 'depends_on_past': False, 14 | 'email_on_failure': False, 15 | 'email_on_retry': False, 16 | 'retries': 1, 17 | 'retry_delay': timedelta(minutes=5) 18 | } 19 | 20 | def conexion_api(ti): 21 | url = 'https://fakestoreapi.com' 22 | res = requests.get(f"{url}/products") 23 | data=json.loads(res.content) 24 | promedio=sum([x['price'] for x in data])/len(data) 25 | ti.xcom_push(key='descargando_data',value=promedio) 26 | 27 | def analizando_data(ti): 28 | promedio_venta= ti.xcom_pull(key='descargando_data', 29 | task_ids='conexion_api') 30 | print('El promedio de precios es:', promedio_venta) 31 | 32 | with DAG( 33 | dag_id='xcom_dag2', 34 | schedule_interval='@daily', 35 | start_date=datetime(2022, 8, 20), 36 | catchup=False 37 | ) as dag: 38 | obtener_data = PythonOperator( 39 | task_id='descargando_data', 40 | python_callable=conexion_api, 41 | #do_xcom_push=True 42 | ) 43 | 44 | hacer_analisis = PythonOperator( 45 | task_id='obtener_promedio', 46 | python_callable=analizando_data, 47 | #do_xcom_push=True 48 | ) 49 | 50 | obtener_data >> hacer_analisis 51 | -------------------------------------------------------------------------------- /Semana 11/Video 4_ TaskGroups y Depencias_SubDAGs/subdags.py: -------------------------------------------------------------------------------- 1 | from airflow import DAG 2 | from airflow.operators.bash import BashOperator 3 | from airflow.utils.task_group import TaskGroup 4 | from datetime import datetime, timedelta 5 | 6 | 7 | # Aqui creamos los default arguments 8 | default_args={ 9 | 'owner': 'DavidBU', 10 | 'depends_on_past': True, 11 | 'email': ['dafbustosus@unal.edu.co'], 12 | 'email_on_retry':False, 13 | 'email_on_failure': False, 14 | 'retries':5, 15 | 'retry_delay': timedelta(minutes=1), 16 | 'start_date': datetime(2022,8,4) 17 | } 18 | 19 | with DAG( 20 | dag_id='dag_paralelo', schedule_interval='@daily',default_args=default_args, catchup=False) as dag: 21 | tarea_1= BashOperator( 22 | task_id='tarea_1', 23 | bash_command='sleep 3' 24 | ) 25 | 26 | with TaskGroup('procesando_tareas') as procesando_tareas: 27 | tarea_2=BashOperator( 28 | task_id='tarea_2', 29 | bash_command='sleep 3' 30 | ) 31 | 32 | tarea_3= BashOperator( 33 | task_id='tarea_3', 34 | bash_command='sleep 3' 35 | ) 36 | 37 | tarea_4= BashOperator( 38 | task_id='tarea_4', 39 | bash_command='sleep 3' 40 | ) 41 | 42 | tarea_1 >> procesando_tareas >> tarea_4 43 | -------------------------------------------------------------------------------- /Semana 11/Video 5_ Airflow.cfg/attrib: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Semana 11/dag_smtp_email.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | from email import message 3 | from airflow.models import DAG, Variable 4 | from airflow.operators.python_operator import PythonOperator 5 | 6 | import smtplib 7 | 8 | def enviar(): 9 | try: 10 | x=smtplib.SMTP('smtp.gmail.com',587) 11 | x.starttls() 12 | x.login('cuenta_remitente@gmail.com','password') 13 | subject='Ganaste un premio' 14 | body_text='Has ganado un premio fantastico!!!!' 15 | message='Subject: {}\n\n{}'.format(subject,body_text) 16 | x.sendmail('cuenta_remitente@gmail.com','cuenta_destinatario@gmail.com',message) 17 | print('Exito') 18 | except Exception as exception: 19 | print(exception) 20 | print('Failure') 21 | 22 | default_args={ 23 | 'owner': 'DavidBU', 24 | 'start_date': datetime(2022,9,6) 25 | } 26 | 27 | with DAG( 28 | dag_id='dag_smtp_email_automatico', 29 | default_args=default_args, 30 | schedule_interval='@daily') as dag: 31 | 32 | tarea_1=PythonOperator( 33 | task_id='dag_envio', 34 | python_callable=enviar 35 | ) 36 | 37 | tarea_1 38 | -------------------------------------------------------------------------------- /Semana 11/docker-compose.yaml: -------------------------------------------------------------------------------- 1 | # Licensed to the Apache Software Foundation (ASF) under one 2 | # or more contributor license agreements. See the NOTICE file 3 | # distributed with this work for additional information 4 | # regarding copyright ownership. The ASF licenses this file 5 | # to you under the Apache License, Version 2.0 (the 6 | # "License"); you may not use this file except in compliance 7 | # with the License. You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, 12 | # software distributed under the License is distributed on an 13 | # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | # KIND, either express or implied. See the License for the 15 | # specific language governing permissions and limitations 16 | # under the License. 17 | # 18 | 19 | # Basic Airflow cluster configuration for CeleryExecutor with Redis and PostgreSQL. 20 | # 21 | # WARNING: This configuration is for local development. Do not use it in a production deployment. 22 | # 23 | # This configuration supports basic configuration using environment variables or an .env file 24 | # The following variables are supported: 25 | # 26 | # AIRFLOW_IMAGE_NAME - Docker image name used to run Airflow. 27 | # Default: apache/airflow:2.2.3 28 | # AIRFLOW_UID - User ID in Airflow containers 29 | # Default: 50000 30 | # Those configurations are useful mostly in case of standalone testing/running Airflow in test/try-out mode 31 | # 32 | # _AIRFLOW_WWW_USER_USERNAME - Username for the administrator account (if requested). 33 | # Default: airflow 34 | # _AIRFLOW_WWW_USER_PASSWORD - Password for the administrator account (if requested). 35 | # Default: airflow 36 | # _PIP_ADDITIONAL_REQUIREMENTS - Additional PIP requirements to add when starting all containers. 37 | # Default: '' 38 | # 39 | # Feel free to modify this file to suit your needs. 40 | --- 41 | version: '3' 42 | x-airflow-common: 43 | &airflow-common 44 | # In order to add custom dependencies or upgrade provider packages you can use your extended image. 45 | # Comment the image line, place your Dockerfile in the directory where you placed the docker-compose.yaml 46 | # and uncomment the "build" line below, Then run `docker-compose build` to build the images. 47 | image: ${AIRFLOW_IMAGE_NAME:-apache/airflow:2.2.3} 48 | # build: . 49 | environment: 50 | &airflow-common-env 51 | AIRFLOW__CORE__EXECUTOR: CeleryExecutor 52 | AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow 53 | AIRFLOW__CELERY__RESULT_BACKEND: db+postgresql://airflow:airflow@postgres/airflow 54 | AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0 55 | AIRFLOW__CORE__FERNET_KEY: '' 56 | AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true' 57 | AIRFLOW__CORE__LOAD_EXAMPLES: 'true' 58 | AIRFLOW__API__AUTH_BACKEND: 'airflow.api.auth.backend.basic_auth' 59 | _PIP_ADDITIONAL_REQUIREMENTS: ${_PIP_ADDITIONAL_REQUIREMENTS:-} 60 | volumes: 61 | - ./dags:/opt/airflow/dags 62 | - ./logs:/opt/airflow/logs 63 | - ./plugins:/opt/airflow/plugins 64 | - ./input_files:/opt/airflow/input_files 65 | - ./output_files:/opt/airflow/output_files 66 | - ./config:/opt/airflow 67 | user: "${AIRFLOW_UID:-50000}:0" 68 | depends_on: 69 | &airflow-common-depends-on 70 | redis: 71 | condition: service_healthy 72 | postgres: 73 | condition: service_healthy 74 | 75 | services: 76 | postgres: 77 | image: postgres:13 78 | environment: 79 | POSTGRES_USER: airflow 80 | POSTGRES_PASSWORD: airflow 81 | POSTGRES_DB: airflow 82 | volumes: 83 | - postgres-db-volume:/var/lib/postgresql/data 84 | healthcheck: 85 | test: ["CMD", "pg_isready", "-U", "airflow"] 86 | interval: 5s 87 | retries: 5 88 | restart: always 89 | 90 | redis: 91 | image: redis:latest 92 | expose: 93 | - 6379 94 | healthcheck: 95 | test: ["CMD", "redis-cli", "ping"] 96 | interval: 5s 97 | timeout: 30s 98 | retries: 50 99 | restart: always 100 | 101 | airflow-webserver: 102 | <<: *airflow-common 103 | command: webserver 104 | ports: 105 | - 8080:8080 106 | healthcheck: 107 | test: ["CMD", "curl", "--fail", "http://localhost:8080/health"] 108 | interval: 10s 109 | timeout: 10s 110 | retries: 5 111 | restart: always 112 | depends_on: 113 | <<: *airflow-common-depends-on 114 | airflow-init: 115 | condition: service_completed_successfully 116 | 117 | airflow-scheduler: 118 | <<: *airflow-common 119 | command: scheduler 120 | healthcheck: 121 | test: ["CMD-SHELL", 'airflow jobs check --job-type SchedulerJob --hostname "$${HOSTNAME}"'] 122 | interval: 10s 123 | timeout: 10s 124 | retries: 5 125 | restart: always 126 | depends_on: 127 | <<: *airflow-common-depends-on 128 | airflow-init: 129 | condition: service_completed_successfully 130 | 131 | airflow-worker: 132 | <<: *airflow-common 133 | command: celery worker 134 | healthcheck: 135 | test: 136 | - "CMD-SHELL" 137 | - 'celery --app airflow.executors.celery_executor.app inspect ping -d "celery@$${HOSTNAME}"' 138 | interval: 10s 139 | timeout: 10s 140 | retries: 5 141 | environment: 142 | <<: *airflow-common-env 143 | # Required to handle warm shutdown of the celery workers properly 144 | # See https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation 145 | DUMB_INIT_SETSID: "0" 146 | restart: always 147 | depends_on: 148 | <<: *airflow-common-depends-on 149 | airflow-init: 150 | condition: service_completed_successfully 151 | 152 | airflow-triggerer: 153 | <<: *airflow-common 154 | command: triggerer 155 | healthcheck: 156 | test: ["CMD-SHELL", 'airflow jobs check --job-type TriggererJob --hostname "$${HOSTNAME}"'] 157 | interval: 10s 158 | timeout: 10s 159 | retries: 5 160 | restart: always 161 | depends_on: 162 | <<: *airflow-common-depends-on 163 | airflow-init: 164 | condition: service_completed_successfully 165 | 166 | airflow-init: 167 | <<: *airflow-common 168 | entrypoint: /bin/bash 169 | # yamllint disable rule:line-length 170 | command: 171 | - -c 172 | - | 173 | function ver() { 174 | printf "%04d%04d%04d%04d" $${1//./ } 175 | } 176 | airflow_version=$$(gosu airflow airflow version) 177 | airflow_version_comparable=$$(ver $${airflow_version}) 178 | min_airflow_version=2.2.0 179 | min_airflow_version_comparable=$$(ver $${min_airflow_version}) 180 | if (( airflow_version_comparable < min_airflow_version_comparable )); then 181 | echo 182 | echo -e "\033[1;31mERROR!!!: Too old Airflow version $${airflow_version}!\e[0m" 183 | echo "The minimum Airflow version supported: $${min_airflow_version}. Only use this or higher!" 184 | echo 185 | exit 1 186 | fi 187 | if [[ -z "${AIRFLOW_UID}" ]]; then 188 | echo 189 | echo -e "\033[1;33mWARNING!!!: AIRFLOW_UID not set!\e[0m" 190 | echo "If you are on Linux, you SHOULD follow the instructions below to set " 191 | echo "AIRFLOW_UID environment variable, otherwise files will be owned by root." 192 | echo "For other operating systems you can get rid of the warning with manually created .env file:" 193 | echo " See: https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#setting-the-right-airflow-user" 194 | echo 195 | fi 196 | one_meg=1048576 197 | mem_available=$$(($$(getconf _PHYS_PAGES) * $$(getconf PAGE_SIZE) / one_meg)) 198 | cpus_available=$$(grep -cE 'cpu[0-9]+' /proc/stat) 199 | disk_available=$$(df / | tail -1 | awk '{print $$4}') 200 | warning_resources="false" 201 | if (( mem_available < 4000 )) ; then 202 | echo 203 | echo -e "\033[1;33mWARNING!!!: Not enough memory available for Docker.\e[0m" 204 | echo "At least 4GB of memory required. You have $$(numfmt --to iec $$((mem_available * one_meg)))" 205 | echo 206 | warning_resources="true" 207 | fi 208 | if (( cpus_available < 2 )); then 209 | echo 210 | echo -e "\033[1;33mWARNING!!!: Not enough CPUS available for Docker.\e[0m" 211 | echo "At least 2 CPUs recommended. You have $${cpus_available}" 212 | echo 213 | warning_resources="true" 214 | fi 215 | if (( disk_available < one_meg * 10 )); then 216 | echo 217 | echo -e "\033[1;33mWARNING!!!: Not enough Disk space available for Docker.\e[0m" 218 | echo "At least 10 GBs recommended. You have $$(numfmt --to iec $$((disk_available * 1024 )))" 219 | echo 220 | warning_resources="true" 221 | fi 222 | if [[ $${warning_resources} == "true" ]]; then 223 | echo 224 | echo -e "\033[1;33mWARNING!!!: You have not enough resources to run Airflow (see above)!\e[0m" 225 | echo "Please follow the instructions to increase amount of resources available:" 226 | echo " https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html#before-you-begin" 227 | echo 228 | fi 229 | mkdir -p /sources/logs /sources/dags /sources/plugins 230 | chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins} 231 | exec /entrypoint airflow version 232 | # yamllint enable rule:line-length 233 | environment: 234 | <<: *airflow-common-env 235 | _AIRFLOW_DB_UPGRADE: 'true' 236 | _AIRFLOW_WWW_USER_CREATE: 'true' 237 | _AIRFLOW_WWW_USER_USERNAME: ${_AIRFLOW_WWW_USER_USERNAME:-airflow} 238 | _AIRFLOW_WWW_USER_PASSWORD: ${_AIRFLOW_WWW_USER_PASSWORD:-airflow} 239 | user: "0:0" 240 | volumes: 241 | - .:/sources 242 | 243 | airflow-cli: 244 | <<: *airflow-common 245 | profiles: 246 | - debug 247 | environment: 248 | <<: *airflow-common-env 249 | CONNECTION_CHECK_MAX_COUNT: "0" 250 | # Workaround for entrypoint issue. See: https://github.com/apache/airflow/issues/16252 251 | command: 252 | - bash 253 | - -c 254 | - airflow 255 | 256 | flower: 257 | <<: *airflow-common 258 | command: celery flower 259 | ports: 260 | - 5555:5555 261 | healthcheck: 262 | test: ["CMD", "curl", "--fail", "http://localhost:5555/"] 263 | interval: 10s 264 | timeout: 10s 265 | retries: 5 266 | restart: always 267 | depends_on: 268 | <<: *airflow-common-depends-on 269 | airflow-init: 270 | condition: service_completed_successfully 271 | 272 | volumes: 273 | postgres-db-volume: 274 | -------------------------------------------------------------------------------- /Semana 2/Ejemplo_en_vivo.sql: -------------------------------------------------------------------------------- 1 | -- Momento 1 (creacion de tablas) 2 | CREATE TABLE customers( 3 | customerid INT primary key, 4 | name VARCHAR(50), 5 | occupation VARCHAR(50), 6 | email VARCHAR(50), 7 | company VARCHAR(50), 8 | phonenumber VARCHAR(20), 9 | age INT 10 | ); 11 | 12 | CREATE TABLE agents( 13 | agentid INT primary key, 14 | name VARCHAR(50) 15 | ); 16 | 17 | CREATE TABLE calls( 18 | callid INT primary key, 19 | agentid INT, 20 | customerid INT, 21 | pickedup SMALLINT, 22 | duration INT, 23 | productsold SMALLINT 24 | ); 25 | 26 | -- Momento 2 (Insercion de registros - analogo- OLTP) 27 | SET STATISTICS TIME ON; 28 | INSERT INTO dbo.calls VALUES (10000, 4,6, 1, 130, 1); 29 | INSERT INTO dbo.calls VALUES (10001, 5,7, 1, 131, 0); 30 | INSERT INTO dbo.calls VALUES (10002, 10,260, 0, 0, 0); 31 | INSERT INTO dbo.calls VALUES (10003, 3,5, 1, 60, 1); 32 | INSERT INTO dbo.calls VALUES (10004, 10,731, 1, 90, 0); 33 | INSERT INTO dbo.calls VALUES (10005, 4,415, 0, 0, 0); 34 | SET STATISTICS TIME OFF; 35 | GO 36 | 37 | 38 | -- Momento 3 (Generacion de consulta analoga a OLAP) 39 | SET STATISTICS TIME ON; 40 | SELECT a.name AS AgentName, cu.name AS CustomerName, x.duration 41 | FROM 42 | ( 43 | SELECT ca.agentid, ca.duration, max(customerid) AS cid 44 | FROM 45 | ( 46 | SELECT agentid, min(duration) as fastestcall 47 | FROM calls 48 | WHERE productsold = 1 49 | GROUP BY agentid 50 | ) min 51 | JOIN calls ca ON ca.agentid = min.agentid AND ca.duration = min.fastestcall 52 | WHERE productsold = 1 53 | GROUP BY ca.agentid, ca.duration 54 | ) x 55 | JOIN agents a ON x.agentid = a.agentid 56 | JOIN customers cu ON cu.customerid = x.cid 57 | SET STATISTICS TIME OFF; 58 | GO 59 | -------------------------------------------------------------------------------- /Semana 2/agents.csv: -------------------------------------------------------------------------------- 1 | agentid,name 2 | 0,Michele Williams 3 | 1,Jocelyn Parker 4 | 2,Christopher Moreno 5 | 3,Todd Morrow 6 | 4,Randy Moore 7 | 5,Paul Nunez 8 | 6,Gloria Singh 9 | 7,Angel Briggs 10 | 8,Lisa Cordova 11 | 9,Dana Hardy 12 | 10,Agent X 13 | -------------------------------------------------------------------------------- /Semana 3/Creacion_tablas_microdesafio.sql: -------------------------------------------------------------------------------- 1 | CREATE TABLE articulos.titulos 2 | (titulo_id char(6) NOT NULL, 3 | titulo varchar(80) NOT NULL, 4 | tipo char(20) NOT NULL); 5 | GO 6 | 7 | INSERT INTO articulos.titulos VALUES ('1', 'Consultas SQL','bbdd'); 8 | INSERT INTO articulos.titulos VALUES ('3', 'Grupo recursos Azure','administracion'); 9 | INSERT INTO articulos.titulos VALUES ('4', '.NET Framework 4.5','programacion'); 10 | INSERT INTO articulos.titulos VALUES ('5', 'Programacion C#','dev'); 11 | INSERT INTO articulos.titulos VALUES ('7', 'Power BI','BI'); 12 | INSERT INTO articulos.titulos VALUES ('8', 'Administracion Sql server','administracion'); 13 | 14 | CREATE TABLE articulos.autores 15 | (TituloId char(6) NOT NULL, 16 | NombreAutor nVarchar(100) NOT NULL, 17 | ApellidosAutor nVarchar(100) NOT NULL, 18 | TelefonoAutor nVarChar(25) 19 | ); 20 | INSERT INTO articulos.autores VALUES ('3', 'David', 'Saenz', '99897867'); 21 | INSERT INTO articulos.autores VALUES ('8', 'Ana', 'Ruiz', '99897466'); 22 | INSERT INTO articulos.autores VALUES ('2', 'Julian', 'Perez', '99897174'); 23 | INSERT INTO articulos.autores VALUES ('1', 'Andres', 'Calamaro', '99876869'); 24 | INSERT INTO articulos.autores VALUES ('4', 'Cidys', 'Castillo', '998987453'); 25 | INSERT INTO articulos.autores VALUES ('5', 'Pedro', 'Molina', '99891768'); 26 | -------------------------------------------------------------------------------- /Semana 3/Ejemplo_en_vivo_MongoDB.sh: -------------------------------------------------------------------------------- 1 | -- Version mongo 2 | mongo 3 | -- Ver databases disponibles 4 | show dbs 5 | -- seleccionar una BD 6 | use datos 7 | -- verificar BD seleccionada 8 | db 9 | -- Crear usuarios 10 | db.createUser({"user":"brad", pwd:"david123",roles:["readWrite","dbAdmin"]}) 11 | -- Crear colecciones 12 | db.createCollection('clientes'); 13 | show collections; 14 | -- Insert 15 | db.clientes.insert({nombres:"David", apellido:"Bustos Usta"}); 16 | -- Insertar datos 17 | db.clientes.insert({nombres:"David", apellido:"Bustos Usta"}); 18 | -- ver registros de coleccion 19 | db.clientes.find(); 20 | -- ver de mejor manera 21 | db.clientes.find().pretty(); 22 | -------------------------------------------------------------------------------- /Semana 3/Ejemplo_en_vivo_SQLServer.sql: -------------------------------------------------------------------------------- 1 | -- DDL 2 | -- CREATE 3 | CREATE TABLE dbo.sales 4 | ( 5 | ID int, 6 | Description varchar (8000), 7 | CustomerID int, 8 | Price decimal(8,2) 9 | ) 10 | -- ALTER 11 | ALTER TABLE dbo.sales ADD taxid int NULL; 12 | -- DROP 13 | Drop table dbo.sales 14 | -- DML 15 | -- SELECT 16 | Select ID,Description 17 | From dbo.sales 18 | -- INSERT 19 | Insert into dbo.sales values(1,’HP Product’,3,1233) 20 | -- UPDATE 21 | UPDATE dbo.sales 22 | SET Description=’HP Product v2’ 23 | Where Description =’HP Product’ 24 | -- DELETE 25 | DELETE FROM dbo.sales 26 | Where ID=1 27 | -------------------------------------------------------------------------------- /Semana 3/Insumo_para_crear_procedimiento_microdesafio.sql: -------------------------------------------------------------------------------- 1 | SELECT 2 | TituloId =t.titulo_id, 3 | TituloNombre =CAST(t.titulo as nVarChar(100)), 4 | TituloTipo =CASE CAST(t.tipo as nVarchar(100)) 5 | WHEN 'bbdd' THEN 'Base de datos, Transact-SQL' 6 | WHEN 'BI' THEN 'Base de datos, BI' 7 | WHEN 'administracion' THEN 'Base de datos, Administración' 8 | WHEN 'dev' THEN 'Desarrollo' 9 | WHEN 'programacion' THEN 'Desarrollo' 10 | END, 11 | NombreCompleto =a.NombreAutor + ' ' +a.ApellidosAutor, 12 | a.TelefonoAutor 13 | FROM BDE.articulos.titulos as t 14 | JOIN BDE.articulos.autores as a ON t.titulo_id =a.TituloId 15 | -------------------------------------------------------------------------------- /Semana 3/Microdesafio.sql: -------------------------------------------------------------------------------- 1 | -- 1.crear base de datos 2 | CREATE DATABASE BDE; 3 | GO 4 | -- 5 | USE BDE 6 | GO 7 | -- 2. Crear un esquema 8 | CREATE SCHEMA articulos AUTHORIZATION dbo; 9 | GO 10 | 11 | -- 3. crear tabla titulso 12 | CREATE TABLE articulos.titulos 13 | (titulo_id char(6) NOT NULL, 14 | titulo varchar(80) NOT NULL, 15 | tipo char(20) NOT NULL); 16 | GO 17 | 18 | -- Insertar valores manualmente(OJO esto se puede hacer con COPY o con un asistente) 19 | INSERT INTO articulos.titulos VALUES ('1', 'Consultas SQL','bbdd'); 20 | INSERT INTO articulos.titulos VALUES ('3', 'Grupo recursos Azure','administracion'); 21 | INSERT INTO articulos.titulos VALUES ('4', '.NET Framework 4.5','programacion'); 22 | INSERT INTO articulos.titulos VALUES ('5', 'Programacion C#','dev'); 23 | INSERT INTO articulos.titulos VALUES ('7', 'Power BI','BI'); 24 | INSERT INTO articulos.titulos VALUES ('8', 'Administracion Sql server','administracion'); 25 | 26 | 27 | -- 4. crear tabla autores 28 | CREATE TABLE articulos.autores 29 | (TituloId char(6) NOT NULL, 30 | NombreAutor nVarchar(100) NOT NULL, 31 | ApellidosAutor nVarchar(100) NOT NULL, 32 | TelefonoAutor nVarChar(25) 33 | ); 34 | 35 | -- Insertar en la tabla autores en essquema articulos 36 | INSERT INTO articulos.autores VALUES ('3', 'David', 'Saenz', '99897867'); 37 | INSERT INTO articulos.autores VALUES ('8', 'Ana', 'Ruiz', '99897466'); 38 | INSERT INTO articulos.autores VALUES ('2', 'Julian', 'Perez', '99897174'); 39 | INSERT INTO articulos.autores VALUES ('1', 'Andres', 'Calamaro', '99876869'); 40 | INSERT INTO articulos.autores VALUES ('4', 'Cidys', 'Castillo', '998987453'); 41 | INSERT INTO articulos.autores VALUES ('5', 'Pedro', 'Molina', '99891768'); 42 | 43 | --5. crear database BDE_DW (DataWarehouse) 44 | CREATE DATABASE BDE_DW; 45 | GO 46 | 47 | --6. crear la tabla DimTitulo para informes 48 | USE BDE_DW 49 | GO 50 | CREATE TABLE dbo.DimTitulos 51 | (TituloId char(6) NOT NULL, 52 | TituloNombre nVarChar(100) NOT NULL, 53 | TituloTipo nVarChar(100) NOT NULL, 54 | NombreCompleto nVarChar(200), 55 | TelefonoAutor nVarchar(25)); 56 | GO 57 | 58 | --7. Crear un procedimiento almacenado para el ETL 59 | USE BDE 60 | GO 61 | --- es la tecnica mas sencilla (aclarado y rellenado) pero no es la unica tecnica (e.g actualizacion) 62 | CREATE PROCEDURE pETL_Insertar_DimTitulo 63 | AS 64 | DELETE FROM BDE_DW.dbo.DimTitulos; 65 | INSERT INTO BDE_DW.dbo.DimTitulos 66 | SELECT 67 | TituloId =t.titulo_id, 68 | TituloNombre =CAST(t.titulo as nVarChar(100)), 69 | TituloTipo =CASE CAST(t.tipo as nVarchar(100)) 70 | WHEN 'bbdd' THEN 'Base de datos, Transact-SQL' 71 | WHEN 'BI' THEN 'Base de datos, BI' 72 | WHEN 'administracion' THEN 'Base de datos, Administraci�n' 73 | WHEN 'dev' THEN 'Desarrollo' 74 | WHEN 'programacion' THEN 'Desarrollo' 75 | END, 76 | NombreCompleto =a.NombreAutor + ' ' +a.ApellidosAutor, 77 | a.TelefonoAutor 78 | FROM BDE.articulos.titulos as t 79 | JOIN BDE.articulos.autores as a ON t.titulo_id =a.TituloId 80 | GO 81 | 82 | -- ir a Programatically>> Stores Procedures y verificar que se creo el procedimeinto 83 | 84 | --8. Executar procedimeinto 85 | EXECUTE pETL_Insertar_DimTitulo; 86 | GO 87 | 88 | -- 9. Verificar que se tiene el resultado 89 | USE BDE_DW 90 | 91 | SELECT * FROM dbo.DimTitulos 92 | GO 93 | -------------------------------------------------------------------------------- /Semana 4/1Forma_normal_solucion.sql: -------------------------------------------------------------------------------- 1 | -- Comando original para crear la tabla 2 | CREATE TABLE customers ( 3 | name VARCHAR(255), 4 | industry VARCHAR(255), 5 | project1_id INT(6), 6 | project1_feedback TEXT, 7 | project2_id INT(6), 8 | project2_feedback TEXT, 9 | contact_person_id INT(6), 10 | contact_person_and_role VARCHAR(300), 11 | phone_number VARCHAR(12), 12 | address VARCHAR(255), 13 | city VARCHAR(255), 14 | zip VARCHAR(5) 15 | ); 16 | -- SOLUCION 1NF 17 | -- Agregar llave primaria 18 | ALTER TABLE customers 19 | ADD COLUMN id INT(6) AUTO_INCREMENT PRIMARY KEY FIRST; 20 | -- Separar la columna contact_person_and_role 21 | ALTER TABLE customers 22 | CHANGE COLUMN contact_person_and_role contact_person VARCHAR(300); 23 | 24 | ALTER TABLE customers 25 | ADD COLUMN contact_person_role VARCHAR(300) AFTER contact_person; 26 | 27 | -- Mover las columnas project_ids y project_feedbacks a una nueva tabla project_feddbacks 28 | ALTER TABLE customers 29 | DROP COLUMN project1_id, 30 | DROP COLUMN project1_feedback, 31 | DROP COLUMN project2_id, 32 | DROP COLUMN project2_feedback; 33 | 34 | CREATE TABLE project_feedbacks ( 35 | id INT(6) AUTO_INCREMENT PRIMARY KEY, 36 | project_id INT(6), 37 | customer_id INT(6), 38 | project_feedback TEXT 39 | ); 40 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /Semana 4/2Forma_normal_solucion.sql: -------------------------------------------------------------------------------- 1 | -- Mover esas columnas a una tabla que contenga la información de contacto de las personas 2 | ALTER TABLE customers 3 | DROP COLUMN contact_person, 4 | DROP COLUMN contact_person_role, 5 | DROP COLUMN phone_number; 6 | 7 | -- crear la tabla contact_persons con un id respectivo 8 | CREATE TABLE contact_persons ( 9 | id INT(6) PRIMARY KEY, 10 | name VARCHAR(300), 11 | role VARCHAR(300), 12 | phone_number VARCHAR(15) 13 | ); 14 | -------------------------------------------------------------------------------- /Semana 4/3Forma_normal_solucion.sql: -------------------------------------------------------------------------------- 1 | -- Eliminar la columna ciudad de customers y crear una nueva tabla zips para almacenar esto 2 | ALTER TABLE customers 3 | DROP COLUMN city; 4 | 5 | CREATE TABLE zips ( 6 | zip VARCHAR(5) PRIMARY KEY, 7 | city VARCHAR(255) 8 | ); 9 | -------------------------------------------------------------------------------- /Semana 4/Condiciones_Microdesafio_ETL_desastres.sql: -------------------------------------------------------------------------------- 1 | -- 1.crear base de datos 2 | CREATE DATABASE DESASTRES; 3 | GO 4 | -- 5 | USE DESASTRES 6 | GO 7 | 8 | 9 | -- 2. crear tabla clima futuro global 10 | CREATE TABLE clima 11 | (año INT NOT NULL PRIMARY KEY, 12 | Temperatura FLOAT NOT NULL, 13 | Oxigeno FLOAT NOT NULL); 14 | GO 15 | 16 | -- Insertar valores manualmente 17 | INSERT INTO clima VALUES (2023, 22.5,230); 18 | INSERT INTO clima VALUES (2024, 22.7,228.6); 19 | INSERT INTO clima VALUES (2025, 22.9,227.5); 20 | INSERT INTO clima VALUES (2026, 23.1,226.7); 21 | INSERT INTO clima VALUES (2027, 23.2,226.4); 22 | INSERT INTO clima VALUES (2028, 23.4,226.2); 23 | INSERT INTO clima VALUES (2029, 23.6,226.1); 24 | INSERT INTO clima VALUES (2030, 23.8,225.1); 25 | 26 | -- 3. crear tabla desastres proyectados globales 27 | CREATE TABLE desastres 28 | (año INT NOT NULL PRIMARY KEY, 29 | Tsunamis INT NOT NULL, 30 | Olas_Calor INT NOT NULL, 31 | Terremotos INT NOT NULL, 32 | Erupciones INT NOT NULL, 33 | Incendios INT NOT NULL); 34 | GO 35 | -- Insertar valores manualmente 36 | INSERT INTO desastres VALUES (2023, 2,15, 6,7,50); 37 | INSERT INTO desastres VALUES (2024, 1,12, 8,9,46); 38 | INSERT INTO desastres VALUES (2025, 3,16, 5,6,47); 39 | INSERT INTO desastres VALUES (2026, 4,12, 10,13,52); 40 | INSERT INTO desastres VALUES (2027, 5,12, 6,5,41); 41 | INSERT INTO desastres VALUES (2028, 4,18, 3,2,39); 42 | INSERT INTO desastres VALUES (2029, 2,19, 5,6,49); 43 | INSERT INTO desastres VALUES (2030, 4,20, 6,7,50); 44 | 45 | -- 4. crear tabla muertes proyectadas por rangos de edad 46 | CREATE TABLE muertes 47 | (año INT NOT NULL PRIMARY KEY, 48 | R_Menor15 INT NOT NULL, 49 | R_15_a_30 INT NOT NULL, 50 | R_30_a_45 INT NOT NULL, 51 | R_45_a_60 INT NOT NULL, 52 | R_M_a_60 INT NOT NULL); 53 | GO 54 | -- Insertar valores manualmente 55 | INSERT INTO muertes VALUES (2023, 1000,1300, 1200,1150,1500); 56 | INSERT INTO muertes VALUES (2024, 1200,1250, 1260,1678,1940); 57 | INSERT INTO muertes VALUES (2025, 987,1130, 1160,1245,1200); 58 | INSERT INTO muertes VALUES (2026, 1560,1578, 1856,1988,1245); 59 | INSERT INTO muertes VALUES (2027, 1002,943, 1345,1232,986); 60 | INSERT INTO muertes VALUES (2028, 957,987, 1856,1567,1756); 61 | INSERT INTO muertes VALUES (2029, 1285,1376, 1465,1432,1236); 62 | INSERT INTO muertes VALUES (2030, 1145,1456, 1345,1654,1877); 63 | 64 | -- 5. Crear base de datos para alojar resumenes de estadisticas 65 | CREATE DATABASE DESASTRES_BDE; 66 | GO 67 | 68 | USE DESASTRES_BDE 69 | GO 70 | 71 | CREATE TABLE DESASTRES_FINAL 72 | (Cuatrenio varchar(20) NOT NULL PRIMARY KEY, 73 | Temp_AVG FLOAT NOT NULL, Oxi_AVG FLOAT NOT NULL, 74 | T_Tsunamis INT NOT NULL, T_OlasCalor INT NOT NULL, 75 | T_Terremotos INT NOT NULL, T_Erupciones INT NOT NULL, 76 | T_Incendios INT NOT NULL,M_Jovenes_AVG FLOAT NOT NULL, 77 | M_Adutos_AVG FLOAT NOT NULL,M_Ancianos_AVG FLOAT NOT NULL); 78 | GO 79 | -------------------------------------------------------------------------------- /Semana 4/ETL_Desastres_Microdesafio.sql: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 4/ETL_Desastres_Microdesafio.sql -------------------------------------------------------------------------------- /Semana 4/data_base.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 4/data_base.png -------------------------------------------------------------------------------- /Semana 4/data_base_1NF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 4/data_base_1NF.png -------------------------------------------------------------------------------- /Semana 4/data_base_2NF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 4/data_base_2NF.png -------------------------------------------------------------------------------- /Semana 4/data_base_3NF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 4/data_base_3NF.png -------------------------------------------------------------------------------- /Semana 4/main: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Semana 5/Consulta_Actividad_Semana5.sql: -------------------------------------------------------------------------------- 1 | SELECT a.name AS AgentName, cu.name AS CustomerName, x.duration 2 | FROM 3 | ( 4 | SELECT ca.agentid, ca.duration, max(customerid) AS cid 5 | FROM 6 | ( 7 | SELECT agentid, min(duration) as fastestcall 8 | FROM calls 9 | WHERE productsold = 1 10 | GROUP BY agentid 11 | ) min 12 | JOIN calls ca ON ca.agentid = min.agentid AND ca.duration = min.fastestcall 13 | WHERE productsold = 1 14 | GROUP BY ca.agentid, ca.duration 15 | ) x 16 | JOIN agents a ON x.agentid = a.agentid 17 | JOIN customers cu ON cu.customerid = x.cid 18 | -------------------------------------------------------------------------------- /Semana 5/Consulta_Actividad_Semana5_Lenta.sql: -------------------------------------------------------------------------------- 1 | SELECT * FROM 2 | (SELECT ca.agentid, ca.duration, max(customerid) AS customerid 3 | FROM 4 | ( 5 | SELECT agentid, min(duration) as fastestcall 6 | FROM calls 7 | WHERE productsold = 1 8 | GROUP BY agentid 9 | ) min 10 | JOIN calls ca ON ca.agentid = min.agentid AND ca.duration = min.fastestcall 11 | JOIN agents a ON ca.agentid = a.agentid 12 | WHERE productsold = 1 13 | GROUP BY ca.agentid, ca.duration) as x 14 | LEFT JOIN customers cu on x.customerid= cu.customerid 15 | -------------------------------------------------------------------------------- /Semana 5/Query_Lenta_Larga.sql: -------------------------------------------------------------------------------- 1 | SET STATISTICS TIME ON 2 | GO 3 | SELECT TOP 25 4 | Product.ProductID, 5 | Product.Name AS ProductName, 6 | Product.ProductNumber, 7 | CostMeasure.UnitMeasureCode, 8 | CostMeasure.Name AS CostMeasureName, 9 | ProductVendor.AverageLeadTime, 10 | ProductVendor.StandardPrice, 11 | ProductReview.ReviewerName, 12 | ProductReview.Rating, 13 | ProductCategory.Name AS CategoryName, 14 | ProductSubCategory.Name AS SubCategoryName 15 | FROM Production.Product 16 | INNER JOIN Production.ProductSubCategory 17 | ON ProductSubCategory.ProductSubcategoryID = Product.ProductSubcategoryID 18 | INNER JOIN Production.ProductCategory 19 | ON ProductCategory.ProductCategoryID = ProductSubCategory.ProductCategoryID 20 | INNER JOIN Production.UnitMeasure SizeUnitMeasureCode 21 | ON Product.SizeUnitMeasureCode = SizeUnitMeasureCode.UnitMeasureCode 22 | INNER JOIN Production.UnitMeasure WeightUnitMeasureCode 23 | ON Product.WeightUnitMeasureCode = WeightUnitMeasureCode.UnitMeasureCode 24 | INNER JOIN Production.ProductModel 25 | ON ProductModel.ProductModelID = Product.ProductModelID 26 | LEFT JOIN Production.ProductModelIllustration 27 | ON ProductModel.ProductModelID = ProductModelIllustration.ProductModelID 28 | LEFT JOIN Production.ProductModelProductDescriptionCulture 29 | ON ProductModelProductDescriptionCulture.ProductModelID = ProductModel.ProductModelID 30 | LEFT JOIN Production.ProductDescription 31 | ON ProductDescription.ProductDescriptionID = ProductModelProductDescriptionCulture.ProductDescriptionID 32 | LEFT JOIN Production.ProductReview 33 | ON ProductReview.ProductID = Product.ProductID 34 | LEFT JOIN Purchasing.ProductVendor 35 | ON ProductVendor.ProductID = Product.ProductID 36 | LEFT JOIN Production.UnitMeasure CostMeasure 37 | ON ProductVendor.UnitMeasureCode = CostMeasure.UnitMeasureCode 38 | ORDER BY Product.ProductID DESC; 39 | SET STATISTICS TIME OFF; 40 | GO -------------------------------------------------------------------------------- /Semana 5/Query_Optimizada.sql: -------------------------------------------------------------------------------- 1 | SET STATISTICS TIME ON 2 | GO 3 | SELECT TOP 25 4 | Product.ProductID, 5 | Product.Name AS ProductName, 6 | Product.ProductNumber, 7 | ProductCategory.Name AS ProductCategory, 8 | ProductSubCategory.Name AS ProductSubCategory, 9 | Product.ProductModelID 10 | INTO #Product 11 | FROM Production.Product 12 | INNER JOIN Production.ProductSubCategory 13 | ON ProductSubCategory.ProductSubcategoryID = Product.ProductSubcategoryID 14 | INNER JOIN Production.ProductCategory 15 | ON ProductCategory.ProductCategoryID = ProductSubCategory.ProductCategoryID 16 | ORDER BY Product.ModifiedDate DESC; 17 | 18 | SELECT 19 | Product.ProductID, 20 | Product.ProductName, 21 | Product.ProductNumber, 22 | CostMeasure.UnitMeasureCode, 23 | CostMeasure.Name AS CostMeasureName, 24 | ProductVendor.AverageLeadTime, 25 | ProductVendor.StandardPrice, 26 | ProductReview.ReviewerName, 27 | ProductReview.Rating, 28 | Product.ProductCategory, 29 | Product.ProductSubCategory 30 | FROM #Product Product 31 | INNER JOIN Production.ProductModel 32 | ON ProductModel.ProductModelID = Product.ProductModelID 33 | LEFT JOIN Production.ProductReview 34 | ON ProductReview.ProductID = Product.ProductID 35 | LEFT JOIN Purchasing.ProductVendor 36 | ON ProductVendor.ProductID = Product.ProductID 37 | LEFT JOIN Production.UnitMeasure CostMeasure 38 | ON ProductVendor.UnitMeasureCode = CostMeasure.UnitMeasureCode; 39 | 40 | DROP TABLE #Product; 41 | SET STATISTICS TIME OFF; 42 | GO -------------------------------------------------------------------------------- /Semana 5/agents.csv: -------------------------------------------------------------------------------- 1 | agentid,name 2 | 0,Michele Williams 3 | 1,Jocelyn Parker 4 | 2,Christopher Moreno 5 | 3,Todd Morrow 6 | 4,Randy Moore 7 | 5,Paul Nunez 8 | 6,Gloria Singh 9 | 7,Angel Briggs 10 | 8,Lisa Cordova 11 | 9,Dana Hardy 12 | 10,Agent X 13 | -------------------------------------------------------------------------------- /Semana 6/Lectura_APIs.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | # Traer archivo 3 | !wget -O cars_clus.csv https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/labs/cars_clus.csv 4 | # Ponerlo en el entorno de trabajo 5 | filename = 'cars_clus.csv' 6 | #Lectura del archivo 7 | pdf = pd.read_csv(filename) 8 | # Traer propiedades como shape 9 | print("Shape: ", pdf.shape) 10 | # Mostrar primeras 5 filas 11 | print(pdf.head(5)) 12 | -------------------------------------------------------------------------------- /Semana 6/Lectura_csv.py: -------------------------------------------------------------------------------- 1 | # De ser necesario se puede usar colab 2 | from google.colab import drive 3 | import os 4 | drive.mount('/content/gdrive') 5 | ################################## 6 | 7 | %cd '/content/gdrive/MyDrive' 8 | 9 | import pandas as pd 10 | ### Lectura de archivo 11 | df= pd.read_csv('winequality-red.csv',sep=',') 12 | # Mostrar primeras 5 filas 13 | df.head() 14 | # Elección de columnas a mostrar 15 | print(df[['density','pH','sulphates','alcohol','quality']].head()) 16 | -------------------------------------------------------------------------------- /Semana 6/Lectura_github.py: -------------------------------------------------------------------------------- 1 | # Importar librería 2 | import pandas as pd 3 | # Definir url importante que sea formato raw 4 | url = 'https://raw.githubusercontent.com/JJTorresDS/stocks-ds-edu/main/stocks.csv' 5 | # Lectura de archivo 6 | df = pd.read_csv(url, index_col=0) 7 | # Subset de columnas de interés 8 | print(df[['AMZN','MCD','SBUX','GOOG','MSFT']].head(5).round(1)) 9 | -------------------------------------------------------------------------------- /Semana 6/Lectura_txt.py: -------------------------------------------------------------------------------- 1 | # De ser necesario usar colab 2 | from google.colab import drive 3 | import os 4 | drive.mount('/content/gdrive') 5 | # Cambiar ruta 6 | %cd '/content/gdrive/MyDrive' 7 | import pandas as pd 8 | # Lectura de archivo 9 | df= pd.read_csv('pokemon_data.txt',delimiter='\t') 10 | # Mostrar ultimas 5 filas 11 | df.tail() 12 | # Mostrar ciertas columnas 13 | print(df[['Name','Type 1','HP','Attack','Defense']].head()) 14 | -------------------------------------------------------------------------------- /Semana 6/Lectura_xlsx.py: -------------------------------------------------------------------------------- 1 | # De ser necesario usar colab 2 | from google.colab import drive 3 | import os 4 | drive.mount('/content/gdrive') 5 | # Cambiar ruta de acceso 6 | %cd '/content/gdrive/MyDrive' 7 | import pandas as pd 8 | # Lectura de archivo 9 | df= pd.read_excel('defaultoutput.xlsx') 10 | # Mostrar las priemra 5 filas 11 | df.head() 12 | # Elegir columnas de interés 13 | print(df[['index','ID','Year_Birth','Education','Income']].head()) 14 | -------------------------------------------------------------------------------- /Semana 6/defaultoutput.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 6/defaultoutput.xlsx -------------------------------------------------------------------------------- /Semana 6/nba_salary.sqlite: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 6/nba_salary.sqlite -------------------------------------------------------------------------------- /Semana 6/nested_json.json: -------------------------------------------------------------------------------- 1 | { 2 | "school_name": "ABC primary school", 3 | "class": "Year 1", 4 | "students": [ 5 | { 6 | "id": "A001", 7 | "name": "Tom", 8 | "math": 60, 9 | "physics": 66, 10 | "chemistry": 61 11 | }, 12 | { 13 | "id": "A002", 14 | "name": "James", 15 | "math": 89, 16 | "physics": 76, 17 | "chemistry": 51 18 | }, 19 | { 20 | "id": "A003", 21 | "name": "Jenny", 22 | "math": 79, 23 | "physics": 90, 24 | "chemistry": 78 25 | }] 26 | } 27 | -------------------------------------------------------------------------------- /Semana 7/Semana7_Psyco2pg_DE_Ejemplo_en_vivo2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "id": "215acb78", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "!pip install psycopg2" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": 1, 16 | "id": "6fead696", 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "import pandas as pd\n", 21 | "pd.options.display.max_rows = 10\n", 22 | "hostname= 'localhost'\n", 23 | "database= 'Semana6_DE'\n", 24 | "username= 'postgres'\n", 25 | "pwd='david9.25.38'\n", 26 | "port_id= '5432'\n", 27 | "import psycopg2" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 2, 33 | "id": "90095acc", 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "# Creamos la conexion (local)\n", 38 | "conn = psycopg2.connect(host=hostname, dbname=database, user=username, password=pwd, port=5432)\n", 39 | "# Conexion a redshift\n", 40 | "#conn = psycopg2.connect(host=hostname, dbname=database, user=username, password=pwd, port=5439)" 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": 3, 46 | "id": "89618006", 47 | "metadata": {}, 48 | "outputs": [], 49 | "source": [ 50 | "def execute_read_query(connection, query):\n", 51 | " cursor = connection.cursor()\n", 52 | " result = None\n", 53 | " try:\n", 54 | " cursor.execute(query)\n", 55 | " result = cursor.fetchall()\n", 56 | " return result\n", 57 | " except Error as e:\n", 58 | " print(f\"Error '{e}' ha ocurrido\")" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "id": "416f729f", 64 | "metadata": {}, 65 | "source": [ 66 | "**Pregunta 1**\n", 67 | "\n", 68 | "Encuentre los cinco vendedores con mejor desempeño usando la columna `salesytd` (Sales, year-to-date). (Solo necesitamos conocer el `businessentityid` de cada vendedor, ya que esto identifica de forma única a cada uno). ¿Por qué podría ser escéptico con estos números en este momento?" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": 4, 74 | "id": "c10cf07f", 75 | "metadata": {}, 76 | "outputs": [ 77 | { 78 | "data": { 79 | "text/plain": [ 80 | "[(276, 4251368.5497),\n", 81 | " (289, 4116871.2277),\n", 82 | " (275, 3763178.1787),\n", 83 | " (277, 3189418.3662),\n", 84 | " (290, 3121616.3202)]" 85 | ] 86 | }, 87 | "execution_count": 4, 88 | "metadata": {}, 89 | "output_type": "execute_result" 90 | } 91 | ], 92 | "source": [ 93 | "query5=\"\"\"SELECT BusinessEntityID, SalesYTD FROM SalesPerson ORDER BY SalesYTD DESC LIMIT 5;\"\"\"\n", 94 | "execute_read_query(conn, query5)" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "id": "3d95ce2c", 100 | "metadata": {}, 101 | "source": [ 102 | "Una version mejorada de la funcion sería" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": 13, 108 | "id": "c8166dee", 109 | "metadata": {}, 110 | "outputs": [ 111 | { 112 | "name": "stdout", 113 | "output_type": "stream", 114 | "text": [ 115 | "['businessentityid', 'salesytd']\n" 116 | ] 117 | }, 118 | { 119 | "data": { 120 | "text/html": [ 121 | "
\n", 122 | "\n", 135 | "\n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | "
businessentityidsalesytd
02764.251369e+06
12894.116871e+06
22753.763178e+06
32773.189418e+06
42903.121616e+06
\n", 171 | "
" 172 | ], 173 | "text/plain": [ 174 | " businessentityid salesytd\n", 175 | "0 276 4.251369e+06\n", 176 | "1 289 4.116871e+06\n", 177 | "2 275 3.763178e+06\n", 178 | "3 277 3.189418e+06\n", 179 | "4 290 3.121616e+06" 180 | ] 181 | }, 182 | "execution_count": 13, 183 | "metadata": {}, 184 | "output_type": "execute_result" 185 | } 186 | ], 187 | "source": [ 188 | "cursor = conn.cursor()\n", 189 | "cursor.execute(query5)\n", 190 | "columnas = [description[0] for description in cursor.description]\n", 191 | "cursor.fetchall()\n", 192 | "print(columnas)\n", 193 | "execute_read_query(conn, query5)\n", 194 | "pd.DataFrame(execute_read_query(conn, query5),columns=columnas)" 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "id": "1ff99896", 200 | "metadata": {}, 201 | "source": [ 202 | "Los números están codificados en esta tabla, en lugar de calcularse dinámicamente a partir de cada registro de ventas. Actualmente, no sabemos cómo se actualiza este número o mucho al respecto, por lo que es bueno permanecer escéptico." 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "id": "cdb0645c", 208 | "metadata": {}, 209 | "source": [ 210 | "**Pregunta 2**\n", 211 | "\n", 212 | "Usando ```salesorderheader```, busque los 5 mejores vendedores que hicieron la mayor cantidad de ventas **en el año más reciente** (2014). (Hay una columna llamada `subtotal`; úsela). Las ventas que no tienen un vendedor asociado deben excluirse de sus cálculos y producción final. Se deben incluir todos los pedidos que se realizaron dentro del año calendario 2014.\n", 213 | "\n", 214 | "**Pista:** Puedes usar la sintaxis `'1970-01-01'` para generar un punto de comparacion en el tiempo" 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": 14, 220 | "id": "1caca742", 221 | "metadata": {}, 222 | "outputs": [ 223 | { 224 | "data": { 225 | "text/plain": [ 226 | "[(289.0, 1382997.0),\n", 227 | " (276.0, 1271089.0),\n", 228 | " (275.0, 1057247.0),\n", 229 | " (282.0, 1044811.0),\n", 230 | " (277.0, 1040093.0)]" 231 | ] 232 | }, 233 | "execution_count": 14, 234 | "metadata": {}, 235 | "output_type": "execute_result" 236 | } 237 | ], 238 | "source": [ 239 | "query6=\"\"\"\n", 240 | "SELECT salespersonid, round(SUM(subtotal)) AS totalsales\n", 241 | "FROM salesorderheader soh\n", 242 | "WHERE soh.orderdate >= '2014-01-01'\n", 243 | "AND soh.SalesPersonID is not NULL\n", 244 | "GROUP BY SalesPersonID\n", 245 | "ORDER BY TotalSales DESC\n", 246 | "LIMIT 5;\n", 247 | "\"\"\"\n", 248 | "execute_read_query(conn, query6)" 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": 15, 254 | "id": "975c9d98", 255 | "metadata": {}, 256 | "outputs": [ 257 | { 258 | "name": "stdout", 259 | "output_type": "stream", 260 | "text": [ 261 | "['salespersonid', 'totalsales']\n" 262 | ] 263 | }, 264 | { 265 | "data": { 266 | "text/html": [ 267 | "
\n", 268 | "\n", 281 | "\n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | "
salespersonidtotalsales
0289.01382997.0
1276.01271089.0
2275.01057247.0
3282.01044811.0
4277.01040093.0
\n", 317 | "
" 318 | ], 319 | "text/plain": [ 320 | " salespersonid totalsales\n", 321 | "0 289.0 1382997.0\n", 322 | "1 276.0 1271089.0\n", 323 | "2 275.0 1057247.0\n", 324 | "3 282.0 1044811.0\n", 325 | "4 277.0 1040093.0" 326 | ] 327 | }, 328 | "execution_count": 15, 329 | "metadata": {}, 330 | "output_type": "execute_result" 331 | } 332 | ], 333 | "source": [ 334 | "cursor = conn.cursor()\n", 335 | "cursor.execute(query6)\n", 336 | "columnas = [description[0] for description in cursor.description]\n", 337 | "cursor.fetchall()\n", 338 | "print(columnas)\n", 339 | "pd.DataFrame(execute_read_query(conn, query6),columns=columnas)" 340 | ] 341 | } 342 | ], 343 | "metadata": { 344 | "kernelspec": { 345 | "display_name": "Python 3 (ipykernel)", 346 | "language": "python", 347 | "name": "python3" 348 | }, 349 | "language_info": { 350 | "codemirror_mode": { 351 | "name": "ipython", 352 | "version": 3 353 | }, 354 | "file_extension": ".py", 355 | "mimetype": "text/x-python", 356 | "name": "python", 357 | "nbconvert_exporter": "python", 358 | "pygments_lexer": "ipython3", 359 | "version": "3.9.12" 360 | } 361 | }, 362 | "nbformat": 4, 363 | "nbformat_minor": 5 364 | } 365 | -------------------------------------------------------------------------------- /Semana 7/Semana7_SqlAlchemy_DE_Actividad_Colaborativa.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 2, 6 | "id": "234f3df3", 7 | "metadata": {}, 8 | "outputs": [ 9 | { 10 | "name": "stdout", 11 | "output_type": "stream", 12 | "text": [ 13 | "Collecting psycopg2\n", 14 | " Downloading psycopg2-2.9.3-cp39-cp39-win_amd64.whl (1.2 MB)\n", 15 | "Installing collected packages: psycopg2\n", 16 | "Successfully installed psycopg2-2.9.3\n" 17 | ] 18 | } 19 | ], 20 | "source": [ 21 | "!pip install psycopg2" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 1, 27 | "id": "d4a1f62e", 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "import pandas as pd\n", 32 | "from sqlalchemy import create_engine, text\n", 33 | "import psycopg2\n", 34 | "pd.options.display.max_rows = 10\n", 35 | "hostname= 'localhost'\n", 36 | "database= 'Semana6_DE'\n", 37 | "username= 'postgres'\n", 38 | "pwd='david9.25.38'\n", 39 | "port_id= '5432'\n", 40 | "import psycopg2" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "id": "e793858b", 46 | "metadata": {}, 47 | "source": [ 48 | "# SQLalchemy" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 2, 54 | "id": "6b050ef0", 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "engine= create_engine(\"postgresql://postgres:david9.25.38@localhost:5432/Semana6_DE\")\n", 59 | "#engine= create_engine(string)\n", 60 | "# Caso redshift\n", 61 | "#engine= conn = create_engine('postgresql://username:password@yoururl.com:5439/yourdatabase')\n", 62 | "ruta_archivos='C:/Users/Windows/Downloads/Caso_practico_SQL_repsuestas/Caso_practico_SQL_respuestas/data/csvs/'\n", 63 | "df = pd.read_csv(ruta_archivos+'product.csv').to_sql('product', engine, if_exists='replace', index=False)\n", 64 | "df = pd.read_csv(ruta_archivos+'productreview.csv').to_sql('productreview', engine, if_exists='replace', index=False)\n", 65 | "df = pd.read_csv(ruta_archivos+'productmodelproductdescriptionculture.csv').to_sql('productmodelproductdescriptionculture', engine, if_exists='replace', index=False)\n", 66 | "df = pd.read_csv(ruta_archivos+'productdescription.csv').to_sql('productdescription', engine, if_exists='replace', index=False)\n", 67 | "df = pd.read_csv(ruta_archivos+'salesorderdetail.csv').to_sql('salesorderdetail', engine, if_exists='replace', index=False)\n", 68 | "df = pd.read_csv(ruta_archivos+'productcategory.csv').to_sql('productcategory', engine, if_exists='replace', index=False)\n", 69 | "df = pd.read_csv(ruta_archivos+'productsubcategory.csv').to_sql('productsubcategory', engine, if_exists='replace', index=False)\n", 70 | "df = pd.read_csv(ruta_archivos+'salesperson.csv').to_sql('salesperson', engine, if_exists='replace', index=False)\n", 71 | "df = pd.read_csv(ruta_archivos+'salesorderheader.csv').to_sql('salesorderheader', engine, if_exists='replace', index=False)\n", 72 | "df = pd.read_csv(ruta_archivos+'salesterritory.csv').to_sql('salesterritory', engine, if_exists='replace', index=False)\n", 73 | "df = pd.read_csv(ruta_archivos+'countryregioncurrency.csv').to_sql('countryregioncurrency', engine, if_exists='replace', index=False)\n", 74 | "df = pd.read_csv(ruta_archivos+'currencyrate.csv').to_sql('currencyrate', engine, if_exists='replace', index=False)\n", 75 | "# funcion para ejecutar comandos en python\n", 76 | "def runQuery(sql):\n", 77 | " result = engine.connect().execute((text(sql)))\n", 78 | " return pd.DataFrame(result.fetchall(), columns=result.keys())" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "id": "80d8c978", 84 | "metadata": {}, 85 | "source": [ 86 | "**Pregunta 1**\n", 87 | "\n", 88 | "Encuentre los cinco vendedores con mejor desempeño usando la columna `salesytd` (Sales, year-to-date). (Solo necesitamos conocer el `businessentityid` de cada vendedor, ya que esto identifica de forma única a cada uno). ¿Por qué podría ser escéptico con estos números en este momento?" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 3, 94 | "id": "2735c72d", 95 | "metadata": {}, 96 | "outputs": [ 97 | { 98 | "data": { 99 | "text/html": [ 100 | "
\n", 101 | "\n", 114 | "\n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | "
businessentityidsalesytd
02764.251369e+06
12894.116871e+06
22753.763178e+06
32773.189418e+06
42903.121616e+06
\n", 150 | "
" 151 | ], 152 | "text/plain": [ 153 | " businessentityid salesytd\n", 154 | "0 276 4.251369e+06\n", 155 | "1 289 4.116871e+06\n", 156 | "2 275 3.763178e+06\n", 157 | "3 277 3.189418e+06\n", 158 | "4 290 3.121616e+06" 159 | ] 160 | }, 161 | "execution_count": 3, 162 | "metadata": {}, 163 | "output_type": "execute_result" 164 | } 165 | ], 166 | "source": [ 167 | "query5=\"\"\"SELECT BusinessEntityID, SalesYTD FROM SalesPerson ORDER BY SalesYTD DESC LIMIT 5;\"\"\"\n", 168 | "runQuery(query5)" 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "id": "b2b19091", 174 | "metadata": {}, 175 | "source": [ 176 | "Los números están codificados en esta tabla, en lugar de calcularse dinámicamente a partir de cada registro de ventas. Actualmente, no sabemos cómo se actualiza este número o mucho al respecto, por lo que es bueno permanecer escéptico." 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "id": "4a6487f5", 182 | "metadata": {}, 183 | "source": [ 184 | "**Pregunta 2**\n", 185 | "\n", 186 | "Usando ```salesorderheader```, busque los 5 mejores vendedores que hicieron la mayor cantidad de ventas **en el año más reciente** (2014). (Hay una columna llamada `subtotal`; úsela). Las ventas que no tienen un vendedor asociado deben excluirse de sus cálculos y producción final. Se deben incluir todos los pedidos que se realizaron dentro del año calendario 2014.\n", 187 | "\n", 188 | "**Pista:** Puedes usar la sintaxis `'1970-01-01'` para generar un punto de comparacion en el tiempo" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": 4, 194 | "id": "a6834536", 195 | "metadata": {}, 196 | "outputs": [ 197 | { 198 | "data": { 199 | "text/html": [ 200 | "
\n", 201 | "\n", 214 | "\n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | "
salespersonidtotalsales
0289.01382997.0
1276.01271089.0
2275.01057247.0
3282.01044811.0
4277.01040093.0
\n", 250 | "
" 251 | ], 252 | "text/plain": [ 253 | " salespersonid totalsales\n", 254 | "0 289.0 1382997.0\n", 255 | "1 276.0 1271089.0\n", 256 | "2 275.0 1057247.0\n", 257 | "3 282.0 1044811.0\n", 258 | "4 277.0 1040093.0" 259 | ] 260 | }, 261 | "execution_count": 4, 262 | "metadata": {}, 263 | "output_type": "execute_result" 264 | } 265 | ], 266 | "source": [ 267 | "query6=\"\"\"\n", 268 | "SELECT salespersonid, round(SUM(subtotal)) AS totalsales\n", 269 | "FROM salesorderheader soh\n", 270 | "WHERE soh.orderdate >= '2014-01-01'\n", 271 | "AND soh.SalesPersonID is not NULL\n", 272 | "GROUP BY SalesPersonID\n", 273 | "ORDER BY TotalSales DESC\n", 274 | "LIMIT 5;\n", 275 | "\"\"\"\n", 276 | "runQuery(query6)" 277 | ] 278 | } 279 | ], 280 | "metadata": { 281 | "kernelspec": { 282 | "display_name": "Python 3 (ipykernel)", 283 | "language": "python", 284 | "name": "python3" 285 | }, 286 | "language_info": { 287 | "codemirror_mode": { 288 | "name": "ipython", 289 | "version": 3 290 | }, 291 | "file_extension": ".py", 292 | "mimetype": "text/x-python", 293 | "name": "python", 294 | "nbconvert_exporter": "python", 295 | "pygments_lexer": "ipython3", 296 | "version": "3.9.12" 297 | } 298 | }, 299 | "nbformat": 4, 300 | "nbformat_minor": 5 301 | } 302 | -------------------------------------------------------------------------------- /Semana 7/Semana7_SqlAlchemy_DE_Ejemplo_en_vivo1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 2, 6 | "id": "ed383200", 7 | "metadata": {}, 8 | "outputs": [ 9 | { 10 | "name": "stdout", 11 | "output_type": "stream", 12 | "text": [ 13 | "Collecting psycopg2\n", 14 | " Downloading psycopg2-2.9.3-cp39-cp39-win_amd64.whl (1.2 MB)\n", 15 | "Installing collected packages: psycopg2\n", 16 | "Successfully installed psycopg2-2.9.3\n" 17 | ] 18 | } 19 | ], 20 | "source": [ 21 | "!pip install psycopg2" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 31, 27 | "id": "0c1da9bd", 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "import pandas as pd\n", 32 | "from sqlalchemy import create_engine, text\n", 33 | "import psycopg2\n", 34 | "pd.options.display.max_rows = 10\n", 35 | "hostname= 'localhost'\n", 36 | "database= 'Semana6_DE'\n", 37 | "username= 'postgres'\n", 38 | "pwd='david9.25.38'\n", 39 | "port_id= '5432'\n", 40 | "import psycopg2" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "id": "94fb9636", 46 | "metadata": {}, 47 | "source": [ 48 | "# SQLalchemy" 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": 32, 54 | "id": "46898885", 55 | "metadata": {}, 56 | "outputs": [], 57 | "source": [ 58 | "engine= create_engine(\"postgresql://postgres:david9.25.38@localhost:5432/Semana6_DE\")\n", 59 | "#engine= create_engine(string)\n", 60 | "# Caso redshift\n", 61 | "#engine= conn = create_engine('postgresql://username:password@yoururl.com:5439/yourdatabase')\n", 62 | "ruta_archivos='C:/Users/Windows/Downloads/Caso_practico_SQL_repsuestas/Caso_practico_SQL_respuestas/data/csvs/'\n", 63 | "df = pd.read_csv(ruta_archivos+'product.csv').to_sql('product', engine, if_exists='replace', index=False)\n", 64 | "df = pd.read_csv(ruta_archivos+'productreview.csv').to_sql('productreview', engine, if_exists='replace', index=False)\n", 65 | "df = pd.read_csv(ruta_archivos+'productmodelproductdescriptionculture.csv').to_sql('productmodelproductdescriptionculture', engine, if_exists='replace', index=False)\n", 66 | "df = pd.read_csv(ruta_archivos+'productdescription.csv').to_sql('productdescription', engine, if_exists='replace', index=False)\n", 67 | "df = pd.read_csv(ruta_archivos+'salesorderdetail.csv').to_sql('salesorderdetail', engine, if_exists='replace', index=False)\n", 68 | "df = pd.read_csv(ruta_archivos+'productcategory.csv').to_sql('productcategory', engine, if_exists='replace', index=False)\n", 69 | "df = pd.read_csv(ruta_archivos+'productsubcategory.csv').to_sql('productsubcategory', engine, if_exists='replace', index=False)\n", 70 | "df = pd.read_csv(ruta_archivos+'salesperson.csv').to_sql('salesperson', engine, if_exists='replace', index=False)\n", 71 | "df = pd.read_csv(ruta_archivos+'salesorderheader.csv').to_sql('salesorderheader', engine, if_exists='replace', index=False)\n", 72 | "df = pd.read_csv(ruta_archivos+'salesterritory.csv').to_sql('salesterritory', engine, if_exists='replace', index=False)\n", 73 | "df = pd.read_csv(ruta_archivos+'countryregioncurrency.csv').to_sql('countryregioncurrency', engine, if_exists='replace', index=False)\n", 74 | "df = pd.read_csv(ruta_archivos+'currencyrate.csv').to_sql('currencyrate', engine, if_exists='replace', index=False)\n", 75 | "# funcion para ejecutar comandos en python\n", 76 | "def runQuery(sql):\n", 77 | " result = engine.connect().execute((text(sql)))\n", 78 | " return pd.DataFrame(result.fetchall(), columns=result.keys())" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "id": "e2255a16", 84 | "metadata": {}, 85 | "source": [ 86 | "**Pregunta 1**\n", 87 | "\n", 88 | "Usando las tablas ```product``` y ```productreview```, JOIN y clasifica los productos de acuerdo con su calificación promedio de revisión. ¿Cuáles son los nombres y las identificaciones de los 5 productos principales" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 34, 94 | "id": "93c64bcf", 95 | "metadata": {}, 96 | "outputs": [ 97 | { 98 | "data": { 99 | "text/html": [ 100 | "
\n", 101 | "\n", 114 | "\n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | "
productidnameavgratingnum_ratings
0798Road-550-W Yellow, 405.001
1709Mountain Bike Socks, M5.001
2937HL Mountain Pedal3.002
\n", 148 | "
" 149 | ], 150 | "text/plain": [ 151 | " productid name avgrating num_ratings\n", 152 | "0 798 Road-550-W Yellow, 40 5.00 1\n", 153 | "1 709 Mountain Bike Socks, M 5.00 1\n", 154 | "2 937 HL Mountain Pedal 3.00 2" 155 | ] 156 | }, 157 | "execution_count": 34, 158 | "metadata": {}, 159 | "output_type": "execute_result" 160 | } 161 | ], 162 | "source": [ 163 | "query1=\"\"\"SELECT product.productid, name, round(avg(rating), 2) as avgrating, count(*) as num_ratings\n", 164 | "FROM product inner join productreview\n", 165 | "ON productreview.productid = product.productid\n", 166 | "GROUP BY product.productid, name\n", 167 | "ORDER BY avgrating DESC\"\"\"\n", 168 | "runQuery(query1)" 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "id": "843cd004", 174 | "metadata": {}, 175 | "source": [ 176 | "**Pregunta 2**\n", 177 | "\n", 178 | "Para su decepción, ¡solo hay tres productos con calificaciones y solo cuatro reseñas en total! Esto no es lo suficientemente cerca como para realizar un análisis de la correlación entre las revisiones y las ventas totales.\n", 179 | "\n", 180 | "Sin embargo, su gerente quiere la descripción en inglés de estos productos para una próxima venta. ¡Utilice la documentación proporcionada anteriormente si necesita ayuda para navegar por la estructura para extraer esto!" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": 35, 186 | "id": "9f9ff9d5", 187 | "metadata": {}, 188 | "outputs": [ 189 | { 190 | "data": { 191 | "text/html": [ 192 | "
\n", 193 | "\n", 206 | "\n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | "
namedescription
0Road-550-W Yellow, 40Same technology as all of our Road series bike...
1HL Mountain PedalStainless steel; designed to shed mud easily.
2Mountain Bike Socks, MCombination of natural and synthetic fibers st...
\n", 232 | "
" 233 | ], 234 | "text/plain": [ 235 | " name description\n", 236 | "0 Road-550-W Yellow, 40 Same technology as all of our Road series bike...\n", 237 | "1 HL Mountain Pedal Stainless steel; designed to shed mud easily.\n", 238 | "2 Mountain Bike Socks, M Combination of natural and synthetic fibers st..." 239 | ] 240 | }, 241 | "execution_count": 35, 242 | "metadata": {}, 243 | "output_type": "execute_result" 244 | } 245 | ], 246 | "source": [ 247 | "query2=\"\"\"\n", 248 | "SELECT \"name\",\n", 249 | " description\n", 250 | "FROM productdescription pd\n", 251 | "INNER JOIN productmodelproductdescriptionculture pm ON pm.productdescriptionid=pd.productdescriptionid\n", 252 | "INNER JOIN product ON product.productmodelid = pm.productmodelid\n", 253 | "WHERE productid IN (798,709,937)\n", 254 | " AND cultureid = 'en'\n", 255 | "\"\"\"\n", 256 | "runQuery(query2)" 257 | ] 258 | } 259 | ], 260 | "metadata": { 261 | "kernelspec": { 262 | "display_name": "Python 3 (ipykernel)", 263 | "language": "python", 264 | "name": "python3" 265 | }, 266 | "language_info": { 267 | "codemirror_mode": { 268 | "name": "ipython", 269 | "version": 3 270 | }, 271 | "file_extension": ".py", 272 | "mimetype": "text/x-python", 273 | "name": "python", 274 | "nbconvert_exporter": "python", 275 | "pygments_lexer": "ipython3", 276 | "version": "3.9.12" 277 | } 278 | }, 279 | "nbformat": 4, 280 | "nbformat_minor": 5 281 | } 282 | -------------------------------------------------------------------------------- /Semana 7/Tablas/countryregioncurrency.csv: -------------------------------------------------------------------------------- 1 | countryregioncode,currencycode,modifieddate 2 | AE,AED,2014-02-08 10:17:21.51 3 | AR,ARS,2014-02-08 10:17:21.51 4 | AT,ATS,2014-02-08 10:17:21.51 5 | AT,EUR,2008-04-30 00:00:00 6 | AU,AUD,2014-02-08 10:17:21.51 7 | BB,BBD,2014-02-08 10:17:21.51 8 | BD,BDT,2014-02-08 10:17:21.51 9 | BE,BEF,2014-02-08 10:17:21.51 10 | BE,EUR,2008-04-30 00:00:00 11 | BG,BGN,2014-02-08 10:17:21.51 12 | BH,BHD,2014-02-08 10:17:21.51 13 | BN,BND,2014-02-08 10:17:21.51 14 | BO,BOB,2014-02-08 10:17:21.51 15 | BR,BRL,2014-02-08 10:17:21.51 16 | BS,BSD,2014-02-08 10:17:21.51 17 | BT,BTN,2014-02-08 10:17:21.51 18 | CA,CAD,2014-02-08 10:17:21.51 19 | CH,CHF,2014-02-08 10:17:21.51 20 | CL,CLP,2014-02-08 10:17:21.51 21 | CN,CNY,2014-02-08 10:17:21.51 22 | CO,COP,2014-02-08 10:17:21.51 23 | CR,CRC,2014-02-08 10:17:21.51 24 | CY,CYP,2014-02-08 10:17:21.51 25 | CZ,CZK,2014-02-08 10:17:21.51 26 | DE,DEM,2014-02-08 10:17:21.51 27 | DE,EUR,2014-02-08 10:17:21.51 28 | DK,DKK,2014-02-08 10:17:21.51 29 | DO,DOP,2014-02-08 10:17:21.51 30 | DZ,DZD,2014-02-08 10:17:21.51 31 | EC,USD,2014-02-08 10:17:21.51 32 | EE,EEK,2014-02-08 10:17:21.51 33 | EG,EGP,2014-02-08 10:17:21.51 34 | ES,ESP,2014-02-08 10:17:21.51 35 | ES,EUR,2008-04-30 00:00:00 36 | FI,EUR,2008-04-30 00:00:00 37 | FI,FIM,2014-02-08 10:17:21.51 38 | FJ,FJD,2014-02-08 10:17:21.51 39 | FR,EUR,2014-02-08 10:17:21.51 40 | FR,FRF,2014-02-08 10:17:21.51 41 | GB,GBP,2014-02-08 10:17:21.51 42 | GH,GHC,2014-02-08 10:17:21.51 43 | GR,EUR,2008-04-30 00:00:00 44 | GR,GRD,2014-02-08 10:17:21.51 45 | GT,GTQ,2014-02-08 10:17:21.51 46 | HK,HKD,2014-02-08 10:17:21.51 47 | HR,HRK,2014-02-08 10:17:21.51 48 | HU,HUF,2014-02-08 10:17:21.51 49 | ID,IDR,2014-02-08 10:17:21.51 50 | IE,EUR,2008-04-30 00:00:00 51 | IE,IEP,2014-02-08 10:17:21.51 52 | IL,ILS,2014-02-08 10:17:21.51 53 | IN,INR,2014-02-08 10:17:21.51 54 | IS,ISK,2014-02-08 10:17:21.51 55 | IT,EUR,2008-04-30 00:00:00 56 | IT,ITL,2014-02-08 10:17:21.51 57 | JM,JMD,2014-02-08 10:17:21.51 58 | JO,JOD,2014-02-08 10:17:21.51 59 | JP,JPY,2014-02-08 10:17:21.51 60 | KE,KES,2014-02-08 10:17:21.51 61 | KR,KRW,2014-02-08 10:17:21.51 62 | KW,KWD,2014-02-08 10:17:21.51 63 | LB,LBP,2014-02-08 10:17:21.51 64 | LK,LKR,2014-02-08 10:17:21.51 65 | LT,LTL,2014-02-08 10:17:21.51 66 | LU,EUR,2008-04-30 00:00:00 67 | LV,LVL,2014-02-08 10:17:21.51 68 | MA,MAD,2014-02-08 10:17:21.51 69 | MT,MTL,2014-02-08 10:17:21.51 70 | MU,MUR,2014-02-08 10:17:21.51 71 | MV,MVR,2014-02-08 10:17:21.51 72 | MX,MXN,2014-02-08 10:17:21.51 73 | MY,MYR,2014-02-08 10:17:21.51 74 | NA,NAD,2014-02-08 10:17:21.51 75 | NG,NGN,2014-02-08 10:17:21.51 76 | NL,EUR,2008-04-30 00:00:00 77 | NL,NLG,2014-02-08 10:17:21.51 78 | NO,NOK,2014-02-08 10:17:21.51 79 | NP,NPR,2014-02-08 10:17:21.51 80 | NZ,NZD,2014-02-08 10:17:21.51 81 | OM,OMR,2014-02-08 10:17:21.51 82 | PA,PAB,2014-02-08 10:17:21.51 83 | PE,PEN,2014-02-08 10:17:21.51 84 | PH,PHP,2014-02-08 10:17:21.51 85 | PK,PKR,2014-02-08 10:17:21.51 86 | PL,PLN,2014-02-08 10:17:21.51 87 | PL,PLZ,2014-02-08 10:17:21.51 88 | PT,EUR,2008-04-30 00:00:00 89 | PT,PTE,2014-02-08 10:17:21.51 90 | PY,PYG,2014-02-08 10:17:21.51 91 | RO,ROL,2014-02-08 10:17:21.51 92 | RU,RUB,2014-02-08 10:17:21.51 93 | RU,RUR,2014-02-08 10:17:21.51 94 | SA,SAR,2014-02-08 10:17:21.51 95 | SE,SEK,2014-02-08 10:17:21.51 96 | SG,SGD,2014-02-08 10:17:21.51 97 | SI,SIT,2014-02-08 10:17:21.51 98 | SK,SKK,2014-02-08 10:17:21.51 99 | SV,SVC,2014-02-08 10:17:21.51 100 | TH,THB,2014-02-08 10:17:21.51 101 | TN,TND,2014-02-08 10:17:21.51 102 | TR,TRL,2014-02-08 10:17:21.51 103 | TT,TTD,2014-02-08 10:17:21.51 104 | TW,TWD,2014-02-08 10:17:21.51 105 | US,USD,2014-02-08 10:17:21.51 106 | UY,UYU,2014-02-08 10:17:21.51 107 | VE,VEB,2014-02-08 10:17:21.51 108 | VN,VND,2014-02-08 10:17:21.51 109 | ZA,ZAR,2014-02-08 10:17:21.51 110 | ZW,ZWD,2014-02-08 10:17:21.51 111 | -------------------------------------------------------------------------------- /Semana 7/Tablas/productcategory.csv: -------------------------------------------------------------------------------- 1 | productcategoryid,name,rowguid,modifieddate 2 | 1,Bikes,cfbda25c-df71-47a7-b81b-64ee161aa37c,2008-04-30 00:00:00 3 | 2,Components,c657828d-d808-4aba-91a3-af2ce02300e9,2008-04-30 00:00:00 4 | 3,Clothing,10a7c342-ca82-48d4-8a38-46a2eb089b74,2008-04-30 00:00:00 5 | 4,Accessories,2be3be36-d9a2-4eee-b593-ed895d97c2a6,2008-04-30 00:00:00 6 | -------------------------------------------------------------------------------- /Semana 7/Tablas/productmodelproductdescriptionculture.csv: -------------------------------------------------------------------------------- 1 | ,productmodelid,productdescriptionid,cultureid,modifieddate 2 | 0,1,1199,en,2013-04-30 00:00:00 3 | 1,1,1467,ar,2013-04-30 00:00:00 4 | 2,1,1589,fr,2013-04-30 00:00:00 5 | 3,1,1712,th,2013-04-30 00:00:00 6 | 4,1,1838,he,2013-04-30 00:00:00 7 | 5,1,1965,zh-cht,2013-04-30 00:00:00 8 | 6,2,1210,en,2013-04-30 00:00:00 9 | 7,2,1476,ar,2013-04-30 00:00:00 10 | 8,2,1598,fr,2013-04-30 00:00:00 11 | 9,2,1721,th,2013-04-30 00:00:00 12 | 10,2,1847,he,2013-04-30 00:00:00 13 | 11,2,1974,zh-cht,2013-04-30 00:00:00 14 | 12,3,1195,en,2013-04-30 00:00:00 15 | 13,3,1464,ar,2013-04-30 00:00:00 16 | 14,3,1586,fr,2013-04-30 00:00:00 17 | 15,3,1709,th,2013-04-30 00:00:00 18 | 16,3,1835,he,2013-04-30 00:00:00 19 | 17,3,1961,zh-cht,2013-04-30 00:00:00 20 | 18,4,1194,en,2013-04-30 00:00:00 21 | 19,4,1463,ar,2013-04-30 00:00:00 22 | 20,4,1585,fr,2013-04-30 00:00:00 23 | 21,4,1708,th,2013-04-30 00:00:00 24 | 22,4,1834,he,2013-04-30 00:00:00 25 | 23,4,1960,zh-cht,2013-04-30 00:00:00 26 | 24,5,647,en,2013-04-30 00:00:00 27 | 25,5,1400,ar,2013-04-30 00:00:00 28 | 26,5,1605,fr,2013-04-30 00:00:00 29 | 27,5,1639,th,2013-04-30 00:00:00 30 | 28,5,1768,he,2013-04-30 00:00:00 31 | 29,5,1889,zh-cht,2013-04-30 00:00:00 32 | 30,6,1090,en,2013-04-30 00:00:00 33 | 31,6,1451,ar,2013-04-30 00:00:00 34 | 32,6,1573,fr,2013-04-30 00:00:00 35 | 33,6,1696,th,2013-04-30 00:00:00 36 | 34,6,1827,he,2013-04-30 00:00:00 37 | 35,6,1948,zh-cht,2013-04-30 00:00:00 38 | 36,7,1182,en,2013-04-30 00:00:00 39 | 37,7,1453,ar,2013-04-30 00:00:00 40 | 38,7,1575,fr,2013-04-30 00:00:00 41 | 39,7,1698,th,2013-04-30 00:00:00 42 | 40,7,1829,he,2013-04-30 00:00:00 43 | 41,7,1950,zh-cht,2013-04-30 00:00:00 44 | 42,8,637,en,2013-04-30 00:00:00 45 | 43,8,1397,ar,2013-04-30 00:00:00 46 | 44,8,1514,fr,2013-04-30 00:00:00 47 | 45,8,1636,th,2013-04-30 00:00:00 48 | 46,8,1759,he,2013-04-30 00:00:00 49 | 47,8,1886,zh-cht,2013-04-30 00:00:00 50 | 48,9,1020,en,2013-04-30 00:00:00 51 | 49,9,1449,ar,2013-04-30 00:00:00 52 | 50,9,1571,fr,2013-04-30 00:00:00 53 | 51,9,1694,th,2013-04-30 00:00:00 54 | 52,9,1825,he,2013-04-30 00:00:00 55 | 53,9,1946,zh-cht,2013-04-30 00:00:00 56 | 54,10,1146,en,2013-04-30 00:00:00 57 | 55,10,1452,ar,2013-04-30 00:00:00 58 | 56,10,1574,fr,2013-04-30 00:00:00 59 | 57,10,1697,th,2013-04-30 00:00:00 60 | 58,10,1828,he,2013-04-30 00:00:00 61 | 59,10,1949,zh-cht,2013-04-30 00:00:00 62 | 60,11,1211,en,2013-04-30 00:00:00 63 | 61,11,1477,ar,2013-04-30 00:00:00 64 | 62,11,1599,fr,2013-04-30 00:00:00 65 | 63,11,1722,th,2013-04-30 00:00:00 66 | 64,11,1848,he,2013-04-30 00:00:00 67 | 65,11,1975,zh-cht,2013-04-30 00:00:00 68 | 66,12,1192,en,2013-04-30 00:00:00 69 | 67,12,1461,ar,2013-04-30 00:00:00 70 | 68,12,1583,fr,2013-04-30 00:00:00 71 | 69,12,1706,th,2013-04-30 00:00:00 72 | 70,12,1832,he,2013-04-30 00:00:00 73 | 71,12,1958,zh-cht,2013-04-30 00:00:00 74 | 72,13,1200,en,2013-04-30 00:00:00 75 | 73,13,1468,ar,2013-04-30 00:00:00 76 | 74,13,1590,fr,2013-04-30 00:00:00 77 | 75,13,1713,th,2013-04-30 00:00:00 78 | 76,13,1839,he,2013-04-30 00:00:00 79 | 77,13,1966,zh-cht,2013-04-30 00:00:00 80 | 78,14,644,en,2013-04-30 00:00:00 81 | 79,14,1399,ar,2013-04-30 00:00:00 82 | 80,14,1516,fr,2013-04-30 00:00:00 83 | 81,14,1638,th,2013-04-30 00:00:00 84 | 82,14,1761,he,2013-04-30 00:00:00 85 | 83,14,1888,zh-cht,2013-04-30 00:00:00 86 | 84,15,642,en,2013-04-30 00:00:00 87 | 85,15,1398,ar,2013-04-30 00:00:00 88 | 86,15,1515,fr,2013-04-30 00:00:00 89 | 87,15,1637,th,2013-04-30 00:00:00 90 | 88,15,1760,he,2013-04-30 00:00:00 91 | 89,15,1887,zh-cht,2013-04-30 00:00:00 92 | 90,16,1062,en,2013-04-30 00:00:00 93 | 91,16,1450,ar,2013-04-30 00:00:00 94 | 92,16,1572,fr,2013-04-30 00:00:00 95 | 93,16,1695,th,2013-04-30 00:00:00 96 | 94,16,1826,he,2013-04-30 00:00:00 97 | 95,16,1947,zh-cht,2013-04-30 00:00:00 98 | 96,17,661,en,2013-04-30 00:00:00 99 | 97,17,1401,ar,2013-04-30 00:00:00 100 | 98,17,1517,fr,2013-04-30 00:00:00 101 | 99,17,1640,th,2013-04-30 00:00:00 102 | 100,17,1770,he,2013-04-30 00:00:00 103 | 101,17,1890,zh-cht,2013-04-30 00:00:00 104 | 102,18,1189,en,2013-04-30 00:00:00 105 | 103,18,1459,ar,2013-04-30 00:00:00 106 | 104,18,1581,fr,2013-04-30 00:00:00 107 | 105,18,1704,th,2013-04-30 00:00:00 108 | 106,18,1830,he,2013-04-30 00:00:00 109 | 107,18,1956,zh-cht,2013-04-30 00:00:00 110 | 108,19,168,en,2013-04-30 00:00:00 111 | 109,19,1367,ar,2013-04-30 00:00:00 112 | 110,19,1491,fr,2013-04-30 00:00:00 113 | 111,19,1613,th,2013-04-30 00:00:00 114 | 112,19,1735,he,2013-04-30 00:00:00 115 | 113,19,1862,zh-cht,2013-04-30 00:00:00 116 | 114,20,128,en,2013-04-30 00:00:00 117 | 115,20,1366,ar,2013-04-30 00:00:00 118 | 116,20,1490,fr,2013-04-30 00:00:00 119 | 117,20,1612,th,2013-04-30 00:00:00 120 | 118,20,1734,he,2013-04-30 00:00:00 121 | 119,20,1861,zh-cht,2013-04-30 00:00:00 122 | 120,21,88,en,2013-04-30 00:00:00 123 | 121,21,1365,ar,2013-04-30 00:00:00 124 | 122,21,1489,fr,2013-04-30 00:00:00 125 | 123,21,1611,th,2013-04-30 00:00:00 126 | 124,21,1733,he,2013-04-30 00:00:00 127 | 125,21,1860,zh-cht,2013-04-30 00:00:00 128 | 126,22,64,en,2013-04-30 00:00:00 129 | 127,22,1364,ar,2013-04-30 00:00:00 130 | 128,22,1488,fr,2013-04-30 00:00:00 131 | 129,22,1610,th,2013-04-30 00:00:00 132 | 130,22,1732,he,2013-04-30 00:00:00 133 | 131,22,1859,zh-cht,2013-04-30 00:00:00 134 | 132,23,8,en,2013-04-30 00:00:00 135 | 133,23,1363,ar,2013-04-30 00:00:00 136 | 134,23,1487,fr,2013-04-30 00:00:00 137 | 135,23,1609,th,2013-04-30 00:00:00 138 | 136,23,1731,he,2013-04-30 00:00:00 139 | 137,23,1858,zh-cht,2013-04-30 00:00:00 140 | 138,24,1208,en,2013-04-30 00:00:00 141 | 139,24,1474,ar,2013-04-30 00:00:00 142 | 140,24,1596,fr,2013-04-30 00:00:00 143 | 141,24,1719,th,2013-04-30 00:00:00 144 | 142,24,1845,he,2013-04-30 00:00:00 145 | 143,24,1972,zh-cht,2013-04-30 00:00:00 146 | 144,25,457,en,2013-04-30 00:00:00 147 | 145,25,1377,ar,2013-04-30 00:00:00 148 | 146,25,1501,fr,2013-04-30 00:00:00 149 | 147,25,1623,th,2013-04-30 00:00:00 150 | 148,25,1745,he,2013-04-30 00:00:00 151 | 149,25,1872,zh-cht,2013-04-30 00:00:00 152 | 150,26,409,en,2013-04-30 00:00:00 153 | 151,26,1376,ar,2013-04-30 00:00:00 154 | 152,26,1500,fr,2013-04-30 00:00:00 155 | 153,26,1622,th,2013-04-30 00:00:00 156 | 154,26,1744,he,2013-04-30 00:00:00 157 | 155,26,1871,zh-cht,2013-04-30 00:00:00 158 | 156,27,376,en,2013-04-30 00:00:00 159 | 157,27,1375,ar,2013-04-30 00:00:00 160 | 158,27,1499,fr,2013-04-30 00:00:00 161 | 159,27,1621,th,2013-04-30 00:00:00 162 | 160,27,1743,he,2013-04-30 00:00:00 163 | 161,27,1870,zh-cht,2013-04-30 00:00:00 164 | 162,28,337,en,2013-04-30 00:00:00 165 | 163,28,1373,ar,2013-04-30 00:00:00 166 | 164,28,1497,fr,2013-04-30 00:00:00 167 | 165,28,1619,th,2013-04-30 00:00:00 168 | 166,28,1741,he,2013-04-30 00:00:00 169 | 167,28,1868,zh-cht,2013-04-30 00:00:00 170 | 168,29,320,en,2013-04-30 00:00:00 171 | 169,29,1371,ar,2013-04-30 00:00:00 172 | 170,29,1495,fr,2013-04-30 00:00:00 173 | 171,29,1617,th,2013-04-30 00:00:00 174 | 172,29,1739,he,2013-04-30 00:00:00 175 | 173,29,1866,zh-cht,2013-04-30 00:00:00 176 | 174,30,249,en,2013-04-30 00:00:00 177 | 175,30,1370,ar,2013-04-30 00:00:00 178 | 176,30,1494,fr,2013-04-30 00:00:00 179 | 177,30,1616,th,2013-04-30 00:00:00 180 | 178,30,1738,he,2013-04-30 00:00:00 181 | 179,30,1865,zh-cht,2013-04-30 00:00:00 182 | 180,31,209,en,2013-04-30 00:00:00 183 | 181,31,1369,ar,2013-04-30 00:00:00 184 | 182,31,1493,fr,2013-04-30 00:00:00 185 | 183,31,1615,th,2013-04-30 00:00:00 186 | 184,31,1737,he,2013-04-30 00:00:00 187 | 185,31,1864,zh-cht,2013-04-30 00:00:00 188 | 186,32,1205,en,2013-04-30 00:00:00 189 | 187,32,1472,ar,2013-04-30 00:00:00 190 | 188,32,1594,fr,2013-04-30 00:00:00 191 | 189,32,1717,th,2013-04-30 00:00:00 192 | 190,32,1843,he,2013-04-30 00:00:00 193 | 191,32,1970,zh-cht,2013-04-30 00:00:00 194 | 192,33,1212,en,2013-04-30 00:00:00 195 | 193,33,1478,ar,2013-04-30 00:00:00 196 | 194,33,1600,fr,2013-04-30 00:00:00 197 | 195,33,1723,th,2013-04-30 00:00:00 198 | 196,33,1850,he,2013-04-30 00:00:00 199 | 197,33,1976,zh-cht,2013-04-30 00:00:00 200 | 198,34,594,en,2013-04-30 00:00:00 201 | 199,34,1387,ar,2013-04-30 00:00:00 202 | 200,34,1504,fr,2013-04-30 00:00:00 203 | 201,34,1626,th,2013-04-30 00:00:00 204 | 202,34,1749,he,2013-04-30 00:00:00 205 | 203,34,1875,zh-cht,2013-04-30 00:00:00 206 | 204,35,554,en,2013-04-30 00:00:00 207 | 205,35,1379,ar,2013-04-30 00:00:00 208 | 206,35,1503,fr,2013-04-30 00:00:00 209 | 207,35,1625,th,2013-04-30 00:00:00 210 | 208,35,1748,he,2013-04-30 00:00:00 211 | 209,35,1874,zh-cht,2013-04-30 00:00:00 212 | 210,36,513,en,2013-04-30 00:00:00 213 | 211,36,1378,ar,2013-04-30 00:00:00 214 | 212,36,1502,fr,2013-04-30 00:00:00 215 | 213,36,1624,th,2013-04-30 00:00:00 216 | 214,36,1747,he,2013-04-30 00:00:00 217 | 215,36,1873,zh-cht,2013-04-30 00:00:00 218 | 216,37,1196,en,2013-04-30 00:00:00 219 | 217,37,1465,ar,2013-04-30 00:00:00 220 | 218,37,1587,fr,2013-04-30 00:00:00 221 | 219,37,1710,th,2013-04-30 00:00:00 222 | 220,37,1836,he,2013-04-30 00:00:00 223 | 221,37,1963,zh-cht,2013-04-30 00:00:00 224 | 222,38,1214,en,2013-04-30 00:00:00 225 | 223,38,1480,ar,2013-04-30 00:00:00 226 | 224,38,1602,fr,2013-04-30 00:00:00 227 | 225,38,1725,th,2013-04-30 00:00:00 228 | 226,38,1852,he,2013-04-30 00:00:00 229 | 227,38,1978,zh-cht,2013-04-30 00:00:00 230 | 228,39,170,en,2013-04-30 00:00:00 231 | 229,39,1368,ar,2013-04-30 00:00:00 232 | 230,39,1492,fr,2013-04-30 00:00:00 233 | 231,39,1614,th,2013-04-30 00:00:00 234 | 232,39,1736,he,2013-04-30 00:00:00 235 | 233,39,1863,zh-cht,2013-04-30 00:00:00 236 | 234,40,321,en,2013-04-30 00:00:00 237 | 235,40,1372,ar,2013-04-30 00:00:00 238 | 236,40,1496,fr,2013-04-30 00:00:00 239 | 237,40,1618,th,2013-04-30 00:00:00 240 | 238,40,1740,he,2013-04-30 00:00:00 241 | 239,40,1867,zh-cht,2013-04-30 00:00:00 242 | 240,41,375,en,2013-04-30 00:00:00 243 | 241,41,1374,ar,2013-04-30 00:00:00 244 | 242,41,1498,fr,2013-04-30 00:00:00 245 | 243,41,1620,th,2013-04-30 00:00:00 246 | 244,41,1742,he,2013-04-30 00:00:00 247 | 245,41,1869,zh-cht,2013-04-30 00:00:00 248 | 246,42,686,en,2013-04-30 00:00:00 249 | 247,42,1402,ar,2013-04-30 00:00:00 250 | 248,42,1518,fr,2013-04-30 00:00:00 251 | 249,42,1641,th,2013-04-30 00:00:00 252 | 250,42,1771,he,2013-04-30 00:00:00 253 | 251,42,1891,zh-cht,2013-04-30 00:00:00 254 | 252,43,873,en,2013-04-30 00:00:00 255 | 253,43,1385,ar,2013-04-30 00:00:00 256 | 254,43,1551,fr,2013-04-30 00:00:00 257 | 255,43,1674,th,2013-04-30 00:00:00 258 | 256,43,1805,he,2013-04-30 00:00:00 259 | 257,43,1926,zh-cht,2013-04-30 00:00:00 260 | 258,44,692,en,2013-04-30 00:00:00 261 | 259,44,1408,ar,2013-04-30 00:00:00 262 | 260,44,1524,fr,2013-04-30 00:00:00 263 | 261,44,1647,th,2013-04-30 00:00:00 264 | 262,44,1777,he,2013-04-30 00:00:00 265 | 263,44,1897,zh-cht,2013-04-30 00:00:00 266 | 264,45,687,en,2013-04-30 00:00:00 267 | 265,45,1403,ar,2013-04-30 00:00:00 268 | 266,45,1519,fr,2013-04-30 00:00:00 269 | 267,45,1642,th,2013-04-30 00:00:00 270 | 268,45,1772,he,2013-04-30 00:00:00 271 | 269,45,1892,zh-cht,2013-04-30 00:00:00 272 | 270,46,688,en,2013-04-30 00:00:00 273 | 271,46,1404,ar,2013-04-30 00:00:00 274 | 272,46,1520,fr,2013-04-30 00:00:00 275 | 273,46,1643,th,2013-04-30 00:00:00 276 | 274,46,1773,he,2013-04-30 00:00:00 277 | 275,46,1893,zh-cht,2013-04-30 00:00:00 278 | 276,47,703,en,2013-04-30 00:00:00 279 | 277,47,1415,ar,2013-04-30 00:00:00 280 | 278,47,1531,fr,2013-04-30 00:00:00 281 | 279,47,1654,th,2013-04-30 00:00:00 282 | 280,47,1784,he,2013-04-30 00:00:00 283 | 281,47,1905,zh-cht,2013-04-30 00:00:00 284 | 282,48,704,en,2013-04-30 00:00:00 285 | 283,48,1416,ar,2013-04-30 00:00:00 286 | 284,48,1532,fr,2013-04-30 00:00:00 287 | 285,48,1655,th,2013-04-30 00:00:00 288 | 286,48,1785,he,2013-04-30 00:00:00 289 | 287,48,1906,zh-cht,2013-04-30 00:00:00 290 | 288,49,689,en,2013-04-30 00:00:00 291 | 289,49,1405,ar,2013-04-30 00:00:00 292 | 290,49,1521,fr,2013-04-30 00:00:00 293 | 291,49,1644,th,2013-04-30 00:00:00 294 | 292,49,1774,he,2013-04-30 00:00:00 295 | 293,49,1894,zh-cht,2013-04-30 00:00:00 296 | 294,50,690,en,2013-04-30 00:00:00 297 | 295,50,1406,ar,2013-04-30 00:00:00 298 | 296,50,1522,fr,2013-04-30 00:00:00 299 | 297,50,1645,th,2013-04-30 00:00:00 300 | 298,50,1775,he,2013-04-30 00:00:00 301 | 299,50,1895,zh-cht,2013-04-30 00:00:00 302 | 300,51,691,en,2013-04-30 00:00:00 303 | 301,51,1407,ar,2013-04-30 00:00:00 304 | 302,51,1523,fr,2013-04-30 00:00:00 305 | 303,51,1646,th,2013-04-30 00:00:00 306 | 304,51,1776,he,2013-04-30 00:00:00 307 | 305,51,1896,zh-cht,2013-04-30 00:00:00 308 | 306,52,697,en,2013-04-30 00:00:00 309 | 307,52,1409,ar,2013-04-30 00:00:00 310 | 308,52,1525,fr,2013-04-30 00:00:00 311 | 309,52,1648,th,2013-04-30 00:00:00 312 | 310,52,1778,he,2013-04-30 00:00:00 313 | 311,52,1898,zh-cht,2013-04-30 00:00:00 314 | 312,53,853,en,2013-04-30 00:00:00 315 | 313,53,1426,ar,2013-04-30 00:00:00 316 | 314,53,1542,fr,2013-04-30 00:00:00 317 | 315,53,1665,th,2013-04-30 00:00:00 318 | 316,53,1795,he,2013-04-30 00:00:00 319 | 317,53,1916,zh-cht,2013-04-30 00:00:00 320 | 318,54,698,en,2013-04-30 00:00:00 321 | 319,54,1410,ar,2013-04-30 00:00:00 322 | 320,54,1526,fr,2013-04-30 00:00:00 323 | 321,54,1649,th,2013-04-30 00:00:00 324 | 322,54,1779,he,2013-04-30 00:00:00 325 | 323,54,1899,zh-cht,2013-04-30 00:00:00 326 | 324,55,699,en,2013-04-30 00:00:00 327 | 325,55,1411,ar,2013-04-30 00:00:00 328 | 326,55,1527,fr,2013-04-30 00:00:00 329 | 327,55,1650,th,2013-04-30 00:00:00 330 | 328,55,1780,he,2013-04-30 00:00:00 331 | 329,55,1900,zh-cht,2013-04-30 00:00:00 332 | 330,56,700,en,2013-04-30 00:00:00 333 | 331,56,1412,ar,2013-04-30 00:00:00 334 | 332,56,1528,fr,2013-04-30 00:00:00 335 | 333,56,1651,th,2013-04-30 00:00:00 336 | 334,56,1781,he,2013-04-30 00:00:00 337 | 335,56,1901,zh-cht,2013-04-30 00:00:00 338 | 336,57,701,en,2013-04-30 00:00:00 339 | 337,57,1413,ar,2013-04-30 00:00:00 340 | 338,57,1529,fr,2013-04-30 00:00:00 341 | 339,57,1652,th,2013-04-30 00:00:00 342 | 340,57,1782,he,2013-04-30 00:00:00 343 | 341,57,1903,zh-cht,2013-04-30 00:00:00 344 | 342,58,702,en,2013-04-30 00:00:00 345 | 343,58,1414,ar,2013-04-30 00:00:00 346 | 344,58,1530,fr,2013-04-30 00:00:00 347 | 345,58,1653,th,2013-04-30 00:00:00 348 | 346,58,1783,he,2013-04-30 00:00:00 349 | 347,58,1904,zh-cht,2013-04-30 00:00:00 350 | 348,59,744,en,2013-04-30 00:00:00 351 | 349,59,1417,ar,2013-04-30 00:00:00 352 | 350,59,1533,fr,2013-04-30 00:00:00 353 | 351,59,1656,th,2013-04-30 00:00:00 354 | 352,59,1786,he,2013-04-30 00:00:00 355 | 353,59,1907,zh-cht,2013-04-30 00:00:00 356 | 354,60,745,en,2013-04-30 00:00:00 357 | 355,60,1418,ar,2013-04-30 00:00:00 358 | 356,60,1534,fr,2013-04-30 00:00:00 359 | 357,60,1657,th,2013-04-30 00:00:00 360 | 358,60,1787,he,2013-04-30 00:00:00 361 | 359,60,1908,zh-cht,2013-04-30 00:00:00 362 | 360,61,746,en,2013-04-30 00:00:00 363 | 361,61,1419,ar,2013-04-30 00:00:00 364 | 362,61,1535,fr,2013-04-30 00:00:00 365 | 363,61,1658,th,2013-04-30 00:00:00 366 | 364,61,1788,he,2013-04-30 00:00:00 367 | 365,61,1909,zh-cht,2013-04-30 00:00:00 368 | 366,62,847,en,2013-04-30 00:00:00 369 | 367,62,1420,ar,2013-04-30 00:00:00 370 | 368,62,1536,fr,2013-04-30 00:00:00 371 | 369,62,1659,th,2013-04-30 00:00:00 372 | 370,62,1789,he,2013-04-30 00:00:00 373 | 371,62,1910,zh-cht,2013-04-30 00:00:00 374 | 372,63,848,en,2013-04-30 00:00:00 375 | 373,63,1421,ar,2013-04-30 00:00:00 376 | 374,63,1537,fr,2013-04-30 00:00:00 377 | 375,63,1660,th,2013-04-30 00:00:00 378 | 376,63,1790,he,2013-04-30 00:00:00 379 | 377,63,1911,zh-cht,2013-04-30 00:00:00 380 | 378,64,849,en,2013-04-30 00:00:00 381 | 379,64,1422,ar,2013-04-30 00:00:00 382 | 380,64,1538,fr,2013-04-30 00:00:00 383 | 381,64,1661,th,2013-04-30 00:00:00 384 | 382,64,1791,he,2013-04-30 00:00:00 385 | 383,64,1912,zh-cht,2013-04-30 00:00:00 386 | 384,65,892,en,2013-04-30 00:00:00 387 | 385,65,1437,ar,2013-04-30 00:00:00 388 | 386,65,1559,fr,2013-04-30 00:00:00 389 | 387,65,1682,th,2013-04-30 00:00:00 390 | 388,65,1813,he,2013-04-30 00:00:00 391 | 389,65,1934,zh-cht,2013-04-30 00:00:00 392 | 390,66,891,en,2013-04-30 00:00:00 393 | 391,66,1436,ar,2013-04-30 00:00:00 394 | 392,66,1558,fr,2013-04-30 00:00:00 395 | 393,66,1681,th,2013-04-30 00:00:00 396 | 394,66,1812,he,2013-04-30 00:00:00 397 | 395,66,1933,zh-cht,2013-04-30 00:00:00 398 | 396,67,893,en,2013-04-30 00:00:00 399 | 397,67,1438,ar,2013-04-30 00:00:00 400 | 398,67,1560,fr,2013-04-30 00:00:00 401 | 399,67,1683,th,2013-04-30 00:00:00 402 | 400,67,1814,he,2013-04-30 00:00:00 403 | 401,67,1935,zh-cht,2013-04-30 00:00:00 404 | 402,68,850,en,2013-04-30 00:00:00 405 | 403,68,1423,ar,2013-04-30 00:00:00 406 | 404,68,1539,fr,2013-04-30 00:00:00 407 | 405,68,1662,th,2013-04-30 00:00:00 408 | 406,68,1792,he,2013-04-30 00:00:00 409 | 407,68,1913,zh-cht,2013-04-30 00:00:00 410 | 408,69,851,en,2013-04-30 00:00:00 411 | 409,69,1424,ar,2013-04-30 00:00:00 412 | 410,69,1540,fr,2013-04-30 00:00:00 413 | 411,69,1663,th,2013-04-30 00:00:00 414 | 412,69,1793,he,2013-04-30 00:00:00 415 | 413,69,1914,zh-cht,2013-04-30 00:00:00 416 | 414,70,852,en,2013-04-30 00:00:00 417 | 415,70,1425,ar,2013-04-30 00:00:00 418 | 416,70,1541,fr,2013-04-30 00:00:00 419 | 417,70,1664,th,2013-04-30 00:00:00 420 | 418,70,1794,he,2013-04-30 00:00:00 421 | 419,70,1915,zh-cht,2013-04-30 00:00:00 422 | 420,71,856,en,2013-04-30 00:00:00 423 | 421,71,1427,ar,2013-04-30 00:00:00 424 | 422,71,1543,fr,2013-04-30 00:00:00 425 | 423,71,1666,th,2013-04-30 00:00:00 426 | 424,71,1796,he,2013-04-30 00:00:00 427 | 425,71,1918,zh-cht,2013-04-30 00:00:00 428 | 426,72,858,en,2013-04-30 00:00:00 429 | 427,72,1429,ar,2013-04-30 00:00:00 430 | 428,72,1544,fr,2013-04-30 00:00:00 431 | 429,72,1667,th,2013-04-30 00:00:00 432 | 430,72,1797,he,2013-04-30 00:00:00 433 | 431,72,1919,zh-cht,2013-04-30 00:00:00 434 | 432,73,867,en,2013-04-30 00:00:00 435 | 433,73,1430,ar,2013-04-30 00:00:00 436 | 434,73,1545,fr,2013-04-30 00:00:00 437 | 435,73,1668,th,2013-04-30 00:00:00 438 | 436,73,1799,he,2013-04-30 00:00:00 439 | 437,73,1920,zh-cht,2013-04-30 00:00:00 440 | 438,74,868,en,2013-04-30 00:00:00 441 | 439,74,1380,ar,2013-04-30 00:00:00 442 | 440,74,1546,fr,2013-04-30 00:00:00 443 | 441,74,1669,th,2013-04-30 00:00:00 444 | 442,74,1800,he,2013-04-30 00:00:00 445 | 443,74,1921,zh-cht,2013-04-30 00:00:00 446 | 444,75,869,en,2013-04-30 00:00:00 447 | 445,75,1381,ar,2013-04-30 00:00:00 448 | 446,75,1547,fr,2013-04-30 00:00:00 449 | 447,75,1670,th,2013-04-30 00:00:00 450 | 448,75,1801,he,2013-04-30 00:00:00 451 | 449,75,1922,zh-cht,2013-04-30 00:00:00 452 | 450,76,870,en,2013-04-30 00:00:00 453 | 451,76,1382,ar,2013-04-30 00:00:00 454 | 452,76,1548,fr,2013-04-30 00:00:00 455 | 453,76,1671,th,2013-04-30 00:00:00 456 | 454,76,1802,he,2013-04-30 00:00:00 457 | 455,76,1923,zh-cht,2013-04-30 00:00:00 458 | 456,77,871,en,2013-04-30 00:00:00 459 | 457,77,1383,ar,2013-04-30 00:00:00 460 | 458,77,1549,fr,2013-04-30 00:00:00 461 | 459,77,1672,th,2013-04-30 00:00:00 462 | 460,77,1803,he,2013-04-30 00:00:00 463 | 461,77,1924,zh-cht,2013-04-30 00:00:00 464 | 462,78,872,en,2013-04-30 00:00:00 465 | 463,78,1384,ar,2013-04-30 00:00:00 466 | 464,78,1550,fr,2013-04-30 00:00:00 467 | 465,78,1673,th,2013-04-30 00:00:00 468 | 466,78,1804,he,2013-04-30 00:00:00 469 | 467,78,1925,zh-cht,2013-04-30 00:00:00 470 | 468,79,885,en,2013-04-30 00:00:00 471 | 469,79,1386,ar,2013-04-30 00:00:00 472 | 470,79,1552,fr,2013-04-30 00:00:00 473 | 471,79,1675,th,2013-04-30 00:00:00 474 | 472,79,1806,he,2013-04-30 00:00:00 475 | 473,79,1927,zh-cht,2013-04-30 00:00:00 476 | 474,80,886,en,2013-04-30 00:00:00 477 | 475,80,1431,ar,2013-04-30 00:00:00 478 | 476,80,1553,fr,2013-04-30 00:00:00 479 | 477,80,1676,th,2013-04-30 00:00:00 480 | 478,80,1807,he,2013-04-30 00:00:00 481 | 479,80,1928,zh-cht,2013-04-30 00:00:00 482 | 480,81,887,en,2013-04-30 00:00:00 483 | 481,81,1432,ar,2013-04-30 00:00:00 484 | 482,81,1554,fr,2013-04-30 00:00:00 485 | 483,81,1677,th,2013-04-30 00:00:00 486 | 484,81,1808,he,2013-04-30 00:00:00 487 | 485,81,1929,zh-cht,2013-04-30 00:00:00 488 | 486,82,888,en,2013-04-30 00:00:00 489 | 487,82,1433,ar,2013-04-30 00:00:00 490 | 488,82,1555,fr,2013-04-30 00:00:00 491 | 489,82,1678,th,2013-04-30 00:00:00 492 | 490,82,1809,he,2013-04-30 00:00:00 493 | 491,82,1930,zh-cht,2013-04-30 00:00:00 494 | 492,83,889,en,2013-04-30 00:00:00 495 | 493,83,1434,ar,2013-04-30 00:00:00 496 | 494,83,1556,fr,2013-04-30 00:00:00 497 | 495,83,1679,th,2013-04-30 00:00:00 498 | 496,83,1810,he,2013-04-30 00:00:00 499 | 497,83,1931,zh-cht,2013-04-30 00:00:00 500 | 498,84,890,en,2013-04-30 00:00:00 501 | 499,84,1435,ar,2013-04-30 00:00:00 502 | 500,84,1557,fr,2013-04-30 00:00:00 503 | 501,84,1680,th,2013-04-30 00:00:00 504 | 502,84,1811,he,2013-04-30 00:00:00 505 | 503,84,1932,zh-cht,2013-04-30 00:00:00 506 | 504,85,903,en,2013-04-30 00:00:00 507 | 505,85,1439,ar,2013-04-30 00:00:00 508 | 506,85,1561,fr,2013-04-30 00:00:00 509 | 507,85,1684,th,2013-04-30 00:00:00 510 | 508,85,1815,he,2013-04-30 00:00:00 511 | 509,85,1936,zh-cht,2013-04-30 00:00:00 512 | 510,86,904,en,2013-04-30 00:00:00 513 | 511,86,1440,ar,2013-04-30 00:00:00 514 | 512,86,1562,fr,2013-04-30 00:00:00 515 | 513,86,1685,th,2013-04-30 00:00:00 516 | 514,86,1816,he,2013-04-30 00:00:00 517 | 515,86,1937,zh-cht,2013-04-30 00:00:00 518 | 516,87,905,en,2013-04-30 00:00:00 519 | 517,87,1441,ar,2013-04-30 00:00:00 520 | 518,87,1563,fr,2013-04-30 00:00:00 521 | 519,87,1686,th,2013-04-30 00:00:00 522 | 520,87,1817,he,2013-04-30 00:00:00 523 | 521,87,1938,zh-cht,2013-04-30 00:00:00 524 | 522,88,906,en,2013-04-30 00:00:00 525 | 523,88,1442,ar,2013-04-30 00:00:00 526 | 524,88,1564,fr,2013-04-30 00:00:00 527 | 525,88,1687,th,2013-04-30 00:00:00 528 | 526,88,1818,he,2013-04-30 00:00:00 529 | 527,88,1939,zh-cht,2013-04-30 00:00:00 530 | 528,89,907,en,2013-04-30 00:00:00 531 | 529,89,1443,ar,2013-04-30 00:00:00 532 | 530,89,1565,fr,2013-04-30 00:00:00 533 | 531,89,1688,th,2013-04-30 00:00:00 534 | 532,89,1819,he,2013-04-30 00:00:00 535 | 533,89,1940,zh-cht,2013-04-30 00:00:00 536 | 534,90,908,en,2013-04-30 00:00:00 537 | 535,90,1444,ar,2013-04-30 00:00:00 538 | 536,90,1566,fr,2013-04-30 00:00:00 539 | 537,90,1689,th,2013-04-30 00:00:00 540 | 538,90,1820,he,2013-04-30 00:00:00 541 | 539,90,1941,zh-cht,2013-04-30 00:00:00 542 | 540,91,909,en,2013-04-30 00:00:00 543 | 541,91,1445,ar,2013-04-30 00:00:00 544 | 542,91,1567,fr,2013-04-30 00:00:00 545 | 543,91,1690,th,2013-04-30 00:00:00 546 | 544,91,1821,he,2013-04-30 00:00:00 547 | 545,91,1942,zh-cht,2013-04-30 00:00:00 548 | 546,92,912,en,2013-04-30 00:00:00 549 | 547,92,1446,ar,2013-04-30 00:00:00 550 | 548,92,1568,fr,2013-04-30 00:00:00 551 | 549,92,1691,th,2013-04-30 00:00:00 552 | 550,92,1822,he,2013-04-30 00:00:00 553 | 551,92,1943,zh-cht,2013-04-30 00:00:00 554 | 552,93,913,en,2013-04-30 00:00:00 555 | 553,93,1447,ar,2013-04-30 00:00:00 556 | 554,93,1569,fr,2013-04-30 00:00:00 557 | 555,93,1692,th,2013-04-30 00:00:00 558 | 556,93,1823,he,2013-04-30 00:00:00 559 | 557,93,1944,zh-cht,2013-04-30 00:00:00 560 | 558,94,914,en,2013-04-30 00:00:00 561 | 559,94,1448,ar,2013-04-30 00:00:00 562 | 560,94,1570,fr,2013-04-30 00:00:00 563 | 561,94,1693,th,2013-04-30 00:00:00 564 | 562,94,1824,he,2013-04-30 00:00:00 565 | 563,94,1945,zh-cht,2013-04-30 00:00:00 566 | 564,95,3,en,2013-04-30 00:00:00 567 | 565,95,1360,ar,2013-04-30 00:00:00 568 | 566,95,1484,fr,2013-04-30 00:00:00 569 | 567,95,1606,th,2013-04-30 00:00:00 570 | 568,95,1728,he,2013-04-30 00:00:00 571 | 569,95,1855,zh-cht,2013-04-30 00:00:00 572 | 570,96,4,en,2013-04-30 00:00:00 573 | 571,96,1361,ar,2013-04-30 00:00:00 574 | 572,96,1485,fr,2013-04-30 00:00:00 575 | 573,96,1607,th,2013-04-30 00:00:00 576 | 574,96,1729,he,2013-04-30 00:00:00 577 | 575,96,1856,zh-cht,2013-04-30 00:00:00 578 | 576,97,5,en,2013-04-30 00:00:00 579 | 577,97,1362,ar,2013-04-30 00:00:00 580 | 578,97,1486,fr,2013-04-30 00:00:00 581 | 579,97,1608,th,2013-04-30 00:00:00 582 | 580,97,1730,he,2013-04-30 00:00:00 583 | 581,97,1857,zh-cht,2013-04-30 00:00:00 584 | 582,98,613,en,2013-04-30 00:00:00 585 | 583,98,1388,ar,2013-04-30 00:00:00 586 | 584,98,1505,fr,2013-04-30 00:00:00 587 | 585,98,1627,th,2013-04-30 00:00:00 588 | 586,98,1750,he,2013-04-30 00:00:00 589 | 587,98,1876,zh-cht,2013-04-30 00:00:00 590 | 588,99,618,en,2013-04-30 00:00:00 591 | 589,99,1389,ar,2013-04-30 00:00:00 592 | 590,99,1506,fr,2013-04-30 00:00:00 593 | 591,99,1628,th,2013-04-30 00:00:00 594 | 592,99,1751,he,2013-04-30 00:00:00 595 | 593,99,1877,zh-cht,2013-04-30 00:00:00 596 | 594,100,619,en,2013-04-30 00:00:00 597 | 595,100,1390,ar,2013-04-30 00:00:00 598 | 596,100,1507,fr,2013-04-30 00:00:00 599 | 597,100,1629,th,2013-04-30 00:00:00 600 | 598,100,1752,he,2013-04-30 00:00:00 601 | 599,100,1878,zh-cht,2013-04-30 00:00:00 602 | 600,101,620,en,2013-04-30 00:00:00 603 | 601,101,1391,ar,2013-04-30 00:00:00 604 | 602,101,1508,fr,2013-04-30 00:00:00 605 | 603,101,1630,th,2013-04-30 00:00:00 606 | 604,101,1753,he,2013-04-30 00:00:00 607 | 605,101,1879,zh-cht,2013-04-30 00:00:00 608 | 606,102,627,en,2013-04-30 00:00:00 609 | 607,102,1392,ar,2013-04-30 00:00:00 610 | 608,102,1509,fr,2013-04-30 00:00:00 611 | 609,102,1631,th,2013-04-30 00:00:00 612 | 610,102,1754,he,2013-04-30 00:00:00 613 | 611,102,1880,zh-cht,2013-04-30 00:00:00 614 | 612,103,630,en,2013-04-30 00:00:00 615 | 613,103,1393,ar,2013-04-30 00:00:00 616 | 614,103,1510,fr,2013-04-30 00:00:00 617 | 615,103,1632,th,2013-04-30 00:00:00 618 | 616,103,1755,he,2013-04-30 00:00:00 619 | 617,103,1882,zh-cht,2013-04-30 00:00:00 620 | 618,104,633,en,2013-04-30 00:00:00 621 | 619,104,1394,ar,2013-04-30 00:00:00 622 | 620,104,1511,fr,2013-04-30 00:00:00 623 | 621,104,1633,th,2013-04-30 00:00:00 624 | 622,104,1756,he,2013-04-30 00:00:00 625 | 623,104,1883,zh-cht,2013-04-30 00:00:00 626 | 624,105,634,en,2013-04-30 00:00:00 627 | 625,105,1395,ar,2013-04-30 00:00:00 628 | 626,105,1512,fr,2013-04-30 00:00:00 629 | 627,105,1634,th,2013-04-30 00:00:00 630 | 628,105,1757,he,2013-04-30 00:00:00 631 | 629,105,1884,zh-cht,2013-04-30 00:00:00 632 | 630,106,635,en,2013-04-30 00:00:00 633 | 631,106,1396,ar,2013-04-30 00:00:00 634 | 632,106,1513,fr,2013-04-30 00:00:00 635 | 633,106,1635,th,2013-04-30 00:00:00 636 | 634,106,1758,he,2013-04-30 00:00:00 637 | 635,106,1885,zh-cht,2013-04-30 00:00:00 638 | 636,107,1213,en,2013-04-30 00:00:00 639 | 637,107,1479,ar,2013-04-30 00:00:00 640 | 638,107,1601,fr,2013-04-30 00:00:00 641 | 639,107,1724,th,2013-04-30 00:00:00 642 | 640,107,1851,he,2013-04-30 00:00:00 643 | 641,107,1977,zh-cht,2013-04-30 00:00:00 644 | 642,108,1183,en,2013-04-30 00:00:00 645 | 643,108,1454,ar,2013-04-30 00:00:00 646 | 644,108,1576,fr,2013-04-30 00:00:00 647 | 645,108,1699,th,2013-04-30 00:00:00 648 | 646,108,1763,he,2013-04-30 00:00:00 649 | 647,108,1951,zh-cht,2013-04-30 00:00:00 650 | 648,109,1202,en,2013-04-30 00:00:00 651 | 649,109,1470,ar,2013-04-30 00:00:00 652 | 650,109,1592,fr,2013-04-30 00:00:00 653 | 651,109,1715,th,2013-04-30 00:00:00 654 | 652,109,1841,he,2013-04-30 00:00:00 655 | 653,109,1968,zh-cht,2013-04-30 00:00:00 656 | 654,110,1203,en,2013-04-30 00:00:00 657 | 655,110,1471,ar,2013-04-30 00:00:00 658 | 656,110,1593,fr,2013-04-30 00:00:00 659 | 657,110,1716,th,2013-04-30 00:00:00 660 | 658,110,1842,he,2013-04-30 00:00:00 661 | 659,110,1969,zh-cht,2013-04-30 00:00:00 662 | 660,111,1186,en,2013-04-30 00:00:00 663 | 661,111,1456,ar,2013-04-30 00:00:00 664 | 662,111,1578,fr,2013-04-30 00:00:00 665 | 663,111,1701,th,2013-04-30 00:00:00 666 | 664,111,1765,he,2013-04-30 00:00:00 667 | 665,111,1953,zh-cht,2013-04-30 00:00:00 668 | 666,112,1209,en,2013-04-30 00:00:00 669 | 667,112,1475,ar,2013-04-30 00:00:00 670 | 668,112,1597,fr,2013-04-30 00:00:00 671 | 669,112,1720,th,2013-04-30 00:00:00 672 | 670,112,1846,he,2013-04-30 00:00:00 673 | 671,112,1973,zh-cht,2013-04-30 00:00:00 674 | 672,113,1185,en,2013-04-30 00:00:00 675 | 673,113,1455,ar,2013-04-30 00:00:00 676 | 674,113,1577,fr,2013-04-30 00:00:00 677 | 675,113,1700,th,2013-04-30 00:00:00 678 | 676,113,1764,he,2013-04-30 00:00:00 679 | 677,113,1952,zh-cht,2013-04-30 00:00:00 680 | 678,114,1197,en,2013-04-30 00:00:00 681 | 679,114,1466,ar,2013-04-30 00:00:00 682 | 680,114,1588,fr,2013-04-30 00:00:00 683 | 681,114,1711,th,2013-04-30 00:00:00 684 | 682,114,1837,he,2013-04-30 00:00:00 685 | 683,114,1964,zh-cht,2013-04-30 00:00:00 686 | 684,115,1216,en,2013-04-30 00:00:00 687 | 685,115,1482,ar,2013-04-30 00:00:00 688 | 686,115,1604,fr,2013-04-30 00:00:00 689 | 687,115,1727,th,2013-04-30 00:00:00 690 | 688,115,1854,he,2013-04-30 00:00:00 691 | 689,115,1980,zh-cht,2013-04-30 00:00:00 692 | 690,116,1191,en,2013-04-30 00:00:00 693 | 691,116,1460,ar,2013-04-30 00:00:00 694 | 692,116,1582,fr,2013-04-30 00:00:00 695 | 693,116,1705,th,2013-04-30 00:00:00 696 | 694,116,1831,he,2013-04-30 00:00:00 697 | 695,116,1957,zh-cht,2013-04-30 00:00:00 698 | 696,117,1206,en,2013-04-30 00:00:00 699 | 697,117,1473,ar,2013-04-30 00:00:00 700 | 698,117,1595,fr,2013-04-30 00:00:00 701 | 699,117,1718,th,2013-04-30 00:00:00 702 | 700,117,1844,he,2013-04-30 00:00:00 703 | 701,117,1971,zh-cht,2013-04-30 00:00:00 704 | 702,118,1187,en,2013-04-30 00:00:00 705 | 703,118,1457,ar,2013-04-30 00:00:00 706 | 704,118,1579,fr,2013-04-30 00:00:00 707 | 705,118,1702,th,2013-04-30 00:00:00 708 | 706,118,1766,he,2013-04-30 00:00:00 709 | 707,118,1954,zh-cht,2013-04-30 00:00:00 710 | 708,119,1215,en,2013-04-30 00:00:00 711 | 709,119,1481,ar,2013-04-30 00:00:00 712 | 710,119,1603,fr,2013-04-30 00:00:00 713 | 711,119,1726,th,2013-04-30 00:00:00 714 | 712,119,1853,he,2013-04-30 00:00:00 715 | 713,119,1979,zh-cht,2013-04-30 00:00:00 716 | 714,120,1193,en,2013-04-30 00:00:00 717 | 715,120,1462,ar,2013-04-30 00:00:00 718 | 716,120,1584,fr,2013-04-30 00:00:00 719 | 717,120,1707,th,2013-04-30 00:00:00 720 | 718,120,1833,he,2013-04-30 00:00:00 721 | 719,120,1959,zh-cht,2013-04-30 00:00:00 722 | 720,121,1188,en,2013-04-30 00:00:00 723 | 721,121,1458,ar,2013-04-30 00:00:00 724 | 722,121,1580,fr,2013-04-30 00:00:00 725 | 723,121,1703,th,2013-04-30 00:00:00 726 | 724,121,1767,he,2013-04-30 00:00:00 727 | 725,121,1955,zh-cht,2013-04-30 00:00:00 728 | 726,122,1201,en,2013-04-30 00:00:00 729 | 727,122,1469,ar,2013-04-30 00:00:00 730 | 728,122,1591,fr,2013-04-30 00:00:00 731 | 729,122,1714,th,2013-04-30 00:00:00 732 | 730,122,1840,he,2013-04-30 00:00:00 733 | 731,122,1967,zh-cht,2013-04-30 00:00:00 734 | 732,123,1981,en,2013-04-30 00:00:00 735 | 733,123,1982,ar,2013-04-30 00:00:00 736 | 734,123,1983,fr,2013-04-30 00:00:00 737 | 735,123,1984,th,2013-04-30 00:00:00 738 | 736,123,1985,he,2013-04-30 00:00:00 739 | 737,123,1986,zh-cht,2013-04-30 00:00:00 740 | 738,124,1987,en,2013-04-30 00:00:00 741 | 739,124,1988,ar,2013-04-30 00:00:00 742 | 740,124,1989,fr,2013-04-30 00:00:00 743 | 741,124,1990,th,2013-04-30 00:00:00 744 | 742,124,1991,he,2013-04-30 00:00:00 745 | 743,124,1992,zh-cht,2013-04-30 00:00:00 746 | 744,125,1993,en,2013-04-30 00:00:00 747 | 745,125,1994,ar,2013-04-30 00:00:00 748 | 746,125,1995,fr,2013-04-30 00:00:00 749 | 747,125,1996,th,2013-04-30 00:00:00 750 | 748,125,1997,he,2013-04-30 00:00:00 751 | 749,125,1998,zh-cht,2013-04-30 00:00:00 752 | 750,126,1999,en,2013-04-30 00:00:00 753 | 751,126,2000,ar,2013-04-30 00:00:00 754 | 752,126,2001,fr,2013-04-30 00:00:00 755 | 753,126,2002,th,2013-04-30 00:00:00 756 | 754,126,2003,he,2013-04-30 00:00:00 757 | 755,126,2004,zh-cht,2013-04-30 00:00:00 758 | 756,127,2005,en,2013-04-30 00:00:00 759 | 757,127,2006,ar,2013-04-30 00:00:00 760 | 758,127,2007,fr,2013-04-30 00:00:00 761 | 759,127,2008,th,2013-04-30 00:00:00 762 | 760,127,2009,he,2013-04-30 00:00:00 763 | 761,127,2010,zh-cht,2013-04-30 00:00:00 764 | -------------------------------------------------------------------------------- /Semana 7/Tablas/productreview.csv: -------------------------------------------------------------------------------- 1 | productreviewid,productid,reviewername,reviewdate,emailaddress,rating,comments,modifieddate 2 | 1,709,John Smith,2013-09-18 00:00:00,john@fourthcoffee.com,5,"I can't believe I'm singing the praises of a pair of socks, but I just came back from a grueling 3 | 3-day ride and these socks really helped make the trip a blast. They're lightweight yet really cushioned my feet all day. 4 | The reinforced toe is nearly bullet-proof and I didn't experience any problems with rubbing or blisters like I have with 5 | other brands. I know it sounds silly, but it's always the little stuff (like comfortable feet) that makes or breaks a long trip. 6 | I won't go on another trip without them!",2013-09-18 00:00:00 7 | 2,937,David,2013-11-13 00:00:00,david@graphicdesigninstitute.com,4,"A little on the heavy side, but overall the entry/exit is easy in all conditions. I've used these pedals for 8 | more than 3 years and I've never had a problem. Cleanup is easy. Mud and sand don't get trapped. I would like 9 | them even better if there was a weight reduction. Maybe in the next design. Still, I would recommend them to a friend.",2013-11-13 00:00:00 10 | 3,937,Jill,2013-11-15 00:00:00,jill@margiestravel.com,2,"Maybe it's just because I'm new to mountain biking, but I had a terrible time getting use 11 | to these pedals. In my first outing, I wiped out trying to release my foot. Any suggestions on 12 | ways I can adjust the pedals, or is it just a learning curve thing?",2013-11-15 00:00:00 13 | 4,798,Laura Norman,2013-11-15 00:00:00,laura@treyresearch.net,5,"The Road-550-W from Adventure Works Cycles is everything it's advertised to be. Finally, a quality bike that 14 | is actually built for a woman and provides control and comfort in one neat package. The top tube is shorter, the suspension is weight-tuned and there's a much shorter reach to the brake 15 | levers. All this adds up to a great mountain bike that is sure to accommodate any woman's anatomy. In addition to getting the size right, the saddle is incredibly comfortable. 16 | Attention to detail is apparent in every aspect from the frame finish to the careful design of each component. Each component is a solid performer without any fluff. 17 | The designers clearly did their homework and thought about size, weight, and funtionality throughout. And at less than 19 pounds, the bike is manageable for even the most petite cyclist. 18 | 19 | We had 5 riders take the bike out for a spin and really put it to the test. The results were consistent and very positive. Our testers loved the manuverability 20 | and control they had with the redesigned frame on the 550-W. A definite improvement over the 2012 design. Four out of five testers listed quick handling 21 | and responsivness were the key elements they noticed. Technical climbing and on the flats, the bike just cruises through the rough. Tight corners and obstacles were handled effortlessly. The fifth tester was more impressed with the smooth ride. The heavy-duty shocks absorbed even the worst bumps and provided a soft ride on all but the 22 | nastiest trails and biggest drops. The shifting was rated superb and typical of what we've come to expect from Adventure Works Cycles. On descents, the bike handled flawlessly and tracked very well. The bike is well balanced front-to-rear and frame flex was minimal. In particular, the testers 23 | noted that the brake system had a unique combination of power and modulation. While some brake setups can be overly touchy, these brakes had a good 24 | amount of power, but also a good feel that allows you to apply as little or as much braking power as is needed. Second is their short break-in period. We found that they tend to break-in well before 25 | the end of the first ride; while others take two to three rides (or more) to come to full power. 26 | 27 | On the negative side, the pedals were not quite up to our tester's standards. 28 | Just for fun, we experimented with routine maintenance tasks. Overall we found most operations to be straight forward and easy to complete. The only exception was replacing the front wheel. The maintenance manual that comes 29 | with the bike say to install the front wheel with the axle quick release or bolt, then compress the fork a few times before fastening and tightening the two quick-release mechanisms on the bottom of the dropouts. This is to seat the axle in the dropouts, and if you do not 30 | do this, the axle will become seated after you tightened the two bottom quick releases, which will then become loose. It's better to test the tightness carefully or you may notice that the two bottom quick releases have come loose enough to fall completely open. And that's something you don't want to experience 31 | while out on the road! 32 | 33 | The Road-550-W frame is available in a variety of sizes and colors and has the same durable, high-quality aluminum that AWC is known for. At a MSRP of just under $1125.00, it's comparable in price to its closest competitors and 34 | we think that after a test drive you'l find the quality and performance above and beyond . You'll have a grin on your face and be itching to get out on the road for more. While designed for serious road racing, the Road-550-W would be an excellent choice for just about any terrain and 35 | any level of experience. It's a huge step in the right direction for female cyclists and well worth your consideration and hard-earned money.",2013-11-15 00:00:00 36 | -------------------------------------------------------------------------------- /Semana 7/Tablas/productsubcategory.csv: -------------------------------------------------------------------------------- 1 | productsubcategoryid,productcategoryid,name,rowguid,modifieddate 2 | 1,1,Mountain Bikes,2d364ade-264a-433c-b092-4fcbf3804e01,2008-04-30 00:00:00 3 | 2,1,Road Bikes,000310c0-bcc8-42c4-b0c3-45ae611af06b,2008-04-30 00:00:00 4 | 3,1,Touring Bikes,02c5061d-ecdc-4274-b5f1-e91d76bc3f37,2008-04-30 00:00:00 5 | 4,2,Handlebars,3ef2c725-7135-4c85-9ae6-ae9a3bdd9283,2008-04-30 00:00:00 6 | 5,2,Bottom Brackets,a9e54089-8a1e-4cf5-8646-e3801f685934,2008-04-30 00:00:00 7 | 6,2,Brakes,d43ba4a3-ef0d-426b-90eb-4be4547dd30c,2008-04-30 00:00:00 8 | 7,2,Chains,e93a7231-f16c-4b0f-8c41-c73fdec62da0,2008-04-30 00:00:00 9 | 8,2,Cranksets,4f644521-422b-4f19-974a-e3df6102567e,2008-04-30 00:00:00 10 | 9,2,Derailleurs,1830d70c-aa2a-40c0-a271-5ba86f38f8bf,2008-04-30 00:00:00 11 | 10,2,Forks,b5f9ba42-b69b-4fdd-b2ec-57fb7b42e3cf,2008-04-30 00:00:00 12 | 11,2,Headsets,7c782bbe-5a16-495a-aa50-10afe5a84af2,2008-04-30 00:00:00 13 | 12,2,Mountain Frames,61b21b65-e16a-4be7-9300-4d8e9db861be,2008-04-30 00:00:00 14 | 13,2,Pedals,6d24ac07-7a84-4849-864a-865a14125bc9,2008-04-30 00:00:00 15 | 14,2,Road Frames,5515f857-075b-4f9a-87b7-43b4997077b3,2008-04-30 00:00:00 16 | 15,2,Saddles,049fffa3-9d30-46df-82f7-f20730ec02b3,2008-04-30 00:00:00 17 | 16,2,Touring Frames,d2e3f1a8-56c4-4f36-b29d-5659fc0d2789,2008-04-30 00:00:00 18 | 17,2,Wheels,43521287-4b0b-438e-b80e-d82d9ad7c9f0,2008-04-30 00:00:00 19 | 18,3,Bib-Shorts,67b58d2b-5798-4a90-8c6c-5ddacf057171,2008-04-30 00:00:00 20 | 19,3,Caps,430dd6a8-a755-4b23-bb05-52520107da5f,2008-04-30 00:00:00 21 | 20,3,Gloves,92d5657b-0032-4e49-bad5-41a441a70942,2008-04-30 00:00:00 22 | 21,3,Jerseys,09e91437-ba4f-4b1a-8215-74184fd95db8,2008-04-30 00:00:00 23 | 22,3,Shorts,1a5ba5b3-03c3-457c-b11e-4fa85ede87da,2008-04-30 00:00:00 24 | 23,3,Socks,701019c3-09fe-4949-8386-c6ce686474e5,2008-04-30 00:00:00 25 | 24,3,Tights,5deb3e55-9897-4416-b18a-515e970bc2d1,2008-04-30 00:00:00 26 | 25,3,Vests,9ad7fe93-5ba0-4736-b578-ff80a2071297,2008-04-30 00:00:00 27 | 26,4,Bike Racks,4624b5ce-66d6-496b-9201-c053df3556cc,2008-04-30 00:00:00 28 | 27,4,Bike Stands,43b445c8-b820-424e-a1d5-90d81da0b46f,2008-04-30 00:00:00 29 | 28,4,Bottles and Cages,9b7dff41-9fa3-4776-8def-2c9a48c8b779,2008-04-30 00:00:00 30 | 29,4,Cleaners,9ad3bcf0-244d-4ec4-a6a0-fb701351c6a3,2008-04-30 00:00:00 31 | 30,4,Fenders,1697f8a2-0a08-4883-b7dd-d19117b4e9a7,2008-04-30 00:00:00 32 | 31,4,Helmets,f5e07a33-c9e0-439c-b5f3-9f25fb65becc,2008-04-30 00:00:00 33 | 32,4,Hydration Packs,646a8906-fc87-4267-a443-9c6d791e6693,2008-04-30 00:00:00 34 | 33,4,Lights,954178ba-624f-42db-95f6-ca035f36d130,2008-04-30 00:00:00 35 | 34,4,Locks,19646983-3fa0-4773-9a0c-f34c49df9bc8,2008-04-30 00:00:00 36 | 35,4,Panniers,3002a5d5-fec3-464b-bef3-e0f81d35f431,2008-04-30 00:00:00 37 | 36,4,Pumps,fe4d46f2-c87c-48c5-a4a1-3f55712d80b1,2008-04-30 00:00:00 38 | 37,4,Tires and Tubes,3c17c9ae-e906-48b4-bdd3-60e28d47dcdf,2008-04-30 00:00:00 39 | -------------------------------------------------------------------------------- /Semana 7/Tablas/salesperson.csv: -------------------------------------------------------------------------------- 1 | businessentityid,territoryid,salesquota,bonus,commissionpct,salesytd,saleslastyear,rowguid,modifieddate 2 | 274,,,0,0,559697.5639,0,48754992-9ee0-4c0e-8c94-9451604e3e02,2010-12-28 00:00:00 3 | 275,2,300000,4100,0.012,3763178.1787,1750406.4785,1e0a7274-3064-4f58-88ee-4c6586c87169,2011-05-24 00:00:00 4 | 276,4,250000,2000,0.015,4251368.5497,1439156.0291,4dd9eee4-8e81-4f8c-af97-683394c1f7c0,2011-05-24 00:00:00 5 | 277,3,250000,2500,0.015,3189418.3662,1997186.2037,39012928-bfec-4242-874d-423162c3f567,2011-05-24 00:00:00 6 | 278,6,250000,500,0.01,1453719.4653,1620276.8966,7a0ae1ab-b283-40f9-91d1-167abf06d720,2011-05-24 00:00:00 7 | 279,5,300000,6700,0.01,2315185.611,1849640.9418,52a5179d-3239-4157-ae29-17e868296dc0,2011-05-24 00:00:00 8 | 280,1,250000,5000,0.01,1352577.1325,1927059.178,be941a4a-fb50-4947-bda4-bb8972365b08,2011-05-24 00:00:00 9 | 281,4,250000,3550,0.01,2458535.6169,2073505.9999,35326ddb-7278-4fef-b3ba-ea137b69094e,2011-05-24 00:00:00 10 | 282,6,250000,5000,0.015,2604540.7172,2038234.6549,31fd7fc1-dc84-4f05-b9a0-762519eacacc,2011-05-24 00:00:00 11 | 283,1,250000,3500,0.012,1573012.9383,1371635.3158,6bac15b2-8ffb-45a9-b6d5-040e16c2073f,2011-05-24 00:00:00 12 | 284,1,300000,3900,0.019,1576562.1966,0,ac94ec04-a2dc-43e3-8654-dd0c546abc17,2012-09-23 00:00:00 13 | 285,,,0,0,172524.4512,0,cfdbef27-b1f7-4a56-a878-0221c73bae67,2013-03-07 00:00:00 14 | 286,9,250000,5650,0.018,1421810.9242,2278548.9776,9b968777-75dc-45bd-a8df-9cdaa72839e1,2013-05-23 00:00:00 15 | 287,,,0,0,519905.932,0,1dd1f689-df74-4149-8600-59555eef154b,2012-04-09 00:00:00 16 | 288,8,250000,75,0.018,1827066.7118,1307949.7917,224bb25a-62e3-493e-acaf-4f8f5c72396a,2013-05-23 00:00:00 17 | 289,10,250000,5150,0.02,4116871.2277,1635823.3967,25f6838d-9db4-4833-9ddc-7a24283af1ba,2012-05-23 00:00:00 18 | 290,7,250000,985,0.016,3121616.3202,2396539.7601,f509e3d4-76c8-42aa-b353-90b7b8db08de,2012-05-23 00:00:00 19 | -------------------------------------------------------------------------------- /Semana 7/Tablas/salesterritory.csv: -------------------------------------------------------------------------------- 1 | territoryid,name,countryregioncode,group,salesytd,saleslastyear,costytd,costlastyear,rowguid,modifieddate 2 | 1,Northwest,US,North America,7887186.7882,3298694.4938,0,0,43689a10-e30b-497f-b0de-11de20267ff7,2008-04-30 00:00:00 3 | 2,Northeast,US,North America,2402176.8476,3607148.9371,0,0,00fb7309-96cc-49e2-8363-0a1ba72486f2,2008-04-30 00:00:00 4 | 3,Central,US,North America,3072175.118,3205014.0767,0,0,df6e7fd8-1a8d-468c-b103-ed8addb452c1,2008-04-30 00:00:00 5 | 4,Southwest,US,North America,10510853.8739,5366575.7098,0,0,dc3e9ea0-7950-4431-9428-99dbcbc33865,2008-04-30 00:00:00 6 | 5,Southeast,US,North America,2538667.2515,3925071.4318,0,0,6dc4165a-5e4c-42d2-809d-4344e0ac75e7,2008-04-30 00:00:00 7 | 6,Canada,CA,North America,6771829.1376,5693988.86,0,0,06b4af8a-1639-476e-9266-110461d66b00,2008-04-30 00:00:00 8 | 7,France,FR,Europe,4772398.3078,2396539.7601,0,0,bf806804-9b4c-4b07-9d19-706f2e689552,2008-04-30 00:00:00 9 | 8,Germany,DE,Europe,3805202.3478,1307949.7917,0,0,6d2450db-8159-414f-a917-e73ee91c38a9,2008-04-30 00:00:00 10 | 9,Australia,AU,Pacific,5977814.9154,2278548.9776,0,0,602e612e-dfe9-41d9-b894-27e489747885,2008-04-30 00:00:00 11 | 10,United Kingdom,GB,Europe,5012905.3656,1635823.3967,0,0,05fc7e1f-2dea-414e-9ecd-09d150516fb5,2008-04-30 00:00:00 12 | -------------------------------------------------------------------------------- /Semana 8/Datos_Microdesafio_Semana8_DE.csv: -------------------------------------------------------------------------------- 1 | Pais ;Comisionado;Reduccion_CO2;Incrmento_P;Inversion_arboles;Fecha;Telefono 2 | Argentina;Carlos Veroes;Si;Si;Si;08/07/2022;23467698 3 | Colombia;Sofia Andrade;No;No;Si;12/06/2022;76587899 4 | Chile;Cristina Valdivia;Si;Si;Si;15/07/2022;76593749 5 | Bolivia;Pedro Carlos;No;No;No;01/09/2022;65746474 6 | Paraguay;Juan Paraguas;No;Si;Si;04/06/2022;83447474 7 | Venezuela;Andres Calamaro;Si;Si;No;03/05/2022;76847383 8 | Uruguay;Fernando Tatial;No;No;Si;04/04/2022;13434453 9 | Brasil;Gabriel Toeras;Si;Si;No;12/06/2022;37374344 10 | Ecuador;Juan Vera;No;No;No;14/05/2022;47477654 11 | Peru;Andres Porto;No;No;No;13/05/2022;57737373 12 | EEUU;Jennifer Laurence;Si;Si;Si;01/04/2022;84574737 13 | Canada;John white;No;Si;Si;04/01/2022;83873734 14 | Mexico ;Andres Fernandez;Si;Si;No;08/05/2022;64634746 15 | Costa Rica;Pedro Urrutia;No;No;Si;09/06/2022;14544643 16 | Jamaica ;Michael Sophit;No;No;No;26/07/2022;75638664 17 | -------------------------------------------------------------------------------- /Semana 8/Ejemplo_en_vivo_Visualizacion_permisos_Redshift.sql: -------------------------------------------------------------------------------- 1 | SELECT 2 | u.usename, 3 | s.schemaname, 4 | has_schema_privilege(u.usename,s.schemaname,'create') AS user_has_select_permission, 5 | has_schema_privilege(u.usename,s.schemaname,'usage') AS user_has_usage_permission 6 | FROM 7 | pg_user u 8 | CROSS JOIN 9 | (SELECT DISTINCT schemaname FROM pg_tables) s 10 | WHERE 11 | u.usename = 'nombre_de_usuario_x' 12 | AND s.schemaname = 'nombre_de_esquema' 13 | -------------------------------------------------------------------------------- /Semana 8/Ejemplo_en_vivo_Visualizacion_permisos_completos_Redshift.sql: -------------------------------------------------------------------------------- 1 | SELECT 2 | u.usename, 3 | t.schemaname||'.'||t.tablename, 4 | has_table_privilege(u.usename,t.tablename,'select') AS user_has_select_permission, 5 | has_table_privilege(u.usename,t.tablename,'insert') AS user_has_insert_permission, 6 | has_table_privilege(u.usename,t.tablename,'update') AS user_has_update_permission, 7 | has_table_privilege(u.usename,t.tablename,'delete') AS user_has_delete_permission, 8 | has_table_privilege(u.usename,t.tablename,'references') AS user_has_references_permission 9 | FROM 10 | pg_user u 11 | CROSS JOIN 12 | pg_tables t 13 | WHERE 14 | u.usename = 'nombre_de_usuario' 15 | AND t.tablename = 'nombre_de_tabla' 16 | -------------------------------------------------------------------------------- /Semana 8/Seguridad_basica_Redshift.sql: -------------------------------------------------------------------------------- 1 | -- Momento 1 2 | CREATE SCHEMA my_secure_schema; 3 | 4 | CREATE TABLE my_secure_schema.my_secure_table ( 5 | name VARCHAR(30), 6 | dob TIMESTAMP SORTKEY, 7 | zip INTEGER, 8 | ssn VARCHAR(9) 9 | ) 10 | diststyle all; 11 | 12 | -- Momento 2 13 | CREATE USER data_scientist PASSWORD 'Test1234'; 14 | 15 | CREATE GROUP ds_prod WITH USER data_scientist; 16 | 17 | -- Momento 3 18 | SELECT * FROM my_secure_schema.my_secure_table; 19 | 20 | -------------------------------------------------------------------------------- /Semana 8/Seguridad_columnas_Redshift.sql: -------------------------------------------------------------------------------- 1 | -- Momento 1: Otorgar acceso al usuario para todas las columnas, excepto el Número de Seguro Social (ssn) para el usuario 2 | GRANT ALL ON SCHEMA my_secure_schema TO data_scientist; 3 | GRANT SELECT(name, dob, zip) ON my_secure_schema.my_secure_table TO data_scientist; 4 | 5 | -- Momento 2: Conectarse al clúster con usuario data_scientist. 6 | -- Ejecute una selección en la tabla con y sin el Número de Seguro Social. Observen la diferencia 7 | SELECT name, dob, zip 8 | FROM my_secure_schema.my_secure_table; 9 | 10 | SELECT name, dob, zip, ssn 11 | FROM my_secure_schema.my_secure_table; 12 | 13 | --La primera consulta debera arrojar 0 filas mientras que la segunda mostrara un error 14 | 15 | -------------------------------------------------------------------------------- /Semana 8/airbnb_nyc.rar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 8/airbnb_nyc.rar -------------------------------------------------------------------------------- /Semana 9/EJEMPLO1_DOCKER.rar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 9/EJEMPLO1_DOCKER.rar -------------------------------------------------------------------------------- /Semana 9/EJEMPLO2_DOCKER.rar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 9/EJEMPLO2_DOCKER.rar -------------------------------------------------------------------------------- /Semana 9/web-page.rar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CoderContenidos/Data.Engineering/1f01330a3c29c7060be02199d7563d78fe58d270/Semana 9/web-page.rar --------------------------------------------------------------------------------