├── .gitignore ├── .ipynb_checkpoints ├── 1. Práctica en el uso de la librería yfinance-checkpoint.ipynb ├── 2. Prácticas de WebScraping, read_html y pandas-checkpoint.ipynb ├── 3. Tarea de extracción de datos mediante Webscraping-checkpoint.ipynb ├── 4. Tarea Final-checkpoint.ipynb ├── Extraer montar y graficar datos financieros-checkpoint-Toshiba-Manuel.ipynb └── Extraer montar y graficar datos financieros-checkpoint.ipynb ├── 1. Práctica en el uso de la librería yfinance.ipynb ├── 2. Prácticas de WebScraping, read_html y pandas.ipynb ├── 3. Tarea de extracción de datos mediante Webscraping.ipynb ├── 4. Tarea Final.ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | *.docx -------------------------------------------------------------------------------- /.ipynb_checkpoints/1. Práctica en el uso de la librería yfinance-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "

Extraer datos bursátiles usando la librería 'yfinance' de Python

\n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Supongamos que trabajamos para una gestora de fondos de inversión; nuestro trabajo consistirá en determinar cualquier actividad de compra-venta de acciones para realizar un análisis posterior con los datos extraídos. En este laboratorio, extraeremos los datos de existencias mediante la biblioteca yfinance, que nos permitirá extraer de Yahoo finanzas datos e información de las acciones\n" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "

Contenido

\n", 22 | "
\n", 23 | " \n", 29 | "
\n", 30 | "\n", 31 | "
\n" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "metadata": {}, 37 | "source": [ 38 | "### Instalamos la librería yfinance\n", 39 | "Esta librería permite descargar datos históricos económico financieros desde Yahoo Finanzas. Podremos acceder de un modo sencillo a una interesante gama de datos. Más información en https://pypi.org/project/yfinance/" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 2, 45 | "metadata": {}, 46 | "outputs": [ 47 | { 48 | "name": "stdout", 49 | "output_type": "stream", 50 | "text": [ 51 | "Collecting yfinance\n", 52 | " Downloading yfinance-0.1.63.tar.gz (26 kB)\n", 53 | "Requirement already satisfied: pandas>=0.24 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (1.1.3)\n", 54 | "Requirement already satisfied: numpy>=1.15 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (1.19.2)\n", 55 | "Requirement already satisfied: requests>=2.20 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (2.24.0)\n", 56 | "Collecting multitasking>=0.0.7\n", 57 | " Downloading multitasking-0.0.9.tar.gz (8.1 kB)\n", 58 | "Requirement already satisfied: lxml>=4.5.1 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (4.6.1)\n", 59 | "Requirement already satisfied: python-dateutil>=2.7.3 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from pandas>=0.24->yfinance) (2.8.1)\n", 60 | "Requirement already satisfied: pytz>=2017.2 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from pandas>=0.24->yfinance) (2020.1)\n", 61 | "Requirement already satisfied: chardet<4,>=3.0.2 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (3.0.4)\n", 62 | "Requirement already satisfied: certifi>=2017.4.17 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (2020.6.20)\n", 63 | "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (1.25.11)\n", 64 | "Requirement already satisfied: idna<3,>=2.5 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (2.10)\n", 65 | "Requirement already satisfied: six>=1.5 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from python-dateutil>=2.7.3->pandas>=0.24->yfinance) (1.15.0)\n", 66 | "Building wheels for collected packages: yfinance, multitasking\n", 67 | " Building wheel for yfinance (setup.py): started\n", 68 | " Building wheel for yfinance (setup.py): finished with status 'done'\n", 69 | " Created wheel for yfinance: filename=yfinance-0.1.63-py2.py3-none-any.whl size=23914 sha256=00cfb7ca5adf87a773f2043f7c775b8cbd1f46fd7a909510934db16be79d6a1d\n", 70 | " Stored in directory: c:\\users\\mrsanchez\\appdata\\local\\pip\\cache\\wheels\\ec\\cc\\c1\\32da8ee853d742d5d7cbd11ee04421222eb354672020b57297\n", 71 | " Building wheel for multitasking (setup.py): started\n", 72 | " Building wheel for multitasking (setup.py): finished with status 'done'\n", 73 | " Created wheel for multitasking: filename=multitasking-0.0.9-py3-none-any.whl size=8372 sha256=6710745adb7b243b00c3493ad96e60e86be0d4c6e5ca2e27177fb014bb109092\n", 74 | " Stored in directory: c:\\users\\mrsanchez\\appdata\\local\\pip\\cache\\wheels\\57\\6d\\a3\\a39b839cc75274d2acfb1c58bfead2f726c6577fe8c4723f13\n", 75 | "Successfully built yfinance multitasking\n", 76 | "Installing collected packages: multitasking, yfinance\n", 77 | "Successfully installed multitasking-0.0.9 yfinance-0.1.63\n" 78 | ] 79 | } 80 | ], 81 | "source": [ 82 | "!pip install yfinance\n", 83 | "#!pip install pandas" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": {}, 89 | "source": [ 90 | "### Importamos las librerías \"yfinance\" y \"pandas\" " 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 3, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "import yfinance as yf\n", 100 | "import pandas as pd" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "## 1- Uso de la libreria yfinance para extraer datos de acciones\n" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "Utilizando el método Ticker() podemos extraer información bursátil pasandole como parámetro las siglas identificativas de con las que a la empresa es identificada en la bolsa. Como ejemplo vamos a recuperar las acciones de la empresa Apple y cuyas siglas identificativas son AAPL.\n" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": 4, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "apple = yf.Ticker(\"AAPL\")" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "### Información de acciones\n", 131 | "Usando el atributo info podemos extraer información de las acciones de Apple en un diccionario de Python." 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 6, 137 | "metadata": {}, 138 | "outputs": [ 139 | { 140 | "data": { 141 | "text/plain": [ 142 | "{'zip': '95014',\n", 143 | " 'sector': 'Technology',\n", 144 | " 'fullTimeEmployees': 147000,\n", 145 | " 'longBusinessSummary': 'Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. It also sells various related services. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, HomePod, iPod touch, and other Apple-branded and third-party accessories. It also provides AppleCare support services; cloud services store services; and operates various platforms, including the App Store, that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts. In addition, the company offers various services, such as Apple Arcade, a game subscription service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription news and magazine service; Apple TV+, which offers exclusive original content; Apple Card, a co-branded credit card; and Apple Pay, a cashless payment service, as well as licenses its intellectual property. The company serves consumers, and small and mid-sized businesses; and the education, enterprise, and government markets. It sells and delivers third-party applications for its products through the App Store. The company also sells its products through its retail and online stores, and direct sales force; and third-party cellular network carriers, wholesalers, retailers, and resellers. Apple Inc. was founded in 1977 and is headquartered in Cupertino, California.',\n", 146 | " 'city': 'Cupertino',\n", 147 | " 'phone': '408-996-1010',\n", 148 | " 'state': 'CA',\n", 149 | " 'country': 'United States',\n", 150 | " 'companyOfficers': [],\n", 151 | " 'website': 'http://www.apple.com',\n", 152 | " 'maxAge': 1,\n", 153 | " 'address1': 'One Apple Park Way',\n", 154 | " 'industry': 'Consumer Electronics',\n", 155 | " 'ebitdaMargins': 0.31955,\n", 156 | " 'profitMargins': 0.25004,\n", 157 | " 'grossMargins': 0.41005,\n", 158 | " 'operatingCashflow': 104414003200,\n", 159 | " 'revenueGrowth': 0.364,\n", 160 | " 'operatingMargins': 0.28788,\n", 161 | " 'ebitda': 110934999040,\n", 162 | " 'targetLowPrice': 125,\n", 163 | " 'recommendationKey': 'buy',\n", 164 | " 'grossProfits': 104956000000,\n", 165 | " 'freeCashflow': 80625876992,\n", 166 | " 'targetMedianPrice': 160,\n", 167 | " 'currentPrice': 147.2,\n", 168 | " 'earningsGrowth': 1,\n", 169 | " 'currentRatio': 1.062,\n", 170 | " 'returnOnAssets': 0.19302,\n", 171 | " 'numberOfAnalystOpinions': 39,\n", 172 | " 'targetMeanPrice': 159.34,\n", 173 | " 'debtToEquity': 210.782,\n", 174 | " 'returnOnEquity': 1.27125,\n", 175 | " 'targetHighPrice': 185,\n", 176 | " 'totalCash': 61696000000,\n", 177 | " 'totalDebt': 135491002368,\n", 178 | " 'totalRevenue': 347155005440,\n", 179 | " 'totalCashPerShare': 3.732,\n", 180 | " 'financialCurrency': 'USD',\n", 181 | " 'revenuePerShare': 20.61,\n", 182 | " 'quickRatio': 0.887,\n", 183 | " 'recommendationMean': 2,\n", 184 | " 'exchange': 'NMS',\n", 185 | " 'shortName': 'Apple Inc.',\n", 186 | " 'longName': 'Apple Inc.',\n", 187 | " 'exchangeTimezoneName': 'America/New_York',\n", 188 | " 'exchangeTimezoneShortName': 'EDT',\n", 189 | " 'isEsgPopulated': False,\n", 190 | " 'gmtOffSetMilliseconds': '-14400000',\n", 191 | " 'quoteType': 'EQUITY',\n", 192 | " 'symbol': 'AAPL',\n", 193 | " 'messageBoardId': 'finmb_24937',\n", 194 | " 'market': 'us_market',\n", 195 | " 'annualHoldingsTurnover': None,\n", 196 | " 'enterpriseToRevenue': 7.21,\n", 197 | " 'beta3Year': None,\n", 198 | " 'enterpriseToEbitda': 22.562,\n", 199 | " '52WeekChange': 0.29013848,\n", 200 | " 'morningStarRiskRating': None,\n", 201 | " 'forwardEps': 5.34,\n", 202 | " 'revenueQuarterlyGrowth': None,\n", 203 | " 'sharesOutstanding': 16530199552,\n", 204 | " 'fundInceptionDate': None,\n", 205 | " 'annualReportExpenseRatio': None,\n", 206 | " 'totalAssets': None,\n", 207 | " 'bookValue': 3.882,\n", 208 | " 'sharesShort': 96355309,\n", 209 | " 'sharesPercentSharesOut': 0.0058,\n", 210 | " 'fundFamily': None,\n", 211 | " 'lastFiscalYearEnd': 1601078400,\n", 212 | " 'heldPercentInstitutions': 0.59102,\n", 213 | " 'netIncomeToCommon': 86801997824,\n", 214 | " 'trailingEps': 5.108,\n", 215 | " 'lastDividendValue': 0.22,\n", 216 | " 'SandP52WeekChange': 0.31455648,\n", 217 | " 'priceToBook': 37.9186,\n", 218 | " 'heldPercentInsiders': 0.00068999996,\n", 219 | " 'nextFiscalYearEnd': 1664150400,\n", 220 | " 'yield': None,\n", 221 | " 'mostRecentQuarter': 1624665600,\n", 222 | " 'shortRatio': 1.14,\n", 223 | " 'sharesShortPreviousMonthDate': 1623715200,\n", 224 | " 'floatShares': 16513139929,\n", 225 | " 'beta': 1.202797,\n", 226 | " 'enterpriseValue': 2502902939648,\n", 227 | " 'priceHint': 2,\n", 228 | " 'threeYearAverageReturn': None,\n", 229 | " 'lastSplitDate': 1598832000,\n", 230 | " 'lastSplitFactor': '4:1',\n", 231 | " 'legalType': None,\n", 232 | " 'lastDividendDate': 1620345600,\n", 233 | " 'morningStarOverallRating': None,\n", 234 | " 'earningsQuarterlyGrowth': 0.932,\n", 235 | " 'priceToSalesTrailing12Months': 7.0091033,\n", 236 | " 'dateShortInterest': 1626307200,\n", 237 | " 'pegRatio': 1.53,\n", 238 | " 'ytdReturn': None,\n", 239 | " 'forwardPE': 27.565542,\n", 240 | " 'lastCapGain': None,\n", 241 | " 'shortPercentOfFloat': 0.0058,\n", 242 | " 'sharesShortPriorMonth': 108937943,\n", 243 | " 'impliedSharesOutstanding': None,\n", 244 | " 'category': None,\n", 245 | " 'fiveYearAverageReturn': None,\n", 246 | " 'previousClose': 146.95,\n", 247 | " 'regularMarketOpen': 146.98,\n", 248 | " 'twoHundredDayAverage': 131.78065,\n", 249 | " 'trailingAnnualDividendYield': 0.005682205,\n", 250 | " 'payoutRatio': 0.16309999,\n", 251 | " 'volume24Hr': None,\n", 252 | " 'regularMarketDayHigh': 147.84,\n", 253 | " 'navPrice': None,\n", 254 | " 'averageDailyVolume10Day': 75906475,\n", 255 | " 'regularMarketPreviousClose': 146.95,\n", 256 | " 'fiftyDayAverage': 141.56372,\n", 257 | " 'trailingAnnualDividendRate': 0.835,\n", 258 | " 'open': 146.98,\n", 259 | " 'toCurrency': None,\n", 260 | " 'averageVolume10days': 75906475,\n", 261 | " 'expireDate': None,\n", 262 | " 'algorithm': None,\n", 263 | " 'dividendRate': 0.88,\n", 264 | " 'exDividendDate': 1628208000,\n", 265 | " 'circulatingSupply': None,\n", 266 | " 'startDate': None,\n", 267 | " 'regularMarketDayLow': 146.17,\n", 268 | " 'currency': 'USD',\n", 269 | " 'trailingPE': 28.817541,\n", 270 | " 'regularMarketVolume': 29498682,\n", 271 | " 'lastMarket': None,\n", 272 | " 'maxSupply': None,\n", 273 | " 'openInterest': None,\n", 274 | " 'marketCap': 2433245249536,\n", 275 | " 'volumeAllCurrencies': None,\n", 276 | " 'strikePrice': None,\n", 277 | " 'averageVolume': 81387312,\n", 278 | " 'dayLow': 146.17,\n", 279 | " 'ask': 147.21,\n", 280 | " 'askSize': 1400,\n", 281 | " 'volume': 29498682,\n", 282 | " 'fiftyTwoWeekHigh': 150,\n", 283 | " 'fromCurrency': None,\n", 284 | " 'fiveYearAvgDividendYield': 1.29,\n", 285 | " 'fiftyTwoWeekLow': 103.1,\n", 286 | " 'bid': 147.2,\n", 287 | " 'tradeable': False,\n", 288 | " 'dividendYield': 0.006,\n", 289 | " 'bidSize': 1300,\n", 290 | " 'dayHigh': 147.84,\n", 291 | " 'regularMarketPrice': 147.2,\n", 292 | " 'logo_url': 'https://logo.clearbit.com/apple.com'}" 293 | ] 294 | }, 295 | "execution_count": 6, 296 | "metadata": {}, 297 | "output_type": "execute_result" 298 | } 299 | ], 300 | "source": [ 301 | "apple_info=apple.info\n", 302 | "apple_info #mostramos lo que tiene la variable" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": {}, 308 | "source": [ 309 | "Como ya tenemos un diccionario, podemos extraer la información a partir del campo clave. Extraemos por ejemplo el país mediante la clave 'country'\n" 310 | ] 311 | }, 312 | { 313 | "cell_type": "code", 314 | "execution_count": 7, 315 | "metadata": {}, 316 | "outputs": [ 317 | { 318 | "data": { 319 | "text/plain": [ 320 | "'United States'" 321 | ] 322 | }, 323 | "execution_count": 7, 324 | "metadata": {}, 325 | "output_type": "execute_result" 326 | } 327 | ], 328 | "source": [ 329 | "apple_info['country']" 330 | ] 331 | }, 332 | { 333 | "cell_type": "markdown", 334 | "metadata": {}, 335 | "source": [ 336 | "## 2 - Uso yfinance para extraer datos históricos de precios de acciones\n", 337 | "Para realizar un análisis necesitamos extraer la evolución histórica de los indicadores principales (precio de apertura, máximo, mínimo, precio de cierre, etc.). Para ello usamos el método history(). Le podemos pasar como parámetro el periodo de tiempo cuyos datos queremos recoger. Las opciones para el período son 1 día (1d), 5d, 1 mes (1mo), 3mo, 6mo, 1 año (1y), 2y, 5y, 10y, ytd y max, este último (max) recoge el histórico completo desde la primera cotización en bolsa de la empresa." 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": 19, 343 | "metadata": {}, 344 | "outputs": [ 345 | { 346 | "name": "stdout", 347 | "output_type": "stream", 348 | "text": [ 349 | " Open High Low Close Volume \\\n", 350 | "Date \n", 351 | "1980-12-12 0.100751 0.101189 0.100751 0.100751 469033600 \n", 352 | "1980-12-15 0.095933 0.095933 0.095495 0.095495 175884800 \n", 353 | "1980-12-16 0.088923 0.088923 0.088485 0.088485 105728000 \n", 354 | "1980-12-17 0.090676 0.091114 0.090676 0.090676 86441600 \n", 355 | "1980-12-18 0.093304 0.093742 0.093304 0.093304 73449600 \n", 356 | "... ... ... ... ... ... \n", 357 | "2021-07-30 144.380005 146.330002 144.110001 145.860001 70382000 \n", 358 | "2021-08-02 146.360001 146.949997 145.250000 145.520004 62880000 \n", 359 | "2021-08-03 145.809998 148.039993 145.179993 147.360001 64660800 \n", 360 | "2021-08-04 147.270004 147.789993 146.279999 146.949997 56319800 \n", 361 | "2021-08-05 146.979996 147.839996 146.169998 147.301498 31752985 \n", 362 | "\n", 363 | " Dividends Stock Splits \n", 364 | "Date \n", 365 | "1980-12-12 0.0 0.0 \n", 366 | "1980-12-15 0.0 0.0 \n", 367 | "1980-12-16 0.0 0.0 \n", 368 | "1980-12-17 0.0 0.0 \n", 369 | "1980-12-18 0.0 0.0 \n", 370 | "... ... ... \n", 371 | "2021-07-30 0.0 0.0 \n", 372 | "2021-08-02 0.0 0.0 \n", 373 | "2021-08-03 0.0 0.0 \n", 374 | "2021-08-04 0.0 0.0 \n", 375 | "2021-08-05 0.0 0.0 \n", 376 | "\n", 377 | "[10249 rows x 7 columns]\n" 378 | ] 379 | } 380 | ], 381 | "source": [ 382 | "historico_apple = apple.history(period=\"max\")\n", 383 | "print(historico_apple)" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "El formato en el que se devuelven los datos es un DataFrame de Pandas. Con la 'Fecha' como índice, la acción 'Open', 'High', 'Low', 'Close', 'Volume' y 'Stock Splits' que se dan para cada día.\n", 391 | "Haciendo uso de los métodos head() y tail() podremos ver las primeras líneas o las ultimas respectivamente. Como parámetro se les pasa el número de líneas a visualizar, si no se le proporciona este parámetro por defecto muestra cinco líneas." 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": 20, 397 | "metadata": {}, 398 | "outputs": [ 399 | { 400 | "data": { 401 | "text/html": [ 402 | "
\n", 403 | "\n", 416 | "\n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | "
OpenHighLowCloseVolumeDividendsStock Splits
Date
1980-12-120.1007510.1011890.1007510.1007514690336000.00.0
1980-12-150.0959330.0959330.0954950.0954951758848000.00.0
1980-12-160.0889230.0889230.0884850.0884851057280000.00.0
1980-12-170.0906760.0911140.0906760.090676864416000.00.0
1980-12-180.0933040.0937420.0933040.093304734496000.00.0
\n", 492 | "
" 493 | ], 494 | "text/plain": [ 495 | " Open High Low Close Volume Dividends \\\n", 496 | "Date \n", 497 | "1980-12-12 0.100751 0.101189 0.100751 0.100751 469033600 0.0 \n", 498 | "1980-12-15 0.095933 0.095933 0.095495 0.095495 175884800 0.0 \n", 499 | "1980-12-16 0.088923 0.088923 0.088485 0.088485 105728000 0.0 \n", 500 | "1980-12-17 0.090676 0.091114 0.090676 0.090676 86441600 0.0 \n", 501 | "1980-12-18 0.093304 0.093742 0.093304 0.093304 73449600 0.0 \n", 502 | "\n", 503 | " Stock Splits \n", 504 | "Date \n", 505 | "1980-12-12 0.0 \n", 506 | "1980-12-15 0.0 \n", 507 | "1980-12-16 0.0 \n", 508 | "1980-12-17 0.0 \n", 509 | "1980-12-18 0.0 " 510 | ] 511 | }, 512 | "execution_count": 20, 513 | "metadata": {}, 514 | "output_type": "execute_result" 515 | } 516 | ], 517 | "source": [ 518 | "historico_apple.head()" 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": {}, 524 | "source": [ 525 | "Si queremos mostrar los diez primero registros de los datos de una sola columna; por ejemplo la columna de precio de la acción al cierre de la jornada" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": 21, 531 | "metadata": {}, 532 | "outputs": [ 533 | { 534 | "name": "stdout", 535 | "output_type": "stream", 536 | "text": [ 537 | "Date\n", 538 | "1980-12-12 0.100751\n", 539 | "1980-12-15 0.095495\n", 540 | "1980-12-16 0.088485\n", 541 | "1980-12-17 0.090676\n", 542 | "1980-12-18 0.093304\n", 543 | "1980-12-19 0.098999\n", 544 | "1980-12-22 0.103817\n", 545 | "1980-12-23 0.108198\n", 546 | "1980-12-24 0.113892\n", 547 | "1980-12-26 0.124405\n", 548 | "Name: Close, dtype: float64\n" 549 | ] 550 | } 551 | ], 552 | "source": [ 553 | "print(historico_apple[\"Close\"].head(10))" 554 | ] 555 | }, 556 | { 557 | "cell_type": "markdown", 558 | "metadata": {}, 559 | "source": [ 560 | "Podemos restablecer el índice del DataFrame con la función reset_index. También establecemos el parámetro inplace en True para que el cambio tenga lugar en el propio DataFrame.\n" 561 | ] 562 | }, 563 | { 564 | "cell_type": "code", 565 | "execution_count": 22, 566 | "metadata": {}, 567 | "outputs": [], 568 | "source": [ 569 | "historico_apple.reset_index(inplace=True)" 570 | ] 571 | }, 572 | { 573 | "cell_type": "markdown", 574 | "metadata": {}, 575 | "source": [ 576 | "Graficamos el precio al cierre por fecha. Usamos el campo `Close` y el campo `Date`:" 577 | ] 578 | }, 579 | { 580 | "cell_type": "code", 581 | "execution_count": 23, 582 | "metadata": {}, 583 | "outputs": [ 584 | { 585 | "data": { 586 | "text/plain": [ 587 | "" 588 | ] 589 | }, 590 | "execution_count": 23, 591 | "metadata": {}, 592 | "output_type": "execute_result" 593 | }, 594 | { 595 | "data": { 596 | "image/png": "\n", 597 | "text/plain": [ 598 | "
" 599 | ] 600 | }, 601 | "metadata": { 602 | "needs_background": "light" 603 | }, 604 | "output_type": "display_data" 605 | } 606 | ], 607 | "source": [ 608 | "historico_apple.plot(x=\"Date\", y=\"Close\")" 609 | ] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "metadata": {}, 614 | "source": [ 615 | "## 3 - Uso de yfinance para extraer datos históricos de dividendos\n" 616 | ] 617 | }, 618 | { 619 | "cell_type": "markdown", 620 | "metadata": {}, 621 | "source": [ 622 | "Los dividendos son la distribución de las ganancias de una empresa a los accionistas. En este caso, se definen como una cantidad de dinero devuelta por acción que posee un inversor. Usando la variable 'dividens' podemos obtener un DataFrame de los datos. El período de los datos viene dado por el período definido en la función \"history\".\n" 623 | ] 624 | }, 625 | { 626 | "cell_type": "code", 627 | "execution_count": 24, 628 | "metadata": {}, 629 | "outputs": [ 630 | { 631 | "data": { 632 | "text/plain": [ 633 | "Date\n", 634 | "1987-05-11 0.000536\n", 635 | "1987-08-10 0.000536\n", 636 | "1987-11-17 0.000714\n", 637 | "1988-02-12 0.000714\n", 638 | "1988-05-16 0.000714\n", 639 | " ... \n", 640 | "2020-05-08 0.205000\n", 641 | "2020-08-07 0.205000\n", 642 | "2020-11-06 0.205000\n", 643 | "2021-02-05 0.205000\n", 644 | "2021-05-07 0.220000\n", 645 | "Name: Dividends, Length: 71, dtype: float64" 646 | ] 647 | }, 648 | "execution_count": 24, 649 | "metadata": {}, 650 | "output_type": "execute_result" 651 | } 652 | ], 653 | "source": [ 654 | "apple.dividends" 655 | ] 656 | }, 657 | { 658 | "cell_type": "markdown", 659 | "metadata": {}, 660 | "source": [ 661 | "Podemos graficar los dividendos a lo largo del tiempo:" 662 | ] 663 | }, 664 | { 665 | "cell_type": "code", 666 | "execution_count": 25, 667 | "metadata": {}, 668 | "outputs": [ 669 | { 670 | "data": { 671 | "text/plain": [ 672 | "" 673 | ] 674 | }, 675 | "execution_count": 25, 676 | "metadata": {}, 677 | "output_type": "execute_result" 678 | }, 679 | { 680 | "data": { 681 | "image/png": "\n", 682 | "text/plain": [ 683 | "
" 684 | ] 685 | }, 686 | "metadata": { 687 | "needs_background": "light" 688 | }, 689 | "output_type": "display_data" 690 | } 691 | ], 692 | "source": [ 693 | "apple.dividends.plot()" 694 | ] 695 | }, 696 | { 697 | "cell_type": "markdown", 698 | "metadata": {}, 699 | "source": [ 700 | "## 4- Ejercicio\n" 701 | ] 702 | }, 703 | { 704 | "cell_type": "markdown", 705 | "metadata": {}, 706 | "source": [ 707 | "Con todo lo visto anteriormente, vamos a extraer los datos de una empresa española conocida, los mostramos y graficamos la evolución de los valores desde que empezó a cotizar en bolsa.\n" 708 | ] 709 | }, 710 | { 711 | "cell_type": "code", 712 | "execution_count": 27, 713 | "metadata": {}, 714 | "outputs": [], 715 | "source": [ 716 | "inditex = yf.Ticker(\"ITX.MC\")" 717 | ] 718 | }, 719 | { 720 | "cell_type": "markdown", 721 | "metadata": {}, 722 | "source": [ 723 | "Question 1 Mostramos el país de la empresa" 724 | ] 725 | }, 726 | { 727 | "cell_type": "code", 728 | "execution_count": 29, 729 | "metadata": {}, 730 | "outputs": [ 731 | { 732 | "data": { 733 | "text/plain": [ 734 | "'Spain'" 735 | ] 736 | }, 737 | "execution_count": 29, 738 | "metadata": {}, 739 | "output_type": "execute_result" 740 | } 741 | ], 742 | "source": [ 743 | "inditex_info=inditex.info\n", 744 | "inditex_info['country']" 745 | ] 746 | }, 747 | { 748 | "cell_type": "markdown", 749 | "metadata": {}, 750 | "source": [ 751 | "Question 2 Mostramos el sector al que pertenece\n" 752 | ] 753 | }, 754 | { 755 | "cell_type": "code", 756 | "execution_count": 30, 757 | "metadata": {}, 758 | "outputs": [ 759 | { 760 | "data": { 761 | "text/plain": [ 762 | "'Consumer Cyclical'" 763 | ] 764 | }, 765 | "execution_count": 30, 766 | "metadata": {}, 767 | "output_type": "execute_result" 768 | } 769 | ], 770 | "source": [ 771 | "inditex_info['sector']" 772 | ] 773 | }, 774 | { 775 | "cell_type": "markdown", 776 | "metadata": {}, 777 | "source": [ 778 | "Question 3 Extraemos el histórico desde que empezó a cotizar en bolsa y lo graficamos" 779 | ] 780 | }, 781 | { 782 | "cell_type": "code", 783 | "execution_count": 33, 784 | "metadata": {}, 785 | "outputs": [ 786 | { 787 | "name": "stdout", 788 | "output_type": "stream", 789 | "text": [ 790 | " Open High Low Close Volume Dividends \\\n", 791 | "Date \n", 792 | "2001-05-24 -0.137869 -0.138635 -0.134423 -0.138176 216270100 0.0 \n", 793 | "2001-05-25 -0.137869 -0.140780 -0.137103 -0.137946 50448300 0.0 \n", 794 | "2001-05-28 -0.136337 -0.138022 -0.135725 -0.137103 26118945 0.0 \n", 795 | "2001-05-29 -0.136414 -0.138865 -0.136414 -0.138405 26910070 0.0 \n", 796 | "2001-05-30 -0.138099 -0.139708 -0.137946 -0.138635 48229995 0.0 \n", 797 | "... ... ... ... ... ... ... \n", 798 | "2021-07-30 28.500000 28.719999 28.270000 28.590000 2524564 0.0 \n", 799 | "2021-08-02 28.809999 29.530001 28.750000 29.100000 13035296 0.0 \n", 800 | "2021-08-03 29.129999 29.290001 28.730000 28.870001 1425722 0.0 \n", 801 | "2021-08-04 29.040001 29.049999 28.580000 28.650000 1260747 0.0 \n", 802 | "2021-08-05 28.690001 28.879999 28.549999 28.780001 1035664 0.0 \n", 803 | "\n", 804 | " Stock Splits \n", 805 | "Date \n", 806 | "2001-05-24 0.0 \n", 807 | "2001-05-25 0.0 \n", 808 | "2001-05-28 0.0 \n", 809 | "2001-05-29 0.0 \n", 810 | "2001-05-30 0.0 \n", 811 | "... ... \n", 812 | "2021-07-30 0.0 \n", 813 | "2021-08-02 0.0 \n", 814 | "2021-08-03 0.0 \n", 815 | "2021-08-04 0.0 \n", 816 | "2021-08-05 0.0 \n", 817 | "\n", 818 | "[5137 rows x 7 columns]\n" 819 | ] 820 | } 821 | ], 822 | "source": [ 823 | "historico_inditex=inditex.history(period=\"max\")\n", 824 | "print(historico_inditex)" 825 | ] 826 | }, 827 | { 828 | "cell_type": "code", 829 | "execution_count": 35, 830 | "metadata": {}, 831 | "outputs": [], 832 | "source": [ 833 | "historico_inditex.reset_index(inplace=True)" 834 | ] 835 | }, 836 | { 837 | "cell_type": "code", 838 | "execution_count": 36, 839 | "metadata": {}, 840 | "outputs": [ 841 | { 842 | "data": { 843 | "text/plain": [ 844 | "" 845 | ] 846 | }, 847 | "execution_count": 36, 848 | "metadata": {}, 849 | "output_type": "execute_result" 850 | }, 851 | { 852 | "data": { 853 | "image/png": "\n", 854 | "text/plain": [ 855 | "
" 856 | ] 857 | }, 858 | "metadata": { 859 | "needs_background": "light" 860 | }, 861 | "output_type": "display_data" 862 | } 863 | ], 864 | "source": [ 865 | "historico_inditex.plot(x=\"Date\", y=\"Close\")" 866 | ] 867 | }, 868 | { 869 | "cell_type": "code", 870 | "execution_count": null, 871 | "metadata": {}, 872 | "outputs": [], 873 | "source": [] 874 | } 875 | ], 876 | "metadata": { 877 | "kernelspec": { 878 | "display_name": "Python 3", 879 | "language": "python", 880 | "name": "python3" 881 | }, 882 | "language_info": { 883 | "codemirror_mode": { 884 | "name": "ipython", 885 | "version": 3 886 | }, 887 | "file_extension": ".py", 888 | "mimetype": "text/x-python", 889 | "name": "python", 890 | "nbconvert_exporter": "python", 891 | "pygments_lexer": "ipython3", 892 | "version": "3.8.5" 893 | } 894 | }, 895 | "nbformat": 4, 896 | "nbformat_minor": 4 897 | } 898 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/3. Tarea de extracción de datos mediante Webscraping-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "

Extraer datos bursátiles usando WebScraping

\n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "No todos los datos de stock están disponibles a través de la API en esta asignación; aquí se va a usar web-scraping para obtener los datos financieros. Usando 'BeautifulSoup' y 'requests' extraeremos datos históricos de acciones de una página web." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "
\n", 22 | "
    \n", 23 | "
  • Descarga de la página web mediante la biblioteca Request
  • \n", 24 | "
  • Analizando HTML de la página web usando BeautifulSoup
  • \n", 25 | "
  • Extraer datos y construir DataFrame
  • \n", 26 | "
\n", 27 | "
\n", 28 | "\n", 29 | "
\n" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 1, 35 | "metadata": {}, 36 | "outputs": [ 37 | { 38 | "name": "stdout", 39 | "output_type": "stream", 40 | "text": [ 41 | "Requirement already satisfied: bs4 in c:\\users\\cyber\\anaconda3\\lib\\site-packages (0.0.1)\n", 42 | "Requirement already satisfied: beautifulsoup4 in c:\\users\\cyber\\anaconda3\\lib\\site-packages (from bs4) (4.9.3)\n", 43 | "Requirement already satisfied: soupsieve>1.2; python_version >= \"3.0\" in c:\\users\\cyber\\anaconda3\\lib\\site-packages (from beautifulsoup4->bs4) (2.0.1)\n", 44 | "Collecting plotly\n", 45 | " Downloading plotly-4.14.3-py2.py3-none-any.whl (13.2 MB)\n", 46 | "Collecting retrying>=1.3.3\n", 47 | " Downloading retrying-1.3.3.tar.gz (10 kB)\n", 48 | "Requirement already satisfied: six in c:\\users\\cyber\\appdata\\roaming\\python\\python38\\site-packages (from plotly) (1.14.0)\n", 49 | "Building wheels for collected packages: retrying\n", 50 | " Building wheel for retrying (setup.py): started\n", 51 | " Building wheel for retrying (setup.py): finished with status 'done'\n", 52 | " Created wheel for retrying: filename=retrying-1.3.3-py3-none-any.whl size=11434 sha256=03a9cb7700431c3918069a7139163cc15327df3edfcc702354a4f00f13c30489\n", 53 | " Stored in directory: c:\\users\\cyber\\appdata\\local\\pip\\cache\\wheels\\c4\\a7\\48\\0a434133f6d56e878ca511c0e6c38326907c0792f67b476e56\n", 54 | "Successfully built retrying\n", 55 | "Installing collected packages: retrying, plotly\n", 56 | "Successfully installed plotly-4.14.3 retrying-1.3.3\n" 57 | ] 58 | } 59 | ], 60 | "source": [ 61 | "!pip install pandas\n", 62 | "!pip install requests\n", 63 | "!pip install bs4\n", 64 | "!pip install plotly" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 2, 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "import pandas as pd\n", 74 | "import requests\n", 75 | "from bs4 import BeautifulSoup" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "## Extraer datos bursátiles usando WebScraping\n" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "Usando la librería `requests` descargamos la página web [https://finance.yahoo.com/quote/AMZN/history?period1=1451606400&period2=1612137600&interval=1mo&filter=history&frequency=1mo&includeAdjustedClose=true](https://finance.yahoo.com/quote/AMZN/history?period1=1451606400&period2=1612137600&interval=1mo&filter=history&frequency=1mo&includeAdjustedClose=true&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ). Salvamos el texto de respuesta como una variable llamada `html_data`.\n" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 3, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "url=\"https://finance.yahoo.com/quote/AMZN/history?period1=1451606400&period2=1612137600&interval=1mo&filter=history&frequency=1mo&includeAdjustedClose=true\"\n", 99 | "html_data=requests.get(url).text" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "Analizamos los datos HTML usando `beautiful_soup`.\n" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 4, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [ 115 | "soup = BeautifulSoup(html_data,\"html5lib\")" 116 | ] 117 | }, 118 | { 119 | "cell_type": "markdown", 120 | "metadata": {}, 121 | "source": [ 122 | "Para sacar el contenido de la etiqueta title\n" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 6, 128 | "metadata": {}, 129 | "outputs": [ 130 | { 131 | "data": { 132 | "text/plain": [ 133 | "Amazon.com, Inc. (AMZN) Stock Historical Prices & Data - Yahoo Finance" 134 | ] 135 | }, 136 | "execution_count": 6, 137 | "metadata": {}, 138 | "output_type": "execute_result" 139 | } 140 | ], 141 | "source": [ 142 | "soup.title" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | " Usando BeautifulSoap, extraemos la tabla con los precios históricos de las acciones y la guardamos en 'dataframe' llamado `amazon_data`. El 'dataframe' debe tener las columnas Fecha, Apertura, Máxima, Mínima, Cierre, Cierre adjunto y Volumen. \n" 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": {}, 155 | "source": [ 156 | "### Construimos el DataFrame" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": 9, 162 | "metadata": {}, 163 | "outputs": [ 164 | { 165 | "data": { 166 | "text/html": [ 167 | "
\n", 168 | "\n", 181 | "\n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | "
DateOpenHighLowCloseVolumeAdj Close
0Jan 01, 20213,270.003,363.893,086.003,206.2071,528,9003,206.20
1Dec 01, 20203,188.503,350.653,072.823,256.9377,567,8003,256.93
2Nov 01, 20203,061.743,366.802,950.123,168.0490,810,5003,168.04
3Oct 01, 20203,208.003,496.243,019.003,036.15116,242,3003,036.15
4Sep 01, 20203,489.583,552.252,871.003,148.73115,943,5003,148.73
........................
56May 01, 2016663.92724.23656.00722.7990,614,500722.79
57Apr 01, 2016590.49669.98585.25659.5978,464,200659.59
58Mar 01, 2016556.29603.24538.58593.6494,009,500593.64
59Feb 01, 2016578.15581.80474.00552.52124,144,800552.52
60Jan 01, 2016656.29657.72547.18587.00130,200,900587.00
\n", 307 | "

61 rows × 7 columns

\n", 308 | "
" 309 | ], 310 | "text/plain": [ 311 | " Date Open High Low Close Volume \\\n", 312 | "0 Jan 01, 2021 3,270.00 3,363.89 3,086.00 3,206.20 71,528,900 \n", 313 | "1 Dec 01, 2020 3,188.50 3,350.65 3,072.82 3,256.93 77,567,800 \n", 314 | "2 Nov 01, 2020 3,061.74 3,366.80 2,950.12 3,168.04 90,810,500 \n", 315 | "3 Oct 01, 2020 3,208.00 3,496.24 3,019.00 3,036.15 116,242,300 \n", 316 | "4 Sep 01, 2020 3,489.58 3,552.25 2,871.00 3,148.73 115,943,500 \n", 317 | ".. ... ... ... ... ... ... \n", 318 | "56 May 01, 2016 663.92 724.23 656.00 722.79 90,614,500 \n", 319 | "57 Apr 01, 2016 590.49 669.98 585.25 659.59 78,464,200 \n", 320 | "58 Mar 01, 2016 556.29 603.24 538.58 593.64 94,009,500 \n", 321 | "59 Feb 01, 2016 578.15 581.80 474.00 552.52 124,144,800 \n", 322 | "60 Jan 01, 2016 656.29 657.72 547.18 587.00 130,200,900 \n", 323 | "\n", 324 | " Adj Close \n", 325 | "0 3,206.20 \n", 326 | "1 3,256.93 \n", 327 | "2 3,168.04 \n", 328 | "3 3,036.15 \n", 329 | "4 3,148.73 \n", 330 | ".. ... \n", 331 | "56 722.79 \n", 332 | "57 659.59 \n", 333 | "58 593.64 \n", 334 | "59 552.52 \n", 335 | "60 587.00 \n", 336 | "\n", 337 | "[61 rows x 7 columns]" 338 | ] 339 | }, 340 | "execution_count": 9, 341 | "metadata": {}, 342 | "output_type": "execute_result" 343 | } 344 | ], 345 | "source": [ 346 | "\n", 347 | "amazon_data = pd.DataFrame(columns=[\"Date\", \"Open\", \"High\", \"Low\", \"Close\", \"Volume\"])\n", 348 | "\n", 349 | "for row in soup.find(\"tbody\").find_all(\"tr\"):\n", 350 | " col = row.find_all(\"td\")\n", 351 | " date =col[0].text\n", 352 | " Open =col[1].text\n", 353 | " high =col[2].text\n", 354 | " low =col[3].text\n", 355 | " close =col[4].text\n", 356 | " adj_close =col[5].text\n", 357 | " volume =col[6].text\n", 358 | " \n", 359 | " amazon_data = amazon_data.append({\"Date\":date, \"Open\":Open, \"High\":high, \"Low\":low, \"Close\":close, \"Adj Close\":adj_close, \"Volume\":volume}, ignore_index=True)\n", 360 | "amazon_data " 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": {}, 366 | "source": [ 367 | "Imprimimos las cinco primeras filas del 'dataframe' `amazon_data`.\n" 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": 18, 373 | "metadata": {}, 374 | "outputs": [ 375 | { 376 | "data": { 377 | "text/html": [ 378 | "
\n", 379 | "\n", 392 | "\n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | "
DateOpenHighLowCloseVolumeAdj Close
0Jan 01, 20213,270.003,363.893,086.003,206.2071,528,9003,206.20
1Dec 01, 20203,188.503,350.653,072.823,256.9377,567,8003,256.93
2Nov 01, 20203,061.743,366.802,950.123,168.0490,810,5003,168.04
3Oct 01, 20203,208.003,496.243,019.003,036.15116,242,3003,036.15
4Sep 01, 20203,489.583,552.252,871.003,148.73115,943,5003,148.73
\n", 458 | "
" 459 | ], 460 | "text/plain": [ 461 | " Date Open High Low Close Volume Adj Close\n", 462 | "0 Jan 01, 2021 3,270.00 3,363.89 3,086.00 3,206.20 71,528,900 3,206.20\n", 463 | "1 Dec 01, 2020 3,188.50 3,350.65 3,072.82 3,256.93 77,567,800 3,256.93\n", 464 | "2 Nov 01, 2020 3,061.74 3,366.80 2,950.12 3,168.04 90,810,500 3,168.04\n", 465 | "3 Oct 01, 2020 3,208.00 3,496.24 3,019.00 3,036.15 116,242,300 3,036.15\n", 466 | "4 Sep 01, 2020 3,489.58 3,552.25 2,871.00 3,148.73 115,943,500 3,148.73" 467 | ] 468 | }, 469 | "execution_count": 18, 470 | "metadata": {}, 471 | "output_type": "execute_result" 472 | } 473 | ], 474 | "source": [ 475 | "amazon_data.head(5)" 476 | ] 477 | }, 478 | { 479 | "cell_type": "markdown", 480 | "metadata": {}, 481 | "source": [ 482 | "Para sacar el nombre de las columnas del 'dataframe'.\n" 483 | ] 484 | }, 485 | { 486 | "cell_type": "code", 487 | "execution_count": 20, 488 | "metadata": {}, 489 | "outputs": [ 490 | { 491 | "data": { 492 | "text/plain": [ 493 | "array(['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'],\n", 494 | " dtype=object)" 495 | ] 496 | }, 497 | "execution_count": 20, 498 | "metadata": {}, 499 | "output_type": "execute_result" 500 | } 501 | ], 502 | "source": [ 503 | "amazon_data.columns.values" 504 | ] 505 | }, 506 | { 507 | "cell_type": "markdown", 508 | "metadata": {}, 509 | "source": [ 510 | "Si queremos sacar el precio de apertura de la acción del 1 de junio del 2019 del 'dataframe' escribimos: " 511 | ] 512 | }, 513 | { 514 | "cell_type": "code", 515 | "execution_count": 57, 516 | "metadata": {}, 517 | "outputs": [ 518 | { 519 | "data": { 520 | "text/plain": [ 521 | "19 1,760.01\n", 522 | "Name: Open, dtype: object" 523 | ] 524 | }, 525 | "execution_count": 57, 526 | "metadata": {}, 527 | "output_type": "execute_result" 528 | } 529 | ], 530 | "source": [ 531 | "amazon_data.loc[(amazon_data.Date=='Jun 01, 2019'),'Open']" 532 | ] 533 | } 534 | ], 535 | "metadata": { 536 | "kernelspec": { 537 | "display_name": "Python 3", 538 | "language": "python", 539 | "name": "python3" 540 | }, 541 | "language_info": { 542 | "codemirror_mode": { 543 | "name": "ipython", 544 | "version": 3 545 | }, 546 | "file_extension": ".py", 547 | "mimetype": "text/x-python", 548 | "name": "python", 549 | "nbconvert_exporter": "python", 550 | "pygments_lexer": "ipython3", 551 | "version": "3.8.5" 552 | } 553 | }, 554 | "nbformat": 4, 555 | "nbformat_minor": 4 556 | } 557 | -------------------------------------------------------------------------------- /1. Práctica en el uso de la librería yfinance.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "

Extraer datos bursátiles usando la librería 'yfinance' de Python

\n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Supongamos que trabajamos para una gestora de fondos de inversión; nuestro trabajo consistirá en determinar cualquier actividad de compra-venta de acciones para realizar un análisis posterior con los datos extraídos. En este laboratorio, extraeremos los datos de existencias mediante la biblioteca yfinance, que nos permitirá extraer de Yahoo finanzas datos e información de las acciones\n" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "

Contenido

\n", 22 | "
\n", 23 | "
    \n", 24 | "
  • 1 - Uso yfinance para extraer información de acciones
  • \n", 25 | "
  • 2 - Uso yfinance para extraer datos históricos de precios de acciones
  • \n", 26 | "
  • 3 - Uso de yfinance para extraer datos históricos de dividendos
  • \n", 27 | "
  • 4 - Ejercicio
  • \n", 28 | "
\n", 29 | "
\n", 30 | "\n", 31 | "
\n" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "metadata": {}, 37 | "source": [ 38 | "### Instalamos la librería yfinance\n", 39 | "Esta librería permite descargar datos históricos económico financieros desde Yahoo Finanzas. Podremos acceder de un modo sencillo a una interesante gama de datos. Más información en https://pypi.org/project/yfinance/" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 2, 45 | "metadata": {}, 46 | "outputs": [ 47 | { 48 | "name": "stdout", 49 | "output_type": "stream", 50 | "text": [ 51 | "Collecting yfinance\n", 52 | " Downloading yfinance-0.1.63.tar.gz (26 kB)\n", 53 | "Requirement already satisfied: pandas>=0.24 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (1.1.3)\n", 54 | "Requirement already satisfied: numpy>=1.15 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (1.19.2)\n", 55 | "Requirement already satisfied: requests>=2.20 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (2.24.0)\n", 56 | "Collecting multitasking>=0.0.7\n", 57 | " Downloading multitasking-0.0.9.tar.gz (8.1 kB)\n", 58 | "Requirement already satisfied: lxml>=4.5.1 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from yfinance) (4.6.1)\n", 59 | "Requirement already satisfied: python-dateutil>=2.7.3 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from pandas>=0.24->yfinance) (2.8.1)\n", 60 | "Requirement already satisfied: pytz>=2017.2 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from pandas>=0.24->yfinance) (2020.1)\n", 61 | "Requirement already satisfied: chardet<4,>=3.0.2 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (3.0.4)\n", 62 | "Requirement already satisfied: certifi>=2017.4.17 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (2020.6.20)\n", 63 | "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (1.25.11)\n", 64 | "Requirement already satisfied: idna<3,>=2.5 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from requests>=2.20->yfinance) (2.10)\n", 65 | "Requirement already satisfied: six>=1.5 in c:\\users\\mrsanchez\\anaconda3\\lib\\site-packages (from python-dateutil>=2.7.3->pandas>=0.24->yfinance) (1.15.0)\n", 66 | "Building wheels for collected packages: yfinance, multitasking\n", 67 | " Building wheel for yfinance (setup.py): started\n", 68 | " Building wheel for yfinance (setup.py): finished with status 'done'\n", 69 | " Created wheel for yfinance: filename=yfinance-0.1.63-py2.py3-none-any.whl size=23914 sha256=00cfb7ca5adf87a773f2043f7c775b8cbd1f46fd7a909510934db16be79d6a1d\n", 70 | " Stored in directory: c:\\users\\mrsanchez\\appdata\\local\\pip\\cache\\wheels\\ec\\cc\\c1\\32da8ee853d742d5d7cbd11ee04421222eb354672020b57297\n", 71 | " Building wheel for multitasking (setup.py): started\n", 72 | " Building wheel for multitasking (setup.py): finished with status 'done'\n", 73 | " Created wheel for multitasking: filename=multitasking-0.0.9-py3-none-any.whl size=8372 sha256=6710745adb7b243b00c3493ad96e60e86be0d4c6e5ca2e27177fb014bb109092\n", 74 | " Stored in directory: c:\\users\\mrsanchez\\appdata\\local\\pip\\cache\\wheels\\57\\6d\\a3\\a39b839cc75274d2acfb1c58bfead2f726c6577fe8c4723f13\n", 75 | "Successfully built yfinance multitasking\n", 76 | "Installing collected packages: multitasking, yfinance\n", 77 | "Successfully installed multitasking-0.0.9 yfinance-0.1.63\n" 78 | ] 79 | } 80 | ], 81 | "source": [ 82 | "!pip install yfinance\n", 83 | "#!pip install pandas" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": {}, 89 | "source": [ 90 | "### Importamos las librerías \"yfinance\" y \"pandas\" " 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 3, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "import yfinance as yf\n", 100 | "import pandas as pd" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "## 1- Uso de la libreria yfinance para extraer datos de acciones\n" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "Utilizando el método Ticker() podemos extraer información bursátil pasandole como parámetro las siglas identificativas de con las que a la empresa es identificada en la bolsa. Como ejemplo vamos a recuperar las acciones de la empresa Apple y cuyas siglas identificativas son AAPL.\n" 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": 4, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "apple = yf.Ticker(\"AAPL\")" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "### Información de acciones\n", 131 | "Usando el atributo info podemos extraer información de las acciones de Apple en un diccionario de Python." 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 6, 137 | "metadata": {}, 138 | "outputs": [ 139 | { 140 | "data": { 141 | "text/plain": [ 142 | "{'zip': '95014',\n", 143 | " 'sector': 'Technology',\n", 144 | " 'fullTimeEmployees': 147000,\n", 145 | " 'longBusinessSummary': 'Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. It also sells various related services. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, HomePod, iPod touch, and other Apple-branded and third-party accessories. It also provides AppleCare support services; cloud services store services; and operates various platforms, including the App Store, that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts. In addition, the company offers various services, such as Apple Arcade, a game subscription service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription news and magazine service; Apple TV+, which offers exclusive original content; Apple Card, a co-branded credit card; and Apple Pay, a cashless payment service, as well as licenses its intellectual property. The company serves consumers, and small and mid-sized businesses; and the education, enterprise, and government markets. It sells and delivers third-party applications for its products through the App Store. The company also sells its products through its retail and online stores, and direct sales force; and third-party cellular network carriers, wholesalers, retailers, and resellers. Apple Inc. was founded in 1977 and is headquartered in Cupertino, California.',\n", 146 | " 'city': 'Cupertino',\n", 147 | " 'phone': '408-996-1010',\n", 148 | " 'state': 'CA',\n", 149 | " 'country': 'United States',\n", 150 | " 'companyOfficers': [],\n", 151 | " 'website': 'http://www.apple.com',\n", 152 | " 'maxAge': 1,\n", 153 | " 'address1': 'One Apple Park Way',\n", 154 | " 'industry': 'Consumer Electronics',\n", 155 | " 'ebitdaMargins': 0.31955,\n", 156 | " 'profitMargins': 0.25004,\n", 157 | " 'grossMargins': 0.41005,\n", 158 | " 'operatingCashflow': 104414003200,\n", 159 | " 'revenueGrowth': 0.364,\n", 160 | " 'operatingMargins': 0.28788,\n", 161 | " 'ebitda': 110934999040,\n", 162 | " 'targetLowPrice': 125,\n", 163 | " 'recommendationKey': 'buy',\n", 164 | " 'grossProfits': 104956000000,\n", 165 | " 'freeCashflow': 80625876992,\n", 166 | " 'targetMedianPrice': 160,\n", 167 | " 'currentPrice': 147.2,\n", 168 | " 'earningsGrowth': 1,\n", 169 | " 'currentRatio': 1.062,\n", 170 | " 'returnOnAssets': 0.19302,\n", 171 | " 'numberOfAnalystOpinions': 39,\n", 172 | " 'targetMeanPrice': 159.34,\n", 173 | " 'debtToEquity': 210.782,\n", 174 | " 'returnOnEquity': 1.27125,\n", 175 | " 'targetHighPrice': 185,\n", 176 | " 'totalCash': 61696000000,\n", 177 | " 'totalDebt': 135491002368,\n", 178 | " 'totalRevenue': 347155005440,\n", 179 | " 'totalCashPerShare': 3.732,\n", 180 | " 'financialCurrency': 'USD',\n", 181 | " 'revenuePerShare': 20.61,\n", 182 | " 'quickRatio': 0.887,\n", 183 | " 'recommendationMean': 2,\n", 184 | " 'exchange': 'NMS',\n", 185 | " 'shortName': 'Apple Inc.',\n", 186 | " 'longName': 'Apple Inc.',\n", 187 | " 'exchangeTimezoneName': 'America/New_York',\n", 188 | " 'exchangeTimezoneShortName': 'EDT',\n", 189 | " 'isEsgPopulated': False,\n", 190 | " 'gmtOffSetMilliseconds': '-14400000',\n", 191 | " 'quoteType': 'EQUITY',\n", 192 | " 'symbol': 'AAPL',\n", 193 | " 'messageBoardId': 'finmb_24937',\n", 194 | " 'market': 'us_market',\n", 195 | " 'annualHoldingsTurnover': None,\n", 196 | " 'enterpriseToRevenue': 7.21,\n", 197 | " 'beta3Year': None,\n", 198 | " 'enterpriseToEbitda': 22.562,\n", 199 | " '52WeekChange': 0.29013848,\n", 200 | " 'morningStarRiskRating': None,\n", 201 | " 'forwardEps': 5.34,\n", 202 | " 'revenueQuarterlyGrowth': None,\n", 203 | " 'sharesOutstanding': 16530199552,\n", 204 | " 'fundInceptionDate': None,\n", 205 | " 'annualReportExpenseRatio': None,\n", 206 | " 'totalAssets': None,\n", 207 | " 'bookValue': 3.882,\n", 208 | " 'sharesShort': 96355309,\n", 209 | " 'sharesPercentSharesOut': 0.0058,\n", 210 | " 'fundFamily': None,\n", 211 | " 'lastFiscalYearEnd': 1601078400,\n", 212 | " 'heldPercentInstitutions': 0.59102,\n", 213 | " 'netIncomeToCommon': 86801997824,\n", 214 | " 'trailingEps': 5.108,\n", 215 | " 'lastDividendValue': 0.22,\n", 216 | " 'SandP52WeekChange': 0.31455648,\n", 217 | " 'priceToBook': 37.9186,\n", 218 | " 'heldPercentInsiders': 0.00068999996,\n", 219 | " 'nextFiscalYearEnd': 1664150400,\n", 220 | " 'yield': None,\n", 221 | " 'mostRecentQuarter': 1624665600,\n", 222 | " 'shortRatio': 1.14,\n", 223 | " 'sharesShortPreviousMonthDate': 1623715200,\n", 224 | " 'floatShares': 16513139929,\n", 225 | " 'beta': 1.202797,\n", 226 | " 'enterpriseValue': 2502902939648,\n", 227 | " 'priceHint': 2,\n", 228 | " 'threeYearAverageReturn': None,\n", 229 | " 'lastSplitDate': 1598832000,\n", 230 | " 'lastSplitFactor': '4:1',\n", 231 | " 'legalType': None,\n", 232 | " 'lastDividendDate': 1620345600,\n", 233 | " 'morningStarOverallRating': None,\n", 234 | " 'earningsQuarterlyGrowth': 0.932,\n", 235 | " 'priceToSalesTrailing12Months': 7.0091033,\n", 236 | " 'dateShortInterest': 1626307200,\n", 237 | " 'pegRatio': 1.53,\n", 238 | " 'ytdReturn': None,\n", 239 | " 'forwardPE': 27.565542,\n", 240 | " 'lastCapGain': None,\n", 241 | " 'shortPercentOfFloat': 0.0058,\n", 242 | " 'sharesShortPriorMonth': 108937943,\n", 243 | " 'impliedSharesOutstanding': None,\n", 244 | " 'category': None,\n", 245 | " 'fiveYearAverageReturn': None,\n", 246 | " 'previousClose': 146.95,\n", 247 | " 'regularMarketOpen': 146.98,\n", 248 | " 'twoHundredDayAverage': 131.78065,\n", 249 | " 'trailingAnnualDividendYield': 0.005682205,\n", 250 | " 'payoutRatio': 0.16309999,\n", 251 | " 'volume24Hr': None,\n", 252 | " 'regularMarketDayHigh': 147.84,\n", 253 | " 'navPrice': None,\n", 254 | " 'averageDailyVolume10Day': 75906475,\n", 255 | " 'regularMarketPreviousClose': 146.95,\n", 256 | " 'fiftyDayAverage': 141.56372,\n", 257 | " 'trailingAnnualDividendRate': 0.835,\n", 258 | " 'open': 146.98,\n", 259 | " 'toCurrency': None,\n", 260 | " 'averageVolume10days': 75906475,\n", 261 | " 'expireDate': None,\n", 262 | " 'algorithm': None,\n", 263 | " 'dividendRate': 0.88,\n", 264 | " 'exDividendDate': 1628208000,\n", 265 | " 'circulatingSupply': None,\n", 266 | " 'startDate': None,\n", 267 | " 'regularMarketDayLow': 146.17,\n", 268 | " 'currency': 'USD',\n", 269 | " 'trailingPE': 28.817541,\n", 270 | " 'regularMarketVolume': 29498682,\n", 271 | " 'lastMarket': None,\n", 272 | " 'maxSupply': None,\n", 273 | " 'openInterest': None,\n", 274 | " 'marketCap': 2433245249536,\n", 275 | " 'volumeAllCurrencies': None,\n", 276 | " 'strikePrice': None,\n", 277 | " 'averageVolume': 81387312,\n", 278 | " 'dayLow': 146.17,\n", 279 | " 'ask': 147.21,\n", 280 | " 'askSize': 1400,\n", 281 | " 'volume': 29498682,\n", 282 | " 'fiftyTwoWeekHigh': 150,\n", 283 | " 'fromCurrency': None,\n", 284 | " 'fiveYearAvgDividendYield': 1.29,\n", 285 | " 'fiftyTwoWeekLow': 103.1,\n", 286 | " 'bid': 147.2,\n", 287 | " 'tradeable': False,\n", 288 | " 'dividendYield': 0.006,\n", 289 | " 'bidSize': 1300,\n", 290 | " 'dayHigh': 147.84,\n", 291 | " 'regularMarketPrice': 147.2,\n", 292 | " 'logo_url': 'https://logo.clearbit.com/apple.com'}" 293 | ] 294 | }, 295 | "execution_count": 6, 296 | "metadata": {}, 297 | "output_type": "execute_result" 298 | } 299 | ], 300 | "source": [ 301 | "apple_info=apple.info\n", 302 | "apple_info #mostramos lo que tiene la variable" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": {}, 308 | "source": [ 309 | "Como ya tenemos un diccionario, podemos extraer la información a partir del campo clave. Extraemos por ejemplo el país mediante la clave 'country'\n" 310 | ] 311 | }, 312 | { 313 | "cell_type": "code", 314 | "execution_count": 7, 315 | "metadata": {}, 316 | "outputs": [ 317 | { 318 | "data": { 319 | "text/plain": [ 320 | "'United States'" 321 | ] 322 | }, 323 | "execution_count": 7, 324 | "metadata": {}, 325 | "output_type": "execute_result" 326 | } 327 | ], 328 | "source": [ 329 | "apple_info['country']" 330 | ] 331 | }, 332 | { 333 | "cell_type": "markdown", 334 | "metadata": {}, 335 | "source": [ 336 | "## 2 - Uso yfinance para extraer datos históricos de precios de acciones\n", 337 | "Para realizar un análisis necesitamos extraer la evolución histórica de los indicadores principales (precio de apertura, máximo, mínimo, precio de cierre, etc.). Para ello usamos el método history(). Le podemos pasar como parámetro el periodo de tiempo cuyos datos queremos recoger. Las opciones para el período son 1 día (1d), 5d, 1 mes (1mo), 3mo, 6mo, 1 año (1y), 2y, 5y, 10y, ytd y max, este último (max) recoge el histórico completo desde la primera cotización en bolsa de la empresa." 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": 19, 343 | "metadata": {}, 344 | "outputs": [ 345 | { 346 | "name": "stdout", 347 | "output_type": "stream", 348 | "text": [ 349 | " Open High Low Close Volume \\\n", 350 | "Date \n", 351 | "1980-12-12 0.100751 0.101189 0.100751 0.100751 469033600 \n", 352 | "1980-12-15 0.095933 0.095933 0.095495 0.095495 175884800 \n", 353 | "1980-12-16 0.088923 0.088923 0.088485 0.088485 105728000 \n", 354 | "1980-12-17 0.090676 0.091114 0.090676 0.090676 86441600 \n", 355 | "1980-12-18 0.093304 0.093742 0.093304 0.093304 73449600 \n", 356 | "... ... ... ... ... ... \n", 357 | "2021-07-30 144.380005 146.330002 144.110001 145.860001 70382000 \n", 358 | "2021-08-02 146.360001 146.949997 145.250000 145.520004 62880000 \n", 359 | "2021-08-03 145.809998 148.039993 145.179993 147.360001 64660800 \n", 360 | "2021-08-04 147.270004 147.789993 146.279999 146.949997 56319800 \n", 361 | "2021-08-05 146.979996 147.839996 146.169998 147.301498 31752985 \n", 362 | "\n", 363 | " Dividends Stock Splits \n", 364 | "Date \n", 365 | "1980-12-12 0.0 0.0 \n", 366 | "1980-12-15 0.0 0.0 \n", 367 | "1980-12-16 0.0 0.0 \n", 368 | "1980-12-17 0.0 0.0 \n", 369 | "1980-12-18 0.0 0.0 \n", 370 | "... ... ... \n", 371 | "2021-07-30 0.0 0.0 \n", 372 | "2021-08-02 0.0 0.0 \n", 373 | "2021-08-03 0.0 0.0 \n", 374 | "2021-08-04 0.0 0.0 \n", 375 | "2021-08-05 0.0 0.0 \n", 376 | "\n", 377 | "[10249 rows x 7 columns]\n" 378 | ] 379 | } 380 | ], 381 | "source": [ 382 | "historico_apple = apple.history(period=\"max\")\n", 383 | "print(historico_apple)" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "El formato en el que se devuelven los datos es un DataFrame de Pandas. Con la 'Fecha' como índice, la acción 'Open', 'High', 'Low', 'Close', 'Volume' y 'Stock Splits' que se dan para cada día.\n", 391 | "Haciendo uso de los métodos head() y tail() podremos ver las primeras líneas o las ultimas respectivamente. Como parámetro se les pasa el número de líneas a visualizar, si no se le proporciona este parámetro por defecto muestra cinco líneas." 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": 20, 397 | "metadata": {}, 398 | "outputs": [ 399 | { 400 | "data": { 401 | "text/html": [ 402 | "
\n", 403 | "\n", 416 | "\n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | "
OpenHighLowCloseVolumeDividendsStock Splits
Date
1980-12-120.1007510.1011890.1007510.1007514690336000.00.0
1980-12-150.0959330.0959330.0954950.0954951758848000.00.0
1980-12-160.0889230.0889230.0884850.0884851057280000.00.0
1980-12-170.0906760.0911140.0906760.090676864416000.00.0
1980-12-180.0933040.0937420.0933040.093304734496000.00.0
\n", 492 | "
" 493 | ], 494 | "text/plain": [ 495 | " Open High Low Close Volume Dividends \\\n", 496 | "Date \n", 497 | "1980-12-12 0.100751 0.101189 0.100751 0.100751 469033600 0.0 \n", 498 | "1980-12-15 0.095933 0.095933 0.095495 0.095495 175884800 0.0 \n", 499 | "1980-12-16 0.088923 0.088923 0.088485 0.088485 105728000 0.0 \n", 500 | "1980-12-17 0.090676 0.091114 0.090676 0.090676 86441600 0.0 \n", 501 | "1980-12-18 0.093304 0.093742 0.093304 0.093304 73449600 0.0 \n", 502 | "\n", 503 | " Stock Splits \n", 504 | "Date \n", 505 | "1980-12-12 0.0 \n", 506 | "1980-12-15 0.0 \n", 507 | "1980-12-16 0.0 \n", 508 | "1980-12-17 0.0 \n", 509 | "1980-12-18 0.0 " 510 | ] 511 | }, 512 | "execution_count": 20, 513 | "metadata": {}, 514 | "output_type": "execute_result" 515 | } 516 | ], 517 | "source": [ 518 | "historico_apple.head()" 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": {}, 524 | "source": [ 525 | "Si queremos mostrar los diez primero registros de los datos de una sola columna; por ejemplo la columna de precio de la acción al cierre de la jornada" 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": 21, 531 | "metadata": {}, 532 | "outputs": [ 533 | { 534 | "name": "stdout", 535 | "output_type": "stream", 536 | "text": [ 537 | "Date\n", 538 | "1980-12-12 0.100751\n", 539 | "1980-12-15 0.095495\n", 540 | "1980-12-16 0.088485\n", 541 | "1980-12-17 0.090676\n", 542 | "1980-12-18 0.093304\n", 543 | "1980-12-19 0.098999\n", 544 | "1980-12-22 0.103817\n", 545 | "1980-12-23 0.108198\n", 546 | "1980-12-24 0.113892\n", 547 | "1980-12-26 0.124405\n", 548 | "Name: Close, dtype: float64\n" 549 | ] 550 | } 551 | ], 552 | "source": [ 553 | "print(historico_apple[\"Close\"].head(10))" 554 | ] 555 | }, 556 | { 557 | "cell_type": "markdown", 558 | "metadata": {}, 559 | "source": [ 560 | "Podemos restablecer el índice del DataFrame con la función reset_index. También establecemos el parámetro inplace en True para que el cambio tenga lugar en el propio DataFrame.\n" 561 | ] 562 | }, 563 | { 564 | "cell_type": "code", 565 | "execution_count": 22, 566 | "metadata": {}, 567 | "outputs": [], 568 | "source": [ 569 | "historico_apple.reset_index(inplace=True)" 570 | ] 571 | }, 572 | { 573 | "cell_type": "markdown", 574 | "metadata": {}, 575 | "source": [ 576 | "Graficamos el precio al cierre por fecha. Usamos el campo `Close` y el campo `Date`:" 577 | ] 578 | }, 579 | { 580 | "cell_type": "code", 581 | "execution_count": 23, 582 | "metadata": {}, 583 | "outputs": [ 584 | { 585 | "data": { 586 | "text/plain": [ 587 | "" 588 | ] 589 | }, 590 | "execution_count": 23, 591 | "metadata": {}, 592 | "output_type": "execute_result" 593 | }, 594 | { 595 | "data": { 596 | "image/png": "\n", 597 | "text/plain": [ 598 | "
" 599 | ] 600 | }, 601 | "metadata": { 602 | "needs_background": "light" 603 | }, 604 | "output_type": "display_data" 605 | } 606 | ], 607 | "source": [ 608 | "historico_apple.plot(x=\"Date\", y=\"Close\")" 609 | ] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "metadata": {}, 614 | "source": [ 615 | "## 3 - Uso de yfinance para extraer datos históricos de dividendos\n" 616 | ] 617 | }, 618 | { 619 | "cell_type": "markdown", 620 | "metadata": {}, 621 | "source": [ 622 | "Los dividendos son la distribución de las ganancias de una empresa a los accionistas. En este caso, se definen como una cantidad de dinero devuelta por acción que posee un inversor. Usando la variable 'dividens' podemos obtener un DataFrame de los datos. El período de los datos viene dado por el período definido en la función \"history\".\n" 623 | ] 624 | }, 625 | { 626 | "cell_type": "code", 627 | "execution_count": 24, 628 | "metadata": {}, 629 | "outputs": [ 630 | { 631 | "data": { 632 | "text/plain": [ 633 | "Date\n", 634 | "1987-05-11 0.000536\n", 635 | "1987-08-10 0.000536\n", 636 | "1987-11-17 0.000714\n", 637 | "1988-02-12 0.000714\n", 638 | "1988-05-16 0.000714\n", 639 | " ... \n", 640 | "2020-05-08 0.205000\n", 641 | "2020-08-07 0.205000\n", 642 | "2020-11-06 0.205000\n", 643 | "2021-02-05 0.205000\n", 644 | "2021-05-07 0.220000\n", 645 | "Name: Dividends, Length: 71, dtype: float64" 646 | ] 647 | }, 648 | "execution_count": 24, 649 | "metadata": {}, 650 | "output_type": "execute_result" 651 | } 652 | ], 653 | "source": [ 654 | "apple.dividends" 655 | ] 656 | }, 657 | { 658 | "cell_type": "markdown", 659 | "metadata": {}, 660 | "source": [ 661 | "Podemos graficar los dividendos a lo largo del tiempo:" 662 | ] 663 | }, 664 | { 665 | "cell_type": "code", 666 | "execution_count": 25, 667 | "metadata": {}, 668 | "outputs": [ 669 | { 670 | "data": { 671 | "text/plain": [ 672 | "" 673 | ] 674 | }, 675 | "execution_count": 25, 676 | "metadata": {}, 677 | "output_type": "execute_result" 678 | }, 679 | { 680 | "data": { 681 | "image/png": "\n", 682 | "text/plain": [ 683 | "
" 684 | ] 685 | }, 686 | "metadata": { 687 | "needs_background": "light" 688 | }, 689 | "output_type": "display_data" 690 | } 691 | ], 692 | "source": [ 693 | "apple.dividends.plot()" 694 | ] 695 | }, 696 | { 697 | "cell_type": "markdown", 698 | "metadata": {}, 699 | "source": [ 700 | "## 4- Ejercicio\n" 701 | ] 702 | }, 703 | { 704 | "cell_type": "markdown", 705 | "metadata": {}, 706 | "source": [ 707 | "Con todo lo visto anteriormente, vamos a extraer los datos de una empresa española conocida, los mostramos y graficamos la evolución de los valores desde que empezó a cotizar en bolsa.\n" 708 | ] 709 | }, 710 | { 711 | "cell_type": "code", 712 | "execution_count": 27, 713 | "metadata": {}, 714 | "outputs": [], 715 | "source": [ 716 | "inditex = yf.Ticker(\"ITX.MC\")" 717 | ] 718 | }, 719 | { 720 | "cell_type": "markdown", 721 | "metadata": {}, 722 | "source": [ 723 | "Question 1 Mostramos el país de la empresa" 724 | ] 725 | }, 726 | { 727 | "cell_type": "code", 728 | "execution_count": 29, 729 | "metadata": {}, 730 | "outputs": [ 731 | { 732 | "data": { 733 | "text/plain": [ 734 | "'Spain'" 735 | ] 736 | }, 737 | "execution_count": 29, 738 | "metadata": {}, 739 | "output_type": "execute_result" 740 | } 741 | ], 742 | "source": [ 743 | "inditex_info=inditex.info\n", 744 | "inditex_info['country']" 745 | ] 746 | }, 747 | { 748 | "cell_type": "markdown", 749 | "metadata": {}, 750 | "source": [ 751 | "Question 2 Mostramos el sector al que pertenece\n" 752 | ] 753 | }, 754 | { 755 | "cell_type": "code", 756 | "execution_count": 30, 757 | "metadata": {}, 758 | "outputs": [ 759 | { 760 | "data": { 761 | "text/plain": [ 762 | "'Consumer Cyclical'" 763 | ] 764 | }, 765 | "execution_count": 30, 766 | "metadata": {}, 767 | "output_type": "execute_result" 768 | } 769 | ], 770 | "source": [ 771 | "inditex_info['sector']" 772 | ] 773 | }, 774 | { 775 | "cell_type": "markdown", 776 | "metadata": {}, 777 | "source": [ 778 | "Question 3 Extraemos el histórico desde que empezó a cotizar en bolsa y lo graficamos" 779 | ] 780 | }, 781 | { 782 | "cell_type": "code", 783 | "execution_count": 33, 784 | "metadata": {}, 785 | "outputs": [ 786 | { 787 | "name": "stdout", 788 | "output_type": "stream", 789 | "text": [ 790 | " Open High Low Close Volume Dividends \\\n", 791 | "Date \n", 792 | "2001-05-24 -0.137869 -0.138635 -0.134423 -0.138176 216270100 0.0 \n", 793 | "2001-05-25 -0.137869 -0.140780 -0.137103 -0.137946 50448300 0.0 \n", 794 | "2001-05-28 -0.136337 -0.138022 -0.135725 -0.137103 26118945 0.0 \n", 795 | "2001-05-29 -0.136414 -0.138865 -0.136414 -0.138405 26910070 0.0 \n", 796 | "2001-05-30 -0.138099 -0.139708 -0.137946 -0.138635 48229995 0.0 \n", 797 | "... ... ... ... ... ... ... \n", 798 | "2021-07-30 28.500000 28.719999 28.270000 28.590000 2524564 0.0 \n", 799 | "2021-08-02 28.809999 29.530001 28.750000 29.100000 13035296 0.0 \n", 800 | "2021-08-03 29.129999 29.290001 28.730000 28.870001 1425722 0.0 \n", 801 | "2021-08-04 29.040001 29.049999 28.580000 28.650000 1260747 0.0 \n", 802 | "2021-08-05 28.690001 28.879999 28.549999 28.780001 1035664 0.0 \n", 803 | "\n", 804 | " Stock Splits \n", 805 | "Date \n", 806 | "2001-05-24 0.0 \n", 807 | "2001-05-25 0.0 \n", 808 | "2001-05-28 0.0 \n", 809 | "2001-05-29 0.0 \n", 810 | "2001-05-30 0.0 \n", 811 | "... ... \n", 812 | "2021-07-30 0.0 \n", 813 | "2021-08-02 0.0 \n", 814 | "2021-08-03 0.0 \n", 815 | "2021-08-04 0.0 \n", 816 | "2021-08-05 0.0 \n", 817 | "\n", 818 | "[5137 rows x 7 columns]\n" 819 | ] 820 | } 821 | ], 822 | "source": [ 823 | "historico_inditex=inditex.history(period=\"max\")\n", 824 | "print(historico_inditex)" 825 | ] 826 | }, 827 | { 828 | "cell_type": "code", 829 | "execution_count": 35, 830 | "metadata": {}, 831 | "outputs": [], 832 | "source": [ 833 | "historico_inditex.reset_index(inplace=True)" 834 | ] 835 | }, 836 | { 837 | "cell_type": "code", 838 | "execution_count": 36, 839 | "metadata": {}, 840 | "outputs": [ 841 | { 842 | "data": { 843 | "text/plain": [ 844 | "" 845 | ] 846 | }, 847 | "execution_count": 36, 848 | "metadata": {}, 849 | "output_type": "execute_result" 850 | }, 851 | { 852 | "data": { 853 | "image/png": "\n", 854 | "text/plain": [ 855 | "
" 856 | ] 857 | }, 858 | "metadata": { 859 | "needs_background": "light" 860 | }, 861 | "output_type": "display_data" 862 | } 863 | ], 864 | "source": [ 865 | "historico_inditex.plot(x=\"Date\", y=\"Close\")" 866 | ] 867 | }, 868 | { 869 | "cell_type": "code", 870 | "execution_count": null, 871 | "metadata": {}, 872 | "outputs": [], 873 | "source": [] 874 | } 875 | ], 876 | "metadata": { 877 | "kernelspec": { 878 | "display_name": "Python 3", 879 | "language": "python", 880 | "name": "python3" 881 | }, 882 | "language_info": { 883 | "codemirror_mode": { 884 | "name": "ipython", 885 | "version": 3 886 | }, 887 | "file_extension": ".py", 888 | "mimetype": "text/x-python", 889 | "name": "python", 890 | "nbconvert_exporter": "python", 891 | "pygments_lexer": "ipython3", 892 | "version": "3.8.5" 893 | } 894 | }, 895 | "nbformat": 4, 896 | "nbformat_minor": 4 897 | } 898 | -------------------------------------------------------------------------------- /3. Tarea de extracción de datos mediante Webscraping.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "

Extraer datos bursátiles usando WebScraping

\n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "No todos los datos de stock están disponibles a través de la API en esta asignación; aquí se va a usar web-scraping para obtener los datos financieros. Usando 'BeautifulSoup' y 'requests' extraeremos datos históricos de acciones de una página web." 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "
\n", 22 | "
    \n", 23 | "
  • Descarga de la página web mediante la biblioteca Request
  • \n", 24 | "
  • Analizando HTML de la página web usando BeautifulSoup
  • \n", 25 | "
  • Extraer datos y construir DataFrame
  • \n", 26 | "
\n", 27 | "
\n", 28 | "\n", 29 | "
\n" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 1, 35 | "metadata": {}, 36 | "outputs": [ 37 | { 38 | "name": "stdout", 39 | "output_type": "stream", 40 | "text": [ 41 | "Requirement already satisfied: bs4 in c:\\users\\cyber\\anaconda3\\lib\\site-packages (0.0.1)\n", 42 | "Requirement already satisfied: beautifulsoup4 in c:\\users\\cyber\\anaconda3\\lib\\site-packages (from bs4) (4.9.3)\n", 43 | "Requirement already satisfied: soupsieve>1.2; python_version >= \"3.0\" in c:\\users\\cyber\\anaconda3\\lib\\site-packages (from beautifulsoup4->bs4) (2.0.1)\n", 44 | "Collecting plotly\n", 45 | " Downloading plotly-4.14.3-py2.py3-none-any.whl (13.2 MB)\n", 46 | "Collecting retrying>=1.3.3\n", 47 | " Downloading retrying-1.3.3.tar.gz (10 kB)\n", 48 | "Requirement already satisfied: six in c:\\users\\cyber\\appdata\\roaming\\python\\python38\\site-packages (from plotly) (1.14.0)\n", 49 | "Building wheels for collected packages: retrying\n", 50 | " Building wheel for retrying (setup.py): started\n", 51 | " Building wheel for retrying (setup.py): finished with status 'done'\n", 52 | " Created wheel for retrying: filename=retrying-1.3.3-py3-none-any.whl size=11434 sha256=03a9cb7700431c3918069a7139163cc15327df3edfcc702354a4f00f13c30489\n", 53 | " Stored in directory: c:\\users\\cyber\\appdata\\local\\pip\\cache\\wheels\\c4\\a7\\48\\0a434133f6d56e878ca511c0e6c38326907c0792f67b476e56\n", 54 | "Successfully built retrying\n", 55 | "Installing collected packages: retrying, plotly\n", 56 | "Successfully installed plotly-4.14.3 retrying-1.3.3\n" 57 | ] 58 | } 59 | ], 60 | "source": [ 61 | "!pip install pandas\n", 62 | "!pip install requests\n", 63 | "!pip install bs4\n", 64 | "!pip install plotly" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 2, 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "import pandas as pd\n", 74 | "import requests\n", 75 | "from bs4 import BeautifulSoup" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "## Extraer datos bursátiles usando WebScraping\n" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "Usando la librería `requests` descargamos la página web [https://finance.yahoo.com/quote/AMZN/history?period1=1451606400&period2=1612137600&interval=1mo&filter=history&frequency=1mo&includeAdjustedClose=true](https://finance.yahoo.com/quote/AMZN/history?period1=1451606400&period2=1612137600&interval=1mo&filter=history&frequency=1mo&includeAdjustedClose=true&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ&cm_mmc=Email_Newsletter-_-Developer_Ed%2BTech-_-WW_WW-_-SkillsNetwork-Courses-IBMDeveloperSkillsNetwork-PY0220EN-SkillsNetwork-23455606&cm_mmca1=000026UJ&cm_mmca2=10006555&cm_mmca3=M12345678&cvosrc=email.Newsletter.M12345678&cvo_campaign=000026UJ). Salvamos el texto de respuesta como una variable llamada `html_data`.\n" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 3, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "url=\"https://finance.yahoo.com/quote/AMZN/history?period1=1451606400&period2=1612137600&interval=1mo&filter=history&frequency=1mo&includeAdjustedClose=true\"\n", 99 | "html_data=requests.get(url).text" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "Analizamos los datos HTML usando `beautiful_soup`.\n" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": 4, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [ 115 | "soup = BeautifulSoup(html_data,\"html5lib\")" 116 | ] 117 | }, 118 | { 119 | "cell_type": "markdown", 120 | "metadata": {}, 121 | "source": [ 122 | "Para sacar el contenido de la etiqueta title\n" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 6, 128 | "metadata": {}, 129 | "outputs": [ 130 | { 131 | "data": { 132 | "text/plain": [ 133 | "Amazon.com, Inc. (AMZN) Stock Historical Prices & Data - Yahoo Finance" 134 | ] 135 | }, 136 | "execution_count": 6, 137 | "metadata": {}, 138 | "output_type": "execute_result" 139 | } 140 | ], 141 | "source": [ 142 | "soup.title" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | " Usando BeautifulSoap, extraemos la tabla con los precios históricos de las acciones y la guardamos en 'dataframe' llamado `amazon_data`. El 'dataframe' debe tener las columnas Fecha, Apertura, Máxima, Mínima, Cierre, Cierre adjunto y Volumen. \n" 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": {}, 155 | "source": [ 156 | "### Construimos el DataFrame" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": 9, 162 | "metadata": {}, 163 | "outputs": [ 164 | { 165 | "data": { 166 | "text/html": [ 167 | "
\n", 168 | "\n", 181 | "\n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | "
DateOpenHighLowCloseVolumeAdj Close
0Jan 01, 20213,270.003,363.893,086.003,206.2071,528,9003,206.20
1Dec 01, 20203,188.503,350.653,072.823,256.9377,567,8003,256.93
2Nov 01, 20203,061.743,366.802,950.123,168.0490,810,5003,168.04
3Oct 01, 20203,208.003,496.243,019.003,036.15116,242,3003,036.15
4Sep 01, 20203,489.583,552.252,871.003,148.73115,943,5003,148.73
........................
56May 01, 2016663.92724.23656.00722.7990,614,500722.79
57Apr 01, 2016590.49669.98585.25659.5978,464,200659.59
58Mar 01, 2016556.29603.24538.58593.6494,009,500593.64
59Feb 01, 2016578.15581.80474.00552.52124,144,800552.52
60Jan 01, 2016656.29657.72547.18587.00130,200,900587.00
\n", 307 | "

61 rows × 7 columns

\n", 308 | "
" 309 | ], 310 | "text/plain": [ 311 | " Date Open High Low Close Volume \\\n", 312 | "0 Jan 01, 2021 3,270.00 3,363.89 3,086.00 3,206.20 71,528,900 \n", 313 | "1 Dec 01, 2020 3,188.50 3,350.65 3,072.82 3,256.93 77,567,800 \n", 314 | "2 Nov 01, 2020 3,061.74 3,366.80 2,950.12 3,168.04 90,810,500 \n", 315 | "3 Oct 01, 2020 3,208.00 3,496.24 3,019.00 3,036.15 116,242,300 \n", 316 | "4 Sep 01, 2020 3,489.58 3,552.25 2,871.00 3,148.73 115,943,500 \n", 317 | ".. ... ... ... ... ... ... \n", 318 | "56 May 01, 2016 663.92 724.23 656.00 722.79 90,614,500 \n", 319 | "57 Apr 01, 2016 590.49 669.98 585.25 659.59 78,464,200 \n", 320 | "58 Mar 01, 2016 556.29 603.24 538.58 593.64 94,009,500 \n", 321 | "59 Feb 01, 2016 578.15 581.80 474.00 552.52 124,144,800 \n", 322 | "60 Jan 01, 2016 656.29 657.72 547.18 587.00 130,200,900 \n", 323 | "\n", 324 | " Adj Close \n", 325 | "0 3,206.20 \n", 326 | "1 3,256.93 \n", 327 | "2 3,168.04 \n", 328 | "3 3,036.15 \n", 329 | "4 3,148.73 \n", 330 | ".. ... \n", 331 | "56 722.79 \n", 332 | "57 659.59 \n", 333 | "58 593.64 \n", 334 | "59 552.52 \n", 335 | "60 587.00 \n", 336 | "\n", 337 | "[61 rows x 7 columns]" 338 | ] 339 | }, 340 | "execution_count": 9, 341 | "metadata": {}, 342 | "output_type": "execute_result" 343 | } 344 | ], 345 | "source": [ 346 | "\n", 347 | "amazon_data = pd.DataFrame(columns=[\"Date\", \"Open\", \"High\", \"Low\", \"Close\", \"Volume\"])\n", 348 | "\n", 349 | "for row in soup.find(\"tbody\").find_all(\"tr\"):\n", 350 | " col = row.find_all(\"td\")\n", 351 | " date =col[0].text\n", 352 | " Open =col[1].text\n", 353 | " high =col[2].text\n", 354 | " low =col[3].text\n", 355 | " close =col[4].text\n", 356 | " adj_close =col[5].text\n", 357 | " volume =col[6].text\n", 358 | " \n", 359 | " amazon_data = amazon_data.append({\"Date\":date, \"Open\":Open, \"High\":high, \"Low\":low, \"Close\":close, \"Adj Close\":adj_close, \"Volume\":volume}, ignore_index=True)\n", 360 | "amazon_data " 361 | ] 362 | }, 363 | { 364 | "cell_type": "markdown", 365 | "metadata": {}, 366 | "source": [ 367 | "Imprimimos las cinco primeras filas del 'dataframe' `amazon_data`.\n" 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": 18, 373 | "metadata": {}, 374 | "outputs": [ 375 | { 376 | "data": { 377 | "text/html": [ 378 | "
\n", 379 | "\n", 392 | "\n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | "
DateOpenHighLowCloseVolumeAdj Close
0Jan 01, 20213,270.003,363.893,086.003,206.2071,528,9003,206.20
1Dec 01, 20203,188.503,350.653,072.823,256.9377,567,8003,256.93
2Nov 01, 20203,061.743,366.802,950.123,168.0490,810,5003,168.04
3Oct 01, 20203,208.003,496.243,019.003,036.15116,242,3003,036.15
4Sep 01, 20203,489.583,552.252,871.003,148.73115,943,5003,148.73
\n", 458 | "
" 459 | ], 460 | "text/plain": [ 461 | " Date Open High Low Close Volume Adj Close\n", 462 | "0 Jan 01, 2021 3,270.00 3,363.89 3,086.00 3,206.20 71,528,900 3,206.20\n", 463 | "1 Dec 01, 2020 3,188.50 3,350.65 3,072.82 3,256.93 77,567,800 3,256.93\n", 464 | "2 Nov 01, 2020 3,061.74 3,366.80 2,950.12 3,168.04 90,810,500 3,168.04\n", 465 | "3 Oct 01, 2020 3,208.00 3,496.24 3,019.00 3,036.15 116,242,300 3,036.15\n", 466 | "4 Sep 01, 2020 3,489.58 3,552.25 2,871.00 3,148.73 115,943,500 3,148.73" 467 | ] 468 | }, 469 | "execution_count": 18, 470 | "metadata": {}, 471 | "output_type": "execute_result" 472 | } 473 | ], 474 | "source": [ 475 | "amazon_data.head(5)" 476 | ] 477 | }, 478 | { 479 | "cell_type": "markdown", 480 | "metadata": {}, 481 | "source": [ 482 | "Para sacar el nombre de las columnas del 'dataframe'.\n" 483 | ] 484 | }, 485 | { 486 | "cell_type": "code", 487 | "execution_count": 20, 488 | "metadata": {}, 489 | "outputs": [ 490 | { 491 | "data": { 492 | "text/plain": [ 493 | "array(['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'],\n", 494 | " dtype=object)" 495 | ] 496 | }, 497 | "execution_count": 20, 498 | "metadata": {}, 499 | "output_type": "execute_result" 500 | } 501 | ], 502 | "source": [ 503 | "amazon_data.columns.values" 504 | ] 505 | }, 506 | { 507 | "cell_type": "markdown", 508 | "metadata": {}, 509 | "source": [ 510 | "Si queremos sacar el precio de apertura de la acción del 1 de junio del 2019 del 'dataframe' escribimos: " 511 | ] 512 | }, 513 | { 514 | "cell_type": "code", 515 | "execution_count": 57, 516 | "metadata": {}, 517 | "outputs": [ 518 | { 519 | "data": { 520 | "text/plain": [ 521 | "19 1,760.01\n", 522 | "Name: Open, dtype: object" 523 | ] 524 | }, 525 | "execution_count": 57, 526 | "metadata": {}, 527 | "output_type": "execute_result" 528 | } 529 | ], 530 | "source": [ 531 | "amazon_data.loc[(amazon_data.Date=='Jun 01, 2019'),'Open']" 532 | ] 533 | } 534 | ], 535 | "metadata": { 536 | "kernelspec": { 537 | "display_name": "Python 3", 538 | "language": "python", 539 | "name": "python3" 540 | }, 541 | "language_info": { 542 | "codemirror_mode": { 543 | "name": "ipython", 544 | "version": 3 545 | }, 546 | "file_extension": ".py", 547 | "mimetype": "text/x-python", 548 | "name": "python", 549 | "nbconvert_exporter": "python", 550 | "pygments_lexer": "ipython3", 551 | "version": "3.8.5" 552 | } 553 | }, 554 | "nbformat": 4, 555 | "nbformat_minor": 4 556 | } 557 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Prácticas en Python para la extracción, tratamiento y carga de datos para el análisis y graficado de éstos. 2 | 3 | A continuación se describen los contenidos de los cuatro cuadernos Jupyter de este repositorio: 4 | 5 | 1 - Práctica en el uso de la librería yfinance para la extracción de datos bursátiles. 6 | 7 | Construcción de varios ejemplos en el uso de métodos de la librería yfinance 8 | 9 | 2 - Práticas con WebScraping, read_html y pandas: 10 | 11 | Diversos ejemplos en el uso combinado de BeautifulSoap, read_html, filtros, descarga, extracción y carga en dataframe de contenido de una página web. 12 | 13 | 3 - Tarea de extracción de datos mediante Webscraping 14 | 15 | Usando las ibrerías BeautifulSoap y Request, extraemos los datos bursátiles de Yahoo Finance sin la librería yfinance. 16 | 17 | 4 - Proyecto final 18 | 19 | Aquí se pone en práctica todo lo estudiado en los cuadernos anteriores donde, entre otras tareas, se han de responder diversas cuestiones. 20 | 21 | --------------------------------------------------------------------------------