├── Airbnb Rio de Janeiro
├── Dados.csv
├── Projeto.ipynb
└── README.md
├── Case interview
├── Modelagem e Avaliação do modelo.ipynb
├── Seleção de variáveis e engenharia de recursos.ipynb
├── X.csv
├── X_dados_balanceados.csv
├── _CaseCandidatosDataScience.pdf
├── callcenter_case.csv
├── dados_tratados.csv
├── exploracao_dos_dados.ipynb
├── logs.log
├── processamento_dos_dados.ipynb
└── y.csv
├── Covid-19
├── Dados.2.csv
├── Dados.csv
├── Projeto.ipynb
└── README.md
├── Detecção do Câncer de Mama
├── Dados.csv
├── Projeto.ipynb
└── README.md
├── Geração de Indicadores
├── Projeto.ipynb
├── README.md
└── Relatório Final.xlsx
├── LICENSE
├── Possiveis resultados copa do mundo 2022
├── 1_coleta_preparacao_dados.ipynb
├── 2-exploracao_dados-TESTE02.ipynb
├── 2-exploracao_dados.ipynb
├── 3-Modelagem e avaliação do modelo-TESTE2.ipynb
├── 3-Modelagem e avaliação do modelo.ipynb
└── README.md
├── Prevendo Partidas de League of Legends
├── Dados.csv
├── README.md
├── [EN-US] Project LOL end-to-end.ipynb
└── [PT-BR] Project LOL end-to-end.ipynb
├── Previsão de Ocorrência de Diabetes
├── Dados.csv
├── Projeto.ipynb
└── README.md
├── README.md
├── Spotify & Python & Data Science
├── Dados.csv
├── Projeto.ipynb
└── README.md
└── Validação modelo da copa do mundo 2022
├── Jogos da fase de grupos
└── README.md
├── Jogos da fase mata-mata
└── README.md
└── README.md
/Airbnb Rio de Janeiro/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # Análise dos Dados do Airbnb - Rio de Janeiro
4 |
5 | A cidade do Rio de Janeiro no ano de 2020 esteve entre as três cidades mais procuradas para curtir o carnaval, sendo 2 milhões de turistas esperados para curtir a maratona carnavalesca, com um crescimento na rede de hotelaria.
6 |
7 |
8 |
9 |
10 |
11 | Em periodo de carnaval e finais de ano, a rede hoteleira do Rio tem um grande crescimento, para 2020 a projeção foi que a rede hoteleira carioca teria 100% dos leitos ocupados durante o evento. Dados atuais de Hotéis do Rio apontam que os hotéis da Barra da Tijuca e São Conrado registram 84% de ocupação, seguidos de Ipanema e Leblon com 80%, Leme e Copacabana com 78%, Botafogo e Flamengo com 89% e o Centro do Rio chegando a 83%.
12 |
13 |
14 |
15 |
16 |
17 |
18 | Somando a isso, quando vamos viajar, sempre pensamos qual séria o melhor hotel, a melhor localização e o custo beneficio para se hospedar. Pensando nisso, utilizei esses questionamentos e os dados do Rio de Janeiro da maior empresa hoteleira da atualidade, o Airbnb, para construir uma analise exploratória dos dados.
19 |
20 |
21 |
22 |
23 |
24 | Uma das iniciativas do Airbnb é disponibilizar dados do site, para algumas das principais cidades do mundo, para nós do Brasil, são disponibilizados dados apenas da cidade do Rio de Janeiro.
25 |
--------------------------------------------------------------------------------
/Case interview/_CaseCandidatosDataScience.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/luislauriano/Data_Science/5ff00f3ff280c36108d9a88a1108adc070d6f3bd/Case interview/_CaseCandidatosDataScience.pdf
--------------------------------------------------------------------------------
/Covid-19/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # Análise dos dados do Coronavírus
4 |
5 | O novo coronavírus de 2019 (2019-nCoV) é um vírus identificado como a causa de um surto de doença respiratória detectado pela primeira vez em Wuhan, China. No início, muitos dos pacientes do surto em Wuhan, na China, teriam algum vínculo com um grande mercado de frutos do mar e animais, sugerindo a disseminação de animais para pessoas. No entanto, um número crescente de pacientes supostamente não teve exposição ao mercado de animais, indicando a ocorrência de disseminação de pessoa para pessoa.
6 |
7 |
8 |
9 |
10 |
11 | Quando me surgiu a ideia de fazer um estudo analisando os dados do coronavírus, pensei de imediato em procurar um dataset no Kaggle e foi isso que aconteceu. O dataset utilizado na construção do projeto foi retirado do Kaggle e disponibilizado pela Johns Hopkins University. Todos os dias são registrados novos dados de casos do coronavírus, logo, os dados que foram analisados vão do dia 22/01/2020 ao 09/03/2020. Périodo em que na China, o número de casos recuperados já erão maiores do que o número de casos confirmados, entretanto, países como Estados Unidos e Coreia do Sul, ainda tinham seus números de casos de mortes maiores do que o de casos recuperados. No projeto mencionei e alertei a preocupação no aumento do número de mortes de casos nos Estados Unidos, antes mesmo de se tornar uma situação alarmante para o país ao final do mês de março. Para países como Canadá e Brasil, ainda era tudo muito novo, então acabou que não se tornou o foco do projeto.
12 |
13 |
14 |
15 |
16 |
--------------------------------------------------------------------------------
/Detecção do Câncer de Mama/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # Machine Learning para Detecção do Câncer de Mama
4 |
5 | O uso e a busca por novas tecnologias como Machine Leanirng, inteligência artifical e aplicações de modelos de inteligencia computacional para contribuir com diagnósticos mais rápidos e precisos tem crescido cada vez mais. Segundo a pesquisa da Agência Internacional de Pesquisa em Câncer (IARC) do ano de 2018, o câncer de mama era o quinto em questão de mortalidade no mundo, sendo estimadas mais de 627 mil mortes em 2018, o que representa 6,6% do total de mortes por todos os tipos da doença. Em 2019, foram estimados 59.700 novos casos em mulheres, o que representa uma taxa de incidência de 51,29 casos por 100 mil mulheres.
6 |
7 |
8 |
9 |
10 |
11 | O principal objetivo do uso de tecnologias como Machine Learning é melhorar a precisão e a velocidade que são feitos os diagnósticos, sendo os diagnósticos precose da doença os principais fatores para reduzir a mortalidade por câncer. Com o diagnóstico precose as chances de curas chegam a 95%.
12 |
13 |
14 |
15 |
16 |
--------------------------------------------------------------------------------
/Geração de Indicadores/Projeto.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "nbformat": 4,
3 | "nbformat_minor": 0,
4 | "metadata": {
5 | "colab": {
6 | "name": "Manipulando_Tratando_Dados.ipynb",
7 | "provenance": [],
8 | "include_colab_link": true
9 | },
10 | "kernelspec": {
11 | "name": "python3",
12 | "display_name": "Python 3"
13 | }
14 | },
15 | "cells": [
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {
19 | "id": "view-in-github",
20 | "colab_type": "text"
21 | },
22 | "source": [
23 | "
"
24 | ]
25 | },
26 | {
27 | "cell_type": "markdown",
28 | "metadata": {
29 | "id": "Cq8Xsy_32hyS",
30 | "colab_type": "text"
31 | },
32 | "source": [
33 | "# **Atividades**\n",
34 | "\n",
35 | "1. Leitura de arquivos\n",
36 | "2. Tratamento de dados\n",
37 | " \n",
38 | " 2.1. Renomear colunas\n",
39 | " \n",
40 | " 2.2. Verificar tipos\n",
41 | " \n",
42 | " 2.3. Deletar dados\n",
43 | "3. Concatenar Tabelas\n",
44 | " \n",
45 | " 3.1 Agrupamento de dados\n",
46 | " \n",
47 | "4. Correlação de Tabelas\n",
48 | "5. Regra de Negócio\n",
49 | "6. Exportar Dados"
50 | ]
51 | },
52 | {
53 | "cell_type": "markdown",
54 | "metadata": {
55 | "id": "3emU6dK83ang",
56 | "colab_type": "text"
57 | },
58 | "source": [
59 | "Ojetivo: Através da manipulação dos dados, preparar e tratar os dados dos conjunto de dados de vendas e detalhamento. Para gerar uma unica base de dados, que irá servir como relatório, indicador. "
60 | ]
61 | },
62 | {
63 | "cell_type": "markdown",
64 | "metadata": {
65 | "id": "KoRf_kyF4rVg",
66 | "colab_type": "text"
67 | },
68 | "source": [
69 | "# **Importar biblioteca**"
70 | ]
71 | },
72 | {
73 | "cell_type": "markdown",
74 | "metadata": {
75 | "id": "xXaIviCNuplb",
76 | "colab_type": "text"
77 | },
78 | "source": [
79 | "Primeiro precisamos importar a biblioteca que iremos utilizar para manipular os dados, neste caso a biblioteca Pandas"
80 | ]
81 | },
82 | {
83 | "cell_type": "code",
84 | "metadata": {
85 | "id": "7X1bIjc14IxZ",
86 | "colab_type": "code",
87 | "colab": {}
88 | },
89 | "source": [
90 | "import pandas as pd"
91 | ],
92 | "execution_count": 0,
93 | "outputs": []
94 | },
95 | {
96 | "cell_type": "markdown",
97 | "metadata": {
98 | "id": "41RW2g9d4yYn",
99 | "colab_type": "text"
100 | },
101 | "source": [
102 | "# **Leitura de arquivos**"
103 | ]
104 | },
105 | {
106 | "cell_type": "markdown",
107 | "metadata": {
108 | "id": "XqasrL4LvbmM",
109 | "colab_type": "text"
110 | },
111 | "source": [
112 | "Antes de tudo é preciso abrir o arquivo e visualizá-lo com calma, para obter informações de como os dados estão separados, se foi pulada alguma linha antes de iniciar a tabela e se o arquivo está dividido por abas, se for arquivo xlsx. "
113 | ]
114 | },
115 | {
116 | "cell_type": "markdown",
117 | "metadata": {
118 | "id": "qCLjPMLy8N2W",
119 | "colab_type": "text"
120 | },
121 | "source": [
122 | "# Arquivos CSV"
123 | ]
124 | },
125 | {
126 | "cell_type": "markdown",
127 | "metadata": {
128 | "id": "jhf85rckwQdw",
129 | "colab_type": "text"
130 | },
131 | "source": [
132 | "Veja o formato de leitura de um arquivo em CSV:"
133 | ]
134 | },
135 | {
136 | "cell_type": "markdown",
137 | "metadata": {
138 | "id": "xbJiu9GEwbss",
139 | "colab_type": "text"
140 | },
141 | "source": [
142 | "df = pd.read_csv(‘nome_do_arquivo.csv’, sep = ’caracter que separa os dados’, skiprows = número de linhas a serem puladas)"
143 | ]
144 | },
145 | {
146 | "cell_type": "markdown",
147 | "metadata": {
148 | "id": "Y2IOP5o-wmB2",
149 | "colab_type": "text"
150 | },
151 | "source": [
152 | "Agora vamos ler o nosso arquivo **Vendas** que está no formato CSV"
153 | ]
154 | },
155 | {
156 | "cell_type": "code",
157 | "metadata": {
158 | "id": "pR3N-j2V8Qz-",
159 | "colab_type": "code",
160 | "colab": {}
161 | },
162 | "source": [
163 | "df_vendas = pd.read_csv('vendas.csv', sep ='|')"
164 | ],
165 | "execution_count": 0,
166 | "outputs": []
167 | },
168 | {
169 | "cell_type": "markdown",
170 | "metadata": {
171 | "id": "e4UPHjZC8U0H",
172 | "colab_type": "text"
173 | },
174 | "source": [
175 | "# Arquivos XLSX"
176 | ]
177 | },
178 | {
179 | "cell_type": "markdown",
180 | "metadata": {
181 | "id": "_dj_dwprw4DY",
182 | "colab_type": "text"
183 | },
184 | "source": [
185 | "Os arquivos no formato XLSX são aqueles arquivos conhecidos como Excel "
186 | ]
187 | },
188 | {
189 | "cell_type": "markdown",
190 | "metadata": {
191 | "id": "fI9ECBQEwxxw",
192 | "colab_type": "text"
193 | },
194 | "source": [
195 | "Veja o formato de leitura de um arquivo em XLSX:"
196 | ]
197 | },
198 | {
199 | "cell_type": "markdown",
200 | "metadata": {
201 | "id": "H_7T_42Jw-40",
202 | "colab_type": "text"
203 | },
204 | "source": [
205 | "df = pd.read_excel(‘nome_do_arquivo.xlsx’, sheet_name = ‘nome da aba’, skiprows = número de linhas a serem pulada)"
206 | ]
207 | },
208 | {
209 | "cell_type": "markdown",
210 | "metadata": {
211 | "id": "WX0R9qAwxH84",
212 | "colab_type": "text"
213 | },
214 | "source": [
215 | "Veja que o nosso arquivo XLSX possue mais de uma aba, pois possue uma aba para cada estado que o nosso arquivo detalhamento está localizado e também foi preciso pular 1 linha para iniciar a tabela. Precisamos construir um conjunto de dados para cada aba e depois que os dados forem tratados á juntamos."
216 | ]
217 | },
218 | {
219 | "cell_type": "code",
220 | "metadata": {
221 | "id": "azb_Ym-X4wlG",
222 | "colab_type": "code",
223 | "colab": {}
224 | },
225 | "source": [
226 | "df_detalhamentoAM = pd.read_excel('detalhamento.xlsx', sheet_name='AM', skiprows=1 )\n",
227 | "df_detalhamentoRR = pd.read_excel('detalhamento.xlsx', sheet_name='RR', skiprows=1 )\n",
228 | "df_detalhamentoRO = pd.read_excel('detalhamento.xlsx', sheet_name='RO', skiprows=1 )\n",
229 | "df_detalhamentoAC = pd.read_excel('detalhamento.xlsx', sheet_name='AC', skiprows=1 )"
230 | ],
231 | "execution_count": 0,
232 | "outputs": []
233 | },
234 | {
235 | "cell_type": "markdown",
236 | "metadata": {
237 | "id": "u5k9XhXS8m_2",
238 | "colab_type": "text"
239 | },
240 | "source": [
241 | "# **Tratamento de Dados**"
242 | ]
243 | },
244 | {
245 | "cell_type": "markdown",
246 | "metadata": {
247 | "id": "iRY_uUuOxtLT",
248 | "colab_type": "text"
249 | },
250 | "source": [
251 | "A etapa de tratamento e preparação dos dados é a etapa que mais consome tempo do cientista de dados, devido que essa etapa impacta todo o resto do seu projeto. \n",
252 | "\n"
253 | ]
254 | },
255 | {
256 | "cell_type": "markdown",
257 | "metadata": {
258 | "id": "K9y_tRY28rDq",
259 | "colab_type": "text"
260 | },
261 | "source": [
262 | "## Renomeando colunas"
263 | ]
264 | },
265 | {
266 | "cell_type": "markdown",
267 | "metadata": {
268 | "id": "rl1uH8D1yEJk",
269 | "colab_type": "text"
270 | },
271 | "source": [
272 | "De acordo com as boas práticas de programação todas as colunas devem possuir letras minúsculas, sem caracteres especiais e sem espaços. Para facilitar o trabalho quando for preciso chamar essas colunas."
273 | ]
274 | },
275 | {
276 | "cell_type": "markdown",
277 | "metadata": {
278 | "id": "UJ9oSccyyZXa",
279 | "colab_type": "text"
280 | },
281 | "source": [
282 | "Para renomear uma ou mais colunas se utiliza o formato como um dicionário em python:"
283 | ]
284 | },
285 | {
286 | "cell_type": "markdown",
287 | "metadata": {
288 | "id": "qC2yhHuvyg6k",
289 | "colab_type": "text"
290 | },
291 | "source": [
292 | "df = df.rename(columns={‘coluna1’:’novo_nome1’, ‘coluna2’:’novo_nome2’})"
293 | ]
294 | },
295 | {
296 | "cell_type": "markdown",
297 | "metadata": {
298 | "id": "xH3yCHqKy0F7",
299 | "colab_type": "text"
300 | },
301 | "source": [
302 | "Para renomear todas as colunas se utiliza o formato como uma lista em python:"
303 | ]
304 | },
305 | {
306 | "cell_type": "markdown",
307 | "metadata": {
308 | "id": "EaLo_wXty4ZK",
309 | "colab_type": "text"
310 | },
311 | "source": [
312 | "df.columns = [‘novo_nome1’, ‘novo_nome2’]"
313 | ]
314 | },
315 | {
316 | "cell_type": "markdown",
317 | "metadata": {
318 | "id": "hdzWExuty9dI",
319 | "colab_type": "text"
320 | },
321 | "source": [
322 | "Vamos agora renomear todas as colunas dos conjuntos de dados de vendas e detalhamentos. "
323 | ]
324 | },
325 | {
326 | "cell_type": "code",
327 | "metadata": {
328 | "id": "Zjd93Kcn87jm",
329 | "colab_type": "code",
330 | "outputId": "e2c5bbaf-aaaf-492b-970d-5e42338e98e7",
331 | "colab": {
332 | "base_uri": "https://localhost:8080/",
333 | "height": 142
334 | }
335 | },
336 | "source": [
337 | "df_vendas.columns = ['escritorio', 'operadora', 'material', 'data', 'valor_liquido']\n",
338 | "df_vendas.head(3)"
339 | ],
340 | "execution_count": 0,
341 | "outputs": [
342 | {
343 | "output_type": "execute_result",
344 | "data": {
345 | "text/html": [
346 | "\n",
347 | "\n",
360 | "
\n",
361 | " \n",
362 | " \n",
363 | " | \n",
364 | " escritorio | \n",
365 | " operadora | \n",
366 | " material | \n",
367 | " data | \n",
368 | " valor_liquido | \n",
369 | "
\n",
370 | " \n",
371 | " \n",
372 | " \n",
373 | " 0 | \n",
374 | " 101 | \n",
375 | " Vivo | \n",
376 | " 131318 RECARGA VIRTUAL VIVO R$ 1,00 | \n",
377 | " 02.01.2020 | \n",
378 | " 20.0 | \n",
379 | "
\n",
380 | " \n",
381 | " 1 | \n",
382 | " 101 | \n",
383 | " Vivo | \n",
384 | " 131318 RECARGA VIRTUAL VIVO R$ 1,00 | \n",
385 | " 30.12.2019 | \n",
386 | " 10.0 | \n",
387 | "
\n",
388 | " \n",
389 | " 2 | \n",
390 | " 101 | \n",
391 | " Vivo | \n",
392 | " 131318 RECARGA VIRTUAL VIVO R$ 1,00 | \n",
393 | " 04.01.2020 | \n",
394 | " 20.0 | \n",
395 | "
\n",
396 | " \n",
397 | "
\n",
398 | "
"
399 | ],
400 | "text/plain": [
401 | " escritorio operadora ... data valor_liquido\n",
402 | "0 101 Vivo ... 02.01.2020 20.0\n",
403 | "1 101 Vivo ... 30.12.2019 10.0\n",
404 | "2 101 Vivo ... 04.01.2020 20.0\n",
405 | "\n",
406 | "[3 rows x 5 columns]"
407 | ]
408 | },
409 | "metadata": {
410 | "tags": []
411 | },
412 | "execution_count": 550
413 | }
414 | ]
415 | },
416 | {
417 | "cell_type": "code",
418 | "metadata": {
419 | "id": "71Vav_rn89GO",
420 | "colab_type": "code",
421 | "outputId": "fc03a6f9-681d-406e-90ad-8ec0e7f76359",
422 | "colab": {
423 | "base_uri": "https://localhost:8080/",
424 | "height": 142
425 | }
426 | },
427 | "source": [
428 | "df_detalhamentoAM.columns = ['loja', 'escritorio', 'uf', 'operadora', 'valor_bruto']\n",
429 | "df_detalhamentoAM.head(3)"
430 | ],
431 | "execution_count": 0,
432 | "outputs": [
433 | {
434 | "output_type": "execute_result",
435 | "data": {
436 | "text/html": [
437 | "\n",
438 | "\n",
451 | "
\n",
452 | " \n",
453 | " \n",
454 | " | \n",
455 | " loja | \n",
456 | " escritorio | \n",
457 | " uf | \n",
458 | " operadora | \n",
459 | " valor_bruto | \n",
460 | "
\n",
461 | " \n",
462 | " \n",
463 | " \n",
464 | " 0 | \n",
465 | " BEMOL AVENIDA | \n",
466 | " 103 | \n",
467 | " AM | \n",
468 | " Claro | \n",
469 | " 2651 | \n",
470 | "
\n",
471 | " \n",
472 | " 1 | \n",
473 | " BEMOL AVENIDA | \n",
474 | " 103 | \n",
475 | " AM | \n",
476 | " Oi | \n",
477 | " 3309 | \n",
478 | "
\n",
479 | " \n",
480 | " 2 | \n",
481 | " BEMOL AVENIDA | \n",
482 | " 103 | \n",
483 | " AM | \n",
484 | " Tim | \n",
485 | " 1550 | \n",
486 | "
\n",
487 | " \n",
488 | "
\n",
489 | "
"
490 | ],
491 | "text/plain": [
492 | " loja escritorio uf operadora valor_bruto\n",
493 | "0 BEMOL AVENIDA 103 AM Claro 2651\n",
494 | "1 BEMOL AVENIDA 103 AM Oi 3309\n",
495 | "2 BEMOL AVENIDA 103 AM Tim 1550"
496 | ]
497 | },
498 | "metadata": {
499 | "tags": []
500 | },
501 | "execution_count": 551
502 | }
503 | ]
504 | },
505 | {
506 | "cell_type": "code",
507 | "metadata": {
508 | "id": "pLTqJdlm-zmJ",
509 | "colab_type": "code",
510 | "outputId": "b4097f47-3aa4-42ea-d3b7-0a6ad7e04f48",
511 | "colab": {
512 | "base_uri": "https://localhost:8080/",
513 | "height": 142
514 | }
515 | },
516 | "source": [
517 | "df_detalhamentoRR.columns = ['loja', 'escritorio', 'uf', 'operadora', 'valor_bruto']\n",
518 | "df_detalhamentoRR.head(3)"
519 | ],
520 | "execution_count": 0,
521 | "outputs": [
522 | {
523 | "output_type": "execute_result",
524 | "data": {
525 | "text/html": [
526 | "\n",
527 | "\n",
540 | "
\n",
541 | " \n",
542 | " \n",
543 | " | \n",
544 | " loja | \n",
545 | " escritorio | \n",
546 | " uf | \n",
547 | " operadora | \n",
548 | " valor_bruto | \n",
549 | "
\n",
550 | " \n",
551 | " \n",
552 | " \n",
553 | " 0 | \n",
554 | " BEMOL BOA VISTA | \n",
555 | " 701 | \n",
556 | " RR | \n",
557 | " Claro | \n",
558 | " 140 | \n",
559 | "
\n",
560 | " \n",
561 | " 1 | \n",
562 | " BEMOL BOA VISTA | \n",
563 | " 701 | \n",
564 | " RR | \n",
565 | " Oi | \n",
566 | " 20 | \n",
567 | "
\n",
568 | " \n",
569 | " 2 | \n",
570 | " BEMOL BOA VISTA | \n",
571 | " 701 | \n",
572 | " RR | \n",
573 | " Tim | \n",
574 | " 135 | \n",
575 | "
\n",
576 | " \n",
577 | "
\n",
578 | "
"
579 | ],
580 | "text/plain": [
581 | " loja escritorio uf operadora valor_bruto\n",
582 | "0 BEMOL BOA VISTA 701 RR Claro 140\n",
583 | "1 BEMOL BOA VISTA 701 RR Oi 20\n",
584 | "2 BEMOL BOA VISTA 701 RR Tim 135"
585 | ]
586 | },
587 | "metadata": {
588 | "tags": []
589 | },
590 | "execution_count": 552
591 | }
592 | ]
593 | },
594 | {
595 | "cell_type": "code",
596 | "metadata": {
597 | "id": "LYYiy5Pa-6zg",
598 | "colab_type": "code",
599 | "outputId": "90d7db70-519f-44ef-b464-3f3de401a849",
600 | "colab": {
601 | "base_uri": "https://localhost:8080/",
602 | "height": 142
603 | }
604 | },
605 | "source": [
606 | "df_detalhamentoRO.columns = ['loja', 'escritorio', 'uf', 'operadora', 'valor_bruto']\n",
607 | "df_detalhamentoRO.head(3)"
608 | ],
609 | "execution_count": 0,
610 | "outputs": [
611 | {
612 | "output_type": "execute_result",
613 | "data": {
614 | "text/html": [
615 | "\n",
616 | "\n",
629 | "
\n",
630 | " \n",
631 | " \n",
632 | " | \n",
633 | " loja | \n",
634 | " escritorio | \n",
635 | " uf | \n",
636 | " operadora | \n",
637 | " valor_bruto | \n",
638 | "
\n",
639 | " \n",
640 | " \n",
641 | " \n",
642 | " 0 | \n",
643 | " BEMOL FARMA PORTO VELHO | \n",
644 | " 609 | \n",
645 | " RO | \n",
646 | " Claro | \n",
647 | " 486 | \n",
648 | "
\n",
649 | " \n",
650 | " 1 | \n",
651 | " BEMOL FARMA PORTO VELHO | \n",
652 | " 609 | \n",
653 | " RO | \n",
654 | " Oi | \n",
655 | " 90 | \n",
656 | "
\n",
657 | " \n",
658 | " 2 | \n",
659 | " BEMOL FARMA PORTO VELHO | \n",
660 | " 609 | \n",
661 | " RO | \n",
662 | " Tim | \n",
663 | " 90 | \n",
664 | "
\n",
665 | " \n",
666 | "
\n",
667 | "
"
668 | ],
669 | "text/plain": [
670 | " loja escritorio uf operadora valor_bruto\n",
671 | "0 BEMOL FARMA PORTO VELHO 609 RO Claro 486\n",
672 | "1 BEMOL FARMA PORTO VELHO 609 RO Oi 90\n",
673 | "2 BEMOL FARMA PORTO VELHO 609 RO Tim 90"
674 | ]
675 | },
676 | "metadata": {
677 | "tags": []
678 | },
679 | "execution_count": 553
680 | }
681 | ]
682 | },
683 | {
684 | "cell_type": "code",
685 | "metadata": {
686 | "id": "xTO0uH3i_IRA",
687 | "colab_type": "code",
688 | "outputId": "45a1e869-19c4-487d-9b66-723eea19c591",
689 | "colab": {
690 | "base_uri": "https://localhost:8080/",
691 | "height": 142
692 | }
693 | },
694 | "source": [
695 | "df_detalhamentoAC.columns = ['loja', 'escritorio', 'uf', 'operadora', 'valor_bruto']\n",
696 | "df_detalhamentoAC.head(3)"
697 | ],
698 | "execution_count": 0,
699 | "outputs": [
700 | {
701 | "output_type": "execute_result",
702 | "data": {
703 | "text/html": [
704 | "\n",
705 | "\n",
718 | "
\n",
719 | " \n",
720 | " \n",
721 | " | \n",
722 | " loja | \n",
723 | " escritorio | \n",
724 | " uf | \n",
725 | " operadora | \n",
726 | " valor_bruto | \n",
727 | "
\n",
728 | " \n",
729 | " \n",
730 | " \n",
731 | " 0 | \n",
732 | " BEMOL RIO BRANCO | \n",
733 | " 401 | \n",
734 | " AC | \n",
735 | " Claro | \n",
736 | " 500 | \n",
737 | "
\n",
738 | " \n",
739 | " 1 | \n",
740 | " BEMOL RIO BRANCO | \n",
741 | " 401 | \n",
742 | " AC | \n",
743 | " Oi | \n",
744 | " 375 | \n",
745 | "
\n",
746 | " \n",
747 | " 2 | \n",
748 | " BEMOL RIO BRANCO | \n",
749 | " 401 | \n",
750 | " AC | \n",
751 | " Tim | \n",
752 | " 125 | \n",
753 | "
\n",
754 | " \n",
755 | "
\n",
756 | "
"
757 | ],
758 | "text/plain": [
759 | " loja escritorio uf operadora valor_bruto\n",
760 | "0 BEMOL RIO BRANCO 401 AC Claro 500\n",
761 | "1 BEMOL RIO BRANCO 401 AC Oi 375\n",
762 | "2 BEMOL RIO BRANCO 401 AC Tim 125"
763 | ]
764 | },
765 | "metadata": {
766 | "tags": []
767 | },
768 | "execution_count": 554
769 | }
770 | ]
771 | },
772 | {
773 | "cell_type": "markdown",
774 | "metadata": {
775 | "id": "XBJ7PUeX_ojb",
776 | "colab_type": "text"
777 | },
778 | "source": [
779 | "## Verificar tipos de dados"
780 | ]
781 | },
782 | {
783 | "cell_type": "markdown",
784 | "metadata": {
785 | "id": "-I8eyVFX_-kF",
786 | "colab_type": "text"
787 | },
788 | "source": [
789 | "### Verificar tipos"
790 | ]
791 | },
792 | {
793 | "cell_type": "markdown",
794 | "metadata": {
795 | "id": "0C0cv1GzzOoH",
796 | "colab_type": "text"
797 | },
798 | "source": [
799 | "Para verificar quais são os tipos de dados do nosso conjunto podemos utilizar a função dtype. \n",
800 | "Se queremos verificar os tipos de todas as colunas do conjunto de dados utiliza-se a função dtype no plural, dtypes. Veja o formato:"
801 | ]
802 | },
803 | {
804 | "cell_type": "markdown",
805 | "metadata": {
806 | "id": "tOd7MnUmz4uT",
807 | "colab_type": "text"
808 | },
809 | "source": [
810 | "df.dtypes"
811 | ]
812 | },
813 | {
814 | "cell_type": "markdown",
815 | "metadata": {
816 | "id": "ANqtD3eHz_dr",
817 | "colab_type": "text"
818 | },
819 | "source": [
820 | "Agora se queremos verificar o tipo dos dados de uma coluna específica do conjunto de dados, utiliza-se a função no singular, dtype. Veja o formato:"
821 | ]
822 | },
823 | {
824 | "cell_type": "markdown",
825 | "metadata": {
826 | "id": "LdMmVK4o0PLb",
827 | "colab_type": "text"
828 | },
829 | "source": [
830 | "df[‘coluna’].dtype"
831 | ]
832 | },
833 | {
834 | "cell_type": "markdown",
835 | "metadata": {
836 | "id": "CH34i62L0Smy",
837 | "colab_type": "text"
838 | },
839 | "source": [
840 | "Agora vamos verificar os tipos de todas as colunas dos nossos conjuntos de dados vendas e detalhamento."
841 | ]
842 | },
843 | {
844 | "cell_type": "code",
845 | "metadata": {
846 | "id": "9E4V0A1F_3ku",
847 | "colab_type": "code",
848 | "outputId": "72764df7-b485-4538-d414-b7515cf3f8fc",
849 | "colab": {
850 | "base_uri": "https://localhost:8080/",
851 | "height": 119
852 | }
853 | },
854 | "source": [
855 | "df_vendas.dtypes"
856 | ],
857 | "execution_count": 0,
858 | "outputs": [
859 | {
860 | "output_type": "execute_result",
861 | "data": {
862 | "text/plain": [
863 | "escritorio int64\n",
864 | "operadora object\n",
865 | "material object\n",
866 | "data object\n",
867 | "valor_liquido float64\n",
868 | "dtype: object"
869 | ]
870 | },
871 | "metadata": {
872 | "tags": []
873 | },
874 | "execution_count": 555
875 | }
876 | ]
877 | },
878 | {
879 | "cell_type": "code",
880 | "metadata": {
881 | "id": "dPEgRE5qAEs0",
882 | "colab_type": "code",
883 | "outputId": "36bb18ca-0a92-433f-bfe5-28635b12672b",
884 | "colab": {
885 | "base_uri": "https://localhost:8080/",
886 | "height": 119
887 | }
888 | },
889 | "source": [
890 | "df_detalhamentoAM.dtypes"
891 | ],
892 | "execution_count": 0,
893 | "outputs": [
894 | {
895 | "output_type": "execute_result",
896 | "data": {
897 | "text/plain": [
898 | "loja object\n",
899 | "escritorio int64\n",
900 | "uf object\n",
901 | "operadora object\n",
902 | "valor_bruto int64\n",
903 | "dtype: object"
904 | ]
905 | },
906 | "metadata": {
907 | "tags": []
908 | },
909 | "execution_count": 556
910 | }
911 | ]
912 | },
913 | {
914 | "cell_type": "code",
915 | "metadata": {
916 | "colab_type": "code",
917 | "outputId": "dc774eac-4afa-4a06-e651-8b68d91e5318",
918 | "id": "crpBh_3lAR4q",
919 | "colab": {
920 | "base_uri": "https://localhost:8080/",
921 | "height": 119
922 | }
923 | },
924 | "source": [
925 | "df_detalhamentoRR.dtypes"
926 | ],
927 | "execution_count": 0,
928 | "outputs": [
929 | {
930 | "output_type": "execute_result",
931 | "data": {
932 | "text/plain": [
933 | "loja object\n",
934 | "escritorio int64\n",
935 | "uf object\n",
936 | "operadora object\n",
937 | "valor_bruto int64\n",
938 | "dtype: object"
939 | ]
940 | },
941 | "metadata": {
942 | "tags": []
943 | },
944 | "execution_count": 557
945 | }
946 | ]
947 | },
948 | {
949 | "cell_type": "code",
950 | "metadata": {
951 | "colab_type": "code",
952 | "outputId": "aaa9b412-6553-4a47-d99c-ef332a1f52f9",
953 | "id": "nx7eR1kPAThv",
954 | "colab": {
955 | "base_uri": "https://localhost:8080/",
956 | "height": 119
957 | }
958 | },
959 | "source": [
960 | "df_detalhamentoRO.dtypes"
961 | ],
962 | "execution_count": 0,
963 | "outputs": [
964 | {
965 | "output_type": "execute_result",
966 | "data": {
967 | "text/plain": [
968 | "loja object\n",
969 | "escritorio int64\n",
970 | "uf object\n",
971 | "operadora object\n",
972 | "valor_bruto int64\n",
973 | "dtype: object"
974 | ]
975 | },
976 | "metadata": {
977 | "tags": []
978 | },
979 | "execution_count": 558
980 | }
981 | ]
982 | },
983 | {
984 | "cell_type": "code",
985 | "metadata": {
986 | "colab_type": "code",
987 | "outputId": "f4e2652b-3a11-463d-a19c-60d8a4fd0067",
988 | "id": "hoVPgUY0AVRy",
989 | "colab": {
990 | "base_uri": "https://localhost:8080/",
991 | "height": 119
992 | }
993 | },
994 | "source": [
995 | "df_detalhamentoAC.dtypes"
996 | ],
997 | "execution_count": 0,
998 | "outputs": [
999 | {
1000 | "output_type": "execute_result",
1001 | "data": {
1002 | "text/plain": [
1003 | "loja object\n",
1004 | "escritorio int64\n",
1005 | "uf object\n",
1006 | "operadora object\n",
1007 | "valor_bruto int64\n",
1008 | "dtype: object"
1009 | ]
1010 | },
1011 | "metadata": {
1012 | "tags": []
1013 | },
1014 | "execution_count": 559
1015 | }
1016 | ]
1017 | },
1018 | {
1019 | "cell_type": "markdown",
1020 | "metadata": {
1021 | "id": "XGjBkpAcCvEl",
1022 | "colab_type": "text"
1023 | },
1024 | "source": [
1025 | "### Alterando tipo de uma coluna"
1026 | ]
1027 | },
1028 | {
1029 | "cell_type": "markdown",
1030 | "metadata": {
1031 | "id": "oJSFjF9O04m5",
1032 | "colab_type": "text"
1033 | },
1034 | "source": [
1035 | "Pode acontecer de uma coluna ser lida como um objeto em (string) e precisa estar como int ou float. Para alterar o tipo da coluna utilizamos a função astype. Veja o formato:"
1036 | ]
1037 | },
1038 | {
1039 | "cell_type": "markdown",
1040 | "metadata": {
1041 | "id": "5efvW4J21P_T",
1042 | "colab_type": "text"
1043 | },
1044 | "source": [
1045 | "df[‘coluna’] = df[‘coluna’].astype(float)"
1046 | ]
1047 | },
1048 | {
1049 | "cell_type": "markdown",
1050 | "metadata": {
1051 | "id": "0TQlXTCL1Weu",
1052 | "colab_type": "text"
1053 | },
1054 | "source": [
1055 | "Vamos precisar alterar o tipo da coluna **valor_liquido** do conjunto de dados de **vendas** para Int. "
1056 | ]
1057 | },
1058 | {
1059 | "cell_type": "code",
1060 | "metadata": {
1061 | "id": "sQs2tEPUCy8N",
1062 | "colab_type": "code",
1063 | "colab": {}
1064 | },
1065 | "source": [
1066 | "df_vendas['valor_liquido'] = df_vendas['valor_liquido'].astype(int)"
1067 | ],
1068 | "execution_count": 0,
1069 | "outputs": []
1070 | },
1071 | {
1072 | "cell_type": "markdown",
1073 | "metadata": {
1074 | "id": "A7H5VO4FBz48",
1075 | "colab_type": "text"
1076 | },
1077 | "source": [
1078 | "## Deletar dados"
1079 | ]
1080 | },
1081 | {
1082 | "cell_type": "markdown",
1083 | "metadata": {
1084 | "id": "jyzfKwo21nGE",
1085 | "colab_type": "text"
1086 | },
1087 | "source": [
1088 | "Uma das formas de deletar colunas é utilizando a função drop. Veja:"
1089 | ]
1090 | },
1091 | {
1092 | "cell_type": "markdown",
1093 | "metadata": {
1094 | "id": "zao9vC1l16Cq",
1095 | "colab_type": "text"
1096 | },
1097 | "source": [
1098 | "df = df.drop([‘coluna1’, 'coluna2', 'coluna3'], axis = ‘columns’)"
1099 | ]
1100 | },
1101 | {
1102 | "cell_type": "markdown",
1103 | "metadata": {
1104 | "id": "9OiaSZep2GJ5",
1105 | "colab_type": "text"
1106 | },
1107 | "source": [
1108 | "Porém a melhor forma seria escolher as colunas que queremos utilizar e as outras colunas serão omitidas e não deletadas. Passando os nomes das colunas que desejamos utilizar e na ordem desejada. Veja o formato:"
1109 | ]
1110 | },
1111 | {
1112 | "cell_type": "markdown",
1113 | "metadata": {
1114 | "id": "uL8OVDFK2i22",
1115 | "colab_type": "text"
1116 | },
1117 | "source": [
1118 | "df = df[[‘coluna1’, ‘coluna2’]].copy() "
1119 | ]
1120 | },
1121 | {
1122 | "cell_type": "markdown",
1123 | "metadata": {
1124 | "id": "vuic7Rm_2s90",
1125 | "colab_type": "text"
1126 | },
1127 | "source": [
1128 | "A partir desse formato vamos escolher as colunas dos nossos conjuntos de dados que iremos utilizar. "
1129 | ]
1130 | },
1131 | {
1132 | "cell_type": "code",
1133 | "metadata": {
1134 | "id": "nCJn1Uh_B3k3",
1135 | "colab_type": "code",
1136 | "colab": {}
1137 | },
1138 | "source": [
1139 | "df_vendas = df_vendas[['escritorio', 'operadora', 'valor_liquido']].copy()"
1140 | ],
1141 | "execution_count": 0,
1142 | "outputs": []
1143 | },
1144 | {
1145 | "cell_type": "code",
1146 | "metadata": {
1147 | "id": "sxtAfp0MCNph",
1148 | "colab_type": "code",
1149 | "colab": {}
1150 | },
1151 | "source": [
1152 | "df_detalhamentoAM = df_detalhamentoAM[['loja', 'escritorio', 'operadora', 'valor_bruto']].copy()\n",
1153 | "df_detalhamentoRR = df_detalhamentoRR[['loja', 'escritorio', 'operadora', 'valor_bruto']].copy()\n",
1154 | "df_detalhamentoRO = df_detalhamentoRO[['loja', 'escritorio', 'operadora', 'valor_bruto']].copy()\n",
1155 | "df_detalhamentoAC = df_detalhamentoAC[['loja', 'escritorio', 'operadora', 'valor_bruto']].copy()"
1156 | ],
1157 | "execution_count": 0,
1158 | "outputs": []
1159 | },
1160 | {
1161 | "cell_type": "markdown",
1162 | "metadata": {
1163 | "id": "DQVElJU-29Kx",
1164 | "colab_type": "text"
1165 | },
1166 | "source": [
1167 | "Também poderíamos deletar linhas utilizando a função drop, veja:"
1168 | ]
1169 | },
1170 | {
1171 | "cell_type": "markdown",
1172 | "metadata": {
1173 | "id": "4UwgH7NZ3Q_g",
1174 | "colab_type": "text"
1175 | },
1176 | "source": [
1177 | "df = df.drop([nome_da_linha], axis = ‘index’)"
1178 | ]
1179 | },
1180 | {
1181 | "cell_type": "markdown",
1182 | "metadata": {
1183 | "id": "c-0xnQOREOGl",
1184 | "colab_type": "text"
1185 | },
1186 | "source": [
1187 | "# Concatenar tabelas (Juntar tabelas)"
1188 | ]
1189 | },
1190 | {
1191 | "cell_type": "markdown",
1192 | "metadata": {
1193 | "id": "YGrQeH-G3ng2",
1194 | "colab_type": "text"
1195 | },
1196 | "source": [
1197 | "A concatenação de tabelas é utilizada quando desejamos unir duas ou mais tabelas que possuem os mesmos conjuntos de dados. Como os conjuntos de dados (tabelas) no formato XLSX que estão separados por abas. Veja o formato para concatenar tabelas:"
1198 | ]
1199 | },
1200 | {
1201 | "cell_type": "markdown",
1202 | "metadata": {
1203 | "id": "1e3WgIph4KoS",
1204 | "colab_type": "text"
1205 | },
1206 | "source": [
1207 | "df = pd.concat([df1, df2, df3])"
1208 | ]
1209 | },
1210 | {
1211 | "cell_type": "markdown",
1212 | "metadata": {
1213 | "id": "28YLEN554PoG",
1214 | "colab_type": "text"
1215 | },
1216 | "source": [
1217 | "Como temos quatro tabelas (conjunto de dados) que são iguais porém estão separados por abas, vamos juntar em um único conjunto de dados. Sendo eles:"
1218 | ]
1219 | },
1220 | {
1221 | "cell_type": "code",
1222 | "metadata": {
1223 | "id": "MTjNJyz8ESw4",
1224 | "colab_type": "code",
1225 | "outputId": "c1e8ea9b-23e7-45d6-b021-bc93bee5807a",
1226 | "colab": {
1227 | "base_uri": "https://localhost:8080/",
1228 | "height": 68
1229 | }
1230 | },
1231 | "source": [
1232 | "print('Amazonas(AM): ')\n",
1233 | "print(f'Linhas: {df_detalhamentoAM.shape[0]}')\n",
1234 | "print(f'Colunas {df_detalhamentoAM.shape[1]}')"
1235 | ],
1236 | "execution_count": 0,
1237 | "outputs": [
1238 | {
1239 | "output_type": "stream",
1240 | "text": [
1241 | "Amazonas(AM): \n",
1242 | "Linhas: 126\n",
1243 | "Colunas 4\n"
1244 | ],
1245 | "name": "stdout"
1246 | }
1247 | ]
1248 | },
1249 | {
1250 | "cell_type": "code",
1251 | "metadata": {
1252 | "id": "_PA1z9m8Ezg0",
1253 | "colab_type": "code",
1254 | "outputId": "d39bc9de-082b-4331-d7e3-a35704ab99af",
1255 | "colab": {
1256 | "base_uri": "https://localhost:8080/",
1257 | "height": 68
1258 | }
1259 | },
1260 | "source": [
1261 | "print('Roraima(RR): ')\n",
1262 | "print(f'Linhas: {df_detalhamentoRR.shape[0]}')\n",
1263 | "print(f'Colunas {df_detalhamentoRR.shape[1]}')"
1264 | ],
1265 | "execution_count": 0,
1266 | "outputs": [
1267 | {
1268 | "output_type": "stream",
1269 | "text": [
1270 | "Roraima(RR): \n",
1271 | "Linhas: 8\n",
1272 | "Colunas 4\n"
1273 | ],
1274 | "name": "stdout"
1275 | }
1276 | ]
1277 | },
1278 | {
1279 | "cell_type": "code",
1280 | "metadata": {
1281 | "id": "PisJfK7hFQaH",
1282 | "colab_type": "code",
1283 | "outputId": "19df1fba-e11b-4b02-ebd5-21ef06db4184",
1284 | "colab": {
1285 | "base_uri": "https://localhost:8080/",
1286 | "height": 68
1287 | }
1288 | },
1289 | "source": [
1290 | "print('Rondônia(RO): ')\n",
1291 | "print(f'Linhas: {df_detalhamentoRO.shape[0]}')\n",
1292 | "print(f'Colunas {df_detalhamentoRO.shape[1]}')"
1293 | ],
1294 | "execution_count": 0,
1295 | "outputs": [
1296 | {
1297 | "output_type": "stream",
1298 | "text": [
1299 | "Rondônia(RO): \n",
1300 | "Linhas: 24\n",
1301 | "Colunas 4\n"
1302 | ],
1303 | "name": "stdout"
1304 | }
1305 | ]
1306 | },
1307 | {
1308 | "cell_type": "code",
1309 | "metadata": {
1310 | "id": "OBsTi0YQFmDR",
1311 | "colab_type": "code",
1312 | "outputId": "e6cd4f30-94ad-48f9-effd-946d89e4b32d",
1313 | "colab": {
1314 | "base_uri": "https://localhost:8080/",
1315 | "height": 68
1316 | }
1317 | },
1318 | "source": [
1319 | "print('Acre(AC): ')\n",
1320 | "print(f'Linhas: {df_detalhamentoAC.shape[0]}')\n",
1321 | "print(f'Colunas {df_detalhamentoAC.shape[1]}')"
1322 | ],
1323 | "execution_count": 0,
1324 | "outputs": [
1325 | {
1326 | "output_type": "stream",
1327 | "text": [
1328 | "Acre(AC): \n",
1329 | "Linhas: 4\n",
1330 | "Colunas 4\n"
1331 | ],
1332 | "name": "stdout"
1333 | }
1334 | ]
1335 | },
1336 | {
1337 | "cell_type": "markdown",
1338 | "metadata": {
1339 | "id": "HCALI1SoGTmd",
1340 | "colab_type": "text"
1341 | },
1342 | "source": [
1343 | "Agora vamos unificar, concatenar todos esses conjuntos de dados em uma unica tabela (dataframe). "
1344 | ]
1345 | },
1346 | {
1347 | "cell_type": "code",
1348 | "metadata": {
1349 | "id": "FkpuDsmdFu7u",
1350 | "colab_type": "code",
1351 | "colab": {}
1352 | },
1353 | "source": [
1354 | "df_detalhamento = pd.concat([df_detalhamentoAM,df_detalhamentoRR, df_detalhamentoRO, df_detalhamentoAC])"
1355 | ],
1356 | "execution_count": 0,
1357 | "outputs": []
1358 | },
1359 | {
1360 | "cell_type": "markdown",
1361 | "metadata": {
1362 | "id": "tE4Vt-9Q4-A8",
1363 | "colab_type": "text"
1364 | },
1365 | "source": [
1366 | "Após juntarmos as tabelas o index do pandas ficará fora de ordem, para ordenar o index novamente precisamos resetá-lo. "
1367 | ]
1368 | },
1369 | {
1370 | "cell_type": "code",
1371 | "metadata": {
1372 | "id": "e0I_QywJGwz2",
1373 | "colab_type": "code",
1374 | "outputId": "674ba9e6-f445-42e4-eff8-040cb7757475",
1375 | "colab": {
1376 | "base_uri": "https://localhost:8080/",
1377 | "height": 419
1378 | }
1379 | },
1380 | "source": [
1381 | "df_detalhamento = df_detalhamento.reset_index(drop=True)\n",
1382 | "df_detalhamento"
1383 | ],
1384 | "execution_count": 0,
1385 | "outputs": [
1386 | {
1387 | "output_type": "execute_result",
1388 | "data": {
1389 | "text/html": [
1390 | "\n",
1391 | "\n",
1404 | "
\n",
1405 | " \n",
1406 | " \n",
1407 | " | \n",
1408 | " loja | \n",
1409 | " escritorio | \n",
1410 | " operadora | \n",
1411 | " valor_bruto | \n",
1412 | "
\n",
1413 | " \n",
1414 | " \n",
1415 | " \n",
1416 | " 0 | \n",
1417 | " BEMOL AVENIDA | \n",
1418 | " 103 | \n",
1419 | " Claro | \n",
1420 | " 2651 | \n",
1421 | "
\n",
1422 | " \n",
1423 | " 1 | \n",
1424 | " BEMOL AVENIDA | \n",
1425 | " 103 | \n",
1426 | " Oi | \n",
1427 | " 3309 | \n",
1428 | "
\n",
1429 | " \n",
1430 | " 2 | \n",
1431 | " BEMOL AVENIDA | \n",
1432 | " 103 | \n",
1433 | " Tim | \n",
1434 | " 1550 | \n",
1435 | "
\n",
1436 | " \n",
1437 | " 3 | \n",
1438 | " BEMOL AVENIDA | \n",
1439 | " 103 | \n",
1440 | " Vivo | \n",
1441 | " 4739 | \n",
1442 | "
\n",
1443 | " \n",
1444 | " 4 | \n",
1445 | " BEMOL BARROSO | \n",
1446 | " 107 | \n",
1447 | " Claro | \n",
1448 | " 1495 | \n",
1449 | "
\n",
1450 | " \n",
1451 | " ... | \n",
1452 | " ... | \n",
1453 | " ... | \n",
1454 | " ... | \n",
1455 | " ... | \n",
1456 | "
\n",
1457 | " \n",
1458 | " 157 | \n",
1459 | " LOJA BEMOL JI-PARANA | \n",
1460 | " 205 | \n",
1461 | " Claro | \n",
1462 | " 123 | \n",
1463 | "
\n",
1464 | " \n",
1465 | " 158 | \n",
1466 | " BEMOL RIO BRANCO | \n",
1467 | " 401 | \n",
1468 | " Claro | \n",
1469 | " 500 | \n",
1470 | "
\n",
1471 | " \n",
1472 | " 159 | \n",
1473 | " BEMOL RIO BRANCO | \n",
1474 | " 401 | \n",
1475 | " Oi | \n",
1476 | " 375 | \n",
1477 | "
\n",
1478 | " \n",
1479 | " 160 | \n",
1480 | " BEMOL RIO BRANCO | \n",
1481 | " 401 | \n",
1482 | " Tim | \n",
1483 | " 125 | \n",
1484 | "
\n",
1485 | " \n",
1486 | " 161 | \n",
1487 | " BEMOL RIO BRANCO | \n",
1488 | " 401 | \n",
1489 | " Vivo | \n",
1490 | " 343 | \n",
1491 | "
\n",
1492 | " \n",
1493 | "
\n",
1494 | "
162 rows × 4 columns
\n",
1495 | "
"
1496 | ],
1497 | "text/plain": [
1498 | " loja escritorio operadora valor_bruto\n",
1499 | "0 BEMOL AVENIDA 103 Claro 2651\n",
1500 | "1 BEMOL AVENIDA 103 Oi 3309\n",
1501 | "2 BEMOL AVENIDA 103 Tim 1550\n",
1502 | "3 BEMOL AVENIDA 103 Vivo 4739\n",
1503 | "4 BEMOL BARROSO 107 Claro 1495\n",
1504 | ".. ... ... ... ...\n",
1505 | "157 LOJA BEMOL JI-PARANA 205 Claro 123\n",
1506 | "158 BEMOL RIO BRANCO 401 Claro 500\n",
1507 | "159 BEMOL RIO BRANCO 401 Oi 375\n",
1508 | "160 BEMOL RIO BRANCO 401 Tim 125\n",
1509 | "161 BEMOL RIO BRANCO 401 Vivo 343\n",
1510 | "\n",
1511 | "[162 rows x 4 columns]"
1512 | ]
1513 | },
1514 | "metadata": {
1515 | "tags": []
1516 | },
1517 | "execution_count": 568
1518 | }
1519 | ]
1520 | },
1521 | {
1522 | "cell_type": "markdown",
1523 | "metadata": {
1524 | "id": "V5o4Rov1MnhI",
1525 | "colab_type": "text"
1526 | },
1527 | "source": [
1528 | "## Agrupamento de dados"
1529 | ]
1530 | },
1531 | {
1532 | "cell_type": "markdown",
1533 | "metadata": {
1534 | "id": "hggwPjdq5fTE",
1535 | "colab_type": "text"
1536 | },
1537 | "source": [
1538 | "O agrupamento consiste em juntar itens iguais do conjunto de dados e sempre vem acompanhado de uma função para agrupar uma coluna e outra função aggregate que vai ditar o que acontecerá com as linhas das outras colunas. Veja o formato: "
1539 | ]
1540 | },
1541 | {
1542 | "cell_type": "markdown",
1543 | "metadata": {
1544 | "id": "4kYDDaG16CZS",
1545 | "colab_type": "text"
1546 | },
1547 | "source": [
1548 | "df =df.groupby( [ ‘coluna1’, ’coluna2’ ].agg({‘coluna3’ : ‘metodo’, ‘coluna4’\n",
1549 | ",‘metodo’ } ).reset_index()"
1550 | ]
1551 | },
1552 | {
1553 | "cell_type": "markdown",
1554 | "metadata": {
1555 | "id": "47wBkwck6cc9",
1556 | "colab_type": "text"
1557 | },
1558 | "source": [
1559 | "Existem diversos métodos que podem ser usados para as linhas que estão se repetindo das outras colunas que não forão agrupadas. Os mais utilizados são mean (média), sum (soma), max (maior valor), min (menor valor), first (primeiro valor), last (último valor)"
1560 | ]
1561 | },
1562 | {
1563 | "cell_type": "markdown",
1564 | "metadata": {
1565 | "id": "v8AbC92562zf",
1566 | "colab_type": "text"
1567 | },
1568 | "source": [
1569 | "Vamos agora **agrupar** o nosso conjunto de dados de vendas por as colunas **operadora** e **escritorio**, para as outras colunas, utilizaremos o método **sum** para as linhas que estão se repetindo, assim o método irá somar as linhas que se repetem na coluna e transformar em um único registro (linha). "
1570 | ]
1571 | },
1572 | {
1573 | "cell_type": "code",
1574 | "metadata": {
1575 | "id": "UgK_bDLSOGU4",
1576 | "colab_type": "code",
1577 | "outputId": "f7f33063-66df-481a-dce7-994813efc28b",
1578 | "colab": {
1579 | "base_uri": "https://localhost:8080/",
1580 | "height": 419
1581 | }
1582 | },
1583 | "source": [
1584 | "df_vendas = df_vendas.groupby(['operadora', 'escritorio']).agg({'valor_liquido':'sum'})\n",
1585 | "df_vendas.sort_values(ascending=False, by='valor_liquido').reset_index()"
1586 | ],
1587 | "execution_count": 0,
1588 | "outputs": [
1589 | {
1590 | "output_type": "execute_result",
1591 | "data": {
1592 | "text/html": [
1593 | "\n",
1594 | "\n",
1607 | "
\n",
1608 | " \n",
1609 | " \n",
1610 | " | \n",
1611 | " operadora | \n",
1612 | " escritorio | \n",
1613 | " valor_liquido | \n",
1614 | "
\n",
1615 | " \n",
1616 | " \n",
1617 | " \n",
1618 | " 0 | \n",
1619 | " Vivo | \n",
1620 | " 103 | \n",
1621 | " 4739 | \n",
1622 | "
\n",
1623 | " \n",
1624 | " 1 | \n",
1625 | " Vivo | \n",
1626 | " 115 | \n",
1627 | " 4616 | \n",
1628 | "
\n",
1629 | " \n",
1630 | " 2 | \n",
1631 | " Vivo | \n",
1632 | " 604 | \n",
1633 | " 4474 | \n",
1634 | "
\n",
1635 | " \n",
1636 | " 3 | \n",
1637 | " Vivo | \n",
1638 | " 500 | \n",
1639 | " 3751 | \n",
1640 | "
\n",
1641 | " \n",
1642 | " 4 | \n",
1643 | " Vivo | \n",
1644 | " 106 | \n",
1645 | " 3717 | \n",
1646 | "
\n",
1647 | " \n",
1648 | " ... | \n",
1649 | " ... | \n",
1650 | " ... | \n",
1651 | " ... | \n",
1652 | "
\n",
1653 | " \n",
1654 | " 158 | \n",
1655 | " Oi | \n",
1656 | " 701 | \n",
1657 | " 20 | \n",
1658 | "
\n",
1659 | " \n",
1660 | " 159 | \n",
1661 | " Tim | \n",
1662 | " 530 | \n",
1663 | " 20 | \n",
1664 | "
\n",
1665 | " \n",
1666 | " 160 | \n",
1667 | " Vivo | \n",
1668 | " 206 | \n",
1669 | " 15 | \n",
1670 | "
\n",
1671 | " \n",
1672 | " 161 | \n",
1673 | " Tim | \n",
1674 | " 614 | \n",
1675 | " 10 | \n",
1676 | "
\n",
1677 | " \n",
1678 | " 162 | \n",
1679 | " Tim | \n",
1680 | " 611 | \n",
1681 | " 10 | \n",
1682 | "
\n",
1683 | " \n",
1684 | "
\n",
1685 | "
163 rows × 3 columns
\n",
1686 | "
"
1687 | ],
1688 | "text/plain": [
1689 | " operadora escritorio valor_liquido\n",
1690 | "0 Vivo 103 4739\n",
1691 | "1 Vivo 115 4616\n",
1692 | "2 Vivo 604 4474\n",
1693 | "3 Vivo 500 3751\n",
1694 | "4 Vivo 106 3717\n",
1695 | ".. ... ... ...\n",
1696 | "158 Oi 701 20\n",
1697 | "159 Tim 530 20\n",
1698 | "160 Vivo 206 15\n",
1699 | "161 Tim 614 10\n",
1700 | "162 Tim 611 10\n",
1701 | "\n",
1702 | "[163 rows x 3 columns]"
1703 | ]
1704 | },
1705 | "metadata": {
1706 | "tags": []
1707 | },
1708 | "execution_count": 569
1709 | }
1710 | ]
1711 | },
1712 | {
1713 | "cell_type": "markdown",
1714 | "metadata": {
1715 | "id": "UhJ6cEqP7g82",
1716 | "colab_type": "text"
1717 | },
1718 | "source": [
1719 | "Para o nosso conjunto de dados detalhamento, vamos **agrupar** por as colunas **loja**, **operadora** e **escritorio**, repetindo o método usado anteriormente **sum** para as outras colunas."
1720 | ]
1721 | },
1722 | {
1723 | "cell_type": "code",
1724 | "metadata": {
1725 | "id": "RtXcTrj6Q0W8",
1726 | "colab_type": "code",
1727 | "outputId": "06d68b08-97e9-4077-8931-b88383f6f92d",
1728 | "colab": {
1729 | "base_uri": "https://localhost:8080/",
1730 | "height": 419
1731 | }
1732 | },
1733 | "source": [
1734 | "df_detalhamento = df_detalhamento.groupby(['loja', 'operadora', 'escritorio']).agg({'valor_bruto':'sum'}).reset_index()\n",
1735 | "df_detalhamento.sort_values(ascending=False, by='valor_bruto').reset_index()"
1736 | ],
1737 | "execution_count": 0,
1738 | "outputs": [
1739 | {
1740 | "output_type": "execute_result",
1741 | "data": {
1742 | "text/html": [
1743 | "\n",
1744 | "\n",
1757 | "
\n",
1758 | " \n",
1759 | " \n",
1760 | " | \n",
1761 | " index | \n",
1762 | " loja | \n",
1763 | " operadora | \n",
1764 | " escritorio | \n",
1765 | " valor_bruto | \n",
1766 | "
\n",
1767 | " \n",
1768 | " \n",
1769 | " \n",
1770 | " 0 | \n",
1771 | " 3 | \n",
1772 | " BEMOL AVENIDA | \n",
1773 | " Vivo | \n",
1774 | " 103 | \n",
1775 | " 4739 | \n",
1776 | "
\n",
1777 | " \n",
1778 | " 1 | \n",
1779 | " 19 | \n",
1780 | " BEMOL CIDADE NOVA | \n",
1781 | " Vivo | \n",
1782 | " 115 | \n",
1783 | " 4616 | \n",
1784 | "
\n",
1785 | " \n",
1786 | " 2 | \n",
1787 | " 43 | \n",
1788 | " BEMOL FARMA GRANDE CIRCULAR | \n",
1789 | " Vivo | \n",
1790 | " 604 | \n",
1791 | " 4474 | \n",
1792 | "
\n",
1793 | " \n",
1794 | " 3 | \n",
1795 | " 149 | \n",
1796 | " BEMOL TORQUATO | \n",
1797 | " Vivo | \n",
1798 | " 500 | \n",
1799 | " 3751 | \n",
1800 | "
\n",
1801 | " \n",
1802 | " 4 | \n",
1803 | " 137 | \n",
1804 | " BEMOL SHOPPING | \n",
1805 | " Vivo | \n",
1806 | " 106 | \n",
1807 | " 3704 | \n",
1808 | "
\n",
1809 | " \n",
1810 | " ... | \n",
1811 | " ... | \n",
1812 | " ... | \n",
1813 | " ... | \n",
1814 | " ... | \n",
1815 | " ... | \n",
1816 | "
\n",
1817 | " \n",
1818 | " 157 | \n",
1819 | " 156 | \n",
1820 | " LOJA BEMOL FARMA ARIQUEMES | \n",
1821 | " Vivo | \n",
1822 | " 613 | \n",
1823 | " 35 | \n",
1824 | "
\n",
1825 | " \n",
1826 | " 158 | \n",
1827 | " 128 | \n",
1828 | " BEMOL PRESIDENTE FIGUEIREDO | \n",
1829 | " Tim | \n",
1830 | " 530 | \n",
1831 | " 20 | \n",
1832 | "
\n",
1833 | " \n",
1834 | " 159 | \n",
1835 | " 9 | \n",
1836 | " BEMOL BOA VISTA | \n",
1837 | " Oi | \n",
1838 | " 701 | \n",
1839 | " 20 | \n",
1840 | "
\n",
1841 | " \n",
1842 | " 160 | \n",
1843 | " 152 | \n",
1844 | " LOJA BEMOL ARIQUEMES | \n",
1845 | " Vivo | \n",
1846 | " 206 | \n",
1847 | " 15 | \n",
1848 | "
\n",
1849 | " \n",
1850 | " 161 | \n",
1851 | " 46 | \n",
1852 | " BEMOL FARMA ITACOATIARA | \n",
1853 | " Tim | \n",
1854 | " 611 | \n",
1855 | " 10 | \n",
1856 | "
\n",
1857 | " \n",
1858 | "
\n",
1859 | "
162 rows × 5 columns
\n",
1860 | "
"
1861 | ],
1862 | "text/plain": [
1863 | " index loja operadora escritorio valor_bruto\n",
1864 | "0 3 BEMOL AVENIDA Vivo 103 4739\n",
1865 | "1 19 BEMOL CIDADE NOVA Vivo 115 4616\n",
1866 | "2 43 BEMOL FARMA GRANDE CIRCULAR Vivo 604 4474\n",
1867 | "3 149 BEMOL TORQUATO Vivo 500 3751\n",
1868 | "4 137 BEMOL SHOPPING Vivo 106 3704\n",
1869 | ".. ... ... ... ... ...\n",
1870 | "157 156 LOJA BEMOL FARMA ARIQUEMES Vivo 613 35\n",
1871 | "158 128 BEMOL PRESIDENTE FIGUEIREDO Tim 530 20\n",
1872 | "159 9 BEMOL BOA VISTA Oi 701 20\n",
1873 | "160 152 LOJA BEMOL ARIQUEMES Vivo 206 15\n",
1874 | "161 46 BEMOL FARMA ITACOATIARA Tim 611 10\n",
1875 | "\n",
1876 | "[162 rows x 5 columns]"
1877 | ]
1878 | },
1879 | "metadata": {
1880 | "tags": []
1881 | },
1882 | "execution_count": 570
1883 | }
1884 | ]
1885 | },
1886 | {
1887 | "cell_type": "markdown",
1888 | "metadata": {
1889 | "id": "5LhLOHgPLJeE",
1890 | "colab_type": "text"
1891 | },
1892 | "source": [
1893 | "# **Correlação de Tabelas**"
1894 | ]
1895 | },
1896 | {
1897 | "cell_type": "markdown",
1898 | "metadata": {
1899 | "id": "chYLModgVmXg",
1900 | "colab_type": "text"
1901 | },
1902 | "source": [
1903 | "## Merge de tabelas"
1904 | ]
1905 | },
1906 | {
1907 | "cell_type": "markdown",
1908 | "metadata": {
1909 | "id": "mvvcEPqt8EN3",
1910 | "colab_type": "text"
1911 | },
1912 | "source": [
1913 | "O método **merge** é utilizado para unir duas tabelas (conjunto de dados) que possuem uma ou mais colunas em comum. Veja o formato:"
1914 | ]
1915 | },
1916 | {
1917 | "cell_type": "markdown",
1918 | "metadata": {
1919 | "id": "Tp8YYjaY8lI8",
1920 | "colab_type": "text"
1921 | },
1922 | "source": [
1923 | "df = pd.merge( df1, df2, on = [‘colunas em comum’], how = ‘método de priorização’ )"
1924 | ]
1925 | },
1926 | {
1927 | "cell_type": "markdown",
1928 | "metadata": {
1929 | "id": "JJ9DSICO8syE",
1930 | "colab_type": "text"
1931 | },
1932 | "source": [
1933 | "O método de priorização é usado para escolher qual das tabelas passadas será escolhida como preferência. Os método de priorização pode ser left, right, inner ou outer."
1934 | ]
1935 | },
1936 | {
1937 | "cell_type": "markdown",
1938 | "metadata": {
1939 | "id": "e0CF8ptl9mfR",
1940 | "colab_type": "text"
1941 | },
1942 | "source": [
1943 | "Vamos agora juntar os nossos dois conjuntos de dados **df_detalhamento** e **df_vendas** que possuem duas colunas em comum e criar um **novo df**. "
1944 | ]
1945 | },
1946 | {
1947 | "cell_type": "code",
1948 | "metadata": {
1949 | "id": "6hUUq70OVv1f",
1950 | "colab_type": "code",
1951 | "colab": {}
1952 | },
1953 | "source": [
1954 | "df_relatorio = pd.merge(df_detalhamento, df_vendas, on=['operadora', 'escritorio'], how='right')"
1955 | ],
1956 | "execution_count": 0,
1957 | "outputs": []
1958 | },
1959 | {
1960 | "cell_type": "code",
1961 | "metadata": {
1962 | "id": "b_bT-o9JanEz",
1963 | "colab_type": "code",
1964 | "outputId": "35495f72-49fc-4858-e03a-9673c533957c",
1965 | "colab": {
1966 | "base_uri": "https://localhost:8080/",
1967 | "height": 204
1968 | }
1969 | },
1970 | "source": [
1971 | "df_relatorio.head(5)"
1972 | ],
1973 | "execution_count": 0,
1974 | "outputs": [
1975 | {
1976 | "output_type": "execute_result",
1977 | "data": {
1978 | "text/html": [
1979 | "\n",
1980 | "\n",
1993 | "
\n",
1994 | " \n",
1995 | " \n",
1996 | " | \n",
1997 | " loja | \n",
1998 | " operadora | \n",
1999 | " escritorio | \n",
2000 | " valor_bruto | \n",
2001 | " valor_liquido | \n",
2002 | "
\n",
2003 | " \n",
2004 | " \n",
2005 | " \n",
2006 | " 0 | \n",
2007 | " BEMOL AVENIDA | \n",
2008 | " Claro | \n",
2009 | " 103 | \n",
2010 | " 2651.0 | \n",
2011 | " 2652 | \n",
2012 | "
\n",
2013 | " \n",
2014 | " 1 | \n",
2015 | " BEMOL AVENIDA | \n",
2016 | " Oi | \n",
2017 | " 103 | \n",
2018 | " 3309.0 | \n",
2019 | " 3309 | \n",
2020 | "
\n",
2021 | " \n",
2022 | " 2 | \n",
2023 | " BEMOL AVENIDA | \n",
2024 | " Tim | \n",
2025 | " 103 | \n",
2026 | " 1550.0 | \n",
2027 | " 1550 | \n",
2028 | "
\n",
2029 | " \n",
2030 | " 3 | \n",
2031 | " BEMOL AVENIDA | \n",
2032 | " Vivo | \n",
2033 | " 103 | \n",
2034 | " 4739.0 | \n",
2035 | " 4739 | \n",
2036 | "
\n",
2037 | " \n",
2038 | " 4 | \n",
2039 | " BEMOL BARROSO | \n",
2040 | " Claro | \n",
2041 | " 107 | \n",
2042 | " 1495.0 | \n",
2043 | " 1495 | \n",
2044 | "
\n",
2045 | " \n",
2046 | "
\n",
2047 | "
"
2048 | ],
2049 | "text/plain": [
2050 | " loja operadora escritorio valor_bruto valor_liquido\n",
2051 | "0 BEMOL AVENIDA Claro 103 2651.0 2652\n",
2052 | "1 BEMOL AVENIDA Oi 103 3309.0 3309\n",
2053 | "2 BEMOL AVENIDA Tim 103 1550.0 1550\n",
2054 | "3 BEMOL AVENIDA Vivo 103 4739.0 4739\n",
2055 | "4 BEMOL BARROSO Claro 107 1495.0 1495"
2056 | ]
2057 | },
2058 | "metadata": {
2059 | "tags": []
2060 | },
2061 | "execution_count": 572
2062 | }
2063 | ]
2064 | },
2065 | {
2066 | "cell_type": "markdown",
2067 | "metadata": {
2068 | "id": "l8GKtQwE909W",
2069 | "colab_type": "text"
2070 | },
2071 | "source": [
2072 | "Como vocês podem observar, no método **how** a tabela (conjunto de dados) que foi escolhida como preferência foi o **df_vendas**, utilizando o método de priorização right. \n",
2073 | "Neste caso a tabela **df_vendas** é mantida e então realizada a correlação com os dados da primeira tabela de acordo com as colunas em comum **operadora** e **escritorio**."
2074 | ]
2075 | },
2076 | {
2077 | "cell_type": "markdown",
2078 | "metadata": {
2079 | "id": "ervT41cr_NZL",
2080 | "colab_type": "text"
2081 | },
2082 | "source": [
2083 | "Caso o valor da coluna em comum só exista na tabela (conjunto de dados) **df_vendas**, o valor é preenchido por NaN na coluna da tabela **df_detalhamento**. "
2084 | ]
2085 | },
2086 | {
2087 | "cell_type": "markdown",
2088 | "metadata": {
2089 | "id": "BCO_GKEZYicx",
2090 | "colab_type": "text"
2091 | },
2092 | "source": [
2093 | "### Tratando o pós merge"
2094 | ]
2095 | },
2096 | {
2097 | "cell_type": "markdown",
2098 | "metadata": {
2099 | "id": "WOsGAFHyAE2C",
2100 | "colab_type": "text"
2101 | },
2102 | "source": [
2103 | "Quando surgem valores NaN (nulos ou vazios) nas tabelas, as vezes torna-se necessário substituir por algum outro valor. Uma maneira de realizar essa substituição é utilizando a função **fillna**. Veja o formato:"
2104 | ]
2105 | },
2106 | {
2107 | "cell_type": "markdown",
2108 | "metadata": {
2109 | "id": "NdXWk4ROAlhT",
2110 | "colab_type": "text"
2111 | },
2112 | "source": [
2113 | "Para subistituir os NaN de uma coluna específica:"
2114 | ]
2115 | },
2116 | {
2117 | "cell_type": "markdown",
2118 | "metadata": {
2119 | "id": "IfmnRWylAwSM",
2120 | "colab_type": "text"
2121 | },
2122 | "source": [
2123 | "df[‘coluna’] = df[‘coluna’].fillna(novo_valor)"
2124 | ]
2125 | },
2126 | {
2127 | "cell_type": "markdown",
2128 | "metadata": {
2129 | "id": "y4d1YbUqA2Y1",
2130 | "colab_type": "text"
2131 | },
2132 | "source": [
2133 | "Para subistituir os NaN da tabela (conjunto de dados) inteira:"
2134 | ]
2135 | },
2136 | {
2137 | "cell_type": "markdown",
2138 | "metadata": {
2139 | "id": "m-eUjNg3A8tX",
2140 | "colab_type": "text"
2141 | },
2142 | "source": [
2143 | "df = df.fillna(novo_valor)"
2144 | ]
2145 | },
2146 | {
2147 | "cell_type": "markdown",
2148 | "metadata": {
2149 | "id": "8QAmzB8cZMBr",
2150 | "colab_type": "text"
2151 | },
2152 | "source": [
2153 | "Na minha análise não encontrei valores faltantes. Mas mesmo assim vou utilizar o fillna para substituir os mesmos por 0 se houver. "
2154 | ]
2155 | },
2156 | {
2157 | "cell_type": "code",
2158 | "metadata": {
2159 | "id": "f73RYBvIY6xG",
2160 | "colab_type": "code",
2161 | "colab": {}
2162 | },
2163 | "source": [
2164 | "df_relatorio = df_relatorio.fillna(0)"
2165 | ],
2166 | "execution_count": 0,
2167 | "outputs": []
2168 | },
2169 | {
2170 | "cell_type": "markdown",
2171 | "metadata": {
2172 | "id": "Jqz2pF2hZjJ9",
2173 | "colab_type": "text"
2174 | },
2175 | "source": [
2176 | "# **Regra de Negócio**"
2177 | ]
2178 | },
2179 | {
2180 | "cell_type": "markdown",
2181 | "metadata": {
2182 | "id": "cL9Y55lNBGrj",
2183 | "colab_type": "text"
2184 | },
2185 | "source": [
2186 | "##Operações entre colunas (com condicional):"
2187 | ]
2188 | },
2189 | {
2190 | "cell_type": "markdown",
2191 | "metadata": {
2192 | "id": "StLYTElsBLfm",
2193 | "colab_type": "text"
2194 | },
2195 | "source": [
2196 | "Caso exista alguma condição para determinado calculo na tabela torna-se necessário criar uma função para poder ser aplicado de forma mais rápida a condição em cada linha da tabela."
2197 | ]
2198 | },
2199 | {
2200 | "cell_type": "code",
2201 | "metadata": {
2202 | "id": "F0a-UYZMZk0h",
2203 | "colab_type": "code",
2204 | "colab": {}
2205 | },
2206 | "source": [
2207 | "def status(valor_liquido, valor_bruto):\n",
2208 | " if (valor_liquido != valor_bruto):\n",
2209 | " return 'alerta'\n",
2210 | " else:\n",
2211 | " return ''"
2212 | ],
2213 | "execution_count": 0,
2214 | "outputs": []
2215 | },
2216 | {
2217 | "cell_type": "markdown",
2218 | "metadata": {
2219 | "id": "UtBOrNUkBiv2",
2220 | "colab_type": "text"
2221 | },
2222 | "source": [
2223 | "A partir dessa função vamos criar uma nova coluna, a coluna status. \n",
2224 | "Para aplicar a função em todas as linhas da tabela (conjunto de dados) é necessário usar a função applly e a função lambda. Veja o formato:"
2225 | ]
2226 | },
2227 | {
2228 | "cell_type": "markdown",
2229 | "metadata": {
2230 | "id": "ukSlG1TXB-o3",
2231 | "colab_type": "text"
2232 | },
2233 | "source": [
2234 | "df[ ‘nova_coluna’ ] = df.apply( lambda row: funcao( row [‘coluna1’] ), axis = 'columns' )"
2235 | ]
2236 | },
2237 | {
2238 | "cell_type": "markdown",
2239 | "metadata": {
2240 | "id": "-gNS_R3yCTHc",
2241 | "colab_type": "text"
2242 | },
2243 | "source": [
2244 | "Vamos agora aplicar a função em todas as linhas da nossa tabela (conjunto de dados) **df_relatorio** e criar uma nova coluna. "
2245 | ]
2246 | },
2247 | {
2248 | "cell_type": "code",
2249 | "metadata": {
2250 | "id": "ZhGPIR0Ydbfu",
2251 | "colab_type": "code",
2252 | "colab": {}
2253 | },
2254 | "source": [
2255 | "df_relatorio['status'] = df_relatorio.apply(lambda row: status(row['valor_liquido'], row['valor_bruto']), axis='columns')"
2256 | ],
2257 | "execution_count": 0,
2258 | "outputs": []
2259 | },
2260 | {
2261 | "cell_type": "markdown",
2262 | "metadata": {
2263 | "id": "Mga9rIDFCsV4",
2264 | "colab_type": "text"
2265 | },
2266 | "source": [
2267 | "O **row** serve para aplicar a função e as colunas que devem ser passadas como paramêtros para a função. "
2268 | ]
2269 | },
2270 | {
2271 | "cell_type": "code",
2272 | "metadata": {
2273 | "id": "1Z87RyeCdvJE",
2274 | "colab_type": "code",
2275 | "outputId": "40e42c72-cbff-48c2-9e95-62f2836cf642",
2276 | "colab": {
2277 | "base_uri": "https://localhost:8080/",
2278 | "height": 419
2279 | }
2280 | },
2281 | "source": [
2282 | "df_relatorio"
2283 | ],
2284 | "execution_count": 0,
2285 | "outputs": [
2286 | {
2287 | "output_type": "execute_result",
2288 | "data": {
2289 | "text/html": [
2290 | "\n",
2291 | "\n",
2304 | "
\n",
2305 | " \n",
2306 | " \n",
2307 | " | \n",
2308 | " loja | \n",
2309 | " operadora | \n",
2310 | " escritorio | \n",
2311 | " valor_bruto | \n",
2312 | " valor_liquido | \n",
2313 | " status | \n",
2314 | "
\n",
2315 | " \n",
2316 | " \n",
2317 | " \n",
2318 | " 0 | \n",
2319 | " BEMOL AVENIDA | \n",
2320 | " Claro | \n",
2321 | " 103 | \n",
2322 | " 2651.0 | \n",
2323 | " 2652 | \n",
2324 | " alerta | \n",
2325 | "
\n",
2326 | " \n",
2327 | " 1 | \n",
2328 | " BEMOL AVENIDA | \n",
2329 | " Oi | \n",
2330 | " 103 | \n",
2331 | " 3309.0 | \n",
2332 | " 3309 | \n",
2333 | " | \n",
2334 | "
\n",
2335 | " \n",
2336 | " 2 | \n",
2337 | " BEMOL AVENIDA | \n",
2338 | " Tim | \n",
2339 | " 103 | \n",
2340 | " 1550.0 | \n",
2341 | " 1550 | \n",
2342 | " | \n",
2343 | "
\n",
2344 | " \n",
2345 | " 3 | \n",
2346 | " BEMOL AVENIDA | \n",
2347 | " Vivo | \n",
2348 | " 103 | \n",
2349 | " 4739.0 | \n",
2350 | " 4739 | \n",
2351 | " | \n",
2352 | "
\n",
2353 | " \n",
2354 | " 4 | \n",
2355 | " BEMOL BARROSO | \n",
2356 | " Claro | \n",
2357 | " 107 | \n",
2358 | " 1495.0 | \n",
2359 | " 1495 | \n",
2360 | " | \n",
2361 | "
\n",
2362 | " \n",
2363 | " ... | \n",
2364 | " ... | \n",
2365 | " ... | \n",
2366 | " ... | \n",
2367 | " ... | \n",
2368 | " ... | \n",
2369 | " ... | \n",
2370 | "
\n",
2371 | " \n",
2372 | " 158 | \n",
2373 | " LOJA BEMOL FARMA JI PARANA | \n",
2374 | " Oi | \n",
2375 | " 616 | \n",
2376 | " 95.0 | \n",
2377 | " 95 | \n",
2378 | " | \n",
2379 | "
\n",
2380 | " \n",
2381 | " 159 | \n",
2382 | " LOJA BEMOL FARMA JI PARANA | \n",
2383 | " Tim | \n",
2384 | " 616 | \n",
2385 | " 50.0 | \n",
2386 | " 50 | \n",
2387 | " | \n",
2388 | "
\n",
2389 | " \n",
2390 | " 160 | \n",
2391 | " LOJA BEMOL FARMA JI PARANA | \n",
2392 | " Vivo | \n",
2393 | " 616 | \n",
2394 | " 70.0 | \n",
2395 | " 70 | \n",
2396 | " | \n",
2397 | "
\n",
2398 | " \n",
2399 | " 161 | \n",
2400 | " LOJA BEMOL JI-PARANA | \n",
2401 | " Claro | \n",
2402 | " 205 | \n",
2403 | " 123.0 | \n",
2404 | " 123 | \n",
2405 | " | \n",
2406 | "
\n",
2407 | " \n",
2408 | " 162 | \n",
2409 | " 0 | \n",
2410 | " Tim | \n",
2411 | " 614 | \n",
2412 | " 0.0 | \n",
2413 | " 10 | \n",
2414 | " alerta | \n",
2415 | "
\n",
2416 | " \n",
2417 | "
\n",
2418 | "
163 rows × 6 columns
\n",
2419 | "
"
2420 | ],
2421 | "text/plain": [
2422 | " loja operadora ... valor_liquido status\n",
2423 | "0 BEMOL AVENIDA Claro ... 2652 alerta\n",
2424 | "1 BEMOL AVENIDA Oi ... 3309 \n",
2425 | "2 BEMOL AVENIDA Tim ... 1550 \n",
2426 | "3 BEMOL AVENIDA Vivo ... 4739 \n",
2427 | "4 BEMOL BARROSO Claro ... 1495 \n",
2428 | ".. ... ... ... ... ...\n",
2429 | "158 LOJA BEMOL FARMA JI PARANA Oi ... 95 \n",
2430 | "159 LOJA BEMOL FARMA JI PARANA Tim ... 50 \n",
2431 | "160 LOJA BEMOL FARMA JI PARANA Vivo ... 70 \n",
2432 | "161 LOJA BEMOL JI-PARANA Claro ... 123 \n",
2433 | "162 0 Tim ... 10 alerta\n",
2434 | "\n",
2435 | "[163 rows x 6 columns]"
2436 | ]
2437 | },
2438 | "metadata": {
2439 | "tags": []
2440 | },
2441 | "execution_count": 576
2442 | }
2443 | ]
2444 | },
2445 | {
2446 | "cell_type": "markdown",
2447 | "metadata": {
2448 | "id": "qb16VJO1knHE",
2449 | "colab_type": "text"
2450 | },
2451 | "source": [
2452 | "# **Exportanto Dados**"
2453 | ]
2454 | },
2455 | {
2456 | "cell_type": "markdown",
2457 | "metadata": {
2458 | "id": "oYw5spf3DCVU",
2459 | "colab_type": "text"
2460 | },
2461 | "source": [
2462 | "##Exportando para o excel (XLSX)"
2463 | ]
2464 | },
2465 | {
2466 | "cell_type": "markdown",
2467 | "metadata": {
2468 | "id": "Z9EhdGFDDKRF",
2469 | "colab_type": "text"
2470 | },
2471 | "source": [
2472 | "Primeiro deve-se criar um arquivo em excel em branco, veja o formato:"
2473 | ]
2474 | },
2475 | {
2476 | "cell_type": "markdown",
2477 | "metadata": {
2478 | "id": "zMKdyhUwDWPD",
2479 | "colab_type": "text"
2480 | },
2481 | "source": [
2482 | "arquivo = pd.ExcelWriter('nome_do_arquivo.xlsx', engine='xlsxwriter')"
2483 | ]
2484 | },
2485 | {
2486 | "cell_type": "markdown",
2487 | "metadata": {
2488 | "id": "iABrZYNADXRT",
2489 | "colab_type": "text"
2490 | },
2491 | "source": [
2492 | "Após o arquivo ser criado podemos preencher com a tabela (conjunto de dados) que criamos."
2493 | ]
2494 | },
2495 | {
2496 | "cell_type": "markdown",
2497 | "metadata": {
2498 | "id": "mHJz9EEgDnkT",
2499 | "colab_type": "text"
2500 | },
2501 | "source": [
2502 | "df.to_excel( arquivo, sheet_name = ‘nome da aba onde o arquivo será inserido’, index = False )"
2503 | ]
2504 | },
2505 | {
2506 | "cell_type": "markdown",
2507 | "metadata": {
2508 | "id": "Te_Z3bDpDqW0",
2509 | "colab_type": "text"
2510 | },
2511 | "source": [
2512 | "O index = False é utilizado para quando o arquivo for salvo não ficar com o index do pandas. Agora é preciso gerar o arquivo."
2513 | ]
2514 | },
2515 | {
2516 | "cell_type": "markdown",
2517 | "metadata": {
2518 | "id": "OgU89Ko-D7NK",
2519 | "colab_type": "text"
2520 | },
2521 | "source": [
2522 | "arquivo.save()"
2523 | ]
2524 | },
2525 | {
2526 | "cell_type": "markdown",
2527 | "metadata": {
2528 | "id": "I1G7GnidD83h",
2529 | "colab_type": "text"
2530 | },
2531 | "source": [
2532 | "Vamos agora salvar a nossa tabela (conjunto de dados) que criamos no formato xlsx(excel)"
2533 | ]
2534 | },
2535 | {
2536 | "cell_type": "code",
2537 | "metadata": {
2538 | "id": "0RZW8ifxk59U",
2539 | "colab_type": "code",
2540 | "colab": {}
2541 | },
2542 | "source": [
2543 | "writer = pd.ExcelWriter('Relatório final.xlsx')\n",
2544 | "\n",
2545 | "df_relatorio.to_excel(writer, sheet_name='relatorio', index=False)\n",
2546 | "\n",
2547 | "writer.save()"
2548 | ],
2549 | "execution_count": 0,
2550 | "outputs": []
2551 | },
2552 | {
2553 | "cell_type": "markdown",
2554 | "metadata": {
2555 | "id": "8yh20ehmEWGb",
2556 | "colab_type": "text"
2557 | },
2558 | "source": [
2559 | "##Exportando para CSV"
2560 | ]
2561 | },
2562 | {
2563 | "cell_type": "markdown",
2564 | "metadata": {
2565 | "id": "7vuEgHVIEaGz",
2566 | "colab_type": "text"
2567 | },
2568 | "source": [
2569 | "Também poderíamos exportar para um arquivo com extensão .csv apenas seguindo o formato:"
2570 | ]
2571 | },
2572 | {
2573 | "cell_type": "code",
2574 | "metadata": {
2575 | "id": "K3LcvHTXEtnf",
2576 | "colab_type": "code",
2577 | "colab": {}
2578 | },
2579 | "source": [
2580 | "df_relatorio.to_csv(‘relatorio_final.csv’, sep = ‘;’, index=False)"
2581 | ],
2582 | "execution_count": 0,
2583 | "outputs": []
2584 | },
2585 | {
2586 | "cell_type": "markdown",
2587 | "metadata": {
2588 | "id": "g7I9y8_3E5CE",
2589 | "colab_type": "text"
2590 | },
2591 | "source": [
2592 | "Normalmente para separar os dados se utiliza o ponto e vírgula."
2593 | ]
2594 | },
2595 | {
2596 | "cell_type": "markdown",
2597 | "metadata": {
2598 | "id": "B0wQkk2NFCui",
2599 | "colab_type": "text"
2600 | },
2601 | "source": [
2602 | "#**Conclusão**"
2603 | ]
2604 | },
2605 | {
2606 | "cell_type": "markdown",
2607 | "metadata": {
2608 | "id": "j4XObjuNFI06",
2609 | "colab_type": "text"
2610 | },
2611 | "source": [
2612 | "Podemos baixar um executavel do python a partir desse notebook e colocar em uma mesma pasta com os arquivos que foram utilizados para criar nossa tabela (conjunto de dados) final. Assim, poderiamos alterar os arquivos e ao dar dois cliques no executavel ele já atualiza e gera uma nova tabela (conjunto de dados) que irá servir como relatório á empresa."
2613 | ]
2614 | }
2615 | ]
2616 | }
--------------------------------------------------------------------------------
/Geração de Indicadores/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | Esse projeto foi construído com o objetivo de fazer o tratamento e manipulação dos dados de arquivos em XLSX e CSV da empresa Bemol, como vendas e materiais. Para gerar um único arquivo no formato xlsx que irá servir como relatório e indicador para a empresa. Se os arquivos que foram tratados e preparados para gerar o relatório forem alterados, o executável estará pronto para gerar um novo relatório, um conjunto de dados atualizado.
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
--------------------------------------------------------------------------------
/Geração de Indicadores/Relatório Final.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/luislauriano/Data_Science/5ff00f3ff280c36108d9a88a1108adc070d6f3bd/Geração de Indicadores/Relatório Final.xlsx
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 luislauriano
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Possiveis resultados copa do mundo 2022/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # Projeto de machine learning para prever possíveis resultados de jogos da copa do mundo de 2022
4 |
5 |
6 | Esse é um projeto para fins de curiosidade e estudo de machine learning, com o objetivo de que seja possível desenvolver um modelo capaz de prever possíveis resultados das partidas da copa do mundo de 2022, até chegar no resultado do grande vencedor do campeonato.
7 |
8 | De que forma isso foi realizado? A partir da análise dos dados e entendimento do problema, ficou claro que em média o time da casa marca mais gols e o número de gols por partida segue uma distribuição de poisson, seguindo de acordo com a lógica de partidas do futebol. Então, foi pensado em construir um Modelo Linear Generalizado com base na regressão de poisson, tendo em vista, que a regressão de poisson busca prever um valor de contagem e para esse problema pode-se prever a quantidade de gols por partida, para conseguir chegar no possível vencendor da partida e as chances de vitória, empate e derrota.
9 |
10 |
11 |
12 |
13 |
14 |
15 | ---
16 |
17 | Após o projeto ter sido finalizado, a ideia foi também construir uma aplicação com Streamlit para o modelo e realizar o seu deploy com heroku, para que outras pessoas possam utilizar o modelo construído e obter seus possíveis resultados para qualquer partida do campeonato, além de poder simular como possivelmente ficaria a fase de grupos após os jogos e a fase mata-mata, até chegar na grande seleção vencedora da copa do mundo de 2022..
18 |
19 |
20 | * **[Repositório/Código fonte da Aplicação](https://github.com/luislauriano/Aplicacao_CopaDoMundo22-projeto)**
21 |
22 | * **[Aplicação na web](https://luislauriano-aplicacao-copadomundo22-projeto-app-r5k28a.streamlit.app/)**
23 |
24 |
25 |
26 |
27 |
28 |
29 | ---
30 |
31 | ## Possíveis resultados finais a partir do modelo
32 |
33 | Lembrar que, o resultado da partida é feito a partir da escolha do resultado mais provável, diante da probabilidade de cada resultado possível (vitória, empate e derrota) que foi calculado em uma função com base em todas as probabilidades de resultados possíveis. Por esse motivo, o resultado da partida pode acabar sendo diferente se testado outra vez. Para ficar mais claro, podemos imaginar a cena do doutor estranho no filme guerra infinita onde ele encontra um único resultado positivo para eles vencerem a guerra diante de todos os resultados possiveis finais que a guerra contra thanos poderia ter.
34 |
35 | Outra observação, o resultado de gols de uma partida pode acabar se repetindo em muitos casos o placar de 1x0, talvez por o modelo não ter uma precisão para quantificar tão bem saldos maiores, mas o interessante é entender e levar em consideração quem o modelo está prevendo como possível vencedor da partida
36 |
37 | ### Teste 01
38 | Essa é a principal aposta de resultado final do projeto para os jogos da fase mata-mata da copa do mundo de 2022, seguindo os resultados possíveis que o modelo informou, utilizando dados de partidas internacionais da fifa de 2010 a 2022 e os dados do ranking da fifa de 1992 a 2021, na tentativa de encontrar um melhor desempenho para o modelo.
39 |
40 |
41 |
42 |
43 | ### Teste 02
44 | Para o teste 02, a ideia foi seguir os resultados possíveis que o modelo informou, entretanto, utilizando dados de partidas internacionais da fifa de 1993 a 2022 (diferente do teste 02) e os dados do ranking da fifa de 1992 a 2021, na tentativa de encontrar um melhor desempenho para o modelo.
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
--------------------------------------------------------------------------------
/Prevendo Partidas de League of Legends/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # League of Legends and Data Science - Predicting match results
4 |
5 | What happens if I join League of Legends and Data Science?
6 |
7 | Well, this is my newest Machine Learning project, defined as an end-to-end project, going from collecting match data to building a machine learning model, to predicting the odds of the time you are playing. on the blue side of the map to win. Performing steps such as: Pre-processing and data analysis, dimensionality reduction and selection of variables, and construction of both a model completed with XGBClassifier, and construction of a logistic regression model from the results obtained from AutoML with Pycaret.
8 |
9 |
10 |
11 |
12 |
13 |
14 | ---
15 |
--------------------------------------------------------------------------------
/Previsão de Ocorrência de Diabetes/Dados.csv:
--------------------------------------------------------------------------------
1 | Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
2 | 6,148,72,35,0,33.6,0.627,50,1
3 | 1,85,66,29,0,26.6,0.351,31,0
4 | 8,183,64,0,0,23.3,0.672,32,1
5 | 1,89,66,23,94,28.1,0.167,21,0
6 | 0,137,40,35,168,43.1,2.288,33,1
7 | 5,116,74,0,0,25.6,0.201,30,0
8 | 3,78,50,32,88,31,0.248,26,1
9 | 10,115,0,0,0,35.3,0.134,29,0
10 | 2,197,70,45,543,30.5,0.158,53,1
11 | 8,125,96,0,0,0,0.232,54,1
12 | 4,110,92,0,0,37.6,0.191,30,0
13 | 10,168,74,0,0,38,0.537,34,1
14 | 10,139,80,0,0,27.1,1.441,57,0
15 | 1,189,60,23,846,30.1,0.398,59,1
16 | 5,166,72,19,175,25.8,0.587,51,1
17 | 7,100,0,0,0,30,0.484,32,1
18 | 0,118,84,47,230,45.8,0.551,31,1
19 | 7,107,74,0,0,29.6,0.254,31,1
20 | 1,103,30,38,83,43.3,0.183,33,0
21 | 1,115,70,30,96,34.6,0.529,32,1
22 | 3,126,88,41,235,39.3,0.704,27,0
23 | 8,99,84,0,0,35.4,0.388,50,0
24 | 7,196,90,0,0,39.8,0.451,41,1
25 | 9,119,80,35,0,29,0.263,29,1
26 | 11,143,94,33,146,36.6,0.254,51,1
27 | 10,125,70,26,115,31.1,0.205,41,1
28 | 7,147,76,0,0,39.4,0.257,43,1
29 | 1,97,66,15,140,23.2,0.487,22,0
30 | 13,145,82,19,110,22.2,0.245,57,0
31 | 5,117,92,0,0,34.1,0.337,38,0
32 | 5,109,75,26,0,36,0.546,60,0
33 | 3,158,76,36,245,31.6,0.851,28,1
34 | 3,88,58,11,54,24.8,0.267,22,0
35 | 6,92,92,0,0,19.9,0.188,28,0
36 | 10,122,78,31,0,27.6,0.512,45,0
37 | 4,103,60,33,192,24,0.966,33,0
38 | 11,138,76,0,0,33.2,0.42,35,0
39 | 9,102,76,37,0,32.9,0.665,46,1
40 | 2,90,68,42,0,38.2,0.503,27,1
41 | 4,111,72,47,207,37.1,1.39,56,1
42 | 3,180,64,25,70,34,0.271,26,0
43 | 7,133,84,0,0,40.2,0.696,37,0
44 | 7,106,92,18,0,22.7,0.235,48,0
45 | 9,171,110,24,240,45.4,0.721,54,1
46 | 7,159,64,0,0,27.4,0.294,40,0
47 | 0,180,66,39,0,42,1.893,25,1
48 | 1,146,56,0,0,29.7,0.564,29,0
49 | 2,71,70,27,0,28,0.586,22,0
50 | 7,103,66,32,0,39.1,0.344,31,1
51 | 7,105,0,0,0,0,0.305,24,0
52 | 1,103,80,11,82,19.4,0.491,22,0
53 | 1,101,50,15,36,24.2,0.526,26,0
54 | 5,88,66,21,23,24.4,0.342,30,0
55 | 8,176,90,34,300,33.7,0.467,58,1
56 | 7,150,66,42,342,34.7,0.718,42,0
57 | 1,73,50,10,0,23,0.248,21,0
58 | 7,187,68,39,304,37.7,0.254,41,1
59 | 0,100,88,60,110,46.8,0.962,31,0
60 | 0,146,82,0,0,40.5,1.781,44,0
61 | 0,105,64,41,142,41.5,0.173,22,0
62 | 2,84,0,0,0,0,0.304,21,0
63 | 8,133,72,0,0,32.9,0.27,39,1
64 | 5,44,62,0,0,25,0.587,36,0
65 | 2,141,58,34,128,25.4,0.699,24,0
66 | 7,114,66,0,0,32.8,0.258,42,1
67 | 5,99,74,27,0,29,0.203,32,0
68 | 0,109,88,30,0,32.5,0.855,38,1
69 | 2,109,92,0,0,42.7,0.845,54,0
70 | 1,95,66,13,38,19.6,0.334,25,0
71 | 4,146,85,27,100,28.9,0.189,27,0
72 | 2,100,66,20,90,32.9,0.867,28,1
73 | 5,139,64,35,140,28.6,0.411,26,0
74 | 13,126,90,0,0,43.4,0.583,42,1
75 | 4,129,86,20,270,35.1,0.231,23,0
76 | 1,79,75,30,0,32,0.396,22,0
77 | 1,0,48,20,0,24.7,0.14,22,0
78 | 7,62,78,0,0,32.6,0.391,41,0
79 | 5,95,72,33,0,37.7,0.37,27,0
80 | 0,131,0,0,0,43.2,0.27,26,1
81 | 2,112,66,22,0,25,0.307,24,0
82 | 3,113,44,13,0,22.4,0.14,22,0
83 | 2,74,0,0,0,0,0.102,22,0
84 | 7,83,78,26,71,29.3,0.767,36,0
85 | 0,101,65,28,0,24.6,0.237,22,0
86 | 5,137,108,0,0,48.8,0.227,37,1
87 | 2,110,74,29,125,32.4,0.698,27,0
88 | 13,106,72,54,0,36.6,0.178,45,0
89 | 2,100,68,25,71,38.5,0.324,26,0
90 | 15,136,70,32,110,37.1,0.153,43,1
91 | 1,107,68,19,0,26.5,0.165,24,0
92 | 1,80,55,0,0,19.1,0.258,21,0
93 | 4,123,80,15,176,32,0.443,34,0
94 | 7,81,78,40,48,46.7,0.261,42,0
95 | 4,134,72,0,0,23.8,0.277,60,1
96 | 2,142,82,18,64,24.7,0.761,21,0
97 | 6,144,72,27,228,33.9,0.255,40,0
98 | 2,92,62,28,0,31.6,0.13,24,0
99 | 1,71,48,18,76,20.4,0.323,22,0
100 | 6,93,50,30,64,28.7,0.356,23,0
101 | 1,122,90,51,220,49.7,0.325,31,1
102 | 1,163,72,0,0,39,1.222,33,1
103 | 1,151,60,0,0,26.1,0.179,22,0
104 | 0,125,96,0,0,22.5,0.262,21,0
105 | 1,81,72,18,40,26.6,0.283,24,0
106 | 2,85,65,0,0,39.6,0.93,27,0
107 | 1,126,56,29,152,28.7,0.801,21,0
108 | 1,96,122,0,0,22.4,0.207,27,0
109 | 4,144,58,28,140,29.5,0.287,37,0
110 | 3,83,58,31,18,34.3,0.336,25,0
111 | 0,95,85,25,36,37.4,0.247,24,1
112 | 3,171,72,33,135,33.3,0.199,24,1
113 | 8,155,62,26,495,34,0.543,46,1
114 | 1,89,76,34,37,31.2,0.192,23,0
115 | 4,76,62,0,0,34,0.391,25,0
116 | 7,160,54,32,175,30.5,0.588,39,1
117 | 4,146,92,0,0,31.2,0.539,61,1
118 | 5,124,74,0,0,34,0.22,38,1
119 | 5,78,48,0,0,33.7,0.654,25,0
120 | 4,97,60,23,0,28.2,0.443,22,0
121 | 4,99,76,15,51,23.2,0.223,21,0
122 | 0,162,76,56,100,53.2,0.759,25,1
123 | 6,111,64,39,0,34.2,0.26,24,0
124 | 2,107,74,30,100,33.6,0.404,23,0
125 | 5,132,80,0,0,26.8,0.186,69,0
126 | 0,113,76,0,0,33.3,0.278,23,1
127 | 1,88,30,42,99,55,0.496,26,1
128 | 3,120,70,30,135,42.9,0.452,30,0
129 | 1,118,58,36,94,33.3,0.261,23,0
130 | 1,117,88,24,145,34.5,0.403,40,1
131 | 0,105,84,0,0,27.9,0.741,62,1
132 | 4,173,70,14,168,29.7,0.361,33,1
133 | 9,122,56,0,0,33.3,1.114,33,1
134 | 3,170,64,37,225,34.5,0.356,30,1
135 | 8,84,74,31,0,38.3,0.457,39,0
136 | 2,96,68,13,49,21.1,0.647,26,0
137 | 2,125,60,20,140,33.8,0.088,31,0
138 | 0,100,70,26,50,30.8,0.597,21,0
139 | 0,93,60,25,92,28.7,0.532,22,0
140 | 0,129,80,0,0,31.2,0.703,29,0
141 | 5,105,72,29,325,36.9,0.159,28,0
142 | 3,128,78,0,0,21.1,0.268,55,0
143 | 5,106,82,30,0,39.5,0.286,38,0
144 | 2,108,52,26,63,32.5,0.318,22,0
145 | 10,108,66,0,0,32.4,0.272,42,1
146 | 4,154,62,31,284,32.8,0.237,23,0
147 | 0,102,75,23,0,0,0.572,21,0
148 | 9,57,80,37,0,32.8,0.096,41,0
149 | 2,106,64,35,119,30.5,1.4,34,0
150 | 5,147,78,0,0,33.7,0.218,65,0
151 | 2,90,70,17,0,27.3,0.085,22,0
152 | 1,136,74,50,204,37.4,0.399,24,0
153 | 4,114,65,0,0,21.9,0.432,37,0
154 | 9,156,86,28,155,34.3,1.189,42,1
155 | 1,153,82,42,485,40.6,0.687,23,0
156 | 8,188,78,0,0,47.9,0.137,43,1
157 | 7,152,88,44,0,50,0.337,36,1
158 | 2,99,52,15,94,24.6,0.637,21,0
159 | 1,109,56,21,135,25.2,0.833,23,0
160 | 2,88,74,19,53,29,0.229,22,0
161 | 17,163,72,41,114,40.9,0.817,47,1
162 | 4,151,90,38,0,29.7,0.294,36,0
163 | 7,102,74,40,105,37.2,0.204,45,0
164 | 0,114,80,34,285,44.2,0.167,27,0
165 | 2,100,64,23,0,29.7,0.368,21,0
166 | 0,131,88,0,0,31.6,0.743,32,1
167 | 6,104,74,18,156,29.9,0.722,41,1
168 | 3,148,66,25,0,32.5,0.256,22,0
169 | 4,120,68,0,0,29.6,0.709,34,0
170 | 4,110,66,0,0,31.9,0.471,29,0
171 | 3,111,90,12,78,28.4,0.495,29,0
172 | 6,102,82,0,0,30.8,0.18,36,1
173 | 6,134,70,23,130,35.4,0.542,29,1
174 | 2,87,0,23,0,28.9,0.773,25,0
175 | 1,79,60,42,48,43.5,0.678,23,0
176 | 2,75,64,24,55,29.7,0.37,33,0
177 | 8,179,72,42,130,32.7,0.719,36,1
178 | 6,85,78,0,0,31.2,0.382,42,0
179 | 0,129,110,46,130,67.1,0.319,26,1
180 | 5,143,78,0,0,45,0.19,47,0
181 | 5,130,82,0,0,39.1,0.956,37,1
182 | 6,87,80,0,0,23.2,0.084,32,0
183 | 0,119,64,18,92,34.9,0.725,23,0
184 | 1,0,74,20,23,27.7,0.299,21,0
185 | 5,73,60,0,0,26.8,0.268,27,0
186 | 4,141,74,0,0,27.6,0.244,40,0
187 | 7,194,68,28,0,35.9,0.745,41,1
188 | 8,181,68,36,495,30.1,0.615,60,1
189 | 1,128,98,41,58,32,1.321,33,1
190 | 8,109,76,39,114,27.9,0.64,31,1
191 | 5,139,80,35,160,31.6,0.361,25,1
192 | 3,111,62,0,0,22.6,0.142,21,0
193 | 9,123,70,44,94,33.1,0.374,40,0
194 | 7,159,66,0,0,30.4,0.383,36,1
195 | 11,135,0,0,0,52.3,0.578,40,1
196 | 8,85,55,20,0,24.4,0.136,42,0
197 | 5,158,84,41,210,39.4,0.395,29,1
198 | 1,105,58,0,0,24.3,0.187,21,0
199 | 3,107,62,13,48,22.9,0.678,23,1
200 | 4,109,64,44,99,34.8,0.905,26,1
201 | 4,148,60,27,318,30.9,0.15,29,1
202 | 0,113,80,16,0,31,0.874,21,0
203 | 1,138,82,0,0,40.1,0.236,28,0
204 | 0,108,68,20,0,27.3,0.787,32,0
205 | 2,99,70,16,44,20.4,0.235,27,0
206 | 6,103,72,32,190,37.7,0.324,55,0
207 | 5,111,72,28,0,23.9,0.407,27,0
208 | 8,196,76,29,280,37.5,0.605,57,1
209 | 5,162,104,0,0,37.7,0.151,52,1
210 | 1,96,64,27,87,33.2,0.289,21,0
211 | 7,184,84,33,0,35.5,0.355,41,1
212 | 2,81,60,22,0,27.7,0.29,25,0
213 | 0,147,85,54,0,42.8,0.375,24,0
214 | 7,179,95,31,0,34.2,0.164,60,0
215 | 0,140,65,26,130,42.6,0.431,24,1
216 | 9,112,82,32,175,34.2,0.26,36,1
217 | 12,151,70,40,271,41.8,0.742,38,1
218 | 5,109,62,41,129,35.8,0.514,25,1
219 | 6,125,68,30,120,30,0.464,32,0
220 | 5,85,74,22,0,29,1.224,32,1
221 | 5,112,66,0,0,37.8,0.261,41,1
222 | 0,177,60,29,478,34.6,1.072,21,1
223 | 2,158,90,0,0,31.6,0.805,66,1
224 | 7,119,0,0,0,25.2,0.209,37,0
225 | 7,142,60,33,190,28.8,0.687,61,0
226 | 1,100,66,15,56,23.6,0.666,26,0
227 | 1,87,78,27,32,34.6,0.101,22,0
228 | 0,101,76,0,0,35.7,0.198,26,0
229 | 3,162,52,38,0,37.2,0.652,24,1
230 | 4,197,70,39,744,36.7,2.329,31,0
231 | 0,117,80,31,53,45.2,0.089,24,0
232 | 4,142,86,0,0,44,0.645,22,1
233 | 6,134,80,37,370,46.2,0.238,46,1
234 | 1,79,80,25,37,25.4,0.583,22,0
235 | 4,122,68,0,0,35,0.394,29,0
236 | 3,74,68,28,45,29.7,0.293,23,0
237 | 4,171,72,0,0,43.6,0.479,26,1
238 | 7,181,84,21,192,35.9,0.586,51,1
239 | 0,179,90,27,0,44.1,0.686,23,1
240 | 9,164,84,21,0,30.8,0.831,32,1
241 | 0,104,76,0,0,18.4,0.582,27,0
242 | 1,91,64,24,0,29.2,0.192,21,0
243 | 4,91,70,32,88,33.1,0.446,22,0
244 | 3,139,54,0,0,25.6,0.402,22,1
245 | 6,119,50,22,176,27.1,1.318,33,1
246 | 2,146,76,35,194,38.2,0.329,29,0
247 | 9,184,85,15,0,30,1.213,49,1
248 | 10,122,68,0,0,31.2,0.258,41,0
249 | 0,165,90,33,680,52.3,0.427,23,0
250 | 9,124,70,33,402,35.4,0.282,34,0
251 | 1,111,86,19,0,30.1,0.143,23,0
252 | 9,106,52,0,0,31.2,0.38,42,0
253 | 2,129,84,0,0,28,0.284,27,0
254 | 2,90,80,14,55,24.4,0.249,24,0
255 | 0,86,68,32,0,35.8,0.238,25,0
256 | 12,92,62,7,258,27.6,0.926,44,1
257 | 1,113,64,35,0,33.6,0.543,21,1
258 | 3,111,56,39,0,30.1,0.557,30,0
259 | 2,114,68,22,0,28.7,0.092,25,0
260 | 1,193,50,16,375,25.9,0.655,24,0
261 | 11,155,76,28,150,33.3,1.353,51,1
262 | 3,191,68,15,130,30.9,0.299,34,0
263 | 3,141,0,0,0,30,0.761,27,1
264 | 4,95,70,32,0,32.1,0.612,24,0
265 | 3,142,80,15,0,32.4,0.2,63,0
266 | 4,123,62,0,0,32,0.226,35,1
267 | 5,96,74,18,67,33.6,0.997,43,0
268 | 0,138,0,0,0,36.3,0.933,25,1
269 | 2,128,64,42,0,40,1.101,24,0
270 | 0,102,52,0,0,25.1,0.078,21,0
271 | 2,146,0,0,0,27.5,0.24,28,1
272 | 10,101,86,37,0,45.6,1.136,38,1
273 | 2,108,62,32,56,25.2,0.128,21,0
274 | 3,122,78,0,0,23,0.254,40,0
275 | 1,71,78,50,45,33.2,0.422,21,0
276 | 13,106,70,0,0,34.2,0.251,52,0
277 | 2,100,70,52,57,40.5,0.677,25,0
278 | 7,106,60,24,0,26.5,0.296,29,1
279 | 0,104,64,23,116,27.8,0.454,23,0
280 | 5,114,74,0,0,24.9,0.744,57,0
281 | 2,108,62,10,278,25.3,0.881,22,0
282 | 0,146,70,0,0,37.9,0.334,28,1
283 | 10,129,76,28,122,35.9,0.28,39,0
284 | 7,133,88,15,155,32.4,0.262,37,0
285 | 7,161,86,0,0,30.4,0.165,47,1
286 | 2,108,80,0,0,27,0.259,52,1
287 | 7,136,74,26,135,26,0.647,51,0
288 | 5,155,84,44,545,38.7,0.619,34,0
289 | 1,119,86,39,220,45.6,0.808,29,1
290 | 4,96,56,17,49,20.8,0.34,26,0
291 | 5,108,72,43,75,36.1,0.263,33,0
292 | 0,78,88,29,40,36.9,0.434,21,0
293 | 0,107,62,30,74,36.6,0.757,25,1
294 | 2,128,78,37,182,43.3,1.224,31,1
295 | 1,128,48,45,194,40.5,0.613,24,1
296 | 0,161,50,0,0,21.9,0.254,65,0
297 | 6,151,62,31,120,35.5,0.692,28,0
298 | 2,146,70,38,360,28,0.337,29,1
299 | 0,126,84,29,215,30.7,0.52,24,0
300 | 14,100,78,25,184,36.6,0.412,46,1
301 | 8,112,72,0,0,23.6,0.84,58,0
302 | 0,167,0,0,0,32.3,0.839,30,1
303 | 2,144,58,33,135,31.6,0.422,25,1
304 | 5,77,82,41,42,35.8,0.156,35,0
305 | 5,115,98,0,0,52.9,0.209,28,1
306 | 3,150,76,0,0,21,0.207,37,0
307 | 2,120,76,37,105,39.7,0.215,29,0
308 | 10,161,68,23,132,25.5,0.326,47,1
309 | 0,137,68,14,148,24.8,0.143,21,0
310 | 0,128,68,19,180,30.5,1.391,25,1
311 | 2,124,68,28,205,32.9,0.875,30,1
312 | 6,80,66,30,0,26.2,0.313,41,0
313 | 0,106,70,37,148,39.4,0.605,22,0
314 | 2,155,74,17,96,26.6,0.433,27,1
315 | 3,113,50,10,85,29.5,0.626,25,0
316 | 7,109,80,31,0,35.9,1.127,43,1
317 | 2,112,68,22,94,34.1,0.315,26,0
318 | 3,99,80,11,64,19.3,0.284,30,0
319 | 3,182,74,0,0,30.5,0.345,29,1
320 | 3,115,66,39,140,38.1,0.15,28,0
321 | 6,194,78,0,0,23.5,0.129,59,1
322 | 4,129,60,12,231,27.5,0.527,31,0
323 | 3,112,74,30,0,31.6,0.197,25,1
324 | 0,124,70,20,0,27.4,0.254,36,1
325 | 13,152,90,33,29,26.8,0.731,43,1
326 | 2,112,75,32,0,35.7,0.148,21,0
327 | 1,157,72,21,168,25.6,0.123,24,0
328 | 1,122,64,32,156,35.1,0.692,30,1
329 | 10,179,70,0,0,35.1,0.2,37,0
330 | 2,102,86,36,120,45.5,0.127,23,1
331 | 6,105,70,32,68,30.8,0.122,37,0
332 | 8,118,72,19,0,23.1,1.476,46,0
333 | 2,87,58,16,52,32.7,0.166,25,0
334 | 1,180,0,0,0,43.3,0.282,41,1
335 | 12,106,80,0,0,23.6,0.137,44,0
336 | 1,95,60,18,58,23.9,0.26,22,0
337 | 0,165,76,43,255,47.9,0.259,26,0
338 | 0,117,0,0,0,33.8,0.932,44,0
339 | 5,115,76,0,0,31.2,0.343,44,1
340 | 9,152,78,34,171,34.2,0.893,33,1
341 | 7,178,84,0,0,39.9,0.331,41,1
342 | 1,130,70,13,105,25.9,0.472,22,0
343 | 1,95,74,21,73,25.9,0.673,36,0
344 | 1,0,68,35,0,32,0.389,22,0
345 | 5,122,86,0,0,34.7,0.29,33,0
346 | 8,95,72,0,0,36.8,0.485,57,0
347 | 8,126,88,36,108,38.5,0.349,49,0
348 | 1,139,46,19,83,28.7,0.654,22,0
349 | 3,116,0,0,0,23.5,0.187,23,0
350 | 3,99,62,19,74,21.8,0.279,26,0
351 | 5,0,80,32,0,41,0.346,37,1
352 | 4,92,80,0,0,42.2,0.237,29,0
353 | 4,137,84,0,0,31.2,0.252,30,0
354 | 3,61,82,28,0,34.4,0.243,46,0
355 | 1,90,62,12,43,27.2,0.58,24,0
356 | 3,90,78,0,0,42.7,0.559,21,0
357 | 9,165,88,0,0,30.4,0.302,49,1
358 | 1,125,50,40,167,33.3,0.962,28,1
359 | 13,129,0,30,0,39.9,0.569,44,1
360 | 12,88,74,40,54,35.3,0.378,48,0
361 | 1,196,76,36,249,36.5,0.875,29,1
362 | 5,189,64,33,325,31.2,0.583,29,1
363 | 5,158,70,0,0,29.8,0.207,63,0
364 | 5,103,108,37,0,39.2,0.305,65,0
365 | 4,146,78,0,0,38.5,0.52,67,1
366 | 4,147,74,25,293,34.9,0.385,30,0
367 | 5,99,54,28,83,34,0.499,30,0
368 | 6,124,72,0,0,27.6,0.368,29,1
369 | 0,101,64,17,0,21,0.252,21,0
370 | 3,81,86,16,66,27.5,0.306,22,0
371 | 1,133,102,28,140,32.8,0.234,45,1
372 | 3,173,82,48,465,38.4,2.137,25,1
373 | 0,118,64,23,89,0,1.731,21,0
374 | 0,84,64,22,66,35.8,0.545,21,0
375 | 2,105,58,40,94,34.9,0.225,25,0
376 | 2,122,52,43,158,36.2,0.816,28,0
377 | 12,140,82,43,325,39.2,0.528,58,1
378 | 0,98,82,15,84,25.2,0.299,22,0
379 | 1,87,60,37,75,37.2,0.509,22,0
380 | 4,156,75,0,0,48.3,0.238,32,1
381 | 0,93,100,39,72,43.4,1.021,35,0
382 | 1,107,72,30,82,30.8,0.821,24,0
383 | 0,105,68,22,0,20,0.236,22,0
384 | 1,109,60,8,182,25.4,0.947,21,0
385 | 1,90,62,18,59,25.1,1.268,25,0
386 | 1,125,70,24,110,24.3,0.221,25,0
387 | 1,119,54,13,50,22.3,0.205,24,0
388 | 5,116,74,29,0,32.3,0.66,35,1
389 | 8,105,100,36,0,43.3,0.239,45,1
390 | 5,144,82,26,285,32,0.452,58,1
391 | 3,100,68,23,81,31.6,0.949,28,0
392 | 1,100,66,29,196,32,0.444,42,0
393 | 5,166,76,0,0,45.7,0.34,27,1
394 | 1,131,64,14,415,23.7,0.389,21,0
395 | 4,116,72,12,87,22.1,0.463,37,0
396 | 4,158,78,0,0,32.9,0.803,31,1
397 | 2,127,58,24,275,27.7,1.6,25,0
398 | 3,96,56,34,115,24.7,0.944,39,0
399 | 0,131,66,40,0,34.3,0.196,22,1
400 | 3,82,70,0,0,21.1,0.389,25,0
401 | 3,193,70,31,0,34.9,0.241,25,1
402 | 4,95,64,0,0,32,0.161,31,1
403 | 6,137,61,0,0,24.2,0.151,55,0
404 | 5,136,84,41,88,35,0.286,35,1
405 | 9,72,78,25,0,31.6,0.28,38,0
406 | 5,168,64,0,0,32.9,0.135,41,1
407 | 2,123,48,32,165,42.1,0.52,26,0
408 | 4,115,72,0,0,28.9,0.376,46,1
409 | 0,101,62,0,0,21.9,0.336,25,0
410 | 8,197,74,0,0,25.9,1.191,39,1
411 | 1,172,68,49,579,42.4,0.702,28,1
412 | 6,102,90,39,0,35.7,0.674,28,0
413 | 1,112,72,30,176,34.4,0.528,25,0
414 | 1,143,84,23,310,42.4,1.076,22,0
415 | 1,143,74,22,61,26.2,0.256,21,0
416 | 0,138,60,35,167,34.6,0.534,21,1
417 | 3,173,84,33,474,35.7,0.258,22,1
418 | 1,97,68,21,0,27.2,1.095,22,0
419 | 4,144,82,32,0,38.5,0.554,37,1
420 | 1,83,68,0,0,18.2,0.624,27,0
421 | 3,129,64,29,115,26.4,0.219,28,1
422 | 1,119,88,41,170,45.3,0.507,26,0
423 | 2,94,68,18,76,26,0.561,21,0
424 | 0,102,64,46,78,40.6,0.496,21,0
425 | 2,115,64,22,0,30.8,0.421,21,0
426 | 8,151,78,32,210,42.9,0.516,36,1
427 | 4,184,78,39,277,37,0.264,31,1
428 | 0,94,0,0,0,0,0.256,25,0
429 | 1,181,64,30,180,34.1,0.328,38,1
430 | 0,135,94,46,145,40.6,0.284,26,0
431 | 1,95,82,25,180,35,0.233,43,1
432 | 2,99,0,0,0,22.2,0.108,23,0
433 | 3,89,74,16,85,30.4,0.551,38,0
434 | 1,80,74,11,60,30,0.527,22,0
435 | 2,139,75,0,0,25.6,0.167,29,0
436 | 1,90,68,8,0,24.5,1.138,36,0
437 | 0,141,0,0,0,42.4,0.205,29,1
438 | 12,140,85,33,0,37.4,0.244,41,0
439 | 5,147,75,0,0,29.9,0.434,28,0
440 | 1,97,70,15,0,18.2,0.147,21,0
441 | 6,107,88,0,0,36.8,0.727,31,0
442 | 0,189,104,25,0,34.3,0.435,41,1
443 | 2,83,66,23,50,32.2,0.497,22,0
444 | 4,117,64,27,120,33.2,0.23,24,0
445 | 8,108,70,0,0,30.5,0.955,33,1
446 | 4,117,62,12,0,29.7,0.38,30,1
447 | 0,180,78,63,14,59.4,2.42,25,1
448 | 1,100,72,12,70,25.3,0.658,28,0
449 | 0,95,80,45,92,36.5,0.33,26,0
450 | 0,104,64,37,64,33.6,0.51,22,1
451 | 0,120,74,18,63,30.5,0.285,26,0
452 | 1,82,64,13,95,21.2,0.415,23,0
453 | 2,134,70,0,0,28.9,0.542,23,1
454 | 0,91,68,32,210,39.9,0.381,25,0
455 | 2,119,0,0,0,19.6,0.832,72,0
456 | 2,100,54,28,105,37.8,0.498,24,0
457 | 14,175,62,30,0,33.6,0.212,38,1
458 | 1,135,54,0,0,26.7,0.687,62,0
459 | 5,86,68,28,71,30.2,0.364,24,0
460 | 10,148,84,48,237,37.6,1.001,51,1
461 | 9,134,74,33,60,25.9,0.46,81,0
462 | 9,120,72,22,56,20.8,0.733,48,0
463 | 1,71,62,0,0,21.8,0.416,26,0
464 | 8,74,70,40,49,35.3,0.705,39,0
465 | 5,88,78,30,0,27.6,0.258,37,0
466 | 10,115,98,0,0,24,1.022,34,0
467 | 0,124,56,13,105,21.8,0.452,21,0
468 | 0,74,52,10,36,27.8,0.269,22,0
469 | 0,97,64,36,100,36.8,0.6,25,0
470 | 8,120,0,0,0,30,0.183,38,1
471 | 6,154,78,41,140,46.1,0.571,27,0
472 | 1,144,82,40,0,41.3,0.607,28,0
473 | 0,137,70,38,0,33.2,0.17,22,0
474 | 0,119,66,27,0,38.8,0.259,22,0
475 | 7,136,90,0,0,29.9,0.21,50,0
476 | 4,114,64,0,0,28.9,0.126,24,0
477 | 0,137,84,27,0,27.3,0.231,59,0
478 | 2,105,80,45,191,33.7,0.711,29,1
479 | 7,114,76,17,110,23.8,0.466,31,0
480 | 8,126,74,38,75,25.9,0.162,39,0
481 | 4,132,86,31,0,28,0.419,63,0
482 | 3,158,70,30,328,35.5,0.344,35,1
483 | 0,123,88,37,0,35.2,0.197,29,0
484 | 4,85,58,22,49,27.8,0.306,28,0
485 | 0,84,82,31,125,38.2,0.233,23,0
486 | 0,145,0,0,0,44.2,0.63,31,1
487 | 0,135,68,42,250,42.3,0.365,24,1
488 | 1,139,62,41,480,40.7,0.536,21,0
489 | 0,173,78,32,265,46.5,1.159,58,0
490 | 4,99,72,17,0,25.6,0.294,28,0
491 | 8,194,80,0,0,26.1,0.551,67,0
492 | 2,83,65,28,66,36.8,0.629,24,0
493 | 2,89,90,30,0,33.5,0.292,42,0
494 | 4,99,68,38,0,32.8,0.145,33,0
495 | 4,125,70,18,122,28.9,1.144,45,1
496 | 3,80,0,0,0,0,0.174,22,0
497 | 6,166,74,0,0,26.6,0.304,66,0
498 | 5,110,68,0,0,26,0.292,30,0
499 | 2,81,72,15,76,30.1,0.547,25,0
500 | 7,195,70,33,145,25.1,0.163,55,1
501 | 6,154,74,32,193,29.3,0.839,39,0
502 | 2,117,90,19,71,25.2,0.313,21,0
503 | 3,84,72,32,0,37.2,0.267,28,0
504 | 6,0,68,41,0,39,0.727,41,1
505 | 7,94,64,25,79,33.3,0.738,41,0
506 | 3,96,78,39,0,37.3,0.238,40,0
507 | 10,75,82,0,0,33.3,0.263,38,0
508 | 0,180,90,26,90,36.5,0.314,35,1
509 | 1,130,60,23,170,28.6,0.692,21,0
510 | 2,84,50,23,76,30.4,0.968,21,0
511 | 8,120,78,0,0,25,0.409,64,0
512 | 12,84,72,31,0,29.7,0.297,46,1
513 | 0,139,62,17,210,22.1,0.207,21,0
514 | 9,91,68,0,0,24.2,0.2,58,0
515 | 2,91,62,0,0,27.3,0.525,22,0
516 | 3,99,54,19,86,25.6,0.154,24,0
517 | 3,163,70,18,105,31.6,0.268,28,1
518 | 9,145,88,34,165,30.3,0.771,53,1
519 | 7,125,86,0,0,37.6,0.304,51,0
520 | 13,76,60,0,0,32.8,0.18,41,0
521 | 6,129,90,7,326,19.6,0.582,60,0
522 | 2,68,70,32,66,25,0.187,25,0
523 | 3,124,80,33,130,33.2,0.305,26,0
524 | 6,114,0,0,0,0,0.189,26,0
525 | 9,130,70,0,0,34.2,0.652,45,1
526 | 3,125,58,0,0,31.6,0.151,24,0
527 | 3,87,60,18,0,21.8,0.444,21,0
528 | 1,97,64,19,82,18.2,0.299,21,0
529 | 3,116,74,15,105,26.3,0.107,24,0
530 | 0,117,66,31,188,30.8,0.493,22,0
531 | 0,111,65,0,0,24.6,0.66,31,0
532 | 2,122,60,18,106,29.8,0.717,22,0
533 | 0,107,76,0,0,45.3,0.686,24,0
534 | 1,86,66,52,65,41.3,0.917,29,0
535 | 6,91,0,0,0,29.8,0.501,31,0
536 | 1,77,56,30,56,33.3,1.251,24,0
537 | 4,132,0,0,0,32.9,0.302,23,1
538 | 0,105,90,0,0,29.6,0.197,46,0
539 | 0,57,60,0,0,21.7,0.735,67,0
540 | 0,127,80,37,210,36.3,0.804,23,0
541 | 3,129,92,49,155,36.4,0.968,32,1
542 | 8,100,74,40,215,39.4,0.661,43,1
543 | 3,128,72,25,190,32.4,0.549,27,1
544 | 10,90,85,32,0,34.9,0.825,56,1
545 | 4,84,90,23,56,39.5,0.159,25,0
546 | 1,88,78,29,76,32,0.365,29,0
547 | 8,186,90,35,225,34.5,0.423,37,1
548 | 5,187,76,27,207,43.6,1.034,53,1
549 | 4,131,68,21,166,33.1,0.16,28,0
550 | 1,164,82,43,67,32.8,0.341,50,0
551 | 4,189,110,31,0,28.5,0.68,37,0
552 | 1,116,70,28,0,27.4,0.204,21,0
553 | 3,84,68,30,106,31.9,0.591,25,0
554 | 6,114,88,0,0,27.8,0.247,66,0
555 | 1,88,62,24,44,29.9,0.422,23,0
556 | 1,84,64,23,115,36.9,0.471,28,0
557 | 7,124,70,33,215,25.5,0.161,37,0
558 | 1,97,70,40,0,38.1,0.218,30,0
559 | 8,110,76,0,0,27.8,0.237,58,0
560 | 11,103,68,40,0,46.2,0.126,42,0
561 | 11,85,74,0,0,30.1,0.3,35,0
562 | 6,125,76,0,0,33.8,0.121,54,1
563 | 0,198,66,32,274,41.3,0.502,28,1
564 | 1,87,68,34,77,37.6,0.401,24,0
565 | 6,99,60,19,54,26.9,0.497,32,0
566 | 0,91,80,0,0,32.4,0.601,27,0
567 | 2,95,54,14,88,26.1,0.748,22,0
568 | 1,99,72,30,18,38.6,0.412,21,0
569 | 6,92,62,32,126,32,0.085,46,0
570 | 4,154,72,29,126,31.3,0.338,37,0
571 | 0,121,66,30,165,34.3,0.203,33,1
572 | 3,78,70,0,0,32.5,0.27,39,0
573 | 2,130,96,0,0,22.6,0.268,21,0
574 | 3,111,58,31,44,29.5,0.43,22,0
575 | 2,98,60,17,120,34.7,0.198,22,0
576 | 1,143,86,30,330,30.1,0.892,23,0
577 | 1,119,44,47,63,35.5,0.28,25,0
578 | 6,108,44,20,130,24,0.813,35,0
579 | 2,118,80,0,0,42.9,0.693,21,1
580 | 10,133,68,0,0,27,0.245,36,0
581 | 2,197,70,99,0,34.7,0.575,62,1
582 | 0,151,90,46,0,42.1,0.371,21,1
583 | 6,109,60,27,0,25,0.206,27,0
584 | 12,121,78,17,0,26.5,0.259,62,0
585 | 8,100,76,0,0,38.7,0.19,42,0
586 | 8,124,76,24,600,28.7,0.687,52,1
587 | 1,93,56,11,0,22.5,0.417,22,0
588 | 8,143,66,0,0,34.9,0.129,41,1
589 | 6,103,66,0,0,24.3,0.249,29,0
590 | 3,176,86,27,156,33.3,1.154,52,1
591 | 0,73,0,0,0,21.1,0.342,25,0
592 | 11,111,84,40,0,46.8,0.925,45,1
593 | 2,112,78,50,140,39.4,0.175,24,0
594 | 3,132,80,0,0,34.4,0.402,44,1
595 | 2,82,52,22,115,28.5,1.699,25,0
596 | 6,123,72,45,230,33.6,0.733,34,0
597 | 0,188,82,14,185,32,0.682,22,1
598 | 0,67,76,0,0,45.3,0.194,46,0
599 | 1,89,24,19,25,27.8,0.559,21,0
600 | 1,173,74,0,0,36.8,0.088,38,1
601 | 1,109,38,18,120,23.1,0.407,26,0
602 | 1,108,88,19,0,27.1,0.4,24,0
603 | 6,96,0,0,0,23.7,0.19,28,0
604 | 1,124,74,36,0,27.8,0.1,30,0
605 | 7,150,78,29,126,35.2,0.692,54,1
606 | 4,183,0,0,0,28.4,0.212,36,1
607 | 1,124,60,32,0,35.8,0.514,21,0
608 | 1,181,78,42,293,40,1.258,22,1
609 | 1,92,62,25,41,19.5,0.482,25,0
610 | 0,152,82,39,272,41.5,0.27,27,0
611 | 1,111,62,13,182,24,0.138,23,0
612 | 3,106,54,21,158,30.9,0.292,24,0
613 | 3,174,58,22,194,32.9,0.593,36,1
614 | 7,168,88,42,321,38.2,0.787,40,1
615 | 6,105,80,28,0,32.5,0.878,26,0
616 | 11,138,74,26,144,36.1,0.557,50,1
617 | 3,106,72,0,0,25.8,0.207,27,0
618 | 6,117,96,0,0,28.7,0.157,30,0
619 | 2,68,62,13,15,20.1,0.257,23,0
620 | 9,112,82,24,0,28.2,1.282,50,1
621 | 0,119,0,0,0,32.4,0.141,24,1
622 | 2,112,86,42,160,38.4,0.246,28,0
623 | 2,92,76,20,0,24.2,1.698,28,0
624 | 6,183,94,0,0,40.8,1.461,45,0
625 | 0,94,70,27,115,43.5,0.347,21,0
626 | 2,108,64,0,0,30.8,0.158,21,0
627 | 4,90,88,47,54,37.7,0.362,29,0
628 | 0,125,68,0,0,24.7,0.206,21,0
629 | 0,132,78,0,0,32.4,0.393,21,0
630 | 5,128,80,0,0,34.6,0.144,45,0
631 | 4,94,65,22,0,24.7,0.148,21,0
632 | 7,114,64,0,0,27.4,0.732,34,1
633 | 0,102,78,40,90,34.5,0.238,24,0
634 | 2,111,60,0,0,26.2,0.343,23,0
635 | 1,128,82,17,183,27.5,0.115,22,0
636 | 10,92,62,0,0,25.9,0.167,31,0
637 | 13,104,72,0,0,31.2,0.465,38,1
638 | 5,104,74,0,0,28.8,0.153,48,0
639 | 2,94,76,18,66,31.6,0.649,23,0
640 | 7,97,76,32,91,40.9,0.871,32,1
641 | 1,100,74,12,46,19.5,0.149,28,0
642 | 0,102,86,17,105,29.3,0.695,27,0
643 | 4,128,70,0,0,34.3,0.303,24,0
644 | 6,147,80,0,0,29.5,0.178,50,1
645 | 4,90,0,0,0,28,0.61,31,0
646 | 3,103,72,30,152,27.6,0.73,27,0
647 | 2,157,74,35,440,39.4,0.134,30,0
648 | 1,167,74,17,144,23.4,0.447,33,1
649 | 0,179,50,36,159,37.8,0.455,22,1
650 | 11,136,84,35,130,28.3,0.26,42,1
651 | 0,107,60,25,0,26.4,0.133,23,0
652 | 1,91,54,25,100,25.2,0.234,23,0
653 | 1,117,60,23,106,33.8,0.466,27,0
654 | 5,123,74,40,77,34.1,0.269,28,0
655 | 2,120,54,0,0,26.8,0.455,27,0
656 | 1,106,70,28,135,34.2,0.142,22,0
657 | 2,155,52,27,540,38.7,0.24,25,1
658 | 2,101,58,35,90,21.8,0.155,22,0
659 | 1,120,80,48,200,38.9,1.162,41,0
660 | 11,127,106,0,0,39,0.19,51,0
661 | 3,80,82,31,70,34.2,1.292,27,1
662 | 10,162,84,0,0,27.7,0.182,54,0
663 | 1,199,76,43,0,42.9,1.394,22,1
664 | 8,167,106,46,231,37.6,0.165,43,1
665 | 9,145,80,46,130,37.9,0.637,40,1
666 | 6,115,60,39,0,33.7,0.245,40,1
667 | 1,112,80,45,132,34.8,0.217,24,0
668 | 4,145,82,18,0,32.5,0.235,70,1
669 | 10,111,70,27,0,27.5,0.141,40,1
670 | 6,98,58,33,190,34,0.43,43,0
671 | 9,154,78,30,100,30.9,0.164,45,0
672 | 6,165,68,26,168,33.6,0.631,49,0
673 | 1,99,58,10,0,25.4,0.551,21,0
674 | 10,68,106,23,49,35.5,0.285,47,0
675 | 3,123,100,35,240,57.3,0.88,22,0
676 | 8,91,82,0,0,35.6,0.587,68,0
677 | 6,195,70,0,0,30.9,0.328,31,1
678 | 9,156,86,0,0,24.8,0.23,53,1
679 | 0,93,60,0,0,35.3,0.263,25,0
680 | 3,121,52,0,0,36,0.127,25,1
681 | 2,101,58,17,265,24.2,0.614,23,0
682 | 2,56,56,28,45,24.2,0.332,22,0
683 | 0,162,76,36,0,49.6,0.364,26,1
684 | 0,95,64,39,105,44.6,0.366,22,0
685 | 4,125,80,0,0,32.3,0.536,27,1
686 | 5,136,82,0,0,0,0.64,69,0
687 | 2,129,74,26,205,33.2,0.591,25,0
688 | 3,130,64,0,0,23.1,0.314,22,0
689 | 1,107,50,19,0,28.3,0.181,29,0
690 | 1,140,74,26,180,24.1,0.828,23,0
691 | 1,144,82,46,180,46.1,0.335,46,1
692 | 8,107,80,0,0,24.6,0.856,34,0
693 | 13,158,114,0,0,42.3,0.257,44,1
694 | 2,121,70,32,95,39.1,0.886,23,0
695 | 7,129,68,49,125,38.5,0.439,43,1
696 | 2,90,60,0,0,23.5,0.191,25,0
697 | 7,142,90,24,480,30.4,0.128,43,1
698 | 3,169,74,19,125,29.9,0.268,31,1
699 | 0,99,0,0,0,25,0.253,22,0
700 | 4,127,88,11,155,34.5,0.598,28,0
701 | 4,118,70,0,0,44.5,0.904,26,0
702 | 2,122,76,27,200,35.9,0.483,26,0
703 | 6,125,78,31,0,27.6,0.565,49,1
704 | 1,168,88,29,0,35,0.905,52,1
705 | 2,129,0,0,0,38.5,0.304,41,0
706 | 4,110,76,20,100,28.4,0.118,27,0
707 | 6,80,80,36,0,39.8,0.177,28,0
708 | 10,115,0,0,0,0,0.261,30,1
709 | 2,127,46,21,335,34.4,0.176,22,0
710 | 9,164,78,0,0,32.8,0.148,45,1
711 | 2,93,64,32,160,38,0.674,23,1
712 | 3,158,64,13,387,31.2,0.295,24,0
713 | 5,126,78,27,22,29.6,0.439,40,0
714 | 10,129,62,36,0,41.2,0.441,38,1
715 | 0,134,58,20,291,26.4,0.352,21,0
716 | 3,102,74,0,0,29.5,0.121,32,0
717 | 7,187,50,33,392,33.9,0.826,34,1
718 | 3,173,78,39,185,33.8,0.97,31,1
719 | 10,94,72,18,0,23.1,0.595,56,0
720 | 1,108,60,46,178,35.5,0.415,24,0
721 | 5,97,76,27,0,35.6,0.378,52,1
722 | 4,83,86,19,0,29.3,0.317,34,0
723 | 1,114,66,36,200,38.1,0.289,21,0
724 | 1,149,68,29,127,29.3,0.349,42,1
725 | 5,117,86,30,105,39.1,0.251,42,0
726 | 1,111,94,0,0,32.8,0.265,45,0
727 | 4,112,78,40,0,39.4,0.236,38,0
728 | 1,116,78,29,180,36.1,0.496,25,0
729 | 0,141,84,26,0,32.4,0.433,22,0
730 | 2,175,88,0,0,22.9,0.326,22,0
731 | 2,92,52,0,0,30.1,0.141,22,0
732 | 3,130,78,23,79,28.4,0.323,34,1
733 | 8,120,86,0,0,28.4,0.259,22,1
734 | 2,174,88,37,120,44.5,0.646,24,1
735 | 2,106,56,27,165,29,0.426,22,0
736 | 2,105,75,0,0,23.3,0.56,53,0
737 | 4,95,60,32,0,35.4,0.284,28,0
738 | 0,126,86,27,120,27.4,0.515,21,0
739 | 8,65,72,23,0,32,0.6,42,0
740 | 2,99,60,17,160,36.6,0.453,21,0
741 | 1,102,74,0,0,39.5,0.293,42,1
742 | 11,120,80,37,150,42.3,0.785,48,1
743 | 3,102,44,20,94,30.8,0.4,26,0
744 | 1,109,58,18,116,28.5,0.219,22,0
745 | 9,140,94,0,0,32.7,0.734,45,1
746 | 13,153,88,37,140,40.6,1.174,39,0
747 | 12,100,84,33,105,30,0.488,46,0
748 | 1,147,94,41,0,49.3,0.358,27,1
749 | 1,81,74,41,57,46.3,1.096,32,0
750 | 3,187,70,22,200,36.4,0.408,36,1
751 | 6,162,62,0,0,24.3,0.178,50,1
752 | 4,136,70,0,0,31.2,1.182,22,1
753 | 1,121,78,39,74,39,0.261,28,0
754 | 3,108,62,24,0,26,0.223,25,0
755 | 0,181,88,44,510,43.3,0.222,26,1
756 | 8,154,78,32,0,32.4,0.443,45,1
757 | 1,128,88,39,110,36.5,1.057,37,1
758 | 7,137,90,41,0,32,0.391,39,0
759 | 0,123,72,0,0,36.3,0.258,52,1
760 | 1,106,76,0,0,37.5,0.197,26,0
761 | 6,190,92,0,0,35.5,0.278,66,1
762 | 2,88,58,26,16,28.4,0.766,22,0
763 | 9,170,74,31,0,44,0.403,43,1
764 | 9,89,62,0,0,22.5,0.142,33,0
765 | 10,101,76,48,180,32.9,0.171,63,0
766 | 2,122,70,27,0,36.8,0.34,27,0
767 | 5,121,72,23,112,26.2,0.245,30,0
768 | 1,126,60,0,0,30.1,0.349,47,1
769 | 1,93,70,31,0,30.4,0.315,23,0
--------------------------------------------------------------------------------
/Previsão de Ocorrência de Diabetes/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # Criação de modelo preditivo de Ocorrência de Diabetes
4 |
5 | O conjunto de dados utilizado foi retirado do repositório de Machine Learning da UCI e também pode ser encontrado no Kaggle. O conjunto de dados é originalmente do Instituto Nacional de Diabetes e Doenças Digestivas e Renais.
6 | O objetivo desse curto projeto é através do conjunto de dados Instituto Nacional de Diabetes e Doenças Digestivas e Renais, prever se uma paciente grávida tem ou não diabete, com base em algumas medidas de diagnóstico.
7 |
8 |
9 |
10 |
11 |
12 | O conjunto de dados formado especialmente por mulheres com pelo menos 21 anos de idade, contém diversas variáveis de previsão médica e uma variável de destino, a resultado. As variáveis médicas preditoras ou dados de entrada, incluem o número de gestações que a paciente teve, seu IMC, nível de insulina, idade e assim por diante.
13 |
14 |
15 |
16 |
17 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 |
4 |
5 |
6 |
7 | # Luis Vinicius
8 |
9 | [Linkedin](https://www.linkedin.com/in/luislauriano)
10 |
11 |
12 | ## Projects:
13 |
14 | Here you can find the notebooks of my projects in the area of Data Science and Machine Learning.
15 |
16 | * **[Machine learning project to predict possible outcomes of 2022 world cup matches:](https://github.com/luislauriano/Data_Science/tree/master/Possiveis%20resultados%20copa%20do%20mundo%202022)**
17 | This is a project for the purposes of curiosity and machine learning study, with the aim of developing a model capable of predicting possible outcomes of the 2022 World Cup matches, until reaching the result of the grand winner of the championship.
18 |
19 |
20 | * **[League of Legends and Data Science – Predicting match results:](https://github.com/luislauriano/Data_Science/tree/master/Prevendo%20Partidas%20de%20League%20of%20Legends)**
21 | This Machine Learning project, defined as an end-to-end project, aimed to go from collecting match data to building a machine learning model, to predict the chances of the team that is playing on the blue side. on the side of the map win. Performing steps such as: Pre-processing and data analysis, dimensionality reduction and selection of variables, and construction of both a model completed with XGBClassifier, and construction of a logistic regression model from the results obtained from AutoML with Pycaret.
22 |
23 | * **[Spotify & Python and Data Science – Data Analysis of Artist NexoAnexo Albums:](https://github.com/luislauriano/Data_Science/tree/master/Spotify%20%26%20Python%20%26%20Data%20Science)**
24 | The objective of the project was to perform a data analysis of the Spotify albums by the artist NexoAnexo, going through the main steps of a data analysis. Being, data collection, pre-processing, exploration and visualization of the data. Finally, after the analysis was completed, an application/dashboard was built with Python and the application was made available on the web through heroku. Another objective is that from the conclusions made from the analysis of the data of the songs of the albums, factors that help or contribute to an album or song to be more successful and how this can be used in future releases could be identified.
25 |
26 | **[Repository/Application source code](https://github.com/luislauriano/Aplicacao_NexoAnexo)**
27 |
28 | **[Project Application/Dashboard](http://analise-nexoanexo.herokuapp.com/)**
29 |
30 | After publicizing the project and the application developed with Streamlit on Linkedin, Product Marketing Ted Ricks from Streamlit found my
31 | application and in his words said "Really enjoyed your app- wanted to let you know it was included in this week's Weekly Roundup on our community
32 | forum Streamlit", Therefore, the application was included in the **[weekly summary 29/11/2020](https://discuss.streamlit.io/t/weekly-roundup-agraph
33 | components-streambackmachines-text-generation-tutorials-and-more/7640)** from the Streamlit community in the apps topic of the week.
34 |
35 |
36 | * **[What they didn't tell you about the coronavirus: An analysis of covid-19 data:](https://github.com/luislauriano/Data_Science/tree/master/Covid-19)**
37 | During the month of March, in China, the number of recovered cases were already greater than the number of confirmed cases, however, countries such as the United States and South Korea still had their number of cases of deaths greater than the number of recovered cases. and for countries like Canada and Brazil, it was still very new. In this brief analysis of covid-19 data from 01/22 to 03/09, I was able to identify and alert the number of increasing cases of deaths in countries like the United States, even before the high peak of the virus.
38 |
39 |
40 | * **[Predictive model for the occurrence of diabetes:](https://github.com/luislauriano/Data_Science/tree/master/Previs%C3%A3o%20de%20Ocorr%C3%AAncia%20de%20Diabetes)**
41 | Based on the dataset of the National Institute of Diabetes and Digestive and Kidney Diseases, a simple model was built capable of predicting whether or not a pregnant patient has diabetes, based on certain diagnostic measures included in the dataset.
42 |
43 | * **[Manipulating and processing data to generate indicators for the company:](https://github.com/luislauriano/Data_Science/tree/master/Gera%C3%A7%C3%A3o%20de%20Indicadores)**
44 | The objective of the project was to process and manipulate data from files in XLSX and CSV from the company Bemol, to generate a single set of data in xlsx, which will serve as a report and indicator for the company.
45 |
46 | * **[Airbnb data analysis for the city of Rio de Janeiro:](https://github.com/luislauriano/Data_Science/tree/master/Airbnb%20Rio%20de%20Janeiro)**
47 | The city of Recife is one of the cities that most attract tourists during the carnival period in Brazil. However, the city of Rio de Janeiro in 2020 was among the three most sought after cities to enjoy the carnival period, with 2 million tourists expected to enjoy the carnival marathon, thus a growth in the hotel chain. In addition, when we travel, we always think about which would be the best hotel, the best location and the best value for money. With that in mind, an exploratory analysis was made of data from one of the largest hotel companies today, Airbnb, using the dataset provided by the company itself.
48 |
49 | * **[Machine Learning for breast cancer detection:](https://github.com/luislauriano/Data_Science/tree/master/Detec%C3%A7%C3%A3o%20do%20C%C3%A2ncer%20de%20Mama)**
50 | In this Machine Learning project, a simple Machine Learning model was built in order to detect the presence of breast cancer.
51 |
52 | * **[Exploratory Data Analysis with Streamlit:](https://gentle-bayou-96352.herokuapp.com/)**
53 | This application was built with streamlit, a python framework for creating an application/dashboard. The application makes an initial exploratory analysis of the data through statistical methods and data visualization, I also took the opportunity to insert some statistical explanations in the application.
54 |
55 |
56 | ---
57 |
58 |
59 |
60 | > Made with 💖 by Luis Vinicius
61 |
--------------------------------------------------------------------------------
/Spotify & Python & Data Science/Dados.csv:
--------------------------------------------------------------------------------
1 | album,album_type,track_number,id_track,name,popularity,explicit,duration_ms,release_date,artists
2 | Trap from Future,album,1,6Mx214YlNnbj7un9PrvmGi,Vem No Tum Tum,17,True,216917,2020-03-13,['NexoAnexo']
3 | Trap from Future,album,2,1xOs8YJkkKm5HhRiP1ha4g,Drip Know Me,3,True,154646,2020-03-13,"['NexoAnexo', 'Drippy Kid Jay']"
4 | Trap from Future,album,3,4s4G0Z6mSkrauIvBXRlVHk,Match,4,True,180716,2020-03-13,['NexoAnexo']
5 | Trap from Future,album,4,5C1iQCV3xuc0oHQpe7fhhQ,Trap & Brega,6,True,164468,2020-03-13,['NexoAnexo']
6 | Trap from Future,album,5,3H2TfCpxaSfEh5nVbcDzrp,Marca Sem Roupa,6,True,148364,2020-03-13,"['NexoAnexo', 'Victor KR']"
7 | Trap from Future,album,6,0obGglfLwWBXRa8chTF2wW,Wow!,3,True,145675,2020-03-13,['NexoAnexo']
8 | Trap from Future,album,7,1eFva22COwaEByjSnoOSjx,Não Posso Morrer Novo,3,False,150116,2020-03-13,['NexoAnexo']
9 | Trap from Future,album,8,3yFbmcmxdQ9ZW5KL4BWNTM,Fuck Cópias,2,True,140880,2020-03-13,"['NexoAnexo', 'Vitorm']"
10 | Trap from Future,album,9,5bx1duyHkWh1pT9hfMNz97,Passa Nada,3,True,135745,2020-03-13,['NexoAnexo']
11 | Trap from Future,album,10,27wJhLhJ5DitqVNF0aa3Y0,Novo Rock,1,True,130971,2020-03-13,['NexoAnexo']
12 | Trap from Future,album,11,44Q3DVLkHklbzBSw4lbgyw,Como Tem Que Ser,3,True,169166,2020-03-13,"['NexoAnexo', 'Rudah', 'Wavee']"
13 | Trap from Future,album,12,1Azgt4uiFUO8oKAu428DPv,Grife,3,True,140925,2020-03-13,"['NexoAnexo', 'Vhulto']"
14 | Trap from Future,album,13,395I6lTZiL99xnHniKZ5PA,Fogo No Incenso,5,True,260158,2020-03-13,"['NexoAnexo', 'Léo da Bodega']"
15 | Trap from Future,album,14,5aTZgS07v6N3ad4kykhDmU,Zombieland,1,True,190357,2020-03-13,['NexoAnexo']
16 | Trap from Future,album,15,6995EcL1GtkmeDXf69pzdJ,Baila Comigo?,2,True,192530,2020-03-13,"['NexoAnexo', 'Prodbygrillo']"
17 | Trap de Cria Mixtape,album,1,1T1oT4ia9vgoUIHczIPhsk,Trap de Cria,9,True,158634,2019-07-29,"['NexoAnexo', 'TheKickBoy']"
18 | Trap de Cria Mixtape,album,2,1cwKuW0dBzneRLyKerSbhd,A Lista das Bandidas,2,True,192697,2019-07-29,['NexoAnexo']
19 | Trap de Cria Mixtape,album,3,6vkMeNQdKrYDGEF3JU2sRZ,A3,1,True,140732,2019-07-29,['NexoAnexo']
20 | Trap de Cria Mixtape,album,4,6Eo9qteRJP9Yb4AP8z225m,E Ai Fake,1,True,167852,2019-07-29,['NexoAnexo']
21 | Trap de Cria Mixtape,album,5,0vHXBOz8viB0SmQRp8Pvcs,Video Call,1,True,192468,2019-07-29,"['NexoAnexo', 'Joma', 'GG']"
22 | Trap de Cria Mixtape,album,6,1odUH2Dq97dJiqEqREdcZx,Aliviando o Stress,1,True,144564,2019-07-29,"['NexoAnexo', 'Young Torvi']"
23 | Trap de Cria Mixtape,album,7,381YrLPNZzDeWk24JRGrvX,Hino dos Irmãos,1,True,126168,2019-07-29,['NexoAnexo']
24 | Trap de Cria Mixtape,album,8,0RcjaQmdPEkwl2rSCX5saG,Bebê da Cara de Mal,1,True,174180,2019-07-29,"['NexoAnexo', 'Jovem Zine']"
25 | Trap de Cria Mixtape,album,9,6qUFQJgAhxB5otxDgYUQdH,Isso Que É Foda,4,True,135086,2019-07-29,['NexoAnexo']
26 | Trap de Cria Mixtape,album,10,3SK7kymFPiQ4pkr3HX3jIx,Pjl,0,True,211323,2019-07-29,['NexoAnexo']
27 | Real Plug Mixtape,album,1,6sCyj7quVzcGbs9kAxrrzV,The Train,0,True,169865,2018-05-15,"['NexoAnexo', 'GG', 'Young Zine']"
28 | Real Plug Mixtape,album,2,1rJOxl2UFAVnbhCWfKZT2b,Fica Bem,0,True,130134,2018-05-15,"['NexoAnexo', 'Guimael']"
29 | Real Plug Mixtape,album,3,0o8xKwkHMiTqboFWj7MdsF,Gospel,0,True,122278,2018-05-15,"['NexoAnexo', 'TheKickBoy']"
30 | Real Plug Mixtape,album,4,6JM9oQxdsP9ZBr7MD0bjzf,Nxanx,0,True,167355,2018-05-15,"['NexoAnexo', 'Guimael']"
31 | Real Plug Mixtape,album,5,2JlPpGTj9MuKjLRD27DXwH,Tudo,0,True,161403,2018-05-15,"['NexoAnexo', 'TheKickBoy']"
32 | Real Plug Mixtape,album,6,1dHrNfJgczkxDCOBtiGJkv,A Lei,0,True,134274,2018-05-15,"['NexoAnexo', 'LC']"
33 | Real Plug Mixtape,album,7,4rDhvolZebPhxzs8wPq802,Dreams Cup,0,True,230822,2018-05-15,"['NexoAnexo', '$ketchBoyBeat$']"
34 | Real Plug Mixtape,album,8,21hFSC02UuBc4NfHv10J8O,2020,4,True,154298,2018-05-15,['NexoAnexo']
35 |
--------------------------------------------------------------------------------
/Spotify & Python & Data Science/README.md:
--------------------------------------------------------------------------------
1 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
2 |
3 | # Spotify & Python & Data Science - Análise de Dados dos Álbuns do artista NexoAnexo
4 |
5 | O que acontece se juntar Spotify & Python & Data Science?
6 | Nesse projeto busquei unir esses três campos coletando os dados dos álbuns do Spotify do artista NexoAnexo com Python, através da biblioteca SpotiPy. Realizando processos de uma análise de dados, como Pré-processamento, Exploração e Visualização de dados.
7 |
8 | NexoAnexo além de ser um grande amigo é um artista do Recife e produtor pernambucano, difusor da trap Music no Brasil. Aproveito para convidar todos vocês a conhecer o trabalho dele na música.
9 |
10 |
11 |
12 |
13 |
14 | ---
15 |
16 | Após o projeto de análise ter sido finalizado, construí uma aplicação/dashboard com o framework Streamlit do Python para o projeto e finalizei inserindo a aplicação disponivel na web através do heroku.
17 |
18 | Após a divulgação da aplicação desenvolvida com Streamlit no Linkedin, o Product Marketing Ted Ricks do Streamlit encontrou a minha publicação, aplicação e me enviou uma mensagem informando que a minha aplicação tinha sido incluída no **[Resumo Semanal](https://discuss.streamlit.io/t/weekly-roundup-agraph-components-streambackmachines-text-generation-tutorials-and-more/7640)** da semana 29/11/2020 da comunidade do Streamlit, no tópico de aplicativos da semana.
19 |
20 | * **[Repositório/Código fonte da Aplicação](https://github.com/luislauriano/Aplicacao_NexoAnexo)**
21 |
22 | * **[Aplicação/Dashboard do projeto](https://luislauriano-aplicacao-nexoanexo-app-tg0gwa.streamlit.app/)**
23 |
24 |
25 |
26 |
27 |
28 | ---
29 |
30 | Na Python Brasil 2020 eu realizei uma palestra sobre esse projeto e como você pode construir algo semelhante, para conferir é só acessar a minha [Palestra](https://bit.ly/362kOjK) no tempo de 1h e 53min.
31 |
32 |
33 |
34 |
35 |
36 |
37 |
--------------------------------------------------------------------------------
/Validação modelo da copa do mundo 2022/Jogos da fase de grupos/README.md:
--------------------------------------------------------------------------------
1 |
2 | [](https://www.linkedin.com/in/luislauriano/) [](https://www.python.org/downloads/release/python-365/) [](http://perso.crans.org/besson/LICENSE.html) [](https://github.com/luislauriano/data_science)
3 |
4 |
5 | # Registro de possiveis resultados para todos os jogos da fase de grupos com base no modelo
6 |
7 | Nesta planilha irei registrar os possíveis resultados dos jogos da fase de grupos com base no modelo, e comparar com o resultado oficial para obter a porcentagem de acertos e erros.
8 |
9 | [Planilha com todos os jogos registrados e validados](https://docs.google.com/spreadsheets/d/1dS63KIzL1290yHe5re0eEVEL2TIT8ICD/edit#gid=1242548967)
10 |
11 |
12 |
13 | # Possiveis resultados finais da fase de grupos com base no modelo
14 |
15 | ## Com base no Teste 01 do projeto (Dados de partidas internacionais da fifa de 2010 a 2022 e os dados do ranking da fifa de 1992 a 2021)
16 |
17 | ### Uma das possíveis possibilidades
18 |
19 | * **Grupo A**
20 |
21 |
22 |
23 |
24 |
25 | * **Grupo B**
26 |
27 |
28 |
29 |
30 |
31 |
32 | * **Grupo C**
33 |
34 |
35 |
36 |
37 |
38 | * **Grupo D**
39 |
40 |
41 |
42 |
43 |
44 | * **Grupo E**
45 |
46 |
47 |
48 |
49 |
50 |
51 | * **Grupo F**
52 |
53 |
54 |
55 |
56 |
57 |
58 | * **Grupo G**
59 |
60 |
61 |
62 |
63 |
64 |
65 | * **Grupo H**
66 |
67 |
68 |
69 |
70 |
71 |
72 | ## Com base no Teste 02 do projeto (dados de partidas internacionais da fifa de 1993 a 2022 e os dados do ranking da fifa de 1992 a 2021)
73 |
74 | ### Uma das possíveis possibilidades
75 |
76 | * **Grupo A**
77 |
78 |
79 |
80 |
81 |
82 | * **Grupo B**
83 |
84 |
85 |
86 |
87 |
88 |
89 | * **Grupo C**
90 |
91 |
92 |
93 |
94 |
95 | * **Grupo D**
96 |
97 |
98 |
99 |
100 |
101 | * **Grupo E**
102 |
103 |
104 |
105 |
106 |
107 |
108 | * **Grupo F**
109 |
110 |
111 |
112 |
113 |
114 |
115 | * **Grupo G**
116 |
117 |
118 |
119 |
120 |
121 |
122 | * **Grupo H**
123 |
124 |
125 |
126 |
127 |
128 |
129 |
130 |
131 | # Resultado oficial
132 |
133 |
134 | * **Grupo A**
135 |
136 |
137 | | Seleção | Pontos | Diff| Gols |
138 | | --- | --- | --- | --- |
139 | | Holanda | x | x | x |
140 | | Equador | x | x | x |
141 | | Senegal | x | x | x |
142 | | Qatar | x | x | x |
143 |
144 |
145 | * **Grupo B**
146 |
147 |
--------------------------------------------------------------------------------
/Validação modelo da copa do mundo 2022/Jogos da fase mata-mata/README.md:
--------------------------------------------------------------------------------
1 | ## Possíveis resultados finais a partir do modelo
2 |
3 | Lembrar que, o resultado da partida é feito a partir da escolha do resultado mais provável, diante da probabilidade de cada resultado possível (vitória, empate e derrota) que foi calculado em uma função com base em todas as probabilidades de resultados possíveis. Por esse motivo, o resultado da partida pode acabar sendo diferente se testado outra vez. Para ficar mais claro, podemos imaginar a cena do doutor estranho no filme guerra infinita onde ele encontra um único resultado positivo para eles vencerem a guerra diante de todos os resultados possiveis finais que a guerra contra thanos poderia ter.
4 |
5 | Outra observação, o resultado de gols de uma partida pode acabar se repetindo em muitos casos o placar de 1x0, talvez por o modelo não ter uma precisão para quantificar tão bem saldos maiores, mas o interessante é entender e levar em consideração quem o modelo está prevendo como possível vencedor da partida
6 |
7 | ### Teste 01
8 | Essa é a principal aposta de resultado final do projeto para os jogos da fase mata-mata da copa do mundo de 2022, seguindo os resultados possíveis que o modelo informou, utilizando dados de partidas internacionais da fifa de 2010 a 2022 e os dados do ranking da fifa de 1992 a 2021, na tentativa de encontrar um melhor desempenho para o modelo.
9 |
10 |
11 |
12 |
13 | ### Teste 02
14 | Para o teste 02, a ideia foi seguir os resultados possíveis que o modelo informou, entretanto, utilizando dados de partidas internacionais da fifa de 1993 a 2022 (diferente do teste 02) e os dados do ranking da fifa de 1992 a 2021, na tentativa de encontrar um melhor desempenho para o modelo.
15 |
16 |
17 |
18 |
19 |
--------------------------------------------------------------------------------
/Validação modelo da copa do mundo 2022/README.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------