├── Documentos teóricos
├── Evaluación del modelo de ML.pdf
├── Metricas de clasificacion.png
├── Métricas de Evaluación para modelos de regresión.pdf
├── Preguntas
├── README.md
├── Sesgo y Varianza.pdf
└── resumen ML.pdf
├── Ejemplo con KMeans - dataset Penguins.ipynb
├── Feature scaling.ipynb
├── Feature selection.-.ipynb
├── Funcion_Aplicar_varios_algoritmos.ipynb
├── Principal Component Analysis-.ipynb
├── Proyecto final Data Science - Coderhouse
├── Predicción de Diabetes
│ ├── Deteccion_diabetes_ProyectoFinal_Camandone_Belén .ipynb
│ └── README.md
└── README.md
├── Proyecto-Dataset-Iris
├── Ejercicio Iris - ML clasificación (2).ipynb
└── README.md
├── Proyecto_calorias
├── ML_calorías quemadas por minutos corriendo.ipynb
├── README.md
├── calories_time.csv
└── calories_time_weight_speed (1).csv
├── Proyecto_salarios
├── ML_regresión_lineal_salarios (1).ipynb
├── README.md
└── data_rrhh.csv
├── README.md
├── Recommender System Final Project
├── Final task - Recommender System - Camandone Belén Part 2.ipynb
├── Final_task_Recommender_System_Camandone Belén Part 1.ipynb
├── ml-capstone-Camandone Belén.pptx.pdf
└── readme.md
├── SMOTE.ipynb
├── Scikit-plot..ipynb
├── Trampa dummy-.ipynb
├── Transformaciones de Variables Categóricas.ipynb
├── Técnicas_de_selección_de_datos_.ipynb
├── Unsupervised Machine Learning Proyecto Final
├── Proyecto Final Aprendizaje No Supervisado Camandone Belén - 2024.ipynb
└── readme.md
└── _Librería Lazypredict.ipynb
/Documentos teóricos/Evaluación del modelo de ML.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bcamandone/Machine-Learning/a185c926938671814f5722df26f82e6238897902/Documentos teóricos/Evaluación del modelo de ML.pdf
--------------------------------------------------------------------------------
/Documentos teóricos/Metricas de clasificacion.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bcamandone/Machine-Learning/a185c926938671814f5722df26f82e6238897902/Documentos teóricos/Metricas de clasificacion.png
--------------------------------------------------------------------------------
/Documentos teóricos/Métricas de Evaluación para modelos de regresión.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bcamandone/Machine-Learning/a185c926938671814f5722df26f82e6238897902/Documentos teóricos/Métricas de Evaluación para modelos de regresión.pdf
--------------------------------------------------------------------------------
/Documentos teóricos/Preguntas:
--------------------------------------------------------------------------------
1 |
2 | 1) ¿Cuál es la diferencia entre aprendizaje supervisado y no supervisado en machine learning?
3 |
4 | El aprendizaje supervisado necesita para el entrenamiento de datos etiquetados. Ejemplo, para realizar una clasificación primero se deben etiquetar los datos a utilizar.
5 | En cambio, el no supervisado, no necesita etiquetar los datos de entrada de forma explícita.
6 |
7 | 2) Explique el trade-off sesgo-varianza.
8 |
9 | Se refiere al equilibrio entre la capacidad de un modelo para ajustarse bien a los datos de entrenamiento (bajo sesgo) y su capacidad para generalizar a datos nuevos (baja varianza).
10 | Si el modelo tiene un sesgo alto es lo mismo que decir que el modelo tiene una complejidad baja(es muy simple). A medida que aumenta la complejidad el sesgo disminuye.
11 | Alto sesgo: cuando un modelo tiene un alto sesgo, tiende a no ajustarse a los datos, lo que significa que no captura los patrones subyacentes.
12 | Alta varianza: cuando un modelo tiene una alta varianza, es demasiado complejo y se ajusta al ruido de los datos, lo que lleva a una generalización deficiente.
13 | Encontrar el equilibrio adecuado entre sesgo y varianza es crucial para crear modelos que funcionen bien tanto con datos de entrenamiento como de prueba.
14 |
15 | 3) ¿Qué es el sobreajuste y cómo se puede prevenir?
16 |
17 | El sobreajuste, también conocido como overfitting en inglés, es un fenómeno donde un modelo se ajusta demasiado bien a los datos de entrenamiento, pero tiene un rendimiento deficiente cuando se enfrenta a datos nuevos, es decir,
18 | datos que no ha visto durante el entrenamiento.
19 |
20 | Acciones para prevenir:
21 |
22 | - Simplificar el modelo (por ejemplo, reducir su complejidad).
23 | - Utilice técnicas de regularización
24 | - Emplear validación cruzada(Cross Validation) para evaluar el rendimiento del modelo.
25 | - Selección de características(Feature selection): Seleccionar cuidadosamente las características más relevantes para el problema en cuestión
26 | y eliminar aquellas que puedan introducir ruido en el modelo puede ayudar a prevenir el sobreajuste.
27 |
28 | 4) ¿Qué es la validación cruzada (Cross Validation) y por qué es importante?
29 |
30 | La validación cruzada es una técnica utilizada para evaluar el desempeño de un modelo. Implica dividir los datos en múltiples subconjuntos, llamados "folds" o "pliegues", y luego utilizar estos subconjuntos de manera rotativa para entrenar y evaluar el modelo.
31 | Esta técnica es importante porque proporciona una estimación más sólida del rendimiento de un modelo, lo que ayuda a detectar problemas como el sobreajuste.
32 |
33 |
34 | 5) Explicar cómo funciona el área bajo la curva(AUC)
35 |
36 | El área bajo la curva(AUC) es una medida numérica del rendimiento de un modelo de clasificación binaria que podemos obtener al graficar la Curva ROC.
37 | Varía entre 0 y 1, donde un valor de 1 indica un clasificador perfecto y un valor de 0.5 indica un rendimiento similar al azar. Cuanto mayor sea el AUC, mejor será el rendimiento del clasificador.
38 |
39 |
--------------------------------------------------------------------------------
/Documentos teóricos/README.md:
--------------------------------------------------------------------------------
1 | # Mis resúmenes 📑
2 |
3 |
4 | ## Qué vas a encontrar en esta carpeta?
5 |
6 | - Resumen ML
7 |
8 | - Evaluación del modelo de ML
9 |
10 | - Métricas de Evaluación para modelos de clasificacion
11 |
12 | - Métricas de Evaluación para modelos de regresión
13 |
14 | - Preguntas teóricas
15 |
16 | - Sesgo y Varianza
17 |
--------------------------------------------------------------------------------
/Documentos teóricos/Sesgo y Varianza.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bcamandone/Machine-Learning/a185c926938671814f5722df26f82e6238897902/Documentos teóricos/Sesgo y Varianza.pdf
--------------------------------------------------------------------------------
/Documentos teóricos/resumen ML.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bcamandone/Machine-Learning/a185c926938671814f5722df26f82e6238897902/Documentos teóricos/resumen ML.pdf
--------------------------------------------------------------------------------
/Feature scaling.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "f9ea4a1d",
6 | "metadata": {},
7 | "source": [
8 | "# Feature scaling"
9 | ]
10 | },
11 | {
12 | "cell_type": "markdown",
13 | "id": "cda048f1",
14 | "metadata": {},
15 | "source": [
16 | "***¿Por qué es bueno escalar?*** Existen modelos sensibles a las escalas como la regresión logística, máquinas de soporte vectorial (SVM), redes neuronales y algoritmos basados en la distancia (como KNN).\n",
17 | "\n",
18 | "Si las características tienen diferentes rangos, los modelos pueden dar más importancia a las características con valores más grandes."
19 | ]
20 | },
21 | {
22 | "cell_type": "code",
23 | "execution_count": 6,
24 | "id": "b220a289",
25 | "metadata": {},
26 | "outputs": [],
27 | "source": [
28 | "from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler\n",
29 | "import numpy as np\n",
30 | "import pandas as pd\n",
31 | "from sklearn.datasets import load_iris\n",
32 | "from sklearn.model_selection import train_test_split"
33 | ]
34 | },
35 | {
36 | "cell_type": "code",
37 | "execution_count": 7,
38 | "id": "852d1e24",
39 | "metadata": {},
40 | "outputs": [],
41 | "source": [
42 | "# Cargar el dataset iris\n",
43 | "iris = load_iris()\n",
44 | "X = pd.DataFrame(iris.data, columns=iris.feature_names)\n",
45 | "y = iris.target\n",
46 | "\n",
47 | "# Dividir el dataset en entrenamiento y prueba\n",
48 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)\n"
49 | ]
50 | },
51 | {
52 | "cell_type": "markdown",
53 | "id": "8f934e4f",
54 | "metadata": {},
55 | "source": [
56 | "## StandardScaler"
57 | ]
58 | },
59 | {
60 | "cell_type": "markdown",
61 | "id": "ddb7a526",
62 | "metadata": {},
63 | "source": [
64 | "### $\\frac{x_i - \\mu}{\\sigma}$ $\\mu$: valor medio, $\\sigma$: desviación estándar"
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "id": "baa1668b",
70 | "metadata": {},
71 | "source": [
72 | "Cada valor de la característica X se le resta la media μ y el resultante se divide por la desviación estándar σ. \n",
73 | "\n",
74 | "Es útil cuando los datos tienen una distribución que se asemeja a una distribución normal.\n"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "execution_count": 8,
80 | "id": "d3f80fe2",
81 | "metadata": {},
82 | "outputs": [
83 | {
84 | "name": "stdout",
85 | "output_type": "stream",
86 | "text": [
87 | "Primeras filas de X_train estandarizado:\n"
88 | ]
89 | },
90 | {
91 | "data": {
92 | "text/html": [
93 | "
\n",
94 | "\n",
107 | "
\n",
108 | " \n",
109 | " \n",
110 | " \n",
111 | " sepal length (cm) \n",
112 | " sepal width (cm) \n",
113 | " petal length (cm) \n",
114 | " petal width (cm) \n",
115 | " \n",
116 | " \n",
117 | " \n",
118 | " \n",
119 | " 0 \n",
120 | " -0.413416 \n",
121 | " -1.462003 \n",
122 | " -0.099511 \n",
123 | " -0.323398 \n",
124 | " \n",
125 | " \n",
126 | " 1 \n",
127 | " 0.551222 \n",
128 | " -0.502563 \n",
129 | " 0.717703 \n",
130 | " 0.353032 \n",
131 | " \n",
132 | " \n",
133 | " 2 \n",
134 | " 0.671802 \n",
135 | " 0.217016 \n",
136 | " 0.951192 \n",
137 | " 0.758890 \n",
138 | " \n",
139 | " \n",
140 | " 3 \n",
141 | " 0.912961 \n",
142 | " -0.022844 \n",
143 | " 0.309096 \n",
144 | " 0.217746 \n",
145 | " \n",
146 | " \n",
147 | " 4 \n",
148 | " 1.636440 \n",
149 | " 1.416315 \n",
150 | " 1.301427 \n",
151 | " 1.705891 \n",
152 | " \n",
153 | " \n",
154 | "
\n",
155 | "
"
156 | ],
157 | "text/plain": [
158 | " sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)\n",
159 | "0 -0.413416 -1.462003 -0.099511 -0.323398\n",
160 | "1 0.551222 -0.502563 0.717703 0.353032\n",
161 | "2 0.671802 0.217016 0.951192 0.758890\n",
162 | "3 0.912961 -0.022844 0.309096 0.217746\n",
163 | "4 1.636440 1.416315 1.301427 1.705891"
164 | ]
165 | },
166 | "execution_count": 8,
167 | "metadata": {},
168 | "output_type": "execute_result"
169 | }
170 | ],
171 | "source": [
172 | "# Estandarización (media 0, desviación estándar 1)\n",
173 | "scaler_standard = StandardScaler()\n",
174 | "X_train_standard = scaler_standard.fit_transform(X_train)\n",
175 | "X_test_standard = scaler_standard.transform(X_test)\n",
176 | "\n",
177 | "print(\"Primeras filas de X_train estandarizado:\")\n",
178 | "pd.DataFrame(X_train_standard, columns=X.columns).head()\n"
179 | ]
180 | },
181 | {
182 | "cell_type": "markdown",
183 | "id": "22aec1a6",
184 | "metadata": {},
185 | "source": [
186 | "## MinMaxScaler $\\frac{x_i - min(x)}{max(x) - min(x)}$"
187 | ]
188 | },
189 | {
190 | "cell_type": "markdown",
191 | "id": "273fc8f4",
192 | "metadata": {},
193 | "source": [
194 | "Con el escalador min-max, podemos transformar y escalar nuestros valores de características de tal manera que cada valor está dentro del rango de [0, 1]. \n",
195 | "\n",
196 | "Sin embargo, la clase MinMaxScaler en scikit-learn también le permite especificar su propio límite superior e inferior en el rango de valores escalados utilizando la variable feature_range. \n"
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "execution_count": 9,
202 | "id": "eef1ae7a",
203 | "metadata": {},
204 | "outputs": [
205 | {
206 | "name": "stdout",
207 | "output_type": "stream",
208 | "text": [
209 | "\n",
210 | "Primeras filas de X_train escalado con Min-Max:\n"
211 | ]
212 | },
213 | {
214 | "data": {
215 | "text/html": [
216 | "\n",
217 | "\n",
230 | "
\n",
231 | " \n",
232 | " \n",
233 | " \n",
234 | " sepal length (cm) \n",
235 | " sepal width (cm) \n",
236 | " petal length (cm) \n",
237 | " petal width (cm) \n",
238 | " \n",
239 | " \n",
240 | " \n",
241 | " \n",
242 | " 0 \n",
243 | " 0.352941 \n",
244 | " 0.181818 \n",
245 | " 0.464286 \n",
246 | " 0.375000 \n",
247 | " \n",
248 | " \n",
249 | " 1 \n",
250 | " 0.588235 \n",
251 | " 0.363636 \n",
252 | " 0.714286 \n",
253 | " 0.583333 \n",
254 | " \n",
255 | " \n",
256 | " 2 \n",
257 | " 0.617647 \n",
258 | " 0.500000 \n",
259 | " 0.785714 \n",
260 | " 0.708333 \n",
261 | " \n",
262 | " \n",
263 | " 3 \n",
264 | " 0.676471 \n",
265 | " 0.454545 \n",
266 | " 0.589286 \n",
267 | " 0.541667 \n",
268 | " \n",
269 | " \n",
270 | " 4 \n",
271 | " 0.852941 \n",
272 | " 0.727273 \n",
273 | " 0.892857 \n",
274 | " 1.000000 \n",
275 | " \n",
276 | " \n",
277 | "
\n",
278 | "
"
279 | ],
280 | "text/plain": [
281 | " sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)\n",
282 | "0 0.352941 0.181818 0.464286 0.375000\n",
283 | "1 0.588235 0.363636 0.714286 0.583333\n",
284 | "2 0.617647 0.500000 0.785714 0.708333\n",
285 | "3 0.676471 0.454545 0.589286 0.541667\n",
286 | "4 0.852941 0.727273 0.892857 1.000000"
287 | ]
288 | },
289 | "execution_count": 9,
290 | "metadata": {},
291 | "output_type": "execute_result"
292 | }
293 | ],
294 | "source": [
295 | "# Escalado Min-Max (reescala características al rango [0, 1])\n",
296 | "scaler_minmax = MinMaxScaler()\n",
297 | "X_train_minmax = scaler_minmax.fit_transform(X_train)\n",
298 | "X_test_minmax = scaler_minmax.transform(X_test)\n",
299 | "\n",
300 | "print(\"\\nPrimeras filas de X_train escalado con Min-Max:\")\n",
301 | "pd.DataFrame(X_train_minmax, columns=X.columns).head()\n"
302 | ]
303 | },
304 | {
305 | "cell_type": "markdown",
306 | "id": "152864fc",
307 | "metadata": {},
308 | "source": [
309 | "## RobustScaler $\\frac{x_i - mediana(x)}{IQR_{(1,3)}(x)}$"
310 | ]
311 | },
312 | {
313 | "cell_type": "markdown",
314 | "id": "ac12c200",
315 | "metadata": {},
316 | "source": [
317 | "Utiliza la mediana y el rango intercuartílico (IQR) para escalar las características, lo que lo hace más resistente a los outliers."
318 | ]
319 | },
320 | {
321 | "cell_type": "code",
322 | "execution_count": 10,
323 | "id": "a30623a4",
324 | "metadata": {},
325 | "outputs": [
326 | {
327 | "name": "stdout",
328 | "output_type": "stream",
329 | "text": [
330 | "\n",
331 | "Primeras filas de X_train escalado con RobustScaler:\n"
332 | ]
333 | },
334 | {
335 | "data": {
336 | "text/html": [
337 | "\n",
338 | "\n",
351 | "
\n",
352 | " \n",
353 | " \n",
354 | " \n",
355 | " sepal length (cm) \n",
356 | " sepal width (cm) \n",
357 | " petal length (cm) \n",
358 | " petal width (cm) \n",
359 | " \n",
360 | " \n",
361 | " \n",
362 | " \n",
363 | " 0 \n",
364 | " -0.230769 \n",
365 | " -1.2 \n",
366 | " -0.176471 \n",
367 | " -0.214286 \n",
368 | " \n",
369 | " \n",
370 | " 1 \n",
371 | " 0.384615 \n",
372 | " -0.4 \n",
373 | " 0.235294 \n",
374 | " 0.142857 \n",
375 | " \n",
376 | " \n",
377 | " 2 \n",
378 | " 0.461538 \n",
379 | " 0.2 \n",
380 | " 0.352941 \n",
381 | " 0.357143 \n",
382 | " \n",
383 | " \n",
384 | " 3 \n",
385 | " 0.615385 \n",
386 | " 0.0 \n",
387 | " 0.029412 \n",
388 | " 0.071429 \n",
389 | " \n",
390 | " \n",
391 | " 4 \n",
392 | " 1.076923 \n",
393 | " 1.2 \n",
394 | " 0.529412 \n",
395 | " 0.857143 \n",
396 | " \n",
397 | " \n",
398 | "
\n",
399 | "
"
400 | ],
401 | "text/plain": [
402 | " sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)\n",
403 | "0 -0.230769 -1.2 -0.176471 -0.214286\n",
404 | "1 0.384615 -0.4 0.235294 0.142857\n",
405 | "2 0.461538 0.2 0.352941 0.357143\n",
406 | "3 0.615385 0.0 0.029412 0.071429\n",
407 | "4 1.076923 1.2 0.529412 0.857143"
408 | ]
409 | },
410 | "execution_count": 10,
411 | "metadata": {},
412 | "output_type": "execute_result"
413 | }
414 | ],
415 | "source": [
416 | "# Escalado robusto \n",
417 | "scaler_robust = RobustScaler()\n",
418 | "X_train_robust = scaler_robust.fit_transform(X_train)\n",
419 | "X_test_robust = scaler_robust.transform(X_test)\n",
420 | "\n",
421 | "print(\"\\nPrimeras filas de X_train escalado con RobustScaler:\")\n",
422 | "pd.DataFrame(X_train_robust, columns=X.columns).head()\n"
423 | ]
424 | }
425 | ],
426 | "metadata": {
427 | "kernelspec": {
428 | "display_name": "Python 3 (ipykernel)",
429 | "language": "python",
430 | "name": "python3"
431 | },
432 | "language_info": {
433 | "codemirror_mode": {
434 | "name": "ipython",
435 | "version": 3
436 | },
437 | "file_extension": ".py",
438 | "mimetype": "text/x-python",
439 | "name": "python",
440 | "nbconvert_exporter": "python",
441 | "pygments_lexer": "ipython3",
442 | "version": "3.11.4"
443 | }
444 | },
445 | "nbformat": 4,
446 | "nbformat_minor": 5
447 | }
448 |
--------------------------------------------------------------------------------
/Funcion_Aplicar_varios_algoritmos.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 20,
6 | "id": "24f0f807",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import pandas as pd\n",
11 | "from sklearn.linear_model import LogisticRegression\n",
12 | "from sklearn.neighbors import KNeighborsClassifier\n",
13 | "from sklearn.ensemble import RandomForestClassifier \n",
14 | "from sklearn.metrics import f1_score\n",
15 | "from sklearn.metrics import roc_auc_score\n",
16 | "from sklearn.metrics import precision_score\n",
17 | "from sklearn.metrics import recall_score\n",
18 | "from sklearn.metrics import accuracy_score\n",
19 | "import warnings\n",
20 | "warnings.filterwarnings(\"ignore\")"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 11,
26 | "id": "1d4352c6",
27 | "metadata": {},
28 | "outputs": [],
29 | "source": [
30 | "X_train = pd.read_csv(\"X_train.csv\") "
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 12,
36 | "id": "63770ee0",
37 | "metadata": {},
38 | "outputs": [],
39 | "source": [
40 | "y_train = pd.read_csv(\"y_train.csv\") "
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 13,
46 | "id": "90209f4b",
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "X_test = pd.read_csv(\"X_test.csv\") "
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": 14,
56 | "id": "515093b1",
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "y_test = pd.read_csv(\"y_test.csv\") "
61 | ]
62 | },
63 | {
64 | "cell_type": "code",
65 | "execution_count": 15,
66 | "id": "54e8fce8",
67 | "metadata": {},
68 | "outputs": [],
69 | "source": [
70 | "def evaluate_model(model, X_train, y_train, X_test, y_test):\n",
71 | " model.fit(X_train, y_train)\n",
72 | " y_train_pred = model.predict(X_train)\n",
73 | " y_test_pred = model.predict(X_test)\n",
74 | " df = pd.DataFrame({'Accuracy_train': [accuracy_score(y_train, y_train_pred)],\n",
75 | " 'Accuracy_test': [accuracy_score(y_test, y_test_pred)],\n",
76 | " 'Precision_train': [precision_score(y_train, y_train_pred)],\n",
77 | " 'Precision_test': [precision_score(y_test, y_test_pred)],\n",
78 | " 'Recall_train': [recall_score(y_train, y_train_pred)],\n",
79 | " 'Recall_test': [recall_score(y_test, y_test_pred)],\n",
80 | " 'f1_score_train': [f1_score(y_train, y_train_pred)],\n",
81 | " 'f1_score_test': [f1_score(y_test, y_test_pred)],\n",
82 | " 'Roc_auc_train': [roc_auc_score(y_train, y_train_pred)],\n",
83 | " 'Roc_auc_test': [roc_auc_score(y_test, y_test_pred)],\n",
84 | " \n",
85 | " })\n",
86 | " return df"
87 | ]
88 | },
89 | {
90 | "cell_type": "code",
91 | "execution_count": 16,
92 | "id": "a717df77",
93 | "metadata": {},
94 | "outputs": [],
95 | "source": [
96 | "models = {\n",
97 | " 'Logistic Regression': LogisticRegression(random_state=42),\n",
98 | " 'KNN':KNeighborsClassifier(),\n",
99 | " 'Random Forest': RandomForestClassifier(random_state = 42),\n",
100 | "}"
101 | ]
102 | },
103 | {
104 | "cell_type": "code",
105 | "execution_count": 21,
106 | "id": "a41b2b5f",
107 | "metadata": {},
108 | "outputs": [],
109 | "source": [
110 | "# iterar a través del diccionario y evaluar cada modelo\n",
111 | "results = []\n",
112 | "for name, model in models.items():\n",
113 | " model_results = evaluate_model(model, X_train, y_train, X_test, y_test)\n",
114 | " model_results['model'] = name\n",
115 | " results.append(model_results)"
116 | ]
117 | },
118 | {
119 | "cell_type": "code",
120 | "execution_count": 22,
121 | "id": "92822939",
122 | "metadata": {},
123 | "outputs": [],
124 | "source": [
125 | "resultados = pd.concat(results, axis=0).reset_index(drop=True)"
126 | ]
127 | },
128 | {
129 | "cell_type": "code",
130 | "execution_count": 23,
131 | "id": "4dc5a126",
132 | "metadata": {},
133 | "outputs": [
134 | {
135 | "data": {
136 | "text/html": [
137 | "\n",
138 | "\n",
151 | "
\n",
152 | " \n",
153 | " \n",
154 | " \n",
155 | " Accuracy_train \n",
156 | " Accuracy_test \n",
157 | " Precision_train \n",
158 | " Precision_test \n",
159 | " Recall_train \n",
160 | " Recall_test \n",
161 | " f1_score_train \n",
162 | " f1_score_test \n",
163 | " Roc_auc_train \n",
164 | " Roc_auc_test \n",
165 | " model \n",
166 | " \n",
167 | " \n",
168 | " \n",
169 | " \n",
170 | " 0 \n",
171 | " 0.810214 \n",
172 | " 0.805865 \n",
173 | " 0.853608 \n",
174 | " 0.854895 \n",
175 | " 0.748054 \n",
176 | " 0.740032 \n",
177 | " 0.797352 \n",
178 | " 0.793328 \n",
179 | " 0.810105 \n",
180 | " 0.806328 \n",
181 | " Logistic Regression \n",
182 | " \n",
183 | " \n",
184 | " 1 \n",
185 | " 0.852708 \n",
186 | " 0.795482 \n",
187 | " 0.905435 \n",
188 | " 0.852291 \n",
189 | " 0.787108 \n",
190 | " 0.718285 \n",
191 | " 0.842135 \n",
192 | " 0.779571 \n",
193 | " 0.852594 \n",
194 | " 0.796025 \n",
195 | " KNN \n",
196 | " \n",
197 | " \n",
198 | " 2 \n",
199 | " 0.932223 \n",
200 | " 0.785161 \n",
201 | " 0.965689 \n",
202 | " 0.807234 \n",
203 | " 0.896044 \n",
204 | " 0.753156 \n",
205 | " 0.929564 \n",
206 | " 0.779258 \n",
207 | " 0.932160 \n",
208 | " 0.785387 \n",
209 | " Random Forest \n",
210 | " \n",
211 | " \n",
212 | "
\n",
213 | "
"
214 | ],
215 | "text/plain": [
216 | " Accuracy_train Accuracy_test Precision_train Precision_test \\\n",
217 | "0 0.810214 0.805865 0.853608 0.854895 \n",
218 | "1 0.852708 0.795482 0.905435 0.852291 \n",
219 | "2 0.932223 0.785161 0.965689 0.807234 \n",
220 | "\n",
221 | " Recall_train Recall_test f1_score_train f1_score_test Roc_auc_train \\\n",
222 | "0 0.748054 0.740032 0.797352 0.793328 0.810105 \n",
223 | "1 0.787108 0.718285 0.842135 0.779571 0.852594 \n",
224 | "2 0.896044 0.753156 0.929564 0.779258 0.932160 \n",
225 | "\n",
226 | " Roc_auc_test model \n",
227 | "0 0.806328 Logistic Regression \n",
228 | "1 0.796025 KNN \n",
229 | "2 0.785387 Random Forest "
230 | ]
231 | },
232 | "execution_count": 23,
233 | "metadata": {},
234 | "output_type": "execute_result"
235 | }
236 | ],
237 | "source": [
238 | "resultados.sort_values(by='Accuracy_test', ascending=False)"
239 | ]
240 | }
241 | ],
242 | "metadata": {
243 | "kernelspec": {
244 | "display_name": "Python 3 (ipykernel)",
245 | "language": "python",
246 | "name": "python3"
247 | },
248 | "language_info": {
249 | "codemirror_mode": {
250 | "name": "ipython",
251 | "version": 3
252 | },
253 | "file_extension": ".py",
254 | "mimetype": "text/x-python",
255 | "name": "python",
256 | "nbconvert_exporter": "python",
257 | "pygments_lexer": "ipython3",
258 | "version": "3.9.13"
259 | }
260 | },
261 | "nbformat": 4,
262 | "nbformat_minor": 5
263 | }
264 |
--------------------------------------------------------------------------------
/Principal Component Analysis-.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 24,
6 | "id": "988ac54d",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "#Librerias basicas\n",
11 | "import pandas as pd\n",
12 | "import matplotlib.pyplot as plt\n",
13 | "import seaborn as sns\n",
14 | "\n",
15 | "import warnings\n",
16 | "warnings.filterwarnings(\"ignore\")"
17 | ]
18 | },
19 | {
20 | "cell_type": "markdown",
21 | "id": "937ea1c8",
22 | "metadata": {},
23 | "source": [
24 | "## ¿Qué es Principal Component Analysis?\n",
25 | "\n",
26 | "Básicamente, PCA es una técnica de reducción de dimensionalidad lineal (algoritmo) que transforma un conjunto de variables correlacionadas en un número más pequeño de variables no correlacionadas llamadas componentes principales, mientras retiene tanta variación de los datos originales como sea posible."
27 | ]
28 | },
29 | {
30 | "cell_type": "code",
31 | "execution_count": 4,
32 | "id": "e0033ca9",
33 | "metadata": {},
34 | "outputs": [],
35 | "source": [
36 | "#Importar el conjunto de datos\n",
37 | "dataset = pd.read_csv(\"iris.csv\")"
38 | ]
39 | },
40 | {
41 | "cell_type": "code",
42 | "execution_count": 5,
43 | "id": "11045ba1",
44 | "metadata": {},
45 | "outputs": [
46 | {
47 | "data": {
48 | "text/html": [
49 | "\n",
50 | "\n",
63 | "
\n",
64 | " \n",
65 | " \n",
66 | " \n",
67 | " sepal_length \n",
68 | " sepal_width \n",
69 | " petal_length \n",
70 | " petal_width \n",
71 | " species \n",
72 | " \n",
73 | " \n",
74 | " \n",
75 | " \n",
76 | " 0 \n",
77 | " 5.1 \n",
78 | " 3.5 \n",
79 | " 1.4 \n",
80 | " 0.2 \n",
81 | " setosa \n",
82 | " \n",
83 | " \n",
84 | " 1 \n",
85 | " 4.9 \n",
86 | " 3.0 \n",
87 | " 1.4 \n",
88 | " 0.2 \n",
89 | " setosa \n",
90 | " \n",
91 | " \n",
92 | " 2 \n",
93 | " 4.7 \n",
94 | " 3.2 \n",
95 | " 1.3 \n",
96 | " 0.2 \n",
97 | " setosa \n",
98 | " \n",
99 | " \n",
100 | " 3 \n",
101 | " 4.6 \n",
102 | " 3.1 \n",
103 | " 1.5 \n",
104 | " 0.2 \n",
105 | " setosa \n",
106 | " \n",
107 | " \n",
108 | " 4 \n",
109 | " 5.0 \n",
110 | " 3.6 \n",
111 | " 1.4 \n",
112 | " 0.2 \n",
113 | " setosa \n",
114 | " \n",
115 | " \n",
116 | "
\n",
117 | "
"
118 | ],
119 | "text/plain": [
120 | " sepal_length sepal_width petal_length petal_width species\n",
121 | "0 5.1 3.5 1.4 0.2 setosa\n",
122 | "1 4.9 3.0 1.4 0.2 setosa\n",
123 | "2 4.7 3.2 1.3 0.2 setosa\n",
124 | "3 4.6 3.1 1.5 0.2 setosa\n",
125 | "4 5.0 3.6 1.4 0.2 setosa"
126 | ]
127 | },
128 | "execution_count": 5,
129 | "metadata": {},
130 | "output_type": "execute_result"
131 | }
132 | ],
133 | "source": [
134 | "#Veamos el dataset\n",
135 | "dataset.head()"
136 | ]
137 | },
138 | {
139 | "cell_type": "code",
140 | "execution_count": 9,
141 | "id": "59a394ee",
142 | "metadata": {},
143 | "outputs": [],
144 | "source": [
145 | "#Preprocesamiento\n",
146 | "X = dataset.drop('species', 1)\n",
147 | "y = dataset['species']"
148 | ]
149 | },
150 | {
151 | "cell_type": "code",
152 | "execution_count": 10,
153 | "id": "5455602f",
154 | "metadata": {},
155 | "outputs": [],
156 | "source": [
157 | "#Separamos en train y test\n",
158 | "from sklearn.model_selection import train_test_split\n",
159 | "\n",
160 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)"
161 | ]
162 | },
163 | {
164 | "cell_type": "code",
165 | "execution_count": 11,
166 | "id": "ebb4ec14",
167 | "metadata": {},
168 | "outputs": [],
169 | "source": [
170 | "#Normalizamos los datos para que PCA funcione mejor\n",
171 | "from sklearn.preprocessing import StandardScaler\n",
172 | "\n",
173 | "sc = StandardScaler()\n",
174 | "X_train = sc.fit_transform(X_train)\n",
175 | "X_test = sc.transform(X_test)"
176 | ]
177 | },
178 | {
179 | "cell_type": "code",
180 | "execution_count": 12,
181 | "id": "63b92fdd",
182 | "metadata": {},
183 | "outputs": [],
184 | "source": [
185 | "#Aplicamos PCA\n",
186 | "from sklearn.decomposition import PCA\n",
187 | "\n",
188 | "pca = PCA()\n",
189 | "X_train = pca.fit_transform(X_train)\n",
190 | "X_test = pca.transform(X_test)"
191 | ]
192 | },
193 | {
194 | "cell_type": "code",
195 | "execution_count": 13,
196 | "id": "e683a085",
197 | "metadata": {},
198 | "outputs": [
199 | {
200 | "data": {
201 | "text/plain": [
202 | "array([0.72551423, 0.23000922, 0.03960774, 0.00486882])"
203 | ]
204 | },
205 | "execution_count": 13,
206 | "metadata": {},
207 | "output_type": "execute_result"
208 | }
209 | ],
210 | "source": [
211 | "#Análisis de la varianza explicada para cada componente\n",
212 | "explained_variance = pca.explained_variance_ratio_\n",
213 | "explained_variance"
214 | ]
215 | },
216 | {
217 | "cell_type": "markdown",
218 | "id": "e1965326",
219 | "metadata": {},
220 | "source": [
221 | "El primer componente principal es responsable de la varianza del 72,55%. De manera similar, el segundo componente principal causa una variación del 23% en el conjunto de datos. En conjunto, podemos decir que (72,55 + 23) el 95,55% por ciento de la información de clasificación contenida en el conjunto de características es capturada por los dos primeros componentes principales."
222 | ]
223 | },
224 | {
225 | "cell_type": "code",
226 | "execution_count": 15,
227 | "id": "ec6a5f27",
228 | "metadata": {},
229 | "outputs": [],
230 | "source": [
231 | "components_ = dataset.columns[0:4]\n",
232 | "\n",
233 | "comp_df_b = pd.DataFrame(list(zip(components_,pca.explained_variance_ratio_*100)),columns=[\"Componentes\",\"Contribucion\"])"
234 | ]
235 | },
236 | {
237 | "cell_type": "code",
238 | "execution_count": 23,
239 | "id": "47dda0ba",
240 | "metadata": {},
241 | "outputs": [
242 | {
243 | "data": {
244 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAhQAAAGICAYAAAAZNnlAAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAABKTklEQVR4nO3deVgVZf8/8PeocOCwCQjngCJgokKKewimYCpG5pKllpbilobmnkpuqAmJZVouffUp1BSzcqnMBTTFntBEtDRF3FDIQFzYVASB+/eHP+bxsAkMeEDfr+s618XcM3PPZ2bOgTeznJGEEAJERERECtTRdwFERERU+zFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBGV4erVq7CyssK8efP0XQpRjcZA8QxZv349JEmSX/Xq1UOjRo0wYsQIXLt2Td/lKRYdHY2goCCkp6dXuo+goCBIklR1RZWicF9cuXKl2pdVnXbv3o2goKAKzydJks58hdvj+PHjVVdcJWt5VG5uLgYNGoR+/fph4cKFT7awx3BycoK/v7++y6gUf39/ODk56bsMqmIMFM+gsLAwHDlyBJGRkRgzZgy2bNmCLl264O7du/ouTZHo6GgsWLBAUaAYPXo0jhw5UnVFPeV2796NBQsWVHi+I0eOYPTo0dVQUcWVVcu0adNgaWmJdevWPeGqnm5z587Fjh079F0GVbF6+i6AnryWLVuiQ4cOAIBu3bohPz8fixYtws6dOzF06FBFfWdnZ8PY2LgqytSLRo0aoVGjRvou46kkhMD9+/dhbGyMTp066bscWVm1fPHFF0+wkqffvXv3oFar8dxzz+m7FKoGPEJB8i/Uq1evAgDu37+PwMBAODs7w9DQEA0bNsT48eOL/efv5OSEV199Fdu3b0fbtm1hZGQk/7d67do1vPvuu3BwcIChoSHs7e3xxhtv4Pr16/L8mZmZmD59us5yJk+eXOxIiSRJmDBhAr755hu4urpCrVajdevW2LVrlzxNUFAQPvjgAwCAs7OzfFrn0KFDAICtW7fC19cXdnZ2MDY2hqurK2bNmlVsWaWd8ti6dSs8PT1hYmICU1NT9OrVCydPnizX9j169Cg6d+4MIyMj2NvbIzAwEA8ePChxWiXLKc82T0xMxNtvvw1bW1uoVCq4urri008/RUFBgTzNlStXIEkSPvnkEyxbtgzOzs4wNTWFp6cnjh49Kk/n7++PVatWAYDOqbTC0ziF++3LL7+Eq6srVCoVNmzYII8r6TRDWloaRowYASsrK5iYmKBPnz64fPmyzjSlHer38fGBj4+PTlt6ejqmTZuGJk2aQKVSwdbWFq+88grOnTsnT1NSLX///Tf69esHS0tLGBkZoU2bNnLthQ4dOgRJkrBlyxbMnj0b9vb2MDc3R48ePRAfH1+svqIuXryIESNGwMXFBWq1Gg0bNkSfPn1w+vTpx85bkvv372PatGlo06YNLCwsYGVlBU9PT/z444+PnXfy5MkwMTFBZmZmsXGDBw+GRqOR37Pl/Sz5+/vD1NQUp0+fhq+vL8zMzNC9e3d5XNFTHqtWrULXrl1ha2sLExMTtGrVCqGhocU+Kz4+PmjZsiViYmLQpUsXqNVqNGnSBB9//LHO+9jHx0fnffnoa/369QCAGzduICAgAG5ubjA1NYWtrS1eeukl/Pbbb4/dZlQcj1AQLl68CACwsbGBEAL9+/fHgQMHEBgYiC5duuDUqVOYP38+jhw5giNHjkClUsnznjhxAnFxcZgzZw6cnZ1hYmKCa9euoWPHjnjw4AE+/PBDuLu749atW9i3bx/S0tKg0Whw7949eHt7459//pGnOXPmDObNm4fTp09j//79On/Yf/nlF8TExGDhwoUwNTVFaGgoXnvtNcTHx6NJkyYYPXo0bt++jS+++ALbt2+HnZ0dAMDNzQ0AcOHCBbzyyivyL85z585hyZIlOHbsGH799dcyt09wcDDmzJmDESNGYM6cOcjNzcXSpUvRpUsXHDt2TF5GSc6ePYvu3bvDyckJ69evh1qtxurVqxEeHl6lyynPNr9x4wa8vLyQm5uLRYsWwcnJCbt27cL06dNx6dIlrF69WqfPVatWoUWLFli+fDmAh4epX3nlFSQkJMDCwgJz587F3bt38cMPP+icJirc9gCwc+dO/Pbbb5g3bx60Wi1sbW3L3NajRo1Cz549ER4ejqSkJMyZMwc+Pj44deoU6tevX+a8RWVlZeHFF1/ElStXMHPmTHh4eODOnTs4fPgwkpOT0aJFixLni4+Ph5eXF2xtbfH555/D2toamzZtgr+/P65fv44ZM2boTP/hhx+ic+fO+M9//oPMzEzMnDkTffr0QVxcHOrWrVtqff/++y+sra3x8ccfw8bGBrdv38aGDRvg4eGBkydPonnz5hVa35ycHNy+fRvTp09Hw4YNkZubi/3792PAgAEICwvDsGHDSp135MiRWLFiBb777jud0z/p6en48ccfMX78eBgYGACo2GcpNzcXffv2xdixYzFr1izk5eWVWsOlS5cwZMgQ+R+Mv/76C4sXL8a5c+fw9ddf60ybkpKCoUOHYtq0aZg/fz527NiBwMBA2Nvby+u5evXqYgFp7ty5OHjwoLxtb9++DQCYP38+tFot7ty5gx07dsDHxwcHDhwoFlDpMQQ9M8LCwgQAcfToUfHgwQORlZUldu3aJWxsbISZmZlISUkRe/fuFQBEaGiozrxbt24VAMTatWvlNkdHR1G3bl0RHx+vM+3IkSOFgYGBOHv2bKm1hISEiDp16oiYmBid9h9++EEAELt375bbAAiNRiMyMzPltpSUFFGnTh0REhIity1dulQAEAkJCWVuh4KCAvHgwQMRFRUlAIi//vpLHjd//nzx6MciMTFR1KtXT7z//vs6fWRlZQmtVisGDRpU5rIGDx4sjI2NRUpKityWl5cnWrRooVOr0uWUZ5vPmjVLABB//PGHTvt7770nJEmS92NCQoIAIFq1aiXy8vLk6Y4dOyYAiC1btsht48ePF6X9GgEgLCwsxO3bt0scN3/+fHm48L352muv6Uz3+++/CwDio48+ktscHR3F8OHDi/Xp7e0tvL295eGFCxcKACIyMrLE+kqr5c033xQqlUokJibqTOfn5yfUarVIT08XQghx8OBBAUC88sorOtN99913AoA4cuRImcstKi8vT+Tm5goXFxcxZcqUx05f2nZ4tL8HDx6IUaNGibZt2z62v3bt2gkvLy+dttWrVwsA4vTp0yXOU9Znafjw4QKA+Prrr4vNN3z4cOHo6FhqLfn5+eLBgwdi48aNom7dujrvIW9v7xLfx25ubqJXr16l9ln4++HR32FFFW6z7t27F3sv0uPxlMczqFOnTjAwMICZmRleffVVaLVa7NmzBxqNRv4Po+gh5YEDB8LExAQHDhzQaXd3d0ezZs102vbs2YNu3brB1dW11Bp27dqFli1bok2bNsjLy5NfvXr10jlVUahbt24wMzOThzUaDWxtbeXTNI9z+fJlDBkyBFqtFnXr1oWBgQG8vb0BAHFxcaXOt2/fPuTl5WHYsGE6dRoZGcHb27tYnUUdPHgQ3bt3h0ajkdvq1q2LwYMHV+lyyrPNf/31V7i5ueGFF17Qaff394cQoth/l71799b5D9vd3R0Ayr3NAeCll16CpaVluacveg2Pl5cXHB0dcfDgwXL3UWjPnj1o1qwZevToUaH5fv31V3Tv3h0ODg467f7+/rh3716xi3b79u2rM1ze7ZSXl4fg4GC4ubnB0NAQ9erVg6GhIS5cuFDme7Is33//PTp37gxTU1PUq1cPBgYG+Oqrr8rV34gRIxAdHa1zuiYsLAwdO3ZEy5Yt5baKfpZef/31ctV+8uRJ9O3bF9bW1nK/w4YNQ35+Ps6fP68zrVarLfY+dnd3L3Wbb9myBTNmzMCcOXMwZswYnXFffvkl2rVrByMjI3mbHThwoNL74FnGUx7PoI0bN8LV1RX16tWDRqPROUR969Yt1KtXDzY2NjrzSJIErVaLW7du6bQ/Om+hGzduPPbCxuvXr+PixYvyYdSibt68qTNsbW1dbBqVSoXs7OwylwMAd+7cQZcuXWBkZISPPvoIzZo1g1qtRlJSEgYMGFBmH4XXH3Ts2LHE8XXqlJ3Jb926Ba1WW6y9aJvS5ZRnm9+6davEW/Xs7e3l8Y8qus0LT3WVZ5sXKun9UZbStlXR2srjxo0baNy4cYXnu3XrVol1V/V2mjp1KlatWoWZM2fC29sblpaWqFOnDkaPHl2hbVxo+/btGDRoEAYOHIgPPvgAWq0W9erVw5o1a4qdMijJ0KFDMX36dKxfvx4hISE4e/YsYmJidE6FVfSzpFarYW5u/thlJyYmokuXLmjevDlWrFgBJycnGBkZ4dixYxg/fnyxfivy++DgwYPw9/fHsGHDsGjRIp1xy5Ytw7Rp0zBu3DgsWrQIDRo0QN26dTF37lwGikpgoHgGubq6ynd5FGVtbY28vDzcuHFDJ1QIIZCSklLsD15JFzDa2Njgn3/+KbOGBg0awNjYuNRfdA0aNHjcapTbr7/+in///ReHDh2S/5MCUK7bSwvr+OGHH+Do6FjhZVtbWyMlJaVYe9E2pcspzza3trZGcnJysfZ///1Xp4aqVNHv9ChtWzVt2lQeNjIyQk5OTrHpbt68qbMO5dkmJXlS22nTpk0YNmwYgoODddpv3rxZ4etFCvtzdnbG1q1bdbZ7SduqJJaWlujXrx82btyIjz76CGFhYTAyMsJbb70lT1PRz1J59//OnTtx9+5dbN++Xef9/+eff5Zr/tKcOnUK/fv3h7e3d4m3/m7atAk+Pj5Ys2aNTntWVpai5T6reMqDdBRehb1p0yad9m3btuHu3bvy+LL4+fnh4MGDZV7p/uqrr+LSpUuwtrZGhw4dir0q86U3pf1nWPhL7dGLSQHg//7v/x7bZ69evVCvXj1cunSpxDpLC2aFunXrhgMHDujcaZGfn4+tW7dW6XLKs827d++Os2fP4sSJEzrtGzduhCRJ6NatW5nLKElljlqUZfPmzTrD0dHRuHr1qs7FcU5OTjh16pTOdOfPny+27n5+fjh//vxjL7otqnv37vIfzkdt3LgRarW6ym55lSSp2Hvyl19+qfSXzEmSBENDQ50/4ikpKeW6y6PQiBEj8O+//2L37t3YtGkTXnvtNZ1wo+Sz9Ljai/YrhFD0/R+JiYnw8/NDkyZNsG3bthKPhpa0D06dOsXvoqkkHqEgHT179kSvXr0wc+ZMZGZmonPnzvJdHm3btsU777zz2D4WLlyIPXv2oGvXrvjwww/RqlUrpKenY+/evZg6dSpatGiByZMnY9u2bejatSumTJkCd3d3FBQUIDExEREREZg2bRo8PDwqVHurVq0AACtWrMDw4cNhYGCA5s2bw8vLC5aWlhg3bhzmz58PAwMDbN68GX/99ddj+3RycsLChQsxe/ZsXL58GS+//DIsLS1x/fp1HDt2DCYmJmV+sdOcOXPw008/4aWXXsK8efOgVquxatWqYrfYKV1Oebb5lClTsHHjRvTu3RsLFy6Eo6MjfvnlF6xevRrvvfdesWthyqNwmy9ZsgR+fn6oW7cu3N3dYWhoWOG+AOD48eMYPXo0Bg4ciKSkJMyePRsNGzZEQECAPM0777yDt99+GwEBAXj99ddx9epVhIaGFjtNN3nyZGzduhX9+vXDrFmz8MILLyA7OxtRUVF49dVXSw1Q8+fPx65du9CtWzfMmzcPVlZW2Lx5M3755ReEhobCwsKiUutW1Kuvvor169ejRYsWcHd3R2xsLJYuXVrp70EpvIU7ICAAb7zxBpKSkrBo0SLY2dnhwoUL5erD19cXjRo1QkBAAFJSUjBixAid8Uo+S2Xp2bMnDA0N8dZbb2HGjBm4f/8+1qxZg7S0tEr36efnh/T0dKxcuRJnzpzRGffcc8/BxsYGr776KhYtWoT58+fD29sb8fHxWLhwIZydncu8I4VKoeeLQukJKrySvuidFUVlZ2eLmTNnCkdHR2FgYCDs7OzEe++9J9LS0nSmc3R0FL179y6xj6SkJDFy5Eih1WqFgYGBsLe3F4MGDRLXr1+Xp7lz546YM2eOaN68uTA0NBQWFhaiVatWYsqUKTp3RQAQ48ePL7aMkq5yDwwMFPb29qJOnToCgDh48KAQQojo6Gjh6ekp1Gq1sLGxEaNHjxYnTpwQAERYWJg8f9G7PArt3LlTdOvWTZibmwuVSiUcHR3FG2+8Ifbv31/mthTi4Z0KnTp1EiqVSmi1WvHBBx+ItWvXlnhHipLllGebX716VQwZMkRYW1sLAwMD0bx5c7F06VKRn58vT1N4l8fSpUuLLQNF7ojIyckRo0ePFjY2NkKSJJ11Km2/ldRP4XszIiJCvPPOO6J+/frC2NhYvPLKK+LChQs68xYUFIjQ0FDRpEkTYWRkJDp06CB+/fXXYnd5CCFEWlqamDRpkmjcuLEwMDAQtra2onfv3uLcuXOl1iKEEKdPnxZ9+vQRFhYWwtDQULRu3VrnfSLE/+7y+P7773XaC7df0emLSktLE6NGjRK2trZCrVaLF198Ufz2228lrkdJSnr/f/zxx8LJyUmoVCrh6uoq1q1bV+p7ujQffvihACAcHBx03heFyvtZGj58uDAxMSlxGSXd5fHzzz+L1q1bCyMjI9GwYUPxwQcfiD179uh8joV4eJfH888//9g+AZT6KqwzJydHTJ8+XTRs2FAYGRmJdu3aiZ07dz72LhQqmSSEEE8qvBDVdFOmTME333xT7KJQIiIqG095EAFITU3FkSNHsH37dnh6euq7HCKiWocXZRLh4UOuhg4dChcXF6xYsULf5RAR1To85UFERESK8QgFERERKcZAQURERIoxUBAREZFiT/1dHgUFBfj3339hZmZW4a8BJiIiepYJIZCVlQV7e/vHPlPoqQ8U//77b7GnBhIREVH5JSUlPfZbXJ/6QFH4yOukpKRyPfWOiIiIHsrMzISDg4P8t7QsT32gKDzNYW5uzkBBRERUCeW5ZIAXZRIREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKfbUP8ujMtp/sFHfJTxzYpcO03cJRESkAI9QEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKSYXgOFk5MTJEkq9ho/fjwAQAiBoKAg2Nvbw9jYGD4+Pjhz5ow+SyYiIqIS6DVQxMTEIDk5WX5FRkYCAAYOHAgACA0NxbJly7By5UrExMRAq9WiZ8+eyMrK0mfZREREVIReA4WNjQ20Wq382rVrF5577jl4e3tDCIHly5dj9uzZGDBgAFq2bIkNGzbg3r17CA8P12fZREREVESNuYYiNzcXmzZtwsiRIyFJEhISEpCSkgJfX195GpVKBW9vb0RHR5faT05ODjIzM3VeREREVL1qTKDYuXMn0tPT4e/vDwBISUkBAGg0Gp3pNBqNPK4kISEhsLCwkF8ODg7VVjMRERE9VGMCxVdffQU/Pz/Y29vrtEuSpDMshCjW9qjAwEBkZGTIr6SkpGqpl4iIiP6nRjwc7OrVq9i/fz+2b98ut2m1WgAPj1TY2dnJ7ampqcWOWjxKpVJBpVJVX7FERERUTI04QhEWFgZbW1v07t1bbnN2doZWq5Xv/AAeXmcRFRUFLy8vfZRJREREpdD7EYqCggKEhYVh+PDhqFfvf+VIkoTJkycjODgYLi4ucHFxQXBwMNRqNYYMGaLHiomIiKgovQeK/fv3IzExESNHjiw2bsaMGcjOzkZAQADS0tLg4eGBiIgImJmZ6aFSIiIiKo0khBD6LqI6ZWZmwsLCAhkZGTA3Ny/XPO0/2FjNVVFRsUuH6bsEIiIqoiJ/Q2vENRRERERUuzFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwUREREpxkBBREREiuk9UFy7dg1vv/02rK2toVar0aZNG8TGxsrjhRAICgqCvb09jI2N4ePjgzNnzuixYiIiIipKr4EiLS0NnTt3hoGBAfbs2YOzZ8/i008/Rf369eVpQkNDsWzZMqxcuRIxMTHQarXo2bMnsrKy9Fc4ERER6ainz4UvWbIEDg4OCAsLk9ucnJzkn4UQWL58OWbPno0BAwYAADZs2ACNRoPw8HCMHTv2SZdMREREJdDrEYqffvoJHTp0wMCBA2Fra4u2bdti3bp18viEhASkpKTA19dXblOpVPD29kZ0dHSJfebk5CAzM1PnRURERNVLr4Hi8uXLWLNmDVxcXLBv3z6MGzcOEydOxMaNGwEAKSkpAACNRqMzn0ajkccVFRISAgsLC/nl4OBQvStBRERE+g0UBQUFaNeuHYKDg9G2bVuMHTsWY8aMwZo1a3SmkyRJZ1gIUaytUGBgIDIyMuRXUlJStdVPRERED+k1UNjZ2cHNzU2nzdXVFYmJiQAArVYLAMWORqSmphY7alFIpVLB3Nxc50VERETVS6+BonPnzoiPj9dpO3/+PBwdHQEAzs7O0Gq1iIyMlMfn5uYiKioKXl5eT7RWIiIiKp1e7/KYMmUKvLy8EBwcjEGDBuHYsWNYu3Yt1q5dC+DhqY7JkycjODgYLi4ucHFxQXBwMNRqNYYMGaLP0omIiOgReg0UHTt2xI4dOxAYGIiFCxfC2dkZy5cvx9ChQ+VpZsyYgezsbAQEBCAtLQ0eHh6IiIiAmZmZHisnIiKiR0lCCKHvIqpTZmYmLCwskJGRUe7rKdp/sLGaq6KiYpcO03cJRERUREX+hur9q7eJiIio9mOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsUYKIiIiEgxBgoiIiJSjIGCiIiIFGOgICIiIsX0GiiCgoIgSZLOS6vVyuOFEAgKCoK9vT2MjY3h4+ODM2fO6LFiIiIiKonej1A8//zzSE5Oll+nT5+Wx4WGhmLZsmVYuXIlYmJioNVq0bNnT2RlZemxYiIiIiqqnt4LqFdP56hEISEEli9fjtmzZ2PAgAEAgA0bNkCj0SA8PBxjx44tsb+cnBzk5OTIw5mZmdVTOBEREcn0foTiwoULsLe3h7OzM958801cvnwZAJCQkICUlBT4+vrK06pUKnh7eyM6OrrU/kJCQmBhYSG/HBwcqn0diIiInnV6DRQeHh7YuHEj9u3bh3Xr1iElJQVeXl64desWUlJSAAAajUZnHo1GI48rSWBgIDIyMuRXUlJSta4DERER6fmUh5+fn/xzq1at4Onpieeeew4bNmxAp06dAACSJOnMI4Qo1vYolUoFlUpVPQUTERFRifR+yuNRJiYmaNWqFS5cuCBfV1H0aERqamqxoxZERESkXzUqUOTk5CAuLg52dnZwdnaGVqtFZGSkPD43NxdRUVHw8vLSY5VERERUlF5PeUyfPh19+vRB48aNkZqaio8++giZmZkYPnw4JEnC5MmTERwcDBcXF7i4uCA4OBhqtRpDhgzRZ9lERERUhF4DxT///IO33noLN2/ehI2NDTp16oSjR4/C0dERADBjxgxkZ2cjICAAaWlp8PDwQEREBMzMzPRZNhERERUhCSGEvouoTpmZmbCwsEBGRgbMzc3LNU/7DzZWc1VUVOzSYfougYiIiqjI39BKH6FIT0/HsWPHkJqaioKCAp1xw4bxjwMREdGzpFKB4ueff8bQoUNx9+5dmJmZ6dzGKUkSAwUREdEzplJ3eUybNg0jR45EVlYW0tPTkZaWJr9u375d1TUSERFRDVepQHHt2jVMnDgRarW6qushIiKiWqhSgaJXr144fvx4VddCREREtVSlrqHo3bs3PvjgA5w9exatWrWCgYGBzvi+fftWSXFERERUO1QqUIwZMwYAsHDhwmLjJElCfn6+sqqIiIioVqlUoCh6mygRERE922rUszyIiIiodqp0oIiKikKfPn3QtGlTuLi4oG/fvvjtt9+qsjYiIiKqJSoVKDZt2oQePXpArVZj4sSJmDBhAoyNjdG9e3eEh4dXdY1ERERUw1XqGorFixcjNDQUU6ZMkdsmTZqEZcuWYdGiRXwaKBER0TOmUkcoLl++jD59+hRr79u3LxISEhQXRURERLVLpQKFg4MDDhw4UKz9wIEDcHBwUFwUERER1S6VOuUxbdo0TJw4EX/++Se8vLwgSRL++9//Yv369VixYkVV10hEREQ1XKUCxXvvvQetVotPP/0U3333HQDA1dUVW7duRb9+/aq0QCIiIqr5KhUoAOC1117Da6+9VpW1EBERUS3FL7YiIiIixcp9hMLKygrnz59HgwYNYGlpCUmSSp329u3bVVIcERER1Q7lDhSfffYZzMzM5J/LChRERET0bCl3oBg+fLj8s7+/f3XUQkRERLVUpa6h2L17N/bt21esPSIiAnv27FFcFBEREdUulQoUs2bNQn5+frH2goICzJo1S3FRREREVLtUKlBcuHABbm5uxdpbtGiBixcvKi6KiIiIapdKBQoLCwtcvny5WPvFixdhYmKiuCgiIiKqXSoVKPr27YvJkyfj0qVLctvFixcxbdo09O3bt8qKIyIiotqhUoFi6dKlMDExQYsWLeDs7AxnZ2e4urrC2toan3zySVXXSERERDVcpb5628LCAtHR0YiMjMRff/0FY2NjuLu7o2vXrlVdHxEREdUClX6WhyRJ8PX1ha+vb1XWQ0RERLVQpQLFwoULyxw/b968CvcZEhKCDz/8EJMmTcLy5csBAEIILFiwAGvXrkVaWho8PDywatUqPP/885Upm4iIiKpJpQLFjh07dIYfPHiAhIQE1KtXD88991yFA0VMTAzWrl0Ld3d3nfbQ0FAsW7YM69evR7NmzfDRRx+hZ8+eiI+Pl78GnIiIiPSvUoHi5MmTxdoyMzPh7+9f4Uea37lzB0OHDsW6devw0Ucfye1CCCxfvhyzZ8/GgAEDAAAbNmyARqNBeHg4xo4dW5nSiYiIqBpU2ePLzc3NsXDhQsydO7dC840fPx69e/dGjx49dNoTEhKQkpKic42GSqWCt7c3oqOjS+0vJycHmZmZOi8iIiKqXpW+KLMk6enpyMjIKPf03377LWJjY3H8+PFi41JSUgAAGo1Gp12j0eDq1aul9hkSEoIFCxaUuwYiIiJSrlKB4vPPP9cZFkIgOTkZ33zzDV5++eVy9ZGUlIRJkyYhIiICRkZGpU5X9DHpQogyH50eGBiIqVOnysOZmZlwcHAoV01ERERUOZUKFJ999pnOcJ06dWBjY4Phw4cjMDCwXH3ExsYiNTUV7du3l9vy8/Nx+PBhrFy5EvHx8QAeHqmws7OTp0lNTS121OJRKpUKKpWqIqtDREREClUqUCQkJChecPfu3XH69GmdthEjRqBFixaYOXMmmjRpAq1Wi8jISLRt2xYAkJubi6ioKCxZskTx8omIiKjqKL6GIikpCZIkoVGjRhWaz8zMDC1bttRpMzExgbW1tdw+efJkBAcHw8XFBS4uLggODoZarcaQIUOUlk1ERERVqFJ3eeTl5WHu3LmwsLCAk5MTHB0dYWFhgTlz5uDBgwdVVtyMGTMwefJkBAQEoEOHDrh27RoiIiL4HRREREQ1TKWOUEyYMAE7duxAaGgoPD09AQBHjhxBUFAQbt68iS+//LJSxRw6dEhnWJIkBAUFISgoqFL9ERER0ZNRqUCxZcsWfPvtt/Dz85Pb3N3d0bhxY7z55puVDhRERERUO1XqlIeRkRGcnJyKtTs5OcHQ0FBpTURERFTLVCpQjB8/HosWLUJOTo7clpOTg8WLF2PChAlVVhwRERHVDuU+5VH4PI1C+/fvR6NGjdC6dWsAwF9//YXc3Fx07969aiskIiKiGq/cgcLCwkJn+PXXX9cZ5rdREhERPbvKHSjCwsKqsw4iIiKqxarsaaNERET07Cr3EYp27drhwIEDsLS0RNu2bct8QNeJEyeqpDgiIiKqHcodKPr16yc/dKt///7VVQ8RERHVQuUOFPPnzwfw8ImgPj4+cHd3h6WlZbUVRkRERLVHha+hqFu3Lnr16oX09PRqKIeIiIhqo0pdlNmqVStcvny5qmshIiKiWqpSgWLx4sWYPn06du3aheTkZGRmZuq8iIiI6NlSqYeDvfzyywCAvn376tztIYSAJEnIz8+vmuqIiIioVqhUoDh48GBV10FERES1WKUChbOzMxwcHIp9F4UQAklJSVVSGBEREdUelbqGwtnZGTdu3CjWfvv2bTg7OysuioiIiGqXSgWKwmslirpz5w6MjIwUF0VERES1S4VOeUydOhUAIEkS5s6dC7VaLY/Lz8/HH3/8gTZt2lRpgURERFTzVShQnDx5EsDDIxSnT5+GoaGhPM7Q0BCtW7fG9OnTq7ZCIiIiqvEqFCgK7+4YMWIEVqxYAXNz82opioiIiGqXSt3lERYWVtV1EBERUS1WqUBx9+5dfPzxxzhw4ABSU1NRUFCgM55fy01ERPRsqVSgGD16NKKiovDOO+/Azs6uxDs+iIiI6NlRqUCxZ88e/PLLL+jcuXNV10NERES1UKW+h8LS0hJWVlZVXQsRERHVUpUKFIsWLcK8efNw7969qq6HiIiIaqFKnfL49NNPcenSJWg0Gjg5OcHAwEBn/IkTJ6qkOCIiIqodKhUo+vfvX8VlEBERUW1WqUAxf/78qq6DiIiIarFKXUNRKDY2Fps2bcLmzZvlr+WuiDVr1sDd3R3m5uYwNzeHp6cn9uzZI48XQiAoKAj29vYwNjaGj48Pzpw5o6RkIiIiqgaVChSpqal46aWX0LFjR0ycOBETJkxA+/bt0b179xIfa16aRo0a4eOPP8bx48dx/PhxvPTSS+jXr58cGkJDQ7Fs2TKsXLkSMTEx0Gq16NmzJ7KysipTNhEREVWTSgWK999/H5mZmThz5gxu376NtLQ0/P3338jMzMTEiRPL3U+fPn3wyiuvoFmzZmjWrBkWL14MU1NTHD16FEIILF++HLNnz8aAAQPQsmVLbNiwAffu3UN4eHipfebk5CAzM1PnRURERNWrUoFi7969WLNmDVxdXeU2Nzc3rFq1SueURUXk5+fj22+/xd27d+Hp6YmEhASkpKTA19dXnkalUsHb2xvR0dGl9hMSEgILCwv55eDgUKl6iIiIqPwqFSgKCgqK3SoKAAYGBsWe6/E4p0+fhqmpKVQqFcaNG4cdO3bAzc0NKSkpAACNRqMzvUajkceVJDAwEBkZGfIrKSmpQvUQERFRxVXqLo+XXnoJkyZNwpYtW2Bvbw8AuHbtGqZMmYLu3btXqK/mzZvjzz//RHp6OrZt24bhw4cjKipKHl/0OSFCiDKfHaJSqaBSqSpUAxERESlTqSMUK1euRFZWFpycnPDcc8+hadOmcHZ2RlZWFr744osK9WVoaIimTZuiQ4cOCAkJQevWrbFixQpotVoAKHY0IjU1tdhRCyIiItKvSh2hcHBwwIkTJxAZGYlz585BCAE3Nzf06NFDcUFCCOTk5MDZ2RlarRaRkZFo27YtACA3NxdRUVFYsmSJ4uUQERFR1alQoPj1118xYcIEHD16FObm5ujZsyd69uwJAMjIyMDzzz+PL7/8El26dClXfx9++CH8/Pzg4OCArKwsfPvttzh06BD27t0LSZIwefJkBAcHw8XFBS4uLggODoZarcaQIUMqvqZERERUbSoUKJYvX44xY8bA3Ny82DgLCwuMHTsWy5YtK3eguH79Ot555x0kJyfDwsIC7u7u2Lt3rxxSZsyYgezsbAQEBCAtLQ0eHh6IiIiAmZlZRcomIiKiaiYJIUR5J3Z0dMTevXt1bhd91Llz5+Dr64vExMQqK1CpzMxMWFhYICMjo8QgVJL2H2ys5qqoqNilw/RdAhERFVGRv6EVuijz+vXrJd4uWqhevXoV+qZMIiIiejpUKFA0bNgQp0+fLnX8qVOnYGdnp7goIiIiql0qFCheeeUVzJs3D/fv3y82Ljs7G/Pnz8err75aZcURERFR7VChizLnzJmD7du3o1mzZpgwYQKaN28OSZIQFxeHVatWIT8/H7Nnz66uWomIiKiGqlCg0Gg0iI6OxnvvvYfAwEAUXs8pSRJ69eqF1atX80uniIiInkEV/mIrR0dH7N69G2lpabh48SKEEHBxcYGlpWV11EdERES1QKW+KRMALC0t0bFjx6qshYiIiGqpSj3Lg4iIiOhRDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkWKUfX05UWyQubKXvEp45jeed1ncJRPSE8QgFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFieg0UISEh6NixI8zMzGBra4v+/fsjPj5eZxohBIKCgmBvbw9jY2P4+PjgzJkzeqqYiIiISqLXQBEVFYXx48fj6NGjiIyMRF5eHnx9fXH37l15mtDQUCxbtgwrV65ETEwMtFotevbsiaysLD1WTkRERI/S6zdl7t27V2c4LCwMtra2iI2NRdeuXSGEwPLlyzF79mwMGDAAALBhwwZoNBqEh4dj7Nix+iibiIiIiqhR11BkZGQAAKysrAAACQkJSElJga+vrzyNSqWCt7c3oqOjS+wjJycHmZmZOi8iIiKqXjUmUAghMHXqVLz44oto2bIlACAlJQUAoNFodKbVaDTyuKJCQkJgYWEhvxwcHKq3cCIiIqo5gWLChAk4deoUtmzZUmycJEk6w0KIYm2FAgMDkZGRIb+SkpKqpV4iIiL6nxrxtNH3338fP/30Ew4fPoxGjRrJ7VqtFsDDIxV2dnZye2pqarGjFoVUKhVUKlX1FkxEREQ69HqEQgiBCRMmYPv27fj111/h7OysM97Z2RlarRaRkZFyW25uLqKiouDl5fWkyyUiIqJS6PUIxfjx4xEeHo4ff/wRZmZm8nURFhYWMDY2hiRJmDx5MoKDg+Hi4gIXFxcEBwdDrVZjyJAh+iydiIiIHqHXQLFmzRoAgI+Pj057WFgY/P39AQAzZsxAdnY2AgICkJaWBg8PD0RERMDMzOwJV0tERESl0WugEEI8dhpJkhAUFISgoKDqL4iIiIgqpcbc5UFERES1FwMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkmF4DxeHDh9GnTx/Y29tDkiTs3LlTZ7wQAkFBQbC3t4exsTF8fHxw5swZ/RRLREREpdJroLh79y5at26NlStXljg+NDQUy5Ytw8qVKxETEwOtVouePXsiKyvrCVdKREREZamnz4X7+fnBz8+vxHFCCCxfvhyzZ8/GgAEDAAAbNmyARqNBeHg4xo4d+yRLJSIiojLU2GsoEhISkJKSAl9fX7lNpVLB29sb0dHRpc6Xk5ODzMxMnRcRERFVrxobKFJSUgAAGo1Gp12j0cjjShISEgILCwv55eDgUK11EhERUQ0OFIUkSdIZFkIUa3tUYGAgMjIy5FdSUlJ1l0hERPTM0+s1FGXRarUAHh6psLOzk9tTU1OLHbV4lEqlgkqlqvb6iIiI6H9q7BEKZ2dnaLVaREZGym25ubmIioqCl5eXHisjIiKiovR6hOLOnTu4ePGiPJyQkIA///wTVlZWaNy4MSZPnozg4GC4uLjAxcUFwcHBUKvVGDJkiB6rJiIioqL0GiiOHz+Obt26ycNTp04FAAwfPhzr16/HjBkzkJ2djYCAAKSlpcHDwwMREREwMzPTV8lERERUAr0GCh8fHwghSh0vSRKCgoIQFBT05IoiIiKiCqux11AQERFR7cFAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIoxUBAREZFiDBRERESkGAMFERERKcZAQURERIrV03cBREQV0fmLzvou4Znz+/u/67sEqgV4hIKIiIgUY6AgIiIixRgoiIiISDEGCiIiIlKMgYKIiIgUY6AgIiIixRgoiIiISDEGCiIiIlKMgYKIiIgUY6AgIiIixRgoiIiISLFaEShWr14NZ2dnGBkZoX379vjtt9/0XRIRERE9osYHiq1bt2Ly5MmYPXs2Tp48iS5dusDPzw+JiYn6Lo2IiIj+vxr/tNFly5Zh1KhRGD16NABg+fLl2LdvH9asWYOQkBA9V0dEREpEdfXWdwnPHO/DUdXSb40OFLm5uYiNjcWsWbN02n19fREdHV3iPDk5OcjJyZGHMzIyAACZmZnlXm5+TnYlqiUlKrJ/Kirrfn619U0lq879mZedV219U8mqc3/ezeP+fNIqsj8LpxVCPHbaGh0obt68ifz8fGg0Gp12jUaDlJSUEucJCQnBggULirU7ODhUS41UNSy+GKfvEqgqhVjouwKqQhYzuT+fKhYV359ZWVmweMx8NTpQFJIkSWdYCFGsrVBgYCCmTp0qDxcUFOD27duwtrYudZ6nQWZmJhwcHJCUlARzc3N9l0MKcX8+fbhPny7Pyv4UQiArKwv29vaPnbZGB4oGDRqgbt26xY5GpKamFjtqUUilUkGlUum01a9fv7pKrHHMzc2f6jf3s4b78+nDffp0eRb25+OOTBSq0Xd5GBoaon379oiMjNRpj4yMhJeXl56qIiIioqJq9BEKAJg6dSreeecddOjQAZ6enli7di0SExMxbhzPuRMREdUUNT5QDB48GLdu3cLChQuRnJyMli1bYvfu3XB0dNR3aTWKSqXC/Pnzi53uodqJ+/Ppw336dOH+LE4S5bkXhIiIiKgMNfoaCiIiIqodGCiIiIhIMQYKIiIiUoyBohbw9/dH//79yzWtj48PJk+eXK31lNehQ4cgSRLS09P1XUqNVZF9WxHr169/7PevBAUFoU2bNmVOc+XKFUiShD///LPKanuaVOQ9Xp598iQ5OTlh+fLl+i6jxqrO31+SJGHnzp2lji/v564m/b4HGCioitS0N/azbvDgwTh//nyF5qmucFMb1LQ/9lXpaV638qiJ65+cnAw/P79yT19b/jmr8beNElHFGRsbw9jYWN9lEFEJtFqtvkuoFjxCUQ4//PADWrVqBWNjY1hbW6NHjx64e/cuACAsLAyurq4wMjJCixYtsHr1anm+wsNW3377Lby8vGBkZITnn38ehw4dkqfJz8/HqFGj4OzsDGNjYzRv3hwrVqyostpzc3MxY8YMNGzYECYmJvDw8NBZfmF637dvH1xdXWFqaoqXX34ZycnJ8jR5eXmYOHEi6tevD2tra8ycORPDhw+X/5v19/dHVFQUVqxYAUmSIEkSrly5Is8fGxuLDh06QK1Ww8vLC/Hx8VW2fkrVln37888/o379+igoKAAA/Pnnn5AkCR988IE8zdixY/HWW28BKPm/so8//hgajQZmZmYYNWoU7t+/L48LCgrChg0b8OOPP8r78NF1uXz5Mrp16wa1Wo3WrVvjyJEjlVqP6uLj44MJEyZgwoQJ8vt0zpw58hMSy/ocHDp0CCNGjEBGRoa87kFBQQCATZs2oUOHDjAzM4NWq8WQIUOQmppaZXX//PPPaN++PYyMjNCkSRMsWLAAeY88fVOSJPznP//Ba6+9BrVaDRcXF/z00086ffz0009wcXGBsbExunXrhg0bNsj/zZa1bgBw7949jBw5EmZmZmjcuDHWrl1bZetWVWr6vhVCwMbGBtu2bZPb2rRpA1tbW3n4yJEjMDAwwJ07dwAUP+Vx7NgxtG3bFkZGRujQoQNOnjwpj7ty5Qq6desGALC0tIQkSfD395fHFxQUYMaMGbCysoJWq9XZv0+coDL9+++/ol69emLZsmUiISFBnDp1SqxatUpkZWWJtWvXCjs7O7Ft2zZx+fJlsW3bNmFlZSXWr18vhBAiISFBABCNGjUSP/zwgzh79qwYPXq0MDMzEzdv3hRCCJGbmyvmzZsnjh07Ji5fviw2bdok1Gq12Lp1q1zD8OHDRb9+/cpVr7e3t5g0aZI8PGTIEOHl5SUOHz4sLl68KJYuXSpUKpU4f/68EEKIsLAwYWBgIHr06CFiYmJEbGyscHV1FUOGDJH7+Oijj4SVlZXYvn27iIuLE+PGjRPm5uZyTenp6cLT01OMGTNGJCcni+TkZJGXlycOHjwoAAgPDw9x6NAhcebMGdGlSxfh5eWlYI9Undq0b9PT00WdOnXE8ePHhRBCLF++XDRo0EB07NhRnqZZs2ZizZo1QoiH+9XCwkIet3XrVmFoaCjWrVsnzp07J2bPni3MzMxE69athRBCZGVliUGDBomXX35Z3oc5OTnyerZo0ULs2rVLxMfHizfeeEM4OjqKBw8eKNn8Vcrb21uYmpqKSZMmiXPnzsnbeu3atUKIsj8HOTk5Yvny5cLc3Fxe96ysLCGEEF999ZXYvXu3uHTpkjhy5Ijo1KmT8PPzk5db+B5PS0t7bI1F98nevXuFubm5WL9+vbh06ZKIiIgQTk5OIigoSJ6m8D0WHh4uLly4ICZOnChMTU3FrVu3hBAP34cGBgZi+vTp4ty5c2LLli2iYcOGck1lrZujo6OwsrISq1atEhcuXBAhISGiTp06Ii4uTunuqFK1Yd8OGDBATJgwQQghxO3bt4WBgYGoX7++OHPmjBBCiODgYOHh4SFPD0Ds2LFDCCHEnTt3hI2NjRg8eLD4+++/xc8//yyaNGkiAIiTJ0+KvLw8sW3bNgFAxMfHi+TkZJGeni5vG3NzcxEUFCTOnz8vNmzYICRJEhEREYq3e2UwUDxGbGysACCuXLlSbJyDg4MIDw/XaVu0aJHw9PQUQvzvj87HH38sj3/w4IFo1KiRWLJkSanLDAgIEK+//ro8XNlAcfHiRSFJkrh27ZrONN27dxeBgYFCiIe/5ACIixcvyuNXrVolNBqNPKzRaMTSpUvl4by8PNG4cWOdmooGGSH+94Hcv3+/3PbLL78IACI7O7tc61Odatu+bdeunfjkk0+EEEL0799fLF68WBgaGorMzEyRnJwsAMh/DIr+8fL09BTjxo3T6c/Dw0MOFKXVUrie//nPf+S2M2fO6CyrJvD29haurq6ioKBAbps5c6ZwdXUt9+fg0e1VmmPHjgkA8h8lJYGiS5cuIjg4WGeab775RtjZ2cnDAMScOXPk4Tt37ghJksSePXvkdWzZsqVOH7Nnz9apqbR1c3R0FG+//bY8XFBQIGxtbeVQWlPUhn37+eefy/th586dokOHDmLAgAFi1apVQgghfH19xcyZM+XpHw0U//d//yesrKzE3bt35fFr1qyRA0VZtXh7e4sXX3xRp61jx446y3qSeMrjMVq3bo3u3bujVatWGDhwINatW4e0tDTcuHEDSUlJGDVqFExNTeXXRx99hEuXLun04enpKf9cr149dOjQAXFxcXLbl19+iQ4dOsDGxgampqZYt24dEhMTFdd+4sQJCCHQrFkznRqjoqJ0alSr1XjuuefkYTs7O/nQX0ZGBq5fv44XXnhBHl+3bl20b9++3HW4u7vr9A2gSg8bV1Zt27c+Pj44dOgQhBD47bff0K9fP7Rs2RL//e9/cfDgQWg0GrRo0aLEeePi4nRqLVr749TUffioTp06QZIkedjT0xMXLlzA8ePHy/U5KMnJkyfRr18/ODo6wszMDD4+PgBQJZ/P2NhYLFy4UKemMWPGIDk5Gffu3ZOne3Tbm5iYwMzMTN728fHx6Nixo06/j35WH+fRviVJglarrXH7Faj5+9bHxwdnzpzBzZs3ERUVBR8fH/j4+CAqKgp5eXmIjo6Gt7d3ifPGxcWhdevWUKvVOutXXo/uQ0D39/eTxosyH6Nu3bqIjIxEdHQ0IiIi8MUXX2D27Nn4+eefAQDr1q2Dh4dHsXkep/DD8d1332HKlCn49NNP4enpCTMzMyxduhR//PGH4toLCgpQt25dxMbGFqvJ1NRU/tnAwKBYbaLIN7I/+mEGUGx8WR7tv7CfwmsB9Km27VsfHx989dVX+Ouvv1CnTh24ubnB29sbUVFRSEtLK/UXVlWoqfuwvMrzOSjq7t278PX1ha+vLzZt2gQbGxskJiaiV69eyM3NVVxTQUEBFixYgAEDBhQbZ2RkJP9c0uezcNsLIarss1m079qiJuzbli1bwtraGlFRUYiKisLChQvh4OCAxYsXIyYmBtnZ2XjxxRdLnLci+6skNWkfMlCUgyRJ6Ny5Mzp37ox58+bB0dERv//+Oxo2bIjLly9j6NChZc5/9OhRdO3aFcDDCxxjY2MxYcIEAMBvv/0GLy8vBAQEyNM/LlmXV9u2bZGfn4/U1FR06dKlUn1YWFhAo9Hg2LFjch/5+fk4efKkzncYGBoaIj8/vyrKfqJq077t2rUrsrKysHz5cnh7e0OSJHh7eyMkJARpaWmYNGlSqfO6urri6NGjGDZsmE7tj6qt+7BQ0fU5evQoXFxcyvU5KGndz507h5s3b+Ljjz+Gg4MDAOD48eNVVm+7du0QHx+Ppk2bVrqPFi1aYPfu3TptRWus7fsVqPn7VpIkdO3aFT/++CP+/vtvdOnSBWZmZnjw4AG+/PJLtGvXDmZmZiXO6+bmhm+++QbZ2dnynVklfTYB1Pj9yFMej/HHH38gODgYx48fR2JiIrZv344bN27A1dUVQUFBCAkJwYoVK3D+/HmcPn0aYWFhWLZsmU4fq1atwo4dO3Du3DmMHz8eaWlpGDlyJACgadOmOH78OPbt24fz589j7ty5iImJqZLamzVrhqFDh2LYsGHYvn07EhISEBMTgyVLlhT7JVSW999/HyEhIfjxxx8RHx+PSZMmIS0tTec/IycnJ/zxxx+4cuUKbt68WSv+y6lt+9bCwgJt2rTBpk2b5MOzXbt2xYkTJ3D+/Hm5rSSTJk3C119/ja+//hrnz5/H/PnzcebMGZ1pnJyccOrUKcTHx+PmzZt48OBBpWvVh6SkJEydOhXx8fHYsmULvvjiC0yaNKlcnwMnJyfcuXMHBw4cwM2bN3Hv3j00btwYhoaG+OKLL3D58mX89NNPWLRoUZXVO2/ePGzcuBFBQUE4c+YM4uLisHXrVsyZM6fcfYwdOxbnzp3DzJkzcf78eXz33XdYv349gP8dSSpp3Wqb2rBvfXx8EB4eDnd3d5ibm8shY/PmzWV+NocMGYI6depg1KhROHv2LHbv3o1PPvlEZxpHR0dIkoRdu3bhxo0b8t0iNY5ertyoRc6ePSt69eolbGxshEqlEs2aNRNffPGFPH7z5s2iTZs2wtDQUFhaWoquXbuK7du3CyH+d0FbeHi48PDwEIaGhsLV1VUcOHBAnv/+/fvC399fWFhYiPr164v33ntPzJo167EXy5Wm6MWRhXcaODk5CQMDA6HVasVrr70mTp06JYQo+YKlHTt2iEffGg8ePBATJkwQ5ubmwtLSUsycOVMMHDhQvPnmm/I08fHxolOnTsLY2FgAEAkJCSVeSHTy5El5vL7Vtn0rhBDTpk0TAMTff/8tt7Vu3VrY2NjoXLRW0n5dvHixaNCggTA1NRXDhw8XM2bM0KklNTVV9OzZU5iamgoA4uDBg/J6Fl4cJoQQaWlp8viawtvbWwQEBMh3IFlaWopZs2bJ2+RxnwMhhBg3bpywtrYWAMT8+fOFEEKEh4cLJycnoVKphKenp/jpp5/KdbFcSUraJ3v37hVeXl7C2NhYmJubixdeeEG+e0EI3Yv3CllYWIiwsDB5+McffxRNmzYVKpVK+Pj4yBf0PXrhc0nr5ujoKD777DOdvlu3bi2Prylqw74VQojTp08LAGL69Oly22effSYAiF27dulMW3S/HjlyRLRu3VoYGhqKNm3ayHd1PPq5W7hwodBqtUKSJDF8+HB52xS9GL5fv37y+CeNjy+vRleuXIGzs3Ox0wO1XUFBAVxdXTFo0KAq/Y+tNnla921t5ePjgzZt2vCrpAEsXrwYX375JZKSkvRdSpXgvq09eA0FPdbVq1cREREBb29v5OTkYOXKlUhISMCQIUP0XRrRM2/16tXo2LEjrK2t8fvvv2Pp0qXydTxETxKvoahFEhMTdW6NKvqqilvZSlKnTh2sX78eHTt2ROfOnXH69Gns378frq6u1bK8Z5G+9i1VHT8/v1L3X3BwcLUt98KFC+jXrx/c3NywaNEiTJs2Tb/flvgU0te+rW14yqMWycvL0/lK66KcnJxQrx4POtVG3Le137Vr15CdnV3iOCsrK1hZWT3hiqiqcN+WDwMFERERKcZTHkRERKQYAwUREREpxkBBREREijFQEBERkWIMFERERKQYAwXRUyQlJQXvv/8+mjRpApVKBQcHB/Tp0wcHDhzQd2k1hr+/P/r376/vMoieOryxnegpceXKFXTu3Bn169dHaGgo3N3d8eDBA+zbtw/jx4/HuXPn9F0iET3FeISC6CkREBAASZJw7NgxvPHGG2jWrBmef/55TJ06VX4ccmJiIvr16wdTU1OYm5tj0KBBuH79utxHUFAQ2rRpg6+//hqNGzeGqakp3nvvPeTn5yM0NBRarRa2trZYvHixzrIlScKaNWvg5+cHY2NjODs74/vvv9eZ5vTp03jppZdgbGwMa2trvPvuuzpPTSw8cvDJJ5/Azs4O1tbWGD9+vM5TT3NzczFjxgw0bNgQJiYm8PDwwKFDh+Tx69evR/369bFv3z64urrC1NQUL7/8MpKTk+X127BhA3788UdIkgRJkuT5r127hsGDB8PS0hLW1tbo16+fzpeNHTp0CC+88AJMTExQv359dO7cGVevXlW0z4ieJgwURE+B27dvY+/evRg/fjxMTEyKja9fvz6EEOjfvz9u376NqKgoREZG4tKlSxg8eLDOtJcuXcKePXuwd+9ebNmyBV9//TV69+6Nf/75B1FRUViyZAnmzJkjh5RCc+fOxeuvv46//voLb7/9Nt566y3ExcUBAO7du4eXX34ZlpaWiImJwffff4/9+/cXe+bEwYMHcenSJRw8eBAbNmzA+vXr5cdxA8CIESPw+++/49tvv8WpU6cwcOBAvPzyy7hw4YI8zb179/DJJ5/gm2++weHDh5GYmIjp06cDAKZPn45BgwbJISM5ORleXl64d+8eunXrBlNTUxw+fBj//e9/5TCSm5uLvLw89O/fH97e3jh16hSOHDmCd999V35EOBGBjy8nehr88ccfAoD8ePWSREREiLp164rExES57cyZMwKAOHbsmBBCiPnz5wu1Wi0yMzPlaXr16iWcnJxEfn6+3Na8eXMREhIiDwMQ48aN01meh4eHeO+994QQQqxdu1ZYWlqKO3fuyON/+eUXUadOHZGSkiKEePgod0dHR5GXlydPM3DgQDF48GAhhBAXL14UkiSJa9eu6Syne/fuIjAwUAjx8BHhAMTFixfl8atWrRIajUYeLumR8V999ZVo3ry5ziPgc3JyhLGxsdi3b5+4deuWACAOHTpUfMMSkRBCCB6hIHoKiP//Dfpl/cccFxcHBwcHODg4yG1ubm6oX7++fCQBePjcEDMzM3lYo9HAzc0NderU0WlLTU3V6d/T07PYcGG/cXFxaN26tc7Rk86dO6OgoADx8fFy2/PPP4+6devKw3Z2dvJyTpw4ASEEmjVrpvNwpqioKFy6dEmeR61W47nnniuxj9LExsbi4sWLMDMzk/u1srLC/fv3cenSJVhZWcHf3x+9evVCnz59sGLFCvk0ChE9xIsyiZ4CLi4ukCQJcXFxpd7BIIQoMXAUbTcwMNAZL0lSiW0FBQWPrauw39KW/eg0pS27cDkFBQWoW7cuYmNjdUIHAJiampbZh3jMI4sKCgrQvn17bN68udg4GxsbAEBYWBgmTpyIvXv3YuvWrZgzZw4iIyPRqVOnMvsmelbwCAXRU8DKygq9evXCqlWrcPfu3WLj09PT4ebmhsTERCQlJcntZ8+eRUZGRpU8ir7oNRVHjx5FixYtADw8EvLnn3/q1Pb777+jTp06aNasWbn6b9u2LfLz85GamoqmTZvqvLRabbnrNDQ0RH5+vk5bu3btcOHCBdja2hbr28LCQqeGwMBAREdHo2XLlggPDy/3comedgwURE+J1atXIz8/Hy+88AK2bduGCxcuIC4uDp9//jk8PT3Ro0cPuLu7Y+jQoThx4gSOHTuGYcOGwdvbGx06dFC8/O+//x5ff/01zp8/j/nz5+PYsWPyRZdDhw6FkZERhg8fjr///hsHDx7E+++/j3feeQcajaZc/Tdr1gxDhw7FsGHDsH37diQkJCAmJgZLlizB7t27y12nk5MTTp06hfj4eNy8eRMPHjzA0KFD0aBBA/Tr1w+//fYbEhISEBUVhUmTJuGff/5BQkICAgMDceTIEVy9ehURERE4f/58lQQxoqcFAwXRU8LZ2RknTpxAt27dMG3aNLRs2RI9e/bEgQMHsGbNGkiShJ07d8LS0hJdu3ZFjx490KRJE2zdurVKlr9gwQJ8++23cHd3x4YNG7B582a4ubkBeHhdw759+3D79m107NgRb7zxBrp3746VK1dWaBlhYWEYNmwYpk2bhubNm6Nv3774448/dK4LeZwxY8agefPm6NChA2xsbPD7779DrVbj8OHDaNy4MQYMGABXV1eMHDkS2dnZMDc3h1qtxrlz5/D666+jWbNmePfddzFhwgSMHTu2QvUTPc0k8biTi0REjyFJEnbs2MFvoCR6hvEIBRERESnGQEFERESK8bZRIlKMZ06JiEcoiIiISDEGCiIiIlKMgYKIiIgUY6AgIiIixRgoiIiISDEGCiIiIlKMgYKIiIgUY6AgIiIixf4fiO59CU9yFBkAAAAASUVORK5CYII=\n",
245 | "text/plain": [
246 | ""
247 | ]
248 | },
249 | "metadata": {},
250 | "output_type": "display_data"
251 | }
252 | ],
253 | "source": [
254 | "plt.figure(figsize=(6, 4))\n",
255 | "\n",
256 | "sns.barplot(\n",
257 | " data=comp_df_b,\n",
258 | " x=\"Componentes\", \n",
259 | " y=\"Contribucion\");\n",
260 | "\n",
261 | "plt.title(\"Porcentaje de contribución a la varianza\")\n",
262 | "plt.show()"
263 | ]
264 | }
265 | ],
266 | "metadata": {
267 | "kernelspec": {
268 | "display_name": "Python 3 (ipykernel)",
269 | "language": "python",
270 | "name": "python3"
271 | },
272 | "language_info": {
273 | "codemirror_mode": {
274 | "name": "ipython",
275 | "version": 3
276 | },
277 | "file_extension": ".py",
278 | "mimetype": "text/x-python",
279 | "name": "python",
280 | "nbconvert_exporter": "python",
281 | "pygments_lexer": "ipython3",
282 | "version": "3.9.13"
283 | }
284 | },
285 | "nbformat": 4,
286 | "nbformat_minor": 5
287 | }
288 |
--------------------------------------------------------------------------------
/Proyecto final Data Science - Coderhouse/Predicción de Diabetes/README.md:
--------------------------------------------------------------------------------
1 | ## Proyecto final Data Science - 04/2023
2 |
3 | Objetivo:
4 | Aplicar distintos algoritmos de Machine learning que puedan predecir si una persona tiene o no la enfermedad de diabetes, en funcion a los atributos utilizados como input del modelo.
5 |
6 | Contexto:
7 | Contamos con un informe donde se estudió un grupo de personas y se guardaron datos sobre determinados indicadores de salud. Los siguientes datos estan etiquetados, contamos con una variable que nos dice si los encuestados tienen diabetes, no tienen diabetes o tienen pre diabetes.
8 |
9 | Si bien no existe una cura para la diabetes, estrategias como perder peso, comer sano, mantenerse activo y recibir tratamientos médicos pueden mitigar los daños de esta enfermedad en muchos pacientes. El diagnóstico temprano puede conducir a cambios en el estilo de vida y a un tratamiento más eficaz, lo que convierte a los modelos predictivos del riesgo de diabetes en herramientas importantes para el público y los funcionarios de salud pública.
10 |
--------------------------------------------------------------------------------
/Proyecto final Data Science - Coderhouse/README.md:
--------------------------------------------------------------------------------
1 | ## Entrenamiento y optimización de modelos de Machine Learning
2 |
3 | Requisitos base
4 |
5 | Un Notebook (Colab o Jupyter) que debe contener:
6 |
7 | - Abstracto con motivación y audiencia: Descripción de alto nivel de lo que motiva a analizar los datos elegidos y audiencia que se podría beneficiar de este análisis.
8 | - Preguntas/Problema que buscamos resolver: Si bien puede haber más de una problemática a resolver, la problemática principal debe encuadrarse como un problema de clasificación o regresión.
9 | - Breve Análisis Exploratorio de Datos (EDA): Análisis descriptivo de los datos mediante visualizaciones y herramientas estadísticas, análisis de valores faltantes.
10 | - Ingeniería de atributos: Creación de nuevas variables, transformación de variables existentes (i.e normalización de variables, encoding, etc.)
11 | - Entrenamiento y Testeo: Entrenamiento y testeo de al menos 2 modelos distintos de Machine Learning utilizando algún método de validación cruzada.
12 | - Optimización: Utilizar alguna técnica de optimización de hiperparámetros (e.g gridsearch, randomizedsearch, etc.)
13 | - Selección de modelos: utilizar las métricas apropiadas para la selección del mejor modelo (e.g AUC, MSE, etc.)
14 |
--------------------------------------------------------------------------------
/Proyecto-Dataset-Iris/README.md:
--------------------------------------------------------------------------------
1 | # Iris
2 |
3 | 
4 |
5 | Sin duda alguna el famoso dataset de Iris es el primer conjunto de datos con el que nos enfrentamos al aprender sobre proyectos de machine learning.
6 | En este proyecto se contruye mediante un algoritmo de clasificación, un clasificador para las flores del tipo Iris. Este conjunto de datos contiene 50 muestras de cada una de tres especies de Iris: Iris setosa, Iris virginica e Iris versicolor, para cada una de estas especies se midieron cuatro rasgos de cada muestra: la longitud y el ancho del sépalos y pétalos.
7 |
8 | Librerías
9 |
10 | - pandas
11 | - numpy
12 | - seaborn
13 | - matplotlib
14 | - sklearn
15 |
16 |
--------------------------------------------------------------------------------
/Proyecto_calorias/README.md:
--------------------------------------------------------------------------------
1 | ## Machine Learning - Regresión lineal
2 |
3 |
4 | 
5 |
6 |
7 | En este proyecto de ML vamos a predecir cuantas calorías se queman en función a:
8 |
9 | 1) Los minutos que corre una persona, para el caso de regresión lineal simple.
10 | 2) Los minutos que corre una persona, la velocidad y su peso corporal, para el caso de regresión lineal múltiple.
11 |
12 | El objetivo es encontrar los parámetros de la recta que mejor se ajustan a esos datos.
13 |
14 |
15 | Librerías utilizadas:
16 |
17 | - numpy
18 | - pandas
19 | - matplotlib.
20 | - sklearn
21 |
22 | Para resolver este problema vamos a realizar los siguientes pasos:
23 |
24 | - Carga de datos
25 | - Cambio de estructura de datos
26 | - Visualización de los datos
27 | - Creacción del modelo (hipótesis) y ajuste
28 | - Obtención del modelo (hipótesis)
29 | - Visualización del modelo
30 | - Medición de la calidad de los resultados
31 |
--------------------------------------------------------------------------------
/Proyecto_salarios/README.md:
--------------------------------------------------------------------------------
1 | # Rvisión salarial utilizando Regresión lineal
2 |
3 | Contexto empresarial: Eres un científico de datos en una organización. Su empresa está pasando por una revisión interna de sus prácticas de contratación y compensación
4 | a los empleados. En los últimos años, su empresa ha tenido poco éxito en la conversión de candidatas mujeres de alta calidad que deseaba contratar.
5 | La gerencia plantea la hipótesis de que esto se debe a una posible discriminación salarial y quiere averiguar qué la está causando.
6 |
7 | Problema empresarial: Como parte de la revisión interna, el departamento de recursos humanos se ha acercado a usted para investigar específicamente la siguiente pregunta:
8 | "En general, ¿se les paga más a los hombres que a las mujeres en su organización? Si es así, ¿qué esta conduciendo esta brecha?"
9 |
10 | Contexto analítico: El departamento de recursos humanos le ha proporcionado una base de datos de empleados que contiene información sobre varios atributos
11 | como rendimiento, educación, ingresos, antigüedad, etc.
12 | Deberá usar técnicas de regresión lineal en este conjunto de datos para resolver el problema comercial descrito anteriormente.
13 | La regresión lineal cuantifica la correlación entre la variable dependiente (salario) y las variables independientes (por ejemplo, educación, ingresos, antigüedad, etc.)
14 |
--------------------------------------------------------------------------------
/Proyecto_salarios/data_rrhh.csv:
--------------------------------------------------------------------------------
1 | jobtitle,gender,age,performance,education,department,seniority,income,bonus
2 | Graphic Designer,Female,18,5,College,Operations,2,42363,9938
3 | Software Engineer,Male,21,5,College,Management,5,108476,11128
4 | Warehouse Associate,Female,19,4,PhD,Administration,5,90208,9268
5 | Software Engineer,Male,20,5,Masters,Sales,4,108080,10154
6 | Graphic Designer,Male,26,5,Masters,Engineering,5,99464,9319
7 | IT,Female,20,5,PhD,Operations,4,70890,10126
8 | Graphic Designer,Female,20,5,College,Sales,4,67585,10541
9 | Software Engineer,Male,18,4,PhD,Engineering,5,97523,10240
10 | Graphic Designer,Female,33,5,High School,Engineering,5,112976,9836
11 | Sales Associate,Female,35,5,College,Engineering,5,106524,9941
12 | Graphic Designer,Male,24,5,PhD,Engineering,5,102261,10212
13 | Driver,Female,18,5,College,Management,3,62759,10124
14 | Financial Analyst,Female,19,5,College,Sales,3,84007,8990
15 | Warehouse Associate,Female,30,5,Masters,Administration,5,86220,9583
16 | Warehouse Associate,Female,35,5,PhD,Operations,4,95584,9745
17 | Marketing Associate,Female,27,5,PhD,Management,3,73357,10334
18 | Financial Analyst,Female,23,5,PhD,Administration,5,88422,10768
19 | Warehouse Associate,Female,24,5,College,Administration,5,99545,9949
20 | Sales Associate,Male,21,5,High School,Engineering,5,90386,9461
21 | Data Scientist,Female,30,5,College,Engineering,5,92067,9838
22 | Warehouse Associate,Male,36,5,PhD,Engineering,5,108446,9210
23 | Financial Analyst,Male,24,5,PhD,Sales,3,83323,9329
24 | Warehouse Associate,Male,24,5,PhD,Sales,3,85205,9792
25 | Driver,Female,35,5,College,Management,1,72038,9031
26 | Software Engineer,Male,21,4,PhD,Engineering,5,132823,9625
27 | Graphic Designer,Female,26,5,College,Engineering,2,71182,10015
28 | Software Engineer,Male,19,5,Masters,Administration,4,100305,9618
29 | Graphic Designer,Male,35,5,PhD,Management,3,88566,9469
30 | Warehouse Associate,Male,34,5,PhD,Engineering,4,104271,10177
31 | IT,Female,38,5,College,Administration,5,112392,10504
32 | Software Engineer,Male,20,5,Masters,Management,2,66359,10137
33 | Sales Associate,Male,35,5,Masters,Management,5,103007,10512
34 | Marketing Associate,Female,26,5,PhD,Operations,4,80306,9233
35 | Data Scientist,Female,22,5,High School,Engineering,3,74523,9972
36 | Data Scientist,Female,45,5,PhD,Management,5,113252,10139
37 | Software Engineer,Male,27,5,PhD,Management,4,96040,10050
38 | Software Engineer,Male,21,5,High School,Management,4,91674,9780
39 | Software Engineer,Male,31,5,High School,Operations,3,92928,9094
40 | IT,Female,33,5,College,Management,4,84638,9409
41 | IT,Male,19,5,College,Administration,4,78986,9023
42 | Data Scientist,Male,29,5,PhD,Administration,5,96355,9784
43 | Data Scientist,Male,32,5,PhD,Management,3,87121,8703
44 | Software Engineer,Male,31,5,PhD,Sales,2,88724,8949
45 | Warehouse Associate,Male,22,5,Masters,Engineering,2,58219,9868
46 | Software Engineer,Male,25,5,Masters,Sales,5,119024,11293
47 | Software Engineer,Male,22,5,Masters,Management,3,81503,9729
48 | Data Scientist,Female,21,5,PhD,Operations,3,70643,10033
49 | Graphic Designer,Female,26,5,PhD,Sales,3,74167,10346
50 | Marketing Associate,Male,21,5,College,Management,5,86886,9424
51 | IT,Female,22,5,High School,Sales,4,102120,10215
52 | Sales Associate,Male,18,4,Masters,Operations,5,90780,9353
53 | Software Engineer,Male,18,4,Masters,Operations,4,89474,9104
54 | Marketing Associate,Female,35,5,PhD,Engineering,5,97376,9564
55 | Data Scientist,Female,33,5,Masters,Sales,2,89415,9654
56 | Software Engineer,Female,22,4,Masters,Operations,3,88037,8949
57 | Data Scientist,Female,24,4,College,Administration,4,71105,8623
58 | Driver,Female,31,5,High School,Engineering,2,62692,8886
59 | Software Engineer,Male,40,5,PhD,Management,4,112466,9493
60 | Data Scientist,Female,49,5,College,Operations,3,71193,8331
61 | Marketing Associate,Female,20,5,Masters,Administration,4,76286,9479
62 | Financial Analyst,Male,18,5,PhD,Engineering,4,97328,9874
63 | Graphic Designer,Male,34,5,Masters,Management,3,86766,9841
64 | Warehouse Associate,Female,41,5,College,Operations,2,82453,9564
65 | Software Engineer,Male,19,5,PhD,Administration,3,88814,10153
66 | Driver,Male,31,5,College,Management,4,82832,9958
67 | Sales Associate,Female,33,5,High School,Engineering,3,83034,8841
68 | Warehouse Associate,Female,53,5,College,Engineering,5,127085,9687
69 | Sales Associate,Male,24,5,High School,Administration,1,66168,9242
70 | Data Scientist,Male,18,5,PhD,Administration,4,59212,10209
71 | Data Scientist,Male,18,5,High School,Operations,2,55189,9405
72 | Manager,Male,25,5,College,Engineering,5,131311,11067
73 | Sales Associate,Female,31,5,College,Management,1,52633,9516
74 | Warehouse Associate,Female,31,4,PhD,Operations,5,101154,10324
75 | Financial Analyst,Female,42,5,College,Engineering,5,113640,8762
76 | Data Scientist,Male,22,4,High School,Management,2,50401,8590
77 | Marketing Associate,Female,23,5,PhD,Administration,3,46263,9432
78 | Warehouse Associate,Female,36,4,College,Management,3,74938,8234
79 | Marketing Associate,Female,39,5,College,Sales,5,103384,9168
80 | Warehouse Associate,Female,39,5,College,Engineering,2,77584,9798
81 | Graphic Designer,Male,23,5,Masters,Management,3,83031,10240
82 | Financial Analyst,Male,29,5,Masters,Engineering,5,111019,10042
83 | Warehouse Associate,Female,25,3,Masters,Engineering,4,80192,8747
84 | Warehouse Associate,Male,21,5,PhD,Management,1,48755,9931
85 | Warehouse Associate,Male,28,5,High School,Engineering,2,59044,9368
86 | Manager,Male,29,4,PhD,Engineering,5,146615,10173
87 | Data Scientist,Female,32,5,PhD,Engineering,1,46693,8457
88 | IT,Female,48,5,PhD,Operations,3,88152,8857
89 | Data Scientist,Female,38,5,Masters,Administration,5,127156,9979
90 | Graphic Designer,Male,31,5,PhD,Engineering,1,69462,8872
91 | Manager,Male,19,4,PhD,Sales,5,119162,9395
92 | Software Engineer,Male,26,5,PhD,Operations,2,68466,8384
93 | Financial Analyst,Female,36,5,Masters,Administration,3,92622,8509
94 | Marketing Associate,Female,34,5,Masters,Engineering,1,51331,9130
95 | Warehouse Associate,Male,49,5,High School,Sales,4,109419,8375
96 | Software Engineer,Male,18,5,College,Operations,3,80355,9945
97 | Driver,Female,21,4,Masters,Sales,4,77032,8725
98 | Financial Analyst,Female,47,5,High School,Sales,5,146190,8961
99 | Graphic Designer,Female,22,4,Masters,Administration,4,96045,9129
100 | Software Engineer,Male,29,5,High School,Management,1,82534,8914
101 | IT,Male,55,5,College,Management,5,127769,8802
102 | Marketing Associate,Female,34,5,High School,Sales,5,96061,9093
103 | Warehouse Associate,Female,48,5,College,Management,5,111342,10122
104 | Financial Analyst,Female,43,5,PhD,Management,3,117554,8369
105 | Financial Analyst,Female,30,5,High School,Engineering,5,109369,10052
106 | Warehouse Associate,Female,25,4,Masters,Operations,4,92358,9611
107 | Financial Analyst,Female,21,4,Masters,Operations,4,91978,8647
108 | Sales Associate,Male,28,5,PhD,Administration,1,52290,9330
109 | IT,Male,18,5,High School,Engineering,1,47036,9130
110 | Warehouse Associate,Male,46,5,PhD,Sales,1,76517,8502
111 | Marketing Associate,Female,29,4,Masters,Administration,5,89822,8818
112 | Driver,Male,32,4,PhD,Engineering,4,90368,8215
113 | Financial Analyst,Male,26,5,College,Engineering,4,106194,9421
114 | Warehouse Associate,Female,51,5,High School,Management,3,87684,8645
115 | Marketing Associate,Female,40,5,Masters,Administration,3,62040,8056
116 | Sales Associate,Female,52,5,Masters,Engineering,2,84132,7204
117 | Financial Analyst,Female,29,4,Masters,Administration,4,88047,8459
118 | Driver,Female,37,5,Masters,Administration,2,65012,8357
119 | Marketing Associate,Female,23,4,Masters,Engineering,4,59251,8596
120 | Graphic Designer,Female,18,4,PhD,Operations,3,78462,8743
121 | Driver,Male,40,5,College,Management,2,97414,8833
122 | Driver,Female,18,4,Masters,Administration,2,42722,8515
123 | Marketing Associate,Female,26,4,College,Management,3,63695,8897
124 | Data Scientist,Male,51,5,PhD,Operations,4,110731,8643
125 | Software Engineer,Male,54,5,PhD,Administration,4,125572,8245
126 | Driver,Male,49,5,Masters,Operations,2,72754,8014
127 | IT,Female,26,5,College,Administration,2,53956,9454
128 | Sales Associate,Male,58,5,High School,Engineering,5,135013,8972
129 | IT,Female,49,5,College,Management,3,108495,8668
130 | Financial Analyst,Female,27,4,High School,Administration,4,87608,7511
131 | Data Scientist,Male,38,5,College,Administration,5,101173,9255
132 | IT,Male,53,5,PhD,Management,5,126828,9310
133 | Financial Analyst,Male,49,5,High School,Engineering,4,102716,8290
134 | Marketing Associate,Male,26,4,High School,Engineering,5,77793,8700
135 | Marketing Associate,Female,44,5,High School,Sales,1,57469,7539
136 | Graphic Designer,Male,33,5,Masters,Management,3,83758,8498
137 | Graphic Designer,Male,22,5,PhD,Administration,1,40187,8549
138 | Financial Analyst,Male,41,5,Masters,Engineering,3,98531,6855
139 | Financial Analyst,Male,26,4,College,Operations,4,104435,7802
140 | Data Scientist,Male,29,5,High School,Administration,1,67617,9075
141 | Driver,Female,24,4,High School,Administration,5,91049,10375
142 | Manager,Female,32,5,PhD,Operations,1,98519,8943
143 | Marketing Associate,Female,36,4,High School,Operations,5,91713,7627
144 | Warehouse Associate,Female,41,4,Masters,Engineering,3,88994,8873
145 | Financial Analyst,Male,44,5,Masters,Administration,4,119381,9345
146 | Financial Analyst,Female,43,5,High School,Sales,5,113180,9089
147 | Warehouse Associate,Male,48,5,PhD,Management,2,68639,9553
148 | Software Engineer,Female,28,3,PhD,Engineering,4,103763,8383
149 | Software Engineer,Male,40,5,College,Management,1,68295,7902
150 | Manager,Male,21,4,High School,Engineering,3,103107,9036
151 | Warehouse Associate,Female,19,4,College,Sales,2,64844,9042
152 | Driver,Female,61,5,PhD,Sales,4,129132,8472
153 | Warehouse Associate,Female,57,5,High School,Engineering,3,93470,8578
154 | Data Scientist,Male,28,4,Masters,Administration,5,97625,7985
155 | Graphic Designer,Female,52,5,College,Administration,1,66529,7067
156 | Data Scientist,Female,48,5,Masters,Management,1,100115,7679
157 | Warehouse Associate,Female,57,5,PhD,Administration,2,86423,8058
158 | Sales Associate,Male,30,4,Masters,Sales,5,115165,8841
159 | Data Scientist,Female,53,5,Masters,Engineering,5,133495,8552
160 | Marketing Associate,Male,47,5,Masters,Operations,3,86516,8049
161 | IT,Male,18,4,Masters,Administration,4,69855,8347
162 | Sales Associate,Female,35,4,PhD,Sales,1,84844,7205
163 | Graphic Designer,Female,51,5,College,Operations,3,108241,8602
164 | Warehouse Associate,Female,51,5,High School,Sales,3,106478,8744
165 | Financial Analyst,Male,38,4,College,Management,5,114421,7444
166 | Software Engineer,Male,24,4,Masters,Management,1,79208,8602
167 | Software Engineer,Male,49,5,Masters,Administration,5,142105,9653
168 | Sales Associate,Male,22,5,High School,Administration,1,57783,9306
169 | Marketing Associate,Female,33,5,High School,Operations,2,46903,8130
170 | Financial Analyst,Male,36,5,Masters,Operations,1,71858,7475
171 | Marketing Associate,Female,27,5,High School,Management,1,44325,8867
172 | Driver,Male,39,5,High School,Engineering,4,100183,8832
173 | Graphic Designer,Male,19,4,Masters,Sales,1,61000,8872
174 | Software Engineer,Male,59,5,Masters,Engineering,1,109623,6897
175 | Marketing Associate,Female,20,3,PhD,Sales,5,60070,8273
176 | IT,Female,29,4,Masters,Management,3,72542,8109
177 | Driver,Male,54,5,PhD,Administration,4,110049,8928
178 | IT,Female,30,4,Masters,Engineering,2,80981,8201
179 | IT,Female,42,5,College,Operations,2,83499,8950
180 | Driver,Female,42,4,Masters,Management,4,106788,8974
181 | Warehouse Associate,Female,50,4,Masters,Engineering,5,108922,8477
182 | IT,Female,43,5,PhD,Administration,5,127062,9619
183 | Software Engineer,Male,64,5,College,Sales,4,120593,7573
184 | Manager,Male,23,4,PhD,Administration,1,90685,8816
185 | Marketing Associate,Female,33,5,High School,Engineering,1,49830,8543
186 | Software Engineer,Male,39,4,High School,Sales,4,106896,7918
187 | Sales Associate,Male,58,5,PhD,Sales,2,129425,7524
188 | Marketing Associate,Female,24,3,Masters,Operations,5,82338,7414
189 | Marketing Associate,Female,37,4,PhD,Administration,3,86036,7567
190 | Graphic Designer,Male,53,5,High School,Operations,2,93073,8188
191 | Software Engineer,Male,33,5,PhD,Administration,2,87352,8435
192 | Software Engineer,Male,56,5,PhD,Administration,5,139760,9229
193 | Financial Analyst,Female,52,5,High School,Engineering,2,100070,7169
194 | Warehouse Associate,Male,40,5,Masters,Sales,4,102815,9090
195 | Software Engineer,Male,43,4,PhD,Management,5,101262,7794
196 | Data Scientist,Male,19,5,High School,Operations,2,58054,9585
197 | Sales Associate,Female,53,4,Masters,Operations,5,129609,7392
198 | Sales Associate,Male,44,5,Masters,Administration,2,100091,9185
199 | Manager,Male,23,4,Masters,Sales,3,103479,7850
200 | Driver,Female,51,5,College,Operations,1,72827,7285
201 | IT,Male,20,4,PhD,Engineering,2,74073,8509
202 | Financial Analyst,Male,47,5,College,Administration,2,64852,8023
203 | Graphic Designer,Male,49,5,College,Management,4,106887,8347
204 | Software Engineer,Male,26,4,College,Administration,2,77309,7957
205 | Graphic Designer,Female,44,4,High School,Administration,4,100416,7513
206 | Software Engineer,Male,18,3,Masters,Engineering,2,88482,7385
207 | Data Scientist,Female,32,4,College,Management,4,78623,9012
208 | Sales Associate,Male,18,4,High School,Management,4,80135,9007
209 | Warehouse Associate,Female,63,5,High School,Sales,2,106158,7654
210 | Graphic Designer,Female,37,4,College,Engineering,4,89411,9522
211 | Graphic Designer,Male,22,4,Masters,Sales,2,57398,7827
212 | Marketing Associate,Female,64,5,PhD,Sales,2,95687,7279
213 | Graphic Designer,Female,39,4,PhD,Operations,5,120294,8090
214 | Manager,Male,20,4,College,Sales,2,107140,8199
215 | Financial Analyst,Female,39,5,College,Operations,1,76773,7278
216 | Data Scientist,Female,37,4,Masters,Engineering,2,98010,8210
217 | Data Scientist,Female,39,4,PhD,Management,3,91099,8306
218 | Financial Analyst,Male,29,4,High School,Management,4,89559,8338
219 | Software Engineer,Male,58,5,High School,Management,3,120156,7200
220 | Driver,Male,38,5,PhD,Engineering,4,102966,8978
221 | Marketing Associate,Female,64,5,PhD,Sales,5,128970,7346
222 | Warehouse Associate,Female,51,4,High School,Management,5,121376,7832
223 | Financial Analyst,Male,53,5,Masters,Administration,3,112941,7404
224 | Graphic Designer,Male,46,5,High School,Engineering,4,102736,9736
225 | Driver,Female,28,4,High School,Management,1,64008,7185
226 | IT,Male,52,5,Masters,Engineering,1,90024,7350
227 | Financial Analyst,Female,48,4,Masters,Management,5,129406,7280
228 | IT,Female,34,5,College,Operations,1,54245,8245
229 | Marketing Associate,Female,27,4,High School,Administration,5,67530,9311
230 | Graphic Designer,Male,18,4,High School,Sales,4,75506,9289
231 | Graphic Designer,Female,31,4,College,Engineering,2,81826,8375
232 | Software Engineer,Female,27,4,High School,Operations,1,65711,7789
233 | Sales Associate,Male,29,3,Masters,Management,2,61164,6346
234 | Driver,Female,41,5,Masters,Administration,1,73801,8317
235 | Manager,Male,62,5,College,Administration,5,157410,7840
236 | Graphic Designer,Male,23,4,High School,Operations,1,50601,7351
237 | Data Scientist,Male,29,4,High School,Engineering,3,65907,7668
238 | Data Scientist,Male,38,5,High School,Sales,2,80937,8335
239 | Driver,Male,34,4,College,Administration,5,90330,7022
240 | Sales Associate,Female,62,5,College,Operations,2,93268,7657
241 | IT,Male,40,4,College,Administration,5,109345,7791
242 | Software Engineer,Male,27,4,Masters,Engineering,1,66084,7769
243 | Financial Analyst,Male,35,4,PhD,Operations,5,98900,7960
244 | IT,Male,19,4,College,Engineering,3,78113,9335
245 | Software Engineer,Male,36,4,Masters,Engineering,5,130826,9694
246 | Sales Associate,Male,23,3,High School,Engineering,3,85928,7229
247 | IT,Male,43,5,College,Operations,4,98515,9090
248 | Graphic Designer,Female,48,4,College,Administration,4,103133,7793
249 | Marketing Associate,Female,20,4,College,Sales,4,69254,8609
250 | Graphic Designer,Male,27,4,College,Sales,5,114680,8472
251 | Driver,Male,25,3,PhD,Engineering,5,92687,7634
252 | Graphic Designer,Male,21,3,Masters,Engineering,4,72301,8391
253 | Financial Analyst,Male,24,4,Masters,Administration,1,67749,8517
254 | Data Scientist,Female,61,5,Masters,Management,2,97000,7757
255 | Manager,Male,52,5,High School,Operations,2,104526,7253
256 | Data Scientist,Male,26,4,High School,Management,3,80494,8449
257 | Financial Analyst,Female,30,4,High School,Operations,2,40341,7320
258 | Software Engineer,Male,29,4,High School,Engineering,1,74316,8761
259 | IT,Female,46,5,PhD,Sales,3,102247,8307
260 | Driver,Male,49,4,College,Sales,5,104657,6839
261 | Warehouse Associate,Female,37,3,High School,Engineering,5,92722,8010
262 | Sales Associate,Female,51,5,Masters,Management,1,72120,9084
263 | Warehouse Associate,Female,28,4,Masters,Operations,1,62260,7740
264 | Manager,Male,45,5,High School,Engineering,3,123422,8894
265 | Warehouse Associate,Male,48,5,College,Sales,1,62097,7075
266 | Data Scientist,Male,53,5,High School,Management,2,102910,8335
267 | Data Scientist,Female,29,4,High School,Engineering,3,84021,9111
268 | Warehouse Associate,Female,19,4,High School,Operations,1,37026,8636
269 | Driver,Male,18,3,PhD,Operations,3,56318,8407
270 | IT,Male,28,4,High School,Operations,3,64958,7910
271 | Data Scientist,Female,31,4,High School,Operations,2,55954,7999
272 | Graphic Designer,Female,58,4,PhD,Sales,3,108820,7776
273 | IT,Female,41,5,High School,Operations,3,100014,8621
274 | Driver,Male,36,4,PhD,Operations,3,88213,8581
275 | Sales Associate,Female,34,3,High School,Sales,4,108288,7491
276 | Driver,Male,56,5,Masters,Sales,5,100859,8388
277 | Financial Analyst,Female,30,4,Masters,Administration,5,114007,9248
278 | Warehouse Associate,Male,37,4,High School,Management,3,83025,8105
279 | IT,Male,28,4,High School,Administration,5,95964,9147
280 | Warehouse Associate,Female,47,4,College,Administration,2,94973,7885
281 | Graphic Designer,Male,44,4,High School,Sales,4,102108,8339
282 | Data Scientist,Male,24,4,Masters,Sales,2,60736,7873
283 | IT,Male,47,5,PhD,Operations,1,82126,6971
284 | Financial Analyst,Male,33,5,Masters,Operations,1,69878,8389
285 | IT,Female,22,4,Masters,Sales,1,37780,8443
286 | Graphic Designer,Male,57,4,Masters,Management,4,132873,7266
287 | Financial Analyst,Male,20,4,High School,Operations,2,65164,7827
288 | IT,Female,65,5,Masters,Administration,4,121210,7949
289 | Graphic Designer,Female,49,4,PhD,Engineering,2,95169,7415
290 | Sales Associate,Male,19,3,PhD,Management,4,94492,8209
291 | Driver,Female,39,4,PhD,Operations,2,75901,7402
292 | Sales Associate,Male,33,3,Masters,Operations,5,100463,6332
293 | Sales Associate,Female,40,4,Masters,Operations,5,92276,9551
294 | Driver,Female,21,2,College,Engineering,5,92458,6966
295 | Graphic Designer,Male,46,4,High School,Administration,4,88571,7464
296 | Financial Analyst,Male,39,4,PhD,Management,3,87409,7168
297 | Warehouse Associate,Male,58,5,College,Sales,4,133927,8278
298 | Manager,Male,33,4,Masters,Sales,1,90482,7094
299 | Marketing Associate,Female,27,4,High School,Sales,4,87435,9230
300 | Graphic Designer,Female,65,5,High School,Operations,4,130251,7750
301 | Manager,Female,34,3,Masters,Management,2,106560,7257
302 | Software Engineer,Male,56,5,College,Operations,3,107366,7774
303 | Warehouse Associate,Female,34,4,High School,Sales,1,56172,7739
304 | Marketing Associate,Female,35,4,Masters,Sales,2,58807,7316
305 | Financial Analyst,Female,49,4,Masters,Sales,4,126626,7144
306 | Manager,Female,24,3,PhD,Sales,3,104202,8456
307 | Warehouse Associate,Male,39,4,PhD,Operations,3,77793,8093
308 | Warehouse Associate,Male,45,4,High School,Operations,2,84930,6914
309 | Driver,Male,32,3,Masters,Engineering,2,69367,6378
310 | Graphic Designer,Male,65,5,Masters,Operations,5,116751,8505
311 | Software Engineer,Male,52,4,Masters,Administration,5,139841,7641
312 | Driver,Female,52,4,College,Administration,2,78750,6666
313 | Warehouse Associate,Female,38,3,College,Administration,5,93306,8265
314 | Financial Analyst,Male,19,4,Masters,Operations,2,60892,8462
315 | Graphic Designer,Male,20,2,Masters,Engineering,5,99543,6111
316 | Sales Associate,Female,29,3,PhD,Operations,2,56745,6495
317 | Data Scientist,Female,29,4,College,Operations,1,59458,7939
318 | Driver,Female,18,2,Masters,Sales,4,71824,6397
319 | Software Engineer,Male,64,5,High School,Operations,2,111903,6824
320 | IT,Male,55,5,College,Sales,3,104361,7813
321 | Warehouse Associate,Female,22,2,High School,Sales,5,91568,7139
322 | Software Engineer,Male,59,4,Masters,Engineering,4,155834,7101
323 | Financial Analyst,Male,30,5,Masters,Sales,1,83620,7864
324 | Marketing Associate,Female,50,5,PhD,Administration,1,81995,7840
325 | Graphic Designer,Female,27,4,PhD,Sales,1,66125,8465
326 | Driver,Male,27,3,College,Engineering,5,89678,7778
327 | Sales Associate,Female,57,5,Masters,Operations,3,98321,8980
328 | Manager,Male,48,4,College,Management,1,109346,7730
329 | Manager,Female,62,4,Masters,Sales,4,155203,7808
330 | IT,Male,64,5,PhD,Management,2,98100,7460
331 | Data Scientist,Female,29,3,High School,Management,4,73306,7765
332 | Driver,Male,40,5,College,Engineering,2,68797,7817
333 | Financial Analyst,Male,58,4,College,Management,4,120154,6751
334 | Warehouse Associate,Female,22,3,College,Sales,4,76865,8145
335 | Graphic Designer,Female,59,4,PhD,Management,4,125770,8013
336 | Financial Analyst,Female,36,4,High School,Administration,5,119931,8400
337 | Marketing Associate,Female,55,4,Masters,Operations,1,71794,6685
338 | Driver,Female,54,4,College,Management,2,87998,7028
339 | Software Engineer,Male,63,4,High School,Administration,5,140445,7905
340 | Warehouse Associate,Male,18,2,College,Administration,5,85306,7735
341 | Sales Associate,Female,31,3,High School,Operations,3,77275,7089
342 | Manager,Male,48,4,PhD,Sales,4,143887,7622
343 | Driver,Female,58,4,High School,Sales,5,130720,7246
344 | Software Engineer,Female,41,4,PhD,Sales,1,86675,7947
345 | Graphic Designer,Female,22,3,College,Operations,2,56954,7550
346 | IT,Male,63,5,Masters,Operations,5,133472,8688
347 | Data Scientist,Male,23,3,Masters,Administration,1,62753,7242
348 | Sales Associate,Male,63,5,PhD,Operations,4,128173,8360
349 | Software Engineer,Female,28,3,College,Engineering,5,112491,8727
350 | Marketing Associate,Female,22,4,PhD,Operations,1,36548,8090
351 | Driver,Female,62,4,PhD,Engineering,1,88781,6413
352 | Software Engineer,Male,65,4,High School,Management,5,157852,6669
353 | Financial Analyst,Male,51,4,PhD,Operations,4,113730,6578
354 | Driver,Male,57,5,College,Management,1,68403,6893
355 | Manager,Male,59,4,PhD,Sales,5,176789,6773
356 | Data Scientist,Female,45,3,Masters,Sales,5,129226,6912
357 | Graphic Designer,Male,24,3,College,Engineering,2,59399,6692
358 | Sales Associate,Male,61,5,PhD,Management,1,102473,8468
359 | Manager,Female,60,4,PhD,Management,3,160614,8354
360 | Sales Associate,Female,28,3,PhD,Sales,3,89570,7411
361 | Marketing Associate,Female,41,4,College,Sales,3,73989,8780
362 | Financial Analyst,Female,35,4,Masters,Management,2,67762,7449
363 | Financial Analyst,Male,35,4,College,Administration,4,90307,7663
364 | Driver,Female,38,3,Masters,Management,2,77751,6554
365 | Data Scientist,Female,57,4,High School,Management,1,87150,6814
366 | Driver,Female,19,2,College,Sales,2,79291,5000
367 | Graphic Designer,Male,55,5,Masters,Engineering,1,79144,8328
368 | Manager,Male,62,5,PhD,Engineering,3,150467,8100
369 | Driver,Female,31,3,Masters,Engineering,2,57070,6615
370 | Marketing Associate,Female,49,4,College,Sales,4,102201,7640
371 | Driver,Male,37,4,College,Engineering,3,70088,8244
372 | Marketing Associate,Female,24,3,PhD,Operations,4,57782,7144
373 | Manager,Male,32,3,High School,Operations,4,117629,6291
374 | Sales Associate,Male,28,3,Masters,Administration,5,118653,7036
375 | Marketing Associate,Female,38,2,High School,Management,5,71772,5764
376 | Software Engineer,Male,65,5,Masters,Sales,1,134758,7500
377 | Marketing Associate,Female,53,5,College,Administration,1,66438,6935
378 | Warehouse Associate,Male,46,3,High School,Operations,4,87134,6184
379 | Marketing Associate,Female,57,4,College,Operations,3,79674,6993
380 | Marketing Associate,Female,45,3,College,Engineering,4,85161,6682
381 | Warehouse Associate,Female,34,2,PhD,Management,5,114479,6700
382 | Marketing Associate,Male,25,3,College,Operations,4,65547,6530
383 | Data Scientist,Female,33,3,High School,Administration,3,71946,6386
384 | Financial Analyst,Male,64,5,High School,Operations,3,96777,7736
385 | Marketing Associate,Female,23,3,PhD,Administration,3,62054,6955
386 | Software Engineer,Male,26,3,High School,Sales,2,61527,6345
387 | Warehouse Associate,Female,24,3,High School,Management,1,64624,8253
388 | Driver,Male,49,4,Masters,Administration,5,118629,6740
389 | Data Scientist,Male,19,2,PhD,Management,5,80171,6454
390 | Sales Associate,Female,47,3,College,Administration,4,105410,6363
391 | Software Engineer,Male,36,4,PhD,Management,3,112134,8587
392 | Warehouse Associate,Female,42,3,Masters,Engineering,1,90005,6631
393 | Marketing Associate,Female,50,5,High School,Operations,1,70374,7466
394 | Manager,Male,21,2,College,Administration,5,121682,7189
395 | Marketing Associate,Female,49,4,PhD,Operations,5,104355,7555
396 | IT,Female,49,4,College,Engineering,1,75811,5520
397 | Sales Associate,Male,48,3,Masters,Engineering,5,113544,6839
398 | Graphic Designer,Female,22,2,Masters,Operations,4,107025,6577
399 | Financial Analyst,Male,20,3,PhD,Operations,3,78564,7875
400 | Financial Analyst,Male,47,5,College,Operations,3,91905,7614
401 | Financial Analyst,Male,60,5,High School,Engineering,4,121728,7932
402 | Marketing Associate,Male,27,4,College,Management,1,54817,6519
403 | Sales Associate,Male,20,3,College,Operations,1,37898,6448
404 | IT,Male,59,5,PhD,Operations,1,79544,6760
405 | Data Scientist,Female,20,3,High School,Engineering,1,40056,7632
406 | IT,Female,45,4,College,Operations,2,70846,7404
407 | Marketing Associate,Female,34,3,High School,Sales,5,79707,7196
408 | Software Engineer,Male,59,4,High School,Operations,4,124817,7268
409 | Manager,Male,53,4,College,Management,1,123114,6946
410 | Financial Analyst,Female,20,3,High School,Operations,3,62377,6999
411 | Data Scientist,Male,28,3,Masters,Sales,3,82850,5794
412 | IT,Female,38,4,High School,Sales,2,75932,7003
413 | Manager,Male,43,3,PhD,Management,2,121949,6739
414 | Marketing Associate,Female,36,3,PhD,Sales,3,82755,7325
415 | Data Scientist,Male,45,3,PhD,Administration,5,122037,8147
416 | Warehouse Associate,Female,19,2,Masters,Administration,2,78021,6970
417 | IT,Male,18,4,PhD,Sales,1,56309,7559
418 | Financial Analyst,Male,59,5,PhD,Management,1,90002,8123
419 | IT,Male,34,3,College,Sales,4,91066,6257
420 | Manager,Male,32,3,Masters,Management,5,118088,7721
421 | Software Engineer,Male,29,3,Masters,Operations,2,92099,7246
422 | Financial Analyst,Female,46,4,College,Administration,4,116758,7744
423 | Marketing Associate,Female,49,5,High School,Sales,2,96500,8092
424 | Manager,Male,48,3,College,Management,5,138851,6929
425 | Marketing Associate,Female,34,4,College,Administration,1,49632,7169
426 | Driver,Female,45,3,PhD,Management,3,76700,7242
427 | IT,Male,58,4,College,Management,5,131275,6997
428 | Manager,Male,21,2,PhD,Operations,3,89648,6573
429 | Graphic Designer,Female,26,2,Masters,Administration,5,100516,6737
430 | Manager,Male,20,2,PhD,Operations,4,112854,6564
431 | Software Engineer,Male,60,4,Masters,Sales,5,155676,6586
432 | Driver,Male,57,4,College,Sales,4,135781,6042
433 | Sales Associate,Female,47,3,College,Engineering,2,84935,6122
434 | Warehouse Associate,Male,31,4,High School,Engineering,2,66407,7522
435 | Driver,Female,56,4,PhD,Sales,3,101249,7329
436 | Manager,Male,57,4,Masters,Administration,4,143189,6321
437 | Graphic Designer,Male,63,5,PhD,Engineering,1,105800,7822
438 | Software Engineer,Male,63,5,College,Operations,2,124871,7327
439 | Data Scientist,Male,23,3,High School,Sales,3,83894,6922
440 | Manager,Female,45,2,High School,Management,5,144146,6340
441 | Marketing Associate,Female,46,4,PhD,Sales,2,69268,7106
442 | Sales Associate,Male,63,5,High School,Sales,1,85948,7646
443 | Software Engineer,Male,20,2,PhD,Engineering,4,92289,6643
444 | Financial Analyst,Male,45,4,High School,Operations,2,75397,6490
445 | Software Engineer,Male,65,4,High School,Engineering,1,102597,5937
446 | Sales Associate,Female,63,4,PhD,Engineering,3,127561,7552
447 | Warehouse Associate,Male,26,3,High School,Sales,3,72945,7279
448 | Software Engineer,Male,60,3,PhD,Management,5,145632,6439
449 | Marketing Associate,Female,36,3,Masters,Sales,3,77223,5272
450 | Manager,Male,63,5,Masters,Operations,1,130093,6416
451 | Marketing Associate,Male,33,3,PhD,Administration,2,53220,5153
452 | Warehouse Associate,Female,41,3,College,Engineering,3,92080,6897
453 | Graphic Designer,Female,55,3,College,Sales,5,118762,6545
454 | Sales Associate,Female,21,2,College,Operations,5,85398,7113
455 | Data Scientist,Male,31,3,College,Operations,1,75227,6027
456 | Data Scientist,Male,41,3,Masters,Administration,4,104314,6611
457 | Manager,Male,27,3,High School,Operations,1,83262,6884
458 | Data Scientist,Female,55,3,Masters,Administration,4,123172,5797
459 | Data Scientist,Male,41,4,PhD,Administration,1,67040,7412
460 | Warehouse Associate,Male,33,3,College,Administration,1,77655,6581
461 | Data Scientist,Female,46,3,College,Administration,3,107476,6182
462 | Software Engineer,Male,45,4,Masters,Operations,2,89811,6471
463 | Graphic Designer,Male,47,3,College,Administration,5,106893,7958
464 | Software Engineer,Male,58,4,Masters,Engineering,5,154039,7304
465 | Warehouse Associate,Male,64,4,High School,Operations,3,116774,6160
466 | Software Engineer,Male,40,2,High School,Management,2,98281,5646
467 | Marketing Associate,Female,18,3,High School,Engineering,2,34208,6620
468 | Software Engineer,Male,42,3,Masters,Engineering,5,121821,6499
469 | Software Engineer,Male,30,2,Masters,Engineering,5,113122,6060
470 | Software Engineer,Male,39,4,PhD,Management,1,92154,7642
471 | Manager,Male,31,3,PhD,Engineering,2,110643,7291
472 | Data Scientist,Female,34,3,PhD,Administration,1,82345,7211
473 | Data Scientist,Male,32,3,PhD,Sales,4,103908,8109
474 | Marketing Associate,Female,24,3,High School,Engineering,3,49303,6615
475 | IT,Female,46,4,High School,Operations,3,98456,7177
476 | Warehouse Associate,Male,49,3,PhD,Operations,4,94927,6274
477 | Software Engineer,Male,30,2,College,Administration,5,76654,6696
478 | Manager,Male,46,4,Masters,Sales,1,107859,6989
479 | Manager,Female,53,4,High School,Operations,3,129885,7314
480 | Graphic Designer,Male,62,5,Masters,Sales,1,85702,6556
481 | Financial Analyst,Female,63,4,Masters,Operations,1,97797,5239
482 | Manager,Male,18,1,Masters,Management,5,115567,4775
483 | Sales Associate,Female,30,3,High School,Administration,4,91566,6996
484 | Sales Associate,Female,61,3,PhD,Management,4,131608,6537
485 | Sales Associate,Male,58,4,PhD,Administration,3,118888,7438
486 | Graphic Designer,Male,55,4,College,Sales,2,75833,6942
487 | Financial Analyst,Female,36,3,College,Engineering,2,65750,6177
488 | Graphic Designer,Male,65,4,College,Management,4,123371,6931
489 | Marketing Associate,Female,49,4,Masters,Sales,3,76806,6662
490 | Warehouse Associate,Male,38,3,College,Management,3,83774,7127
491 | Warehouse Associate,Female,22,1,PhD,Operations,5,114733,6290
492 | Financial Analyst,Male,22,3,PhD,Administration,2,69320,6765
493 | Sales Associate,Male,51,3,PhD,Operations,4,110746,5947
494 | Graphic Designer,Male,61,4,Masters,Engineering,2,105632,6394
495 | Manager,Male,62,4,PhD,Engineering,1,117648,4765
496 | Driver,Female,21,1,PhD,Management,4,89170,6722
497 | Sales Associate,Male,28,3,College,Management,3,85986,8160
498 | Financial Analyst,Male,32,4,Masters,Sales,3,78132,8035
499 | Manager,Male,57,4,Masters,Operations,3,135073,6898
500 | Driver,Female,52,4,High School,Operations,1,78751,6258
501 | Software Engineer,Male,64,5,High School,Sales,2,118231,7440
502 | Data Scientist,Male,58,5,High School,Operations,1,86651,7250
503 | IT,Female,36,3,PhD,Management,3,75518,6711
504 | Data Scientist,Male,44,3,Masters,Management,3,114693,6832
505 | Financial Analyst,Male,39,3,Masters,Engineering,4,116875,5479
506 | IT,Male,23,4,PhD,Administration,1,64468,8113
507 | Marketing Associate,Male,40,2,PhD,Sales,5,116834,5516
508 | Marketing Associate,Female,18,2,Masters,Management,5,75294,6496
509 | Graphic Designer,Female,63,3,College,Management,5,127608,6558
510 | Financial Analyst,Female,43,3,College,Sales,5,116069,6432
511 | Driver,Female,50,3,High School,Engineering,2,86122,6295
512 | Warehouse Associate,Male,23,2,College,Management,4,81988,7160
513 | Warehouse Associate,Female,22,2,Masters,Sales,2,68300,5953
514 | Manager,Male,60,4,College,Sales,2,148178,6777
515 | Financial Analyst,Female,50,3,PhD,Operations,2,94590,5327
516 | Sales Associate,Female,33,2,College,Engineering,4,87418,6430
517 | Warehouse Associate,Male,25,3,College,Administration,1,68384,7443
518 | Manager,Male,48,3,PhD,Operations,1,106056,4161
519 | Software Engineer,Male,35,2,Masters,Management,3,110886,6228
520 | Graphic Designer,Female,37,2,Masters,Management,3,104082,5477
521 | IT,Female,43,3,Masters,Management,4,94354,7052
522 | Warehouse Associate,Male,53,3,PhD,Engineering,5,128669,6866
523 | Financial Analyst,Male,40,3,PhD,Sales,3,88725,6243
524 | Manager,Male,56,3,Masters,Engineering,4,149771,5495
525 | Financial Analyst,Male,45,4,Masters,Management,2,98535,8001
526 | Data Scientist,Female,57,4,PhD,Sales,1,90713,6552
527 | IT,Male,33,3,College,Management,3,88749,7433
528 | Graphic Designer,Male,49,3,Masters,Operations,5,126100,6336
529 | Manager,Male,23,3,College,Sales,3,110594,6753
530 | Manager,Male,60,4,College,Management,3,157644,7213
531 | Manager,Female,41,3,PhD,Management,1,113009,6812
532 | Financial Analyst,Male,20,2,PhD,Engineering,4,105601,5943
533 | Financial Analyst,Male,56,4,High School,Engineering,3,100769,6453
534 | Warehouse Associate,Male,38,3,High School,Administration,2,82244,6020
535 | Marketing Associate,Female,55,4,High School,Administration,2,72030,5569
536 | Driver,Female,20,1,College,Management,3,66203,5627
537 | Driver,Female,34,2,High School,Management,3,67210,5662
538 | Manager,Female,32,2,PhD,Management,2,115383,5878
539 | Graphic Designer,Female,49,3,PhD,Engineering,4,120697,7205
540 | Graphic Designer,Male,29,3,PhD,Operations,2,70311,6235
541 | Data Scientist,Female,49,3,College,Engineering,4,125770,7301
542 | Warehouse Associate,Male,33,1,College,Engineering,5,100465,5520
543 | Software Engineer,Male,30,2,College,Administration,4,83264,6251
544 | Warehouse Associate,Male,61,3,PhD,Engineering,4,126375,6630
545 | Driver,Male,25,3,College,Engineering,1,43076,5829
546 | Data Scientist,Female,37,2,College,Sales,5,97846,7198
547 | Sales Associate,Female,24,2,Masters,Management,3,76522,6900
548 | Driver,Male,42,3,PhD,Operations,4,111726,5989
549 | Sales Associate,Male,55,4,College,Administration,3,112188,6689
550 | Data Scientist,Male,35,2,College,Operations,5,98959,6572
551 | Sales Associate,Female,35,2,College,Management,4,87131,6356
552 | IT,Male,40,3,College,Administration,3,77881,5569
553 | IT,Male,36,3,Masters,Engineering,2,65225,6462
554 | Driver,Male,30,4,Masters,Operations,1,78472,8439
555 | Financial Analyst,Male,58,4,College,Operations,4,126269,6490
556 | Warehouse Associate,Male,61,3,High School,Management,3,106233,5792
557 | IT,Female,37,3,High School,Sales,2,56656,5914
558 | IT,Female,42,3,High School,Engineering,2,70157,6016
559 | Data Scientist,Female,41,2,PhD,Engineering,2,98366,6398
560 | Manager,Male,59,4,College,Engineering,2,127013,6124
561 | Software Engineer,Male,53,3,High School,Management,3,106146,6618
562 | Manager,Male,22,2,Masters,Operations,4,90736,6219
563 | Graphic Designer,Male,44,3,PhD,Sales,1,66611,5924
564 | Manager,Male,57,3,PhD,Operations,5,150914,5982
565 | Graphic Designer,Male,30,2,Masters,Management,4,70559,6497
566 | Data Scientist,Female,33,3,College,Sales,1,73798,6224
567 | Graphic Designer,Male,22,2,High School,Sales,3,69734,5383
568 | Marketing Associate,Female,52,3,High School,Sales,3,94763,5668
569 | Financial Analyst,Female,23,2,College,Administration,5,101534,6861
570 | Graphic Designer,Male,23,1,PhD,Management,5,91504,5136
571 | Sales Associate,Female,19,1,College,Management,4,90426,5020
572 | Manager,Male,20,2,Masters,Operations,3,100183,6363
573 | IT,Female,62,4,College,Operations,2,99369,6746
574 | Graphic Designer,Female,40,3,College,Operations,1,55845,6616
575 | Financial Analyst,Female,26,3,College,Administration,2,48570,6604
576 | Data Scientist,Male,56,3,Masters,Engineering,3,98165,5470
577 | Data Scientist,Female,57,3,Masters,Operations,3,113274,6186
578 | Software Engineer,Male,23,2,Masters,Engineering,3,95754,6596
579 | Data Scientist,Male,27,2,College,Engineering,1,57915,4584
580 | Manager,Male,53,4,College,Sales,2,121506,6122
581 | Manager,Male,37,3,Masters,Management,1,121151,5934
582 | Marketing Associate,Female,46,3,College,Management,3,86241,5814
583 | Data Scientist,Male,40,2,Masters,Management,2,87682,4152
584 | Software Engineer,Male,42,2,College,Administration,3,114029,6842
585 | IT,Male,43,3,College,Management,1,67323,4680
586 | Data Scientist,Male,28,3,High School,Management,2,68049,6660
587 | IT,Male,39,3,High School,Engineering,4,93335,5243
588 | Driver,Female,55,3,PhD,Operations,2,100433,5090
589 | Marketing Associate,Female,62,3,PhD,Operations,5,123242,5453
590 | Sales Associate,Female,32,2,PhD,Administration,2,87585,6476
591 | IT,Female,18,2,Masters,Sales,1,45747,4890
592 | Marketing Associate,Female,41,3,High School,Administration,2,36972,7058
593 | Sales Associate,Female,29,3,High School,Management,3,64754,7377
594 | IT,Female,44,2,PhD,Operations,5,132267,5679
595 | Driver,Male,53,3,PhD,Administration,3,111043,5838
596 | Warehouse Associate,Male,25,2,Masters,Engineering,1,58130,6327
597 | IT,Male,27,3,PhD,Operations,3,90542,7554
598 | Manager,Male,40,2,College,Administration,2,124891,5436
599 | Marketing Associate,Female,54,3,PhD,Engineering,5,112507,7460
600 | Marketing Associate,Female,64,4,College,Sales,5,100316,6010
601 | Software Engineer,Male,63,3,PhD,Engineering,2,134757,5944
602 | Manager,Male,42,3,College,Sales,4,125339,6019
603 | IT,Female,62,4,College,Management,3,111120,6536
604 | Data Scientist,Female,36,3,High School,Operations,3,76523,7122
605 | Driver,Female,50,3,College,Operations,2,91160,5863
606 | Manager,Male,28,3,College,Sales,2,105795,6179
607 | Financial Analyst,Female,18,2,High School,Operations,2,66935,6508
608 | Sales Associate,Male,19,1,PhD,Management,3,75238,5308
609 | Data Scientist,Female,55,2,Masters,Sales,5,138365,5359
610 | Software Engineer,Male,42,2,College,Management,1,103978,5146
611 | IT,Male,65,4,Masters,Sales,5,165229,6506
612 | Warehouse Associate,Male,54,2,Masters,Engineering,4,118322,5224
613 | Graphic Designer,Male,39,2,Masters,Operations,5,106883,5831
614 | Marketing Associate,Female,45,2,High School,Management,2,59006,5876
615 | Manager,Male,30,1,PhD,Engineering,4,125164,5888
616 | Software Engineer,Male,53,4,High School,Operations,1,89225,5751
617 | Driver,Male,63,4,High School,Sales,2,101716,5881
618 | Marketing Associate,Female,45,3,Masters,Sales,3,74305,7126
619 | Marketing Associate,Female,30,3,Masters,Administration,1,38451,5562
620 | Graphic Designer,Female,48,1,High School,Operations,5,113465,3846
621 | Sales Associate,Female,27,2,College,Operations,2,66377,5803
622 | Financial Analyst,Male,31,3,PhD,Engineering,1,79142,5445
623 | Graphic Designer,Female,20,1,Masters,Operations,4,55569,5236
624 | Graphic Designer,Male,60,4,Masters,Engineering,1,97240,5412
625 | Manager,Female,25,1,Masters,Management,5,119033,6381
626 | Marketing Associate,Female,65,3,Masters,Administration,5,106315,5618
627 | Driver,Male,24,2,College,Engineering,3,81301,5713
628 | Software Engineer,Male,23,1,PhD,Management,2,79462,4673
629 | IT,Female,28,1,PhD,Sales,5,105598,5380
630 | Data Scientist,Male,30,2,College,Operations,3,72497,6452
631 | Financial Analyst,Female,23,1,High School,Management,4,78084,5598
632 | Graphic Designer,Male,50,3,Masters,Management,1,84803,5406
633 | Graphic Designer,Female,18,2,College,Administration,1,41603,6092
634 | Driver,Female,47,2,Masters,Operations,2,78002,5175
635 | Warehouse Associate,Male,56,3,Masters,Administration,2,79274,4618
636 | IT,Male,20,2,PhD,Management,1,49622,5452
637 | Sales Associate,Male,62,3,College,Management,3,129620,5608
638 | Software Engineer,Male,36,2,High School,Administration,4,87848,5166
639 | Driver,Male,24,2,College,Management,3,51906,6274
640 | Sales Associate,Female,44,2,High School,Engineering,3,81866,5183
641 | Financial Analyst,Male,24,2,PhD,Engineering,3,71590,5845
642 | Warehouse Associate,Male,19,2,High School,Sales,2,63928,4998
643 | Driver,Male,44,4,High School,Administration,4,83333,7113
644 | Marketing Associate,Male,49,3,High School,Administration,4,92797,5405
645 | Financial Analyst,Male,36,3,PhD,Operations,1,93067,5103
646 | Graphic Designer,Male,22,2,High School,Sales,3,72422,5697
647 | Driver,Male,59,3,College,Operations,3,120159,4265
648 | Data Scientist,Female,39,3,Masters,Operations,3,91711,6775
649 | Marketing Associate,Female,50,3,Masters,Sales,5,93805,6044
650 | Marketing Associate,Female,58,3,Masters,Operations,5,109931,5878
651 | Marketing Associate,Female,42,2,College,Administration,5,88365,4668
652 | Financial Analyst,Male,58,4,Masters,Operations,3,104744,6425
653 | Warehouse Associate,Female,44,3,High School,Administration,1,79334,6505
654 | Driver,Male,41,2,Masters,Administration,4,95795,4898
655 | Financial Analyst,Female,58,3,PhD,Engineering,1,98512,4548
656 | Data Scientist,Female,33,2,High School,Operations,4,78910,5927
657 | Manager,Male,22,1,College,Management,2,103156,6277
658 | Driver,Male,35,2,High School,Administration,5,92760,4826
659 | Warehouse Associate,Female,42,1,PhD,Operations,5,125574,5688
660 | Data Scientist,Male,62,4,College,Operations,3,93742,6580
661 | Sales Associate,Male,50,2,Masters,Engineering,5,124470,5454
662 | Software Engineer,Male,51,2,Masters,Engineering,3,132323,4872
663 | Sales Associate,Male,22,1,High School,Sales,4,70595,5234
664 | Software Engineer,Male,32,2,College,Operations,1,71278,4701
665 | Data Scientist,Male,33,2,High School,Engineering,2,53855,4980
666 | IT,Male,35,1,College,Management,4,98423,3879
667 | Software Engineer,Male,47,2,Masters,Operations,4,126239,5396
668 | Sales Associate,Male,53,2,Masters,Management,5,101133,4700
669 | Warehouse Associate,Male,23,1,High School,Administration,4,92516,5866
670 | Data Scientist,Male,49,3,College,Administration,3,93320,7236
671 | Warehouse Associate,Female,45,1,High School,Administration,5,106963,5084
672 | IT,Female,26,1,PhD,Sales,5,111770,5345
673 | IT,Male,33,2,Masters,Engineering,5,106584,5426
674 | Marketing Associate,Female,26,2,High School,Operations,3,57672,6476
675 | Warehouse Associate,Male,50,3,Masters,Administration,1,90612,5913
676 | Graphic Designer,Female,35,2,High School,Engineering,5,97768,7233
677 | Financial Analyst,Male,31,2,PhD,Administration,3,82790,5826
678 | Graphic Designer,Female,50,3,Masters,Administration,1,76103,5173
679 | Financial Analyst,Male,38,3,Masters,Administration,1,84628,5680
680 | Warehouse Associate,Female,59,2,College,Management,2,93441,5499
681 | Marketing Associate,Male,42,2,Masters,Management,5,110636,5239
682 | Data Scientist,Female,55,3,PhD,Administration,1,108366,5108
683 | Manager,Male,38,2,Masters,Management,3,135106,5066
684 | Warehouse Associate,Female,55,2,PhD,Administration,3,85453,5135
685 | Graphic Designer,Male,54,3,High School,Management,3,100160,5562
686 | Data Scientist,Female,48,2,College,Administration,3,79765,5329
687 | Manager,Male,48,2,PhD,Engineering,4,157169,5798
688 | Financial Analyst,Female,63,3,PhD,Engineering,3,129683,5754
689 | Financial Analyst,Female,20,1,College,Management,2,76908,4756
690 | Financial Analyst,Female,53,3,High School,Management,1,93696,5029
691 | Driver,Male,40,3,High School,Sales,2,88230,5512
692 | Sales Associate,Female,46,2,Masters,Management,2,83095,5284
693 | IT,Female,26,2,College,Sales,3,81325,5647
694 | Warehouse Associate,Female,62,2,PhD,Engineering,2,102106,5195
695 | Graphic Designer,Female,43,1,PhD,Engineering,1,80635,4322
696 | Marketing Associate,Female,32,2,College,Sales,3,73404,5038
697 | Data Scientist,Male,44,3,High School,Sales,1,83674,4854
698 | IT,Male,62,3,Masters,Engineering,4,118314,4892
699 | Graphic Designer,Female,57,2,Masters,Sales,5,133741,4925
700 | Software Engineer,Male,50,3,PhD,Sales,1,119522,4650
701 | Sales Associate,Male,40,2,High School,Administration,3,84592,4635
702 | Sales Associate,Male,64,3,PhD,Sales,2,121589,5133
703 | Manager,Female,28,1,High School,Engineering,4,104329,4962
704 | Warehouse Associate,Male,24,1,College,Engineering,3,71371,5220
705 | Data Scientist,Male,59,4,Masters,Sales,1,96582,6350
706 | Data Scientist,Male,62,3,PhD,Sales,3,108778,4357
707 | Driver,Male,63,3,PhD,Management,4,128520,6151
708 | Sales Associate,Female,34,1,High School,Management,5,102745,6303
709 | Manager,Female,62,4,Masters,Administration,3,146008,6235
710 | Marketing Associate,Female,44,2,PhD,Management,2,68364,4076
711 | Driver,Male,31,2,High School,Administration,4,62753,5735
712 | Data Scientist,Female,27,1,High School,Administration,5,96584,5293
713 | Software Engineer,Male,32,1,College,Engineering,5,128680,4677
714 | Manager,Male,55,2,College,Management,5,152710,6107
715 | Software Engineer,Male,36,1,High School,Operations,5,109136,4572
716 | Data Scientist,Female,53,2,PhD,Operations,2,102265,4760
717 | Marketing Associate,Female,33,1,High School,Management,5,81249,4845
718 | IT,Female,56,3,Masters,Management,2,101791,6083
719 | Marketing Associate,Female,62,4,College,Sales,1,77584,4524
720 | Driver,Female,36,1,Masters,Administration,3,88175,5004
721 | Data Scientist,Female,21,1,PhD,Engineering,2,65816,6817
722 | IT,Male,33,2,Masters,Sales,1,84878,5009
723 | Marketing Associate,Female,28,1,Masters,Engineering,5,73685,4439
724 | IT,Female,27,1,Masters,Administration,2,55560,4449
725 | Software Engineer,Male,36,2,Masters,Administration,2,94745,5930
726 | Data Scientist,Female,43,2,PhD,Sales,3,106452,5346
727 | Sales Associate,Male,46,2,College,Management,2,95505,4549
728 | Software Engineer,Male,25,1,High School,Engineering,4,103181,4566
729 | Software Engineer,Male,40,2,Masters,Sales,1,120096,4690
730 | Warehouse Associate,Female,47,2,College,Administration,2,87575,4422
731 | Sales Associate,Male,51,3,High School,Management,2,98499,6274
732 | Driver,Female,60,2,PhD,Sales,3,113140,4079
733 | Manager,Male,56,2,PhD,Administration,4,152081,4621
734 | Data Scientist,Female,40,1,Masters,Management,5,100505,4626
735 | IT,Female,60,3,Masters,Management,2,104589,4536
736 | Manager,Male,36,2,Masters,Sales,3,134450,5214
737 | IT,Female,48,3,High School,Management,1,61081,6337
738 | Manager,Male,37,2,PhD,Operations,1,100819,5390
739 | Software Engineer,Male,64,2,PhD,Sales,5,157277,4844
740 | Marketing Associate,Female,51,2,High School,Management,2,79237,4453
741 | Driver,Male,20,2,High School,Sales,2,68175,5698
742 | Driver,Female,61,2,College,Management,4,108475,4489
743 | Software Engineer,Male,65,3,High School,Sales,2,122624,4165
744 | Graphic Designer,Female,34,1,College,Management,4,99942,5352
745 | Data Scientist,Female,58,2,PhD,Engineering,2,108296,4708
746 | Graphic Designer,Male,45,2,Masters,Management,2,101273,3817
747 | Sales Associate,Male,18,1,Masters,Administration,3,59334,5729
748 | Manager,Male,43,1,PhD,Management,4,152625,4702
749 | IT,Female,62,3,College,Sales,4,126370,6032
750 | Financial Analyst,Female,45,2,Masters,Administration,5,122142,5435
751 | Driver,Female,19,1,High School,Operations,4,75919,5613
752 | Data Scientist,Male,26,1,Masters,Operations,3,90526,4191
753 | IT,Female,47,2,PhD,Administration,3,103416,5536
754 | Manager,Male,59,3,College,Engineering,3,133910,5795
755 | Driver,Female,63,3,High School,Sales,1,104468,4466
756 | Sales Associate,Female,57,2,High School,Management,4,117202,5068
757 | Data Scientist,Female,51,1,PhD,Sales,5,139141,4969
758 | Warehouse Associate,Female,58,1,Masters,Management,4,134109,4184
759 | Sales Associate,Female,24,1,High School,Administration,2,66887,5014
760 | Data Scientist,Male,41,1,College,Operations,5,77839,4458
761 | Driver,Male,21,1,High School,Sales,2,61018,5513
762 | IT,Female,58,2,High School,Sales,4,124847,3100
763 | Financial Analyst,Female,33,2,PhD,Administration,1,50758,5386
764 | Graphic Designer,Female,34,1,PhD,Operations,3,75316,5861
765 | Driver,Male,54,3,PhD,Sales,2,118093,4115
766 | Software Engineer,Male,31,1,College,Operations,4,106503,5225
767 | Graphic Designer,Female,55,2,College,Administration,3,111502,4413
768 | Sales Associate,Female,47,1,PhD,Engineering,4,116642,4621
769 | Financial Analyst,Male,55,3,College,Operations,1,84683,4476
770 | IT,Female,45,2,PhD,Sales,4,113283,5404
771 | Marketing Associate,Female,48,2,High School,Administration,4,86538,4212
772 | Marketing Associate,Female,62,3,Masters,Engineering,2,77742,5430
773 | Software Engineer,Male,22,1,College,Sales,1,67913,4045
774 | Financial Analyst,Female,54,2,College,Sales,2,114171,3768
775 | Manager,Male,47,1,Masters,Engineering,4,155826,4964
776 | Graphic Designer,Female,63,2,College,Operations,4,106204,4534
777 | Graphic Designer,Female,38,1,High School,Administration,4,111041,3727
778 | Software Engineer,Male,62,2,PhD,Engineering,3,138184,4664
779 | Financial Analyst,Female,63,3,College,Sales,2,95795,5723
780 | Sales Associate,Male,60,3,High School,Administration,3,106798,5201
781 | Marketing Associate,Female,65,3,High School,Engineering,3,98575,5410
782 | Warehouse Associate,Male,37,2,Masters,Administration,1,82343,5232
783 | Software Engineer,Female,57,1,College,Engineering,3,117131,2324
784 | Sales Associate,Male,26,2,High School,Sales,2,78625,4073
785 | Financial Analyst,Male,27,2,PhD,Administration,4,99134,6255
786 | Marketing Associate,Female,20,2,College,Administration,2,38613,6079
787 | Marketing Associate,Female,26,2,College,Operations,1,62866,5114
788 | Graphic Designer,Male,44,1,Masters,Engineering,5,107818,5281
789 | Graphic Designer,Female,47,1,High School,Operations,5,108731,4793
790 | Manager,Male,21,1,College,Management,2,114621,3927
791 | Marketing Associate,Female,23,1,High School,Sales,2,47739,4829
792 | Data Scientist,Male,45,1,Masters,Operations,3,104290,3474
793 | Financial Analyst,Male,65,4,Masters,Sales,1,111403,5078
794 | IT,Male,52,2,College,Administration,5,93802,5404
795 | Manager,Male,30,1,High School,Sales,5,138163,3415
796 | Financial Analyst,Male,50,2,Masters,Operations,3,115613,3881
797 | Sales Associate,Female,18,1,Masters,Engineering,1,64920,4967
798 | Software Engineer,Female,36,1,PhD,Management,1,79177,3661
799 | Warehouse Associate,Male,25,1,College,Operations,1,64994,4310
800 | Driver,Male,53,3,PhD,Sales,2,100941,5992
801 | Manager,Male,55,1,High School,Management,5,163208,4095
802 | Sales Associate,Male,24,1,PhD,Operations,2,51837,4472
803 | Graphic Designer,Male,56,2,Masters,Management,4,114048,5097
804 | Manager,Female,59,2,High School,Management,2,136215,3603
805 | Marketing Associate,Female,55,3,High School,Administration,1,63327,4066
806 | Software Engineer,Male,64,2,PhD,Engineering,4,139068,4126
807 | Sales Associate,Male,63,2,Masters,Engineering,3,136209,3910
808 | Software Engineer,Male,55,2,PhD,Management,3,121910,4494
809 | Graphic Designer,Male,43,2,PhD,Management,2,82057,5268
810 | Sales Associate,Male,51,2,High School,Engineering,1,96023,4148
811 | Marketing Associate,Female,31,1,High School,Sales,3,52578,5538
812 | Driver,Female,61,2,College,Administration,4,114436,4249
813 | Marketing Associate,Female,56,2,Masters,Sales,4,99543,4442
814 | Sales Associate,Female,41,2,College,Sales,1,60983,5022
815 | Software Engineer,Male,49,3,Masters,Administration,1,82277,5460
816 | Marketing Associate,Female,23,1,College,Sales,2,55885,3795
817 | Graphic Designer,Male,29,1,Masters,Operations,3,73142,4712
818 | Graphic Designer,Male,53,2,High School,Operations,4,107065,3652
819 | Driver,Female,61,2,PhD,Engineering,2,107064,4202
820 | Marketing Associate,Female,56,2,High School,Operations,5,107370,4916
821 | Marketing Associate,Female,25,1,High School,Sales,1,39741,4981
822 | Sales Associate,Male,64,2,High School,Management,3,120579,4687
823 | Driver,Female,55,2,High School,Engineering,3,103242,4495
824 | Driver,Female,42,1,Masters,Sales,5,120112,5070
825 | Sales Associate,Female,61,2,High School,Sales,2,83308,3653
826 | Marketing Associate,Female,26,1,High School,Management,1,36585,4373
827 | Graphic Designer,Female,59,2,College,Engineering,4,115344,4827
828 | Sales Associate,Male,36,2,High School,Management,1,49553,5784
829 | IT,Female,54,2,Masters,Administration,3,96982,4750
830 | Financial Analyst,Female,24,1,High School,Management,4,80030,5481
831 | Driver,Male,58,2,College,Sales,4,110157,4275
832 | Financial Analyst,Female,32,1,High School,Administration,4,84047,4683
833 | Marketing Associate,Male,47,2,PhD,Sales,3,93054,4243
834 | Data Scientist,Female,53,2,Masters,Engineering,2,84764,4409
835 | Warehouse Associate,Female,61,1,Masters,Management,3,120277,4617
836 | Marketing Associate,Female,56,2,High School,Sales,3,81661,4965
837 | Manager,Female,45,1,High School,Engineering,5,149893,5120
838 | Software Engineer,Male,45,2,High School,Management,1,81515,5001
839 | Driver,Male,55,2,Masters,Operations,3,95898,4309
840 | Warehouse Associate,Male,42,2,PhD,Operations,3,90000,4774
841 | Manager,Male,60,3,High School,Management,2,111896,5597
842 | Driver,Male,53,1,Masters,Engineering,5,130082,4734
843 | Sales Associate,Female,60,2,PhD,Engineering,5,130417,4694
844 | Marketing Associate,Female,34,1,College,Engineering,2,38855,4503
845 | Software Engineer,Male,25,1,High School,Administration,4,81057,5259
846 | Data Scientist,Male,41,1,High School,Management,5,117215,4322
847 | Graphic Designer,Female,49,1,High School,Administration,4,96232,3697
848 | Financial Analyst,Male,64,1,PhD,Operations,5,141186,3415
849 | Manager,Male,52,2,High School,Operations,2,114943,3438
850 | Data Scientist,Male,43,1,PhD,Administration,4,107640,3896
851 | Manager,Male,31,2,College,Operations,1,80259,4791
852 | Marketing Associate,Female,48,1,College,Operations,5,86563,4415
853 | Data Scientist,Male,20,1,High School,Operations,2,61589,3990
854 | Warehouse Associate,Male,35,1,High School,Administration,2,74829,4263
855 | Graphic Designer,Female,44,1,Masters,Administration,2,72248,3634
856 | Sales Associate,Male,49,2,High School,Operations,3,91342,4516
857 | Sales Associate,Male,49,2,Masters,Operations,2,101140,4851
858 | Marketing Associate,Male,51,2,PhD,Sales,1,62600,4300
859 | Software Engineer,Male,21,1,High School,Engineering,2,65313,4685
860 | Manager,Male,41,1,College,Administration,5,123153,4111
861 | IT,Male,27,1,High School,Management,3,57500,3665
862 | Manager,Male,37,1,PhD,Administration,3,115604,4098
863 | Software Engineer,Male,18,1,High School,Sales,1,55555,4390
864 | Data Scientist,Male,58,1,PhD,Engineering,5,124166,3875
865 | IT,Male,50,2,Masters,Management,3,101050,3879
866 | Data Scientist,Female,56,1,College,Engineering,5,125828,3961
867 | Graphic Designer,Female,45,1,High School,Management,3,75017,5576
868 | Marketing Associate,Female,40,1,Masters,Engineering,4,100415,3871
869 | Manager,Female,57,1,Masters,Operations,3,145095,3889
870 | Data Scientist,Female,63,2,High School,Administration,5,115641,4339
871 | Software Engineer,Male,59,1,College,Sales,5,160460,2930
872 | Financial Analyst,Male,45,2,High School,Administration,2,85947,3909
873 | Financial Analyst,Female,48,1,College,Engineering,5,119058,4037
874 | Warehouse Associate,Female,56,1,College,Administration,4,106722,4044
875 | Financial Analyst,Female,41,2,PhD,Operations,3,91129,4825
876 | Graphic Designer,Male,59,1,PhD,Sales,4,135512,3955
877 | Financial Analyst,Female,49,2,College,Operations,1,92891,3578
878 | Financial Analyst,Male,64,3,High School,Administration,3,124151,3392
879 | Data Scientist,Female,58,1,High School,Sales,5,141005,3384
880 | Manager,Male,36,1,High School,Operations,2,115742,3661
881 | IT,Male,31,1,Masters,Sales,2,93358,3223
882 | IT,Female,44,1,High School,Sales,5,131411,3084
883 | Sales Associate,Female,61,2,College,Sales,1,96900,4157
884 | IT,Male,59,3,Masters,Sales,2,98995,4596
885 | IT,Male,57,3,High School,Sales,3,109645,5472
886 | Financial Analyst,Male,39,2,College,Sales,1,61609,3738
887 | Marketing Associate,Female,56,2,High School,Engineering,3,88820,4289
888 | Data Scientist,Male,61,1,Masters,Administration,5,122352,3590
889 | Marketing Associate,Female,53,1,Masters,Management,4,112489,4237
890 | Software Engineer,Male,29,1,High School,Sales,1,86650,3518
891 | Data Scientist,Male,62,3,High School,Operations,2,101284,4993
892 | Manager,Male,41,1,Masters,Sales,2,119794,3712
893 | Graphic Designer,Male,58,2,Masters,Operations,1,96609,3044
894 | Marketing Associate,Female,38,1,PhD,Engineering,1,53839,3270
895 | Financial Analyst,Male,29,1,Masters,Operations,2,84725,3274
896 | IT,Female,57,2,High School,Administration,1,64807,3887
897 | IT,Female,28,1,High School,Management,1,41151,4296
898 | Financial Analyst,Female,48,1,College,Administration,4,114031,4510
899 | Driver,Male,33,1,High School,Management,1,59493,3955
900 | Manager,Female,42,1,High School,Management,2,119893,3721
901 | Manager,Male,60,2,College,Sales,1,131980,3841
902 | Sales Associate,Female,52,1,High School,Sales,3,94154,3964
903 | Marketing Associate,Female,60,1,College,Management,4,114906,4766
904 | Driver,Female,63,1,Masters,Management,4,115981,2661
905 | Data Scientist,Male,37,1,PhD,Administration,2,78836,3588
906 | Driver,Male,64,2,Masters,Operations,2,112719,3382
907 | Marketing Associate,Female,59,1,Masters,Sales,3,93018,2730
908 | Marketing Associate,Female,55,2,College,Sales,2,69928,4811
909 | Software Engineer,Female,57,2,High School,Engineering,1,104623,3791
910 | Driver,Male,55,2,College,Administration,2,103328,4543
911 | Financial Analyst,Male,47,2,High School,Management,1,68643,4674
912 | Graphic Designer,Female,33,1,High School,Sales,1,63914,3521
913 | Sales Associate,Male,45,2,High School,Engineering,1,67296,4780
914 | Graphic Designer,Female,44,1,College,Sales,3,90474,3530
915 | Driver,Female,22,1,High School,Administration,1,60026,4605
916 | Marketing Associate,Female,41,1,PhD,Management,5,108587,4275
917 | Financial Analyst,Female,61,1,Masters,Engineering,4,139042,2982
918 | Driver,Male,65,1,High School,Operations,5,116803,3431
919 | Sales Associate,Female,57,1,Masters,Operations,3,101423,3641
920 | Software Engineer,Male,36,1,Masters,Operations,1,86375,3368
921 | IT,Male,31,1,College,Operations,1,49449,4149
922 | Marketing Associate,Female,56,2,High School,Sales,3,95995,4045
923 | Software Engineer,Male,51,1,College,Management,2,118282,3599
924 | IT,Female,47,1,PhD,Operations,3,99472,3943
925 | Manager,Female,55,1,High School,Administration,3,135638,2824
926 | IT,Male,50,1,College,Operations,3,85648,2728
927 | Data Scientist,Female,58,1,PhD,Management,4,121457,4723
928 | Data Scientist,Female,57,1,PhD,Operations,2,114141,3746
929 | Manager,Male,58,1,PhD,Management,4,179726,4284
930 | Marketing Associate,Female,64,1,College,Administration,5,107572,3107
931 | Sales Associate,Female,65,2,PhD,Sales,5,142505,4214
932 | Financial Analyst,Male,43,1,PhD,Sales,4,108719,4143
933 | IT,Male,36,1,High School,Sales,4,109832,3528
934 | Software Engineer,Male,60,1,PhD,Administration,4,127250,4263
935 | Sales Associate,Female,50,1,Masters,Sales,3,106279,3892
936 | Software Engineer,Male,37,1,High School,Sales,1,81547,3510
937 | Data Scientist,Male,56,1,Masters,Administration,2,95798,2530
938 | Manager,Female,47,1,High School,Management,3,106916,3959
939 | Manager,Male,59,2,PhD,Administration,1,124782,4618
940 | Warehouse Associate,Male,53,1,College,Operations,4,126256,3463
941 | IT,Female,59,1,PhD,Engineering,3,92257,3908
942 | Software Engineer,Male,62,1,High School,Sales,3,124660,2200
943 | IT,Female,32,2,High School,Engineering,1,59377,5236
944 | Marketing Associate,Female,38,1,PhD,Engineering,2,76726,4204
945 | IT,Female,54,1,High School,Engineering,5,112502,3988
946 | Data Scientist,Male,65,2,Masters,Sales,2,128730,3247
947 | Manager,Male,61,2,College,Sales,1,136836,4070
948 | IT,Male,51,1,High School,Engineering,3,89272,2600
949 | Financial Analyst,Female,62,1,High School,Operations,3,101515,2251
950 | Software Engineer,Male,36,1,Masters,Operations,1,93855,4099
951 | Graphic Designer,Male,30,1,College,Operations,2,36642,4388
952 | Sales Associate,Male,60,1,College,Operations,3,112604,3366
953 | Financial Analyst,Female,49,1,PhD,Engineering,1,67089,3078
954 | Driver,Female,51,1,PhD,Management,2,97789,3536
955 | Data Scientist,Male,62,1,PhD,Administration,3,123333,3353
956 | IT,Male,62,1,PhD,Management,2,105771,2978
957 | Marketing Associate,Female,46,1,College,Administration,1,54945,2991
958 | Data Scientist,Female,65,1,PhD,Operations,2,121239,3078
959 | Data Scientist,Female,64,2,High School,Sales,1,87314,3688
960 | Driver,Female,65,1,PhD,Operations,3,124530,3647
961 | Driver,Female,45,1,College,Sales,2,67150,3255
962 | Financial Analyst,Male,39,1,College,Engineering,4,99643,3789
963 | Sales Associate,Male,54,1,College,Administration,3,92468,3242
964 | Financial Analyst,Female,47,1,College,Operations,2,63300,3655
965 | IT,Male,55,2,Masters,Operations,1,91447,3196
966 | Marketing Associate,Female,58,2,High School,Operations,1,60271,3927
967 | Warehouse Associate,Male,34,1,Masters,Operations,1,45915,4765
968 | Marketing Associate,Female,53,1,Masters,Operations,1,66028,3745
969 | Sales Associate,Male,57,1,PhD,Administration,5,113031,3409
970 | Software Engineer,Male,62,1,High School,Sales,3,132815,3367
971 | Manager,Male,52,1,High School,Management,1,110078,1703
972 | Data Scientist,Male,57,1,Masters,Sales,3,106918,3180
973 | Software Engineer,Male,60,1,High School,Administration,3,106045,3079
974 | Warehouse Associate,Male,65,1,High School,Operations,2,107479,2051
975 | Software Engineer,Male,64,1,Masters,Sales,1,109606,2193
976 | Marketing Associate,Female,44,1,High School,Engineering,3,91300,3680
977 | Sales Associate,Male,56,1,High School,Management,1,88195,2375
978 | Financial Analyst,Female,54,1,Masters,Operations,2,90999,2963
979 | Sales Associate,Female,58,1,College,Operations,3,86795,3453
980 | IT,Female,59,1,College,Sales,3,110627,2041
981 | Marketing Associate,Female,52,1,PhD,Sales,3,97696,3888
982 | IT,Male,64,1,Masters,Sales,1,92950,2125
983 | Sales Associate,Male,56,1,High School,Operations,1,82164,2611
984 | Software Engineer,Male,62,1,High School,Management,3,119069,2788
985 | Marketing Associate,Female,59,1,College,Engineering,1,62738,2656
986 | IT,Female,53,1,High School,Administration,2,112169,2763
987 | Manager,Male,55,1,High School,Administration,1,120574,2683
988 | Data Scientist,Male,54,1,College,Sales,2,97311,3083
989 | Marketing Associate,Female,59,1,High School,Operations,3,98796,3042
990 | Sales Associate,Male,62,1,Masters,Administration,2,102593,1823
991 | Financial Analyst,Female,65,1,High School,Administration,2,96665,2645
992 | Graphic Designer,Female,61,1,Masters,Engineering,1,91030,3318
993 | IT,Female,65,1,Masters,Administration,1,106945,2041
994 | Graphic Designer,Female,63,1,College,Administration,2,81545,3418
995 | Marketing Associate,Female,65,1,Masters,Administration,1,80789,1884
996 | Marketing Associate,Female,64,1,PhD,Administration,2,85253,2777
997 | Marketing Associate,Female,61,1,High School,Administration,1,62644,3270
998 | Data Scientist,Male,57,1,Masters,Sales,2,108977,3567
999 | Financial Analyst,Male,48,1,High School,Operations,1,92347,2724
1000 | Financial Analyst,Male,65,2,High School,Administration,1,97376,2225
1001 | Financial Analyst,Male,60,1,PhD,Sales,2,123108,2244
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Machine Learning
2 |
3 | 
4 |
5 |
6 | El ‘machine learning’ –aprendizaje automático– es una rama de la inteligencia artificial que permite que las máquinas aprendan sin ser expresamente programadas para ello. Una habilidad indispensable para hacer sistemas capaces de identificar patrones entre los datos para hacer predicciones. Esta tecnología está presente en un sinfín de aplicaciones como las recomendaciones de Netflix o Spotify, las respuestas inteligentes de Gmail o el habla de Siri y Alexa.
7 | En definitiva, el ‘machine learning’ es un maestro del reconocimiento de patrones, y es capaz de convertir una muestra de datos en un programa informático capaz de extraer inferencias de nuevos conjuntos de datos para los que no ha sido entrenado previamente. Esta capacidad de aprendizaje se emplea para la mejora de motores de búsqueda, la robótica, el diagnóstico médico o incluso la detección del fraude en el uso de tarjetas de crédito.
8 |
9 | ## Qué vas a encontrar en esta carpeta?
10 |
11 | - [Documentos teóricos](https://github.com/bcamandone/Machine-Learning/tree/main/Documentos%20te%C3%B3ricos)
12 | - [Proyecto final Data Science - Coderhouse](https://github.com/bcamandone/Machine-Learning/tree/main/Proyecto%20final%20Data%20Science%20-%20Coderhouse)
13 | - [Proyecto Dataset-Iris](https://github.com/bcamandone/Machine-Learning/tree/main/Proyecto-Dataset-Iris)
14 | - [Proyecto calorías](https://github.com/bcamandone/Machine-Learning/tree/main/Proyecto_calorias)
15 | - [Proyecto salarios](https://github.com/bcamandone/Machine-Learning/tree/main/Proyecto_salarios)
16 | - [KMeans - Ejemplo con dataset Penguins](https://github.com/bcamandone/Machine-Learning/blob/main/Ejemplo%20con%20KMeans%20-%20dataset%20Penguins.ipynb)
17 | - [Feature selection](https://github.com/bcamandone/Machine-Learning/blob/main/Feature%20selection.-.ipynb)
18 | - [Función para aplicar varios algoritmos](https://github.com/bcamandone/Machine-Learning/blob/main/Funcion_Aplicar_varios_algoritmos.ipynb)
19 | - [Principal Component Analysis(PCA)](https://github.com/bcamandone/Machine-Learning/blob/main/Principal%20Component%20Analysis-.ipynb)
20 | - [SMOTE](https://github.com/bcamandone/Machine-Learning/blob/main/SMOTE.ipynb)
21 | - [Trampa dummy](https://github.com/bcamandone/Machine-Learning/blob/main/Scikit-plot..ipynb)
22 | - [Scikit-plot](https://github.com/bcamandone/Machine-Learning/blob/main/Trampa%20dummy-.ipynb)
23 | - [Librería Lazypredict](https://github.com/bcamandone/Machine-Learning/blob/main/_Librer%C3%ADa%20Lazypredict.ipynb)
24 |
--------------------------------------------------------------------------------
/Recommender System Final Project/ml-capstone-Camandone Belén.pptx.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bcamandone/Machine-Learning/a185c926938671814f5722df26f82e6238897902/Recommender System Final Project/ml-capstone-Camandone Belén.pptx.pdf
--------------------------------------------------------------------------------
/Recommender System Final Project/readme.md:
--------------------------------------------------------------------------------
1 | ## Capstone Project
2 | The final task of this capstone project is to create a presentation based on the outcomes of all tasks in previous modules and labs.
3 | Your presentation will develop into a story of all your machine learning journey in this project, and it should be compelling and easy to understand.
4 |
5 | -Uploaded your completed presentation in PDF format
6 |
7 | -Completed the required Introduction slide
8 |
9 | -Completed the required Exploratory Data Analysis slides
10 |
11 | -Completed the required content-based recommender system using user profile and course genres slides
12 |
13 | -Completed the required content-based recommender system using course similarity slides
14 |
15 | -Completed the required content-based recommender system using user profile clustering slides
16 |
17 | -Completed the required KNN-based collaborative filtering slide
18 |
19 | -Completed the required NMF-based collaborative filtering slide
20 |
21 | -Completed the required neural network embedding based collaborative filtering slide
22 |
23 | -Completed the required collaborative filtering algorithms evaluation slides
24 |
25 | -Completed the required Conclusion slide
26 |
27 | -Applied your creativity to improve the presentation beyond the template
28 |
29 | -Displayed any innovative insights
30 |
--------------------------------------------------------------------------------
/SMOTE.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "id": "ab7ea8fe",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import pandas as pd"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": 2,
16 | "id": "90dd833b",
17 | "metadata": {},
18 | "outputs": [],
19 | "source": [
20 | "data = pd.read_csv('creditcard.csv')"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 3,
26 | "id": "3deb0b07",
27 | "metadata": {},
28 | "outputs": [
29 | {
30 | "data": {
31 | "text/plain": [
32 | "Index(['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10',\n",
33 | " 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20',\n",
34 | " 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount',\n",
35 | " 'Class'],\n",
36 | " dtype='object')"
37 | ]
38 | },
39 | "execution_count": 3,
40 | "metadata": {},
41 | "output_type": "execute_result"
42 | }
43 | ],
44 | "source": [
45 | "data.columns"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": 4,
51 | "id": "e87e5d7d",
52 | "metadata": {},
53 | "outputs": [
54 | {
55 | "data": {
56 | "text/html": [
57 | "\n",
58 | "\n",
71 | "
\n",
72 | " \n",
73 | " \n",
74 | " \n",
75 | " Time \n",
76 | " V1 \n",
77 | " V2 \n",
78 | " V3 \n",
79 | " V4 \n",
80 | " V5 \n",
81 | " V6 \n",
82 | " V7 \n",
83 | " V8 \n",
84 | " V9 \n",
85 | " ... \n",
86 | " V21 \n",
87 | " V22 \n",
88 | " V23 \n",
89 | " V24 \n",
90 | " V25 \n",
91 | " V26 \n",
92 | " V27 \n",
93 | " V28 \n",
94 | " Amount \n",
95 | " Class \n",
96 | " \n",
97 | " \n",
98 | " \n",
99 | " \n",
100 | " 0 \n",
101 | " 0.0 \n",
102 | " -1.359807 \n",
103 | " -0.072781 \n",
104 | " 2.536347 \n",
105 | " 1.378155 \n",
106 | " -0.338321 \n",
107 | " 0.462388 \n",
108 | " 0.239599 \n",
109 | " 0.098698 \n",
110 | " 0.363787 \n",
111 | " ... \n",
112 | " -0.018307 \n",
113 | " 0.277838 \n",
114 | " -0.110474 \n",
115 | " 0.066928 \n",
116 | " 0.128539 \n",
117 | " -0.189115 \n",
118 | " 0.133558 \n",
119 | " -0.021053 \n",
120 | " 149.62 \n",
121 | " 0 \n",
122 | " \n",
123 | " \n",
124 | " 1 \n",
125 | " 0.0 \n",
126 | " 1.191857 \n",
127 | " 0.266151 \n",
128 | " 0.166480 \n",
129 | " 0.448154 \n",
130 | " 0.060018 \n",
131 | " -0.082361 \n",
132 | " -0.078803 \n",
133 | " 0.085102 \n",
134 | " -0.255425 \n",
135 | " ... \n",
136 | " -0.225775 \n",
137 | " -0.638672 \n",
138 | " 0.101288 \n",
139 | " -0.339846 \n",
140 | " 0.167170 \n",
141 | " 0.125895 \n",
142 | " -0.008983 \n",
143 | " 0.014724 \n",
144 | " 2.69 \n",
145 | " 0 \n",
146 | " \n",
147 | " \n",
148 | " 2 \n",
149 | " 1.0 \n",
150 | " -1.358354 \n",
151 | " -1.340163 \n",
152 | " 1.773209 \n",
153 | " 0.379780 \n",
154 | " -0.503198 \n",
155 | " 1.800499 \n",
156 | " 0.791461 \n",
157 | " 0.247676 \n",
158 | " -1.514654 \n",
159 | " ... \n",
160 | " 0.247998 \n",
161 | " 0.771679 \n",
162 | " 0.909412 \n",
163 | " -0.689281 \n",
164 | " -0.327642 \n",
165 | " -0.139097 \n",
166 | " -0.055353 \n",
167 | " -0.059752 \n",
168 | " 378.66 \n",
169 | " 0 \n",
170 | " \n",
171 | " \n",
172 | " 3 \n",
173 | " 1.0 \n",
174 | " -0.966272 \n",
175 | " -0.185226 \n",
176 | " 1.792993 \n",
177 | " -0.863291 \n",
178 | " -0.010309 \n",
179 | " 1.247203 \n",
180 | " 0.237609 \n",
181 | " 0.377436 \n",
182 | " -1.387024 \n",
183 | " ... \n",
184 | " -0.108300 \n",
185 | " 0.005274 \n",
186 | " -0.190321 \n",
187 | " -1.175575 \n",
188 | " 0.647376 \n",
189 | " -0.221929 \n",
190 | " 0.062723 \n",
191 | " 0.061458 \n",
192 | " 123.50 \n",
193 | " 0 \n",
194 | " \n",
195 | " \n",
196 | " 4 \n",
197 | " 2.0 \n",
198 | " -1.158233 \n",
199 | " 0.877737 \n",
200 | " 1.548718 \n",
201 | " 0.403034 \n",
202 | " -0.407193 \n",
203 | " 0.095921 \n",
204 | " 0.592941 \n",
205 | " -0.270533 \n",
206 | " 0.817739 \n",
207 | " ... \n",
208 | " -0.009431 \n",
209 | " 0.798278 \n",
210 | " -0.137458 \n",
211 | " 0.141267 \n",
212 | " -0.206010 \n",
213 | " 0.502292 \n",
214 | " 0.219422 \n",
215 | " 0.215153 \n",
216 | " 69.99 \n",
217 | " 0 \n",
218 | " \n",
219 | " \n",
220 | "
\n",
221 | "
5 rows × 31 columns
\n",
222 | "
"
223 | ],
224 | "text/plain": [
225 | " Time V1 V2 V3 V4 V5 V6 V7 \\\n",
226 | "0 0.0 -1.359807 -0.072781 2.536347 1.378155 -0.338321 0.462388 0.239599 \n",
227 | "1 0.0 1.191857 0.266151 0.166480 0.448154 0.060018 -0.082361 -0.078803 \n",
228 | "2 1.0 -1.358354 -1.340163 1.773209 0.379780 -0.503198 1.800499 0.791461 \n",
229 | "3 1.0 -0.966272 -0.185226 1.792993 -0.863291 -0.010309 1.247203 0.237609 \n",
230 | "4 2.0 -1.158233 0.877737 1.548718 0.403034 -0.407193 0.095921 0.592941 \n",
231 | "\n",
232 | " V8 V9 ... V21 V22 V23 V24 V25 \\\n",
233 | "0 0.098698 0.363787 ... -0.018307 0.277838 -0.110474 0.066928 0.128539 \n",
234 | "1 0.085102 -0.255425 ... -0.225775 -0.638672 0.101288 -0.339846 0.167170 \n",
235 | "2 0.247676 -1.514654 ... 0.247998 0.771679 0.909412 -0.689281 -0.327642 \n",
236 | "3 0.377436 -1.387024 ... -0.108300 0.005274 -0.190321 -1.175575 0.647376 \n",
237 | "4 -0.270533 0.817739 ... -0.009431 0.798278 -0.137458 0.141267 -0.206010 \n",
238 | "\n",
239 | " V26 V27 V28 Amount Class \n",
240 | "0 -0.189115 0.133558 -0.021053 149.62 0 \n",
241 | "1 0.125895 -0.008983 0.014724 2.69 0 \n",
242 | "2 -0.139097 -0.055353 -0.059752 378.66 0 \n",
243 | "3 -0.221929 0.062723 0.061458 123.50 0 \n",
244 | "4 0.502292 0.219422 0.215153 69.99 0 \n",
245 | "\n",
246 | "[5 rows x 31 columns]"
247 | ]
248 | },
249 | "execution_count": 4,
250 | "metadata": {},
251 | "output_type": "execute_result"
252 | }
253 | ],
254 | "source": [
255 | "data.head()"
256 | ]
257 | },
258 | {
259 | "cell_type": "code",
260 | "execution_count": 5,
261 | "id": "8b841634",
262 | "metadata": {},
263 | "outputs": [
264 | {
265 | "data": {
266 | "text/plain": [
267 | "Time 0\n",
268 | "V1 0\n",
269 | "V2 0\n",
270 | "V3 0\n",
271 | "V4 0\n",
272 | "V5 0\n",
273 | "V6 0\n",
274 | "V7 0\n",
275 | "V8 0\n",
276 | "V9 0\n",
277 | "V10 0\n",
278 | "V11 0\n",
279 | "V12 0\n",
280 | "V13 0\n",
281 | "V14 0\n",
282 | "V15 0\n",
283 | "V16 0\n",
284 | "V17 0\n",
285 | "V18 0\n",
286 | "V19 0\n",
287 | "V20 0\n",
288 | "V21 0\n",
289 | "V22 0\n",
290 | "V23 0\n",
291 | "V24 0\n",
292 | "V25 0\n",
293 | "V26 0\n",
294 | "V27 0\n",
295 | "V28 0\n",
296 | "Amount 0\n",
297 | "Class 0\n",
298 | "dtype: int64"
299 | ]
300 | },
301 | "execution_count": 5,
302 | "metadata": {},
303 | "output_type": "execute_result"
304 | }
305 | ],
306 | "source": [
307 | "data.isnull().sum()"
308 | ]
309 | },
310 | {
311 | "cell_type": "code",
312 | "execution_count": 6,
313 | "id": "736ce555",
314 | "metadata": {},
315 | "outputs": [
316 | {
317 | "name": "stdout",
318 | "output_type": "stream",
319 | "text": [
320 | "No Fraude 99.83 % del dataset\n",
321 | "Fraude 0.17 % del dataset\n"
322 | ]
323 | }
324 | ],
325 | "source": [
326 | "print('No Fraude', round(data['Class'].value_counts()[0]/len(data) * 100,2), '% del dataset')\n",
327 | "print('Fraude', round(data['Class'].value_counts()[1]/len(data) * 100,2), '% del dataset')"
328 | ]
329 | },
330 | {
331 | "cell_type": "code",
332 | "execution_count": 7,
333 | "id": "3c795e21",
334 | "metadata": {},
335 | "outputs": [],
336 | "source": [
337 | "X = data.drop('Class', axis=1)\n",
338 | "y = data['Class']"
339 | ]
340 | },
341 | {
342 | "cell_type": "code",
343 | "execution_count": 8,
344 | "id": "661ba9e3",
345 | "metadata": {},
346 | "outputs": [
347 | {
348 | "data": {
349 | "text/plain": [
350 | "0 284315\n",
351 | "1 492\n",
352 | "Name: Class, dtype: int64"
353 | ]
354 | },
355 | "execution_count": 8,
356 | "metadata": {},
357 | "output_type": "execute_result"
358 | }
359 | ],
360 | "source": [
361 | "y.value_counts()"
362 | ]
363 | },
364 | {
365 | "cell_type": "code",
366 | "execution_count": 9,
367 | "id": "f97563b4",
368 | "metadata": {},
369 | "outputs": [],
370 | "source": [
371 | "from imblearn.over_sampling import SMOTE\n",
372 | "oversample = SMOTE()\n",
373 | "X_s, y_s = oversample.fit_resample(X, y)"
374 | ]
375 | },
376 | {
377 | "cell_type": "code",
378 | "execution_count": 10,
379 | "id": "9beac7a3",
380 | "metadata": {},
381 | "outputs": [
382 | {
383 | "data": {
384 | "text/plain": [
385 | "0 284315\n",
386 | "1 284315\n",
387 | "Name: Class, dtype: int64"
388 | ]
389 | },
390 | "execution_count": 10,
391 | "metadata": {},
392 | "output_type": "execute_result"
393 | }
394 | ],
395 | "source": [
396 | "y_s.value_counts()"
397 | ]
398 | }
399 | ],
400 | "metadata": {
401 | "kernelspec": {
402 | "display_name": "Python 3 (ipykernel)",
403 | "language": "python",
404 | "name": "python3"
405 | },
406 | "language_info": {
407 | "codemirror_mode": {
408 | "name": "ipython",
409 | "version": 3
410 | },
411 | "file_extension": ".py",
412 | "mimetype": "text/x-python",
413 | "name": "python",
414 | "nbconvert_exporter": "python",
415 | "pygments_lexer": "ipython3",
416 | "version": "3.11.4"
417 | }
418 | },
419 | "nbformat": 4,
420 | "nbformat_minor": 5
421 | }
422 |
--------------------------------------------------------------------------------
/Trampa dummy-.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 21,
6 | "id": "8f26ffd0",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import pandas as pd\n"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": 22,
16 | "id": "84322f3b",
17 | "metadata": {},
18 | "outputs": [],
19 | "source": [
20 | "df = pd.read_csv('https://datasets-humai.s3.amazonaws.com/datasets/titanic_clase4.csv')"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 23,
26 | "id": "705c6e45",
27 | "metadata": {},
28 | "outputs": [
29 | {
30 | "data": {
31 | "text/html": [
32 | "\n",
33 | "\n",
46 | "
\n",
47 | " \n",
48 | " \n",
49 | " \n",
50 | " PassengerId \n",
51 | " Survived \n",
52 | " Pclass \n",
53 | " Name \n",
54 | " Sex \n",
55 | " Age \n",
56 | " SibSp \n",
57 | " Parch \n",
58 | " Ticket \n",
59 | " Fare \n",
60 | " Cabin \n",
61 | " Embarked \n",
62 | " \n",
63 | " \n",
64 | " \n",
65 | " \n",
66 | " 0 \n",
67 | " 1 \n",
68 | " 0 \n",
69 | " 3 \n",
70 | " Braund, Mr. Owen Harris \n",
71 | " male \n",
72 | " 22.0 \n",
73 | " 1 \n",
74 | " 0 \n",
75 | " A/5 21171 \n",
76 | " 7.2500 \n",
77 | " NaN \n",
78 | " S \n",
79 | " \n",
80 | " \n",
81 | " 1 \n",
82 | " 2 \n",
83 | " 1 \n",
84 | " 1 \n",
85 | " Cumings, Mrs. John Bradley (Florence Briggs Th... \n",
86 | " female \n",
87 | " 38.0 \n",
88 | " 1 \n",
89 | " 0 \n",
90 | " PC 17599 \n",
91 | " 71.2833 \n",
92 | " C85 \n",
93 | " C \n",
94 | " \n",
95 | " \n",
96 | " 2 \n",
97 | " 3 \n",
98 | " 1 \n",
99 | " 3 \n",
100 | " Heikkinen, Miss. Laina \n",
101 | " female \n",
102 | " 26.0 \n",
103 | " 0 \n",
104 | " 0 \n",
105 | " STON/O2. 3101282 \n",
106 | " 7.9250 \n",
107 | " NaN \n",
108 | " S \n",
109 | " \n",
110 | " \n",
111 | " 3 \n",
112 | " 4 \n",
113 | " 1 \n",
114 | " 1 \n",
115 | " Futrelle, Mrs. Jacques Heath (Lily May Peel) \n",
116 | " female \n",
117 | " 35.0 \n",
118 | " 1 \n",
119 | " 0 \n",
120 | " 113803 \n",
121 | " 53.1000 \n",
122 | " C123 \n",
123 | " S \n",
124 | " \n",
125 | " \n",
126 | " 4 \n",
127 | " 5 \n",
128 | " 0 \n",
129 | " 3 \n",
130 | " Allen, Mr. William Henry \n",
131 | " male \n",
132 | " 35.0 \n",
133 | " 0 \n",
134 | " 0 \n",
135 | " 373450 \n",
136 | " 8.0500 \n",
137 | " NaN \n",
138 | " S \n",
139 | " \n",
140 | " \n",
141 | "
\n",
142 | "
"
143 | ],
144 | "text/plain": [
145 | " PassengerId Survived Pclass \\\n",
146 | "0 1 0 3 \n",
147 | "1 2 1 1 \n",
148 | "2 3 1 3 \n",
149 | "3 4 1 1 \n",
150 | "4 5 0 3 \n",
151 | "\n",
152 | " Name Sex Age SibSp \\\n",
153 | "0 Braund, Mr. Owen Harris male 22.0 1 \n",
154 | "1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n",
155 | "2 Heikkinen, Miss. Laina female 26.0 0 \n",
156 | "3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n",
157 | "4 Allen, Mr. William Henry male 35.0 0 \n",
158 | "\n",
159 | " Parch Ticket Fare Cabin Embarked \n",
160 | "0 0 A/5 21171 7.2500 NaN S \n",
161 | "1 0 PC 17599 71.2833 C85 C \n",
162 | "2 0 STON/O2. 3101282 7.9250 NaN S \n",
163 | "3 0 113803 53.1000 C123 S \n",
164 | "4 0 373450 8.0500 NaN S "
165 | ]
166 | },
167 | "execution_count": 23,
168 | "metadata": {},
169 | "output_type": "execute_result"
170 | }
171 | ],
172 | "source": [
173 | "df.head()"
174 | ]
175 | },
176 | {
177 | "cell_type": "code",
178 | "execution_count": 24,
179 | "id": "a97dcd37",
180 | "metadata": {},
181 | "outputs": [],
182 | "source": [
183 | "columns = ['Survived', 'Pclass', 'Sex', 'Age', 'SibSp', 'Fare', 'Embarked']\n",
184 | "df = titanic[columns]"
185 | ]
186 | },
187 | {
188 | "cell_type": "code",
189 | "execution_count": 25,
190 | "id": "d75b9de6",
191 | "metadata": {},
192 | "outputs": [],
193 | "source": [
194 | "cat_vars = ['Sex', 'Pclass', 'Embarked']\n",
195 | "\n",
196 | "for var in cat_vars:\n",
197 | " cat_list='var'+'_'+var\n",
198 | " cat_list = pd.get_dummies(df[var], prefix=var, drop_first=True) #el parametro drop_first elimina la posibilidad de caer en una trampa dummy\n",
199 | " df_1 = df.join(cat_list)\n",
200 | " df = df_1"
201 | ]
202 | },
203 | {
204 | "cell_type": "code",
205 | "execution_count": 26,
206 | "id": "5fda435d",
207 | "metadata": {},
208 | "outputs": [
209 | {
210 | "data": {
211 | "text/html": [
212 | "\n",
213 | "\n",
226 | "
\n",
227 | " \n",
228 | " \n",
229 | " \n",
230 | " Survived \n",
231 | " Pclass \n",
232 | " Sex \n",
233 | " Age \n",
234 | " SibSp \n",
235 | " Fare \n",
236 | " Embarked \n",
237 | " Sex_male \n",
238 | " Pclass_2 \n",
239 | " Pclass_3 \n",
240 | " Embarked_Q \n",
241 | " Embarked_S \n",
242 | " \n",
243 | " \n",
244 | " \n",
245 | " \n",
246 | " 0 \n",
247 | " 0 \n",
248 | " 3 \n",
249 | " male \n",
250 | " 22.0 \n",
251 | " 1 \n",
252 | " 7.2500 \n",
253 | " S \n",
254 | " 1 \n",
255 | " 0 \n",
256 | " 1 \n",
257 | " 0 \n",
258 | " 1 \n",
259 | " \n",
260 | " \n",
261 | " 1 \n",
262 | " 1 \n",
263 | " 1 \n",
264 | " female \n",
265 | " 38.0 \n",
266 | " 1 \n",
267 | " 71.2833 \n",
268 | " C \n",
269 | " 0 \n",
270 | " 0 \n",
271 | " 0 \n",
272 | " 0 \n",
273 | " 0 \n",
274 | " \n",
275 | " \n",
276 | " 2 \n",
277 | " 1 \n",
278 | " 3 \n",
279 | " female \n",
280 | " 26.0 \n",
281 | " 0 \n",
282 | " 7.9250 \n",
283 | " S \n",
284 | " 0 \n",
285 | " 0 \n",
286 | " 1 \n",
287 | " 0 \n",
288 | " 1 \n",
289 | " \n",
290 | " \n",
291 | "
\n",
292 | "
"
293 | ],
294 | "text/plain": [
295 | " Survived Pclass Sex Age SibSp Fare Embarked Sex_male \\\n",
296 | "0 0 3 male 22.0 1 7.2500 S 1 \n",
297 | "1 1 1 female 38.0 1 71.2833 C 0 \n",
298 | "2 1 3 female 26.0 0 7.9250 S 0 \n",
299 | "\n",
300 | " Pclass_2 Pclass_3 Embarked_Q Embarked_S \n",
301 | "0 0 1 0 1 \n",
302 | "1 0 0 0 0 \n",
303 | "2 0 1 0 1 "
304 | ]
305 | },
306 | "execution_count": 26,
307 | "metadata": {},
308 | "output_type": "execute_result"
309 | }
310 | ],
311 | "source": [
312 | "df.head(3)"
313 | ]
314 | },
315 | {
316 | "cell_type": "code",
317 | "execution_count": 27,
318 | "id": "dd607f80",
319 | "metadata": {},
320 | "outputs": [],
321 | "source": [
322 | "data_vars = df.columns.values.tolist()\n",
323 | "to_keep = [i for i in data_vars if i not in cat_vars]\n",
324 | "df_final=df[to_keep]"
325 | ]
326 | },
327 | {
328 | "cell_type": "code",
329 | "execution_count": 28,
330 | "id": "61dd68b6",
331 | "metadata": {},
332 | "outputs": [
333 | {
334 | "data": {
335 | "text/html": [
336 | "\n",
337 | "\n",
350 | "
\n",
351 | " \n",
352 | " \n",
353 | " \n",
354 | " Survived \n",
355 | " Age \n",
356 | " SibSp \n",
357 | " Fare \n",
358 | " Sex_male \n",
359 | " Pclass_2 \n",
360 | " Pclass_3 \n",
361 | " Embarked_Q \n",
362 | " Embarked_S \n",
363 | " \n",
364 | " \n",
365 | " \n",
366 | " \n",
367 | " 0 \n",
368 | " 0 \n",
369 | " 22.0 \n",
370 | " 1 \n",
371 | " 7.2500 \n",
372 | " 1 \n",
373 | " 0 \n",
374 | " 1 \n",
375 | " 0 \n",
376 | " 1 \n",
377 | " \n",
378 | " \n",
379 | " 1 \n",
380 | " 1 \n",
381 | " 38.0 \n",
382 | " 1 \n",
383 | " 71.2833 \n",
384 | " 0 \n",
385 | " 0 \n",
386 | " 0 \n",
387 | " 0 \n",
388 | " 0 \n",
389 | " \n",
390 | " \n",
391 | " 2 \n",
392 | " 1 \n",
393 | " 26.0 \n",
394 | " 0 \n",
395 | " 7.9250 \n",
396 | " 0 \n",
397 | " 0 \n",
398 | " 1 \n",
399 | " 0 \n",
400 | " 1 \n",
401 | " \n",
402 | " \n",
403 | " 3 \n",
404 | " 1 \n",
405 | " 35.0 \n",
406 | " 1 \n",
407 | " 53.1000 \n",
408 | " 0 \n",
409 | " 0 \n",
410 | " 0 \n",
411 | " 0 \n",
412 | " 1 \n",
413 | " \n",
414 | " \n",
415 | " 4 \n",
416 | " 0 \n",
417 | " 35.0 \n",
418 | " 0 \n",
419 | " 8.0500 \n",
420 | " 1 \n",
421 | " 0 \n",
422 | " 1 \n",
423 | " 0 \n",
424 | " 1 \n",
425 | " \n",
426 | " \n",
427 | "
\n",
428 | "
"
429 | ],
430 | "text/plain": [
431 | " Survived Age SibSp Fare Sex_male Pclass_2 Pclass_3 Embarked_Q \\\n",
432 | "0 0 22.0 1 7.2500 1 0 1 0 \n",
433 | "1 1 38.0 1 71.2833 0 0 0 0 \n",
434 | "2 1 26.0 0 7.9250 0 0 1 0 \n",
435 | "3 1 35.0 1 53.1000 0 0 0 0 \n",
436 | "4 0 35.0 0 8.0500 1 0 1 0 \n",
437 | "\n",
438 | " Embarked_S \n",
439 | "0 1 \n",
440 | "1 0 \n",
441 | "2 1 \n",
442 | "3 1 \n",
443 | "4 1 "
444 | ]
445 | },
446 | "execution_count": 28,
447 | "metadata": {},
448 | "output_type": "execute_result"
449 | }
450 | ],
451 | "source": [
452 | "df_final.head()"
453 | ]
454 | }
455 | ],
456 | "metadata": {
457 | "kernelspec": {
458 | "display_name": "Python 3 (ipykernel)",
459 | "language": "python",
460 | "name": "python3"
461 | },
462 | "language_info": {
463 | "codemirror_mode": {
464 | "name": "ipython",
465 | "version": 3
466 | },
467 | "file_extension": ".py",
468 | "mimetype": "text/x-python",
469 | "name": "python",
470 | "nbconvert_exporter": "python",
471 | "pygments_lexer": "ipython3",
472 | "version": "3.9.13"
473 | }
474 | },
475 | "nbformat": 4,
476 | "nbformat_minor": 5
477 | }
478 |
--------------------------------------------------------------------------------
/Transformaciones de Variables Categóricas.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "### Transformaciones de Variables Categóricas"
8 | ]
9 | },
10 | {
11 | "cell_type": "code",
12 | "execution_count": 53,
13 | "metadata": {
14 | "id": "MMHzykyQfQQN",
15 | "outputId": "709fd4c9-4e33-4618-c445-a392d0c824bf"
16 | },
17 | "outputs": [
18 | {
19 | "data": {
20 | "text/html": [
21 | "\n",
22 | "\n",
35 | "
\n",
36 | " \n",
37 | " \n",
38 | " \n",
39 | " edad \n",
40 | " genero \n",
41 | " clase \n",
42 | " \n",
43 | " \n",
44 | " \n",
45 | " \n",
46 | " 0 \n",
47 | " 20 \n",
48 | " Masculino \n",
49 | " Primera \n",
50 | " \n",
51 | " \n",
52 | " 1 \n",
53 | " 30 \n",
54 | " Masculino \n",
55 | " Segunda \n",
56 | " \n",
57 | " \n",
58 | " 2 \n",
59 | " 40 \n",
60 | " Femenino \n",
61 | " Segunda \n",
62 | " \n",
63 | " \n",
64 | " 3 \n",
65 | " 50 \n",
66 | " Femenino \n",
67 | " Tercera \n",
68 | " \n",
69 | " \n",
70 | "
\n",
71 | "
"
72 | ],
73 | "text/plain": [
74 | " edad genero clase\n",
75 | "0 20 Masculino Primera\n",
76 | "1 30 Masculino Segunda\n",
77 | "2 40 Femenino Segunda\n",
78 | "3 50 Femenino Tercera"
79 | ]
80 | },
81 | "execution_count": 53,
82 | "metadata": {},
83 | "output_type": "execute_result"
84 | }
85 | ],
86 | "source": [
87 | "import pandas as pd\n",
88 | "\n",
89 | "df = pd.DataFrame(data={\n",
90 | " 'edad': [20, 30, 40, 50],\n",
91 | " 'genero': ['Masculino', 'Masculino', 'Femenino', 'Femenino'], \n",
92 | " 'clase': ['Primera', 'Segunda', 'Segunda','Tercera',]\n",
93 | "})\n",
94 | "\n",
95 | "df"
96 | ]
97 | },
98 | {
99 | "cell_type": "markdown",
100 | "metadata": {
101 | "id": "CVKT04n2fQQO"
102 | },
103 | "source": [
104 | " \n",
105 | "\n",
106 | "\n",
107 | "# 1. Codificación Binaria \n",
108 | "\n",
109 | "\n",
110 | "* Vamos a transformar la variable genero a una variable binaria.\n",
111 | "\n",
112 | "\n",
113 | "* Esto lo podemos hacer con la clase **\"LabelBinarizer()\"** de scikit-learn de la siguiente manera:"
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": 54,
119 | "metadata": {
120 | "id": "EuOo4Us6fQQO",
121 | "outputId": "fac92611-04c6-40ce-9434-314ce62a761b"
122 | },
123 | "outputs": [
124 | {
125 | "data": {
126 | "text/html": [
127 | "\n",
128 | "\n",
141 | "
\n",
142 | " \n",
143 | " \n",
144 | " \n",
145 | " genero \n",
146 | " genero_binario \n",
147 | " \n",
148 | " \n",
149 | " \n",
150 | " \n",
151 | " 0 \n",
152 | " Masculino \n",
153 | " 1 \n",
154 | " \n",
155 | " \n",
156 | " 1 \n",
157 | " Masculino \n",
158 | " 1 \n",
159 | " \n",
160 | " \n",
161 | " 2 \n",
162 | " Femenino \n",
163 | " 0 \n",
164 | " \n",
165 | " \n",
166 | " 3 \n",
167 | " Femenino \n",
168 | " 0 \n",
169 | " \n",
170 | " \n",
171 | "
\n",
172 | "
"
173 | ],
174 | "text/plain": [
175 | " genero genero_binario\n",
176 | "0 Masculino 1\n",
177 | "1 Masculino 1\n",
178 | "2 Femenino 0\n",
179 | "3 Femenino 0"
180 | ]
181 | },
182 | "execution_count": 54,
183 | "metadata": {},
184 | "output_type": "execute_result"
185 | }
186 | ],
187 | "source": [
188 | "from sklearn.preprocessing import LabelBinarizer\n",
189 | "\n",
190 | "lb = LabelBinarizer()\n",
191 | "df['genero_binario'] = lb.fit_transform(df['genero'])\n",
192 | "df[['genero', 'genero_binario']]"
193 | ]
194 | },
195 | {
196 | "cell_type": "markdown",
197 | "metadata": {
198 | "id": "FnW2yQtvfQQO"
199 | },
200 | "source": [
201 | " \n",
202 | "\n",
203 | "\n",
204 | "# 2. Codificación One Hot Encode \n",
205 | "\n"
206 | ]
207 | },
208 | {
209 | "cell_type": "code",
210 | "execution_count": 55,
211 | "metadata": {
212 | "id": "xDu0ecPYfQQO",
213 | "outputId": "a93c60f1-6aa2-4768-b075-3ece2fa05956"
214 | },
215 | "outputs": [],
216 | "source": [
217 | "from sklearn.preprocessing import OneHotEncoder"
218 | ]
219 | },
220 | {
221 | "cell_type": "code",
222 | "execution_count": 56,
223 | "metadata": {},
224 | "outputs": [
225 | {
226 | "data": {
227 | "text/plain": [
228 | "array([[0., 0.],\n",
229 | " [1., 0.],\n",
230 | " [1., 0.],\n",
231 | " [0., 1.]])"
232 | ]
233 | },
234 | "execution_count": 56,
235 | "metadata": {},
236 | "output_type": "execute_result"
237 | }
238 | ],
239 | "source": [
240 | "ohe = OneHotEncoder(drop='first')\n",
241 | "x = df[['clase']].values\n",
242 | "x_one_hot = ohe.fit_transform(x).toarray()\n",
243 | "x_one_hot"
244 | ]
245 | },
246 | {
247 | "cell_type": "code",
248 | "execution_count": 57,
249 | "metadata": {},
250 | "outputs": [
251 | {
252 | "data": {
253 | "text/html": [
254 | "\n",
255 | "\n",
268 | "
\n",
269 | " \n",
270 | " \n",
271 | " \n",
272 | " edad \n",
273 | " genero \n",
274 | " clase \n",
275 | " genero_binario \n",
276 | " clase_0 \n",
277 | " clase_1 \n",
278 | " \n",
279 | " \n",
280 | " \n",
281 | " \n",
282 | " 0 \n",
283 | " 20 \n",
284 | " Masculino \n",
285 | " Primera \n",
286 | " 1 \n",
287 | " 0.0 \n",
288 | " 0.0 \n",
289 | " \n",
290 | " \n",
291 | " 1 \n",
292 | " 30 \n",
293 | " Masculino \n",
294 | " Segunda \n",
295 | " 1 \n",
296 | " 1.0 \n",
297 | " 0.0 \n",
298 | " \n",
299 | " \n",
300 | " 2 \n",
301 | " 40 \n",
302 | " Femenino \n",
303 | " Segunda \n",
304 | " 0 \n",
305 | " 1.0 \n",
306 | " 0.0 \n",
307 | " \n",
308 | " \n",
309 | " 3 \n",
310 | " 50 \n",
311 | " Femenino \n",
312 | " Tercera \n",
313 | " 0 \n",
314 | " 0.0 \n",
315 | " 1.0 \n",
316 | " \n",
317 | " \n",
318 | "
\n",
319 | "
"
320 | ],
321 | "text/plain": [
322 | " edad genero clase genero_binario clase_0 clase_1\n",
323 | "0 20 Masculino Primera 1 0.0 0.0\n",
324 | "1 30 Masculino Segunda 1 1.0 0.0\n",
325 | "2 40 Femenino Segunda 0 1.0 0.0\n",
326 | "3 50 Femenino Tercera 0 0.0 1.0"
327 | ]
328 | },
329 | "execution_count": 57,
330 | "metadata": {},
331 | "output_type": "execute_result"
332 | }
333 | ],
334 | "source": [
335 | "df_one_hot = pd.DataFrame(x_one_hot, columns = [\"clase_\"+str(int(i)) for i in range(x_one_hot.shape[1])])\n",
336 | "df_all = pd.concat([df, df_one_hot], axis=1)\n",
337 | "df_all"
338 | ]
339 | },
340 | {
341 | "cell_type": "markdown",
342 | "metadata": {},
343 | "source": [
344 | "La \"trampa de las variables dummy\" es un problema que puede ocurrir cuando utilizamos variables categóricas en un modelo de regresión lineal (u otros modelos) y las convertimos en variables dummy o indicadoras. Las variables dummy se utilizan para transformar una variable categórica en varias columnas binarias (0 o 1), permitiendo que los modelos de regresión trabajen con variables que no son numéricas.\n",
345 | "\n",
346 | "¿Qué es una variable dummy?\n",
347 | "Cuando tienes una variable categórica (como \"color\" con valores \"rojo\", \"verde\", y \"azul\"), necesitas transformarla en varias columnas para que el modelo pueda procesarlas. Esto se hace con variables dummy. Si \"color\" tiene tres valores, las variables dummy se verían así:\n",
348 | "\n",
349 | "\"rojo\" → (1, 0, 0)\n",
350 | "\"verde\" → (0, 1, 0)\n",
351 | "\"azul\" → (0, 0, 1)"
352 | ]
353 | },
354 | {
355 | "cell_type": "markdown",
356 | "metadata": {},
357 | "source": [
358 | "#### ¿Por qué eliminar una categoría?\n",
359 | "\n",
360 | "Esto evita que haya redundancia entre las variables. Por ejemplo, si ya sabes que una fila no pertenece a las clases \"Segunda\" ni \"Tercera\", automáticamente sabes que pertenece a \"Primera\". Esto soluciona el problema de la multicolinealidad, mejorando la estabilidad del modelo."
361 | ]
362 | },
363 | {
364 | "cell_type": "markdown",
365 | "metadata": {
366 | "id": "WXvFp08QfQQP"
367 | },
368 | "source": [
369 | "### Codificación One Hot Encode en Pandas\n",
370 | "\n",
371 | "\n",
372 | "* Esta transformación también nos la permite hacer la librería de Pandas de la siguiente manera:"
373 | ]
374 | },
375 | {
376 | "cell_type": "code",
377 | "execution_count": 58,
378 | "metadata": {},
379 | "outputs": [
380 | {
381 | "data": {
382 | "text/html": [
383 | "\n",
384 | "\n",
397 | "
\n",
398 | " \n",
399 | " \n",
400 | " \n",
401 | " edad \n",
402 | " genero \n",
403 | " genero_binario \n",
404 | " clase_Segunda \n",
405 | " clase_Tercera \n",
406 | " \n",
407 | " \n",
408 | " \n",
409 | " \n",
410 | " 0 \n",
411 | " 20 \n",
412 | " Masculino \n",
413 | " 1 \n",
414 | " 0 \n",
415 | " 0 \n",
416 | " \n",
417 | " \n",
418 | " 1 \n",
419 | " 30 \n",
420 | " Masculino \n",
421 | " 1 \n",
422 | " 1 \n",
423 | " 0 \n",
424 | " \n",
425 | " \n",
426 | " 2 \n",
427 | " 40 \n",
428 | " Femenino \n",
429 | " 0 \n",
430 | " 1 \n",
431 | " 0 \n",
432 | " \n",
433 | " \n",
434 | " 3 \n",
435 | " 50 \n",
436 | " Femenino \n",
437 | " 0 \n",
438 | " 0 \n",
439 | " 1 \n",
440 | " \n",
441 | " \n",
442 | "
\n",
443 | "
"
444 | ],
445 | "text/plain": [
446 | " edad genero genero_binario clase_Segunda clase_Tercera\n",
447 | "0 20 Masculino 1 0 0\n",
448 | "1 30 Masculino 1 1 0\n",
449 | "2 40 Femenino 0 1 0\n",
450 | "3 50 Femenino 0 0 1"
451 | ]
452 | },
453 | "execution_count": 58,
454 | "metadata": {},
455 | "output_type": "execute_result"
456 | }
457 | ],
458 | "source": [
459 | "import pandas as pd\n",
460 | "\n",
461 | "\n",
462 | "df_encoded = pd.get_dummies(df, columns=['clase'], drop_first=True)\n",
463 | "\n",
464 | "# Mostrar el DataFrame con las nuevas columnas\n",
465 | "df_encoded\n"
466 | ]
467 | },
468 | {
469 | "cell_type": "markdown",
470 | "metadata": {
471 | "id": "fo1i2yWGfQQQ"
472 | },
473 | "source": [
474 | " \n",
475 | "\n",
476 | "\n",
477 | "\n",
478 | "# 3. Variables con un orden natural \n",
479 | "\n",
480 | "\n"
481 | ]
482 | },
483 | {
484 | "cell_type": "code",
485 | "execution_count": 59,
486 | "metadata": {
487 | "id": "EiLYj-HDfQQQ",
488 | "outputId": "8b4f5074-dcf0-4669-d04f-25e8dd63a7a2"
489 | },
490 | "outputs": [],
491 | "source": [
492 | "df = pd.DataFrame({\n",
493 | " 'nombre': ['Juan', 'María', 'Luis', 'Ana'],\n",
494 | " 'nivel_educativo': ['Primaria', 'Posgrado', 'Secundaria', 'Pregrado']\n",
495 | "})\n",
496 | "\n"
497 | ]
498 | },
499 | {
500 | "cell_type": "code",
501 | "execution_count": 60,
502 | "metadata": {},
503 | "outputs": [
504 | {
505 | "data": {
506 | "text/html": [
507 | "\n",
508 | "\n",
521 | "
\n",
522 | " \n",
523 | " \n",
524 | " \n",
525 | " nombre \n",
526 | " nivel_educativo \n",
527 | " nivel_educativo_encoded \n",
528 | " \n",
529 | " \n",
530 | " \n",
531 | " \n",
532 | " 0 \n",
533 | " Juan \n",
534 | " Primaria \n",
535 | " 0 \n",
536 | " \n",
537 | " \n",
538 | " 1 \n",
539 | " María \n",
540 | " Posgrado \n",
541 | " 3 \n",
542 | " \n",
543 | " \n",
544 | " 2 \n",
545 | " Luis \n",
546 | " Secundaria \n",
547 | " 1 \n",
548 | " \n",
549 | " \n",
550 | " 3 \n",
551 | " Ana \n",
552 | " Pregrado \n",
553 | " 2 \n",
554 | " \n",
555 | " \n",
556 | "
\n",
557 | "
"
558 | ],
559 | "text/plain": [
560 | " nombre nivel_educativo nivel_educativo_encoded\n",
561 | "0 Juan Primaria 0\n",
562 | "1 María Posgrado 3\n",
563 | "2 Luis Secundaria 1\n",
564 | "3 Ana Pregrado 2"
565 | ]
566 | },
567 | "execution_count": 60,
568 | "metadata": {},
569 | "output_type": "execute_result"
570 | }
571 | ],
572 | "source": [
573 | "# Definir el orden correcto de las categorías\n",
574 | "orden_nivel_educativo = ['Primaria', 'Secundaria', 'Pregrado', 'Posgrado']\n",
575 | "\n",
576 | "# Convertir la columna 'nivel_educativo' en una variable categórica con el orden definido\n",
577 | "df['nivel_educativo'] = pd.Categorical(df['nivel_educativo'], categories=orden_nivel_educativo, ordered=True)\n",
578 | "\n",
579 | "# Mapear las categorías al número correspondiente según su orden\n",
580 | "df['nivel_educativo_encoded'] = df['nivel_educativo'].cat.codes\n",
581 | "\n",
582 | "# Mostrar el DataFrame con la columna codificada\n",
583 | "df"
584 | ]
585 | }
586 | ],
587 | "metadata": {
588 | "colab": {
589 | "provenance": []
590 | },
591 | "kernelspec": {
592 | "display_name": "Python 3 (ipykernel)",
593 | "language": "python",
594 | "name": "python3"
595 | },
596 | "language_info": {
597 | "codemirror_mode": {
598 | "name": "ipython",
599 | "version": 3
600 | },
601 | "file_extension": ".py",
602 | "mimetype": "text/x-python",
603 | "name": "python",
604 | "nbconvert_exporter": "python",
605 | "pygments_lexer": "ipython3",
606 | "version": "3.11.4"
607 | }
608 | },
609 | "nbformat": 4,
610 | "nbformat_minor": 1
611 | }
612 |
--------------------------------------------------------------------------------
/Unsupervised Machine Learning Proyecto Final/readme.md:
--------------------------------------------------------------------------------
1 |
2 | ## IBM Machine Learning Professional Certificate - Unsupervised Machine Learning
3 |
4 | Required in your Project:
5 |
6 | - Main objective of the analysis that also specifies whether your model will be focused on clustering or dimensionality reduction and the benefits that your analysis brings to the business or stakeholders of this data.
7 |
8 | - Brief description of the data set you chose, a summary of its attributes, and an outline of what you are trying to accomplish with this analysis.
9 |
10 | - Brief summary of data exploration and actions taken for data cleaning orfeature engineering.
11 |
12 | - Summary of training at least three variations of the unsupervised model you selected. For example, you can use different clustering techniques or different hyperparameters.
13 |
14 | - A paragraph explaining which of your Unsupervised Learning models you recommend as a final model that best fits your needs in terms.
15 |
16 | - Summary Key Findings and Insights, which walks your reader through the main findings of your modeling exercise.
17 |
18 | - Suggestions for next steps in analyzing this data
19 |
20 |
--------------------------------------------------------------------------------
/_Librería Lazypredict.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 50,
6 | "id": "2c92e51b",
7 | "metadata": {},
8 | "outputs": [],
9 | "source": [
10 | "import pandas as pd \n",
11 | "from sklearn.model_selection import train_test_split\n",
12 | "import lazypredict \n",
13 | "from lazypredict.Supervised import LazyRegressor"
14 | ]
15 | },
16 | {
17 | "cell_type": "code",
18 | "execution_count": 51,
19 | "id": "0da3b1bf",
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "# Definimos la semilla\n",
24 | "SEMILLA = 42"
25 | ]
26 | },
27 | {
28 | "cell_type": "code",
29 | "execution_count": 52,
30 | "id": "e1f1c656",
31 | "metadata": {},
32 | "outputs": [],
33 | "source": [
34 | "# Lectura del dataset\n",
35 | "data = pd.read_csv('properati_caba_2021.csv')"
36 | ]
37 | },
38 | {
39 | "cell_type": "code",
40 | "execution_count": 53,
41 | "id": "1dbc66f4",
42 | "metadata": {},
43 | "outputs": [
44 | {
45 | "data": {
46 | "text/html": [
47 | "\n",
48 | "\n",
61 | "
\n",
62 | " \n",
63 | " \n",
64 | " \n",
65 | " id \n",
66 | " lat \n",
67 | " lon \n",
68 | " l3 \n",
69 | " rooms \n",
70 | " bathrooms \n",
71 | " surface_total \n",
72 | " surface_covered \n",
73 | " price \n",
74 | " property_type \n",
75 | " \n",
76 | " \n",
77 | " \n",
78 | " \n",
79 | " 0 \n",
80 | " 5dmWZ4uqAU4kpJw0AEc/Hw== \n",
81 | " -34.62 \n",
82 | " -58.40 \n",
83 | " San Cristobal \n",
84 | " 7.00 \n",
85 | " 6.00 \n",
86 | " 534.00 \n",
87 | " 384.00 \n",
88 | " 470000.00 \n",
89 | " Casa \n",
90 | " \n",
91 | " \n",
92 | " 1 \n",
93 | " kFa6ndbLuJ2k2zfzI1fY3A== \n",
94 | " -34.56 \n",
95 | " -58.46 \n",
96 | " Belgrano \n",
97 | " 1.00 \n",
98 | " 1.00 \n",
99 | " 25.00 \n",
100 | " 25.00 \n",
101 | " 60000.00 \n",
102 | " Departamento \n",
103 | " \n",
104 | " \n",
105 | " 2 \n",
106 | " S0fct9jgpfmuqmOaPntC/Q== \n",
107 | " -34.62 \n",
108 | " -58.38 \n",
109 | " Monserrat \n",
110 | " 1.00 \n",
111 | " 1.00 \n",
112 | " 40.00 \n",
113 | " 40.00 \n",
114 | " 82500.00 \n",
115 | " Departamento \n",
116 | " \n",
117 | " \n",
118 | " 3 \n",
119 | " sduUfHxdOh9PuRwJruEcyA== \n",
120 | " -34.62 \n",
121 | " -58.38 \n",
122 | " San Telmo \n",
123 | " 1.00 \n",
124 | " 1.00 \n",
125 | " 40.00 \n",
126 | " 40.00 \n",
127 | " 82500.00 \n",
128 | " Departamento \n",
129 | " \n",
130 | " \n",
131 | " 4 \n",
132 | " Tl1ebIQJyPOXV2XJMEImQg== \n",
133 | " -34.62 \n",
134 | " -58.38 \n",
135 | " Constitución \n",
136 | " 1.00 \n",
137 | " 1.00 \n",
138 | " 40.00 \n",
139 | " 40.00 \n",
140 | " 82500.00 \n",
141 | " Departamento \n",
142 | " \n",
143 | " \n",
144 | "
\n",
145 | "
"
146 | ],
147 | "text/plain": [
148 | " id lat lon l3 rooms bathrooms \\\n",
149 | "0 5dmWZ4uqAU4kpJw0AEc/Hw== -34.62 -58.40 San Cristobal 7.00 6.00 \n",
150 | "1 kFa6ndbLuJ2k2zfzI1fY3A== -34.56 -58.46 Belgrano 1.00 1.00 \n",
151 | "2 S0fct9jgpfmuqmOaPntC/Q== -34.62 -58.38 Monserrat 1.00 1.00 \n",
152 | "3 sduUfHxdOh9PuRwJruEcyA== -34.62 -58.38 San Telmo 1.00 1.00 \n",
153 | "4 Tl1ebIQJyPOXV2XJMEImQg== -34.62 -58.38 Constitución 1.00 1.00 \n",
154 | "\n",
155 | " surface_total surface_covered price property_type \n",
156 | "0 534.00 384.00 470000.00 Casa \n",
157 | "1 25.00 25.00 60000.00 Departamento \n",
158 | "2 40.00 40.00 82500.00 Departamento \n",
159 | "3 40.00 40.00 82500.00 Departamento \n",
160 | "4 40.00 40.00 82500.00 Departamento "
161 | ]
162 | },
163 | "execution_count": 53,
164 | "metadata": {},
165 | "output_type": "execute_result"
166 | }
167 | ],
168 | "source": [
169 | "# Observamos los primeros registros del dataframe\n",
170 | "data.head()"
171 | ]
172 | },
173 | {
174 | "cell_type": "code",
175 | "execution_count": 54,
176 | "id": "90ad4a14",
177 | "metadata": {},
178 | "outputs": [
179 | {
180 | "data": {
181 | "text/html": [
182 | "\n",
183 | "\n",
196 | "
\n",
197 | " \n",
198 | " \n",
199 | " \n",
200 | " lat \n",
201 | " lon \n",
202 | " rooms \n",
203 | " bathrooms \n",
204 | " surface_total \n",
205 | " surface_covered \n",
206 | " price \n",
207 | " \n",
208 | " \n",
209 | " \n",
210 | " \n",
211 | " lat \n",
212 | " 1.00 \n",
213 | " -0.14 \n",
214 | " -0.01 \n",
215 | " 0.10 \n",
216 | " 0.02 \n",
217 | " 0.04 \n",
218 | " 0.15 \n",
219 | " \n",
220 | " \n",
221 | " lon \n",
222 | " -0.14 \n",
223 | " 1.00 \n",
224 | " 0.05 \n",
225 | " 0.08 \n",
226 | " 0.05 \n",
227 | " 0.09 \n",
228 | " 0.18 \n",
229 | " \n",
230 | " \n",
231 | " rooms \n",
232 | " -0.01 \n",
233 | " 0.05 \n",
234 | " 1.00 \n",
235 | " 0.62 \n",
236 | " 0.74 \n",
237 | " 0.75 \n",
238 | " 0.49 \n",
239 | " \n",
240 | " \n",
241 | " bathrooms \n",
242 | " 0.10 \n",
243 | " 0.08 \n",
244 | " 0.62 \n",
245 | " 1.00 \n",
246 | " 0.74 \n",
247 | " 0.75 \n",
248 | " 0.67 \n",
249 | " \n",
250 | " \n",
251 | " surface_total \n",
252 | " 0.02 \n",
253 | " 0.05 \n",
254 | " 0.74 \n",
255 | " 0.74 \n",
256 | " 1.00 \n",
257 | " 0.97 \n",
258 | " 0.72 \n",
259 | " \n",
260 | " \n",
261 | " surface_covered \n",
262 | " 0.04 \n",
263 | " 0.09 \n",
264 | " 0.75 \n",
265 | " 0.75 \n",
266 | " 0.97 \n",
267 | " 1.00 \n",
268 | " 0.76 \n",
269 | " \n",
270 | " \n",
271 | " price \n",
272 | " 0.15 \n",
273 | " 0.18 \n",
274 | " 0.49 \n",
275 | " 0.67 \n",
276 | " 0.72 \n",
277 | " 0.76 \n",
278 | " 1.00 \n",
279 | " \n",
280 | " \n",
281 | "
\n",
282 | "
"
283 | ],
284 | "text/plain": [
285 | " lat lon rooms bathrooms surface_total surface_covered \\\n",
286 | "lat 1.00 -0.14 -0.01 0.10 0.02 0.04 \n",
287 | "lon -0.14 1.00 0.05 0.08 0.05 0.09 \n",
288 | "rooms -0.01 0.05 1.00 0.62 0.74 0.75 \n",
289 | "bathrooms 0.10 0.08 0.62 1.00 0.74 0.75 \n",
290 | "surface_total 0.02 0.05 0.74 0.74 1.00 0.97 \n",
291 | "surface_covered 0.04 0.09 0.75 0.75 0.97 1.00 \n",
292 | "price 0.15 0.18 0.49 0.67 0.72 0.76 \n",
293 | "\n",
294 | " price \n",
295 | "lat 0.15 \n",
296 | "lon 0.18 \n",
297 | "rooms 0.49 \n",
298 | "bathrooms 0.67 \n",
299 | "surface_total 0.72 \n",
300 | "surface_covered 0.76 \n",
301 | "price 1.00 "
302 | ]
303 | },
304 | "execution_count": 54,
305 | "metadata": {},
306 | "output_type": "execute_result"
307 | }
308 | ],
309 | "source": [
310 | "# Observamos la matriz de correlación entre las variables numéricas\n",
311 | "data.corr()"
312 | ]
313 | },
314 | {
315 | "cell_type": "code",
316 | "execution_count": 55,
317 | "id": "85b88a5f",
318 | "metadata": {},
319 | "outputs": [],
320 | "source": [
321 | "# Separamos al dataset en X (variables predictoras) e y (variable a predecir)\n",
322 | "X = df[[ 'rooms', 'bathrooms', 'surface_total', 'surface_covered']]\n",
323 | "y = df['price']"
324 | ]
325 | },
326 | {
327 | "cell_type": "code",
328 | "execution_count": 56,
329 | "id": "76759d00",
330 | "metadata": {},
331 | "outputs": [],
332 | "source": [
333 | "from sklearn.model_selection import train_test_split\n",
334 | "# Realizamos el split de X e y en los sets de entrenamiento (train) y test\n",
335 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=SEMILLA)"
336 | ]
337 | },
338 | {
339 | "cell_type": "code",
340 | "execution_count": 57,
341 | "id": "f9a8ea4b",
342 | "metadata": {},
343 | "outputs": [
344 | {
345 | "name": "stderr",
346 | "output_type": "stream",
347 | "text": [
348 | "100%|██████████| 42/42 [06:42<00:00, 9.58s/it]"
349 | ]
350 | },
351 | {
352 | "name": "stdout",
353 | "output_type": "stream",
354 | "text": [
355 | "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000606 seconds.\n",
356 | "You can set `force_row_wise=true` to remove the overhead.\n",
357 | "And if memory is not enough, you can set `force_col_wise=true`.\n",
358 | "[LightGBM] [Info] Total Bins 541\n",
359 | "[LightGBM] [Info] Number of data points in the train set: 32124, number of used features: 4\n",
360 | "[LightGBM] [Info] Start training from score 245499.655491\n"
361 | ]
362 | },
363 | {
364 | "name": "stderr",
365 | "output_type": "stream",
366 | "text": [
367 | "\n"
368 | ]
369 | }
370 | ],
371 | "source": [
372 | "# Para tareas de regresión\n",
373 | "reg = LazyRegressor(verbose=0, ignore_warnings=True, predictions=True, random_state = SEMILLA)\n",
374 | "models, predictions = reg.fit(X_train, X_test, y_train, y_test)"
375 | ]
376 | },
377 | {
378 | "cell_type": "code",
379 | "execution_count": 58,
380 | "id": "57f95645",
381 | "metadata": {},
382 | "outputs": [
383 | {
384 | "data": {
385 | "text/html": [
386 | "\n",
387 | "\n",
400 | "
\n",
401 | " \n",
402 | " \n",
403 | " \n",
404 | " Adjusted R-Squared \n",
405 | " R-Squared \n",
406 | " RMSE \n",
407 | " Time Taken \n",
408 | " \n",
409 | " \n",
410 | " Model \n",
411 | " \n",
412 | " \n",
413 | " \n",
414 | " \n",
415 | " \n",
416 | " \n",
417 | " \n",
418 | " \n",
419 | " RandomForestRegressor \n",
420 | " 0.65 \n",
421 | " 0.65 \n",
422 | " 197526.48 \n",
423 | " 4.60 \n",
424 | " \n",
425 | " \n",
426 | " LassoLarsCV \n",
427 | " 0.64 \n",
428 | " 0.64 \n",
429 | " 199850.17 \n",
430 | " 0.05 \n",
431 | " \n",
432 | " \n",
433 | " LarsCV \n",
434 | " 0.64 \n",
435 | " 0.64 \n",
436 | " 199850.17 \n",
437 | " 0.11 \n",
438 | " \n",
439 | " \n",
440 | " OrthogonalMatchingPursuitCV \n",
441 | " 0.64 \n",
442 | " 0.64 \n",
443 | " 199850.17 \n",
444 | " 0.03 \n",
445 | " \n",
446 | " \n",
447 | " LassoLarsIC \n",
448 | " 0.64 \n",
449 | " 0.64 \n",
450 | " 199850.17 \n",
451 | " 0.06 \n",
452 | " \n",
453 | " \n",
454 | " Lars \n",
455 | " 0.64 \n",
456 | " 0.64 \n",
457 | " 199850.17 \n",
458 | " 0.23 \n",
459 | " \n",
460 | " \n",
461 | " TransformedTargetRegressor \n",
462 | " 0.64 \n",
463 | " 0.64 \n",
464 | " 199850.17 \n",
465 | " 0.03 \n",
466 | " \n",
467 | " \n",
468 | " LinearRegression \n",
469 | " 0.64 \n",
470 | " 0.64 \n",
471 | " 199850.17 \n",
472 | " 0.04 \n",
473 | " \n",
474 | " \n",
475 | " LassoLars \n",
476 | " 0.64 \n",
477 | " 0.64 \n",
478 | " 199850.94 \n",
479 | " 0.03 \n",
480 | " \n",
481 | " \n",
482 | " Lasso \n",
483 | " 0.64 \n",
484 | " 0.64 \n",
485 | " 199850.95 \n",
486 | " 0.03 \n",
487 | " \n",
488 | " \n",
489 | " Ridge \n",
490 | " 0.64 \n",
491 | " 0.64 \n",
492 | " 199854.96 \n",
493 | " 0.02 \n",
494 | " \n",
495 | " \n",
496 | " BayesianRidge \n",
497 | " 0.64 \n",
498 | " 0.64 \n",
499 | " 199859.18 \n",
500 | " 0.05 \n",
501 | " \n",
502 | " \n",
503 | " GradientBoostingRegressor \n",
504 | " 0.64 \n",
505 | " 0.64 \n",
506 | " 199883.02 \n",
507 | " 1.30 \n",
508 | " \n",
509 | " \n",
510 | " RidgeCV \n",
511 | " 0.64 \n",
512 | " 0.64 \n",
513 | " 199898.04 \n",
514 | " 0.03 \n",
515 | " \n",
516 | " \n",
517 | " LassoCV \n",
518 | " 0.64 \n",
519 | " 0.64 \n",
520 | " 200042.25 \n",
521 | " 0.31 \n",
522 | " \n",
523 | " \n",
524 | "
\n",
525 | "
"
526 | ],
527 | "text/plain": [
528 | " Adjusted R-Squared R-Squared RMSE \\\n",
529 | "Model \n",
530 | "RandomForestRegressor 0.65 0.65 197526.48 \n",
531 | "LassoLarsCV 0.64 0.64 199850.17 \n",
532 | "LarsCV 0.64 0.64 199850.17 \n",
533 | "OrthogonalMatchingPursuitCV 0.64 0.64 199850.17 \n",
534 | "LassoLarsIC 0.64 0.64 199850.17 \n",
535 | "Lars 0.64 0.64 199850.17 \n",
536 | "TransformedTargetRegressor 0.64 0.64 199850.17 \n",
537 | "LinearRegression 0.64 0.64 199850.17 \n",
538 | "LassoLars 0.64 0.64 199850.94 \n",
539 | "Lasso 0.64 0.64 199850.95 \n",
540 | "Ridge 0.64 0.64 199854.96 \n",
541 | "BayesianRidge 0.64 0.64 199859.18 \n",
542 | "GradientBoostingRegressor 0.64 0.64 199883.02 \n",
543 | "RidgeCV 0.64 0.64 199898.04 \n",
544 | "LassoCV 0.64 0.64 200042.25 \n",
545 | "\n",
546 | " Time Taken \n",
547 | "Model \n",
548 | "RandomForestRegressor 4.60 \n",
549 | "LassoLarsCV 0.05 \n",
550 | "LarsCV 0.11 \n",
551 | "OrthogonalMatchingPursuitCV 0.03 \n",
552 | "LassoLarsIC 0.06 \n",
553 | "Lars 0.23 \n",
554 | "TransformedTargetRegressor 0.03 \n",
555 | "LinearRegression 0.04 \n",
556 | "LassoLars 0.03 \n",
557 | "Lasso 0.03 \n",
558 | "Ridge 0.02 \n",
559 | "BayesianRidge 0.05 \n",
560 | "GradientBoostingRegressor 1.30 \n",
561 | "RidgeCV 0.03 \n",
562 | "LassoCV 0.31 "
563 | ]
564 | },
565 | "execution_count": 58,
566 | "metadata": {},
567 | "output_type": "execute_result"
568 | }
569 | ],
570 | "source": [
571 | "models.head(15)"
572 | ]
573 | },
574 | {
575 | "cell_type": "code",
576 | "execution_count": 59,
577 | "id": "67becea8",
578 | "metadata": {},
579 | "outputs": [
580 | {
581 | "data": {
582 | "text/html": [
583 | "\n",
584 | "\n",
597 | "
\n",
598 | " \n",
599 | " \n",
600 | " \n",
601 | " AdaBoostRegressor \n",
602 | " BaggingRegressor \n",
603 | " BayesianRidge \n",
604 | " DecisionTreeRegressor \n",
605 | " DummyRegressor \n",
606 | " ElasticNet \n",
607 | " ElasticNetCV \n",
608 | " ExtraTreeRegressor \n",
609 | " ExtraTreesRegressor \n",
610 | " GammaRegressor \n",
611 | " ... \n",
612 | " RANSACRegressor \n",
613 | " RandomForestRegressor \n",
614 | " Ridge \n",
615 | " RidgeCV \n",
616 | " SGDRegressor \n",
617 | " SVR \n",
618 | " TransformedTargetRegressor \n",
619 | " TweedieRegressor \n",
620 | " XGBRegressor \n",
621 | " LGBMRegressor \n",
622 | " \n",
623 | " \n",
624 | " \n",
625 | " \n",
626 | " 0 \n",
627 | " 228968.84 \n",
628 | " 214928.57 \n",
629 | " 180022.93 \n",
630 | " 207747.48 \n",
631 | " 245499.66 \n",
632 | " 191236.47 \n",
633 | " 244862.47 \n",
634 | " 207747.48 \n",
635 | " 207747.48 \n",
636 | " 174113.15 \n",
637 | " ... \n",
638 | " 180061.19 \n",
639 | " 209371.87 \n",
640 | " 180038.49 \n",
641 | " 179881.35 \n",
642 | " 180233.31 \n",
643 | " 154946.08 \n",
644 | " 180056.17 \n",
645 | " 202048.48 \n",
646 | " 188592.42 \n",
647 | " 188580.82 \n",
648 | " \n",
649 | " \n",
650 | " 1 \n",
651 | " 294374.96 \n",
652 | " 273537.08 \n",
653 | " 320488.60 \n",
654 | " 220725.00 \n",
655 | " 245499.66 \n",
656 | " 329842.74 \n",
657 | " 247122.24 \n",
658 | " 299000.00 \n",
659 | " 265603.50 \n",
660 | " 250171.98 \n",
661 | " ... \n",
662 | " 250576.81 \n",
663 | " 252777.32 \n",
664 | " 320494.04 \n",
665 | " 320439.26 \n",
666 | " 306135.61 \n",
667 | " 161310.78 \n",
668 | " 320500.23 \n",
669 | " 325697.63 \n",
670 | " 318698.81 \n",
671 | " 298953.89 \n",
672 | " \n",
673 | " \n",
674 | " 2 \n",
675 | " 112564.81 \n",
676 | " 94430.07 \n",
677 | " 100085.44 \n",
678 | " 94290.30 \n",
679 | " 245499.66 \n",
680 | " 114400.22 \n",
681 | " 243411.69 \n",
682 | " 94290.30 \n",
683 | " 94290.30 \n",
684 | " 139537.87 \n",
685 | " ... \n",
686 | " 109747.86 \n",
687 | " 94360.00 \n",
688 | " 100091.05 \n",
689 | " 100034.46 \n",
690 | " 106628.80 \n",
691 | " 150499.26 \n",
692 | " 100097.42 \n",
693 | " 129835.00 \n",
694 | " 97081.09 \n",
695 | " 100720.71 \n",
696 | " \n",
697 | " \n",
698 | " 3 \n",
699 | " 744567.56 \n",
700 | " 242700.00 \n",
701 | " 515875.50 \n",
702 | " 130000.00 \n",
703 | " 245499.66 \n",
704 | " 548703.57 \n",
705 | " 250178.81 \n",
706 | " 630000.00 \n",
707 | " 289404.00 \n",
708 | " 394957.25 \n",
709 | " ... \n",
710 | " 448760.09 \n",
711 | " 367419.26 \n",
712 | " 515773.57 \n",
713 | " 516804.31 \n",
714 | " 496984.95 \n",
715 | " 160433.09 \n",
716 | " 515657.77 \n",
717 | " 510097.69 \n",
718 | " 372152.19 \n",
719 | " 428587.70 \n",
720 | " \n",
721 | " \n",
722 | " 4 \n",
723 | " 112564.81 \n",
724 | " 137013.53 \n",
725 | " 120031.06 \n",
726 | " 139345.56 \n",
727 | " 245499.66 \n",
728 | " 132591.67 \n",
729 | " 243663.21 \n",
730 | " 139345.56 \n",
731 | " 139345.56 \n",
732 | " 144895.42 \n",
733 | " ... \n",
734 | " 125485.36 \n",
735 | " 138948.17 \n",
736 | " 120031.38 \n",
737 | " 120028.17 \n",
738 | " 126168.38 \n",
739 | " 150863.50 \n",
740 | " 120031.75 \n",
741 | " 145090.58 \n",
742 | " 134329.66 \n",
743 | " 128959.05 \n",
744 | " \n",
745 | " \n",
746 | "
\n",
747 | "
5 rows × 39 columns
\n",
748 | "
"
749 | ],
750 | "text/plain": [
751 | " AdaBoostRegressor BaggingRegressor BayesianRidge DecisionTreeRegressor \\\n",
752 | "0 228968.84 214928.57 180022.93 207747.48 \n",
753 | "1 294374.96 273537.08 320488.60 220725.00 \n",
754 | "2 112564.81 94430.07 100085.44 94290.30 \n",
755 | "3 744567.56 242700.00 515875.50 130000.00 \n",
756 | "4 112564.81 137013.53 120031.06 139345.56 \n",
757 | "\n",
758 | " DummyRegressor ElasticNet ElasticNetCV ExtraTreeRegressor \\\n",
759 | "0 245499.66 191236.47 244862.47 207747.48 \n",
760 | "1 245499.66 329842.74 247122.24 299000.00 \n",
761 | "2 245499.66 114400.22 243411.69 94290.30 \n",
762 | "3 245499.66 548703.57 250178.81 630000.00 \n",
763 | "4 245499.66 132591.67 243663.21 139345.56 \n",
764 | "\n",
765 | " ExtraTreesRegressor GammaRegressor ... RANSACRegressor \\\n",
766 | "0 207747.48 174113.15 ... 180061.19 \n",
767 | "1 265603.50 250171.98 ... 250576.81 \n",
768 | "2 94290.30 139537.87 ... 109747.86 \n",
769 | "3 289404.00 394957.25 ... 448760.09 \n",
770 | "4 139345.56 144895.42 ... 125485.36 \n",
771 | "\n",
772 | " RandomForestRegressor Ridge RidgeCV SGDRegressor SVR \\\n",
773 | "0 209371.87 180038.49 179881.35 180233.31 154946.08 \n",
774 | "1 252777.32 320494.04 320439.26 306135.61 161310.78 \n",
775 | "2 94360.00 100091.05 100034.46 106628.80 150499.26 \n",
776 | "3 367419.26 515773.57 516804.31 496984.95 160433.09 \n",
777 | "4 138948.17 120031.38 120028.17 126168.38 150863.50 \n",
778 | "\n",
779 | " TransformedTargetRegressor TweedieRegressor XGBRegressor LGBMRegressor \n",
780 | "0 180056.17 202048.48 188592.42 188580.82 \n",
781 | "1 320500.23 325697.63 318698.81 298953.89 \n",
782 | "2 100097.42 129835.00 97081.09 100720.71 \n",
783 | "3 515657.77 510097.69 372152.19 428587.70 \n",
784 | "4 120031.75 145090.58 134329.66 128959.05 \n",
785 | "\n",
786 | "[5 rows x 39 columns]"
787 | ]
788 | },
789 | "execution_count": 59,
790 | "metadata": {},
791 | "output_type": "execute_result"
792 | }
793 | ],
794 | "source": [
795 | "predictions.head()"
796 | ]
797 | }
798 | ],
799 | "metadata": {
800 | "kernelspec": {
801 | "display_name": "Python 3 (ipykernel)",
802 | "language": "python",
803 | "name": "python3"
804 | },
805 | "language_info": {
806 | "codemirror_mode": {
807 | "name": "ipython",
808 | "version": 3
809 | },
810 | "file_extension": ".py",
811 | "mimetype": "text/x-python",
812 | "name": "python",
813 | "nbconvert_exporter": "python",
814 | "pygments_lexer": "ipython3",
815 | "version": "3.11.4"
816 | }
817 | },
818 | "nbformat": 4,
819 | "nbformat_minor": 5
820 | }
821 |
--------------------------------------------------------------------------------