├── .gitignore
├── BookRef
    ├── GPU-apps-catalog-mar2015.pdf
    ├── OpenMP
    │   └── IntrotoOpenMP.pdf
    └── Programming Massively Parallel Processors.pdf
├── LICENSE
├── Lab1
    ├── .ipynb_checkpoints
    │   ├── Analisis-checkpoint.ipynb
    │   └── readme-checkpoint.ipynb
    ├── clang
    │   ├── allomen.c
    │   ├── colorinunix.c
    │   ├── matrixr
    │   ├── matrixtest.c
    │   ├── matrixtestcalloc.c
    │   ├── test1
    │   ├── test2
    │   ├── testcolor
    │   └── usingsizeof.c
    ├── data1.txt
    ├── data2.txt
    ├── readme.ipynb
    ├── readme.md
    ├── result.txt
    ├── run.sh
    └── solve.c
├── Lab2
    ├── datasets
    │   ├── data1.txt
    │   ├── data1500.txt
    │   ├── data2.txt
    │   └── data2500.txt
    ├── logtimes.txt
    ├── readme.ipynb
    ├── readme.md
    ├── result.txt
    ├── run.sh
    ├── samplesomp
    │   ├── sample1.c
    │   └── sample4.c
    └── solve.c
├── Lab3.1CudaAddVector
    ├── vecadd.cu
    └── vecadd.sh
├── Lab3NvidiaCudaIntro
    ├── Accelerated Computing.ipynb
    ├── gpu_computing.zip
    └── readme.md
├── Lab4CudaMM
    ├── matmul.sh
    ├── matmult.cu
    └── readme.md
├── Lab5Numba
    ├── GPULibrariesBasics.ipynb
    └── mc.py
├── Notes
    ├── HPC-Class1.ipynb
    └── OficialSlides
    │   ├── Class1-openMP.pdf
    │   ├── Class2-Slurm.pdf
    │   ├── Class3openMP.pdf
    │   ├── Class4MPI.pdf
    │   ├── Class5introductionDataParallelismCUDAC.pdf
    │   ├── Class6vecAddition.pdf
    │   ├── Class8CUDAMemories.pdf
    │   └── NyuPowerWall.pdf
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Prerequisites
 2 | *.d
 3 | 
 4 | # Compiled Object files
 5 | *.slo
 6 | *.lo
 7 | *.o
 8 | *.obj
 9 | 
10 | # Precompiled Headers
11 | *.gch
12 | *.pch
13 | 
14 | # Compiled Dynamic libraries
15 | *.so
16 | *.dylib
17 | *.dll
18 | 
19 | # Fortran module files
20 | *.mod
21 | *.smod
22 | 
23 | # Compiled Static libraries
24 | *.lai
25 | *.la
26 | *.a
27 | *.lib
28 | 
29 | # Executables
30 | *.exe
31 | *.out
32 | *.app
33 | */.ipynb_checkpoints
34 | .ipynb_checkpoints
35 | 
36 | 


--------------------------------------------------------------------------------
/BookRef/GPU-apps-catalog-mar2015.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/BookRef/GPU-apps-catalog-mar2015.pdf


--------------------------------------------------------------------------------
/BookRef/OpenMP/IntrotoOpenMP.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/BookRef/OpenMP/IntrotoOpenMP.pdf


--------------------------------------------------------------------------------
/BookRef/Programming Massively Parallel Processors.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/BookRef/Programming Massively Parallel Processors.pdf


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 Hector F. Jimenez.
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/Lab1/.ipynb_checkpoints/Analisis-checkpoint.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Laboratorio 1, High Performance Computing\n",
  8 |     "Con el fin de realizar un repaso de las habilidades con el lenguaje ***C*** se debe construir un programa que permita multiplicar 2 matrices, teniendo en cuenta las siguientes condiciones :\n",
  9 |     "\n",
 10 |     "* Las dos Matrices se leeran de dos archivos de texto separados por comas. \n",
 11 |     "* los dos primeras lineas contienen la cantidad de filas y columnas de la matriz respectiva. \n",
 12 |     "\n",
 13 |     "*Ejemplo:* \n",
 14 |     "```\n",
 15 |     "2\n",
 16 |     "2\n",
 17 |     "1,2\n",
 18 |     "3,4\n",
 19 |     "```\n",
 20 |     "El resulta de la multiplicación de las dos matrices debe ser enviado via stdout aun archivo de texto,\n",
 21 |     "utilizar ***Taller1HPC*** es el asunto que debe tener el email enviado al docente John.\n",
 22 |     "\n",
 23 |     "<cite>Como tipo de dato se utiliza precisión sencilla ***float***</cite>\n",
 24 |     "# Solución\n",
 25 |     "\n",
 26 |     "## Multiplicación de Matrices\n",
 27 |     "dadas dos matrices $A$,$B$ de la forma:\n",
 28 |     "$$A = \\begin{pmatrix}\n",
 29 |     " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
 30 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 31 |     " a_{m 1} & \\cdots & a_{m n}\n",
 32 |     " \\end{pmatrix}, B = \\begin{pmatrix}\n",
 33 |     " b_{1 1} & \\cdots & b_{1 n} \\\\\n",
 34 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 35 |     " b_{m 1} & \\cdots & b_{m n}\n",
 36 |     " \\end{pmatrix}\n",
 37 |     " $$\n",
 38 |     "\n",
 39 |     "Escritas en los textos como $A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, donde $m,n,p$ indican las filas y columnas de cada matriz, el producto de $A\\cdot B$ es:\n",
 40 |     "$$\n",
 41 |     "C = AB_{}^{},  \\\\= \\begin{pmatrix}\n",
 42 |     " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
 43 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 44 |     " a_{m 1} & \\cdots & a_{m n}\n",
 45 |     " \\end{pmatrix} \\cdot \\begin{pmatrix}\n",
 46 |     " b_{1 1} & \\cdots & b_{1 p} \\\\\n",
 47 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 48 |     " b_{n 1} & \\cdots & b_{n p}\n",
 49 |     " \\end{pmatrix} \\\\  \\begin{pmatrix}\n",
 50 |     " a_{11}b_{11}+ \\cdots +a_{1n}b_{n1} & \\cdots & a_{11}b_{1p}+ \\cdots +a_{1n}b_{np} \\\\\n",
 51 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 52 |     " a_{m1}b_{11}+ \\cdots +a_{mn}b_{n1} & \\cdots & a_{m1}b_{1p}+ \\cdots +a_{mn}b_{np}\n",
 53 |     " \\end{pmatrix}\n",
 54 |     "$$\n",
 55 |     "Que no es mas que la sumatoria de multiplicar la fila por la columna para cada elemento de la matriz resulta:\n",
 56 |     "$$\n",
 57 |     "c_{ij} = \\sum_{r=1}^n a_{ir}b_{rj}$$\n",
 58 |     "\n",
 59 |     "\n",
 60 |     "*Nota[!]:La cantindad de Columnas debe ser igual a la cantidad de filas de la segunda matriz,$A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, tendra como resultado una Matriz $B:=(b_{i j})_{m \\times p}$*.\n",
 61 |     "\n",
 62 |     "De las observaciones anteriores podemos decir que  para resolver la multiplicación de las matrices $A\\cdot B$ es necesario realizar $m*n*p$ multiplicaciones, Seria posible utilizar una gran cantidad de tecnicas como se describe  en [1][1] y [2][2], pero por tiempo utilizare la version interactiva tiene un costo $Θ(n^{3})$ como se muestra en [3][3]:\n",
 63 |     "\n",
 64 |     "<img src=\"http://i.imgur.com/Y4OmGFt.png\" height=\"650\" width=\"640\">\n",
 65 |     "\n",
 66 |     "## Codigo implementado\n",
 67 |     "Para resolver este problema he hecho uso de los Makefiles para hacer parseo de los datos que  se me entrega, \n",
 68 |     "la ejecucion de este programa obligatoriamente estable que los archivos de entrada tengan los nombre data1.txt, data2.txt, siendo *data1.txt* el archivo que contiene los datos reales de la Matriz $A$, y  lo mismo para $B$, \n",
 69 |     "luego lo unico que sera hacaer es:\n",
 70 |     "**make run**\n",
 71 |     "\n",
 72 |     "```c\n",
 73 |     "#include <stdio.h>\n",
 74 |     "#include <stdlib.h>\n",
 75 |     "//High Performance Computing\n",
 76 |     "//Lab#1: Matrix Multiplication \n",
 77 |     "//hfjimenez@utp.edu.co, 2017-2\n",
 78 |     "int main(int argc, const char * argv[]){\n",
 79 |     "\tFILE *archi;\n",
 80 |     "\tFILE *archi2;\n",
 81 |     "\tarchi  = fopen(argv[1], \"r\");\t\n",
 82 |     "\tarchi2 = fopen(argv[2], \"r\");\n",
 83 |     "\n",
 84 |     "\t//the  first two number of each datamatrix file opts=[rmat1,colmat1,rmat2,colmat2]\n",
 85 |     "\tint opts[4];\n",
 86 |     "\t//NULL in C is ugly, NULL is equal to 0, not like in C++ <3 that is a real null value.\n",
 87 |     "\tif (archi == NULL || archi2 == NULL){\t\t\n",
 88 |     "\t    printf(\"Imposible abrir los archivos pasados como argumentos\");\n",
 89 |     "\t    exit(0);\n",
 90 |     "\t \t}\n",
 91 |     "\t //size of the matrix is fixed we Just need to get the number of rows and cols\n",
 92 |     "\t // of each matrix \n",
 93 |     "\tfor (i = 0; i < 2; i++){\n",
 94 |     "\t  fscanf(archi,\"%d\",&opts[i]);\n",
 95 |     "\t  fscanf(archi2,\"%d\",&opts[i+2]);\n",
 96 |     "\t}\n",
 97 |     "\n",
 98 |     "\t//printf(\"%d %d\\n\",opts[1],opts[2]);\n",
 99 |     "\tif(opts[1]!=opts[2]){\n",
100 |     "\t\tprintf(\"\\n[!] No Es Posible realizar la Multiplicacion entre Matrices\\n \");\n",
101 |     "\t  exit(0);\n",
102 |     "\t}\n",
103 |     "\tint i=0;\n",
104 |     "\tfor(i=0;i<4;i++){\n",
105 |     "\t\tprintf(\"%d\",opts[i] );\n",
106 |     "\t}\n",
107 |     "\n",
108 |     "\tfloat A[(opts[0]*opts[1])+2];\n",
109 |     "\tfloat B[(opts[2]*opts[3])+2];\n",
110 |     "\n",
111 |     "\tfclose(archi);\n",
112 |     "\tfclose(archi2);\n",
113 |     "\treturn 0;\n",
114 |     "} \n",
115 |     "\n",
116 |     "\n",
117 |     "\n",
118 |     "```\n",
119 |     "#### Referencias :\n",
120 |     "- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
121 |     "- https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
122 |     "[1]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
123 |     "[2]: https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
124 |     "[3]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm"
125 |    ]
126 |   }
127 |  ],
128 |  "metadata": {
129 |   "kernelspec": {
130 |    "display_name": "Python 3",
131 |    "language": "python",
132 |    "name": "python3"
133 |   },
134 |   "language_info": {
135 |    "codemirror_mode": {
136 |     "name": "ipython",
137 |     "version": 3
138 |    },
139 |    "file_extension": ".py",
140 |    "mimetype": "text/x-python",
141 |    "name": "python",
142 |    "nbconvert_exporter": "python",
143 |    "pygments_lexer": "ipython3",
144 |    "version": "3.5.4rc1"
145 |   }
146 |  },
147 |  "nbformat": 4,
148 |  "nbformat_minor": 2
149 | }
150 | 


--------------------------------------------------------------------------------
/Lab1/.ipynb_checkpoints/readme-checkpoint.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Laboratorio 1, High Performance Computing\n",
  8 |     "Con el fin de realizar un repaso de las habilidades con el lenguaje ***C*** se debe construir un programa que permita multiplicar 2 matrices, teniendo en cuenta las siguientes condiciones :\n",
  9 |     "\n",
 10 |     "* Las dos Matrices se leeran de dos archivos de texto separados por comas. \n",
 11 |     "* los dos primeras lineas contienen la cantidad de filas y columnas de la matriz respectiva. \n",
 12 |     "*Ejemplo:* \n",
 13 |     "```\n",
 14 |     "2\n",
 15 |     "2\n",
 16 |     "1,2\n",
 17 |     "3,4\n",
 18 |     "```\n",
 19 |     "El resulta de la multiplicación de las dos matrices debe ser enviado via stdout aun archivo de texto,\n",
 20 |     "utilizar ***Taller1HPC*** es el asunto que debe tener el email enviado al docente John.\n",
 21 |     "\n",
 22 |     "<cite>Como tipo de dato se utiliza precisión sencilla ***float***</cite>\n",
 23 |     "# Solución\n",
 24 |     "\n",
 25 |     "## Multiplicación de Matrices\n",
 26 |     "dadas dos matrices $A$,$B$ de la forma:\n",
 27 |     "$$A = \\begin{pmatrix}\n",
 28 |     " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
 29 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 30 |     " a_{m 1} & \\cdots & a_{m n}\n",
 31 |     " \\end{pmatrix}, B = \\begin{pmatrix}\n",
 32 |     " b_{1 1} & \\cdots & b_{1 n} \\\\\n",
 33 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 34 |     " b_{m 1} & \\cdots & b_{m n}\n",
 35 |     " \\end{pmatrix}\n",
 36 |     " $$\n",
 37 |     "\n",
 38 |     "Escritas en los textos como $A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, donde $m,n,p$ indican las filas y columnas de cada matriz, el producto de $A\\cdot B$ es:\n",
 39 |     "$C = AB_{}^{}$.\n",
 40 |     "\n",
 41 |     "$$ \\begin{pmatrix}\n",
 42 |     " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
 43 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 44 |     " a_{m 1} & \\cdots & a_{m n}\n",
 45 |     " \\end{pmatrix} \\cdot \\begin{pmatrix}\n",
 46 |     " b_{1 1} & \\cdots & b_{1 p} \\\\\n",
 47 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 48 |     " b_{n 1} & \\cdots & b_{n p}\n",
 49 |     " \\end{pmatrix}$$\n",
 50 |     " \n",
 51 |     " $$\\begin{pmatrix}\n",
 52 |     " a_{11}b_{11}+ \\cdots +a_{1n}b_{n1} & \\cdots & a_{11}b_{1p}+ \\cdots +a_{1n}b_{np} \\\\\n",
 53 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 54 |     " a_{m1}b_{11}+ \\cdots +a_{mn}b_{n1} & \\cdots & a_{m1}b_{1p}+ \\cdots +a_{mn}b_{np}\n",
 55 |     " \\end{pmatrix}\n",
 56 |     "$$\n",
 57 |     "\n",
 58 |     "Que no es mas que la sumatoria de multiplicar la fila por la columna para cada elemento de la matriz resulta:\n",
 59 |     "$$c_{ij} = \\sum_{r=1}^n a_{ir}b_{rj}$$\n",
 60 |     "\n",
 61 |     "\n",
 62 |     "*Nota[!]:La cantindad de Columnas debe ser igual a la cantidad de filas de la segunda matriz,$A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, tendra como resultado una Matriz $B:=(b_{i j})_{m \\times p}$*.\n",
 63 |     "\n",
 64 |     "De las observaciones anteriores podemos decir que  para resolver la multiplicación de las matrices $A\\cdot B$ es necesario realizar $m*n*p$ multiplicaciones, Seria posible utilizar una gran cantidad de tecnicas como se describe  en [1][1] y [2][2], pero por tiempo utilizare la version interactiva tiene un costo $Θ(n^{3})$ como se muestra en [3][3]:\n",
 65 |     "\n",
 66 |     "<img src=\"http://i.imgur.com/Y4OmGFt.png\" height=\"650\" width=\"640\">\n",
 67 |     "\n",
 68 |     "## Codigo implementado\n",
 69 |     "Para resolver este problema he hecho uso de los Makefiles para hacer parseo de los datos que  se me entrega, \n",
 70 |     "la ejecucion de este programa obligatoriamente establece que los archivos de entrada tengan los nombre ***data1.txt***, ***data2.txt***, siendo *data1.txt* el archivo que contiene los datos reales de la Matriz $A$, y  lo mismo para $B$, \n",
 71 |     "luego lo unico que sera hacaer es:\n",
 72 |     "**make run**\n",
 73 |     "\n",
 74 |     "### Lectura de Archivo\n",
 75 |     "Para lograr la lectura de los archivo en lenguaje **C** utilizamos las funciones fopen,fwrite para la apertura del archivo, y el almacenamiento de este, el siguiente fragmento de codigo en C hace uso de fopen para abrir el archivo:\n",
 76 |     "```c\n",
 77 |     "#include <stdio.h>\n",
 78 |     "int main(){\n",
 79 |     "  FILE *archivo;\n",
 80 |     "  archivo = fopen(\"data1.txt\",\"r\");\n",
 81 |     "  fclose(fclose);\n",
 82 |     "  return 0;\n",
 83 |     "  }\n",
 84 |     "```\n",
 85 |     "Con el ejemplo anterior tenemos la posibilidad de acceder a los datos del archivo, ahora para la lectura hacemos uso de la funcion **fscanf()** que tiene los mismos parametros de **scanf()**, utilizando las mismas opciones para los diferentes tipos de dato. `int fscanf(FILE *stream, const char *format, ...)`. de los archivos recibidos por el programa, nos damos cuenta que las dos primeras lineas siempre serán enteros, por lo cual simplemente se leen cuatro enteros indicando filas, y columnas de cada matriz:\n",
 86 |     "\n",
 87 |     "```C\n",
 88 |     "  //previous version has conflict with the pointer movmnt.\n",
 89 |     "  fscanf(archi,\"%i\",&rows1);\n",
 90 |     "  fscanf(archi,\"%i\",&cols1);\n",
 91 |     "  fscanf(archi2,\"%i\",&rows2);\n",
 92 |     "  fscanf(archi2,\"%i\",&cols2);\n",
 93 |     "```\n",
 94 |     "\n",
 95 |     "Siguiendo el algoritmo iterativo anterior validamos que las columnas de la primera matriz sean iguales a las filas de la segunda matriz, de lo contrario saqueremos un mensaje de *Dimensiones incompatibles*. \n",
 96 |     "\n",
 97 |     "Uno de los temas mas delicados a la hora de realizar este ejercicio fue el de manejar la memoria de manera dinamica, el lenguaje C nos da mucho control para hacer uso de esta, mediante un mecanismo muy simple, solicitud de memoria(tipicamente la parte del *heap*) y liberacion de la misma. El ciclo es sencillo, cuando se precisa almacenar un nuevo dato, se solicita tanta memoria en **bytes** como sea necesaria, y una vez que ese dato ya no se necesita, a la memoria se devuelve para poder ser reutilizada, para solicitar memoria se utiliza la funcion **malloc** o **calloc** en mi caso dado que inicializa los datos de la memoria solicitada con valores de ceros.\n",
 98 |     "\n",
 99 |     "El siguiente snippet documenta como es posible realizar la reserva de memoria para la Matriz $A$\n",
100 |     "```C\n",
101 |     "  float **MatA = (float **)calloc(rows1,sizeof(float*));\n",
102 |     "  for(i = 0; i < rows1; i++)\n",
103 |     "      MatA[i] = (float *)calloc(cols1 ,sizeof(float));\n",
104 |     "```\n",
105 |     "al ser una reserva de memoria con **calloc** es necesario validar que efectivamente hallamos reservado el espacio en memoria:\n",
106 |     "```C\n",
107 |     "if (!MatA || !MatB || !MatC) { \n",
108 |     "      printf(\"\\n%s[!]%s Falla de Reserva de Memoria\",RED,RES);\n",
109 |     "      exit(ENOMEM);}\n",
110 |     "```\n",
111 |     "\n",
112 |     "Ahora que ya hemos reservado la memoria para las tres matrices es necesario leer los datos :\n",
113 |     "```C\n",
114 |     "while(!feof(archi)){\n",
115 |     "    for(i=0;i<rows1;i++){\n",
116 |     "      for(j=0;j<cols1;j++){\n",
117 |     "        fscanf(archi,\"%f\",&MatA[i][j]);\n",
118 |     "        }\n",
119 |     "      }\n",
120 |     "    }\n",
121 |     "```\n",
122 |     "\n",
123 |     "Ahora realizamos la multiplicación de matrices como se muestra en la imagen anterior:\n",
124 |     "\n",
125 |     "```C\n",
126 |     "void multimat(float** M,float** M2,float**R,int r1,int r2,int c2){\n",
127 |     "  int tmp=0; \n",
128 |     "  for(int i=0;i<r1;i++){\n",
129 |     "    for(int j=0;j<c2;j++){\n",
130 |     "      tmp=0;\n",
131 |     "      for(int k=0;k<r2;k++){\n",
132 |     "        tmp=tmp+M[i][k]*M2[k][j];\n",
133 |     "      }\n",
134 |     "     R[i][j]=tmp;\n",
135 |     "    } \n",
136 |     "  }\n",
137 |     "}\n",
138 |     "\n",
139 |     "```\n",
140 |     "\n",
141 |     "\n",
142 |     "### Ejecución\n",
143 |     "\n",
144 |     "Para ejecutar este programa he tenido que hacer pruebas pequeñas como las que se encuentran en la carpeta **clang**, una de ellas fue la lectura de datos desde el archivo con **fscanf()**, por las pruebas realizadas observer que es mucho mas facil realizar la lectura de los datos cuando se tiene el siguiente formato:\n",
145 |     "\n",
146 |     "```\n",
147 |     "2\n",
148 |     "2\n",
149 |     "1 2\n",
150 |     "3 4\n",
151 |     "```\n",
152 |     "**Notese** el espacio entre los valores de la matriz, si este espacio no esta tocaria implementar una estrategia para detectar la ***,*** y seria de mas tiempo, aun asi este programa funciona con el formato de entrada original dado que he creado un pequeño bash script que utiliza la utilidad ***sed*** para reemplazar las comas en ambos archivos, ejecutar el programa con los datos parseados y recrear los datos. \n",
153 |     "```sh\n",
154 |     "#!/usr/bin/env bash\n",
155 |     "#hfjimenez@utp.edu.co\n",
156 |     "#Preprocess the Txt Files, more easy manipulation in the C program\n",
157 |     "echo -e \"[*]\\x1B[0m Limpiando Archivos\\n\"\n",
158 |     "sed -e \"s/,/ /g\" data1.txt>clean1.txt\n",
159 |     "sed -e \"s/,/ /g\" data2.txt>clean2.txt\n",
160 |     "touch r.txt\n",
161 |     "echo -e \"\\x1B[33m[*]\\x1B[0m Corriendo Programa\"\n",
162 |     "gcc solve.c -w -s -o solver \n",
163 |     "./solver clean1.txt clean2.txt\n",
164 |     "echo -e \"\\x1B[33m[*]\\x1B[0m Escribiendo Resultado a Stdout(result.txt)\"\n",
165 |     "sed -e \"s/ /,/g\" r.txt>result.txt\n",
166 |     "rm r.txt clean1.txt clean2.txt\n",
167 |     "echo -e \"\\x1B[32m[✔]TERMINADO\\x1B[0m\"\n",
168 |     "```\n",
169 |     "Ahora solo basta con realizar `./run.sh` y nuestro programa esta listo con complejidad cubica.\n",
170 |     "\n",
171 |     "#### Referencias :\n",
172 |     "- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
173 |     "- https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
174 |     "[1]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
175 |     "[2]: https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
176 |     "[3]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm"
177 |    ]
178 |   }
179 |  ],
180 |  "metadata": {
181 |   "kernelspec": {
182 |    "display_name": "Python 3",
183 |    "language": "python",
184 |    "name": "python3"
185 |   },
186 |   "language_info": {
187 |    "codemirror_mode": {
188 |     "name": "ipython",
189 |     "version": 3
190 |    },
191 |    "file_extension": ".py",
192 |    "mimetype": "text/x-python",
193 |    "name": "python",
194 |    "nbconvert_exporter": "python",
195 |    "pygments_lexer": "ipython3",
196 |    "version": "3.5.4rc1"
197 |   }
198 |  },
199 |  "nbformat": 4,
200 |  "nbformat_minor": 2
201 | }
202 | 


--------------------------------------------------------------------------------
/Lab1/clang/allomen.c:
--------------------------------------------------------------------------------
 1 | #include <stdio.h>
 2 | #include <stdlib.h>
 3 | #define TABLE_SIZE 10
 4 | struct cell_info
 5 | {
 6 |     int a;
 7 |     int b;
 8 |     int table[TABLE_SIZE];
 9 | };
10 | //define a pointer to the cell_info struct
11 | struct cell_info *cell_ptr;		
12 | 
13 | int main(int argc, char *argv[])
14 | {
15 | 	//Explicit memory management, you allocate the exact mem required in the heap. 
16 | 	//the previous struct is located in the stack memory and static variables in 
17 | 	//the global memory, three kind of different memories. 
18 | 	cell_ptr = (struct cell_info *)malloc(sizeof(struct cell_info));
19 | 	printf("%dBytes",sizeof(cell_ptr));
20 | 	printf("\n%dBytes",sizeof(struct cell_info));
21 | 	//after the usage of the memory, we need to free up the mem using free.
22 | 	free(cell_ptr);
23 | 	//then the operative system just use the mem space that was free up.
24 | 	return 0;
25 | }
26 | 


--------------------------------------------------------------------------------
/Lab1/clang/colorinunix.c:
--------------------------------------------------------------------------------
 1 | #include <stdio.h>
 2 | 
 3 | #define KNRM  "\x1B[0m"
 4 | #define KRED  "\x1B[31m"
 5 | #define KGRN  "\x1B[32m"
 6 | #define KYEL  "\x1B[33m"
 7 | #define KBLU  "\x1B[34m"
 8 | #define KMAG  "\x1B[35m"
 9 | #define KCYN  "\x1B[36m"
10 | #define KWHT  "\x1B[37m"
11 | 
12 | int main()
13 | {
14 |     printf("%sblue\n", KBLU);
15 |     printf("%smagenta\n", KMAG);
16 |     printf("%scyan\n", KCYN);
17 |     printf("%swhite\n", KWHT);
18 |     printf("%snormal\n", KNRM);
19 | 
20 |     return 0;
21 | }


--------------------------------------------------------------------------------
/Lab1/clang/matrixr:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Lab1/clang/matrixr


--------------------------------------------------------------------------------
/Lab1/clang/matrixtest.c:
--------------------------------------------------------------------------------
 1 | #include <stdio.h>
 2 | #include <stdlib.h> 
 3 | //stored in the global memory 
 4 | //int rows=4, cols=4;
 5 | #define rows 4 
 6 | #define cols 4
 7 | //store in the stack
 8 | //use gdb to check this.
 9 | int i,j;
10 | int **matrix;	// we need a double pointer, 
11 | // a matrix is a vector of vectors
12 | 
13 | int main(){
14 | //store in stack
15 | 	int x ; // just testing gdb as debuger
16 | 	//I allocate a simple row
17 | 	matrix =(int**)malloc(sizeof(int*)*rows);	
18 | 	//then I iterate over it, to allocate the cols.
19 | 	for(i=0; i<rows;i++)
20 | 		matrix[i] = (int**)malloc(sizeof(int)*cols);
21 | 	printf("Ingresa los datos de la matrix\n");
22 | 
23 | 	for(i=0;i<rows;i++){
24 | 		for(j=0;j<cols;j++){
25 | 			scanf("%d",&matrix[i][j]);
26 | 		}
27 | 	}
28 | 
29 | for(i=0;i<rows;i++){
30 | 		for(j=0;j<cols;j++){
31 | 			printf("%d ",matrix[i][j]);
32 | 		}
33 | 		printf("\n");
34 | 	}
35 | 
36 | 
37 | 	return  0
38 | ;}


--------------------------------------------------------------------------------
/Lab1/clang/matrixtestcalloc.c:
--------------------------------------------------------------------------------
 1 | #include <stdio.h>
 2 | #include <stdlib.h> 
 3 | 
 4 | 
 5 | //stored in the global memory 
 6 | //int rows=4, cols=4;
 7 | #define rows 2
 8 | #define cols 2 
 9 | //store in the stack
10 | //use gdb to check this.
11 | int i,j;
12 | int **matrix;	// we need a double pointer, 
13 | // a matrix is a vector of vectors
14 | 
15 | int main(){
16 | //store in stack
17 | 	int x ; // just testing gdb as debuger
18 | 	//I allocate a simple row
19 | 	matrix =(int**)calloc(rows,sizeof(int*));	
20 | 	//then I iterate over it, to allocate the cols.
21 | 	for(i=0; i<rows;i++)
22 | 		matrix[i] = (int**)calloc(cols,sizeof(int*));
23 | 	printf("Ingresa los datos de la matrix\n");
24 | 
25 | 	for(i=0;i<rows;i++){
26 | 		for(j=0;j<cols;j++){
27 | 			printf("%d ",matrix[i][j]);
28 | 		}
29 | 		printf("\n");
30 | 	}
31 | 
32 | 
33 | 	for(i=0;i<rows;i++){
34 | 		for(j=0;j<cols;j++){
35 | 			scanf("%d",&matrix[i][j]);
36 | 		}
37 | 	}
38 | 
39 | for(i=0;i<rows;i++){
40 | 		for(j=0;j<cols;j++){
41 | 			printf("%d ",matrix[i][j]);
42 | 		}
43 | 		printf("\n");
44 | 	}
45 | 
46 | 
47 | 	return  0
48 | ;}


--------------------------------------------------------------------------------
/Lab1/clang/test1:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Lab1/clang/test1


--------------------------------------------------------------------------------
/Lab1/clang/test2:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Lab1/clang/test2


--------------------------------------------------------------------------------
/Lab1/clang/testcolor:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Lab1/clang/testcolor


--------------------------------------------------------------------------------
/Lab1/clang/usingsizeof.c:
--------------------------------------------------------------------------------
 1 | #include <stdio.h>
 2 | #define NAME_LENGTH 10
 3 | #define TABLE_SIZE 100
 4 | #define UNITS_NUMBER 10
 5 | 
 6 | struct unit
 7 | {  /* Define a struct with an internal union */
 8 |   int x;
 9 |   float y;
10 |   double z;
11 |   short int a;
12 |   long b;
13 |   union
14 |   { /* Union with no name because it is internal to the struct */
15 |     char name[NAME_LENGTH];
16 |     int id;
17 |     short int sid;
18 |   } identifier;
19 | };
20 | 
21 | int main(int argc, char *argv[])
22 | {
23 |   int table[TABLE_SIZE];
24 |   struct unit data[UNITS_NUMBER];
25 | 
26 |   printf("%dBytes\n", sizeof(struct unit)); /* Print size of structure */
27 |   printf("%dBytes\n", sizeof(table));       /* Print size of table of ints */
28 |   printf("%dBytes\n", sizeof(data));        /* Print size of table of structs */
29 | 
30 |   return 0;
31 | }


--------------------------------------------------------------------------------
/Lab1/data1.txt:
--------------------------------------------------------------------------------
 1 | 12
 2 | 12
 3 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 4 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 5 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 6 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
 7 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
 8 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
 9 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
10 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
11 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
12 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
13 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
14 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,40
15 | 


--------------------------------------------------------------------------------
/Lab1/data2.txt:
--------------------------------------------------------------------------------
 1 | 12
 2 | 12
 3 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 4 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 5 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 6 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 7 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 8 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 9 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
10 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
11 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
12 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
13 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
14 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
15 | 


--------------------------------------------------------------------------------
/Lab1/readme.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Laboratorio 1, High Performance Computing\n",
  8 |     "Con el fin de realizar un repaso de las habilidades con el lenguaje ***C*** se debe construir un programa que permita multiplicar 2 matrices, teniendo en cuenta las siguientes condiciones :\n",
  9 |     "\n",
 10 |     "* Las dos Matrices se leeran de dos archivos de texto separados por comas. \n",
 11 |     "* los dos primeras lineas contienen la cantidad de filas y columnas de la matriz respectiva. \n",
 12 |     "*Ejemplo:* \n",
 13 |     "```\n",
 14 |     "2\n",
 15 |     "2\n",
 16 |     "1,2\n",
 17 |     "3,4\n",
 18 |     "```\n",
 19 |     "El resulta de la multiplicación de las dos matrices debe ser enviado via stdout aun archivo de texto,\n",
 20 |     "utilizar ***Taller1HPC*** es el asunto que debe tener el email enviado al docente John.\n",
 21 |     "\n",
 22 |     "<cite>Como tipo de dato se utiliza precisión sencilla ***float***</cite>\n",
 23 |     "# Solución\n",
 24 |     "\n",
 25 |     "## Multiplicación de Matrices\n",
 26 |     "dadas dos matrices $A$,$B$ de la forma:\n",
 27 |     "$$A = \\begin{pmatrix}\n",
 28 |     " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
 29 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 30 |     " a_{m 1} & \\cdots & a_{m n}\n",
 31 |     " \\end{pmatrix}, B = \\begin{pmatrix}\n",
 32 |     " b_{1 1} & \\cdots & b_{1 n} \\\\\n",
 33 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 34 |     " b_{m 1} & \\cdots & b_{m n}\n",
 35 |     " \\end{pmatrix}\n",
 36 |     " $$\n",
 37 |     "\n",
 38 |     "Escritas en los textos como $A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, donde $m,n,p$ indican las filas y columnas de cada matriz, el producto de $A\\cdot B$ es:\n",
 39 |     "$C = AB_{}^{}$.\n",
 40 |     "\n",
 41 |     "$$ \\begin{pmatrix}\n",
 42 |     " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
 43 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 44 |     " a_{m 1} & \\cdots & a_{m n}\n",
 45 |     " \\end{pmatrix} \\cdot \\begin{pmatrix}\n",
 46 |     " b_{1 1} & \\cdots & b_{1 p} \\\\\n",
 47 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 48 |     " b_{n 1} & \\cdots & b_{n p}\n",
 49 |     " \\end{pmatrix}$$\n",
 50 |     " \n",
 51 |     " $$\\begin{pmatrix}\n",
 52 |     " a_{11}b_{11}+ \\cdots +a_{1n}b_{n1} & \\cdots & a_{11}b_{1p}+ \\cdots +a_{1n}b_{np} \\\\\n",
 53 |     " \\vdots & \\ddots & \\vdots \\\\\n",
 54 |     " a_{m1}b_{11}+ \\cdots +a_{mn}b_{n1} & \\cdots & a_{m1}b_{1p}+ \\cdots +a_{mn}b_{np}\n",
 55 |     " \\end{pmatrix}\n",
 56 |     "$$\n",
 57 |     "\n",
 58 |     "Que no es mas que la sumatoria de multiplicar la fila por la columna para cada elemento de la matriz resulta:\n",
 59 |     "$$c_{ij} = \\sum_{r=1}^n a_{ir}b_{rj}$$\n",
 60 |     "\n",
 61 |     "\n",
 62 |     "*Nota[!]:La cantindad de Columnas debe ser igual a la cantidad de filas de la segunda matriz,$A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, tendra como resultado una Matriz $B:=(b_{i j})_{m \\times p}$*.\n",
 63 |     "\n",
 64 |     "De las observaciones anteriores podemos decir que  para resolver la multiplicación de las matrices $A\\cdot B$ es necesario realizar $m*n*p$ multiplicaciones, Seria posible utilizar una gran cantidad de tecnicas como se describe  en [1][1] y [2][2], pero por tiempo utilizare la version interactiva tiene un costo $Θ(n^{3})$ como se muestra en [3][3]:\n",
 65 |     "\n",
 66 |     "<img src=\"http://i.imgur.com/Y4OmGFt.png\" height=\"650\" width=\"640\">\n",
 67 |     "\n",
 68 |     "## Codigo implementado\n",
 69 |     "Para resolver este problema he hecho uso de los Makefiles para hacer parseo de los datos que  se me entrega, \n",
 70 |     "la ejecucion de este programa obligatoriamente establece que los archivos de entrada tengan los nombre ***data1.txt***, ***data2.txt***, siendo *data1.txt* el archivo que contiene los datos reales de la Matriz $A$, y  lo mismo para $B$, \n",
 71 |     "luego lo unico que sera hacaer es:\n",
 72 |     "**make run**\n",
 73 |     "\n",
 74 |     "### Lectura de Archivo\n",
 75 |     "Para lograr la lectura de los archivo en lenguaje **C** utilizamos las funciones fopen,fwrite para la apertura del archivo, y el almacenamiento de este, el siguiente fragmento de codigo en C hace uso de fopen para abrir el archivo:\n",
 76 |     "```c\n",
 77 |     "#include <stdio.h>\n",
 78 |     "int main(){\n",
 79 |     "  FILE *archivo;\n",
 80 |     "  archivo = fopen(\"data1.txt\",\"r\");\n",
 81 |     "  fclose(fclose);\n",
 82 |     "  return 0;\n",
 83 |     "  }\n",
 84 |     "```\n",
 85 |     "Con el ejemplo anterior tenemos la posibilidad de acceder a los datos del archivo, ahora para la lectura hacemos uso de la funcion **fscanf()** que tiene los mismos parametros de **scanf()**, utilizando las mismas opciones para los diferentes tipos de dato. `int fscanf(FILE *stream, const char *format, ...)`. de los archivos recibidos por el programa, nos damos cuenta que las dos primeras lineas siempre serán enteros, por lo cual simplemente se leen cuatro enteros indicando filas, y columnas de cada matriz:\n",
 86 |     "\n",
 87 |     "```C\n",
 88 |     "  //previous version has conflict with the pointer movmnt.\n",
 89 |     "  fscanf(archi,\"%i\",&rows1);\n",
 90 |     "  fscanf(archi,\"%i\",&cols1);\n",
 91 |     "  fscanf(archi2,\"%i\",&rows2);\n",
 92 |     "  fscanf(archi2,\"%i\",&cols2);\n",
 93 |     "```\n",
 94 |     "\n",
 95 |     "Siguiendo el algoritmo iterativo anterior validamos que las columnas de la primera matriz sean iguales a las filas de la segunda matriz, de lo contrario saqueremos un mensaje de *Dimensiones incompatibles*. \n",
 96 |     "\n",
 97 |     "Uno de los temas mas delicados a la hora de realizar este ejercicio fue el de manejar la memoria de manera dinamica, el lenguaje C nos da mucho control para hacer uso de esta, mediante un mecanismo muy simple, solicitud de memoria(tipicamente la parte del *heap*) y liberacion de la misma. El ciclo es sencillo, cuando se precisa almacenar un nuevo dato, se solicita tanta memoria en **bytes** como sea necesaria, y una vez que ese dato ya no se necesita, a la memoria se devuelve para poder ser reutilizada, para solicitar memoria se utiliza la funcion **malloc** o **calloc** en mi caso dado que inicializa los datos de la memoria solicitada con valores de ceros.\n",
 98 |     "\n",
 99 |     "El siguiente snippet documenta como es posible realizar la reserva de memoria para la Matriz $A$\n",
100 |     "```C\n",
101 |     "  float **MatA = (float **)calloc(rows1,sizeof(float*));\n",
102 |     "  for(i = 0; i < rows1; i++)\n",
103 |     "      MatA[i] = (float *)calloc(cols1 ,sizeof(float));\n",
104 |     "```\n",
105 |     "al ser una reserva de memoria con **calloc** es necesario validar que efectivamente hallamos reservado el espacio en memoria:\n",
106 |     "```C\n",
107 |     "if (!MatA || !MatB || !MatC) { \n",
108 |     "      printf(\"\\n%s[!]%s Falla de Reserva de Memoria\",RED,RES);\n",
109 |     "      exit(ENOMEM);}\n",
110 |     "```\n",
111 |     "\n",
112 |     "Ahora que ya hemos reservado la memoria para las tres matrices es necesario leer los datos :\n",
113 |     "```C\n",
114 |     "while(!feof(archi)){\n",
115 |     "    for(i=0;i<rows1;i++){\n",
116 |     "      for(j=0;j<cols1;j++){\n",
117 |     "        fscanf(archi,\"%f\",&MatA[i][j]);\n",
118 |     "        }\n",
119 |     "      }\n",
120 |     "    }\n",
121 |     "```\n",
122 |     "\n",
123 |     "Ahora realizamos la multiplicación de matrices como se muestra en la imagen anterior:\n",
124 |     "\n",
125 |     "```C\n",
126 |     "void multimat(float** M,float** M2,float**R,int r1,int r2,int c2){\n",
127 |     "  int tmp=0; \n",
128 |     "  for(int i=0;i<r1;i++){\n",
129 |     "    for(int j=0;j<c2;j++){\n",
130 |     "      tmp=0;\n",
131 |     "      for(int k=0;k<r2;k++){\n",
132 |     "        tmp=tmp+M[i][k]*M2[k][j];\n",
133 |     "      }\n",
134 |     "     R[i][j]=tmp;\n",
135 |     "    } \n",
136 |     "  }\n",
137 |     "}\n",
138 |     "\n",
139 |     "```\n",
140 |     "\n",
141 |     "\n",
142 |     "### Ejecución\n",
143 |     "\n",
144 |     "Para ejecutar este programa he tenido que hacer pruebas pequeñas como las que se encuentran en la carpeta **clang**, una de ellas fue la lectura de datos desde el archivo con **fscanf()**, por las pruebas realizadas observer que es mucho mas facil realizar la lectura de los datos cuando se tiene el siguiente formato:\n",
145 |     "\n",
146 |     "```\n",
147 |     "2\n",
148 |     "2\n",
149 |     "1 2\n",
150 |     "3 4\n",
151 |     "```\n",
152 |     "**Notese** el espacio entre los valores de la matriz, si este espacio no esta tocaria implementar una estrategia para detectar la ***,*** y seria de mas tiempo, aun asi este programa funciona con el formato de entrada original dado que he creado un pequeño bash script que utiliza la utilidad ***sed*** para reemplazar las comas en ambos archivos, ejecutar el programa con los datos parseados y recrear los datos. \n",
153 |     "```sh\n",
154 |     "#!/usr/bin/env bash\n",
155 |     "#hfjimenez@utp.edu.co\n",
156 |     "#Preprocess the Txt Files, more easy manipulation in the C program\n",
157 |     "echo -e \"[*]\\x1B[0m Limpiando Archivos\\n\"\n",
158 |     "sed -e \"s/,/ /g\" data1.txt>clean1.txt\n",
159 |     "sed -e \"s/,/ /g\" data2.txt>clean2.txt\n",
160 |     "touch r.txt\n",
161 |     "echo -e \"\\x1B[33m[*]\\x1B[0m Corriendo Programa\"\n",
162 |     "gcc solve.c -w -s -o solver \n",
163 |     "./solver clean1.txt clean2.txt\n",
164 |     "echo -e \"\\x1B[33m[*]\\x1B[0m Escribiendo Resultado a Stdout(result.txt)\"\n",
165 |     "sed -e \"s/ /,/g\" r.txt>result.txt\n",
166 |     "rm r.txt clean1.txt clean2.txt\n",
167 |     "echo -e \"\\x1B[32m[✔]TERMINADO\\x1B[0m\"\n",
168 |     "```\n",
169 |     "Ahora solo basta con realizar `./run.sh` y nuestro programa esta listo con complejidad cubica.\n",
170 |     "\n",
171 |     "#### Referencias :\n",
172 |     "- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
173 |     "- https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
174 |     "[1]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
175 |     "[2]: https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
176 |     "[3]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm"
177 |    ]
178 |   }
179 |  ],
180 |  "metadata": {
181 |   "kernelspec": {
182 |    "display_name": "Python 3",
183 |    "language": "python",
184 |    "name": "python3"
185 |   },
186 |   "language_info": {
187 |    "codemirror_mode": {
188 |     "name": "ipython",
189 |     "version": 3
190 |    },
191 |    "file_extension": ".py",
192 |    "mimetype": "text/x-python",
193 |    "name": "python",
194 |    "nbconvert_exporter": "python",
195 |    "pygments_lexer": "ipython3",
196 |    "version": "3.5.4rc1"
197 |   }
198 |  },
199 |  "nbformat": 4,
200 |  "nbformat_minor": 2
201 | }
202 | 


--------------------------------------------------------------------------------
/Lab1/readme.md:
--------------------------------------------------------------------------------
 1 | # Lab1
 2 | Este laboratorio tiene por fin identificar el uso de memoria estatica, y dinamica mediante el lenguaje C, 
 3 | se utiliza gdb como debugger y radare2 para verificar el binario y determinar la ubicacion de los datos(No incluido). 
 4 | ## Información
 5 | Este Laboratorio tiene un analísis pequeño realizado con Jupyter, así que le recomiendo leer (readme.ipynb)[https://github.com/h3ct0rjs/HighPerformanceComputing/blob/master/Lab1/readme.ipynb]. 
 6 | 
 7 | ## Ejecucion
 8 | Este programa solo ha sido probado bajo un entorno Debian Gnu/Linux, aunque deberia de funcionar tambien en un entorno Window$. 
 9 | 
10 | Para ejecutar el código solo bastará con hacer :
11 | ```sh
12 | user@host:/tmp/: git clone https://github.com/h3ct0rjs/HighPerformanceComputing
13 | user@host:/tmp/: cd HighPerformanceComputing/Lab1 && ./run
14 | ```
15 | **Tener presente que para realizar la prueba con otros datos es necesario que los archivos se llamen data1.txt, data2.txt los datos de las matrices correspondientes, el resultado se mostrará en pantalla y la matriz resultado será almacenada en *result.txt* **


--------------------------------------------------------------------------------
/Lab1/result.txt:
--------------------------------------------------------------------------------
 1 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 2 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 3 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 4 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 5 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 6 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 7 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 8 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
 9 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
10 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
11 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
12 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
13 | 


--------------------------------------------------------------------------------
/Lab1/run.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | #hfjimenez@utp.edu.co
 3 | #Preprocess the Txt Files, more easy manipulation in the C program
 4 | echo -e "[*]\x1B[0m Limpiando Archivos\n"
 5 | sed -e "s/,/ /g" data1.txt>clean1.txt
 6 | sed -e "s/,/ /g" data2.txt>clean2.txt
 7 | touch r.txt
 8 | echo -e "\x1B[33m[*]\x1B[0m Corriendo Programa"
 9 | gcc solve.c -w -s -o solver 
10 | ./solver clean1.txt clean2.txt
11 | echo -e "\x1B[33m[*]\x1B[0m Escribiendo Resultado a Stdout(result.txt)"
12 | sed -e "s/ /,/g" r.txt>result.txt
13 | rm r.txt clean1.txt clean2.txt
14 | echo -e "\x1B[32m[✔]TERMINADO\x1B[0m"
15 | 


--------------------------------------------------------------------------------
/Lab1/solve.c:
--------------------------------------------------------------------------------
  1 | #include <stdio.h>
  2 | #include <stdlib.h>
  3 | #include <errno.h>
  4 | //High Performance Computing
  5 | //Lab#1: Matrix Multiplication 
  6 | //hfjimenez@utp.edu.co, 2017-2
  7 | 
  8 | //Strings label :
  9 | //[!]:Warning,Exit Process.
 10 | //[*]:Information interesting
 11 | //[Ok]: Process Completed 
 12 | //References:
 13 | //[1]https://linux.die.net/man/3/calloc
 14 | 
 15 | //A*B = C, representation of all the matrix
 16 | float **MatA, **MatB, **MatC;     //Heap MemStorage
 17 | int i,j,k;                        //global Mem
 18 | 
 19 | //My Function helpers
 20 | void printmat(float **Mat,int r, int  c);
 21 | void title(void);
 22 | 
 23 | #define version  "v0.1"
 24 | //First Time Using Colors Hooray
 25 | #define RES  "\x1B[0m"
 26 | #define RED  "\x1B[31m"
 27 | #define GRN  "\x1B[32m"
 28 | #define YEL  "\x1B[33m"
 29 | #define BLU  "\x1B[34m"
 30 | #define MAG  "\x1B[35m"
 31 | #define CYN  "\x1B[36m"
 32 | #define WHT  "\x1B[37m"
 33 | 
 34 | int main(int argc, const char * argv[])
 35 | {
 36 |   title();
 37 |   //File pointers,in read mode
 38 |   FILE *archi;
 39 |   FILE *archi2;
 40 |   FILE *result;
 41 |   int rows1,cols1,rows2,cols2;
 42 |   int sum=0;
 43 |   archi = fopen(argv[1],"r");
 44 |   archi2 = fopen(argv[2],"r");
 45 |   //NULL in C is ugly, NULL is equal to 0, not like in C++ <3 that is a real null value.
 46 |   if (archi == NULL || archi2 == NULL){       
 47 |     printf("%s[!]%s Imposible abrir los archivos pasados como argumentos\n",RED,RES);
 48 |     exit(0);
 49 |   }
 50 | 
 51 |   /*  int opts[4];            //to store the values.
 52 |     for(i=0; i<2; i++){
 53 |       fscanf(archi,"%d",&opts[i]);  //read the amount of rows and cols 
 54 |       fscanf(archi2,"%d",&opts[i+2]);   //read the amount of rows and cols 
 55 |     }
 56 | 
 57 |     //fclose(archi);            //Because I was having a trouble dealing with this,
 58 |     //fclose(archi2);           //I need to reopen the file using another bar. I try to figured it out with 
 59 |                                 // with gdb and it seems that had something to do with a program that I use to 
 60 |                                // chroot process. Investigating about this.
 61 |   */
 62 | 
 63 |   //previous version has conflict with the pointer movmnt.
 64 |   fscanf(archi,"%i",&rows1);
 65 |   fscanf(archi,"%i",&cols1);
 66 |   fscanf(archi2,"%i",&rows2);
 67 |   fscanf(archi2,"%i",&cols2);
 68 | 
 69 |   printf("%s**Dimensiones Matriz**%s",CYN, RES);
 70 |   printf("\n%s[*]%s Matriz **A** Filas:%d Columnas:%d",YEL,RES,rows1,cols1);
 71 |   printf("\n%s[*]%s Matriz **B** Filas:%d Columnas:%d",YEL,RES,rows2,cols2);
 72 |   
 73 |   if(cols1!=rows2){
 74 |     printf("\n%s[!]%s No Es Posible realizar la Multiplicacion entre Matrices\n ",RED,RES);
 75 |     printf("%s[NOTA:!]%s Dimensiones Incompatibles,saliendo...\n",RED,RES );
 76 |     exit(0);
 77 |   }
 78 | 
 79 |   //Allocate memory for the Matrices, with initilize values to cero. 
 80 |   //avoid the for loops to init the matriz values in 0.see[1] for more info
 81 |   
 82 |   float **MatA = (float **)calloc(rows1,sizeof(float*));
 83 |   for(i = 0; i < rows1; i++)
 84 |       MatA[i] = (float *)calloc(cols1 ,sizeof(float));
 85 |   
 86 |   float **MatB = (float **)calloc(rows2, sizeof(float*));
 87 |   for(i = 0; i < rows2; i++)
 88 |       MatB[i] = (float *)calloc(cols2, sizeof(float));
 89 | 
 90 |   float **MatC = (float **)calloc(rows1, sizeof(float*));
 91 |   for(i = 0; i < rows1; i++)
 92 |       MatC[i] = (float *)calloc(cols2, sizeof(float));
 93 | 
 94 |   if (!MatA || !MatB || !MatC) { 
 95 |       printf("\n%s[!]%s Falla de Reserva de Memoria",RED,RES);
 96 |       exit(ENOMEM);}
 97 |   
 98 |   printf("\n%s[*]%s Leyendo Valores de la Matriz A en el archivos\n",YEL, RES);
 99 |   while(!feof(archi)){
100 |     for(i=0;i<rows1;i++){
101 |       for(j=0;j<cols1;j++){
102 |         fscanf(archi,"%f",&MatA[i][j]);
103 |         }
104 |       }
105 |     }
106 |   printf("%s[*]%s Leyendo Valores de la Matriz B en el archivos\n",YEL, RES);
107 |   while(!feof(archi2)){
108 |     for(i=0;i<rows2;i++){
109 |       for(j=0;j<cols2;j++){
110 |         fscanf(archi2,"%f",&MatB[i][j]);
111 |       }
112 |     }
113 |   }
114 |  
115 |   printmat(MatA,rows1,cols1);
116 |   printmat(MatB,rows2,cols2);
117 |   printf("%s[*]%s Multiplicando Matriz:\n",YEL, RES);
118 |   multimat(MatA,MatB,MatC,rows1,rows2,cols2);
119 |   printmat(MatC,rows1,cols2);
120 |   result = fopen("r.txt","w");
121 |   if (result == NULL){       
122 |     printf("%s [!]%s Imposible abrir el archivo para los resultados\n",RED,RES);
123 |     exit(0);
124 |   }
125 | 
126 |   for(i=0;i<rows2;i++){
127 |     for(j=0;j<cols2;j++){
128 |       fprintf(result,"%.2f ",MatB[i][j]);
129 |     }
130 |     fprintf(result,"\n");
131 |   }
132 | 
133 |   //free up memory
134 |   free(*MatA);
135 |   free(*MatB);
136 |   free(*MatC);
137 |   free(MatA);
138 |   free(MatB);
139 |   free(MatC);
140 |   
141 |   fclose(archi);
142 |   fclose(archi2);
143 |   fclose(result);
144 | }
145 | 
146 | void multimat(float** M,float** M2,float**R,int r1,int r2,int c2){
147 |   int tmp=0; 
148 |   for(int i=0;i<r1;i++){
149 |     for(int j=0;j<c2;j++){
150 |       tmp=0;
151 |       for(int k=0;k<r2;k++){
152 |         tmp+=M[i][k]*M2[k][j];
153 |       }
154 |      R[i][j]=tmp;
155 |     } 
156 |   }
157 | } 
158 | 
159 | 
160 | void printmat(float **Mat,int r,int c){
161 |   printf("\nMatriz\n%s---%s\n",BLU,RES );
162 |   for(i=0;i<r;i++){
163 |       for(j=0;j<c;j++)
164 |         printf("%.2f  ",Mat[i][j]);
165 |       printf("\n");
166 |   }
167 | 
168 | }
169 | 
170 | void title(void){
171 | printf("        :::    ::: :::::::::   :::::::: \n");
172 | printf("     :+:    :+: :+:    :+: :+:    :+: \n");
173 | printf("    +:+    +:+ +:+    +:+ +:+         \n");
174 | printf("   +#++:++#++ +#++:++#+  +#+          \n");
175 | printf("  +#+    +#+ +#+        +#+           \n");
176 | printf(" #+#    #+# #+#        #+#    #+#     \n");
177 | printf("###    ### ###         ########       \n\n");
178 |   printf("%s-----------------------------%s\n",BLU,RES);
179 |   printf("%sMultiplicacion de Matrices\nhfjimenez@utp.edu.co,2017-2\n\tH.P.C%s\n",RED,RES);
180 |   printf("%s-----------------------------\n%s",BLU,RES);
181 | }


--------------------------------------------------------------------------------
/Lab2/datasets/data1.txt:
--------------------------------------------------------------------------------
 1 | 12
 2 | 12
 3 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 4 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 5 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 6 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
 7 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
 8 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
 9 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
10 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
11 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
12 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
13 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
14 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,40
15 | 


--------------------------------------------------------------------------------
/Lab2/datasets/data2.txt:
--------------------------------------------------------------------------------
 1 | 12
 2 | 12
 3 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 4 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 5 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 6 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 7 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 8 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
 9 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
10 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
11 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
12 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
13 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
14 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
15 | 


--------------------------------------------------------------------------------
/Lab2/logtimes.txt:
--------------------------------------------------------------------------------
  1 | Pruebas Secuenciales :
  2 | [*] Numero maximo de Threads: 1 
  3 | Tiempo de Ejecucion: 110.000s
  4 | ---
  5 | [*] Numero maximo de Threads: 1 
  6 | Tiempo de Ejecucion: 110.000s
  7 | real	2m5.229s
  8 | user	1m49.292s
  9 | sys	0m1.240s
 10 | ---
 11 | [*] Numero maximo de Threads: 2 
 12 | Tiempo de Ejecucion: 58.000s
 13 | real	1m13.346s
 14 | user	1m49.140s
 15 | sys	0m1.204s
 16 | ---
 17 | [*] Numero maximo de Threads: 2 
 18 | Tiempo de Ejecucion: 58.000s
 19 | real	1m13.151s
 20 | user	1m48.544s
 21 | sys	0m1.220s
 22 | --
 23 | [*] Numero maximo de Threads: 2 
 24 | Tiempo de Ejecucion: 58.000s
 25 | real	1m14.044s
 26 | user	1m49.940s
 27 | sys	0m1.144s
 28 | ---
 29 | [*] Numero maximo de Threads: 3 
 30 | Tiempo de Ejecucion: 41.000s
 31 | real	0m55.691s
 32 | user	1m48.884s
 33 | sys	0m1.072s
 34 | ---
 35 | [*] Numero maximo de Threads: 3 
 36 | Tiempo de Ejecucion: 41.000s
 37 | real	0m55.852s
 38 | user	1m49.068s
 39 | sys	0m1.048s
 40 | ---
 41 | [*] Numero maximo de Threads: 3 
 42 | Tiempo de Ejecucion: 41.000s
 43 | real	0m56.106s
 44 | user	1m49.004s
 45 | sys	0m1.156s
 46 | ---
 47 | [*] Numero maximo de Threads: 4 
 48 | Tiempo de Ejecucion: 39.000s
 49 | real	0m53.856s
 50 | user	2m5.772s
 51 | sys	0m1.136s
 52 | ---
 53 | [*] Numero maximo de Threads: 4 
 54 | Tiempo de Ejecucion: 39.000s
 55 | real	0m54.293s
 56 | user	2m4.704s
 57 | sys	0m1.204s
 58 | ---
 59 | [*] Numero maximo de Threads: 4 
 60 | Tiempo de Ejecucion: 38.000s
 61 | real	0m52.839s
 62 | user	2m5.840s
 63 | sys	0m1.108s
 64 | ---
 65 | 5:::
 66 | [*] Numero maximo de Threads: 5 
 67 | Tiempo de Ejecucion: 33.000s
 68 | real	0m48.357s
 69 | user	2m19.516s
 70 | sys	0m1.176s
 71 | ---
 72 | [*] Numero maximo de Threads: 5 
 73 | Tiempo de Ejecucion: 34.000s
 74 | real	0m48.709s
 75 | user	2m19.716s
 76 | sys	0m1.192s
 77 | ---
 78 | [*] Numero maximo de Threads: 5 
 79 | Tiempo de Ejecucion: 33.000s
 80 | real	0m48.600s
 81 | user	2m19.388s
 82 | sys	0m1.336s
 83 | ---
 84 | 7:::
 85 | [*] Numero maximo de Threads: 7 
 86 | Tiempo de Ejecucion: 27.000s
 87 | real	0m42.461s
 88 | user	2m27.884s
 89 | sys	0m1.168s
 90 | ---
 91 | [*] Numero maximo de Threads: 7 
 92 | Tiempo de Ejecucion: 28.000s
 93 | real	0m42.842s
 94 | user	2m29.472s
 95 | sys	0m1.172s
 96 | ---
 97 | [*] Numero maximo de Threads: 7 
 98 | Tiempo de Ejecucion: 28.000s
 99 | real	0m43.329s
100 | user	2m30.128s
101 | sys	0m1.136s
102 | ---
103 | [*] Numero maximo de Threads: 8 
104 | Tiempo de Ejecucion: 29.000s
105 | real	0m43.891s
106 | user	2m34.372s
107 | sys	0m1.208s
108 | ---
109 | [*] Numero maximo de Threads: 8 
110 | Tiempo de Ejecucion: 29.000s
111 | real	0m43.891s
112 | user	2m34.372s
113 | sys	0m1.208s
114 | ---
115 | [*] Numero maximo de Threads: 8 
116 | Tiempo de Ejecucion: 28.000s
117 | real	0m43.487s
118 | user	2m33.632s
119 | sys	0m1.216s
120 | 


--------------------------------------------------------------------------------
/Lab2/readme.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "###  Lab2: "
 8 |    ]
 9 |   }
10 |  ],
11 |  "metadata": {
12 |   "kernelspec": {
13 |    "display_name": "Python 3",
14 |    "language": "python",
15 |    "name": "python3"
16 |   },
17 |   "language_info": {
18 |    "codemirror_mode": {
19 |     "name": "ipython",
20 |     "version": 3
21 |    },
22 |    "file_extension": ".py",
23 |    "mimetype": "text/x-python",
24 |    "name": "python",
25 |    "nbconvert_exporter": "python",
26 |    "pygments_lexer": "ipython3",
27 |    "version": "3.5.4"
28 |   }
29 |  },
30 |  "nbformat": 4,
31 |  "nbformat_minor": 2
32 | }
33 | 


--------------------------------------------------------------------------------
/Lab2/readme.md:
--------------------------------------------------------------------------------
 1 | # Lab2: Multiplicación implementación de **OpenMP**
 2 | En este laboratorio-teórico práctico se realiza la puesta en marcha de la multiplicación de matrices utilizando **OpenMP**, se tratan de comparar los tiempos de ejecución de la multiplicación de matrices de forma secuencial y de forma paralela.
 3 | 
 4 | ## Información
 5 | Este Laboratorio tiene un analísis pequeño realizado con Python en un notebook de Jupyter, podras observar las estadisticas y las graficas de secuencial vs  parallelo con matrices de 500x500[readme.ipynb](https://github.com/h3ct0rjs/HighPerformanceComputing/blob/master/Lab2/readme.ipynb), se realiza la variacion del numero de threads a utilizar modificando la variable de entorno **OMP_NUM_THREADS** .
 6 | 
 7 | ## Ejecución
 8 | Este programa solo ha sido probado bajo un entorno Debian Gnu/Linux ***Buster*** y en el cluster[lovelace.utp.edu.co](), aunque deberia de funcionar tambien en un entorno Window$. 
 9 | 
10 | Para ejecutar el código solo bastará con hacer :
11 | ```sh
12 | user@host:/tmp/: git clone https://github.com/h3ct0rjs/HighPerformanceComputing
13 | user@host:/tmp/: cd HighPerformanceComputing/Lab2 && ./run
14 | ```
15 | *Tener presente que para realizar la prueba con otros datos es necesario que los archivos ubicados en **datasets** se llamen data1.txt, data2.txt los datos de las matrices correspondientes, el resultado se mostrará en pantalla y la matriz resultado será almacenada en *result.txt* *


--------------------------------------------------------------------------------
/Lab2/run.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | #hfjimenez@utp.edu.co
 3 | #Preprocess the Txt Files, more easy manipulation in the C program
 4 | rm -rf clean* r.txt result.txt 
 5 | echo -e "[*]\x1B[0m Limpiando Archivos\n"
 6 | sed -e "s/,/ /g" datasets/data1500.txt>clean1.txt
 7 | sed -e "s/,/ /g" datasets/data2500.txt>clean2.txt
 8 | touch r.txt
 9 | echo -e "\x1B[33m[*]\x1B[0m Corriendo Programa"
10 | gcc solve.c -w -s -o solver -fopenmp
11 | ./solver clean1.txt clean2.txt
12 | echo -e "\x1B[33m[*]\x1B[0m Escribiendo Resultado a Stdout(result.txt)"
13 | sed -e "s/ /,/g" r.txt>result.txt
14 | rm r.txt clean1.txt clean2.txt solver 
15 | echo -e "\x1B[32m[✔]TERMINADO\x1B[0m"
16 | 


--------------------------------------------------------------------------------
/Lab2/samplesomp/sample1.c:
--------------------------------------------------------------------------------
 1 | /******************************************************************************
 2 | * FILE: omp_workshare1.c
 3 | * DESCRIPTION:
 4 | *   OpenMP Example - Loop Work-sharing - C/C++ Version
 5 | *   In this example, the iterations of a loop are scheduled dynamically
 6 | *   across the team of threads.  A thread will perform CHUNK iterations
 7 | *   at a time before being scheduled for the next CHUNK of work.
 8 | * AUTHOR: Blaise Barney  5/99
 9 | * LAST REVISED: 04/06/05
10 | ******************************************************************************/
11 | #include <omp.h>
12 | #include <stdio.h>
13 | #include <stdlib.h>
14 | 
15 | #define CHUNKSIZE   10
16 | #define N       100
17 | 
18 | int main (int argc, char *argv[]) 
19 | {
20 | int nthreads, tid, i, chunk;
21 | float a[N], b[N], c[N];
22 | 
23 | /* Some initializations */
24 | for (i=0; i < N; i++)
25 | a[i] = b[i] = i * 1.0;
26 | chunk = CHUNKSIZE;
27 | 
28 | #pragma omp parallel shared(a,b,c,nthreads,chunk) private(i,tid)
29 |   {
30 |   tid = omp_get_thread_num();
31 |   if (tid == 0)
32 |     {
33 |     nthreads = omp_get_num_threads();
34 |     printf("Number of threads = %d\n", nthreads);
35 |     }
36 |   printf("Thread %d starting...\n",tid);
37 | 
38 |   #pragma omp for schedule(dynamic,chunk)
39 |   for (i=0; i<N; i++)
40 |     {
41 |     c[i] = a[i] + b[i];
42 |     printf("Thread %d: c[%d]= %f\n",tid,i,c[i]);
43 |     }
44 | 
45 |   }  /* end of parallel section */
46 | return 0;
47 | 
48 | }
49 | 
50 | 


--------------------------------------------------------------------------------
/Lab2/samplesomp/sample4.c:
--------------------------------------------------------------------------------
 1 | //htc vs hpc with samples
 2 | /******************************************************************************
 3 | * FILE: omp_mm.c
 4 | * DESCRIPTION:  
 5 | *   OpenMp Example - Matrix Multiply - C Version
 6 | *   Demonstrates a matrix multiply using OpenMP. Threads share row iterations
 7 | *   according to a predefined chunk size.
 8 | * AUTHOR: Blaise Barney
 9 | * LAST REVISED: 06/28/05
10 | ******************************************************************************/
11 | #include <omp.h>
12 | #include <stdio.h>
13 | #include <stdlib.h>
14 | 
15 | #define NRA 62                 /* number of rows in matrix A */
16 | #define NCA 15                 /* number of columns in matrix A */
17 | #define NCB 7                  /* number of columns in matrix B */
18 | 
19 | int main (int argc, char *argv[]){
20 | int	tid, nthreads, i, j, k, chunk;
21 | double	a[NRA][NCA],           	/* matrix A to be multiplied */
22 | 	b[NCA][NCB],           				/* matrix B to be multiplied */
23 | 	c[NRA][NCB];           				/* result matrix C */
24 | 
25 | chunk = 10;                    /* set loop iteration chunk size */
26 | 
27 | /*** Spawn a parallel region explicitly scoping all variables ***/
28 | #pragma omp parallel shared(a,b,c,nthreads,chunk) private(tid,i,j,k)
29 |   {
30 |   tid = omp_get_thread_num();
31 |   if (tid == 0)
32 |     {
33 |     nthreads = omp_get_num_threads();
34 |     printf("Starting matrix multiple example with %d threads\n",nthreads);
35 |     printf("Initializing matrices...\n");
36 |     }
37 |   /*** Initialize matrices ***/
38 |   #pragma omp for schedule (static, chunk) 
39 |   for (i=0; i<NRA; i++)
40 |     for (j=0; j<NCA; j++)
41 |       a[i][j]= i+j;
42 | 
43 |   #pragma omp for schedule (static, chunk)
44 |   for (i=0; i<NCA; i++)
45 |     for (j=0; j<NCB; j++)
46 |       b[i][j]= i*j;
47 |   #pragma omp for schedule (static, chunk)
48 |   for (i=0; i<NRA; i++)
49 |     for (j=0; j<NCB; j++)
50 |       c[i][j]= 0;
51 | 
52 |   /*** Do matrix multiply sharing iterations on outer loop ***/
53 |   /*** Display who does which iterations for demonstration purposes ***/
54 |   printf("Thread %d starting matrix multiply...\n",tid);
55 |   #pragma omp for schedule (static, chunk)
56 |   for (i=0; i<NRA; i++)    
57 |     {
58 |     printf("Thread=%d did row=%d\n",tid,i);
59 |     for(j=0; j<NCB; j++)       
60 |       for (k=0; k<NCA; k++)
61 |         c[i][j] += a[i][k] * b[k][j];
62 |     }
63 |   }   /*** End of parallel region ***/
64 | 
65 | /*** Print results ***/
66 | printf("******************************************************\n");
67 | printf("Result Matrix:\n");
68 | for (i=0; i<NRA; i++){
69 |   for (j=0; j<NCB; j++) 
70 |     printf("%6.2f   ", c[i][j]);
71 |   printf("\n"); 
72 |   }
73 | printf("******************************************************\n");
74 | printf ("Done.\n");
75 | }


--------------------------------------------------------------------------------
/Lab2/solve.c:
--------------------------------------------------------------------------------
  1 | #include <omp.h>
  2 | #include <stdio.h>
  3 | #include <stdlib.h>
  4 | #include <errno.h>
  5 | #include <time.h>
  6 | //High Performance Computing
  7 | //Lab#1: Matrix Multiplication 
  8 | //hfjimenez@utp.edu.co, 2017-2
  9 | 
 10 | //Strings label :
 11 | //[!]:Warning,Exit Process.
 12 | //[*]:Information interesting
 13 | //[Ok]: Process Completed 
 14 | //References:
 15 | //[1]https://linux.die.net/man/3/calloc
 16 | //A*B = C, representation of all the matrix
 17 | float **MatA, **MatB, **MatC;     //Heap MemStorage
 18 | int i,j,k;                        //global Mem
 19 | int tid, nthreads;        
 20 | int chunk = 10;                    /* set loop iteration chunk size */
 21 |  int tmp=0;
 22 | 
 23 | //My Function helpers
 24 | void printmat(float **Mat,int r, int  c);
 25 | void title(void);
 26 | 
 27 | #define version  "v0.1"
 28 | //First Time Using Colors Hooray
 29 | #define RES  "\x1B[0m"
 30 | #define RED  "\x1B[31m"
 31 | #define GRN  "\x1B[32m"
 32 | #define YEL  "\x1B[33m"
 33 | #define BLU  "\x1B[34m"
 34 | #define MAG  "\x1B[35m"
 35 | #define CYN  "\x1B[36m"
 36 | #define WHT  "\x1B[37m"
 37 | 
 38 | int main(int argc, const char * argv[])
 39 | {
 40 | 
 41 |   float t_1; // Execution time measures
 42 |   clock_t c_1, c_2;
 43 | 
 44 |   title();
 45 |   //File pointers,in read mode
 46 |   FILE *archi;
 47 |   FILE *archi2;
 48 |   FILE *result;
 49 |   int rows1,cols1,rows2,cols2;
 50 |   int sum=0;
 51 | 
 52 |   archi = fopen(argv[1],"r");
 53 |   archi2 = fopen(argv[2],"r");
 54 |   //NULL in C is ugly, NULL is equal to 0, not like in C++ <3 that is a real null value.
 55 |   if (archi == NULL || archi2 == NULL){       
 56 |     printf("%s[!]%s Imposible abrir los archivos pasados como argumentos\n",RED,RES);
 57 |     exit(0);
 58 |   }
 59 |   fscanf(archi,"%i",&rows1);
 60 |   fscanf(archi,"%i",&cols1);
 61 |   fscanf(archi2,"%i",&rows2);
 62 |   fscanf(archi2,"%i",&cols2);
 63 |   printf("%s**Dimensiones Matriz**%s",CYN, RES);
 64 |   printf("\n%s[*]%s Matriz **A** Filas:%d Columnas:%d",YEL,RES,rows1,cols1);
 65 |   printf("\n%s[*]%s Matriz **B** Filas:%d Columnas:%d",YEL,RES,rows2,cols2);
 66 |   
 67 |   if(cols1!=rows2){
 68 |     printf("\n%s[!]%s No Es Posible realizar la Multiplicacion entre Matrices\n ",RED,RES);
 69 |     printf("%s[NOTA:!]%s Dimensiones Incompatibles,saliendo...\n",RED,RES );
 70 |     exit(0);
 71 |   }
 72 | 
 73 |   //Allocate memory for the Matrices, with initilize values to cero. 
 74 |   //avoid the for loops to init the matriz values in 0.see[1] for more info
 75 |   
 76 |   float **MatA = (float **)calloc(rows1,sizeof(float*));
 77 |   float **MatB = (float **)calloc(rows2, sizeof(float*));
 78 |   float **MatC = (float **)calloc(rows1, sizeof(float*));
 79 |   //double **M = (double **)malloc(rows1*sizeof(double*));
 80 | 
 81 | 
 82 |   
 83 |   /*** Initialize matrices with ceros ***/
 84 |   for(i = 0; i < rows1; i++)
 85 |       MatA[i] = (float *)calloc(cols1 ,sizeof(float));
 86 | 
 87 | 
 88 |   for(i = 0; i < rows2; i++)
 89 |       MatB[i] = (float *)calloc(cols2, sizeof(float));
 90 | 
 91 |   for(i = 0; i < rows1; i++)
 92 |       MatC[i] = (float *)calloc(cols2, sizeof(float));
 93 |     
 94 |   tid = omp_get_thread_num();
 95 |   nthreads = omp_get_num_threads();
 96 |   printf("\nMultiplicando las matrices con # %d de threads\n",nthreads);
 97 | 
 98 |   if (!MatA || !MatB || !MatC) { 
 99 |       printf("\n%s[!]%s Falla de Reserva de Memoria",RED,RES);
100 |       exit(ENOMEM);}
101 |   
102 |   printf("\n%s[*]%s Leyendo Valores de la Matriz A en el archivo\n",YEL, RES);
103 |   while(!feof(archi)){
104 |     for(i=0;i<rows1;i++){
105 |       for(j=0;j<cols1;j++){
106 |         fscanf(archi,"%f",&MatA[i][j]);
107 |       }
108 |     }
109 |   }
110 | 
111 |   printf("%s[*]%s Leyendo Valores de la Matriz B en el archivo\n",YEL, RES);
112 |   while(!feof(archi2)){
113 |     for(i=0;i<rows2;i++){
114 |       for(j=0;j<cols2;j++){
115 |         fscanf(archi2,"%f",&MatB[i][j]);
116 |       }
117 |     }
118 |   }
119 |  
120 |   //printmat(MatA,rows1,cols1);       //debug
121 |   //printmat(MatB,rows2,cols2);
122 |   printf("%s[*]%s Multiplicando Matriz:\n",YEL, RES);
123 |   printf("%s[*]%s Numero maximo de Threads: %i \n",YEL, RES, omp_get_max_threads());
124 |   printf("%s[*]%s Numero de Threads: %i \n",YEL, RES, omp_get_num_threads());
125 |   c_1=time(NULL);                     // time measure: start mm 
126 |     multimat(MatA,MatB,MatC,rows1,rows2,cols2);
127 | 
128 |   /*** End of parallel region ***/
129 |   //printmat(MatC,rows1,cols2);
130 |   result = fopen("r.txt","w");
131 |   if (result == NULL){       
132 |     printf("%s [!]%s Imposible abrir el archivo para los resultados\n",RED,RES);
133 |     exit(0);
134 |   }
135 |   printf("%s[*]%s Guardando Resultados:\n",YEL, RES);
136 |   for(i=0;i<rows2;i++){
137 |     for(j=0;j<cols2;j++){
138 |       fprintf(result,"%.2f ",MatB[i][j]);
139 |     }
140 |     fprintf(result,"\n");
141 |   }
142 | 
143 |   //free up memory
144 |   free(*MatA);
145 |   free(*MatB);
146 |   free(*MatC);
147 |   free(MatA);
148 |   free(MatB);
149 |   free(MatC);
150 |   
151 |   fclose(archi);
152 |   fclose(archi2);
153 |   fclose(result);
154 |   c_2=time(NULL);                     // time measure: end mm
155 |   t_1 = (float)(c_2-c_1);             // time elapsed for job row-wise
156 |   printf("Tiempo de Ejecucion: %.3fs\n",t_1);
157 | }
158 | 
159 | void multimat(float** M,float** M2,float**R,int r1,int r2,int c2){
160 |   /*** Do matrix multiply sharing iterations on outer loop ***/
161 |   
162 |   #pragma omp parallel shared(M, M2, R, chunk) private(i, j, k, tid,tmp)
163 |   {
164 |   printf("Thread %d Iniciando Multiplicacion...\n",tid);
165 |   
166 |   #pragma omp for schedule(static, chunk) 
167 |   for(int i=0;i<r1;i++){
168 |     for(int j=0;j<c2;j++){
169 |       tmp=0;
170 |       for(int k=0;k<r2;k++){
171 |         tmp+=M[i][k]*M2[k][j];
172 |       }
173 |      R[i][j]=tmp;
174 |     } 
175 |   }
176 | 
177 |   }
178 | } 
179 | 
180 | 
181 | void printmat(float **Mat,int r,int c){
182 |   printf("\nMatriz\n%s---%s\n",BLU,RES );
183 |   for(i=0;i<r;i++){
184 |       for(j=0;j<c;j++)
185 |         printf("%.2f  ",Mat[i][j]);
186 |       printf("\n");
187 |   }
188 | 
189 | }
190 | 
191 | void title(void){
192 | printf("        :::    ::: :::::::::   :::::::: \n");
193 | printf("     :+:    :+: :+:    :+: :+:    :+: \n");
194 | printf("    +:+    +:+ +:+    +:+ +:+         \n");
195 | printf("   +#++:++#++ +#++:++#+  +#+          \n");
196 | printf("  +#+    +#+ +#+        +#+           \n");
197 | printf(" #+#    #+# #+#        #+#    #+#     \n");
198 | printf("###    ### ###         ########       \n\n");
199 |   printf("%s-----------------------------%s\n",BLU,RES);
200 |   printf("%sMultiplicacion de Matrices\nhfjimenez@utp.edu.co,2017-2\n\tH.P.C%s\n",RED,RES);
201 |   printf("%s-----------------------------\n%s",BLU,RES);
202 | }


--------------------------------------------------------------------------------
/Lab3.1CudaAddVector/vecadd.cu:
--------------------------------------------------------------------------------
 1 | //This code is taken from https://www.olcf.ornl.gov/tutorials/cuda-vector-addition/
 2 | 
 3 | #include <stdio.h>
 4 | #include <stdlib.h>
 5 | #include <math.h>
 6 |  
 7 | // CUDA kernel. Each thread takes care of one element of c
 8 | __global__ void vecAdd(double *a, double *b, double *c, int n)
 9 | {
10 |     // Get our global thread ID
11 |     int id = blockIdx.x*blockDim.x+threadIdx.x;
12 |  
13 |     // Make sure we do not go out of bounds
14 |     if (id < n){
15 |         c[id] = a[id] + b[id];
16 |         printf('En Thread id: %d Oper : %d +%d :%d',id,a[id],b[id],c[id])
17 |     }
18 | 
19 | }
20 |  
21 | int main( int argc, char* argv[] )
22 | {
23 |     // Size of vectors
24 |     int n = 100000;
25 |  
26 |     // Host input vectors
27 |     double *h_a;
28 |     double *h_b;
29 |     //Host output vector
30 |     double *h_c;
31 |     // Device input vectors
32 |     double *d_a;
33 |     double *d_b;
34 |     //Device output vector
35 |     double *d_c;
36 |  
37 |     // Size, in bytes, of each vector
38 |     size_t bytes = n*sizeof(double);
39 |  
40 |     // Allocate memory for each vector on host
41 |     h_a = (double*)malloc(bytes);
42 |     h_b = (double*)malloc(bytes);
43 |     h_c = (double*)malloc(bytes);
44 |  
45 |     // Allocate memory for each vector on GPU
46 |     cudaMalloc(&d_a, bytes);
47 |     cudaMalloc(&d_b, bytes);
48 |     cudaMalloc(&d_c, bytes);
49 |  
50 |     int i;
51 |     // Initialize vectors on host
52 |     for( i = 0; i < n; i++ ) {
53 |         h_a[i] = sin(i)*sin(i);
54 |         h_b[i] = cos(i)*cos(i);
55 |     }
56 |  
57 |     // Copy host vectors to device
58 |     cudaMemcpy( d_a, h_a, bytes, cudaMemcpyHostToDevice);
59 |     cudaMemcpy( d_b, h_b, bytes, cudaMemcpyHostToDevice);
60 |  
61 |     int blockSize, gridSize;
62 |  
63 |     // Number of threads in each thread block
64 |     blockSize = 1024;
65 |  
66 |     // Number of thread blocks in grid
67 |     gridSize = (int)ceil((float)n/blockSize);
68 |  
69 |     // Execute the kernel
70 |     vecAdd<<<gridSize, blockSize>>>(d_a, d_b, d_c, n);
71 |  
72 |     // Copy array back to host
73 |     cudaMemcpy( h_c, d_c, bytes, cudaMemcpyDeviceToHost );
74 |  
75 |     // Sum up vector c and print result divided by n, this should equal 1 within error
76 |     double sum = 0;
77 |     for(i=0; i<n; i++)
78 |         sum += h_c[i];
79 |     printf("final result: %f\n", sum/n);
80 |  
81 |     // Release device memory
82 |     cudaFree(d_a);
83 |     cudaFree(d_b);
84 |     cudaFree(d_c);
85 |  
86 |     // Release host memory
87 |     free(h_a);
88 |     free(h_b);
89 |     free(h_c);
90 |  
91 |     return 0;
92 | }
93 | 


--------------------------------------------------------------------------------
/Lab3.1CudaAddVector/vecadd.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | #SBATCH --job-name=vecadd
 4 | #SBATCH --output=res_vecadd
 5 | #SBATCH --ntasks=1
 6 | #SBATCH --nodes=1
 7 | #SBATCH --gres=gpu:1
 8 | 
 9 | export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
10 | export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
11 | 
12 | export CUDA_VISIBLE_DEVICES=1
13 | 
14 | ./vecadd
15 | 


--------------------------------------------------------------------------------
/Lab3NvidiaCudaIntro/gpu_computing.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Lab3NvidiaCudaIntro/gpu_computing.zip


--------------------------------------------------------------------------------
/Lab3NvidiaCudaIntro/readme.md:
--------------------------------------------------------------------------------
1 | # Lab 3, Nvidia Introduction to Accelerated Computing
2 | 
3 | En este laboratorio se realiza una introduccion a la aceleracion de algoritmos utilizando 
4 | cuda, como teoria se utilizo todo el contenido del introductorio de cuda labs en jupiter, 
5 | ver ***Accelerated Computing.ipynb*** para mas detalles. 
6 | 
7 | 
8 | 


--------------------------------------------------------------------------------
/Lab4CudaMM/matmul.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | #SBATCH --job-name=CudaMulti
 4 | #SBATCH --output=resultado
 5 | #SBATCH --ntasks=10
 6 | #SBATCH --nodes=4
 7 | #SBATCH --gres=gpu:1
 8 | 
 9 | export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
10 | export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
11 | 
12 | export CUDA_VISIBLE_DEVICES=1
13 | 
14 | ./mul
15 | 


--------------------------------------------------------------------------------
/Lab4CudaMM/matmult.cu:
--------------------------------------------------------------------------------
 1 | #include <stdio.h>
 2 | #include <time.h>
 3 | #define N 1024
 4 | 
 5 | __global__ void Matriz_GPU_Mult(int *a, int *b, int *c) {
 6 | 	int k, sum = 0;
 7 | 	int i = blockIdx.x * blockDim.x + threadIdx.x; 
 8 |   int j = blockIdx.y * blockDim.y + threadIdx.y;
 9 |   if (i < N && j < N) {
10 |     for (k = 0; k < N; k++) {
11 |       sum += a[j * N + k] * b[k * N + i];
12 |     }
13 |     c[j * N + i] = sum;
14 |   }
15 | }
16 | 
17 | int main() {
18 |   double timeGPU;
19 |   int h_A[N][N], h_B[N][N], h_C[N][N];
20 |   int *d_a, *d_b, *d_c;
21 |   
22 |   cudaMalloc((void **) &d_a, N*sizeof(int));
23 |   cudaMalloc((void **) &d_b, N*sizeof(int));
24 |   cudaMalloc((void **) &d_c, N*sizeof(int));
25 | 
26 |   cudaMemcpy(d_a, A, size, cudaMemcpyHostToDevice);
27 |   cudaMemcpy(d_b, B, size, cudaMemcpyHostToDevice);
28 | 
29 |   //int threadsPerBlock(16);
30 |   //int numBlocks(N/threadsPerBlock);
31 |   dim3 threadsPerBlock(32, 32);
32 |   dim3 numBlocks(N/threadsPerBlock.x, N/threadsPerBlock.y);
33 |   clock_t startGPU  = clock();
34 |   Matriz_GPU_Mult<<<numBlocks, threadsPerBlock>>>(d_a, d_b, d_c);
35 | 	timeGPU = ((double)(clock() - startGPU))/CLOCKS_PER_SEC;
36 |   cudaMemcpy(C, d_c, size, cudaMemcpyDeviceToHost);
37 |   cudaFree(d_a);
38 |   cudaFree(d_b);
39 |   cudaFree(d_c);
40 |   // tiempos de ejecucion
41 |   printf("tiempo GPU = %f s",timeGPU);
42 |   return 0;
43 | }
44 | 


--------------------------------------------------------------------------------
/Lab4CudaMM/readme.md:
--------------------------------------------------------------------------------
1 | # Matriz Multiplication Using CUDA
2 | 


--------------------------------------------------------------------------------
/Lab5Numba/mc.py:
--------------------------------------------------------------------------------
 1 | import numpy as np                         # numpy namespace
 2 | from timeit import default_timer as timer  # for timing
 3 | from matplotlib import pyplot              # for plotting
 4 | import math
 5 | from numbapro import vectorize
 6 | 
 7 | @vectorize(['float64(float64, float64, float64, float64, float64)'], target='gpu')
 8 | def step(price, dt, c0, c1, noise):
 9 |     return price * math.exp(c0 * dt + c1 * noise)
10 | 
11 | # Stock information parameters
12 | StockPrice = 20.83
13 | StrikePrice = 21.50
14 | Volatility = 0.021
15 | InterestRate = 0.20
16 | Maturity = 5. / 12.
17 | 
18 | # monte-carlo simulation parameters
19 | NumPath = 3000000
20 | NumStep = 100
21 | 
22 | # plotting
23 | MAX_PATH_IN_PLOT = 50
24 | 
25 | def driver(pricer, do_plot=False):
26 |     paths = np.zeros((NumPath, NumStep + 1), order='F')
27 |     paths[:, 0] = StockPrice
28 |     DT = Maturity / NumStep
29 | 
30 |     ts = timer()
31 |     pricer(paths, DT, InterestRate, Volatility)
32 |     te = timer()
33 |     elapsed = te - ts
34 | 
35 |     ST = paths[:, -1]
36 |     PaidOff = np.maximum(paths[:, -1] - StrikePrice, 0)
37 |     print 'Result'
38 |     fmt = '%20s: %s'
39 |     print fmt % ('stock price', np.mean(ST))
40 |     print fmt % ('standard error', np.std(ST) / np.sqrt(NumPath))
41 |     print fmt % ('paid off', np.mean(PaidOff))
42 |     optionprice = np.mean(PaidOff) * np.exp(-InterestRate * Maturity)
43 |     print fmt % ('option price', optionprice)
44 | 
45 |     print 'Performance'
46 |     NumCompute = NumPath * NumStep
47 |     print fmt % ('Mstep/second', '%.2f' % (NumCompute / elapsed / 1e6))
48 |     print fmt % ('time elapsed', '%.3fs' % (te - ts))
49 |     
50 |     if do_plot:
51 |         pathct = min(NumPath, MAX_PATH_IN_PLOT)
52 |         for i in xrange(pathct):
53 |             pyplot.plot(paths[i])
54 |         print 'Plotting %d/%d paths' % (pathct, NumPath)
55 |         pyplot.show()
56 | 


--------------------------------------------------------------------------------
/Notes/HPC-Class1.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "## 08/11/2017  OpenMP \n",
 8 |     "\n",
 9 |     "Modelo de memoria compartida donde diferentes nucleos hacen uso de la memoria compartida, paralelismo esta basado en hilos (*threads*) la idea es aprovechar usar la tendencia multinucleo para hacer procesamiento y optimizar al maximo. Se plantea que el valor optimo de hilos asignado es igual al numero de procesadores o soportados por el procesador. \n",
10 |     "\n",
11 |     "* Se tiene control total de lo que se hace\n",
12 |     "* Debe ser explicito donde se realiza el paralelismo\n",
13 |     "\n",
14 |     "** Modelo de Fork Join** \n",
15 |     "\n",
16 |     "<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/Fork_join.svg/1200px-Fork_join.svg.png\"/>\n",
17 |     "\n",
18 |     "\n",
19 |     "### Recursos Electronicos\n",
20 |     "\n",
21 |     "[https://computing.llnl.gov/tutorials/openMP/](https://computing.llnl.gov/tutorials/openMP/)\n",
22 |     "\n",
23 |     "***Revisar : Threads en C/C++, Modelo Fork-Join ***"
24 |    ]
25 |   }
26 |  ],
27 |  "metadata": {
28 |   "kernelspec": {
29 |    "display_name": "Python 3",
30 |    "language": "python",
31 |    "name": "python3"
32 |   },
33 |   "language_info": {
34 |    "codemirror_mode": {
35 |     "name": "ipython",
36 |     "version": 3
37 |    },
38 |    "file_extension": ".py",
39 |    "mimetype": "text/x-python",
40 |    "name": "python",
41 |    "nbconvert_exporter": "python",
42 |    "pygments_lexer": "ipython3",
43 |    "version": "3.5.4rc1"
44 |   }
45 |  },
46 |  "nbformat": 4,
47 |  "nbformat_minor": 2
48 | }
49 | 


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class1-openMP.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class1-openMP.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class2-Slurm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class2-Slurm.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class3openMP.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class3openMP.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class4MPI.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class4MPI.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class5introductionDataParallelismCUDAC.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class5introductionDataParallelismCUDAC.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class6vecAddition.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class6vecAddition.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/Class8CUDAMemories.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class8CUDAMemories.pdf


--------------------------------------------------------------------------------
/Notes/OficialSlides/NyuPowerWall.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/NyuPowerWall.pdf


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # HighPerformanceComputing
2 | Class of High Performance Computing taken at U.T.P 2017 
3 | 
4 | For each of the labs check:
5 | * readme.ipynb 
6 | * solve.xx
7 | * datasets
8 | 


--------------------------------------------------------------------------------