├── .gitignore
├── BookRef
├── GPU-apps-catalog-mar2015.pdf
├── OpenMP
│ └── IntrotoOpenMP.pdf
└── Programming Massively Parallel Processors.pdf
├── LICENSE
├── Lab1
├── .ipynb_checkpoints
│ ├── Analisis-checkpoint.ipynb
│ └── readme-checkpoint.ipynb
├── clang
│ ├── allomen.c
│ ├── colorinunix.c
│ ├── matrixr
│ ├── matrixtest.c
│ ├── matrixtestcalloc.c
│ ├── test1
│ ├── test2
│ ├── testcolor
│ └── usingsizeof.c
├── data1.txt
├── data2.txt
├── readme.ipynb
├── readme.md
├── result.txt
├── run.sh
└── solve.c
├── Lab2
├── datasets
│ ├── data1.txt
│ ├── data1500.txt
│ ├── data2.txt
│ └── data2500.txt
├── logtimes.txt
├── readme.ipynb
├── readme.md
├── result.txt
├── run.sh
├── samplesomp
│ ├── sample1.c
│ └── sample4.c
└── solve.c
├── Lab3.1CudaAddVector
├── vecadd.cu
└── vecadd.sh
├── Lab3NvidiaCudaIntro
├── Accelerated Computing.ipynb
├── gpu_computing.zip
└── readme.md
├── Lab4CudaMM
├── matmul.sh
├── matmult.cu
└── readme.md
├── Lab5Numba
├── GPULibrariesBasics.ipynb
└── mc.py
├── Notes
├── HPC-Class1.ipynb
└── OficialSlides
│ ├── Class1-openMP.pdf
│ ├── Class2-Slurm.pdf
│ ├── Class3openMP.pdf
│ ├── Class4MPI.pdf
│ ├── Class5introductionDataParallelismCUDAC.pdf
│ ├── Class6vecAddition.pdf
│ ├── Class8CUDAMemories.pdf
│ └── NyuPowerWall.pdf
└── README.md
/.gitignore:
--------------------------------------------------------------------------------
1 | # Prerequisites
2 | *.d
3 |
4 | # Compiled Object files
5 | *.slo
6 | *.lo
7 | *.o
8 | *.obj
9 |
10 | # Precompiled Headers
11 | *.gch
12 | *.pch
13 |
14 | # Compiled Dynamic libraries
15 | *.so
16 | *.dylib
17 | *.dll
18 |
19 | # Fortran module files
20 | *.mod
21 | *.smod
22 |
23 | # Compiled Static libraries
24 | *.lai
25 | *.la
26 | *.a
27 | *.lib
28 |
29 | # Executables
30 | *.exe
31 | *.out
32 | *.app
33 | */.ipynb_checkpoints
34 | .ipynb_checkpoints
35 |
36 |
--------------------------------------------------------------------------------
/BookRef/GPU-apps-catalog-mar2015.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/BookRef/GPU-apps-catalog-mar2015.pdf
--------------------------------------------------------------------------------
/BookRef/OpenMP/IntrotoOpenMP.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/BookRef/OpenMP/IntrotoOpenMP.pdf
--------------------------------------------------------------------------------
/BookRef/Programming Massively Parallel Processors.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/BookRef/Programming Massively Parallel Processors.pdf
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2017 Hector F. Jimenez.
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/Lab1/.ipynb_checkpoints/Analisis-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Laboratorio 1, High Performance Computing\n",
8 | "Con el fin de realizar un repaso de las habilidades con el lenguaje ***C*** se debe construir un programa que permita multiplicar 2 matrices, teniendo en cuenta las siguientes condiciones :\n",
9 | "\n",
10 | "* Las dos Matrices se leeran de dos archivos de texto separados por comas. \n",
11 | "* los dos primeras lineas contienen la cantidad de filas y columnas de la matriz respectiva. \n",
12 | "\n",
13 | "*Ejemplo:* \n",
14 | "```\n",
15 | "2\n",
16 | "2\n",
17 | "1,2\n",
18 | "3,4\n",
19 | "```\n",
20 | "El resulta de la multiplicación de las dos matrices debe ser enviado via stdout aun archivo de texto,\n",
21 | "utilizar ***Taller1HPC*** es el asunto que debe tener el email enviado al docente John.\n",
22 | "\n",
23 | "Como tipo de dato se utiliza precisión sencilla ***float***\n",
24 | "# Solución\n",
25 | "\n",
26 | "## Multiplicación de Matrices\n",
27 | "dadas dos matrices $A$,$B$ de la forma:\n",
28 | "$$A = \\begin{pmatrix}\n",
29 | " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
30 | " \\vdots & \\ddots & \\vdots \\\\\n",
31 | " a_{m 1} & \\cdots & a_{m n}\n",
32 | " \\end{pmatrix}, B = \\begin{pmatrix}\n",
33 | " b_{1 1} & \\cdots & b_{1 n} \\\\\n",
34 | " \\vdots & \\ddots & \\vdots \\\\\n",
35 | " b_{m 1} & \\cdots & b_{m n}\n",
36 | " \\end{pmatrix}\n",
37 | " $$\n",
38 | "\n",
39 | "Escritas en los textos como $A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, donde $m,n,p$ indican las filas y columnas de cada matriz, el producto de $A\\cdot B$ es:\n",
40 | "$$\n",
41 | "C = AB_{}^{}, \\\\= \\begin{pmatrix}\n",
42 | " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
43 | " \\vdots & \\ddots & \\vdots \\\\\n",
44 | " a_{m 1} & \\cdots & a_{m n}\n",
45 | " \\end{pmatrix} \\cdot \\begin{pmatrix}\n",
46 | " b_{1 1} & \\cdots & b_{1 p} \\\\\n",
47 | " \\vdots & \\ddots & \\vdots \\\\\n",
48 | " b_{n 1} & \\cdots & b_{n p}\n",
49 | " \\end{pmatrix} \\\\ \\begin{pmatrix}\n",
50 | " a_{11}b_{11}+ \\cdots +a_{1n}b_{n1} & \\cdots & a_{11}b_{1p}+ \\cdots +a_{1n}b_{np} \\\\\n",
51 | " \\vdots & \\ddots & \\vdots \\\\\n",
52 | " a_{m1}b_{11}+ \\cdots +a_{mn}b_{n1} & \\cdots & a_{m1}b_{1p}+ \\cdots +a_{mn}b_{np}\n",
53 | " \\end{pmatrix}\n",
54 | "$$\n",
55 | "Que no es mas que la sumatoria de multiplicar la fila por la columna para cada elemento de la matriz resulta:\n",
56 | "$$\n",
57 | "c_{ij} = \\sum_{r=1}^n a_{ir}b_{rj}$$\n",
58 | "\n",
59 | "\n",
60 | "*Nota[!]:La cantindad de Columnas debe ser igual a la cantidad de filas de la segunda matriz,$A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, tendra como resultado una Matriz $B:=(b_{i j})_{m \\times p}$*.\n",
61 | "\n",
62 | "De las observaciones anteriores podemos decir que para resolver la multiplicación de las matrices $A\\cdot B$ es necesario realizar $m*n*p$ multiplicaciones, Seria posible utilizar una gran cantidad de tecnicas como se describe en [1][1] y [2][2], pero por tiempo utilizare la version interactiva tiene un costo $Θ(n^{3})$ como se muestra en [3][3]:\n",
63 | "\n",
64 | "
\n",
65 | "\n",
66 | "## Codigo implementado\n",
67 | "Para resolver este problema he hecho uso de los Makefiles para hacer parseo de los datos que se me entrega, \n",
68 | "la ejecucion de este programa obligatoriamente estable que los archivos de entrada tengan los nombre data1.txt, data2.txt, siendo *data1.txt* el archivo que contiene los datos reales de la Matriz $A$, y lo mismo para $B$, \n",
69 | "luego lo unico que sera hacaer es:\n",
70 | "**make run**\n",
71 | "\n",
72 | "```c\n",
73 | "#include \n",
74 | "#include \n",
75 | "//High Performance Computing\n",
76 | "//Lab#1: Matrix Multiplication \n",
77 | "//hfjimenez@utp.edu.co, 2017-2\n",
78 | "int main(int argc, const char * argv[]){\n",
79 | "\tFILE *archi;\n",
80 | "\tFILE *archi2;\n",
81 | "\tarchi = fopen(argv[1], \"r\");\t\n",
82 | "\tarchi2 = fopen(argv[2], \"r\");\n",
83 | "\n",
84 | "\t//the first two number of each datamatrix file opts=[rmat1,colmat1,rmat2,colmat2]\n",
85 | "\tint opts[4];\n",
86 | "\t//NULL in C is ugly, NULL is equal to 0, not like in C++ <3 that is a real null value.\n",
87 | "\tif (archi == NULL || archi2 == NULL){\t\t\n",
88 | "\t printf(\"Imposible abrir los archivos pasados como argumentos\");\n",
89 | "\t exit(0);\n",
90 | "\t \t}\n",
91 | "\t //size of the matrix is fixed we Just need to get the number of rows and cols\n",
92 | "\t // of each matrix \n",
93 | "\tfor (i = 0; i < 2; i++){\n",
94 | "\t fscanf(archi,\"%d\",&opts[i]);\n",
95 | "\t fscanf(archi2,\"%d\",&opts[i+2]);\n",
96 | "\t}\n",
97 | "\n",
98 | "\t//printf(\"%d %d\\n\",opts[1],opts[2]);\n",
99 | "\tif(opts[1]!=opts[2]){\n",
100 | "\t\tprintf(\"\\n[!] No Es Posible realizar la Multiplicacion entre Matrices\\n \");\n",
101 | "\t exit(0);\n",
102 | "\t}\n",
103 | "\tint i=0;\n",
104 | "\tfor(i=0;i<4;i++){\n",
105 | "\t\tprintf(\"%d\",opts[i] );\n",
106 | "\t}\n",
107 | "\n",
108 | "\tfloat A[(opts[0]*opts[1])+2];\n",
109 | "\tfloat B[(opts[2]*opts[3])+2];\n",
110 | "\n",
111 | "\tfclose(archi);\n",
112 | "\tfclose(archi2);\n",
113 | "\treturn 0;\n",
114 | "} \n",
115 | "\n",
116 | "\n",
117 | "\n",
118 | "```\n",
119 | "#### Referencias :\n",
120 | "- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
121 | "- https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
122 | "[1]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
123 | "[2]: https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
124 | "[3]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm"
125 | ]
126 | }
127 | ],
128 | "metadata": {
129 | "kernelspec": {
130 | "display_name": "Python 3",
131 | "language": "python",
132 | "name": "python3"
133 | },
134 | "language_info": {
135 | "codemirror_mode": {
136 | "name": "ipython",
137 | "version": 3
138 | },
139 | "file_extension": ".py",
140 | "mimetype": "text/x-python",
141 | "name": "python",
142 | "nbconvert_exporter": "python",
143 | "pygments_lexer": "ipython3",
144 | "version": "3.5.4rc1"
145 | }
146 | },
147 | "nbformat": 4,
148 | "nbformat_minor": 2
149 | }
150 |
--------------------------------------------------------------------------------
/Lab1/.ipynb_checkpoints/readme-checkpoint.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Laboratorio 1, High Performance Computing\n",
8 | "Con el fin de realizar un repaso de las habilidades con el lenguaje ***C*** se debe construir un programa que permita multiplicar 2 matrices, teniendo en cuenta las siguientes condiciones :\n",
9 | "\n",
10 | "* Las dos Matrices se leeran de dos archivos de texto separados por comas. \n",
11 | "* los dos primeras lineas contienen la cantidad de filas y columnas de la matriz respectiva. \n",
12 | "*Ejemplo:* \n",
13 | "```\n",
14 | "2\n",
15 | "2\n",
16 | "1,2\n",
17 | "3,4\n",
18 | "```\n",
19 | "El resulta de la multiplicación de las dos matrices debe ser enviado via stdout aun archivo de texto,\n",
20 | "utilizar ***Taller1HPC*** es el asunto que debe tener el email enviado al docente John.\n",
21 | "\n",
22 | "Como tipo de dato se utiliza precisión sencilla ***float***\n",
23 | "# Solución\n",
24 | "\n",
25 | "## Multiplicación de Matrices\n",
26 | "dadas dos matrices $A$,$B$ de la forma:\n",
27 | "$$A = \\begin{pmatrix}\n",
28 | " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
29 | " \\vdots & \\ddots & \\vdots \\\\\n",
30 | " a_{m 1} & \\cdots & a_{m n}\n",
31 | " \\end{pmatrix}, B = \\begin{pmatrix}\n",
32 | " b_{1 1} & \\cdots & b_{1 n} \\\\\n",
33 | " \\vdots & \\ddots & \\vdots \\\\\n",
34 | " b_{m 1} & \\cdots & b_{m n}\n",
35 | " \\end{pmatrix}\n",
36 | " $$\n",
37 | "\n",
38 | "Escritas en los textos como $A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, donde $m,n,p$ indican las filas y columnas de cada matriz, el producto de $A\\cdot B$ es:\n",
39 | "$C = AB_{}^{}$.\n",
40 | "\n",
41 | "$$ \\begin{pmatrix}\n",
42 | " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
43 | " \\vdots & \\ddots & \\vdots \\\\\n",
44 | " a_{m 1} & \\cdots & a_{m n}\n",
45 | " \\end{pmatrix} \\cdot \\begin{pmatrix}\n",
46 | " b_{1 1} & \\cdots & b_{1 p} \\\\\n",
47 | " \\vdots & \\ddots & \\vdots \\\\\n",
48 | " b_{n 1} & \\cdots & b_{n p}\n",
49 | " \\end{pmatrix}$$\n",
50 | " \n",
51 | " $$\\begin{pmatrix}\n",
52 | " a_{11}b_{11}+ \\cdots +a_{1n}b_{n1} & \\cdots & a_{11}b_{1p}+ \\cdots +a_{1n}b_{np} \\\\\n",
53 | " \\vdots & \\ddots & \\vdots \\\\\n",
54 | " a_{m1}b_{11}+ \\cdots +a_{mn}b_{n1} & \\cdots & a_{m1}b_{1p}+ \\cdots +a_{mn}b_{np}\n",
55 | " \\end{pmatrix}\n",
56 | "$$\n",
57 | "\n",
58 | "Que no es mas que la sumatoria de multiplicar la fila por la columna para cada elemento de la matriz resulta:\n",
59 | "$$c_{ij} = \\sum_{r=1}^n a_{ir}b_{rj}$$\n",
60 | "\n",
61 | "\n",
62 | "*Nota[!]:La cantindad de Columnas debe ser igual a la cantidad de filas de la segunda matriz,$A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, tendra como resultado una Matriz $B:=(b_{i j})_{m \\times p}$*.\n",
63 | "\n",
64 | "De las observaciones anteriores podemos decir que para resolver la multiplicación de las matrices $A\\cdot B$ es necesario realizar $m*n*p$ multiplicaciones, Seria posible utilizar una gran cantidad de tecnicas como se describe en [1][1] y [2][2], pero por tiempo utilizare la version interactiva tiene un costo $Θ(n^{3})$ como se muestra en [3][3]:\n",
65 | "\n",
66 | "
\n",
67 | "\n",
68 | "## Codigo implementado\n",
69 | "Para resolver este problema he hecho uso de los Makefiles para hacer parseo de los datos que se me entrega, \n",
70 | "la ejecucion de este programa obligatoriamente establece que los archivos de entrada tengan los nombre ***data1.txt***, ***data2.txt***, siendo *data1.txt* el archivo que contiene los datos reales de la Matriz $A$, y lo mismo para $B$, \n",
71 | "luego lo unico que sera hacaer es:\n",
72 | "**make run**\n",
73 | "\n",
74 | "### Lectura de Archivo\n",
75 | "Para lograr la lectura de los archivo en lenguaje **C** utilizamos las funciones fopen,fwrite para la apertura del archivo, y el almacenamiento de este, el siguiente fragmento de codigo en C hace uso de fopen para abrir el archivo:\n",
76 | "```c\n",
77 | "#include \n",
78 | "int main(){\n",
79 | " FILE *archivo;\n",
80 | " archivo = fopen(\"data1.txt\",\"r\");\n",
81 | " fclose(fclose);\n",
82 | " return 0;\n",
83 | " }\n",
84 | "```\n",
85 | "Con el ejemplo anterior tenemos la posibilidad de acceder a los datos del archivo, ahora para la lectura hacemos uso de la funcion **fscanf()** que tiene los mismos parametros de **scanf()**, utilizando las mismas opciones para los diferentes tipos de dato. `int fscanf(FILE *stream, const char *format, ...)`. de los archivos recibidos por el programa, nos damos cuenta que las dos primeras lineas siempre serán enteros, por lo cual simplemente se leen cuatro enteros indicando filas, y columnas de cada matriz:\n",
86 | "\n",
87 | "```C\n",
88 | " //previous version has conflict with the pointer movmnt.\n",
89 | " fscanf(archi,\"%i\",&rows1);\n",
90 | " fscanf(archi,\"%i\",&cols1);\n",
91 | " fscanf(archi2,\"%i\",&rows2);\n",
92 | " fscanf(archi2,\"%i\",&cols2);\n",
93 | "```\n",
94 | "\n",
95 | "Siguiendo el algoritmo iterativo anterior validamos que las columnas de la primera matriz sean iguales a las filas de la segunda matriz, de lo contrario saqueremos un mensaje de *Dimensiones incompatibles*. \n",
96 | "\n",
97 | "Uno de los temas mas delicados a la hora de realizar este ejercicio fue el de manejar la memoria de manera dinamica, el lenguaje C nos da mucho control para hacer uso de esta, mediante un mecanismo muy simple, solicitud de memoria(tipicamente la parte del *heap*) y liberacion de la misma. El ciclo es sencillo, cuando se precisa almacenar un nuevo dato, se solicita tanta memoria en **bytes** como sea necesaria, y una vez que ese dato ya no se necesita, a la memoria se devuelve para poder ser reutilizada, para solicitar memoria se utiliza la funcion **malloc** o **calloc** en mi caso dado que inicializa los datos de la memoria solicitada con valores de ceros.\n",
98 | "\n",
99 | "El siguiente snippet documenta como es posible realizar la reserva de memoria para la Matriz $A$\n",
100 | "```C\n",
101 | " float **MatA = (float **)calloc(rows1,sizeof(float*));\n",
102 | " for(i = 0; i < rows1; i++)\n",
103 | " MatA[i] = (float *)calloc(cols1 ,sizeof(float));\n",
104 | "```\n",
105 | "al ser una reserva de memoria con **calloc** es necesario validar que efectivamente hallamos reservado el espacio en memoria:\n",
106 | "```C\n",
107 | "if (!MatA || !MatB || !MatC) { \n",
108 | " printf(\"\\n%s[!]%s Falla de Reserva de Memoria\",RED,RES);\n",
109 | " exit(ENOMEM);}\n",
110 | "```\n",
111 | "\n",
112 | "Ahora que ya hemos reservado la memoria para las tres matrices es necesario leer los datos :\n",
113 | "```C\n",
114 | "while(!feof(archi)){\n",
115 | " for(i=0;iclean1.txt\n",
159 | "sed -e \"s/,/ /g\" data2.txt>clean2.txt\n",
160 | "touch r.txt\n",
161 | "echo -e \"\\x1B[33m[*]\\x1B[0m Corriendo Programa\"\n",
162 | "gcc solve.c -w -s -o solver \n",
163 | "./solver clean1.txt clean2.txt\n",
164 | "echo -e \"\\x1B[33m[*]\\x1B[0m Escribiendo Resultado a Stdout(result.txt)\"\n",
165 | "sed -e \"s/ /,/g\" r.txt>result.txt\n",
166 | "rm r.txt clean1.txt clean2.txt\n",
167 | "echo -e \"\\x1B[32m[✔]TERMINADO\\x1B[0m\"\n",
168 | "```\n",
169 | "Ahora solo basta con realizar `./run.sh` y nuestro programa esta listo con complejidad cubica.\n",
170 | "\n",
171 | "#### Referencias :\n",
172 | "- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
173 | "- https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
174 | "[1]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
175 | "[2]: https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
176 | "[3]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm"
177 | ]
178 | }
179 | ],
180 | "metadata": {
181 | "kernelspec": {
182 | "display_name": "Python 3",
183 | "language": "python",
184 | "name": "python3"
185 | },
186 | "language_info": {
187 | "codemirror_mode": {
188 | "name": "ipython",
189 | "version": 3
190 | },
191 | "file_extension": ".py",
192 | "mimetype": "text/x-python",
193 | "name": "python",
194 | "nbconvert_exporter": "python",
195 | "pygments_lexer": "ipython3",
196 | "version": "3.5.4rc1"
197 | }
198 | },
199 | "nbformat": 4,
200 | "nbformat_minor": 2
201 | }
202 |
--------------------------------------------------------------------------------
/Lab1/clang/allomen.c:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #define TABLE_SIZE 10
4 | struct cell_info
5 | {
6 | int a;
7 | int b;
8 | int table[TABLE_SIZE];
9 | };
10 | //define a pointer to the cell_info struct
11 | struct cell_info *cell_ptr;
12 |
13 | int main(int argc, char *argv[])
14 | {
15 | //Explicit memory management, you allocate the exact mem required in the heap.
16 | //the previous struct is located in the stack memory and static variables in
17 | //the global memory, three kind of different memories.
18 | cell_ptr = (struct cell_info *)malloc(sizeof(struct cell_info));
19 | printf("%dBytes",sizeof(cell_ptr));
20 | printf("\n%dBytes",sizeof(struct cell_info));
21 | //after the usage of the memory, we need to free up the mem using free.
22 | free(cell_ptr);
23 | //then the operative system just use the mem space that was free up.
24 | return 0;
25 | }
26 |
--------------------------------------------------------------------------------
/Lab1/clang/colorinunix.c:
--------------------------------------------------------------------------------
1 | #include
2 |
3 | #define KNRM "\x1B[0m"
4 | #define KRED "\x1B[31m"
5 | #define KGRN "\x1B[32m"
6 | #define KYEL "\x1B[33m"
7 | #define KBLU "\x1B[34m"
8 | #define KMAG "\x1B[35m"
9 | #define KCYN "\x1B[36m"
10 | #define KWHT "\x1B[37m"
11 |
12 | int main()
13 | {
14 | printf("%sblue\n", KBLU);
15 | printf("%smagenta\n", KMAG);
16 | printf("%scyan\n", KCYN);
17 | printf("%swhite\n", KWHT);
18 | printf("%snormal\n", KNRM);
19 |
20 | return 0;
21 | }
--------------------------------------------------------------------------------
/Lab1/clang/matrixr:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Lab1/clang/matrixr
--------------------------------------------------------------------------------
/Lab1/clang/matrixtest.c:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | //stored in the global memory
4 | //int rows=4, cols=4;
5 | #define rows 4
6 | #define cols 4
7 | //store in the stack
8 | //use gdb to check this.
9 | int i,j;
10 | int **matrix; // we need a double pointer,
11 | // a matrix is a vector of vectors
12 |
13 | int main(){
14 | //store in stack
15 | int x ; // just testing gdb as debuger
16 | //I allocate a simple row
17 | matrix =(int**)malloc(sizeof(int*)*rows);
18 | //then I iterate over it, to allocate the cols.
19 | for(i=0; i
2 | #include
3 |
4 |
5 | //stored in the global memory
6 | //int rows=4, cols=4;
7 | #define rows 2
8 | #define cols 2
9 | //store in the stack
10 | //use gdb to check this.
11 | int i,j;
12 | int **matrix; // we need a double pointer,
13 | // a matrix is a vector of vectors
14 |
15 | int main(){
16 | //store in stack
17 | int x ; // just testing gdb as debuger
18 | //I allocate a simple row
19 | matrix =(int**)calloc(rows,sizeof(int*));
20 | //then I iterate over it, to allocate the cols.
21 | for(i=0; i
2 | #define NAME_LENGTH 10
3 | #define TABLE_SIZE 100
4 | #define UNITS_NUMBER 10
5 |
6 | struct unit
7 | { /* Define a struct with an internal union */
8 | int x;
9 | float y;
10 | double z;
11 | short int a;
12 | long b;
13 | union
14 | { /* Union with no name because it is internal to the struct */
15 | char name[NAME_LENGTH];
16 | int id;
17 | short int sid;
18 | } identifier;
19 | };
20 |
21 | int main(int argc, char *argv[])
22 | {
23 | int table[TABLE_SIZE];
24 | struct unit data[UNITS_NUMBER];
25 |
26 | printf("%dBytes\n", sizeof(struct unit)); /* Print size of structure */
27 | printf("%dBytes\n", sizeof(table)); /* Print size of table of ints */
28 | printf("%dBytes\n", sizeof(data)); /* Print size of table of structs */
29 |
30 | return 0;
31 | }
--------------------------------------------------------------------------------
/Lab1/data1.txt:
--------------------------------------------------------------------------------
1 | 12
2 | 12
3 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
4 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
5 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
6 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
7 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
8 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
9 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
10 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
11 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
12 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
13 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,3
14 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1,40
15 |
--------------------------------------------------------------------------------
/Lab1/data2.txt:
--------------------------------------------------------------------------------
1 | 12
2 | 12
3 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
4 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
5 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
6 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
7 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
8 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
9 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
10 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
11 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
12 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
13 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
14 | 1, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 3
15 |
--------------------------------------------------------------------------------
/Lab1/readme.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Laboratorio 1, High Performance Computing\n",
8 | "Con el fin de realizar un repaso de las habilidades con el lenguaje ***C*** se debe construir un programa que permita multiplicar 2 matrices, teniendo en cuenta las siguientes condiciones :\n",
9 | "\n",
10 | "* Las dos Matrices se leeran de dos archivos de texto separados por comas. \n",
11 | "* los dos primeras lineas contienen la cantidad de filas y columnas de la matriz respectiva. \n",
12 | "*Ejemplo:* \n",
13 | "```\n",
14 | "2\n",
15 | "2\n",
16 | "1,2\n",
17 | "3,4\n",
18 | "```\n",
19 | "El resulta de la multiplicación de las dos matrices debe ser enviado via stdout aun archivo de texto,\n",
20 | "utilizar ***Taller1HPC*** es el asunto que debe tener el email enviado al docente John.\n",
21 | "\n",
22 | "Como tipo de dato se utiliza precisión sencilla ***float***\n",
23 | "# Solución\n",
24 | "\n",
25 | "## Multiplicación de Matrices\n",
26 | "dadas dos matrices $A$,$B$ de la forma:\n",
27 | "$$A = \\begin{pmatrix}\n",
28 | " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
29 | " \\vdots & \\ddots & \\vdots \\\\\n",
30 | " a_{m 1} & \\cdots & a_{m n}\n",
31 | " \\end{pmatrix}, B = \\begin{pmatrix}\n",
32 | " b_{1 1} & \\cdots & b_{1 n} \\\\\n",
33 | " \\vdots & \\ddots & \\vdots \\\\\n",
34 | " b_{m 1} & \\cdots & b_{m n}\n",
35 | " \\end{pmatrix}\n",
36 | " $$\n",
37 | "\n",
38 | "Escritas en los textos como $A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, donde $m,n,p$ indican las filas y columnas de cada matriz, el producto de $A\\cdot B$ es:\n",
39 | "$C = AB_{}^{}$.\n",
40 | "\n",
41 | "$$ \\begin{pmatrix}\n",
42 | " a_{1 1} & \\cdots & a_{1 n} \\\\\n",
43 | " \\vdots & \\ddots & \\vdots \\\\\n",
44 | " a_{m 1} & \\cdots & a_{m n}\n",
45 | " \\end{pmatrix} \\cdot \\begin{pmatrix}\n",
46 | " b_{1 1} & \\cdots & b_{1 p} \\\\\n",
47 | " \\vdots & \\ddots & \\vdots \\\\\n",
48 | " b_{n 1} & \\cdots & b_{n p}\n",
49 | " \\end{pmatrix}$$\n",
50 | " \n",
51 | " $$\\begin{pmatrix}\n",
52 | " a_{11}b_{11}+ \\cdots +a_{1n}b_{n1} & \\cdots & a_{11}b_{1p}+ \\cdots +a_{1n}b_{np} \\\\\n",
53 | " \\vdots & \\ddots & \\vdots \\\\\n",
54 | " a_{m1}b_{11}+ \\cdots +a_{mn}b_{n1} & \\cdots & a_{m1}b_{1p}+ \\cdots +a_{mn}b_{np}\n",
55 | " \\end{pmatrix}\n",
56 | "$$\n",
57 | "\n",
58 | "Que no es mas que la sumatoria de multiplicar la fila por la columna para cada elemento de la matriz resulta:\n",
59 | "$$c_{ij} = \\sum_{r=1}^n a_{ir}b_{rj}$$\n",
60 | "\n",
61 | "\n",
62 | "*Nota[!]:La cantindad de Columnas debe ser igual a la cantidad de filas de la segunda matriz,$A:=(a_{i j})_{m \\times n}$, $B:=(b_{i j})_{n \\times p}$, tendra como resultado una Matriz $B:=(b_{i j})_{m \\times p}$*.\n",
63 | "\n",
64 | "De las observaciones anteriores podemos decir que para resolver la multiplicación de las matrices $A\\cdot B$ es necesario realizar $m*n*p$ multiplicaciones, Seria posible utilizar una gran cantidad de tecnicas como se describe en [1][1] y [2][2], pero por tiempo utilizare la version interactiva tiene un costo $Θ(n^{3})$ como se muestra en [3][3]:\n",
65 | "\n",
66 | "
\n",
67 | "\n",
68 | "## Codigo implementado\n",
69 | "Para resolver este problema he hecho uso de los Makefiles para hacer parseo de los datos que se me entrega, \n",
70 | "la ejecucion de este programa obligatoriamente establece que los archivos de entrada tengan los nombre ***data1.txt***, ***data2.txt***, siendo *data1.txt* el archivo que contiene los datos reales de la Matriz $A$, y lo mismo para $B$, \n",
71 | "luego lo unico que sera hacaer es:\n",
72 | "**make run**\n",
73 | "\n",
74 | "### Lectura de Archivo\n",
75 | "Para lograr la lectura de los archivo en lenguaje **C** utilizamos las funciones fopen,fwrite para la apertura del archivo, y el almacenamiento de este, el siguiente fragmento de codigo en C hace uso de fopen para abrir el archivo:\n",
76 | "```c\n",
77 | "#include \n",
78 | "int main(){\n",
79 | " FILE *archivo;\n",
80 | " archivo = fopen(\"data1.txt\",\"r\");\n",
81 | " fclose(fclose);\n",
82 | " return 0;\n",
83 | " }\n",
84 | "```\n",
85 | "Con el ejemplo anterior tenemos la posibilidad de acceder a los datos del archivo, ahora para la lectura hacemos uso de la funcion **fscanf()** que tiene los mismos parametros de **scanf()**, utilizando las mismas opciones para los diferentes tipos de dato. `int fscanf(FILE *stream, const char *format, ...)`. de los archivos recibidos por el programa, nos damos cuenta que las dos primeras lineas siempre serán enteros, por lo cual simplemente se leen cuatro enteros indicando filas, y columnas de cada matriz:\n",
86 | "\n",
87 | "```C\n",
88 | " //previous version has conflict with the pointer movmnt.\n",
89 | " fscanf(archi,\"%i\",&rows1);\n",
90 | " fscanf(archi,\"%i\",&cols1);\n",
91 | " fscanf(archi2,\"%i\",&rows2);\n",
92 | " fscanf(archi2,\"%i\",&cols2);\n",
93 | "```\n",
94 | "\n",
95 | "Siguiendo el algoritmo iterativo anterior validamos que las columnas de la primera matriz sean iguales a las filas de la segunda matriz, de lo contrario saqueremos un mensaje de *Dimensiones incompatibles*. \n",
96 | "\n",
97 | "Uno de los temas mas delicados a la hora de realizar este ejercicio fue el de manejar la memoria de manera dinamica, el lenguaje C nos da mucho control para hacer uso de esta, mediante un mecanismo muy simple, solicitud de memoria(tipicamente la parte del *heap*) y liberacion de la misma. El ciclo es sencillo, cuando se precisa almacenar un nuevo dato, se solicita tanta memoria en **bytes** como sea necesaria, y una vez que ese dato ya no se necesita, a la memoria se devuelve para poder ser reutilizada, para solicitar memoria se utiliza la funcion **malloc** o **calloc** en mi caso dado que inicializa los datos de la memoria solicitada con valores de ceros.\n",
98 | "\n",
99 | "El siguiente snippet documenta como es posible realizar la reserva de memoria para la Matriz $A$\n",
100 | "```C\n",
101 | " float **MatA = (float **)calloc(rows1,sizeof(float*));\n",
102 | " for(i = 0; i < rows1; i++)\n",
103 | " MatA[i] = (float *)calloc(cols1 ,sizeof(float));\n",
104 | "```\n",
105 | "al ser una reserva de memoria con **calloc** es necesario validar que efectivamente hallamos reservado el espacio en memoria:\n",
106 | "```C\n",
107 | "if (!MatA || !MatB || !MatC) { \n",
108 | " printf(\"\\n%s[!]%s Falla de Reserva de Memoria\",RED,RES);\n",
109 | " exit(ENOMEM);}\n",
110 | "```\n",
111 | "\n",
112 | "Ahora que ya hemos reservado la memoria para las tres matrices es necesario leer los datos :\n",
113 | "```C\n",
114 | "while(!feof(archi)){\n",
115 | " for(i=0;iclean1.txt\n",
159 | "sed -e \"s/,/ /g\" data2.txt>clean2.txt\n",
160 | "touch r.txt\n",
161 | "echo -e \"\\x1B[33m[*]\\x1B[0m Corriendo Programa\"\n",
162 | "gcc solve.c -w -s -o solver \n",
163 | "./solver clean1.txt clean2.txt\n",
164 | "echo -e \"\\x1B[33m[*]\\x1B[0m Escribiendo Resultado a Stdout(result.txt)\"\n",
165 | "sed -e \"s/ /,/g\" r.txt>result.txt\n",
166 | "rm r.txt clean1.txt clean2.txt\n",
167 | "echo -e \"\\x1B[32m[✔]TERMINADO\\x1B[0m\"\n",
168 | "```\n",
169 | "Ahora solo basta con realizar `./run.sh` y nuestro programa esta listo con complejidad cubica.\n",
170 | "\n",
171 | "#### Referencias :\n",
172 | "- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
173 | "- https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
174 | "[1]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm\n",
175 | "[2]: https://es.wikibooks.org/wiki/Optimizaci%C3%B3n_del_Producto_de_Matrices\n",
176 | "[3]: https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm#Iterative_algorithm"
177 | ]
178 | }
179 | ],
180 | "metadata": {
181 | "kernelspec": {
182 | "display_name": "Python 3",
183 | "language": "python",
184 | "name": "python3"
185 | },
186 | "language_info": {
187 | "codemirror_mode": {
188 | "name": "ipython",
189 | "version": 3
190 | },
191 | "file_extension": ".py",
192 | "mimetype": "text/x-python",
193 | "name": "python",
194 | "nbconvert_exporter": "python",
195 | "pygments_lexer": "ipython3",
196 | "version": "3.5.4rc1"
197 | }
198 | },
199 | "nbformat": 4,
200 | "nbformat_minor": 2
201 | }
202 |
--------------------------------------------------------------------------------
/Lab1/readme.md:
--------------------------------------------------------------------------------
1 | # Lab1
2 | Este laboratorio tiene por fin identificar el uso de memoria estatica, y dinamica mediante el lenguaje C,
3 | se utiliza gdb como debugger y radare2 para verificar el binario y determinar la ubicacion de los datos(No incluido).
4 | ## Información
5 | Este Laboratorio tiene un analísis pequeño realizado con Jupyter, así que le recomiendo leer (readme.ipynb)[https://github.com/h3ct0rjs/HighPerformanceComputing/blob/master/Lab1/readme.ipynb].
6 |
7 | ## Ejecucion
8 | Este programa solo ha sido probado bajo un entorno Debian Gnu/Linux, aunque deberia de funcionar tambien en un entorno Window$.
9 |
10 | Para ejecutar el código solo bastará con hacer :
11 | ```sh
12 | user@host:/tmp/: git clone https://github.com/h3ct0rjs/HighPerformanceComputing
13 | user@host:/tmp/: cd HighPerformanceComputing/Lab1 && ./run
14 | ```
15 | **Tener presente que para realizar la prueba con otros datos es necesario que los archivos se llamen data1.txt, data2.txt los datos de las matrices correspondientes, el resultado se mostrará en pantalla y la matriz resultado será almacenada en *result.txt* **
--------------------------------------------------------------------------------
/Lab1/result.txt:
--------------------------------------------------------------------------------
1 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
2 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
3 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
4 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
5 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
6 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
7 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
8 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
9 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
10 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
11 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
12 | 1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,1.00,1.00,3.00,
13 |
--------------------------------------------------------------------------------
/Lab1/run.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 | #hfjimenez@utp.edu.co
3 | #Preprocess the Txt Files, more easy manipulation in the C program
4 | echo -e "[*]\x1B[0m Limpiando Archivos\n"
5 | sed -e "s/,/ /g" data1.txt>clean1.txt
6 | sed -e "s/,/ /g" data2.txt>clean2.txt
7 | touch r.txt
8 | echo -e "\x1B[33m[*]\x1B[0m Corriendo Programa"
9 | gcc solve.c -w -s -o solver
10 | ./solver clean1.txt clean2.txt
11 | echo -e "\x1B[33m[*]\x1B[0m Escribiendo Resultado a Stdout(result.txt)"
12 | sed -e "s/ /,/g" r.txt>result.txt
13 | rm r.txt clean1.txt clean2.txt
14 | echo -e "\x1B[32m[✔]TERMINADO\x1B[0m"
15 |
--------------------------------------------------------------------------------
/Lab1/solve.c:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #include
4 | //High Performance Computing
5 | //Lab#1: Matrix Multiplication
6 | //hfjimenez@utp.edu.co, 2017-2
7 |
8 | //Strings label :
9 | //[!]:Warning,Exit Process.
10 | //[*]:Information interesting
11 | //[Ok]: Process Completed
12 | //References:
13 | //[1]https://linux.die.net/man/3/calloc
14 |
15 | //A*B = C, representation of all the matrix
16 | float **MatA, **MatB, **MatC; //Heap MemStorage
17 | int i,j,k; //global Mem
18 |
19 | //My Function helpers
20 | void printmat(float **Mat,int r, int c);
21 | void title(void);
22 |
23 | #define version "v0.1"
24 | //First Time Using Colors Hooray
25 | #define RES "\x1B[0m"
26 | #define RED "\x1B[31m"
27 | #define GRN "\x1B[32m"
28 | #define YEL "\x1B[33m"
29 | #define BLU "\x1B[34m"
30 | #define MAG "\x1B[35m"
31 | #define CYN "\x1B[36m"
32 | #define WHT "\x1B[37m"
33 |
34 | int main(int argc, const char * argv[])
35 | {
36 | title();
37 | //File pointers,in read mode
38 | FILE *archi;
39 | FILE *archi2;
40 | FILE *result;
41 | int rows1,cols1,rows2,cols2;
42 | int sum=0;
43 | archi = fopen(argv[1],"r");
44 | archi2 = fopen(argv[2],"r");
45 | //NULL in C is ugly, NULL is equal to 0, not like in C++ <3 that is a real null value.
46 | if (archi == NULL || archi2 == NULL){
47 | printf("%s[!]%s Imposible abrir los archivos pasados como argumentos\n",RED,RES);
48 | exit(0);
49 | }
50 |
51 | /* int opts[4]; //to store the values.
52 | for(i=0; i<2; i++){
53 | fscanf(archi,"%d",&opts[i]); //read the amount of rows and cols
54 | fscanf(archi2,"%d",&opts[i+2]); //read the amount of rows and cols
55 | }
56 |
57 | //fclose(archi); //Because I was having a trouble dealing with this,
58 | //fclose(archi2); //I need to reopen the file using another bar. I try to figured it out with
59 | // with gdb and it seems that had something to do with a program that I use to
60 | // chroot process. Investigating about this.
61 | */
62 |
63 | //previous version has conflict with the pointer movmnt.
64 | fscanf(archi,"%i",&rows1);
65 | fscanf(archi,"%i",&cols1);
66 | fscanf(archi2,"%i",&rows2);
67 | fscanf(archi2,"%i",&cols2);
68 |
69 | printf("%s**Dimensiones Matriz**%s",CYN, RES);
70 | printf("\n%s[*]%s Matriz **A** Filas:%d Columnas:%d",YEL,RES,rows1,cols1);
71 | printf("\n%s[*]%s Matriz **B** Filas:%d Columnas:%d",YEL,RES,rows2,cols2);
72 |
73 | if(cols1!=rows2){
74 | printf("\n%s[!]%s No Es Posible realizar la Multiplicacion entre Matrices\n ",RED,RES);
75 | printf("%s[NOTA:!]%s Dimensiones Incompatibles,saliendo...\n",RED,RES );
76 | exit(0);
77 | }
78 |
79 | //Allocate memory for the Matrices, with initilize values to cero.
80 | //avoid the for loops to init the matriz values in 0.see[1] for more info
81 |
82 | float **MatA = (float **)calloc(rows1,sizeof(float*));
83 | for(i = 0; i < rows1; i++)
84 | MatA[i] = (float *)calloc(cols1 ,sizeof(float));
85 |
86 | float **MatB = (float **)calloc(rows2, sizeof(float*));
87 | for(i = 0; i < rows2; i++)
88 | MatB[i] = (float *)calloc(cols2, sizeof(float));
89 |
90 | float **MatC = (float **)calloc(rows1, sizeof(float*));
91 | for(i = 0; i < rows1; i++)
92 | MatC[i] = (float *)calloc(cols2, sizeof(float));
93 |
94 | if (!MatA || !MatB || !MatC) {
95 | printf("\n%s[!]%s Falla de Reserva de Memoria",RED,RES);
96 | exit(ENOMEM);}
97 |
98 | printf("\n%s[*]%s Leyendo Valores de la Matriz A en el archivos\n",YEL, RES);
99 | while(!feof(archi)){
100 | for(i=0;iclean1.txt
7 | sed -e "s/,/ /g" datasets/data2500.txt>clean2.txt
8 | touch r.txt
9 | echo -e "\x1B[33m[*]\x1B[0m Corriendo Programa"
10 | gcc solve.c -w -s -o solver -fopenmp
11 | ./solver clean1.txt clean2.txt
12 | echo -e "\x1B[33m[*]\x1B[0m Escribiendo Resultado a Stdout(result.txt)"
13 | sed -e "s/ /,/g" r.txt>result.txt
14 | rm r.txt clean1.txt clean2.txt solver
15 | echo -e "\x1B[32m[✔]TERMINADO\x1B[0m"
16 |
--------------------------------------------------------------------------------
/Lab2/samplesomp/sample1.c:
--------------------------------------------------------------------------------
1 | /******************************************************************************
2 | * FILE: omp_workshare1.c
3 | * DESCRIPTION:
4 | * OpenMP Example - Loop Work-sharing - C/C++ Version
5 | * In this example, the iterations of a loop are scheduled dynamically
6 | * across the team of threads. A thread will perform CHUNK iterations
7 | * at a time before being scheduled for the next CHUNK of work.
8 | * AUTHOR: Blaise Barney 5/99
9 | * LAST REVISED: 04/06/05
10 | ******************************************************************************/
11 | #include
12 | #include
13 | #include
14 |
15 | #define CHUNKSIZE 10
16 | #define N 100
17 |
18 | int main (int argc, char *argv[])
19 | {
20 | int nthreads, tid, i, chunk;
21 | float a[N], b[N], c[N];
22 |
23 | /* Some initializations */
24 | for (i=0; i < N; i++)
25 | a[i] = b[i] = i * 1.0;
26 | chunk = CHUNKSIZE;
27 |
28 | #pragma omp parallel shared(a,b,c,nthreads,chunk) private(i,tid)
29 | {
30 | tid = omp_get_thread_num();
31 | if (tid == 0)
32 | {
33 | nthreads = omp_get_num_threads();
34 | printf("Number of threads = %d\n", nthreads);
35 | }
36 | printf("Thread %d starting...\n",tid);
37 |
38 | #pragma omp for schedule(dynamic,chunk)
39 | for (i=0; i
12 | #include
13 | #include
14 |
15 | #define NRA 62 /* number of rows in matrix A */
16 | #define NCA 15 /* number of columns in matrix A */
17 | #define NCB 7 /* number of columns in matrix B */
18 |
19 | int main (int argc, char *argv[]){
20 | int tid, nthreads, i, j, k, chunk;
21 | double a[NRA][NCA], /* matrix A to be multiplied */
22 | b[NCA][NCB], /* matrix B to be multiplied */
23 | c[NRA][NCB]; /* result matrix C */
24 |
25 | chunk = 10; /* set loop iteration chunk size */
26 |
27 | /*** Spawn a parallel region explicitly scoping all variables ***/
28 | #pragma omp parallel shared(a,b,c,nthreads,chunk) private(tid,i,j,k)
29 | {
30 | tid = omp_get_thread_num();
31 | if (tid == 0)
32 | {
33 | nthreads = omp_get_num_threads();
34 | printf("Starting matrix multiple example with %d threads\n",nthreads);
35 | printf("Initializing matrices...\n");
36 | }
37 | /*** Initialize matrices ***/
38 | #pragma omp for schedule (static, chunk)
39 | for (i=0; i
2 | #include
3 | #include
4 | #include
5 | #include
6 | //High Performance Computing
7 | //Lab#1: Matrix Multiplication
8 | //hfjimenez@utp.edu.co, 2017-2
9 |
10 | //Strings label :
11 | //[!]:Warning,Exit Process.
12 | //[*]:Information interesting
13 | //[Ok]: Process Completed
14 | //References:
15 | //[1]https://linux.die.net/man/3/calloc
16 | //A*B = C, representation of all the matrix
17 | float **MatA, **MatB, **MatC; //Heap MemStorage
18 | int i,j,k; //global Mem
19 | int tid, nthreads;
20 | int chunk = 10; /* set loop iteration chunk size */
21 | int tmp=0;
22 |
23 | //My Function helpers
24 | void printmat(float **Mat,int r, int c);
25 | void title(void);
26 |
27 | #define version "v0.1"
28 | //First Time Using Colors Hooray
29 | #define RES "\x1B[0m"
30 | #define RED "\x1B[31m"
31 | #define GRN "\x1B[32m"
32 | #define YEL "\x1B[33m"
33 | #define BLU "\x1B[34m"
34 | #define MAG "\x1B[35m"
35 | #define CYN "\x1B[36m"
36 | #define WHT "\x1B[37m"
37 |
38 | int main(int argc, const char * argv[])
39 | {
40 |
41 | float t_1; // Execution time measures
42 | clock_t c_1, c_2;
43 |
44 | title();
45 | //File pointers,in read mode
46 | FILE *archi;
47 | FILE *archi2;
48 | FILE *result;
49 | int rows1,cols1,rows2,cols2;
50 | int sum=0;
51 |
52 | archi = fopen(argv[1],"r");
53 | archi2 = fopen(argv[2],"r");
54 | //NULL in C is ugly, NULL is equal to 0, not like in C++ <3 that is a real null value.
55 | if (archi == NULL || archi2 == NULL){
56 | printf("%s[!]%s Imposible abrir los archivos pasados como argumentos\n",RED,RES);
57 | exit(0);
58 | }
59 | fscanf(archi,"%i",&rows1);
60 | fscanf(archi,"%i",&cols1);
61 | fscanf(archi2,"%i",&rows2);
62 | fscanf(archi2,"%i",&cols2);
63 | printf("%s**Dimensiones Matriz**%s",CYN, RES);
64 | printf("\n%s[*]%s Matriz **A** Filas:%d Columnas:%d",YEL,RES,rows1,cols1);
65 | printf("\n%s[*]%s Matriz **B** Filas:%d Columnas:%d",YEL,RES,rows2,cols2);
66 |
67 | if(cols1!=rows2){
68 | printf("\n%s[!]%s No Es Posible realizar la Multiplicacion entre Matrices\n ",RED,RES);
69 | printf("%s[NOTA:!]%s Dimensiones Incompatibles,saliendo...\n",RED,RES );
70 | exit(0);
71 | }
72 |
73 | //Allocate memory for the Matrices, with initilize values to cero.
74 | //avoid the for loops to init the matriz values in 0.see[1] for more info
75 |
76 | float **MatA = (float **)calloc(rows1,sizeof(float*));
77 | float **MatB = (float **)calloc(rows2, sizeof(float*));
78 | float **MatC = (float **)calloc(rows1, sizeof(float*));
79 | //double **M = (double **)malloc(rows1*sizeof(double*));
80 |
81 |
82 |
83 | /*** Initialize matrices with ceros ***/
84 | for(i = 0; i < rows1; i++)
85 | MatA[i] = (float *)calloc(cols1 ,sizeof(float));
86 |
87 |
88 | for(i = 0; i < rows2; i++)
89 | MatB[i] = (float *)calloc(cols2, sizeof(float));
90 |
91 | for(i = 0; i < rows1; i++)
92 | MatC[i] = (float *)calloc(cols2, sizeof(float));
93 |
94 | tid = omp_get_thread_num();
95 | nthreads = omp_get_num_threads();
96 | printf("\nMultiplicando las matrices con # %d de threads\n",nthreads);
97 |
98 | if (!MatA || !MatB || !MatC) {
99 | printf("\n%s[!]%s Falla de Reserva de Memoria",RED,RES);
100 | exit(ENOMEM);}
101 |
102 | printf("\n%s[*]%s Leyendo Valores de la Matriz A en el archivo\n",YEL, RES);
103 | while(!feof(archi)){
104 | for(i=0;i
4 | #include
5 | #include
6 |
7 | // CUDA kernel. Each thread takes care of one element of c
8 | __global__ void vecAdd(double *a, double *b, double *c, int n)
9 | {
10 | // Get our global thread ID
11 | int id = blockIdx.x*blockDim.x+threadIdx.x;
12 |
13 | // Make sure we do not go out of bounds
14 | if (id < n){
15 | c[id] = a[id] + b[id];
16 | printf('En Thread id: %d Oper : %d +%d :%d',id,a[id],b[id],c[id])
17 | }
18 |
19 | }
20 |
21 | int main( int argc, char* argv[] )
22 | {
23 | // Size of vectors
24 | int n = 100000;
25 |
26 | // Host input vectors
27 | double *h_a;
28 | double *h_b;
29 | //Host output vector
30 | double *h_c;
31 | // Device input vectors
32 | double *d_a;
33 | double *d_b;
34 | //Device output vector
35 | double *d_c;
36 |
37 | // Size, in bytes, of each vector
38 | size_t bytes = n*sizeof(double);
39 |
40 | // Allocate memory for each vector on host
41 | h_a = (double*)malloc(bytes);
42 | h_b = (double*)malloc(bytes);
43 | h_c = (double*)malloc(bytes);
44 |
45 | // Allocate memory for each vector on GPU
46 | cudaMalloc(&d_a, bytes);
47 | cudaMalloc(&d_b, bytes);
48 | cudaMalloc(&d_c, bytes);
49 |
50 | int i;
51 | // Initialize vectors on host
52 | for( i = 0; i < n; i++ ) {
53 | h_a[i] = sin(i)*sin(i);
54 | h_b[i] = cos(i)*cos(i);
55 | }
56 |
57 | // Copy host vectors to device
58 | cudaMemcpy( d_a, h_a, bytes, cudaMemcpyHostToDevice);
59 | cudaMemcpy( d_b, h_b, bytes, cudaMemcpyHostToDevice);
60 |
61 | int blockSize, gridSize;
62 |
63 | // Number of threads in each thread block
64 | blockSize = 1024;
65 |
66 | // Number of thread blocks in grid
67 | gridSize = (int)ceil((float)n/blockSize);
68 |
69 | // Execute the kernel
70 | vecAdd<<>>(d_a, d_b, d_c, n);
71 |
72 | // Copy array back to host
73 | cudaMemcpy( h_c, d_c, bytes, cudaMemcpyDeviceToHost );
74 |
75 | // Sum up vector c and print result divided by n, this should equal 1 within error
76 | double sum = 0;
77 | for(i=0; i
2 | #include
3 | #define N 1024
4 |
5 | __global__ void Matriz_GPU_Mult(int *a, int *b, int *c) {
6 | int k, sum = 0;
7 | int i = blockIdx.x * blockDim.x + threadIdx.x;
8 | int j = blockIdx.y * blockDim.y + threadIdx.y;
9 | if (i < N && j < N) {
10 | for (k = 0; k < N; k++) {
11 | sum += a[j * N + k] * b[k * N + i];
12 | }
13 | c[j * N + i] = sum;
14 | }
15 | }
16 |
17 | int main() {
18 | double timeGPU;
19 | int h_A[N][N], h_B[N][N], h_C[N][N];
20 | int *d_a, *d_b, *d_c;
21 |
22 | cudaMalloc((void **) &d_a, N*sizeof(int));
23 | cudaMalloc((void **) &d_b, N*sizeof(int));
24 | cudaMalloc((void **) &d_c, N*sizeof(int));
25 |
26 | cudaMemcpy(d_a, A, size, cudaMemcpyHostToDevice);
27 | cudaMemcpy(d_b, B, size, cudaMemcpyHostToDevice);
28 |
29 | //int threadsPerBlock(16);
30 | //int numBlocks(N/threadsPerBlock);
31 | dim3 threadsPerBlock(32, 32);
32 | dim3 numBlocks(N/threadsPerBlock.x, N/threadsPerBlock.y);
33 | clock_t startGPU = clock();
34 | Matriz_GPU_Mult<<>>(d_a, d_b, d_c);
35 | timeGPU = ((double)(clock() - startGPU))/CLOCKS_PER_SEC;
36 | cudaMemcpy(C, d_c, size, cudaMemcpyDeviceToHost);
37 | cudaFree(d_a);
38 | cudaFree(d_b);
39 | cudaFree(d_c);
40 | // tiempos de ejecucion
41 | printf("tiempo GPU = %f s",timeGPU);
42 | return 0;
43 | }
44 |
--------------------------------------------------------------------------------
/Lab4CudaMM/readme.md:
--------------------------------------------------------------------------------
1 | # Matriz Multiplication Using CUDA
2 |
--------------------------------------------------------------------------------
/Lab5Numba/mc.py:
--------------------------------------------------------------------------------
1 | import numpy as np # numpy namespace
2 | from timeit import default_timer as timer # for timing
3 | from matplotlib import pyplot # for plotting
4 | import math
5 | from numbapro import vectorize
6 |
7 | @vectorize(['float64(float64, float64, float64, float64, float64)'], target='gpu')
8 | def step(price, dt, c0, c1, noise):
9 | return price * math.exp(c0 * dt + c1 * noise)
10 |
11 | # Stock information parameters
12 | StockPrice = 20.83
13 | StrikePrice = 21.50
14 | Volatility = 0.021
15 | InterestRate = 0.20
16 | Maturity = 5. / 12.
17 |
18 | # monte-carlo simulation parameters
19 | NumPath = 3000000
20 | NumStep = 100
21 |
22 | # plotting
23 | MAX_PATH_IN_PLOT = 50
24 |
25 | def driver(pricer, do_plot=False):
26 | paths = np.zeros((NumPath, NumStep + 1), order='F')
27 | paths[:, 0] = StockPrice
28 | DT = Maturity / NumStep
29 |
30 | ts = timer()
31 | pricer(paths, DT, InterestRate, Volatility)
32 | te = timer()
33 | elapsed = te - ts
34 |
35 | ST = paths[:, -1]
36 | PaidOff = np.maximum(paths[:, -1] - StrikePrice, 0)
37 | print 'Result'
38 | fmt = '%20s: %s'
39 | print fmt % ('stock price', np.mean(ST))
40 | print fmt % ('standard error', np.std(ST) / np.sqrt(NumPath))
41 | print fmt % ('paid off', np.mean(PaidOff))
42 | optionprice = np.mean(PaidOff) * np.exp(-InterestRate * Maturity)
43 | print fmt % ('option price', optionprice)
44 |
45 | print 'Performance'
46 | NumCompute = NumPath * NumStep
47 | print fmt % ('Mstep/second', '%.2f' % (NumCompute / elapsed / 1e6))
48 | print fmt % ('time elapsed', '%.3fs' % (te - ts))
49 |
50 | if do_plot:
51 | pathct = min(NumPath, MAX_PATH_IN_PLOT)
52 | for i in xrange(pathct):
53 | pyplot.plot(paths[i])
54 | print 'Plotting %d/%d paths' % (pathct, NumPath)
55 | pyplot.show()
56 |
--------------------------------------------------------------------------------
/Notes/HPC-Class1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## 08/11/2017 OpenMP \n",
8 | "\n",
9 | "Modelo de memoria compartida donde diferentes nucleos hacen uso de la memoria compartida, paralelismo esta basado en hilos (*threads*) la idea es aprovechar usar la tendencia multinucleo para hacer procesamiento y optimizar al maximo. Se plantea que el valor optimo de hilos asignado es igual al numero de procesadores o soportados por el procesador. \n",
10 | "\n",
11 | "* Se tiene control total de lo que se hace\n",
12 | "* Debe ser explicito donde se realiza el paralelismo\n",
13 | "\n",
14 | "** Modelo de Fork Join** \n",
15 | "\n",
16 | "
\n",
17 | "\n",
18 | "\n",
19 | "### Recursos Electronicos\n",
20 | "\n",
21 | "[https://computing.llnl.gov/tutorials/openMP/](https://computing.llnl.gov/tutorials/openMP/)\n",
22 | "\n",
23 | "***Revisar : Threads en C/C++, Modelo Fork-Join ***"
24 | ]
25 | }
26 | ],
27 | "metadata": {
28 | "kernelspec": {
29 | "display_name": "Python 3",
30 | "language": "python",
31 | "name": "python3"
32 | },
33 | "language_info": {
34 | "codemirror_mode": {
35 | "name": "ipython",
36 | "version": 3
37 | },
38 | "file_extension": ".py",
39 | "mimetype": "text/x-python",
40 | "name": "python",
41 | "nbconvert_exporter": "python",
42 | "pygments_lexer": "ipython3",
43 | "version": "3.5.4rc1"
44 | }
45 | },
46 | "nbformat": 4,
47 | "nbformat_minor": 2
48 | }
49 |
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class1-openMP.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class1-openMP.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class2-Slurm.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class2-Slurm.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class3openMP.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class3openMP.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class4MPI.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class4MPI.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class5introductionDataParallelismCUDAC.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class5introductionDataParallelismCUDAC.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class6vecAddition.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class6vecAddition.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/Class8CUDAMemories.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/Class8CUDAMemories.pdf
--------------------------------------------------------------------------------
/Notes/OficialSlides/NyuPowerWall.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/h3ct0rjs/HighPerformanceComputing/10456afc3976b41dbcdb29d7bfb035a5c94e227f/Notes/OficialSlides/NyuPowerWall.pdf
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # HighPerformanceComputing
2 | Class of High Performance Computing taken at U.T.P 2017
3 |
4 | For each of the labs check:
5 | * readme.ipynb
6 | * solve.xx
7 | * datasets
8 |
--------------------------------------------------------------------------------