├── .gitignore ├── LICENSE ├── Makefile ├── README.md ├── gemm_cpu_naive.cc ├── gemm_cpu_naive.h ├── gemm_cpu_simd.cc ├── gemm_cpu_simd.h ├── gemm_gpu_1thread.cu ├── gemm_gpu_1thread.h ├── gemm_gpu_mult_block.cu ├── gemm_gpu_mult_block.h ├── gemm_gpu_mult_block_no_restrict.cu ├── gemm_gpu_mult_block_no_restrict.h ├── gemm_gpu_mult_block_no_restrict_reg.cu ├── gemm_gpu_mult_block_no_restrict_reg.h ├── gemm_gpu_mult_thread.cu ├── gemm_gpu_mult_thread.h ├── gemm_gpu_tiling.cu ├── gemm_gpu_tiling.h ├── gemm_test.cc └── lecture.md /.gitignore: -------------------------------------------------------------------------------- 1 | /.vscode 2 | /gemm_test 3 | *.o 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/LICENSE -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/Makefile -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/README.md -------------------------------------------------------------------------------- /gemm_cpu_naive.cc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_cpu_naive.cc -------------------------------------------------------------------------------- /gemm_cpu_naive.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_cpu_naive.h -------------------------------------------------------------------------------- /gemm_cpu_simd.cc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_cpu_simd.cc -------------------------------------------------------------------------------- /gemm_cpu_simd.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_cpu_simd.h -------------------------------------------------------------------------------- /gemm_gpu_1thread.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_1thread.cu -------------------------------------------------------------------------------- /gemm_gpu_1thread.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_1thread.h -------------------------------------------------------------------------------- /gemm_gpu_mult_block.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_block.cu -------------------------------------------------------------------------------- /gemm_gpu_mult_block.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_block.h -------------------------------------------------------------------------------- /gemm_gpu_mult_block_no_restrict.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_block_no_restrict.cu -------------------------------------------------------------------------------- /gemm_gpu_mult_block_no_restrict.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_block_no_restrict.h -------------------------------------------------------------------------------- /gemm_gpu_mult_block_no_restrict_reg.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_block_no_restrict_reg.cu -------------------------------------------------------------------------------- /gemm_gpu_mult_block_no_restrict_reg.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_block_no_restrict_reg.h -------------------------------------------------------------------------------- /gemm_gpu_mult_thread.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_thread.cu -------------------------------------------------------------------------------- /gemm_gpu_mult_thread.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_mult_thread.h -------------------------------------------------------------------------------- /gemm_gpu_tiling.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_tiling.cu -------------------------------------------------------------------------------- /gemm_gpu_tiling.h: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_gpu_tiling.h -------------------------------------------------------------------------------- /gemm_test.cc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/gemm_test.cc -------------------------------------------------------------------------------- /lecture.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/interestingLSY/CUDA-From-Correctness-To-Performance-Code/HEAD/lecture.md --------------------------------------------------------------------------------