├── .gitignore ├── .vscode └── settings.json ├── CMakeLists.txt ├── CONTRIBUTORS.md ├── KernelAndLibExamples ├── simple_thrust │ ├── Makefile │ ├── README.md │ └── simple_thrust.cu └── vector_add │ ├── Makefile │ ├── README.md │ └── vector_add.cu ├── MemoryAndStructureExamples ├── alloc_init_vs_alloc_uninit │ ├── Makefile │ └── alloc_init_vs_alloc_uninit.cu └── pinned_vs_pageable │ ├── Makefile │ ├── README.md │ └── pinned_vs_pageable.cu ├── PerformanceChecklistExamples ├── README.md └── cuda_streams │ ├── Makefile │ ├── cuda_streams.cu │ └── make_run_and_profile.sh ├── ProfilingExamples ├── bandwidth_check │ ├── Makefile │ └── bandwidth_check.cu ├── global_vs_shared_mem │ ├── Makefile │ └── global_vs_shared.cu ├── memory_bank_conflict │ ├── Makefile │ ├── bank_conflict.cu │ └── make_run_and_profile.sh └── nvtx │ ├── Makefile │ ├── make_run_and_profile.sh │ └── nvtx_example.cu ├── README.md ├── SetupAndInitExamples └── setup_check │ ├── Makefile │ └── hello_world.cu ├── TensorParallelFromScratch ├── 00_intro_to_tensors │ └── README.md ├── 01_simple_matmul_no_tp │ ├── Makefile │ ├── README.md │ └── matmul.cu └── 02_matmul_tp │ ├── Makefile │ ├── matmul_tp_big.cu │ └── matmul_tp_small.cu └── utils ├── nvtx_event.cuh └── utils.cuh /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/.gitignore -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/.vscode/settings.json -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/CMakeLists.txt -------------------------------------------------------------------------------- /CONTRIBUTORS.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/CONTRIBUTORS.md -------------------------------------------------------------------------------- /KernelAndLibExamples/simple_thrust/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/KernelAndLibExamples/simple_thrust/Makefile -------------------------------------------------------------------------------- /KernelAndLibExamples/simple_thrust/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/KernelAndLibExamples/simple_thrust/README.md -------------------------------------------------------------------------------- /KernelAndLibExamples/simple_thrust/simple_thrust.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/KernelAndLibExamples/simple_thrust/simple_thrust.cu -------------------------------------------------------------------------------- /KernelAndLibExamples/vector_add/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/KernelAndLibExamples/vector_add/Makefile -------------------------------------------------------------------------------- /KernelAndLibExamples/vector_add/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/KernelAndLibExamples/vector_add/README.md -------------------------------------------------------------------------------- /KernelAndLibExamples/vector_add/vector_add.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/KernelAndLibExamples/vector_add/vector_add.cu -------------------------------------------------------------------------------- /MemoryAndStructureExamples/alloc_init_vs_alloc_uninit/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/MemoryAndStructureExamples/alloc_init_vs_alloc_uninit/Makefile -------------------------------------------------------------------------------- /MemoryAndStructureExamples/alloc_init_vs_alloc_uninit/alloc_init_vs_alloc_uninit.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/MemoryAndStructureExamples/alloc_init_vs_alloc_uninit/alloc_init_vs_alloc_uninit.cu -------------------------------------------------------------------------------- /MemoryAndStructureExamples/pinned_vs_pageable/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/MemoryAndStructureExamples/pinned_vs_pageable/Makefile -------------------------------------------------------------------------------- /MemoryAndStructureExamples/pinned_vs_pageable/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/MemoryAndStructureExamples/pinned_vs_pageable/README.md -------------------------------------------------------------------------------- /MemoryAndStructureExamples/pinned_vs_pageable/pinned_vs_pageable.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/MemoryAndStructureExamples/pinned_vs_pageable/pinned_vs_pageable.cu -------------------------------------------------------------------------------- /PerformanceChecklistExamples/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/PerformanceChecklistExamples/README.md -------------------------------------------------------------------------------- /PerformanceChecklistExamples/cuda_streams/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/PerformanceChecklistExamples/cuda_streams/Makefile -------------------------------------------------------------------------------- /PerformanceChecklistExamples/cuda_streams/cuda_streams.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/PerformanceChecklistExamples/cuda_streams/cuda_streams.cu -------------------------------------------------------------------------------- /PerformanceChecklistExamples/cuda_streams/make_run_and_profile.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/PerformanceChecklistExamples/cuda_streams/make_run_and_profile.sh -------------------------------------------------------------------------------- /ProfilingExamples/bandwidth_check/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/bandwidth_check/Makefile -------------------------------------------------------------------------------- /ProfilingExamples/bandwidth_check/bandwidth_check.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/bandwidth_check/bandwidth_check.cu -------------------------------------------------------------------------------- /ProfilingExamples/global_vs_shared_mem/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/global_vs_shared_mem/Makefile -------------------------------------------------------------------------------- /ProfilingExamples/global_vs_shared_mem/global_vs_shared.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/global_vs_shared_mem/global_vs_shared.cu -------------------------------------------------------------------------------- /ProfilingExamples/memory_bank_conflict/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/memory_bank_conflict/Makefile -------------------------------------------------------------------------------- /ProfilingExamples/memory_bank_conflict/bank_conflict.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/memory_bank_conflict/bank_conflict.cu -------------------------------------------------------------------------------- /ProfilingExamples/memory_bank_conflict/make_run_and_profile.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/memory_bank_conflict/make_run_and_profile.sh -------------------------------------------------------------------------------- /ProfilingExamples/nvtx/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/nvtx/Makefile -------------------------------------------------------------------------------- /ProfilingExamples/nvtx/make_run_and_profile.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/nvtx/make_run_and_profile.sh -------------------------------------------------------------------------------- /ProfilingExamples/nvtx/nvtx_example.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/ProfilingExamples/nvtx/nvtx_example.cu -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/README.md -------------------------------------------------------------------------------- /SetupAndInitExamples/setup_check/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/SetupAndInitExamples/setup_check/Makefile -------------------------------------------------------------------------------- /SetupAndInitExamples/setup_check/hello_world.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/SetupAndInitExamples/setup_check/hello_world.cu -------------------------------------------------------------------------------- /TensorParallelFromScratch/00_intro_to_tensors/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/00_intro_to_tensors/README.md -------------------------------------------------------------------------------- /TensorParallelFromScratch/01_simple_matmul_no_tp/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/01_simple_matmul_no_tp/Makefile -------------------------------------------------------------------------------- /TensorParallelFromScratch/01_simple_matmul_no_tp/README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/01_simple_matmul_no_tp/README.md -------------------------------------------------------------------------------- /TensorParallelFromScratch/01_simple_matmul_no_tp/matmul.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/01_simple_matmul_no_tp/matmul.cu -------------------------------------------------------------------------------- /TensorParallelFromScratch/02_matmul_tp/Makefile: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/02_matmul_tp/Makefile -------------------------------------------------------------------------------- /TensorParallelFromScratch/02_matmul_tp/matmul_tp_big.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/02_matmul_tp/matmul_tp_big.cu -------------------------------------------------------------------------------- /TensorParallelFromScratch/02_matmul_tp/matmul_tp_small.cu: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/TensorParallelFromScratch/02_matmul_tp/matmul_tp_small.cu -------------------------------------------------------------------------------- /utils/nvtx_event.cuh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/utils/nvtx_event.cuh -------------------------------------------------------------------------------- /utils/utils.cuh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/drkennetz/cuda_examples/HEAD/utils/utils.cuh --------------------------------------------------------------------------------