├── amdgpu-install_5.4.50400-1_all.deb ├── rocblas_2.46.0.50401-84.20.04_amd64.deb ├── README.md └── check_pytorch.py /amdgpu-install_5.4.50400-1_all.deb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikos230/Run-Pytorch-with-AMD-Radeon-GPU/HEAD/amdgpu-install_5.4.50400-1_all.deb -------------------------------------------------------------------------------- /rocblas_2.46.0.50401-84.20.04_amd64.deb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nikos230/Run-Pytorch-with-AMD-Radeon-GPU/HEAD/rocblas_2.46.0.50401-84.20.04_amd64.deb -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Run-Pytorch-with-AMD-Radeon-GPU 2 | 3 | ## Introducation 4 | With this guide you will be able to run Pytorch 2.1.1 with an Radeon GPU, it has been tested on rx470 4GB. Your GPU need to belong to gfx803 family like RX400 and RX500 Series. 5 | 6 | ### Requirements 7 | - Ubuntu 22.04 LTS 8 | - AMD Radeon GPU in gfx803 family (rx460, rx470, rx480, rx550, rx560, rx560, rx570, rx580) 9 | 10 | ### Download Required Files 11 | - Download [Pytorch 2.1.1](https://drive.google.com/file/d/1Tkyqe8VxUPkpf_jLZRzJphKNlW5Cqixi/view?usp=sharing) or build it yourself (see below) 12 | - Download [rocblas_2.46.0.50401-84.20.04_amd64.deb](https://github.com/xuhuisheng/rocm-gfx803/releases/tag/rocm541) 13 | 14 | 15 | ## Guide to Setup ROCm and Pytorch 16 | 17 | Start with a fresh setup of ubuntu 22.04, then you need to install AMD drivers like ROCm. The version needed is ROCm 5.4.0, or 5.4.3 choose one of theese. 18 | - Open a terminal and type 19 |
 20 | sudo su
 21 | sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
 22 | sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
 23 | 
24 | Reboot your system

25 | 26 | 27 | - Open terminal and now you can start installing ROCm (for ROCm 5.4.0) 28 | 29 |
 30 | cd Downloads
 31 | wget https://repo.radeon.com/amdgpu-install/5.4/ubuntu/jammy/amdgpu-install_5.4.50400-1_all.deb
 32 | sudo apt install ./amdgpu-install_5.4.50400-1_all.deb
 33 | sudo amdgpu-install -y --no-dkms --usecase=rocm,hiplibsdk,mlsdk
 34 | sudo usermod -aG video $LOGNAME
 35 | sudo usermod -aG render $LOGNAME
 36 | 
37 | 38 | - or Alternative install ROCm 5.4.3 39 |
 40 | cd Downloads
 41 | wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb
 42 | sudo apt install ./amdgpu-install_5.4.50403-1_all.deb
 43 | sudo amdgpu-install -y --no-dkms --usecase=rocm,hiplibsdk,mlsdk
 44 | sudo usermod -aG video $LOGNAME
 45 | sudo usermod -aG render $LOGNAME
 46 | 
47 | Reboot your system

48 | 49 | - Open terminal and check if ROCm is installed correctly 50 | 51 |
 52 | rocminfo
 53 | clinfo
 54 | 
55 |
56 | 57 | - Then install libopenmpi3 andlibstdc++-11-dev and rocblas patched version for gfx803 58 | 59 |
 60 | sudo apt install libopenmpi3 libstdc++-11-dev
 61 | sudo apt-get install libopenblas-dev
 62 | cd Downloads
 63 | sudo apt install ./rocblas_2.46.0.50401-84.20.04_amd64.deb 
 64 | 

65 | 66 | - Now you need to install Pytorch, you can use the pre-build wheels from this repo, or you can build it yourself but it will take some time. You can not install Pytorch with ROCm support directly from the Pytorch repo because it will not work for gfx803 GPUs 67 | 68 |
 69 | cd Downloads
 70 | sudo apt install pip
 71 | pip install torch-2.1.1-cp310-cp310-linux_x86_64.whl
 72 | 
73 | 74 | **Done!** you can check if Pytorch works correctly with the provided test script.

75 |
 76 | pip install matplotlib
 77 | cd Downloads
 78 | python3 check_pytorch.py
 79 | 
80 |
81 | 82 | 83 | ### Build Pytorch for gfx803 (patched vesrion) 84 | - First install Dependencies 85 |
 86 | sudo apt install build-essential cmake python3-dev python3-numpy ninja-build libomp-dev libcurl4-openssl-dev libgflags-dev libgoogle-glog-dev libssl-dev libyaml-cpp-dev git
 87 | 
88 | - Now you can start the build 89 |
 90 | git clone https://github.com/pytorch/pytorch.git -b v2.1.1
 91 | cd pytorch
 92 | export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
 93 | export PYTORCH_ROCM_ARCH=gfx803
 94 | export PYTORCH_BUILD_VERSION=2.1.1 PYTORCH_BUILD_NUMBER=1
 95 | python3 tools/amd_build/build_amd.py
 96 | USE_ROCM=1 USE_NINJA=1 python3 setup.py bdist_wheel
 97 | pip3 install dist/torch-2.1.1-cp310-cp310-linux_x86_64.whl
 98 | 
99 | if you get error from rocblas library and you are using ROCm 5.4.0 create the follwing symbolc link 100 |
101 | CMake Error at /opt/rocm-5.4.0/lib/cmake/rocblas/rocblas-targets.cmake:79 (message):
102 |   The imported target "roc::rocblas" references the file
103 | 
104 |      "/opt/rocm-5.4.0/lib/librocblas.so.0.1.50400"
105 |   but this file does not exist.  Possible reasons include:
106 |   * The file was deleted, renamed, or moved to another location.
107 |   * An install or uninstall procedure did not complete successfully.
108 |   * The installation package was faulty and contained
109 |      "/opt/rocm-5.4.0/lib/cmake/rocblas/rocblas-targets.cmake"
110 |   but not all the files it references.
111 | Call Stack (most recent call first):
112 |   /opt/rocm/lib/cmake/rocblas/rocblas-config.cmake:92 (include)
113 |   cmake/public/LoadHIP.cmake:161 (find_package)
114 |   cmake/public/LoadHIP.cmake:291 (find_package_and_print_version)
115 |   cmake/Dependencies.cmake:1268 (include)
116 |   CMakeLists.txt:722 (include)
117 | 
118 |
119 | sudo ln -s /opt/rocm-5.4.1/lib/librocblas.so.0 /opt/rocm-5.4.0/lib/librocblas.so.0
120 | sudo ln -s /opt/rocm-5.4.1/lib/librocblas.so.0.1.50401 /opt/rocm-5.4.0/lib/librocblas.so.0.1.50400
121 | 
122 | 123 | ## References 124 | [https://github.com/tsl0922/pytorch-gfx803](https://github.com/tsl0922/pytorch-gfx803)
125 | [https://github.com/xuhuisheng/rocm-gfx803](https://github.com/xuhuisheng/rocm-gfx803) 126 | -------------------------------------------------------------------------------- /check_pytorch.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torch.utils.data import TensorDataset, DataLoader 5 | from torch.utils.data.dataset import random_split 6 | import matplotlib.pyplot as plt 7 | import numpy as np 8 | 9 | class SimpleClassifier(nn.Module): 10 | 11 | def __init__(self, num_inputs, num_hidden, num_outputs): 12 | super().__init__() 13 | # Initialize the modules we need to build the network 14 | self.linear1 = nn.Linear(num_inputs, num_hidden) 15 | self.act_fn = nn.Tanh() 16 | self.linear2 = nn.Linear(num_hidden, num_outputs) 17 | 18 | def forward(self, x): 19 | # Perform the calculation of the model to determine the prediction 20 | x = self.linear1(x) 21 | x = self.act_fn(x) 22 | x = self.linear2(x) 23 | return x 24 | 25 | 26 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 27 | #device = 'cpu' 28 | print(device) 29 | model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1).to(device) 30 | # Printing a module shows all its submodules 31 | print(model) 32 | 33 | 34 | for name, param in model.named_parameters(): 35 | print(f"Parameter {name}, shape {param.shape}") 36 | 37 | 38 | 39 | import torch 40 | from torch.utils.data import TensorDataset, DataLoader 41 | from torch.utils.data.dataset import random_split 42 | import matplotlib.pyplot as plt 43 | import numpy as np 44 | 45 | 46 | 47 | # Corrected Continuous XOR logic function 48 | def continuous_xor(x, y): 49 | # Convert to integers for bitwise XOR, then convert back to float if needed 50 | return ((x > 0).float().int() ^ (y > 0).float().int()).float() 51 | 52 | 53 | # Generate random (x, y) pairs in the range of [-1, 1] 54 | n_samples = 10000 55 | 56 | 57 | 58 | x = (torch.rand(n_samples, 1, device=device) * 2 - 1).to(device) # Scale to [-1, 1] 59 | y = (torch.rand(n_samples, 1, device=device) * 2 - 1).to(device) # Scale to [-1, 1] 60 | 61 | # Apply the continuous XOR logic 62 | labels = continuous_xor(x, y) 63 | 64 | # Combine the (x, y) pairs 65 | inputs = torch.cat((x, y), dim=1) 66 | 67 | # Create a TensorDataset 68 | dataset = TensorDataset(inputs, labels) 69 | 70 | # Split the dataset into train, validation, and test sets 71 | train_size = int(n_samples * 0.8) # 80% of the dataset 72 | val_size = int(n_samples * 0.05) # 5% of the dataset 73 | test_size = n_samples - (train_size + val_size) # The remaining 15% 74 | train_dataset, val_dataset, test_dataset = random_split(dataset, [train_size, val_size, test_size], generator=torch.Generator().manual_seed(42)) 75 | 76 | 77 | 78 | # Create DataLoader for each set 79 | batch_size = 32 80 | train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=0) 81 | val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=0) 82 | test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=0) 83 | 84 | 85 | 86 | # Visualization of a batch from the training set 87 | 88 | 89 | 90 | 91 | optimizer = optim.SGD(model.parameters(), lr=0.01) 92 | 93 | #import torch.optim as optim 94 | 95 | #optimizer = optim.Adam(model.parameters(), lr=0.001) 96 | 97 | 98 | 99 | # Loss and Optimizer 100 | criterion = nn.BCEWithLogitsLoss() 101 | 102 | # Training Loop 103 | epochs = 100 104 | loss_log = [] 105 | epoch_log = [] 106 | for epoch in range(epochs): 107 | model.train() 108 | for inputs, labels in train_loader: 109 | inputs, labels = inputs.to(device), labels.to(device) 110 | 111 | optimizer.zero_grad() 112 | outputs = model(inputs) 113 | loss = criterion(outputs, labels) 114 | loss.backward() 115 | optimizer.step() 116 | 117 | # Validation Loop 118 | model.eval() 119 | val_loss = 0.0 120 | with torch.no_grad(): 121 | for inputs, labels in val_loader: 122 | inputs, labels = inputs.to(device), labels.to(device) 123 | outputs = model(inputs) 124 | loss = criterion(outputs, labels) 125 | val_loss += loss.item() 126 | loss_log.append(val_loss/len(val_loader)) 127 | epoch_log.append(epoch) 128 | print(f'Epoch {epoch+1}, Loss: {loss.item()}, Validation Loss: {val_loss / len(val_loader)}') 129 | 130 | import matplotlib.pyplot as plt 131 | import seaborn as sns 132 | 133 | # Assuming 'epochs' and 'val_loss' are your existing lists for 100 epochs 134 | # epochs = [1, 2, 3, ..., 100] 135 | # val_loss = [loss_value1, loss_value2, loss_value3, ..., loss_value100] 136 | 137 | # Set the seaborn style for plotting 138 | sns.set(style="whitegrid") 139 | 140 | # Plotting with enhanced aesthetics for 100 epochs 141 | plt.figure(figsize=(14, 8)) 142 | plt.plot(epoch_log, loss_log, label='Validation Loss', markersize=8, linewidth=2) 143 | 144 | # Adjusting titles and labels with enhanced font settings 145 | plt.title('Validation Loss Over 100 Epochs', fontsize=20, fontweight='bold', color='darkslateblue') 146 | plt.xlabel('Epoch', fontsize=16, fontweight='bold') 147 | plt.ylabel('Validation Loss', fontsize=16, fontweight='bold') 148 | 149 | # Adjusting x-axis to show every 10th epoch for better readability 150 | plt.xticks(range(1, 101, 10), fontsize=12, fontweight='bold') 151 | plt.yticks(fontsize=12, fontweight='bold') 152 | plt.legend(fontsize=14, frameon=True, shadow=True, borderpad=1) 153 | 154 | # Optional: Remove the top and right spines for a cleaner look and adjust the grid 155 | 156 | plt.show() 157 | 158 | 159 | 160 | state_dict = model.state_dict() 161 | print(state_dict) 162 | 163 | 164 | # torch.save(object, filename). For the filename, any extension can be used 165 | torch.save(state_dict, "our_model.tar") 166 | 167 | 168 | # Load state dict from the disk (make sure it is the same name as above) 169 | state_dict = torch.load("our_model.tar") 170 | 171 | # Create a new model and load the state 172 | new_model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1) 173 | new_model.load_state_dict(state_dict) 174 | 175 | # Verify that the parameters are the same 176 | print("Original model\n", model.state_dict()) 177 | print("\nLoaded model\n", new_model.state_dict()) 178 | 179 | 180 | 181 | from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix 182 | import seaborn as sns 183 | 184 | def evaluate_model_and_metrics(model, data_loader, device): 185 | model.eval() # Set the model to evaluation mode 186 | all_labels = [] 187 | all_preds = [] 188 | 189 | with torch.no_grad(): 190 | for inputs, labels in data_loader: 191 | inputs, labels = inputs.to(device), labels.to(device) 192 | outputs = model(inputs) 193 | preds = torch.sigmoid(outputs) > 0.5 # Convert to binary predictions 194 | all_labels.extend(labels.cpu().numpy()) 195 | all_preds.extend(preds.cpu().numpy().flatten()) 196 | 197 | # Calculate metrics 198 | accuracy = accuracy_score(all_labels, all_preds) 199 | precision = precision_score(all_labels, all_preds) 200 | recall = recall_score(all_labels, all_preds) 201 | f1 = f1_score(all_labels, all_preds) 202 | conf_matrix = confusion_matrix(all_labels, all_preds) 203 | 204 | # Plotting 205 | metrics = [accuracy, precision, recall, f1] 206 | metric_names = ['Accuracy', 'Precision', 'Recall', 'F1 Score'] 207 | 208 | # Bar chart for metrics 209 | plt.figure(figsize=(10, 6)) 210 | sns.barplot(x=metric_names, y=metrics) 211 | plt.title('Model Performance Metrics') 212 | plt.ylabel('Score') 213 | plt.show() 214 | 215 | # Confusion Matrix 216 | plt.figure(figsize=(6, 6)) 217 | sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['False', 'True'], yticklabels=['False', 'True']) 218 | plt.xlabel('Predicted Label') 219 | plt.ylabel('True Label') 220 | plt.title('Confusion Matrix') 221 | plt.show() 222 | 223 | # Assuming test_loader is defined and contains the test dataset 224 | evaluate_model_and_metrics(model, test_loader, device) 225 | 226 | 227 | 228 | # Commented out IPython magic to ensure Python compatibility. 229 | # Import tensorboard logger from PyTorch 230 | from torch.utils.tensorboard import SummaryWriter 231 | writer = SummaryWriter('runs/xor_experiment_1') 232 | # Load tensorboard extension for Jupyter Notebook, only need to start TB in the notebook 233 | # %load_ext tensorboard 234 | 235 | 236 | 237 | import torch 238 | import torch.nn as nn 239 | import torch.optim as optim 240 | from torch.utils.tensorboard import SummaryWriter 241 | # Assuming SimpleClassifier, train_loader, val_loader are defined 242 | 243 | # Device configuration 244 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 245 | 246 | # Model instantiation 247 | model_board = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1).to(device) 248 | 249 | # Optimizer and Criterion 250 | optimizer = optim.SGD(model_board.parameters(), lr=0.1) 251 | criterion = nn.BCEWithLogitsLoss() 252 | 253 | # TensorBoard Writer 254 | writer = SummaryWriter() 255 | 256 | # Training Loop 257 | epochs = 100 258 | for epoch in range(epochs): 259 | model_board.train() 260 | train_loss, train_correct = 0, 0 261 | 262 | for inputs, labels in train_loader: 263 | inputs, labels = inputs.to(device), labels.to(device) 264 | 265 | optimizer.zero_grad() 266 | outputs = model_board(inputs) 267 | loss = criterion(outputs, labels) 268 | loss.backward() 269 | optimizer.step() 270 | 271 | train_loss += loss.item() 272 | predictions = torch.sigmoid(outputs) > 0.5 273 | train_correct += predictions.eq(labels.unsqueeze(1).data.view_as(predictions)).sum().item() 274 | 275 | train_accuracy = 100. * train_correct / len(train_loader.dataset) 276 | train_loss /= len(train_loader) 277 | 278 | # Log training metrics 279 | writer.add_scalar('Loss/train', train_loss, epoch) 280 | writer.add_scalar('Accuracy/train', train_accuracy, epoch) 281 | 282 | # Log gradients and weights histograms 283 | for name, param in model_board.named_parameters(): 284 | writer.add_histogram(f'{name}/gradients', param.grad, epoch) 285 | writer.add_histogram(f'{name}/weights', param, epoch) 286 | 287 | # Validation phase 288 | model_board.eval() 289 | val_loss, val_correct = 0, 0 290 | with torch.no_grad(): 291 | for inputs, labels in val_loader: 292 | inputs, labels = inputs.to(device), labels.to(device) 293 | outputs = model_board(inputs) 294 | loss = criterion(outputs, labels) 295 | 296 | val_loss += loss.item() 297 | predictions = torch.sigmoid(outputs) > 0.5 298 | val_correct += predictions.eq(labels.unsqueeze(1).data.view_as(predictions)).sum().item() 299 | 300 | val_accuracy = 100. * val_correct / len(val_loader.dataset) 301 | val_loss /= len(val_loader) 302 | 303 | # Log validation metrics 304 | writer.add_scalar('Loss/val', val_loss, epoch) 305 | writer.add_scalar('Accuracy/val', val_accuracy, epoch) 306 | 307 | print(f'Epoch {epoch+1}/{epochs}, Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.2f}%, Val Loss: {val_loss:.4f}, Val Accuracy: {val_accuracy:.2f}%') 308 | 309 | # Close the writer when done 310 | writer.close() 311 | 312 | 313 | 314 | # Commented out IPython magic to ensure Python compatibility. 315 | # %tensorboard --logdir runs 316 | --------------------------------------------------------------------------------