├── amdgpu-install_5.4.50400-1_all.deb
├── rocblas_2.46.0.50401-84.20.04_amd64.deb
├── README.md
└── check_pytorch.py


/amdgpu-install_5.4.50400-1_all.deb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/nikos230/Run-Pytorch-with-AMD-Radeon-GPU/HEAD/amdgpu-install_5.4.50400-1_all.deb


--------------------------------------------------------------------------------
/rocblas_2.46.0.50401-84.20.04_amd64.deb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/nikos230/Run-Pytorch-with-AMD-Radeon-GPU/HEAD/rocblas_2.46.0.50401-84.20.04_amd64.deb


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Run-Pytorch-with-AMD-Radeon-GPU
  2 | 
  3 | ## Introducation
  4 | With this guide you will be able to run Pytorch 2.1.1 with an Radeon GPU, it has been tested on rx470 4GB. Your GPU need to belong to gfx803 family like RX400 and RX500 Series. 
  5 | 
  6 | ### Requirements
  7 | - Ubuntu 22.04 LTS
  8 | - AMD Radeon GPU in gfx803 family (rx460, rx470, rx480, rx550, rx560, rx560, rx570, rx580)
  9 | 
 10 | ### Download Required Files
 11 | - Download [Pytorch 2.1.1](https://drive.google.com/file/d/1Tkyqe8VxUPkpf_jLZRzJphKNlW5Cqixi/view?usp=sharing) or build it yourself (see below)
 12 | - Download [rocblas_2.46.0.50401-84.20.04_amd64.deb](https://github.com/xuhuisheng/rocm-gfx803/releases/tag/rocm541)
 13 | 
 14 | 
 15 | ## Guide to Setup ROCm and Pytorch
 16 | 
 17 | Start with a fresh setup of ubuntu 22.04, then you need to install AMD drivers like ROCm. The version needed is ROCm 5.4.0, or 5.4.3 choose one of theese.
 18 | - Open a terminal and type
 19 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 20 | sudo su
 21 | sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
 22 | sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
 23 | </pre>
 24 | Reboot your system <br /><br />
 25 | 
 26 | 
 27 | - Open terminal and now you can start installing ROCm (for ROCm 5.4.0)
 28 | 
 29 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 30 | cd Downloads
 31 | wget https://repo.radeon.com/amdgpu-install/5.4/ubuntu/jammy/amdgpu-install_5.4.50400-1_all.deb
 32 | sudo apt install ./amdgpu-install_5.4.50400-1_all.deb
 33 | sudo amdgpu-install -y --no-dkms --usecase=rocm,hiplibsdk,mlsdk
 34 | sudo usermod -aG video $LOGNAME
 35 | sudo usermod -aG render $LOGNAME
 36 | </pre>
 37 | 
 38 | - or Alternative install ROCm 5.4.3
 39 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 40 | cd Downloads
 41 | wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb
 42 | sudo apt install ./amdgpu-install_5.4.50403-1_all.deb
 43 | sudo amdgpu-install -y --no-dkms --usecase=rocm,hiplibsdk,mlsdk
 44 | sudo usermod -aG video $LOGNAME
 45 | sudo usermod -aG render $LOGNAME
 46 | </pre>
 47 | Reboot your system<br /><br />
 48 | 
 49 | - Open terminal and check if ROCm is installed correctly
 50 |   
 51 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 52 | rocminfo
 53 | clinfo
 54 | </pre>  
 55 | <br />
 56 | 
 57 |  - Then install libopenmpi3 andlibstdc++-11-dev and rocblas patched version for gfx803
 58 | 
 59 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 60 | sudo apt install libopenmpi3 libstdc++-11-dev
 61 | sudo apt-get install libopenblas-dev
 62 | cd Downloads
 63 | sudo apt install ./rocblas_2.46.0.50401-84.20.04_amd64.deb 
 64 | </pre><br /> 
 65 | 
 66 | - Now you need to install Pytorch, you can use the pre-build wheels from this repo, or you can build it yourself but it will take some time. You can not install Pytorch with ROCm support directly from the Pytorch repo because it will not work for gfx803 GPUs
 67 | 
 68 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 69 | cd Downloads
 70 | sudo apt install pip
 71 | pip install torch-2.1.1-cp310-cp310-linux_x86_64.whl
 72 | </pre>
 73 | 
 74 | **Done!** you can check if Pytorch works correctly with the provided test script.<br /><br />
 75 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 76 | pip install matplotlib
 77 | cd Downloads
 78 | python3 check_pytorch.py
 79 | </pre>  
 80 | <br />
 81 | 
 82 | 
 83 | ### Build Pytorch for gfx803 (patched vesrion)
 84 | - First install Dependencies
 85 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 86 | sudo apt install build-essential cmake python3-dev python3-numpy ninja-build libomp-dev libcurl4-openssl-dev libgflags-dev libgoogle-glog-dev libssl-dev libyaml-cpp-dev git
 87 | </pre>
 88 | - Now you can start the build
 89 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
 90 | git clone https://github.com/pytorch/pytorch.git -b v2.1.1
 91 | cd pytorch
 92 | export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
 93 | export PYTORCH_ROCM_ARCH=gfx803
 94 | export PYTORCH_BUILD_VERSION=2.1.1 PYTORCH_BUILD_NUMBER=1
 95 | python3 tools/amd_build/build_amd.py
 96 | USE_ROCM=1 USE_NINJA=1 python3 setup.py bdist_wheel
 97 | pip3 install dist/torch-2.1.1-cp310-cp310-linux_x86_64.whl
 98 | </pre>
 99 | if you get error from rocblas library and you are using ROCm 5.4.0 create the follwing symbolc link
100 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
101 | CMake Error at /opt/rocm-5.4.0/lib/cmake/rocblas/rocblas-targets.cmake:79 (message):
102 |   The imported target "roc::rocblas" references the file
103 | 
104 |      "/opt/rocm-5.4.0/lib/librocblas.so.0.1.50400"
105 |   but this file does not exist.  Possible reasons include:
106 |   * The file was deleted, renamed, or moved to another location.
107 |   * An install or uninstall procedure did not complete successfully.
108 |   * The installation package was faulty and contained
109 |      "/opt/rocm-5.4.0/lib/cmake/rocblas/rocblas-targets.cmake"
110 |   but not all the files it references.
111 | Call Stack (most recent call first):
112 |   /opt/rocm/lib/cmake/rocblas/rocblas-config.cmake:92 (include)
113 |   cmake/public/LoadHIP.cmake:161 (find_package)
114 |   cmake/public/LoadHIP.cmake:291 (find_package_and_print_version)
115 |   cmake/Dependencies.cmake:1268 (include)
116 |   CMakeLists.txt:722 (include)
117 | </pre>
118 | <pre style="background-color: #f4f4f4; padding: 10px; border-radius: 8px;">
119 | sudo ln -s /opt/rocm-5.4.1/lib/librocblas.so.0 /opt/rocm-5.4.0/lib/librocblas.so.0
120 | sudo ln -s /opt/rocm-5.4.1/lib/librocblas.so.0.1.50401 /opt/rocm-5.4.0/lib/librocblas.so.0.1.50400
121 | </pre>
122 | 
123 | ## References
124 | [https://github.com/tsl0922/pytorch-gfx803](https://github.com/tsl0922/pytorch-gfx803) <br />
125 | [https://github.com/xuhuisheng/rocm-gfx803](https://github.com/xuhuisheng/rocm-gfx803)
126 | 


--------------------------------------------------------------------------------
/check_pytorch.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.optim as optim
  4 | from torch.utils.data import TensorDataset, DataLoader
  5 | from torch.utils.data.dataset import random_split
  6 | import matplotlib.pyplot as plt
  7 | import numpy as np
  8 | 
  9 | class SimpleClassifier(nn.Module):
 10 | 
 11 |     def __init__(self, num_inputs, num_hidden, num_outputs):
 12 |         super().__init__()
 13 |         # Initialize the modules we need to build the network
 14 |         self.linear1 = nn.Linear(num_inputs, num_hidden)
 15 |         self.act_fn = nn.Tanh()
 16 |         self.linear2 = nn.Linear(num_hidden, num_outputs)
 17 | 
 18 |     def forward(self, x):
 19 |         # Perform the calculation of the model to determine the prediction
 20 |         x = self.linear1(x)
 21 |         x = self.act_fn(x)
 22 |         x = self.linear2(x)
 23 |         return x
 24 | 
 25 | 
 26 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
 27 | #device = 'cpu'
 28 | print(device)
 29 | model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1).to(device)
 30 | # Printing a module shows all its submodules
 31 | print(model)
 32 | 
 33 | 
 34 | for name, param in model.named_parameters():
 35 |     print(f"Parameter {name}, shape {param.shape}")
 36 | 
 37 | 
 38 | 
 39 | import torch
 40 | from torch.utils.data import TensorDataset, DataLoader
 41 | from torch.utils.data.dataset import random_split
 42 | import matplotlib.pyplot as plt
 43 | import numpy as np
 44 | 
 45 | 
 46 | 
 47 | # Corrected Continuous XOR logic function
 48 | def continuous_xor(x, y):
 49 |     # Convert to integers for bitwise XOR, then convert back to float if needed
 50 |     return ((x > 0).float().int() ^ (y > 0).float().int()).float()
 51 | 
 52 | 
 53 | # Generate random (x, y) pairs in the range of [-1, 1]
 54 | n_samples = 10000
 55 | 
 56 | 
 57 | 
 58 | x = (torch.rand(n_samples, 1, device=device) * 2 - 1).to(device)  # Scale to [-1, 1]
 59 | y = (torch.rand(n_samples, 1, device=device) * 2 - 1).to(device)  # Scale to [-1, 1]
 60 | 
 61 | # Apply the continuous XOR logic
 62 | labels = continuous_xor(x, y)
 63 | 
 64 | # Combine the (x, y) pairs
 65 | inputs = torch.cat((x, y), dim=1)
 66 | 
 67 | # Create a TensorDataset
 68 | dataset = TensorDataset(inputs, labels)
 69 | 
 70 | # Split the dataset into train, validation, and test sets
 71 | train_size = int(n_samples * 0.8)  # 80% of the dataset
 72 | val_size = int(n_samples * 0.05)  # 5% of the dataset
 73 | test_size = n_samples - (train_size + val_size)  # The remaining 15%
 74 | train_dataset, val_dataset, test_dataset = random_split(dataset, [train_size, val_size, test_size], generator=torch.Generator().manual_seed(42))
 75 | 
 76 | 
 77 | 
 78 | # Create DataLoader for each set
 79 | batch_size = 32
 80 | train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=0)
 81 | val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=0)
 82 | test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=0)
 83 | 
 84 | 
 85 | 
 86 | # Visualization of a batch from the training set
 87 | 
 88 | 
 89 | 
 90 | 
 91 | optimizer = optim.SGD(model.parameters(), lr=0.01)
 92 | 
 93 | #import torch.optim as optim
 94 | 
 95 | #optimizer = optim.Adam(model.parameters(), lr=0.001)
 96 | 
 97 | 
 98 | 
 99 | # Loss and Optimizer
100 | criterion = nn.BCEWithLogitsLoss()
101 | 
102 | # Training Loop
103 | epochs = 100
104 | loss_log = []
105 | epoch_log = []
106 | for epoch in range(epochs):
107 |     model.train()
108 |     for inputs, labels in train_loader:
109 |         inputs, labels = inputs.to(device), labels.to(device)
110 | 
111 |         optimizer.zero_grad()
112 |         outputs = model(inputs)
113 |         loss = criterion(outputs, labels)
114 |         loss.backward()
115 |         optimizer.step()
116 | 
117 |     # Validation Loop
118 |     model.eval()
119 |     val_loss = 0.0
120 |     with torch.no_grad():
121 |         for inputs, labels in val_loader:
122 |             inputs, labels = inputs.to(device), labels.to(device)
123 |             outputs = model(inputs)
124 |             loss = criterion(outputs, labels)
125 |             val_loss += loss.item()
126 |     loss_log.append(val_loss/len(val_loader))
127 |     epoch_log.append(epoch)
128 |     print(f'Epoch {epoch+1}, Loss: {loss.item()}, Validation Loss: {val_loss / len(val_loader)}')
129 | 
130 | import matplotlib.pyplot as plt
131 | import seaborn as sns
132 | 
133 | # Assuming 'epochs' and 'val_loss' are your existing lists for 100 epochs
134 | # epochs = [1, 2, 3, ..., 100]
135 | # val_loss = [loss_value1, loss_value2, loss_value3, ..., loss_value100]
136 | 
137 | # Set the seaborn style for plotting
138 | sns.set(style="whitegrid")
139 | 
140 | # Plotting with enhanced aesthetics for 100 epochs
141 | plt.figure(figsize=(14, 8))
142 | plt.plot(epoch_log, loss_log, label='Validation Loss',  markersize=8, linewidth=2)
143 | 
144 | # Adjusting titles and labels with enhanced font settings
145 | plt.title('Validation Loss Over 100 Epochs', fontsize=20, fontweight='bold', color='darkslateblue')
146 | plt.xlabel('Epoch', fontsize=16, fontweight='bold')
147 | plt.ylabel('Validation Loss', fontsize=16, fontweight='bold')
148 | 
149 | # Adjusting x-axis to show every 10th epoch for better readability
150 | plt.xticks(range(1, 101, 10), fontsize=12, fontweight='bold')
151 | plt.yticks(fontsize=12, fontweight='bold')
152 | plt.legend(fontsize=14, frameon=True, shadow=True, borderpad=1)
153 | 
154 | # Optional: Remove the top and right spines for a cleaner look and adjust the grid
155 | 
156 | plt.show()
157 | 
158 | 
159 | 
160 | state_dict = model.state_dict()
161 | print(state_dict)
162 | 
163 | 
164 | # torch.save(object, filename). For the filename, any extension can be used
165 | torch.save(state_dict, "our_model.tar")
166 | 
167 | 
168 | # Load state dict from the disk (make sure it is the same name as above)
169 | state_dict = torch.load("our_model.tar")
170 | 
171 | # Create a new model and load the state
172 | new_model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1)
173 | new_model.load_state_dict(state_dict)
174 | 
175 | # Verify that the parameters are the same
176 | print("Original model\n", model.state_dict())
177 | print("\nLoaded model\n", new_model.state_dict())
178 | 
179 | 
180 | 
181 | from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
182 | import seaborn as sns
183 | 
184 | def evaluate_model_and_metrics(model, data_loader, device):
185 |     model.eval()  # Set the model to evaluation mode
186 |     all_labels = []
187 |     all_preds = []
188 | 
189 |     with torch.no_grad():
190 |         for inputs, labels in data_loader:
191 |             inputs, labels = inputs.to(device), labels.to(device)
192 |             outputs = model(inputs)
193 |             preds = torch.sigmoid(outputs) > 0.5  # Convert to binary predictions
194 |             all_labels.extend(labels.cpu().numpy())
195 |             all_preds.extend(preds.cpu().numpy().flatten())
196 | 
197 |     # Calculate metrics
198 |     accuracy = accuracy_score(all_labels, all_preds)
199 |     precision = precision_score(all_labels, all_preds)
200 |     recall = recall_score(all_labels, all_preds)
201 |     f1 = f1_score(all_labels, all_preds)
202 |     conf_matrix = confusion_matrix(all_labels, all_preds)
203 | 
204 |     # Plotting
205 |     metrics = [accuracy, precision, recall, f1]
206 |     metric_names = ['Accuracy', 'Precision', 'Recall', 'F1 Score']
207 | 
208 |     # Bar chart for metrics
209 |     plt.figure(figsize=(10, 6))
210 |     sns.barplot(x=metric_names, y=metrics)
211 |     plt.title('Model Performance Metrics')
212 |     plt.ylabel('Score')
213 |     plt.show()
214 | 
215 |     # Confusion Matrix
216 |     plt.figure(figsize=(6, 6))
217 |     sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['False', 'True'], yticklabels=['False', 'True'])
218 |     plt.xlabel('Predicted Label')
219 |     plt.ylabel('True Label')
220 |     plt.title('Confusion Matrix')
221 |     plt.show()
222 | 
223 | # Assuming test_loader is defined and contains the test dataset
224 | evaluate_model_and_metrics(model, test_loader, device)
225 | 
226 | 
227 | 
228 | # Commented out IPython magic to ensure Python compatibility.
229 | # Import tensorboard logger from PyTorch
230 | from torch.utils.tensorboard import SummaryWriter
231 | writer = SummaryWriter('runs/xor_experiment_1')
232 | # Load tensorboard extension for Jupyter Notebook, only need to start TB in the notebook
233 | # %load_ext tensorboard
234 | 
235 | 
236 | 
237 | import torch
238 | import torch.nn as nn
239 | import torch.optim as optim
240 | from torch.utils.tensorboard import SummaryWriter
241 | # Assuming SimpleClassifier, train_loader, val_loader are defined
242 | 
243 | # Device configuration
244 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
245 | 
246 | # Model instantiation
247 | model_board = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1).to(device)
248 | 
249 | # Optimizer and Criterion
250 | optimizer = optim.SGD(model_board.parameters(), lr=0.1)
251 | criterion = nn.BCEWithLogitsLoss()
252 | 
253 | # TensorBoard Writer
254 | writer = SummaryWriter()
255 | 
256 | # Training Loop
257 | epochs = 100
258 | for epoch in range(epochs):
259 |     model_board.train()
260 |     train_loss, train_correct = 0, 0
261 | 
262 |     for inputs, labels in train_loader:
263 |         inputs, labels = inputs.to(device), labels.to(device)
264 | 
265 |         optimizer.zero_grad()
266 |         outputs = model_board(inputs)
267 |         loss = criterion(outputs, labels)
268 |         loss.backward()
269 |         optimizer.step()
270 | 
271 |         train_loss += loss.item()
272 |         predictions = torch.sigmoid(outputs) > 0.5
273 |         train_correct += predictions.eq(labels.unsqueeze(1).data.view_as(predictions)).sum().item()
274 | 
275 |     train_accuracy = 100. * train_correct / len(train_loader.dataset)
276 |     train_loss /= len(train_loader)
277 | 
278 |     # Log training metrics
279 |     writer.add_scalar('Loss/train', train_loss, epoch)
280 |     writer.add_scalar('Accuracy/train', train_accuracy, epoch)
281 | 
282 |     # Log gradients and weights histograms
283 |     for name, param in model_board.named_parameters():
284 |         writer.add_histogram(f'{name}/gradients', param.grad, epoch)
285 |         writer.add_histogram(f'{name}/weights', param, epoch)
286 | 
287 |     # Validation phase
288 |     model_board.eval()
289 |     val_loss, val_correct = 0, 0
290 |     with torch.no_grad():
291 |         for inputs, labels in val_loader:
292 |             inputs, labels = inputs.to(device), labels.to(device)
293 |             outputs = model_board(inputs)
294 |             loss = criterion(outputs, labels)
295 | 
296 |             val_loss += loss.item()
297 |             predictions = torch.sigmoid(outputs) > 0.5
298 |             val_correct += predictions.eq(labels.unsqueeze(1).data.view_as(predictions)).sum().item()
299 | 
300 |     val_accuracy = 100. * val_correct / len(val_loader.dataset)
301 |     val_loss /= len(val_loader)
302 | 
303 |     # Log validation metrics
304 |     writer.add_scalar('Loss/val', val_loss, epoch)
305 |     writer.add_scalar('Accuracy/val', val_accuracy, epoch)
306 | 
307 |     print(f'Epoch {epoch+1}/{epochs}, Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.2f}%, Val Loss: {val_loss:.4f}, Val Accuracy: {val_accuracy:.2f}%')
308 | 
309 | # Close the writer when done
310 | writer.close()
311 | 
312 | 
313 | 
314 | # Commented out IPython magic to ensure Python compatibility.
315 | # %tensorboard --logdir runs
316 | 


--------------------------------------------------------------------------------