├── amdgpu-install_5.4.50400-1_all.deb
├── rocblas_2.46.0.50401-84.20.04_amd64.deb
├── README.md
└── check_pytorch.py
/amdgpu-install_5.4.50400-1_all.deb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/nikos230/Run-Pytorch-with-AMD-Radeon-GPU/HEAD/amdgpu-install_5.4.50400-1_all.deb
--------------------------------------------------------------------------------
/rocblas_2.46.0.50401-84.20.04_amd64.deb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/nikos230/Run-Pytorch-with-AMD-Radeon-GPU/HEAD/rocblas_2.46.0.50401-84.20.04_amd64.deb
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Run-Pytorch-with-AMD-Radeon-GPU
2 |
3 | ## Introducation
4 | With this guide you will be able to run Pytorch 2.1.1 with an Radeon GPU, it has been tested on rx470 4GB. Your GPU need to belong to gfx803 family like RX400 and RX500 Series.
5 |
6 | ### Requirements
7 | - Ubuntu 22.04 LTS
8 | - AMD Radeon GPU in gfx803 family (rx460, rx470, rx480, rx550, rx560, rx560, rx570, rx580)
9 |
10 | ### Download Required Files
11 | - Download [Pytorch 2.1.1](https://drive.google.com/file/d/1Tkyqe8VxUPkpf_jLZRzJphKNlW5Cqixi/view?usp=sharing) or build it yourself (see below)
12 | - Download [rocblas_2.46.0.50401-84.20.04_amd64.deb](https://github.com/xuhuisheng/rocm-gfx803/releases/tag/rocm541)
13 |
14 |
15 | ## Guide to Setup ROCm and Pytorch
16 |
17 | Start with a fresh setup of ubuntu 22.04, then you need to install AMD drivers like ROCm. The version needed is ROCm 5.4.0, or 5.4.3 choose one of theese.
18 | - Open a terminal and type
19 |
20 | sudo su
21 | sudo echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
22 | sudo echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
23 |
24 | Reboot your system
25 |
26 |
27 | - Open terminal and now you can start installing ROCm (for ROCm 5.4.0)
28 |
29 |
30 | cd Downloads
31 | wget https://repo.radeon.com/amdgpu-install/5.4/ubuntu/jammy/amdgpu-install_5.4.50400-1_all.deb
32 | sudo apt install ./amdgpu-install_5.4.50400-1_all.deb
33 | sudo amdgpu-install -y --no-dkms --usecase=rocm,hiplibsdk,mlsdk
34 | sudo usermod -aG video $LOGNAME
35 | sudo usermod -aG render $LOGNAME
36 |
37 |
38 | - or Alternative install ROCm 5.4.3
39 |
40 | cd Downloads
41 | wget https://repo.radeon.com/amdgpu-install/5.4.3/ubuntu/jammy/amdgpu-install_5.4.50403-1_all.deb
42 | sudo apt install ./amdgpu-install_5.4.50403-1_all.deb
43 | sudo amdgpu-install -y --no-dkms --usecase=rocm,hiplibsdk,mlsdk
44 | sudo usermod -aG video $LOGNAME
45 | sudo usermod -aG render $LOGNAME
46 |
47 | Reboot your system
48 |
49 | - Open terminal and check if ROCm is installed correctly
50 |
51 |
52 | rocminfo
53 | clinfo
54 |
55 |
56 |
57 | - Then install libopenmpi3 andlibstdc++-11-dev and rocblas patched version for gfx803
58 |
59 |
60 | sudo apt install libopenmpi3 libstdc++-11-dev
61 | sudo apt-get install libopenblas-dev
62 | cd Downloads
63 | sudo apt install ./rocblas_2.46.0.50401-84.20.04_amd64.deb
64 |
65 |
66 | - Now you need to install Pytorch, you can use the pre-build wheels from this repo, or you can build it yourself but it will take some time. You can not install Pytorch with ROCm support directly from the Pytorch repo because it will not work for gfx803 GPUs
67 |
68 |
69 | cd Downloads
70 | sudo apt install pip
71 | pip install torch-2.1.1-cp310-cp310-linux_x86_64.whl
72 |
73 |
74 | **Done!** you can check if Pytorch works correctly with the provided test script.
75 |
76 | pip install matplotlib
77 | cd Downloads
78 | python3 check_pytorch.py
79 |
80 |
81 |
82 |
83 | ### Build Pytorch for gfx803 (patched vesrion)
84 | - First install Dependencies
85 |
86 | sudo apt install build-essential cmake python3-dev python3-numpy ninja-build libomp-dev libcurl4-openssl-dev libgflags-dev libgoogle-glog-dev libssl-dev libyaml-cpp-dev git
87 |
88 | - Now you can start the build
89 |
90 | git clone https://github.com/pytorch/pytorch.git -b v2.1.1
91 | cd pytorch
92 | export PATH=/opt/rocm/bin:$PATH ROCM_PATH=/opt/rocm HIP_PATH=/opt/rocm/hip
93 | export PYTORCH_ROCM_ARCH=gfx803
94 | export PYTORCH_BUILD_VERSION=2.1.1 PYTORCH_BUILD_NUMBER=1
95 | python3 tools/amd_build/build_amd.py
96 | USE_ROCM=1 USE_NINJA=1 python3 setup.py bdist_wheel
97 | pip3 install dist/torch-2.1.1-cp310-cp310-linux_x86_64.whl
98 |
99 | if you get error from rocblas library and you are using ROCm 5.4.0 create the follwing symbolc link
100 |
101 | CMake Error at /opt/rocm-5.4.0/lib/cmake/rocblas/rocblas-targets.cmake:79 (message):
102 | The imported target "roc::rocblas" references the file
103 |
104 | "/opt/rocm-5.4.0/lib/librocblas.so.0.1.50400"
105 | but this file does not exist. Possible reasons include:
106 | * The file was deleted, renamed, or moved to another location.
107 | * An install or uninstall procedure did not complete successfully.
108 | * The installation package was faulty and contained
109 | "/opt/rocm-5.4.0/lib/cmake/rocblas/rocblas-targets.cmake"
110 | but not all the files it references.
111 | Call Stack (most recent call first):
112 | /opt/rocm/lib/cmake/rocblas/rocblas-config.cmake:92 (include)
113 | cmake/public/LoadHIP.cmake:161 (find_package)
114 | cmake/public/LoadHIP.cmake:291 (find_package_and_print_version)
115 | cmake/Dependencies.cmake:1268 (include)
116 | CMakeLists.txt:722 (include)
117 |
118 |
119 | sudo ln -s /opt/rocm-5.4.1/lib/librocblas.so.0 /opt/rocm-5.4.0/lib/librocblas.so.0
120 | sudo ln -s /opt/rocm-5.4.1/lib/librocblas.so.0.1.50401 /opt/rocm-5.4.0/lib/librocblas.so.0.1.50400
121 |
122 |
123 | ## References
124 | [https://github.com/tsl0922/pytorch-gfx803](https://github.com/tsl0922/pytorch-gfx803)
125 | [https://github.com/xuhuisheng/rocm-gfx803](https://github.com/xuhuisheng/rocm-gfx803)
126 |
--------------------------------------------------------------------------------
/check_pytorch.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import torch.optim as optim
4 | from torch.utils.data import TensorDataset, DataLoader
5 | from torch.utils.data.dataset import random_split
6 | import matplotlib.pyplot as plt
7 | import numpy as np
8 |
9 | class SimpleClassifier(nn.Module):
10 |
11 | def __init__(self, num_inputs, num_hidden, num_outputs):
12 | super().__init__()
13 | # Initialize the modules we need to build the network
14 | self.linear1 = nn.Linear(num_inputs, num_hidden)
15 | self.act_fn = nn.Tanh()
16 | self.linear2 = nn.Linear(num_hidden, num_outputs)
17 |
18 | def forward(self, x):
19 | # Perform the calculation of the model to determine the prediction
20 | x = self.linear1(x)
21 | x = self.act_fn(x)
22 | x = self.linear2(x)
23 | return x
24 |
25 |
26 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
27 | #device = 'cpu'
28 | print(device)
29 | model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1).to(device)
30 | # Printing a module shows all its submodules
31 | print(model)
32 |
33 |
34 | for name, param in model.named_parameters():
35 | print(f"Parameter {name}, shape {param.shape}")
36 |
37 |
38 |
39 | import torch
40 | from torch.utils.data import TensorDataset, DataLoader
41 | from torch.utils.data.dataset import random_split
42 | import matplotlib.pyplot as plt
43 | import numpy as np
44 |
45 |
46 |
47 | # Corrected Continuous XOR logic function
48 | def continuous_xor(x, y):
49 | # Convert to integers for bitwise XOR, then convert back to float if needed
50 | return ((x > 0).float().int() ^ (y > 0).float().int()).float()
51 |
52 |
53 | # Generate random (x, y) pairs in the range of [-1, 1]
54 | n_samples = 10000
55 |
56 |
57 |
58 | x = (torch.rand(n_samples, 1, device=device) * 2 - 1).to(device) # Scale to [-1, 1]
59 | y = (torch.rand(n_samples, 1, device=device) * 2 - 1).to(device) # Scale to [-1, 1]
60 |
61 | # Apply the continuous XOR logic
62 | labels = continuous_xor(x, y)
63 |
64 | # Combine the (x, y) pairs
65 | inputs = torch.cat((x, y), dim=1)
66 |
67 | # Create a TensorDataset
68 | dataset = TensorDataset(inputs, labels)
69 |
70 | # Split the dataset into train, validation, and test sets
71 | train_size = int(n_samples * 0.8) # 80% of the dataset
72 | val_size = int(n_samples * 0.05) # 5% of the dataset
73 | test_size = n_samples - (train_size + val_size) # The remaining 15%
74 | train_dataset, val_dataset, test_dataset = random_split(dataset, [train_size, val_size, test_size], generator=torch.Generator().manual_seed(42))
75 |
76 |
77 |
78 | # Create DataLoader for each set
79 | batch_size = 32
80 | train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=0)
81 | val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=0)
82 | test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=0)
83 |
84 |
85 |
86 | # Visualization of a batch from the training set
87 |
88 |
89 |
90 |
91 | optimizer = optim.SGD(model.parameters(), lr=0.01)
92 |
93 | #import torch.optim as optim
94 |
95 | #optimizer = optim.Adam(model.parameters(), lr=0.001)
96 |
97 |
98 |
99 | # Loss and Optimizer
100 | criterion = nn.BCEWithLogitsLoss()
101 |
102 | # Training Loop
103 | epochs = 100
104 | loss_log = []
105 | epoch_log = []
106 | for epoch in range(epochs):
107 | model.train()
108 | for inputs, labels in train_loader:
109 | inputs, labels = inputs.to(device), labels.to(device)
110 |
111 | optimizer.zero_grad()
112 | outputs = model(inputs)
113 | loss = criterion(outputs, labels)
114 | loss.backward()
115 | optimizer.step()
116 |
117 | # Validation Loop
118 | model.eval()
119 | val_loss = 0.0
120 | with torch.no_grad():
121 | for inputs, labels in val_loader:
122 | inputs, labels = inputs.to(device), labels.to(device)
123 | outputs = model(inputs)
124 | loss = criterion(outputs, labels)
125 | val_loss += loss.item()
126 | loss_log.append(val_loss/len(val_loader))
127 | epoch_log.append(epoch)
128 | print(f'Epoch {epoch+1}, Loss: {loss.item()}, Validation Loss: {val_loss / len(val_loader)}')
129 |
130 | import matplotlib.pyplot as plt
131 | import seaborn as sns
132 |
133 | # Assuming 'epochs' and 'val_loss' are your existing lists for 100 epochs
134 | # epochs = [1, 2, 3, ..., 100]
135 | # val_loss = [loss_value1, loss_value2, loss_value3, ..., loss_value100]
136 |
137 | # Set the seaborn style for plotting
138 | sns.set(style="whitegrid")
139 |
140 | # Plotting with enhanced aesthetics for 100 epochs
141 | plt.figure(figsize=(14, 8))
142 | plt.plot(epoch_log, loss_log, label='Validation Loss', markersize=8, linewidth=2)
143 |
144 | # Adjusting titles and labels with enhanced font settings
145 | plt.title('Validation Loss Over 100 Epochs', fontsize=20, fontweight='bold', color='darkslateblue')
146 | plt.xlabel('Epoch', fontsize=16, fontweight='bold')
147 | plt.ylabel('Validation Loss', fontsize=16, fontweight='bold')
148 |
149 | # Adjusting x-axis to show every 10th epoch for better readability
150 | plt.xticks(range(1, 101, 10), fontsize=12, fontweight='bold')
151 | plt.yticks(fontsize=12, fontweight='bold')
152 | plt.legend(fontsize=14, frameon=True, shadow=True, borderpad=1)
153 |
154 | # Optional: Remove the top and right spines for a cleaner look and adjust the grid
155 |
156 | plt.show()
157 |
158 |
159 |
160 | state_dict = model.state_dict()
161 | print(state_dict)
162 |
163 |
164 | # torch.save(object, filename). For the filename, any extension can be used
165 | torch.save(state_dict, "our_model.tar")
166 |
167 |
168 | # Load state dict from the disk (make sure it is the same name as above)
169 | state_dict = torch.load("our_model.tar")
170 |
171 | # Create a new model and load the state
172 | new_model = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1)
173 | new_model.load_state_dict(state_dict)
174 |
175 | # Verify that the parameters are the same
176 | print("Original model\n", model.state_dict())
177 | print("\nLoaded model\n", new_model.state_dict())
178 |
179 |
180 |
181 | from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
182 | import seaborn as sns
183 |
184 | def evaluate_model_and_metrics(model, data_loader, device):
185 | model.eval() # Set the model to evaluation mode
186 | all_labels = []
187 | all_preds = []
188 |
189 | with torch.no_grad():
190 | for inputs, labels in data_loader:
191 | inputs, labels = inputs.to(device), labels.to(device)
192 | outputs = model(inputs)
193 | preds = torch.sigmoid(outputs) > 0.5 # Convert to binary predictions
194 | all_labels.extend(labels.cpu().numpy())
195 | all_preds.extend(preds.cpu().numpy().flatten())
196 |
197 | # Calculate metrics
198 | accuracy = accuracy_score(all_labels, all_preds)
199 | precision = precision_score(all_labels, all_preds)
200 | recall = recall_score(all_labels, all_preds)
201 | f1 = f1_score(all_labels, all_preds)
202 | conf_matrix = confusion_matrix(all_labels, all_preds)
203 |
204 | # Plotting
205 | metrics = [accuracy, precision, recall, f1]
206 | metric_names = ['Accuracy', 'Precision', 'Recall', 'F1 Score']
207 |
208 | # Bar chart for metrics
209 | plt.figure(figsize=(10, 6))
210 | sns.barplot(x=metric_names, y=metrics)
211 | plt.title('Model Performance Metrics')
212 | plt.ylabel('Score')
213 | plt.show()
214 |
215 | # Confusion Matrix
216 | plt.figure(figsize=(6, 6))
217 | sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['False', 'True'], yticklabels=['False', 'True'])
218 | plt.xlabel('Predicted Label')
219 | plt.ylabel('True Label')
220 | plt.title('Confusion Matrix')
221 | plt.show()
222 |
223 | # Assuming test_loader is defined and contains the test dataset
224 | evaluate_model_and_metrics(model, test_loader, device)
225 |
226 |
227 |
228 | # Commented out IPython magic to ensure Python compatibility.
229 | # Import tensorboard logger from PyTorch
230 | from torch.utils.tensorboard import SummaryWriter
231 | writer = SummaryWriter('runs/xor_experiment_1')
232 | # Load tensorboard extension for Jupyter Notebook, only need to start TB in the notebook
233 | # %load_ext tensorboard
234 |
235 |
236 |
237 | import torch
238 | import torch.nn as nn
239 | import torch.optim as optim
240 | from torch.utils.tensorboard import SummaryWriter
241 | # Assuming SimpleClassifier, train_loader, val_loader are defined
242 |
243 | # Device configuration
244 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
245 |
246 | # Model instantiation
247 | model_board = SimpleClassifier(num_inputs=2, num_hidden=4, num_outputs=1).to(device)
248 |
249 | # Optimizer and Criterion
250 | optimizer = optim.SGD(model_board.parameters(), lr=0.1)
251 | criterion = nn.BCEWithLogitsLoss()
252 |
253 | # TensorBoard Writer
254 | writer = SummaryWriter()
255 |
256 | # Training Loop
257 | epochs = 100
258 | for epoch in range(epochs):
259 | model_board.train()
260 | train_loss, train_correct = 0, 0
261 |
262 | for inputs, labels in train_loader:
263 | inputs, labels = inputs.to(device), labels.to(device)
264 |
265 | optimizer.zero_grad()
266 | outputs = model_board(inputs)
267 | loss = criterion(outputs, labels)
268 | loss.backward()
269 | optimizer.step()
270 |
271 | train_loss += loss.item()
272 | predictions = torch.sigmoid(outputs) > 0.5
273 | train_correct += predictions.eq(labels.unsqueeze(1).data.view_as(predictions)).sum().item()
274 |
275 | train_accuracy = 100. * train_correct / len(train_loader.dataset)
276 | train_loss /= len(train_loader)
277 |
278 | # Log training metrics
279 | writer.add_scalar('Loss/train', train_loss, epoch)
280 | writer.add_scalar('Accuracy/train', train_accuracy, epoch)
281 |
282 | # Log gradients and weights histograms
283 | for name, param in model_board.named_parameters():
284 | writer.add_histogram(f'{name}/gradients', param.grad, epoch)
285 | writer.add_histogram(f'{name}/weights', param, epoch)
286 |
287 | # Validation phase
288 | model_board.eval()
289 | val_loss, val_correct = 0, 0
290 | with torch.no_grad():
291 | for inputs, labels in val_loader:
292 | inputs, labels = inputs.to(device), labels.to(device)
293 | outputs = model_board(inputs)
294 | loss = criterion(outputs, labels)
295 |
296 | val_loss += loss.item()
297 | predictions = torch.sigmoid(outputs) > 0.5
298 | val_correct += predictions.eq(labels.unsqueeze(1).data.view_as(predictions)).sum().item()
299 |
300 | val_accuracy = 100. * val_correct / len(val_loader.dataset)
301 | val_loss /= len(val_loader)
302 |
303 | # Log validation metrics
304 | writer.add_scalar('Loss/val', val_loss, epoch)
305 | writer.add_scalar('Accuracy/val', val_accuracy, epoch)
306 |
307 | print(f'Epoch {epoch+1}/{epochs}, Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.2f}%, Val Loss: {val_loss:.4f}, Val Accuracy: {val_accuracy:.2f}%')
308 |
309 | # Close the writer when done
310 | writer.close()
311 |
312 |
313 |
314 | # Commented out IPython magic to ensure Python compatibility.
315 | # %tensorboard --logdir runs
316 |
--------------------------------------------------------------------------------