├── .README.md.swp └── README.md /.README.md.swp: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kazzastic/Tensorflow-BuildFromSource/34517029fcac17a4762188577caa8ae03eb68059/.README.md.swp -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Tensorflow-BuildFromSource 2 | How to built tensorflow from source, on Xeon processor, ubuntu 19.x, tf 2.x with GPU support. 3 | 4 | Even though all the required information that one might need when building tensorflow from source can be obtained from 5 | https://www.tensorflow.org/install/source but I just want to highlight the current situation of tensorflow on ubuntu and xeon processor. 6 | 7 | ## Configuration 8 | These are the configuration which will be able to re produce results most accurately 9 | - Ubuntu 19.10 10 | - CPU: Intel Xeon E5620 (8) @ 2.395GHz 11 | - GPU: NVIDIA GeForce GTX 1050 Ti 12 | - NVIDIA-SMI 440.64.00 13 | - CUDA 10.1 14 | - nvcc cuDNN 7.6 15 | 16 | ## GPU softwares 17 | 18 | ### CUDA Toolkit 10.1 Download 19 | [Here](https://developer.nvidia.com/cuda-10.1-download-archive-update2?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal) you can see how to install the CUDA 10.1, even though this is outdated but you would want to install this once since TF 2.X supports this version. 20 | 21 | ``` 22 | $ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin 23 | 24 | $ sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 25 | 26 | $ wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb 27 | 28 | $ sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb 29 | 30 | $ sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub 31 | 32 | $ sudo apt-get update 33 | 34 | $ sudo apt-get -y install cuda 35 | ``` 36 | now, if you had already tried installing an nvidia toolkit before then you might run into this minor problem which might give you following error. 37 | 38 | ``` 39 | $ sudo apt-get install cuda 40 | Reading package lists... Done 41 | Building dependency tree 42 | Reading state information... Done 43 | Some packages could not be installed. This may mean that you have 44 | requested an impossible situation or if you are using the unstable 45 | distribution that some required packages have not yet been created 46 | or been moved out of Incoming. 47 | The following information may help to resolve the situation: 48 | 49 | The following packages have unmet dependencies. 50 | cuda : Depends: cuda-10-0 (>= 10.0.130) but it is not going to be installed 51 | E: Unable to correct problems, you have held broken packages 52 | ``` 53 | this is normal and can be solved by the following commands, 54 | 55 | ``` 56 | $ sudo add-apt-repository -r ppa:graphics-drivers/ppa 57 | 58 | $ sudo apt remove nvidia-* 59 | 60 | $ sudo apt autoremove 61 | ``` 62 | this should solved the problem by removing previouly installed toolkits of any version, and now proceed where you left from meaning, 63 | 64 | ``` 65 | $ sudo apt install cuda 66 | ``` 67 | once done installing, modify the `~/.profile` file and add the following lines, 68 | first gedit, nano or vim open the file using following command, 69 | ``` 70 | $ nano ~/.profile 71 | ``` 72 | then add these lines, 73 | ``` 74 | # set PATH for cuda 10.1 installation 75 | if [ -d "/usr/local/cuda-10.1/bin/" ]; then 76 | export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}} 77 | export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} 78 | fi 79 | ``` 80 | then compile this once, just to be sure! 81 | ``` 82 | $ source ~/.profile 83 | ``` 84 | now restart the OS once or use this, 85 | ``` 86 | $ sudo reboot now 87 | ``` 88 | Now check the installation using these commmands, 89 | ``` 90 | $ nvidia-smi 91 | Sun Apr 5 18:34:20 2020 92 | +-----------------------------------------------------------------------------+ 93 | | NVIDIA-SMI 440.64.00 Driver Version: 440.64.00 CUDA Version: 10.2 | 94 | |-------------------------------+----------------------+----------------------+ 95 | | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | 96 | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | 97 | |===============================+======================+======================| 98 | | 0 GeForce GTX 105... On | 00000000:0F:00.0 On | N/A | 99 | | 0% 46C P0 N/A / 72W | 341MiB / 4038MiB | 15% Default | 100 | +-------------------------------+----------------------+----------------------+ 101 | 102 | +-----------------------------------------------------------------------------+ 103 | | Processes: GPU Memory | 104 | | GPU PID Type Process name Usage | 105 | |=============================================================================| 106 | | 0 1113 G /usr/lib/xorg/Xorg 14MiB | 107 | | 0 1705 G /usr/lib/xorg/Xorg 85MiB | 108 | | 0 1912 G /usr/bin/gnome-shell 101MiB | 109 | | 0 7262 G ...AAAAAAAAAAAAAAgAAAAAAAAA --shared-files 88MiB | 110 | +-----------------------------------------------------------------------------+ 111 | 112 | $ nvcc -V 113 | nvcc: NVIDIA (R) Cuda compiler driver 114 | Copyright (c) 2005-2019 NVIDIA Corporation 115 | Built on Sun_Jul_28_19:07:16_PDT_2019 116 | Cuda compilation tools, release 10.1, V10.1.243 117 | ``` 118 | if you get any error here reach out to github/stackoverflow community. 119 | 120 | If you run into any gcc or gnu version error till now, or in any further installation then install these versions of gcc and gnu, 121 | ``` 122 | $ sudo apt-get install -y software-properties-common 123 | 124 | $ sudo add-apt-repository ppa:ubuntu-toolchain-r/test 125 | 126 | $ sudo apt update 127 | 128 | $ sudo apt install g++-7 -y 129 | ``` 130 | Set it up so the symbolic links gcc, g++ point to the newer version: 131 | ``` 132 | $ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 60 \ 133 | --slave /usr/bin/g++ g++ /usr/bin/g++-7 134 | 135 | $ sudo update-alternatives --config gcc 136 | 137 | $ gcc --version 138 | 139 | $ g++ --version 140 | ``` 141 | 142 | ### cuDNN v7.6.5 for CUDA 10.1 143 | 144 | You can find the file [here](https://developer.nvidia.com/rdp/cudnn-download), you might have to sign up to download it. 145 | Download the given packages 146 | - cuDNN Runtime Library for Ubuntu18.04 (Deb) 147 | - cuDNN Developer Library for Ubuntu18.04 (Deb) 148 | - cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb) 149 | These are for Ubuntu 18.04 but they work like a charm even on Ubuntu 19.10 150 | 151 | ``` 152 | # see files we downloaded 153 | $ ll | grep cudnn 154 | 155 | $ sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb 156 | 157 | $ sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb 158 | 159 | $ sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb 160 | 161 | $ apt search libcudnn7 162 | ``` 163 | navigate the packages and install each as above. 164 | We then test the CUDNN installation with: 165 | ``` 166 | $ cp -r /usr/src/cudnn_samples_v7/ ~/test/ 167 | $ cd ~/test/mnistCUDNN/ 168 | $ make clean && make 169 | $ $ ./mnistCUDNN 170 | # ... 171 | # Test passed! 172 | ``` 173 | We need to know where CUDDN7 is installed for tensorflow build configuration 174 | ``` 175 | $ whereis cudnn 176 | 177 | cudnn: /usr/include/cudnn.h 178 | 179 | $ ll /usr/include/ | grep cudnn 180 | 181 | lrwxrwxrwx 1 root root 26 Sep 29 22:06 cudnn.h -> /etc/alternatives/libcudnn 182 | 183 | $ ll /etc/alternatives/libcudnn 184 | 185 | libcudnn libcudnn_so libcudnn_stlib 186 | 187 | $ ll /etc/alternatives/ | grep cudnn 188 | lrwxrwxrwx 1 root root 40 Sep 29 22:06 libcudnn -> /usr/include/x86_64-linux-gnu/cudnn_v7.h 189 | lrwxrwxrwx 1 root root 39 Sep 29 22:06 libcudnn_so -> /usr/lib/x86_64-linux-gnu/libcudnn.so.7 190 | lrwxrwxrwx 1 root root 46 Sep 29 22:06 libcudnn_stlib -> /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a 191 | 192 | $ dpkg-query -L libcudnn7 193 | /. 194 | /usr 195 | /usr/lib 196 | /usr/lib/x86_64-linux-gnu 197 | /usr/lib/x86_64-linux-gnu/libcudnn.so.7.3.1 198 | /usr/share 199 | /usr/share/doc 200 | /usr/share/doc/libcudnn7 201 | /usr/share/doc/libcudnn7/changelog.Debian.gz 202 | /usr/share/doc/libcudnn7/copyright 203 | /usr/share/lintian 204 | /usr/share/lintian/overrides 205 | /usr/share/lintian/overrides/libcudnn7 206 | /usr/lib/x86_64-linux-gnu/libcudnn.so.7 207 | 208 | $ dpkg-query -L libcudnn7-dev 209 | /. 210 | /usr 211 | /usr/include 212 | /usr/include/x86_64-linux-gnu 213 | /usr/include/x86_64-linux-gnu/cudnn_v7.h 214 | /usr/lib 215 | /usr/lib/x86_64-linux-gnu 216 | /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a 217 | /usr/share 218 | /usr/share/doc 219 | /usr/share/doc/libcudnn7-dev 220 | /usr/share/doc/libcudnn7-dev/changelog.Debian.gz 221 | /usr/share/doc/libcudnn7-dev/copyright 222 | /usr/share/lintian 223 | /usr/share/lintian/overrides 224 | /usr/share/lintian/overrides/libcudnn7-dev 225 | ``` 226 | We decide to use `/etc/alternatives/` for the tensorflow configuration. 227 | 228 | ## Install Bazel 229 | 230 | ### Release Version of Bazel 231 | The most suitable for our build can be checked [here](https://www.tensorflow.org/install/source#tested_build_configurations) further more navigate to correct release version of bazel from [here](https://github.com/bazelbuild/bazel/releases) I used bazel version 2.0.0 232 | download the `bazel-2.0.0-installer-linux-x86_64.sh` 233 | 234 | ``` 235 | $ cd where-bazel-is-downloaded 236 | 237 | $ chmod +x bazel-2.0.0-installer-linux-x86_64.sh 238 | 239 | $ ./bazel-2.0.0-installer-linux-x86_64.sh --user 240 | ``` 241 | this would install the bazel. Now setup the enviroment, 242 | ``` 243 | export PATH="$PATH:$HOME/bin" 244 | ``` 245 | add these lines to your ./bashrc file, you can open this file with these commands, 246 | ``` 247 | $ nano ~/.bashrc 248 | ``` 249 | 250 | ### NVIDIA Collective Communications Library (NCCL) 251 | From [here](https://developer.nvidia.com/nccl/nccl-download) download `Download NCCL v2.6.4, for CUDA 10.1, March 26,2020`. When enabling CUDA tensorflow asks us to specify the nccl library location. We go at this nvidia webpage to install it. We download the network installer, so that it is more straightforward to get updates. 252 | 253 | ``` 254 | $ sudo dpkg-i nccl-repo-ubuntu1804-2.6.4-ga-cuda10.2_1-1_amd64.deb 255 | 256 | $ sudo apt update 257 | 258 | $ apt search libnccl 259 | Full Text Search... Done 260 | libnccl-dev/unknown 2.3.5-2+cuda10.0 amd64 261 | NVIDIA Collectives Communication Library (NCCL) Development Files 262 | 263 | libnccl2/unknown 2.3.5-2+cuda10.0 amd64 264 | NVIDIA Collectives Communication Library (NCCL) Runtime 265 | 266 | $ sudo apt install libnccl2 libnccl-dev 267 | 268 | # See where the libraries got installed 269 | $ dpkg-query -L libnccl-dev libnccl2 270 | /. 271 | /usr 272 | /usr/include 273 | /usr/include/nccl.h 274 | /usr/lib 275 | /usr/lib/x86_64-linux-gnu 276 | /usr/lib/x86_64-linux-gnu/libnccl_static.a 277 | /usr/share 278 | /usr/share/doc 279 | /usr/share/doc/libnccl-dev 280 | /usr/share/doc/libnccl-dev/changelog.Debian.gz 281 | /usr/share/doc/libnccl-dev/copyright 282 | /usr/lib/x86_64-linux-gnu/libnccl.so 283 | 284 | /. 285 | /usr 286 | /usr/lib 287 | /usr/lib/x86_64-linux-gnu 288 | /usr/lib/x86_64-linux-gnu/libnccl.so.2.3.5 289 | /usr/share 290 | /usr/share/doc 291 | /usr/share/doc/libnccl2 292 | /usr/share/doc/libnccl2/changelog.Debian.gz 293 | /usr/share/doc/libnccl2/copyright 294 | /usr/lib/x86_64-linux-gnu/libnccl.so.2 295 | ``` 296 | We decide to create symbolic links from within the cuda installation to where they are: 297 | ``` 298 | $ cd /usr/local/cuda/ 299 | 300 | $ sudo mkdir lib 301 | 302 | $ cd lib 303 | 304 | $ sudo ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 libnccl.so.2 305 | 306 | $ cd ../include 307 | 308 | $ sudo ln -s /usr/include/nccl.h nccl.h 309 | ``` 310 | ## Building Tensorflow 311 | 312 | ### Clone Tensorflow 313 | 314 | ``` 315 | $ git clone https://github.com/tensorflow/tensorflow 316 | $ cd tensorflow 317 | $ git checkout v2.2.0-rc2 318 | ``` 319 | 320 | ### Configuration 321 | ``` 322 | $ pip install -U pip six numpy wheel setuptools mock 'future>=0.17.1' 323 | $ pip install -U keras_applications --no-deps 324 | $ pip install -U keras_preprocessing --no-deps 325 | ``` 326 | 327 | now the main part, 328 | 329 | ``` 330 | $ ./configure 331 | You have bazel 2.0.0 installed. 332 | Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3.7 333 | 334 | Found possible Python library paths: 335 | /usr/local/lib/python3.7/dist-packages 336 | /usr/lib/python3.7/dist-packages 337 | Please input the desired Python library path to use. Default is [/usr/lib/python3.7/dist-packages] 338 | 339 | Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y 340 | jemalloc as malloc support will be enabled for TensorFlow. 341 | 342 | Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n 343 | Google Cloud Platform support will be enabled for TensorFlow. 344 | 345 | Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n 346 | Hadoop File System support will be enabled for TensorFlow. 347 | 348 | Do you wish to build TensorFlow with Amazon AWS Platform support? [Y/n]: n 349 | Amazon AWS Platform support will be enabled for TensorFlow. 350 | 351 | Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n 352 | Apache Kafka Platform support will be enabled for TensorFlow. 353 | 354 | Do you wish to build TensorFlow with XLA JIT support? [y/N]: n 355 | No XLA JIT support will be enabled for TensorFlow. 356 | 357 | Do you wish to build TensorFlow with GDR support? [y/N]: n 358 | No GDR support will be enabled for TensorFlow. 359 | 360 | Do you wish to build TensorFlow with VERBS support? [y/N]: n 361 | No VERBS support will be enabled for TensorFlow. 362 | 363 | Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n 364 | No OpenCL SYCL support will be enabled for TensorFlow. 365 | 366 | Do you wish to build TensorFlow with CUDA support? [y/N]: Y 367 | CUDA support will be enabled for TensorFlow. 368 | 369 | Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.1]: 10.1 370 | 371 | Please specify the location where CUDA 10.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 372 | 373 | Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.6]: 374 | 375 | Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 376 | 377 | Do you wish to build TensorFlow with TensorRT support? [y/N]: n 378 | No TensorRT support will be enabled for TensorFlow. 379 | 380 | Please specify a list of comma-separated Cuda compute capabilities you want to build with. 381 | You can find the compute capability of your device at: developer.nvidia.com/cuda-gpus 382 | Please note that each additional compute capability significantly increases your 383 | build time and binary size. [Default is: 3.5,7.0] 5.2,6.1,7.0 384 | 385 | Do you want to use clang as CUDA compiler? [y/N]: n 386 | nvcc will be used as CUDA compiler. 387 | 388 | Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 389 | 390 | Do you wish to build TensorFlow with MPI support? [y/N]: n 391 | No MPI support will be enabled for TensorFlow. 392 | 393 | Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 394 | 395 | Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n 396 | Not configuring the WORKSPACE for Android builds. 397 | 398 | Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details. 399 | --config=mkl # Build with MKL support. 400 | Configuration finished 401 | --config=monolithic # Config for mostly static monolithic build. 402 | ``` 403 | 404 | After this you're almost there. 405 | 406 | ### Build 407 | Now using bazel you will build the package, 408 | 409 | ``` 410 | bazel build //tensorflow/tools/pip_package:build_pip_package 411 | ``` 412 | If your PC is freezing and hanging due to this build, use this command to build package rather than that, 413 | 414 | ``` 415 | bazel build --local_ram_resources=2048 --jobs=1 //tensorflow/tools/pip_package:build_pip_package 416 | ``` 417 | This other command will take wayyyy more time but it will get the job done without freezing the PC at any point at all! 418 | 419 | 420 | ### Build the package 421 | ``` 422 | $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg 423 | ``` 424 | 425 | ### Install the package 426 | ``` 427 | $ pip install /tmp/tensorflow_pkg/tensorflow-version-tags.whl 428 | ``` 429 | 430 | These links helped me the most, 431 | https://github.com/Iolaum/CompileTF/blob/master/README.md 432 | 433 | https://gist.github.com/kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03 434 | 435 | https://www.tensorflow.org/install/source 436 | 437 | https://github.com/bazelbuild/bazel/releases 438 | 439 | https://github.com/tensorflow/tensorflow/issues/31804 440 | 441 | 442 | --------------------------------------------------------------------------------