├── LICENSE
└── README.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 Andrew Adare
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Installing PyTorch on the NVIDIA Jetson TX1/TX2
  2 | 
  3 | [PyTorch](http://pytorch.org/) is a new deep learning framework that runs very well on the Jetson TX1 and TX2 boards. It is relatively simple and quick to install. Unlike TensorFlow, it requires no external swap partition to build on the TX1.
  4 | 
  5 | Although the TX2 has an ample 32 GB of eMMC, the TX1 has only half that, and it is easy to run out of space due to cruft from JetPack, Ubuntu packages, and installation artifacts. The cleanup section below lists ways to slim things down, and the steps here lean in the direction of minimalism.
  6 | 
  7 | The PyTorch developers recommend the Anaconda distribution. I was unable to find a recent Anaconda setup for ARM64, so I used the global python libraries.
  8 | 
  9 | **Tip:** On the TX2, running `~/jetson_clocks.sh` throttles up the CPUs and enables two more cores. This reduces the PyTorch compilation time from 45 to 37 minutes. I didn't test on the TX1, but would expect a less dramatic speedup.
 10 | 
 11 | To avoid issues with system-wide installation as superuser, I appended `--user` to all `pip3 install` commands below. This puts packages in $HOME/.local/lib/python3.5/site-packages, which I added to my PYTHONPATH.
 12 | 
 13 | ## Scipy and LA libs
 14 |  - sudo apt install libopenblas-dev libatlas-dev liblapack-dev
 15 |  - sudo apt install liblapacke-dev checkinstall # For OpenCV
 16 |  - pip3 install numpy scipy  # ~20-30 min
 17 | 
 18 | ## Build tool prerequisites
 19 |  - pip3 install pyyaml
 20 |  - pip3 install scikit-build
 21 |  - sudo apt install ninja-build
 22 | 
 23 | ## CMake
 24 | Check `cmake --version`. It looks like cmake >= 3.6 is required for the python bindings used in PyTorch. I followed http://askubuntu.com/a/865294 (and used --no-check-certificate) to get 3.7.2. The cmake executable was installed to /usr/local/bin. Make sure `cmake --version` reports the new version after installation.
 25 | 
 26 | ## CFFI
 27 |  - sudo apt install python3-dev
 28 |  - sudo apt install libffi-dev
 29 |  - pip3 install cffi
 30 | 
 31 | ## OpenCV
 32 | I followed a subset of these excellent [instructions](http://www.pyimagesearch.com/2016/10/24/ubuntu-16-04-how-to-install-opencv/) for Python 3 from the pyimagesearch blog. I skipped CUDA and OpenCL integration, and went system-wide with the Python 3 bindings (no virtualenvs). Under the OpenCV source directory, I created build/ and ran this:
 33 | ```
 34 | cmake \
 35 |     -D ENABLE_PRECOMPILED_HEADERS=OFF \
 36 |     -D WITH_OPENCL=OFF \
 37 |     -D WITH_CUDA=OFF \
 38 |     -D WITH_CUFFT=OFF \
 39 |     -D WITH_CUBLAS=OFF \
 40 |     -D CMAKE_BUILD_TYPE=RELEASE \
 41 |     -D CMAKE_INSTALL_PREFIX=/usr/local \
 42 |     -D INSTALL_PYTHON_EXAMPLES=OFF \
 43 |     -D INSTALL_C_EXAMPLES=OFF \
 44 |     -D PYTHON_EXECUTABLE=/usr/bin/python3.5 \
 45 |     -D BUILD_EXAMPLES=OFF ..
 46 | ```
 47 | The first option significantly reduces the storage requirements during the build, at the expense of a slightly longer build time. Before discovering this option, I encountered errors like "failed to copy PCH file" on the TX1.
 48 | 
 49 | With this setup, OpenCV builds quickly--19 minutes on the TX2.
 50 | 
 51 | ## CUDA and cuDNN
 52 | The Jetson does not ship ready to run deep learning models on the GPU. Unfortunately CUDA and cuDNN cannot simply be downloaded directly as for x86 systems; NVIDIA directs us to the JetPack SDK, which currently runs only on Ubuntu 14.04. In a VirtualBox instance, I [downloaded](https://developer.nvidia.com/embedded/downloads) the JetPack .run file. The default full installation is massive and takes hours to download, build, and flash. Moreover, it copies many GB of files to the local VM which is not what I want! I realized I could avoid the overkill and just select the CUDA and cuDNN packages, then cancel the installation once the .deb files are downloaded. Only the following 4 files are needed to install CUDA and cuDNN:
 53 |    1. cuda-repo-l4t-8-0-local_8.0.64-1_arm64.deb
 54 |    2. libcudnn5_5.1.10-1+cuda8.0_arm64.deb
 55 |    3. libcudnn5-dev_5.1.10-1+cuda8.0_arm64.deb
 56 |    4. cuda-l4t.sh
 57 | 
 58 | I created `~/cuda-l4t` on the Jetson and copied these 4 files there.
 59 |  - `sudo ./cuda-l4t.sh cuda-repo...arm64.deb 8.0 8-0` to install the CUDA libs. Note that this script prepends CUDA directories to PATH and LD_LIBRARY_PATH, then slaps the hard-coded redefinitions onto ~/.bashrc. If it matters to you, now is a good time to tidy this up.
 60 |  - To get cuDNN, do
 61 | `sudo dpkg -i <debfile>` for files 2 and 3 above. Then do
 62 | ```
 63 | sudo apt install -f
 64 | ```
 65 |  - Check: `nvcc -V`
 66 |  - `ldconfig -p | grep cu` and `grep dnn` can be used to show the library locations. I also see /usr/include/cudnn.h
 67 | 
 68 | ### cuDNN for PyTorch
 69 | Since tools/setup_helpers/cuda.py assumes /usr/local/cuda, CUDA_HOME need not be set. But the TX1/TX2 install locations are missing from the search list in tools/setup_helpers/cudnn.py. In that script, there is a check for `os.getenv(CUDNN_LIB_DIR)` and another for the include dir. So I added the following to ~/.profile:
 70 | ```
 71 | export CUDNN_LIB_DIR=/usr/lib/aarch64-linux-gnu
 72 | export CUDNN_INCLUDE_DIR=/usr/include
 73 | ```
 74 | *Note:* Echoing at the prompt is not sufficient to test if environment variables are visible to the PyTorch build setup scripts. If CUDNN_LIB_DIR is properly set for the bash environment, but you also see this:
 75 | ```
 76 | [~]$ sudo python3 -c 'import os; print(os.getenv("CUDNN_LIB_DIR"))'
 77 | None
 78 | ```
 79 | then cuDNN will fail to be included. Here are three options:
 80 |  - Skip the problem and install PyTorch locally with the `--user` option (see next section).
 81 |  - use sudo -E to make user environment variables available to root.
 82 |  - Export from ~/.profile, then log out/in.
 83 | 
 84 | ## PyTorch source
 85 | ```
 86 | git clone https://github.com/pytorch/pytorch.git
 87 | git checkout -b v0.1.10 v0.1.10
 88 | ```
 89 | 
 90 | Now build PyTorch:
 91 | ```
 92 | time python3 setup.py install --user
 93 | ```
 94 | For a system-wide install,
 95 | ```
 96 | sudo time python3 setup.py install # or sudo -E
 97 | ```
 98 | Test out in the python3 repl (outside pytorch direcory):
 99 | ```
100 | import torch
101 | torch.backends.cudnn.is_acceptable(torch.cuda.FloatTensor(1))
102 | ```
103 | If `True`, congratulations!
104 | 
105 | ## TorchVision
106 | `pip3 --no-deps install torchvision`
107 | 
108 | ## Cleanup
109 | The [postFlashTX1](https://github.com/jetsonhacks/postFlashTX1.git) repo contains some useful cleanup scripts. In addition:
110 | ```
111 | sudo apt clean
112 | sudo apt autoremove --purge
113 | sudo rm /usr/src/*.tbz2 ## I had 6.9 GB of zip files
114 | sudo rm /var/cuda-repo-8.0-local/*.deb
115 | rm ~/temp # From my CMake 3.7 install
116 | ```
117 | The OpenCV sources can also be removed if necessary.
118 | 
119 | 


--------------------------------------------------------------------------------