├── LICENSE
└── README.md


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 Roman Ring
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Optimized TensorFlow Wheels
 2 | 
 3 | If you see similar messages when you start TensorFlow then these wheels are for you!
 4 | 
 5 | ```
 6 | The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
 7 | The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
 8 | The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
 9 | The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
10 | ```
11 | 
12 | ## Introduction
13 | 
14 | The builds enable various performance flags targeting modern CPUs, including SIMD support (AVX2, SSE4, FMA).
15 | If you have a CPU released after ~2013 then you'll likely benefit from these on e.g. data pre-processing.
16 | 
17 | Build also enables [XLA](https://www.tensorflow.org/xla/) - an Accelerated Linear Algebra domain-specific just-in-time compiler.
18 | 
19 | Additional compute capabilities (5.0, 6.1, 7.0, 7.5) are enabled, meaning the wheels should work well on a wide range of GPUS: from `GTX 7xx` to `RTX 20xx` families.
20 | 
21 | ## Available Wheels
22 | 
23 | |TensorFlow|Python|CUDA|CuDNN|TensorRT|NCCL|Compute Capability|OS|Link|
24 | |---:|---:|---:|---:|---:|---:|---:|:---:|:---:|
25 | |2.1.0|3.8|10.2|7.6|7.0|2.5|5.0,6.1,7.0,7.5|Linux|[tensorflow-2.1.0-cp38-cp38-linux_x86_64.whl](https://github.com/inoryy/tensorflow-optimized-wheels/releases/download/v2.1.0/tensorflow-2.1.0-cp38-cp38-linux_x86_64.whl)|
26 | |2.1.0|3.7|10.2|7.6|7.0|2.5|5.0,6.1,7.0,7.5|Linux|[tensorflow-2.1.0-cp37-cp37m-linux_x86_64.whl](https://github.com/inoryy/tensorflow-optimized-wheels/releases/download/v2.1.0/tensorflow-2.1.0-cp37-cp37m-linux_x86_64.whl)|
27 | |2.0.0|3.8|10.2|7.6|N/A|2.5|5.0,6.1,7.0|Linux|[tensorflow-2.0.0-cp38-cp38-linux_x86_64.whl](https://github.com/inoryy/tensorflow-optimized-wheels/releases/download/v2.0.0-py3.8/tensorflow-2.0.0-cp38-cp38-linux_x86_64.whl)|
28 | |2.0.0|3.7|10.1|7.5|N/A|2.4|5.0,6.1,7.0|Linux|[tensorflow-2.0.0-cp37-cp37m-linux_x86_64.whl](https://github.com/inoryy/tensorflow-optimized-wheels/releases/download/v2.0.0/tensorflow-2.0.0-cp37-cp37m-linux_x86_64.whl)|
29 | 
30 | ## Installation
31 | 
32 | Assuming you have all the requirements, you can install the wheel directly via pip:
33 | 
34 | ```
35 | pip install https://github.com/inoryy/tensorflow-optimized-wheels/releases/download/v2.1.0/tensorflow-2.1.0-cp37-cp37m-linux_x86_64.whl
36 | ```
37 | And verify the installation (notice no warning messages):
38 | 
39 | ```
40 | $ python
41 | Python 3.8.0 | packaged by conda-forge | (default, Nov 22 2019, 19:11:38)
42 | [GCC 7.3.0] :: Anaconda, Inc. on linux
43 | >>> import tensorflow as tf
44 | I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
45 | >>> tf.__version__
46 | '2.1.0'
47 | >>> tf.executing_eagerly()
48 | True
49 | >>> tf.constant([123]) + tf.constant([321])
50 | I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
51 | ...
52 | I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
53 | I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (...) -> physical GPU (...)
54 | <tf.Tensor: shape=(1,), dtype=int32, numpy=array([444], dtype=int32)>
55 | ```
56 | 
57 | ## Benchmark
58 | 
59 | The wheels are benchmarked by training an MNIST model from [TF Models](https://github.com/tensorflow/models) on a CPU. Results for TF 2.1 are as follows:
60 | 
61 | | Build / Time Per Epoch |Mean|Min|Max|
62 | |---:|---:|---:|---:|
63 | | Official  | 16.7s | 16s | 19s |
64 | | Optimized | 14.3s | 12s | 17s |
65 | 
66 | ## Requests
67 | 
68 | If you need a different TensorFlow / CUDA / CuDNN / Python combination feel free to open a GitHub ticket.
69 | 


--------------------------------------------------------------------------------