├── .gitignore
├── README.md
├── build_tutorial_1.10.md
├── build_tutorial_2.0.0.md
├── patch
    └── _nccl_ops.so
├── source_patches
    ├── .DS_Store
    ├── v1.15.0_macos.patch
    ├── v2.0.0_macos.patch
    ├── v2.1.0_macos.patch
    └── v2.2.0_macos.patch
└── usr_local_lib
    ├── libgcc_s.1.dylib
    └── libgomp.1.dylib


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Tensorflow OSX Build
 2 | 
 3 | Unfortunately, the Tensorflow team stops releasing binary package for Mac OS with CUDA support since Tensorflow 1.2. This project provides off-the-shelf binary packages. ``Both Python 2.7 and 3.7 are supported now!``
 4 | 
 5 | 很不幸，Tensorflow团队自从1.2版本开始停止了发布 Mac OS CUDA版。本项目提供 Mac OS 上编译好、可直接安装的Tensorflow CUDA版本。``本项目同时支持Python 2.7 和 3.7 了！``
 6 | 
 7 | 
 8 | 
 9 | # Releases
10 | 
11 | You can find releases in the [releases page](https://github.com/TomHeaven/tensorflow-osx-build/releases).
12 | 
13 | 你可以在[Releases页面](https://github.com/TomHeaven/tensorflow-osx-build/releases)找到以前发布的版本。
14 | 
15 | # My Fork of Tensorflow
16 | 
17 | Except for making patches for release verions of TF, I fork TF sources at [https://github.com/TomHeaven/tensorflow](https://github.com/TomHeaven/tensorflow) and keep fixing build issues of TF on macOS with CUDA. The corresponding PR is at: [https://github.com/tensorflow/tensorflow/pull/39297](https://github.com/tensorflow/tensorflow/pull/39297). You can use the PR to make your own builds.
18 | 
19 | 
20 | # Installation for Python 2.7
21 | 
22 | First, ensure your CUDA driver and cudnn is installed properly, and copy dependencies in folder `usr_local_lib` to path `/usr/local/lib`.
23 | 
24 | 首先，确保CUDA驱动和cudnn正确安装,并且将文件夹`usr_local_lib`中的依赖项复制到路径`/usr/local/lib`。
25 | 
26 | ```
27 | sudo mkdir /usr/local
28 | sudo mkdir /usr/local/lib
29 | sudo cp usr_local_lib/* /usr/local/lib/
30 | ```
31 | 
32 | Second, uninstall the previous tensorflow installtion by
33 | 
34 | 其次，卸载之前版本的Tensorflow:
35 | 
36 | ```
37 | pip uninstall tensorflow
38 | pip uninstall tensorflow-gpu # for early version with offical support
39 | ```
40 | 
41 | At last, download binary packages from [Releases](https://github.com/TomHeaven/tensorflow-osx-build/releases) page and install
42 | 
43 | 最后，从[Releases页面](https://github.com/TomHeaven/tensorflow-osx-build/releases)下载并安装：
44 | 
45 | ```
46 | 
47 | pip install tensorflow*.whl
48 | ```
49 | 
50 | # Installation for Python 3.7
51 | 
52 | Install Python 3.7 from Homebrew first, and then simply follow the guide for Python 2.7 and replace `pip` command with `pip3` and `python` with `python3`.
53 | 
54 | 首先从Homebrew安装Python 3.7，然后按照Python 2.7的安装步骤执行，注意将`pip`替换为`pip3`，并用`python3`启动`python`。
55 | 
56 | 
57 | 
58 | Enjoy!
59 | 
60 | 开始使用新版Tensorflow吧!
61 | 
62 | # Build Tutorial
63 | If you want to build your own wheel packages, refer to the following tutorials:
64 | 
65 | + [v1.10](https://github.com/TomHeaven/tensorflow-osx-build/blob/master/build_tutorial_1.10.md)
66 | + [v2.0.0](https://github.com/TomHeaven/tensorflow-osx-build/blob/master/build_tutorial_2.0.0.md) This tutorial also works for v1.15.0, just use source patch v1.15.0 instead of v2.0.0.
67 | 
68 | 
69 | 
70 | # Related Links
71 | 
72 | If you need Pytorch builds for osx, go to this page: [https://github.com/TomHeaven/pytorch-osx-build](https://github.com/TomHeaven/pytorch-osx-build)
73 | 
74 | If you need MxNet builds for osx, go to this page: [https://github.com/TomHeaven/mxnet_osx_build](https://github.com/TomHeaven/mxnet_osx_build)
75 | 
76 | 如果你需要Pytorch包，请看这个页面：[https://github.com/TomHeaven/pytorch-osx-build](https://github.com/TomHeaven/pytorch-osx-build)
77 | 
78 | 如果你需要MxNet包，请看这个页面：[https://github.com/TomHeaven/mxnet_osx_build](https://github.com/TomHeaven/mxnet_osx_build)


--------------------------------------------------------------------------------
/build_tutorial_1.10.md:
--------------------------------------------------------------------------------
  1 | # Tensorflow OSX Build Tutorial (1.10)
  2 | 
  3 | By `Tom Heaven` @ 2018.08.25
  4 | 
  5 | Project page: [https://github.com/TomHeaven/tensorflow-osx-build](https://github.com/TomHeaven/tensorflow-osx-build)
  6 | 
  7 | ---
  8 | 
  9 | Note patches for every release of Tensorflow is a bit different! So the instructions work for v1.10.0 only.
 10 | 
 11 | First make sure your XCode 8, CUDA 9.0 SDK and CUDNN 7 are properly installed, and Internet connection is also required. You need to install Python3 using `Homebrew` if you need to compile for Python3.
 12 | 
 13 | If running on Mac OS High Sierra(10.13), you need to keep both XCode 8 (CUDA 9 SDK only supports this version) and XCode 9 (Homebrew requires this version) and switch between them by renaming.
 14 | 
 15 | I've found that build against CUDA 9.2 does not work properly, which keeps reporting `CUDA OUT OF MEMORY ERROR` and my test program is easily killed by operating system. If you have interests, feel free to try.
 16 | 
 17 | 
 18 | ## For Python 2.7 and 3.6
 19 | 
 20 | The following instructions will help you build your own wheel files for Python 2.7 and 3.6 using CUDA 9.0.
 21 | 
 22 | 
 23 | 
 24 | + download Tensorflow sources and switch to `v1.10.0`
 25 | 
 26 | ```shell
 27 | git clone https://github.com/tensorflow/tensorflow
 28 | cd tensorflow
 29 | git checkout v1.10.0
 30 | ```
 31 | + patch sources using the following shell commands:
 32 | 
 33 | ```shell
 34 | # new patches
 35 | sed "s/__align__(sizeof(T))/__align__(sizeof(T) > 16 ? sizeof(T) : 16)/" "tensorflow/core/kernels/split_lib_gpu.cu.cc" > tmp.h.cc
 36 | cp -f tmp.h.cc "tensorflow/core/kernels/split_lib_gpu.cu.cc"
 37 | rm -f tmp.h.cc
 38 | ## 2
 39 | sed "s/__align__(sizeof(T))/__align__(sizeof(T) > 16 ? sizeof(T) : 16)/" "tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc" > tmp.h.cc
 40 | cp -f tmp.h.cc "tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc"
 41 | rm -f tmp.h.cc
 42 | ## 3
 43 | sed "s/__align__(sizeof(T))/__align__(sizeof(T) > 16 ? sizeof(T) : 16)/" "tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc" > tmp.h.cc
 44 | cp -f tmp.h.cc "tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc"
 45 | rm -f tmp.h.cc
 46 | 
 47 | ## 4 patch for nccl configuration
 48 | sed "s/libnccl.so.%s/libnccl.%s.dylib/" "third_party/nccl/nccl_configure.bzl" > tmp.h.cc
 49 | cp -f tmp.h.cc "third_party/nccl/nccl_configure.bzl"
 50 | rm -f tmp.h.cc
 51 | ```
 52 | + run `./configure`, note that:  
 53 |   - using `/usr/local/bin/python3` if you want a build for python3.
 54 |   - select the correct site packages path.
 55 |   - Set correct CUDA compute capability values.
 56 |   - Use `/usr/bin/gcc` as the default compiler.
 57 |   
 58 | Here is an example for Python 2.7:
 59 | 
 60 | ```
 61 | MacBook-Pro:tensorflow-master tomheaven$ ./configure 
 62 | You have bazel 0.5.1-homebrew installed.
 63 | Please specify the location of python. [Default is /usr/bin/python]: 
 64 | 
 65 | 
 66 | Found possible Python library paths:
 67 |   /Library/Python/2.7/site-packages
 68 |   /Users/tomheaven/Documents/caffe-master/python
 69 |   /Users/tomheaven/Documents/Current/demosaicnet-master/demosaicnet
 70 |   /Users/tomheaven/Documents/facenet/build/lib/src
 71 | Please input the desired Python library path to use.  Default is [/Library/Python/2.7/site-packages]
 72 | 
 73 | Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: 
 74 | No Google Cloud Platform support will be enabled for TensorFlow.
 75 | 
 76 | Do you wish to build TensorFlow with Hadoop File System support? [y/N]: 
 77 | No Hadoop File System support will be enabled for TensorFlow.
 78 | 
 79 | Do you wish to build TensorFlow with XLA JIT support? [y/N]: 
 80 | No XLA JIT support will be enabled for TensorFlow.
 81 | 
 82 | Do you wish to build TensorFlow with GDR support? [y/N]: 
 83 | No GDR support will be enabled for TensorFlow.
 84 | 
 85 | Do you wish to build TensorFlow with VERBS support? [y/N]: 
 86 | No VERBS support will be enabled for TensorFlow.
 87 | 
 88 | Do you wish to build TensorFlow with OpenCL support? [y/N]: 
 89 | No OpenCL support will be enabled for TensorFlow.
 90 | 
 91 | Do you wish to build TensorFlow with CUDA support? [y/N]: y
 92 | CUDA support will be enabled for TensorFlow.
 93 | 
 94 | Please specify the CUDA SDK version you want to use, e.g. 9.0. [Leave empty to default to CUDA 9.0]: 
 95 | 
 96 | 
 97 | Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
 98 | 
 99 | 
100 | Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 
101 | 
102 | 
103 | Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
104 | 
105 | 
106 | Please specify a list of comma-separated Cuda compute capabilities you want to build with.
107 | You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
108 | Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2] 3.0,3.5,5.2,6.1
109 | 
110 | 
111 | Do you want to use clang as CUDA compiler? [y/N]: n
112 | nvcc will be used as CUDA compiler.
113 | 
114 | Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/local/bin/gcc]: /usr/bin/gcc
115 | 
116 | Do you wish to build TensorFlow with MPI support? [y/N]: n
117 | No MPI support will be enabled for TensorFlow.
118 | ```
119 | + start building
120 | 
121 | ```
122 | bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package
123 | ```
124 | Note `NCCL` only works on Linux now, which does not work on Mac OS X or Windows. We need to disable it by passing `--config=nonccl`, or you will meet NCCL related errors.
125 | 
126 | + generate a wheel package
127 | ```
128 | bazel-bin/tensorflow/tools/pip_package/build_pip_package ./
129 | ```
130 | 
131 | ## For Python 3.7
132 | Python 3.7 is newly released. There are a few major changes affecting Tensorflow v1.10.0:
133 | 
134 | + PyUnicode_AsXXXAndSize() function returns `const char*` rather than `char*`. Refer to this post: [https://github.com/protocolbuffers/protobuf/pull/4862/files](https://github.com/protocolbuffers/protobuf/pull/4862/files) for patches to protobuf. Similar patches are required for the following files:
135 | 
136 |  - `tensorflow/python/lib/core/ndarray_tensor.cc:157:13`
137 |  - `tensorflow/python/lib/core/py_func.cc:355:16`
138 |  - `tensorflow/python/eager/pywrap_tfe_src.cc:219:11`
139 | 
140 | + `async` is a reserved word now. Replace function parameter `async` with `t_async` (or any other valid parameter name) in `/tensorflow/c/eager/c_api`. There are four places requiring the patch.
141 | 
142 | After applying the additional patches, Tensorflow 1.10.0 should be built on Python 3.7.
143 | 
144 | 
145 | 


--------------------------------------------------------------------------------
/build_tutorial_2.0.0.md:
--------------------------------------------------------------------------------
  1 | # Tensorflow OSX Build Tutorial (v2.0.0)
  2 | 
  3 | By `Tom Heaven` @ 2019.10.01
  4 | 
  5 | Project page: [https://github.com/TomHeaven/tensorflow-osx-build](https://github.com/TomHeaven/tensorflow-osx-build)
  6 | 
  7 | ---
  8 | 
  9 | Note patches for every release of Tensorflow is different! So the instructions work for v2.0.0 only.
 10 | 
 11 | First make sure your XCode 9.4.1, CUDA 10.0 SDK and CUDNN 7.4 are properly installed, and Internet connection is also required. Install Python3 using `Homebrew` if you need to compile for Python3.
 12 | 
 13 | 
 14 | ## For Python 3.7
 15 | The following instructions will help you build your own wheel files for Python 3.7 with CUDA 10.0.
 16 | 
 17 | + download Tensorflow sources and switch to `v2.0.0`
 18 | 
 19 | ```shell
 20 | git clone https://github.com/tensorflow/tensorflow
 21 | cd tensorflow
 22 | git checkout v2.0.0
 23 | ```
 24 | + download and use source patches in this repo to patch sources:
 25 | 
 26 | ```shell
 27 | git apply v2.0.0_macos.patch
 28 | 
 29 | ```
 30 | + run `./configure`, note that:  
 31 |   - using `/usr/local/bin/python3` if you want a build for python3.
 32 |   - select the correct site packages path.
 33 |   - Set correct CUDA compute capability values.
 34 |   - Use `/usr/bin/gcc` as the default compiler.
 35 |   
 36 | Here is an example for Python 3.7:
 37 | 
 38 | ```
 39 | iMac18:tensorflow tomheaven$ ./configure
 40 | WARNING: Running Bazel server needs to be killed, because the startup options are different.
 41 | WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
 42 | You have bazel 0.24.1 installed.
 43 | Please specify the location of python. [Default is /usr/bin/python]: /usr/local/bin/python3
 44 | 
 45 | 
 46 | Found possible Python library paths:
 47 |   /usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages
 48 | Please input the desired Python library path to use.  Default is [/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages]
 49 | 
 50 | Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
 51 | No XLA JIT support will be enabled for TensorFlow.
 52 | 
 53 | Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 
 54 | No OpenCL SYCL support will be enabled for TensorFlow.
 55 | 
 56 | Do you wish to build TensorFlow with ROCm support? [y/N]: 
 57 | No ROCm support will be enabled for TensorFlow.
 58 | 
 59 | Do you wish to build TensorFlow with CUDA support? [y/N]: y
 60 | CUDA support will be enabled for TensorFlow.
 61 | 
 62 | Found CUDA 10.0 in:
 63 |     /usr/local/cuda/lib64
 64 |     /usr/local/cuda/include
 65 | Found cuDNN 7 in:
 66 |     /usr/local/cuda/lib64
 67 |     /usr/local/cuda/include
 68 | 
 69 | 
 70 | Please specify a list of comma-separated CUDA compute capabilities you want to build with.
 71 | You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
 72 | Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 3.0,3.5,5.0,5.2,6.1,7.0
 73 | 
 74 | 
 75 | WARNING: XLA does not support CUDA compute capabilities lower than 3.5. Disable XLA when running on older GPUs.
 76 | Do you want to use clang as CUDA compiler? [y/N]: 
 77 | nvcc will be used as CUDA compiler.
 78 | 
 79 | Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/local/bin/gcc]: /usr/bin/gcc
 80 | 
 81 | 
 82 | Do you wish to build TensorFlow with MPI support? [y/N]: 
 83 | No MPI support will be enabled for TensorFlow.
 84 | 
 85 | Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 
 86 | 
 87 | 
 88 | Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: 
 89 | Not configuring the WORKSPACE for Android builds.
 90 | 
 91 | Do you wish to build TensorFlow with iOS support? [y/N]: 
 92 | No iOS support will be enabled for TensorFlow.
 93 | 
 94 | Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
 95 | 	--config=mkl         	# Build with MKL support.
 96 | 	--config=monolithic  	# Config for mostly static monolithic build.
 97 | 	--config=gdr         	# Build with GDR support.
 98 | 	--config=verbs       	# Build with libverbs support.
 99 | 	--config=ngraph      	# Build with Intel nGraph support.
100 | 	--config=numa        	# Build with NUMA support.
101 | 	--config=dynamic_kernels	# (Experimental) Build kernels into separate shared objects.
102 | 	--config=v2          	# Build TensorFlow 2.x instead of 1.x.
103 | Preconfigured Bazel build configs to DISABLE default on features:
104 | 	--config=noaws       	# Disable AWS S3 filesystem support.
105 | 	--config=nogcp       	# Disable GCP support.
106 | 	--config=nohdfs      	# Disable HDFS support.
107 | 	--config=noignite    	# Disable Apache Ignite support.
108 | 	--config=nokafka     	# Disable Apache Kafka support.
109 | 	--config=nonccl      	# Disable NVIDIA NCCL support.
110 | Configuration finished
111 | ```
112 | + start building
113 | 
114 | ```
115 | bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package
116 | ```
117 | Note `NCCL` only works on Linux now, which does not work on Mac OS X or Windows. We need to disable it by passing `--config=nonccl`, or you will meet NCCL related errors.
118 | 
119 | + fix a source error in external sources. Previous building process will end with an error:
120 |  
121 | ```
122 | external/com_google_absl/absl/container/internal/compressed_tuple.h:170:53: error: use 'template' keyword to treat 'Storage' as a dependent template name
123 | return (std::move(*this).internal_compressed_tuple::Storage< CompressedTuple, I> ::get()); 
124 | ```
125 | 
126 | Solution: Edit `bazel-tensorflow/external/com_google_absl/absl/container/internal/compressed_tuple.h:168-178` and comment out two problematic functions:
127 | 
128 | ```cpp
129 | 
130 | /*template <int I>
131 |   ElemT<I>&& get() && {
132 |     return std::move(*this).internal_compressed_tuple::template Storage<CompressedTuple, I>::get();
133 |   }
134 |   template <int I>
135 |   constexpr const ElemT<I>&& get() const&& {
136 |     return absl::move(*this).internal_compressed_tuple::template Storage<CompressedTuple, I>::get();
137 |   }*/
138 | 
139 | ```
140 | Then, build again using the same previous commands. This time, the building process should be successful.
141 | 
142 | + generate a wheel package
143 | 
144 | ```
145 | bazel-bin/tensorflow/tools/pip_package/build_pip_package ./
146 | ```
147 | 
148 | ## For Python 2.7
149 | 
150 | For Python 2.7, we need an additional patch for external source file. Edit file `bazel-tensorflow/external/cython/Cython/Compiler/Nodes.py` and add the following lines at top:
151 | 
152 | ```python
153 | import sys
154 | if sys.version < '3':
155 |     reload(sys)
156 |     sys.setdefaultencoding('utf-8')
157 | ```
158 | Note external source patch can be applied only after the first build failure because external sources are downloaded at the beginning of the building process.
159 | 
160 | 
161 | 


--------------------------------------------------------------------------------
/patch/_nccl_ops.so:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/patch/_nccl_ops.so


--------------------------------------------------------------------------------
/source_patches/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/source_patches/.DS_Store


--------------------------------------------------------------------------------
/source_patches/v1.15.0_macos.patch:
--------------------------------------------------------------------------------
  1 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc
  2 | index 84588e2ff5..1686738126 100644
  3 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc
  4 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc
  5 | @@ -836,10 +836,10 @@ void LaunchConv2DBackpropFilterOp<Eigen::GpuDevice, T>::operator()(
  6 |            << " data_format=" << ToString(data_format)
  7 |            << " compute_data_format=" << ToString(compute_data_format);
  8 |  
  9 | -  constexpr auto kComputeInNHWC =
 10 | +   auto kComputeInNHWC =
 11 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 12 |                        se::dnn::FilterLayout::kOutputYXInput);
 13 | -  constexpr auto kComputeInNCHW =
 14 | +   auto kComputeInNCHW =
 15 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 16 |                        se::dnn::FilterLayout::kOutputInputYX);
 17 |  
 18 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc
 19 | index 165560d678..1a1814e302 100644
 20 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc
 21 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc
 22 | @@ -952,10 +952,10 @@ void LaunchConv2DBackpropInputOp<GPUDevice, T>::operator()(
 23 |            << " data_format=" << ToString(data_format)
 24 |            << " compute_data_format=" << ToString(compute_data_format);
 25 |  
 26 | -  constexpr auto kComputeInNHWC =
 27 | +   auto kComputeInNHWC =
 28 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 29 |                        se::dnn::FilterLayout::kOutputYXInput);
 30 | -  constexpr auto kComputeInNCHW =
 31 | +   auto kComputeInNCHW =
 32 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 33 |                        se::dnn::FilterLayout::kOutputInputYX);
 34 |  
 35 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc
 36 | index d453e9d68e..e4b7c2c28d 100644
 37 | --- a/tensorflow/core/kernels/conv_ops.cc
 38 | +++ b/tensorflow/core/kernels/conv_ops.cc
 39 | @@ -819,10 +819,10 @@ void LaunchConv2DOp<GPUDevice, T>::operator()(
 40 |        << "Negative row or col paddings: (" << common_padding_rows << ", "
 41 |        << common_padding_cols << ")";
 42 |  
 43 | -  constexpr auto kComputeInNHWC =
 44 | +   auto kComputeInNHWC =
 45 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 46 |                        se::dnn::FilterLayout::kOutputYXInput);
 47 | -  constexpr auto kComputeInNCHW =
 48 | +   auto kComputeInNCHW =
 49 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 50 |                        se::dnn::FilterLayout::kOutputInputYX);
 51 |  
 52 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 53 | index 88a3f2d1ca..2d695981be 100644
 54 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 55 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 56 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] =
 57 |      "The matrix is not invertible: it is a scalar with value zero.";
 58 |  
 59 |  template <typename Scalar>
 60 | -__global__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags,
 61 | +__device__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags,
 62 |                                             const Scalar* rhs, const int num_rhs,
 63 |                                             Scalar* x, bool* not_invertible) {
 64 |    if (m == 1) {
 65 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h
 66 | index 9040e78d6f..ff41a09669 100644
 67 | --- a/tensorflow/core/util/gpu_device_functions.h
 68 | +++ b/tensorflow/core/util/gpu_device_functions.h
 69 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff;
 70 |  __device__ inline unsigned GpuLaneId() {
 71 |    unsigned int lane_id;
 72 |  #if GOOGLE_CUDA
 73 | -#if __clang__
 74 | -  return __nvvm_read_ptx_sreg_laneid();
 75 | -#else   // __clang__
 76 | -  asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
 77 | -#endif  // __clang__
 78 | +    //#if __clang__
 79 | +    // return __nvvm_read_ptx_sreg_laneid();
 80 | +    //#else // __clang__
 81 | +    asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
 82 | +    //#endif // __clang__
 83 |  #elif TENSORFLOW_USE_ROCM
 84 |    lane_id = __lane_id();
 85 |  #endif
 86 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h
 87 | index 51fd2a84e3..ab2dda65c9 100644
 88 | --- a/tensorflow/core/util/gpu_kernel_helper.h
 89 | +++ b/tensorflow/core/util/gpu_kernel_helper.h
 90 | @@ -57,7 +57,7 @@ using gpuError_t = hipError_t;
 91 |  #if GOOGLE_CUDA
 92 |  
 93 |  #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \
 94 | -  extern __shared__ __align__(ALIGN) TYPE NAME[]
 95 | +  extern __shared__ TYPE NAME[]
 96 |  
 97 |  #elif TENSORFLOW_USE_ROCM
 98 |  
 99 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
100 | index 38d3dc9846..cc60169754 100644
101 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
102 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
103 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) {
104 |    _NSGetExecutablePath(nullptr, &buffer_size);
105 |    char unresolved_path[buffer_size];
106 |    _NSGetExecutablePath(unresolved_path, &buffer_size);
107 | -  CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
108 | +//CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
109 |  #else
110 |  #if defined(PLATFORM_WINDOWS)
111 |    HMODULE hModule = GetModuleHandle(NULL);
112 | diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py
113 | index 992f2eae22..886359d790 100644
114 | --- a/tensorflow/tools/pip_package/setup.py
115 | +++ b/tensorflow/tools/pip_package/setup.py
116 | @@ -52,8 +52,8 @@ _VERSION = '1.15.0'
117 |  REQUIRED_PACKAGES = [
118 |      'absl-py >= 0.7.0',
119 |      'astor >= 0.6.0',
120 | -    'backports.weakref >= 1.0rc1;python_version<"3.4"',
121 | -    'enum34 >= 1.1.6;python_version<"3.4"',
122 | +    'backports.weakref >= 1.0',
123 | +    'enum34 >= 1.1.6',
124 |      'gast == 0.2.2',
125 |      'google_pasta >= 0.1.6',
126 |      'keras_applications >= 1.0.8',
127 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl
128 | index cf63adcbaa..418a4c61f8 100644
129 | --- a/third_party/gpus/cuda_configure.bzl
130 | +++ b/third_party/gpus/cuda_configure.bzl
131 | @@ -553,8 +553,9 @@ def find_lib(repository_ctx, paths, check_soname = True):
132 |              continue
133 |          if check_soname and objdump != None and not _is_windows(repository_ctx):
134 |              output = repository_ctx.execute([objdump, "-p", str(path)]).stdout
135 | -            output = [line for line in output.splitlines() if "SONAME" in line]
136 | -            sonames = [line.strip().split(" ")[-1] for line in output]
137 | +            output = [line for line in output.splitlines() if "name @rpath/" in line]
138 | +            sonames = [line.strip().split("/")[-1] for line in output]
139 | +            sonames = [sonames[0].strip().split(" ")[0] for line in output]
140 |              if not any([soname == path.basename for soname in sonames]):
141 |                  mismatches.append(str(path))
142 |                  continue
143 | @@ -603,7 +604,7 @@ def _find_libs(repository_ctx, cuda_config):
144 |          Map of library names to structs of filename and path.
145 |        """
146 |      cpu_value = cuda_config.cpu_value
147 | -    stub_dir = "" if _is_windows(repository_ctx) else "/stubs"
148 | +    stub_dir = "" if _is_windows(repository_ctx) else ""
149 |      return {
150 |          "cuda": _find_cuda_lib(
151 |              "cuda",
152 | @@ -932,7 +933,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir):
153 |      outs = [
154 |  %s
155 |      ],
156 | -    cmd = \"""cp -rLf "%s/." "%s/" \""",
157 | +    cmd = \"""cp -r -f "%s/." "%s/" \""",
158 |  )""" % (name, "\n".join(outs), src_dir, out_dir)
159 |  
160 |  def _read_dir(repository_ctx, src_dir):
161 | 


--------------------------------------------------------------------------------
/source_patches/v2.0.0_macos.patch:
--------------------------------------------------------------------------------
  1 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc
  2 | index 9d5f316..4f95a38 100644
  3 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc
  4 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc
  5 | @@ -831,10 +831,10 @@ void LaunchConv2DBackpropFilterOp<Eigen::GpuDevice, T>::operator()(
  6 |            << " data_format=" << ToString(data_format)
  7 |            << " compute_data_format=" << ToString(compute_data_format);
  8 |  
  9 | -  constexpr auto kComputeInNHWC =
 10 | +  auto kComputeInNHWC =
 11 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 12 |                        se::dnn::FilterLayout::kOutputYXInput);
 13 | -  constexpr auto kComputeInNCHW =
 14 | +  auto kComputeInNCHW =
 15 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 16 |                        se::dnn::FilterLayout::kOutputInputYX);
 17 |  
 18 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc
 19 | index 8974aa1..71daf6c 100644
 20 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc
 21 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc
 22 | @@ -947,10 +947,10 @@ void LaunchConv2DBackpropInputOp<GPUDevice, T>::operator()(
 23 |            << " data_format=" << ToString(data_format)
 24 |            << " compute_data_format=" << ToString(compute_data_format);
 25 |  
 26 | -  constexpr auto kComputeInNHWC =
 27 | +   auto kComputeInNHWC =
 28 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 29 |                        se::dnn::FilterLayout::kOutputYXInput);
 30 | -  constexpr auto kComputeInNCHW =
 31 | +   auto kComputeInNCHW =
 32 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 33 |                        se::dnn::FilterLayout::kOutputInputYX);
 34 |  
 35 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc
 36 | index 5ad2489..e26c31f 100644
 37 | --- a/tensorflow/core/kernels/conv_ops.cc
 38 | +++ b/tensorflow/core/kernels/conv_ops.cc
 39 | @@ -864,10 +864,10 @@ void LaunchConv2DOp<GPUDevice, T>::operator()(
 40 |        << "Negative row or col paddings: (" << common_padding_rows << ", "
 41 |        << common_padding_cols << ")";
 42 |  
 43 | -  constexpr auto kComputeInNHWC =
 44 | +   auto kComputeInNHWC =
 45 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 46 |                        se::dnn::FilterLayout::kOutputYXInput);
 47 | -  constexpr auto kComputeInNCHW =
 48 | +   auto kComputeInNCHW =
 49 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 50 |                        se::dnn::FilterLayout::kOutputInputYX);
 51 |  
 52 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 53 | index 88a3f2d..2d69598 100644
 54 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 55 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 56 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] =
 57 |      "The matrix is not invertible: it is a scalar with value zero.";
 58 |  
 59 |  template <typename Scalar>
 60 | -__global__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags,
 61 | +__device__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags,
 62 |                                             const Scalar* rhs, const int num_rhs,
 63 |                                             Scalar* x, bool* not_invertible) {
 64 |    if (m == 1) {
 65 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h
 66 | index 9040e78..ff41a09 100644
 67 | --- a/tensorflow/core/util/gpu_device_functions.h
 68 | +++ b/tensorflow/core/util/gpu_device_functions.h
 69 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff;
 70 |  __device__ inline unsigned GpuLaneId() {
 71 |    unsigned int lane_id;
 72 |  #if GOOGLE_CUDA
 73 | -#if __clang__
 74 | -  return __nvvm_read_ptx_sreg_laneid();
 75 | -#else   // __clang__
 76 | -  asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
 77 | -#endif  // __clang__
 78 | +    //#if __clang__
 79 | +    // return __nvvm_read_ptx_sreg_laneid();
 80 | +    //#else // __clang__
 81 | +    asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
 82 | +    //#endif // __clang__
 83 |  #elif TENSORFLOW_USE_ROCM
 84 |    lane_id = __lane_id();
 85 |  #endif
 86 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h
 87 | index 51fd2a8..2a9d8cb 100644
 88 | --- a/tensorflow/core/util/gpu_kernel_helper.h
 89 | +++ b/tensorflow/core/util/gpu_kernel_helper.h
 90 | @@ -57,7 +57,7 @@ using gpuError_t = hipError_t;
 91 |  #if GOOGLE_CUDA
 92 |  
 93 |  #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \
 94 | -  extern __shared__ __align__(ALIGN) TYPE NAME[]
 95 | +    extern __shared__ TYPE NAME[]
 96 |  
 97 |  #elif TENSORFLOW_USE_ROCM
 98 |  
 99 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
100 | index a9289e3..db727bb 100644
101 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
102 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
103 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) {
104 |    _NSGetExecutablePath(nullptr, &buffer_size);
105 |    char unresolved_path[buffer_size];
106 |    _NSGetExecutablePath(unresolved_path, &buffer_size);
107 | -  CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
108 | +  //CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
109 |  #else
110 |  #if defined(PLATFORM_WINDOWS)
111 |    HMODULE hModule = GetModuleHandle(NULL);
112 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl
113 | index cf63adc..418a4c6 100644
114 | --- a/third_party/gpus/cuda_configure.bzl
115 | +++ b/third_party/gpus/cuda_configure.bzl
116 | @@ -553,8 +553,9 @@ def find_lib(repository_ctx, paths, check_soname = True):
117 |              continue
118 |          if check_soname and objdump != None and not _is_windows(repository_ctx):
119 |              output = repository_ctx.execute([objdump, "-p", str(path)]).stdout
120 | -            output = [line for line in output.splitlines() if "SONAME" in line]
121 | -            sonames = [line.strip().split(" ")[-1] for line in output]
122 | +            output = [line for line in output.splitlines() if "name @rpath/" in line]
123 | +            sonames = [line.strip().split("/")[-1] for line in output]
124 | +            sonames = [sonames[0].strip().split(" ")[0] for line in output]
125 |              if not any([soname == path.basename for soname in sonames]):
126 |                  mismatches.append(str(path))
127 |                  continue
128 | @@ -603,7 +604,7 @@ def _find_libs(repository_ctx, cuda_config):
129 |          Map of library names to structs of filename and path.
130 |        """
131 |      cpu_value = cuda_config.cpu_value
132 | -    stub_dir = "" if _is_windows(repository_ctx) else "/stubs"
133 | +    stub_dir = "" if _is_windows(repository_ctx) else ""
134 |      return {
135 |          "cuda": _find_cuda_lib(
136 |              "cuda",
137 | @@ -932,7 +933,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir):
138 |      outs = [
139 |  %s
140 |      ],
141 | -    cmd = \"""cp -rLf "%s/." "%s/" \""",
142 | +    cmd = \"""cp -r -f "%s/." "%s/" \""",
143 |  )""" % (name, "\n".join(outs), src_dir, out_dir)
144 |  
145 |  def _read_dir(repository_ctx, src_dir):
146 | 


--------------------------------------------------------------------------------
/source_patches/v2.1.0_macos.patch:
--------------------------------------------------------------------------------
  1 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc
  2 | index 594dbd0..a533e6f 100644
  3 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc
  4 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc
  5 | @@ -839,10 +839,10 @@ void LaunchConv2DBackpropFilterOp<Eigen::GpuDevice, T>::operator()(
  6 |            << " data_format=" << ToString(data_format)
  7 |            << " compute_data_format=" << ToString(compute_data_format);
  8 |  
  9 | -  constexpr auto kComputeInNHWC =
 10 | +  auto kComputeInNHWC =
 11 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 12 |                        se::dnn::FilterLayout::kOutputYXInput);
 13 | -  constexpr auto kComputeInNCHW =
 14 | +  auto kComputeInNCHW =
 15 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 16 |                        se::dnn::FilterLayout::kOutputInputYX);
 17 |  
 18 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc
 19 | index 2f6200e..c9e17c9 100644
 20 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc
 21 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc
 22 | @@ -997,10 +997,10 @@ void LaunchConv2DBackpropInputOp<GPUDevice, T>::operator()(
 23 |            << " data_format=" << ToString(data_format)
 24 |            << " compute_data_format=" << ToString(compute_data_format);
 25 |  
 26 | -  constexpr auto kComputeInNHWC =
 27 | +   auto kComputeInNHWC =
 28 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 29 |                        se::dnn::FilterLayout::kOutputYXInput);
 30 | -  constexpr auto kComputeInNCHW =
 31 | +   auto kComputeInNCHW =
 32 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 33 |                        se::dnn::FilterLayout::kOutputInputYX);
 34 |  
 35 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc
 36 | index d5ce7de..5a36c53 100644
 37 | --- a/tensorflow/core/kernels/conv_ops.cc
 38 | +++ b/tensorflow/core/kernels/conv_ops.cc
 39 | @@ -859,10 +859,10 @@ void LaunchConv2DOp<GPUDevice, T>::operator()(
 40 |        << "Negative row or col paddings: (" << common_padding_rows << ", "
 41 |        << common_padding_cols << ")";
 42 |  
 43 | -  constexpr auto kComputeInNHWC =
 44 | +   auto kComputeInNHWC =
 45 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 46 |                        se::dnn::FilterLayout::kOutputYXInput);
 47 | -  constexpr auto kComputeInNCHW =
 48 | +   auto kComputeInNCHW =
 49 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 50 |                        se::dnn::FilterLayout::kOutputInputYX);
 51 |  
 52 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 53 | index 4899cd8..12d9705 100644
 54 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 55 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
 56 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] =
 57 |      "The matrix is not invertible: it is a scalar with value zero.";
 58 |  
 59 |  template <typename Scalar>
 60 | -__global__ void SolveForSizeOneOrTwoKernel(const int m,
 61 | +__device__ void SolveForSizeOneOrTwoKernel(const int m,
 62 |                                             const Scalar* __restrict__ diags,
 63 |                                             const Scalar* __restrict__ rhs,
 64 |                                             const int num_rhs,
 65 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h
 66 | index 7c54294..b5ac3a6 100644
 67 | --- a/tensorflow/core/util/gpu_device_functions.h
 68 | +++ b/tensorflow/core/util/gpu_device_functions.h
 69 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff;
 70 |  __device__ inline unsigned GpuLaneId() {
 71 |    unsigned int lane_id;
 72 |  #if GOOGLE_CUDA
 73 | -#if __clang__
 74 | -  return __nvvm_read_ptx_sreg_laneid();
 75 | -#else   // __clang__
 76 | -  asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
 77 | -#endif  // __clang__
 78 | +    //#if __clang__
 79 | +    // return __nvvm_read_ptx_sreg_laneid();
 80 | +    //#else // __clang__
 81 | +    asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
 82 | +    //#endif // __clang__
 83 |  #elif TENSORFLOW_USE_ROCM
 84 |    lane_id = __lane_id();
 85 |  #endif
 86 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h
 87 | index 51fd2a8..2a9d8cb 100644
 88 | --- a/tensorflow/core/util/gpu_kernel_helper.h
 89 | +++ b/tensorflow/core/util/gpu_kernel_helper.h
 90 | @@ -57,7 +57,7 @@ using gpuError_t = hipError_t;
 91 |  #if GOOGLE_CUDA
 92 |  
 93 |  #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \
 94 | -  extern __shared__ __align__(ALIGN) TYPE NAME[]
 95 | +    extern __shared__ TYPE NAME[]
 96 |  
 97 |  #elif TENSORFLOW_USE_ROCM
 98 |  
 99 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
100 | index 44bb359..6bb31fe 100644
101 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
102 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
103 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) {
104 |    _NSGetExecutablePath(nullptr, &buffer_size);
105 |    char unresolved_path[buffer_size];
106 |    _NSGetExecutablePath(unresolved_path, &buffer_size);
107 | -  CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
108 | +  //CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
109 |  #else
110 |  #if defined(PLATFORM_WINDOWS)
111 |    HMODULE hModule = GetModuleHandle(NULL);
112 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl
113 | index af1bc96..9719a83 100644
114 | --- a/third_party/gpus/cuda_configure.bzl
115 | +++ b/third_party/gpus/cuda_configure.bzl
116 | @@ -568,8 +568,9 @@ def find_lib(repository_ctx, paths, check_soname = True):
117 |              continue
118 |          if check_soname and objdump != None and not _is_windows(repository_ctx):
119 |              output = repository_ctx.execute([objdump, "-p", str(path)]).stdout
120 | -            output = [line for line in output.splitlines() if "SONAME" in line]
121 | -            sonames = [line.strip().split(" ")[-1] for line in output]
122 | +            output = [line for line in output.splitlines() if "name @rpath/" in line]
123 | +            sonames = [line.strip().split("/")[-1] for line in output]
124 | +            sonames = [sonames[0].strip().split(" ")[0] for line in output]
125 |              if not any([soname == path.basename for soname in sonames]):
126 |                  mismatches.append(str(path))
127 |                  continue
128 | @@ -618,7 +619,7 @@ def _find_libs(repository_ctx, cuda_config):
129 |          Map of library names to structs of filename and path.
130 |        """
131 |      cpu_value = cuda_config.cpu_value
132 | -    stub_dir = "" if _is_windows(repository_ctx) else "/stubs"
133 | +    stub_dir = "" if _is_windows(repository_ctx) else ""
134 |      return {
135 |          "cuda": _find_cuda_lib(
136 |              "cuda",
137 | @@ -947,7 +948,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir):
138 |      outs = [
139 |  %s
140 |      ],
141 | -    cmd = \"""cp -rLf "%s/." "%s/" \""",
142 | +    cmd = \"""cp -r -f "%s/." "%s/" \""",
143 |  )""" % (name, "\n".join(outs), src_dir, out_dir)
144 |  
145 |  def _read_dir(repository_ctx, src_dir):
146 | 


--------------------------------------------------------------------------------
/source_patches/v2.2.0_macos.patch:
--------------------------------------------------------------------------------
  1 | diff --git a/configure b/configure
  2 | index 66b66ba..e43908e 100755
  3 | --- a/configure
  4 | +++ b/configure
  5 | @@ -4,7 +4,7 @@ set -e
  6 |  set -o pipefail
  7 |  
  8 |  if [ -z "$PYTHON_BIN_PATH" ]; then
  9 | -  PYTHON_BIN_PATH=$(which python || which python3 || true)
 10 | +  PYTHON_BIN_PATH=$(which python3 || which python || true)
 11 |  fi
 12 |  
 13 |  # Set all env variables
 14 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc
 15 | index f9bf64f..eb9803c 100644
 16 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc
 17 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc
 18 | @@ -839,10 +839,10 @@ void LaunchConv2DBackpropFilterOp<Eigen::GpuDevice, T>::operator()(
 19 |            << " data_format=" << ToString(data_format)
 20 |            << " compute_data_format=" << ToString(compute_data_format);
 21 |  
 22 | -  constexpr auto kComputeInNHWC =
 23 | +  auto kComputeInNHWC =
 24 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 25 |                        se::dnn::FilterLayout::kOutputYXInput);
 26 | -  constexpr auto kComputeInNCHW =
 27 | +  auto kComputeInNCHW =
 28 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 29 |                        se::dnn::FilterLayout::kOutputInputYX);
 30 |  
 31 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc
 32 | index be5d821..dd17d4b 100644
 33 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc
 34 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc
 35 | @@ -997,10 +997,10 @@ void LaunchConv2DBackpropInputOp<GPUDevice, T>::operator()(
 36 |            << " data_format=" << ToString(data_format)
 37 |            << " compute_data_format=" << ToString(compute_data_format);
 38 |  
 39 | -  constexpr auto kComputeInNHWC =
 40 | +  auto kComputeInNHWC =
 41 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 42 |                        se::dnn::FilterLayout::kOutputYXInput);
 43 | -  constexpr auto kComputeInNCHW =
 44 | +  auto kComputeInNCHW =
 45 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 46 |                        se::dnn::FilterLayout::kOutputInputYX);
 47 |  
 48 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc
 49 | index d265e9d..354c9e9 100644
 50 | --- a/tensorflow/core/kernels/conv_ops.cc
 51 | +++ b/tensorflow/core/kernels/conv_ops.cc
 52 | @@ -863,10 +863,10 @@ void LaunchConv2DOp<GPUDevice, T>::operator()(
 53 |        << "Negative row or col paddings: (" << common_padding_rows << ", "
 54 |        << common_padding_cols << ")";
 55 |  
 56 | -  constexpr auto kComputeInNHWC =
 57 | +  auto kComputeInNHWC =
 58 |        std::make_tuple(se::dnn::DataLayout::kBatchYXDepth,
 59 |                        se::dnn::FilterLayout::kOutputYXInput);
 60 | -  constexpr auto kComputeInNCHW =
 61 | +  auto kComputeInNCHW =
 62 |        std::make_tuple(se::dnn::DataLayout::kBatchDepthYX,
 63 |                        se::dnn::FilterLayout::kOutputInputYX);
 64 |  
 65 | diff --git a/tensorflow/core/kernels/data/experimental/snapshot_util.cc b/tensorflow/core/kernels/data/experimental/snapshot_util.cc
 66 | index 391ece3..1df8c82 100644
 67 | --- a/tensorflow/core/kernels/data/experimental/snapshot_util.cc
 68 | +++ b/tensorflow/core/kernels/data/experimental/snapshot_util.cc
 69 | @@ -32,6 +32,12 @@ limitations under the License.
 70 |  namespace tensorflow {
 71 |  namespace data {
 72 |  namespace experimental {
 73 | +// Tom Added to solve symbol not found error on macOS
 74 | +static constexpr const int64 kSnappyReaderInputBufferSizeBytes =
 75 | +    1 << 30;  // 1 GiB
 76 | +    // TODO(b/148804377): Set this in a smarter fashion.
 77 | +static constexpr const int64 kSnappyReaderOutputBufferSizeBytes =
 78 | +    32 << 20;  // 32 MiB
 79 |  
 80 |  SnapshotWriter::SnapshotWriter(WritableFile* dest,
 81 |                                 const string& compression_type, int version,
 82 | diff --git a/tensorflow/core/kernels/data/experimental/snapshot_util.h b/tensorflow/core/kernels/data/experimental/snapshot_util.h
 83 | index a2df3cc..43eda32 100644
 84 | --- a/tensorflow/core/kernels/data/experimental/snapshot_util.h
 85 | +++ b/tensorflow/core/kernels/data/experimental/snapshot_util.h
 86 | @@ -70,11 +70,11 @@ class SnapshotReader {
 87 |    // The reader input buffer size is deliberately large because the input reader
 88 |    // will throw an error if the compressed block length cannot fit in the input
 89 |    // buffer.
 90 | -  static constexpr const int64 kSnappyReaderInputBufferSizeBytes =
 91 | -      1 << 30;  // 1 GiB
 92 | +  //static constexpr const int64 kSnappyReaderInputBufferSizeBytes =
 93 | +  //    1 << 30;  // 1 GiB
 94 |    // TODO(b/148804377): Set this in a smarter fashion.
 95 | -  static constexpr const int64 kSnappyReaderOutputBufferSizeBytes =
 96 | -      32 << 20;  // 32 MiB
 97 | +  //static constexpr const int64 kSnappyReaderOutputBufferSizeBytes =
 98 | +  //    32 << 20;  // 32 MiB
 99 |    static constexpr const size_t kHeaderSize = sizeof(uint64);
100 |  
101 |    static constexpr const char* const kClassName = "SnapshotReader";
102 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
103 | index 3825e29..c75fca7 100644
104 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
105 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc
106 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] =
107 |      "The matrix is not invertible: it is a scalar with value zero.";
108 |  
109 |  template <typename Scalar>
110 | -__global__ void SolveForSizeOneOrTwoKernel(const int m,
111 | +__device__ void SolveForSizeOneOrTwoKernel(const int m,
112 |                                             const Scalar* __restrict__ diags,
113 |                                             const Scalar* __restrict__ rhs,
114 |                                             const int num_rhs,
115 | diff --git a/tensorflow/core/platform/tstring.h b/tensorflow/core/platform/tstring.h
116 | index 3fe1be2..515dbf7 100644
117 | --- a/tensorflow/core/platform/tstring.h
118 | +++ b/tensorflow/core/platform/tstring.h
119 | @@ -15,7 +15,7 @@ limitations under the License.
120 |  
121 |  #ifndef TENSORFLOW_CORE_PLATFORM_TSTRING_H_
122 |  #define TENSORFLOW_CORE_PLATFORM_TSTRING_H_
123 | -
124 | +#include <functional> // Tom added
125 |  #include <assert.h>
126 |  
127 |  #include <ostream>
128 | @@ -225,7 +225,7 @@ class tstring {
129 |    friend bool operator==(const std::string& a, const tstring& b);
130 |    friend tstring operator+(const tstring& a, const tstring& b);
131 |    friend std::ostream& operator<<(std::ostream& o, const tstring& str);
132 | -  friend std::hash<tstring>;
133 | +  //friend struct std::hash<tstring>; //Tom modified
134 |  };
135 |  
136 |  // Non-member function overloads
137 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h
138 | index 7c54294..e648517 100644
139 | --- a/tensorflow/core/util/gpu_device_functions.h
140 | +++ b/tensorflow/core/util/gpu_device_functions.h
141 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff;
142 |  __device__ inline unsigned GpuLaneId() {
143 |    unsigned int lane_id;
144 |  #if GOOGLE_CUDA
145 | -#if __clang__
146 | -  return __nvvm_read_ptx_sreg_laneid();
147 | -#else   // __clang__
148 | +//#if __clang__
149 | +//  return __nvvm_read_ptx_sreg_laneid();
150 | +//#else   // __clang__
151 |    asm("mov.u32 %0, %%laneid;" : "=r"(lane_id));
152 | -#endif  // __clang__
153 | +//#endif  // __clang__
154 |  #elif TENSORFLOW_USE_ROCM
155 |    lane_id = __lane_id();
156 |  #endif
157 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h
158 | index 51fd2a8..cff59b6 100644
159 | --- a/tensorflow/core/util/gpu_kernel_helper.h
160 | +++ b/tensorflow/core/util/gpu_kernel_helper.h
161 | @@ -57,7 +57,8 @@ using gpuError_t = hipError_t;
162 |  #if GOOGLE_CUDA
163 |  
164 |  #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \
165 | -  extern __shared__ __align__(ALIGN) TYPE NAME[]
166 | +extern __shared__ TYPE NAME[]
167 | +//  extern __shared__ __align__(ALIGN) TYPE NAME[]
168 |  
169 |  #elif TENSORFLOW_USE_ROCM
170 |  
171 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
172 | index 44bb359..6bb31fe 100644
173 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
174 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
175 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) {
176 |    _NSGetExecutablePath(nullptr, &buffer_size);
177 |    char unresolved_path[buffer_size];
178 |    _NSGetExecutablePath(unresolved_path, &buffer_size);
179 | -  CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
180 | +  //CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1);
181 |  #else
182 |  #if defined(PLATFORM_WINDOWS)
183 |    HMODULE hModule = GetModuleHandle(NULL);
184 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl
185 | index bdaaa4a..544d1c9 100644
186 | --- a/third_party/gpus/cuda_configure.bzl
187 | +++ b/third_party/gpus/cuda_configure.bzl
188 | @@ -462,7 +462,7 @@ def _check_cuda_lib_params(lib, cpu_value, basedir, version, static = False):
189 |          _should_check_soname(version, static),
190 |      )
191 |  
192 | -def _check_cuda_libs(repository_ctx, script_path, libs):
193 | +def _check_cuda_libs_failed(repository_ctx, script_path, libs):
194 |      python_bin = get_python_bin(repository_ctx)
195 |      contents = repository_ctx.read(script_path).splitlines()
196 |  
197 | @@ -476,6 +476,7 @@ def _check_cuda_libs(repository_ctx, script_path, libs):
198 |      cmd += "system('%s script.py %s');" % (python_bin, args)
199 |  
200 |      all_paths = [path for path, _ in libs]
201 | +    print('cmd %s' % cmd)
202 |      checked_paths = execute(repository_ctx, [python_bin, "-c", cmd]).stdout.splitlines()
203 |  
204 |      # Filter out empty lines from splitting on '\r\n' on Windows
205 | @@ -483,6 +484,33 @@ def _check_cuda_libs(repository_ctx, script_path, libs):
206 |      if all_paths != checked_paths:
207 |          auto_configure_fail("Error with installed CUDA libs. Expected '%s'. Actual '%s'." % (all_paths, checked_paths))
208 |  
209 | +def _check_cuda_libs(repository_ctx, script_path, paths, check_soname = True):
210 | +    """
211 | +      Finds a library among a list of potential paths.
212 | +      Args:
213 | +        paths: List of paths to inspect.
214 | +      Returns:
215 | +        Returns the first path in paths that exist.
216 | +    """
217 | +    objdump = repository_ctx.which("objdump")
218 | +    mismatches = []
219 | +    for path in paths:
220 | +        path = path[0]
221 | +        print('mypath', path)
222 | +        #if not path.exists:
223 | +        #    continue
224 | +        output = repository_ctx.execute([objdump, "-p", str(path)]).stdout
225 | +        output = [line for line in output.splitlines() if "name @rpath/" in line]
226 | +        sonames = [line.strip().split("/")[-1] for line in output]
227 | +        sonames = [sonames[0].strip().split(" ")[0] for line in output]
228 | +        return path
229 | +
230 | +    if mismatches:
231 | +        auto_configure_fail(
232 | +            "None of the libraries match their SONAME: " + ", ".join(mismatches),
233 | +        )
234 | +    auto_configure_fail("No library found under: " + ", ".join(paths))
235 | +
236 |  def _find_libs(repository_ctx, check_cuda_libs_script, cuda_config):
237 |      """Returns the CUDA and cuDNN libraries on the system.
238 |  
239 | @@ -498,7 +526,7 @@ def _find_libs(repository_ctx, check_cuda_libs_script, cuda_config):
240 |          Map of library names to structs of filename and path.
241 |        """
242 |      cpu_value = cuda_config.cpu_value
243 | -    stub_dir = "" if is_windows(repository_ctx) else "/stubs"
244 | +    stub_dir = "" if is_windows(repository_ctx) else ""
245 |  
246 |      check_cuda_libs_params = {
247 |          "cuda": _check_cuda_lib_params(
248 | @@ -826,7 +854,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir):
249 |      outs = [
250 |  %s
251 |      ],
252 | -    cmd = \"""cp -rLf "%s/." "%s/" \""",
253 | +    cmd = \"""cp -r -f "%s/." "%s/" \""",
254 |  )""" % (name, "\n".join(outs), src_dir, out_dir)
255 |  
256 |  def _flag_enabled(repository_ctx, flag_name):
257 | 


--------------------------------------------------------------------------------
/usr_local_lib/libgcc_s.1.dylib:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/usr_local_lib/libgcc_s.1.dylib


--------------------------------------------------------------------------------
/usr_local_lib/libgomp.1.dylib:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/usr_local_lib/libgomp.1.dylib


--------------------------------------------------------------------------------