├── .gitignore ├── README.md ├── build_tutorial_1.10.md ├── build_tutorial_2.0.0.md ├── patch └── _nccl_ops.so ├── source_patches ├── .DS_Store ├── v1.15.0_macos.patch ├── v2.0.0_macos.patch ├── v2.1.0_macos.patch └── v2.2.0_macos.patch └── usr_local_lib ├── libgcc_s.1.dylib └── libgomp.1.dylib /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Tensorflow OSX Build 2 | 3 | Unfortunately, the Tensorflow team stops releasing binary package for Mac OS with CUDA support since Tensorflow 1.2. This project provides off-the-shelf binary packages. ``Both Python 2.7 and 3.7 are supported now!`` 4 | 5 | 很不幸,Tensorflow团队自从1.2版本开始停止了发布 Mac OS CUDA版。本项目提供 Mac OS 上编译好、可直接安装的Tensorflow CUDA版本。``本项目同时支持Python 2.7 和 3.7 了!`` 6 | 7 | 8 | 9 | # Releases 10 | 11 | You can find releases in the [releases page](https://github.com/TomHeaven/tensorflow-osx-build/releases). 12 | 13 | 你可以在[Releases页面](https://github.com/TomHeaven/tensorflow-osx-build/releases)找到以前发布的版本。 14 | 15 | # My Fork of Tensorflow 16 | 17 | Except for making patches for release verions of TF, I fork TF sources at [https://github.com/TomHeaven/tensorflow](https://github.com/TomHeaven/tensorflow) and keep fixing build issues of TF on macOS with CUDA. The corresponding PR is at: [https://github.com/tensorflow/tensorflow/pull/39297](https://github.com/tensorflow/tensorflow/pull/39297). You can use the PR to make your own builds. 18 | 19 | 20 | # Installation for Python 2.7 21 | 22 | First, ensure your CUDA driver and cudnn is installed properly, and copy dependencies in folder `usr_local_lib` to path `/usr/local/lib`. 23 | 24 | 首先,确保CUDA驱动和cudnn正确安装,并且将文件夹`usr_local_lib`中的依赖项复制到路径`/usr/local/lib`。 25 | 26 | ``` 27 | sudo mkdir /usr/local 28 | sudo mkdir /usr/local/lib 29 | sudo cp usr_local_lib/* /usr/local/lib/ 30 | ``` 31 | 32 | Second, uninstall the previous tensorflow installtion by 33 | 34 | 其次,卸载之前版本的Tensorflow: 35 | 36 | ``` 37 | pip uninstall tensorflow 38 | pip uninstall tensorflow-gpu # for early version with offical support 39 | ``` 40 | 41 | At last, download binary packages from [Releases](https://github.com/TomHeaven/tensorflow-osx-build/releases) page and install 42 | 43 | 最后,从[Releases页面](https://github.com/TomHeaven/tensorflow-osx-build/releases)下载并安装: 44 | 45 | ``` 46 | 47 | pip install tensorflow*.whl 48 | ``` 49 | 50 | # Installation for Python 3.7 51 | 52 | Install Python 3.7 from Homebrew first, and then simply follow the guide for Python 2.7 and replace `pip` command with `pip3` and `python` with `python3`. 53 | 54 | 首先从Homebrew安装Python 3.7,然后按照Python 2.7的安装步骤执行,注意将`pip`替换为`pip3`,并用`python3`启动`python`。 55 | 56 | 57 | 58 | Enjoy! 59 | 60 | 开始使用新版Tensorflow吧! 61 | 62 | # Build Tutorial 63 | If you want to build your own wheel packages, refer to the following tutorials: 64 | 65 | + [v1.10](https://github.com/TomHeaven/tensorflow-osx-build/blob/master/build_tutorial_1.10.md) 66 | + [v2.0.0](https://github.com/TomHeaven/tensorflow-osx-build/blob/master/build_tutorial_2.0.0.md) This tutorial also works for v1.15.0, just use source patch v1.15.0 instead of v2.0.0. 67 | 68 | 69 | 70 | # Related Links 71 | 72 | If you need Pytorch builds for osx, go to this page: [https://github.com/TomHeaven/pytorch-osx-build](https://github.com/TomHeaven/pytorch-osx-build) 73 | 74 | If you need MxNet builds for osx, go to this page: [https://github.com/TomHeaven/mxnet_osx_build](https://github.com/TomHeaven/mxnet_osx_build) 75 | 76 | 如果你需要Pytorch包,请看这个页面:[https://github.com/TomHeaven/pytorch-osx-build](https://github.com/TomHeaven/pytorch-osx-build) 77 | 78 | 如果你需要MxNet包,请看这个页面:[https://github.com/TomHeaven/mxnet_osx_build](https://github.com/TomHeaven/mxnet_osx_build) -------------------------------------------------------------------------------- /build_tutorial_1.10.md: -------------------------------------------------------------------------------- 1 | # Tensorflow OSX Build Tutorial (1.10) 2 | 3 | By `Tom Heaven` @ 2018.08.25 4 | 5 | Project page: [https://github.com/TomHeaven/tensorflow-osx-build](https://github.com/TomHeaven/tensorflow-osx-build) 6 | 7 | --- 8 | 9 | Note patches for every release of Tensorflow is a bit different! So the instructions work for v1.10.0 only. 10 | 11 | First make sure your XCode 8, CUDA 9.0 SDK and CUDNN 7 are properly installed, and Internet connection is also required. You need to install Python3 using `Homebrew` if you need to compile for Python3. 12 | 13 | If running on Mac OS High Sierra(10.13), you need to keep both XCode 8 (CUDA 9 SDK only supports this version) and XCode 9 (Homebrew requires this version) and switch between them by renaming. 14 | 15 | I've found that build against CUDA 9.2 does not work properly, which keeps reporting `CUDA OUT OF MEMORY ERROR` and my test program is easily killed by operating system. If you have interests, feel free to try. 16 | 17 | 18 | ## For Python 2.7 and 3.6 19 | 20 | The following instructions will help you build your own wheel files for Python 2.7 and 3.6 using CUDA 9.0. 21 | 22 | 23 | 24 | + download Tensorflow sources and switch to `v1.10.0` 25 | 26 | ```shell 27 | git clone https://github.com/tensorflow/tensorflow 28 | cd tensorflow 29 | git checkout v1.10.0 30 | ``` 31 | + patch sources using the following shell commands: 32 | 33 | ```shell 34 | # new patches 35 | sed "s/__align__(sizeof(T))/__align__(sizeof(T) > 16 ? sizeof(T) : 16)/" "tensorflow/core/kernels/split_lib_gpu.cu.cc" > tmp.h.cc 36 | cp -f tmp.h.cc "tensorflow/core/kernels/split_lib_gpu.cu.cc" 37 | rm -f tmp.h.cc 38 | ## 2 39 | sed "s/__align__(sizeof(T))/__align__(sizeof(T) > 16 ? sizeof(T) : 16)/" "tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc" > tmp.h.cc 40 | cp -f tmp.h.cc "tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc" 41 | rm -f tmp.h.cc 42 | ## 3 43 | sed "s/__align__(sizeof(T))/__align__(sizeof(T) > 16 ? sizeof(T) : 16)/" "tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc" > tmp.h.cc 44 | cp -f tmp.h.cc "tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc" 45 | rm -f tmp.h.cc 46 | 47 | ## 4 patch for nccl configuration 48 | sed "s/libnccl.so.%s/libnccl.%s.dylib/" "third_party/nccl/nccl_configure.bzl" > tmp.h.cc 49 | cp -f tmp.h.cc "third_party/nccl/nccl_configure.bzl" 50 | rm -f tmp.h.cc 51 | ``` 52 | + run `./configure`, note that: 53 | - using `/usr/local/bin/python3` if you want a build for python3. 54 | - select the correct site packages path. 55 | - Set correct CUDA compute capability values. 56 | - Use `/usr/bin/gcc` as the default compiler. 57 | 58 | Here is an example for Python 2.7: 59 | 60 | ``` 61 | MacBook-Pro:tensorflow-master tomheaven$ ./configure 62 | You have bazel 0.5.1-homebrew installed. 63 | Please specify the location of python. [Default is /usr/bin/python]: 64 | 65 | 66 | Found possible Python library paths: 67 | /Library/Python/2.7/site-packages 68 | /Users/tomheaven/Documents/caffe-master/python 69 | /Users/tomheaven/Documents/Current/demosaicnet-master/demosaicnet 70 | /Users/tomheaven/Documents/facenet/build/lib/src 71 | Please input the desired Python library path to use. Default is [/Library/Python/2.7/site-packages] 72 | 73 | Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: 74 | No Google Cloud Platform support will be enabled for TensorFlow. 75 | 76 | Do you wish to build TensorFlow with Hadoop File System support? [y/N]: 77 | No Hadoop File System support will be enabled for TensorFlow. 78 | 79 | Do you wish to build TensorFlow with XLA JIT support? [y/N]: 80 | No XLA JIT support will be enabled for TensorFlow. 81 | 82 | Do you wish to build TensorFlow with GDR support? [y/N]: 83 | No GDR support will be enabled for TensorFlow. 84 | 85 | Do you wish to build TensorFlow with VERBS support? [y/N]: 86 | No VERBS support will be enabled for TensorFlow. 87 | 88 | Do you wish to build TensorFlow with OpenCL support? [y/N]: 89 | No OpenCL support will be enabled for TensorFlow. 90 | 91 | Do you wish to build TensorFlow with CUDA support? [y/N]: y 92 | CUDA support will be enabled for TensorFlow. 93 | 94 | Please specify the CUDA SDK version you want to use, e.g. 9.0. [Leave empty to default to CUDA 9.0]: 95 | 96 | 97 | Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 98 | 99 | 100 | Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 101 | 102 | 103 | Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 104 | 105 | 106 | Please specify a list of comma-separated Cuda compute capabilities you want to build with. 107 | You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. 108 | Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2] 3.0,3.5,5.2,6.1 109 | 110 | 111 | Do you want to use clang as CUDA compiler? [y/N]: n 112 | nvcc will be used as CUDA compiler. 113 | 114 | Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/local/bin/gcc]: /usr/bin/gcc 115 | 116 | Do you wish to build TensorFlow with MPI support? [y/N]: n 117 | No MPI support will be enabled for TensorFlow. 118 | ``` 119 | + start building 120 | 121 | ``` 122 | bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package 123 | ``` 124 | Note `NCCL` only works on Linux now, which does not work on Mac OS X or Windows. We need to disable it by passing `--config=nonccl`, or you will meet NCCL related errors. 125 | 126 | + generate a wheel package 127 | ``` 128 | bazel-bin/tensorflow/tools/pip_package/build_pip_package ./ 129 | ``` 130 | 131 | ## For Python 3.7 132 | Python 3.7 is newly released. There are a few major changes affecting Tensorflow v1.10.0: 133 | 134 | + PyUnicode_AsXXXAndSize() function returns `const char*` rather than `char*`. Refer to this post: [https://github.com/protocolbuffers/protobuf/pull/4862/files](https://github.com/protocolbuffers/protobuf/pull/4862/files) for patches to protobuf. Similar patches are required for the following files: 135 | 136 | - `tensorflow/python/lib/core/ndarray_tensor.cc:157:13` 137 | - `tensorflow/python/lib/core/py_func.cc:355:16` 138 | - `tensorflow/python/eager/pywrap_tfe_src.cc:219:11` 139 | 140 | + `async` is a reserved word now. Replace function parameter `async` with `t_async` (or any other valid parameter name) in `/tensorflow/c/eager/c_api`. There are four places requiring the patch. 141 | 142 | After applying the additional patches, Tensorflow 1.10.0 should be built on Python 3.7. 143 | 144 | 145 | -------------------------------------------------------------------------------- /build_tutorial_2.0.0.md: -------------------------------------------------------------------------------- 1 | # Tensorflow OSX Build Tutorial (v2.0.0) 2 | 3 | By `Tom Heaven` @ 2019.10.01 4 | 5 | Project page: [https://github.com/TomHeaven/tensorflow-osx-build](https://github.com/TomHeaven/tensorflow-osx-build) 6 | 7 | --- 8 | 9 | Note patches for every release of Tensorflow is different! So the instructions work for v2.0.0 only. 10 | 11 | First make sure your XCode 9.4.1, CUDA 10.0 SDK and CUDNN 7.4 are properly installed, and Internet connection is also required. Install Python3 using `Homebrew` if you need to compile for Python3. 12 | 13 | 14 | ## For Python 3.7 15 | The following instructions will help you build your own wheel files for Python 3.7 with CUDA 10.0. 16 | 17 | + download Tensorflow sources and switch to `v2.0.0` 18 | 19 | ```shell 20 | git clone https://github.com/tensorflow/tensorflow 21 | cd tensorflow 22 | git checkout v2.0.0 23 | ``` 24 | + download and use source patches in this repo to patch sources: 25 | 26 | ```shell 27 | git apply v2.0.0_macos.patch 28 | 29 | ``` 30 | + run `./configure`, note that: 31 | - using `/usr/local/bin/python3` if you want a build for python3. 32 | - select the correct site packages path. 33 | - Set correct CUDA compute capability values. 34 | - Use `/usr/bin/gcc` as the default compiler. 35 | 36 | Here is an example for Python 3.7: 37 | 38 | ``` 39 | iMac18:tensorflow tomheaven$ ./configure 40 | WARNING: Running Bazel server needs to be killed, because the startup options are different. 41 | WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". 42 | You have bazel 0.24.1 installed. 43 | Please specify the location of python. [Default is /usr/bin/python]: /usr/local/bin/python3 44 | 45 | 46 | Found possible Python library paths: 47 | /usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages 48 | Please input the desired Python library path to use. Default is [/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages] 49 | 50 | Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n 51 | No XLA JIT support will be enabled for TensorFlow. 52 | 53 | Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 54 | No OpenCL SYCL support will be enabled for TensorFlow. 55 | 56 | Do you wish to build TensorFlow with ROCm support? [y/N]: 57 | No ROCm support will be enabled for TensorFlow. 58 | 59 | Do you wish to build TensorFlow with CUDA support? [y/N]: y 60 | CUDA support will be enabled for TensorFlow. 61 | 62 | Found CUDA 10.0 in: 63 | /usr/local/cuda/lib64 64 | /usr/local/cuda/include 65 | Found cuDNN 7 in: 66 | /usr/local/cuda/lib64 67 | /usr/local/cuda/include 68 | 69 | 70 | Please specify a list of comma-separated CUDA compute capabilities you want to build with. 71 | You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. 72 | Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 3.0,3.5,5.0,5.2,6.1,7.0 73 | 74 | 75 | WARNING: XLA does not support CUDA compute capabilities lower than 3.5. Disable XLA when running on older GPUs. 76 | Do you want to use clang as CUDA compiler? [y/N]: 77 | nvcc will be used as CUDA compiler. 78 | 79 | Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/local/bin/gcc]: /usr/bin/gcc 80 | 81 | 82 | Do you wish to build TensorFlow with MPI support? [y/N]: 83 | No MPI support will be enabled for TensorFlow. 84 | 85 | Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 86 | 87 | 88 | Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: 89 | Not configuring the WORKSPACE for Android builds. 90 | 91 | Do you wish to build TensorFlow with iOS support? [y/N]: 92 | No iOS support will be enabled for TensorFlow. 93 | 94 | Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details. 95 | --config=mkl # Build with MKL support. 96 | --config=monolithic # Config for mostly static monolithic build. 97 | --config=gdr # Build with GDR support. 98 | --config=verbs # Build with libverbs support. 99 | --config=ngraph # Build with Intel nGraph support. 100 | --config=numa # Build with NUMA support. 101 | --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. 102 | --config=v2 # Build TensorFlow 2.x instead of 1.x. 103 | Preconfigured Bazel build configs to DISABLE default on features: 104 | --config=noaws # Disable AWS S3 filesystem support. 105 | --config=nogcp # Disable GCP support. 106 | --config=nohdfs # Disable HDFS support. 107 | --config=noignite # Disable Apache Ignite support. 108 | --config=nokafka # Disable Apache Kafka support. 109 | --config=nonccl # Disable NVIDIA NCCL support. 110 | Configuration finished 111 | ``` 112 | + start building 113 | 114 | ``` 115 | bazel build --config=opt --config=nonccl //tensorflow/tools/pip_package:build_pip_package 116 | ``` 117 | Note `NCCL` only works on Linux now, which does not work on Mac OS X or Windows. We need to disable it by passing `--config=nonccl`, or you will meet NCCL related errors. 118 | 119 | + fix a source error in external sources. Previous building process will end with an error: 120 | 121 | ``` 122 | external/com_google_absl/absl/container/internal/compressed_tuple.h:170:53: error: use 'template' keyword to treat 'Storage' as a dependent template name 123 | return (std::move(*this).internal_compressed_tuple::Storage< CompressedTuple, I> ::get()); 124 | ``` 125 | 126 | Solution: Edit `bazel-tensorflow/external/com_google_absl/absl/container/internal/compressed_tuple.h:168-178` and comment out two problematic functions: 127 | 128 | ```cpp 129 | 130 | /*template 131 | ElemT&& get() && { 132 | return std::move(*this).internal_compressed_tuple::template Storage::get(); 133 | } 134 | template 135 | constexpr const ElemT&& get() const&& { 136 | return absl::move(*this).internal_compressed_tuple::template Storage::get(); 137 | }*/ 138 | 139 | ``` 140 | Then, build again using the same previous commands. This time, the building process should be successful. 141 | 142 | + generate a wheel package 143 | 144 | ``` 145 | bazel-bin/tensorflow/tools/pip_package/build_pip_package ./ 146 | ``` 147 | 148 | ## For Python 2.7 149 | 150 | For Python 2.7, we need an additional patch for external source file. Edit file `bazel-tensorflow/external/cython/Cython/Compiler/Nodes.py` and add the following lines at top: 151 | 152 | ```python 153 | import sys 154 | if sys.version < '3': 155 | reload(sys) 156 | sys.setdefaultencoding('utf-8') 157 | ``` 158 | Note external source patch can be applied only after the first build failure because external sources are downloaded at the beginning of the building process. 159 | 160 | 161 | -------------------------------------------------------------------------------- /patch/_nccl_ops.so: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/patch/_nccl_ops.so -------------------------------------------------------------------------------- /source_patches/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/source_patches/.DS_Store -------------------------------------------------------------------------------- /source_patches/v1.15.0_macos.patch: -------------------------------------------------------------------------------- 1 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc 2 | index 84588e2ff5..1686738126 100644 3 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc 4 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc 5 | @@ -836,10 +836,10 @@ void LaunchConv2DBackpropFilterOp::operator()( 6 | << " data_format=" << ToString(data_format) 7 | << " compute_data_format=" << ToString(compute_data_format); 8 | 9 | - constexpr auto kComputeInNHWC = 10 | + auto kComputeInNHWC = 11 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 12 | se::dnn::FilterLayout::kOutputYXInput); 13 | - constexpr auto kComputeInNCHW = 14 | + auto kComputeInNCHW = 15 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 16 | se::dnn::FilterLayout::kOutputInputYX); 17 | 18 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc 19 | index 165560d678..1a1814e302 100644 20 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc 21 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc 22 | @@ -952,10 +952,10 @@ void LaunchConv2DBackpropInputOp::operator()( 23 | << " data_format=" << ToString(data_format) 24 | << " compute_data_format=" << ToString(compute_data_format); 25 | 26 | - constexpr auto kComputeInNHWC = 27 | + auto kComputeInNHWC = 28 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 29 | se::dnn::FilterLayout::kOutputYXInput); 30 | - constexpr auto kComputeInNCHW = 31 | + auto kComputeInNCHW = 32 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 33 | se::dnn::FilterLayout::kOutputInputYX); 34 | 35 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc 36 | index d453e9d68e..e4b7c2c28d 100644 37 | --- a/tensorflow/core/kernels/conv_ops.cc 38 | +++ b/tensorflow/core/kernels/conv_ops.cc 39 | @@ -819,10 +819,10 @@ void LaunchConv2DOp::operator()( 40 | << "Negative row or col paddings: (" << common_padding_rows << ", " 41 | << common_padding_cols << ")"; 42 | 43 | - constexpr auto kComputeInNHWC = 44 | + auto kComputeInNHWC = 45 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 46 | se::dnn::FilterLayout::kOutputYXInput); 47 | - constexpr auto kComputeInNCHW = 48 | + auto kComputeInNCHW = 49 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 50 | se::dnn::FilterLayout::kOutputInputYX); 51 | 52 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 53 | index 88a3f2d1ca..2d695981be 100644 54 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 55 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 56 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] = 57 | "The matrix is not invertible: it is a scalar with value zero."; 58 | 59 | template 60 | -__global__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags, 61 | +__device__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags, 62 | const Scalar* rhs, const int num_rhs, 63 | Scalar* x, bool* not_invertible) { 64 | if (m == 1) { 65 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h 66 | index 9040e78d6f..ff41a09669 100644 67 | --- a/tensorflow/core/util/gpu_device_functions.h 68 | +++ b/tensorflow/core/util/gpu_device_functions.h 69 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff; 70 | __device__ inline unsigned GpuLaneId() { 71 | unsigned int lane_id; 72 | #if GOOGLE_CUDA 73 | -#if __clang__ 74 | - return __nvvm_read_ptx_sreg_laneid(); 75 | -#else // __clang__ 76 | - asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 77 | -#endif // __clang__ 78 | + //#if __clang__ 79 | + // return __nvvm_read_ptx_sreg_laneid(); 80 | + //#else // __clang__ 81 | + asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 82 | + //#endif // __clang__ 83 | #elif TENSORFLOW_USE_ROCM 84 | lane_id = __lane_id(); 85 | #endif 86 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h 87 | index 51fd2a84e3..ab2dda65c9 100644 88 | --- a/tensorflow/core/util/gpu_kernel_helper.h 89 | +++ b/tensorflow/core/util/gpu_kernel_helper.h 90 | @@ -57,7 +57,7 @@ using gpuError_t = hipError_t; 91 | #if GOOGLE_CUDA 92 | 93 | #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \ 94 | - extern __shared__ __align__(ALIGN) TYPE NAME[] 95 | + extern __shared__ TYPE NAME[] 96 | 97 | #elif TENSORFLOW_USE_ROCM 98 | 99 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 100 | index 38d3dc9846..cc60169754 100644 101 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 102 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 103 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) { 104 | _NSGetExecutablePath(nullptr, &buffer_size); 105 | char unresolved_path[buffer_size]; 106 | _NSGetExecutablePath(unresolved_path, &buffer_size); 107 | - CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 108 | +//CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 109 | #else 110 | #if defined(PLATFORM_WINDOWS) 111 | HMODULE hModule = GetModuleHandle(NULL); 112 | diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py 113 | index 992f2eae22..886359d790 100644 114 | --- a/tensorflow/tools/pip_package/setup.py 115 | +++ b/tensorflow/tools/pip_package/setup.py 116 | @@ -52,8 +52,8 @@ _VERSION = '1.15.0' 117 | REQUIRED_PACKAGES = [ 118 | 'absl-py >= 0.7.0', 119 | 'astor >= 0.6.0', 120 | - 'backports.weakref >= 1.0rc1;python_version<"3.4"', 121 | - 'enum34 >= 1.1.6;python_version<"3.4"', 122 | + 'backports.weakref >= 1.0', 123 | + 'enum34 >= 1.1.6', 124 | 'gast == 0.2.2', 125 | 'google_pasta >= 0.1.6', 126 | 'keras_applications >= 1.0.8', 127 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl 128 | index cf63adcbaa..418a4c61f8 100644 129 | --- a/third_party/gpus/cuda_configure.bzl 130 | +++ b/third_party/gpus/cuda_configure.bzl 131 | @@ -553,8 +553,9 @@ def find_lib(repository_ctx, paths, check_soname = True): 132 | continue 133 | if check_soname and objdump != None and not _is_windows(repository_ctx): 134 | output = repository_ctx.execute([objdump, "-p", str(path)]).stdout 135 | - output = [line for line in output.splitlines() if "SONAME" in line] 136 | - sonames = [line.strip().split(" ")[-1] for line in output] 137 | + output = [line for line in output.splitlines() if "name @rpath/" in line] 138 | + sonames = [line.strip().split("/")[-1] for line in output] 139 | + sonames = [sonames[0].strip().split(" ")[0] for line in output] 140 | if not any([soname == path.basename for soname in sonames]): 141 | mismatches.append(str(path)) 142 | continue 143 | @@ -603,7 +604,7 @@ def _find_libs(repository_ctx, cuda_config): 144 | Map of library names to structs of filename and path. 145 | """ 146 | cpu_value = cuda_config.cpu_value 147 | - stub_dir = "" if _is_windows(repository_ctx) else "/stubs" 148 | + stub_dir = "" if _is_windows(repository_ctx) else "" 149 | return { 150 | "cuda": _find_cuda_lib( 151 | "cuda", 152 | @@ -932,7 +933,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir): 153 | outs = [ 154 | %s 155 | ], 156 | - cmd = \"""cp -rLf "%s/." "%s/" \""", 157 | + cmd = \"""cp -r -f "%s/." "%s/" \""", 158 | )""" % (name, "\n".join(outs), src_dir, out_dir) 159 | 160 | def _read_dir(repository_ctx, src_dir): 161 | -------------------------------------------------------------------------------- /source_patches/v2.0.0_macos.patch: -------------------------------------------------------------------------------- 1 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc 2 | index 9d5f316..4f95a38 100644 3 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc 4 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc 5 | @@ -831,10 +831,10 @@ void LaunchConv2DBackpropFilterOp::operator()( 6 | << " data_format=" << ToString(data_format) 7 | << " compute_data_format=" << ToString(compute_data_format); 8 | 9 | - constexpr auto kComputeInNHWC = 10 | + auto kComputeInNHWC = 11 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 12 | se::dnn::FilterLayout::kOutputYXInput); 13 | - constexpr auto kComputeInNCHW = 14 | + auto kComputeInNCHW = 15 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 16 | se::dnn::FilterLayout::kOutputInputYX); 17 | 18 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc 19 | index 8974aa1..71daf6c 100644 20 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc 21 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc 22 | @@ -947,10 +947,10 @@ void LaunchConv2DBackpropInputOp::operator()( 23 | << " data_format=" << ToString(data_format) 24 | << " compute_data_format=" << ToString(compute_data_format); 25 | 26 | - constexpr auto kComputeInNHWC = 27 | + auto kComputeInNHWC = 28 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 29 | se::dnn::FilterLayout::kOutputYXInput); 30 | - constexpr auto kComputeInNCHW = 31 | + auto kComputeInNCHW = 32 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 33 | se::dnn::FilterLayout::kOutputInputYX); 34 | 35 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc 36 | index 5ad2489..e26c31f 100644 37 | --- a/tensorflow/core/kernels/conv_ops.cc 38 | +++ b/tensorflow/core/kernels/conv_ops.cc 39 | @@ -864,10 +864,10 @@ void LaunchConv2DOp::operator()( 40 | << "Negative row or col paddings: (" << common_padding_rows << ", " 41 | << common_padding_cols << ")"; 42 | 43 | - constexpr auto kComputeInNHWC = 44 | + auto kComputeInNHWC = 45 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 46 | se::dnn::FilterLayout::kOutputYXInput); 47 | - constexpr auto kComputeInNCHW = 48 | + auto kComputeInNCHW = 49 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 50 | se::dnn::FilterLayout::kOutputInputYX); 51 | 52 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 53 | index 88a3f2d..2d69598 100644 54 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 55 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 56 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] = 57 | "The matrix is not invertible: it is a scalar with value zero."; 58 | 59 | template 60 | -__global__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags, 61 | +__device__ void SolveForSizeOneOrTwoKernel(const int m, const Scalar* diags, 62 | const Scalar* rhs, const int num_rhs, 63 | Scalar* x, bool* not_invertible) { 64 | if (m == 1) { 65 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h 66 | index 9040e78..ff41a09 100644 67 | --- a/tensorflow/core/util/gpu_device_functions.h 68 | +++ b/tensorflow/core/util/gpu_device_functions.h 69 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff; 70 | __device__ inline unsigned GpuLaneId() { 71 | unsigned int lane_id; 72 | #if GOOGLE_CUDA 73 | -#if __clang__ 74 | - return __nvvm_read_ptx_sreg_laneid(); 75 | -#else // __clang__ 76 | - asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 77 | -#endif // __clang__ 78 | + //#if __clang__ 79 | + // return __nvvm_read_ptx_sreg_laneid(); 80 | + //#else // __clang__ 81 | + asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 82 | + //#endif // __clang__ 83 | #elif TENSORFLOW_USE_ROCM 84 | lane_id = __lane_id(); 85 | #endif 86 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h 87 | index 51fd2a8..2a9d8cb 100644 88 | --- a/tensorflow/core/util/gpu_kernel_helper.h 89 | +++ b/tensorflow/core/util/gpu_kernel_helper.h 90 | @@ -57,7 +57,7 @@ using gpuError_t = hipError_t; 91 | #if GOOGLE_CUDA 92 | 93 | #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \ 94 | - extern __shared__ __align__(ALIGN) TYPE NAME[] 95 | + extern __shared__ TYPE NAME[] 96 | 97 | #elif TENSORFLOW_USE_ROCM 98 | 99 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 100 | index a9289e3..db727bb 100644 101 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 102 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 103 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) { 104 | _NSGetExecutablePath(nullptr, &buffer_size); 105 | char unresolved_path[buffer_size]; 106 | _NSGetExecutablePath(unresolved_path, &buffer_size); 107 | - CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 108 | + //CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 109 | #else 110 | #if defined(PLATFORM_WINDOWS) 111 | HMODULE hModule = GetModuleHandle(NULL); 112 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl 113 | index cf63adc..418a4c6 100644 114 | --- a/third_party/gpus/cuda_configure.bzl 115 | +++ b/third_party/gpus/cuda_configure.bzl 116 | @@ -553,8 +553,9 @@ def find_lib(repository_ctx, paths, check_soname = True): 117 | continue 118 | if check_soname and objdump != None and not _is_windows(repository_ctx): 119 | output = repository_ctx.execute([objdump, "-p", str(path)]).stdout 120 | - output = [line for line in output.splitlines() if "SONAME" in line] 121 | - sonames = [line.strip().split(" ")[-1] for line in output] 122 | + output = [line for line in output.splitlines() if "name @rpath/" in line] 123 | + sonames = [line.strip().split("/")[-1] for line in output] 124 | + sonames = [sonames[0].strip().split(" ")[0] for line in output] 125 | if not any([soname == path.basename for soname in sonames]): 126 | mismatches.append(str(path)) 127 | continue 128 | @@ -603,7 +604,7 @@ def _find_libs(repository_ctx, cuda_config): 129 | Map of library names to structs of filename and path. 130 | """ 131 | cpu_value = cuda_config.cpu_value 132 | - stub_dir = "" if _is_windows(repository_ctx) else "/stubs" 133 | + stub_dir = "" if _is_windows(repository_ctx) else "" 134 | return { 135 | "cuda": _find_cuda_lib( 136 | "cuda", 137 | @@ -932,7 +933,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir): 138 | outs = [ 139 | %s 140 | ], 141 | - cmd = \"""cp -rLf "%s/." "%s/" \""", 142 | + cmd = \"""cp -r -f "%s/." "%s/" \""", 143 | )""" % (name, "\n".join(outs), src_dir, out_dir) 144 | 145 | def _read_dir(repository_ctx, src_dir): 146 | -------------------------------------------------------------------------------- /source_patches/v2.1.0_macos.patch: -------------------------------------------------------------------------------- 1 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc 2 | index 594dbd0..a533e6f 100644 3 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc 4 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc 5 | @@ -839,10 +839,10 @@ void LaunchConv2DBackpropFilterOp::operator()( 6 | << " data_format=" << ToString(data_format) 7 | << " compute_data_format=" << ToString(compute_data_format); 8 | 9 | - constexpr auto kComputeInNHWC = 10 | + auto kComputeInNHWC = 11 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 12 | se::dnn::FilterLayout::kOutputYXInput); 13 | - constexpr auto kComputeInNCHW = 14 | + auto kComputeInNCHW = 15 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 16 | se::dnn::FilterLayout::kOutputInputYX); 17 | 18 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc 19 | index 2f6200e..c9e17c9 100644 20 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc 21 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc 22 | @@ -997,10 +997,10 @@ void LaunchConv2DBackpropInputOp::operator()( 23 | << " data_format=" << ToString(data_format) 24 | << " compute_data_format=" << ToString(compute_data_format); 25 | 26 | - constexpr auto kComputeInNHWC = 27 | + auto kComputeInNHWC = 28 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 29 | se::dnn::FilterLayout::kOutputYXInput); 30 | - constexpr auto kComputeInNCHW = 31 | + auto kComputeInNCHW = 32 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 33 | se::dnn::FilterLayout::kOutputInputYX); 34 | 35 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc 36 | index d5ce7de..5a36c53 100644 37 | --- a/tensorflow/core/kernels/conv_ops.cc 38 | +++ b/tensorflow/core/kernels/conv_ops.cc 39 | @@ -859,10 +859,10 @@ void LaunchConv2DOp::operator()( 40 | << "Negative row or col paddings: (" << common_padding_rows << ", " 41 | << common_padding_cols << ")"; 42 | 43 | - constexpr auto kComputeInNHWC = 44 | + auto kComputeInNHWC = 45 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 46 | se::dnn::FilterLayout::kOutputYXInput); 47 | - constexpr auto kComputeInNCHW = 48 | + auto kComputeInNCHW = 49 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 50 | se::dnn::FilterLayout::kOutputInputYX); 51 | 52 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 53 | index 4899cd8..12d9705 100644 54 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 55 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 56 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] = 57 | "The matrix is not invertible: it is a scalar with value zero."; 58 | 59 | template 60 | -__global__ void SolveForSizeOneOrTwoKernel(const int m, 61 | +__device__ void SolveForSizeOneOrTwoKernel(const int m, 62 | const Scalar* __restrict__ diags, 63 | const Scalar* __restrict__ rhs, 64 | const int num_rhs, 65 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h 66 | index 7c54294..b5ac3a6 100644 67 | --- a/tensorflow/core/util/gpu_device_functions.h 68 | +++ b/tensorflow/core/util/gpu_device_functions.h 69 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff; 70 | __device__ inline unsigned GpuLaneId() { 71 | unsigned int lane_id; 72 | #if GOOGLE_CUDA 73 | -#if __clang__ 74 | - return __nvvm_read_ptx_sreg_laneid(); 75 | -#else // __clang__ 76 | - asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 77 | -#endif // __clang__ 78 | + //#if __clang__ 79 | + // return __nvvm_read_ptx_sreg_laneid(); 80 | + //#else // __clang__ 81 | + asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 82 | + //#endif // __clang__ 83 | #elif TENSORFLOW_USE_ROCM 84 | lane_id = __lane_id(); 85 | #endif 86 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h 87 | index 51fd2a8..2a9d8cb 100644 88 | --- a/tensorflow/core/util/gpu_kernel_helper.h 89 | +++ b/tensorflow/core/util/gpu_kernel_helper.h 90 | @@ -57,7 +57,7 @@ using gpuError_t = hipError_t; 91 | #if GOOGLE_CUDA 92 | 93 | #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \ 94 | - extern __shared__ __align__(ALIGN) TYPE NAME[] 95 | + extern __shared__ TYPE NAME[] 96 | 97 | #elif TENSORFLOW_USE_ROCM 98 | 99 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 100 | index 44bb359..6bb31fe 100644 101 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 102 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 103 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) { 104 | _NSGetExecutablePath(nullptr, &buffer_size); 105 | char unresolved_path[buffer_size]; 106 | _NSGetExecutablePath(unresolved_path, &buffer_size); 107 | - CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 108 | + //CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 109 | #else 110 | #if defined(PLATFORM_WINDOWS) 111 | HMODULE hModule = GetModuleHandle(NULL); 112 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl 113 | index af1bc96..9719a83 100644 114 | --- a/third_party/gpus/cuda_configure.bzl 115 | +++ b/third_party/gpus/cuda_configure.bzl 116 | @@ -568,8 +568,9 @@ def find_lib(repository_ctx, paths, check_soname = True): 117 | continue 118 | if check_soname and objdump != None and not _is_windows(repository_ctx): 119 | output = repository_ctx.execute([objdump, "-p", str(path)]).stdout 120 | - output = [line for line in output.splitlines() if "SONAME" in line] 121 | - sonames = [line.strip().split(" ")[-1] for line in output] 122 | + output = [line for line in output.splitlines() if "name @rpath/" in line] 123 | + sonames = [line.strip().split("/")[-1] for line in output] 124 | + sonames = [sonames[0].strip().split(" ")[0] for line in output] 125 | if not any([soname == path.basename for soname in sonames]): 126 | mismatches.append(str(path)) 127 | continue 128 | @@ -618,7 +619,7 @@ def _find_libs(repository_ctx, cuda_config): 129 | Map of library names to structs of filename and path. 130 | """ 131 | cpu_value = cuda_config.cpu_value 132 | - stub_dir = "" if _is_windows(repository_ctx) else "/stubs" 133 | + stub_dir = "" if _is_windows(repository_ctx) else "" 134 | return { 135 | "cuda": _find_cuda_lib( 136 | "cuda", 137 | @@ -947,7 +948,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir): 138 | outs = [ 139 | %s 140 | ], 141 | - cmd = \"""cp -rLf "%s/." "%s/" \""", 142 | + cmd = \"""cp -r -f "%s/." "%s/" \""", 143 | )""" % (name, "\n".join(outs), src_dir, out_dir) 144 | 145 | def _read_dir(repository_ctx, src_dir): 146 | -------------------------------------------------------------------------------- /source_patches/v2.2.0_macos.patch: -------------------------------------------------------------------------------- 1 | diff --git a/configure b/configure 2 | index 66b66ba..e43908e 100755 3 | --- a/configure 4 | +++ b/configure 5 | @@ -4,7 +4,7 @@ set -e 6 | set -o pipefail 7 | 8 | if [ -z "$PYTHON_BIN_PATH" ]; then 9 | - PYTHON_BIN_PATH=$(which python || which python3 || true) 10 | + PYTHON_BIN_PATH=$(which python3 || which python || true) 11 | fi 12 | 13 | # Set all env variables 14 | diff --git a/tensorflow/core/kernels/conv_grad_filter_ops.cc b/tensorflow/core/kernels/conv_grad_filter_ops.cc 15 | index f9bf64f..eb9803c 100644 16 | --- a/tensorflow/core/kernels/conv_grad_filter_ops.cc 17 | +++ b/tensorflow/core/kernels/conv_grad_filter_ops.cc 18 | @@ -839,10 +839,10 @@ void LaunchConv2DBackpropFilterOp::operator()( 19 | << " data_format=" << ToString(data_format) 20 | << " compute_data_format=" << ToString(compute_data_format); 21 | 22 | - constexpr auto kComputeInNHWC = 23 | + auto kComputeInNHWC = 24 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 25 | se::dnn::FilterLayout::kOutputYXInput); 26 | - constexpr auto kComputeInNCHW = 27 | + auto kComputeInNCHW = 28 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 29 | se::dnn::FilterLayout::kOutputInputYX); 30 | 31 | diff --git a/tensorflow/core/kernels/conv_grad_input_ops.cc b/tensorflow/core/kernels/conv_grad_input_ops.cc 32 | index be5d821..dd17d4b 100644 33 | --- a/tensorflow/core/kernels/conv_grad_input_ops.cc 34 | +++ b/tensorflow/core/kernels/conv_grad_input_ops.cc 35 | @@ -997,10 +997,10 @@ void LaunchConv2DBackpropInputOp::operator()( 36 | << " data_format=" << ToString(data_format) 37 | << " compute_data_format=" << ToString(compute_data_format); 38 | 39 | - constexpr auto kComputeInNHWC = 40 | + auto kComputeInNHWC = 41 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 42 | se::dnn::FilterLayout::kOutputYXInput); 43 | - constexpr auto kComputeInNCHW = 44 | + auto kComputeInNCHW = 45 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 46 | se::dnn::FilterLayout::kOutputInputYX); 47 | 48 | diff --git a/tensorflow/core/kernels/conv_ops.cc b/tensorflow/core/kernels/conv_ops.cc 49 | index d265e9d..354c9e9 100644 50 | --- a/tensorflow/core/kernels/conv_ops.cc 51 | +++ b/tensorflow/core/kernels/conv_ops.cc 52 | @@ -863,10 +863,10 @@ void LaunchConv2DOp::operator()( 53 | << "Negative row or col paddings: (" << common_padding_rows << ", " 54 | << common_padding_cols << ")"; 55 | 56 | - constexpr auto kComputeInNHWC = 57 | + auto kComputeInNHWC = 58 | std::make_tuple(se::dnn::DataLayout::kBatchYXDepth, 59 | se::dnn::FilterLayout::kOutputYXInput); 60 | - constexpr auto kComputeInNCHW = 61 | + auto kComputeInNCHW = 62 | std::make_tuple(se::dnn::DataLayout::kBatchDepthYX, 63 | se::dnn::FilterLayout::kOutputInputYX); 64 | 65 | diff --git a/tensorflow/core/kernels/data/experimental/snapshot_util.cc b/tensorflow/core/kernels/data/experimental/snapshot_util.cc 66 | index 391ece3..1df8c82 100644 67 | --- a/tensorflow/core/kernels/data/experimental/snapshot_util.cc 68 | +++ b/tensorflow/core/kernels/data/experimental/snapshot_util.cc 69 | @@ -32,6 +32,12 @@ limitations under the License. 70 | namespace tensorflow { 71 | namespace data { 72 | namespace experimental { 73 | +// Tom Added to solve symbol not found error on macOS 74 | +static constexpr const int64 kSnappyReaderInputBufferSizeBytes = 75 | + 1 << 30; // 1 GiB 76 | + // TODO(b/148804377): Set this in a smarter fashion. 77 | +static constexpr const int64 kSnappyReaderOutputBufferSizeBytes = 78 | + 32 << 20; // 32 MiB 79 | 80 | SnapshotWriter::SnapshotWriter(WritableFile* dest, 81 | const string& compression_type, int version, 82 | diff --git a/tensorflow/core/kernels/data/experimental/snapshot_util.h b/tensorflow/core/kernels/data/experimental/snapshot_util.h 83 | index a2df3cc..43eda32 100644 84 | --- a/tensorflow/core/kernels/data/experimental/snapshot_util.h 85 | +++ b/tensorflow/core/kernels/data/experimental/snapshot_util.h 86 | @@ -70,11 +70,11 @@ class SnapshotReader { 87 | // The reader input buffer size is deliberately large because the input reader 88 | // will throw an error if the compressed block length cannot fit in the input 89 | // buffer. 90 | - static constexpr const int64 kSnappyReaderInputBufferSizeBytes = 91 | - 1 << 30; // 1 GiB 92 | + //static constexpr const int64 kSnappyReaderInputBufferSizeBytes = 93 | + // 1 << 30; // 1 GiB 94 | // TODO(b/148804377): Set this in a smarter fashion. 95 | - static constexpr const int64 kSnappyReaderOutputBufferSizeBytes = 96 | - 32 << 20; // 32 MiB 97 | + //static constexpr const int64 kSnappyReaderOutputBufferSizeBytes = 98 | + // 32 << 20; // 32 MiB 99 | static constexpr const size_t kHeaderSize = sizeof(uint64); 100 | 101 | static constexpr const char* const kClassName = "SnapshotReader"; 102 | diff --git a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 103 | index 3825e29..c75fca7 100644 104 | --- a/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 105 | +++ b/tensorflow/core/kernels/tridiagonal_solve_op_gpu.cu.cc 106 | @@ -40,7 +40,7 @@ static const char kNotInvertibleScalarMsg[] = 107 | "The matrix is not invertible: it is a scalar with value zero."; 108 | 109 | template 110 | -__global__ void SolveForSizeOneOrTwoKernel(const int m, 111 | +__device__ void SolveForSizeOneOrTwoKernel(const int m, 112 | const Scalar* __restrict__ diags, 113 | const Scalar* __restrict__ rhs, 114 | const int num_rhs, 115 | diff --git a/tensorflow/core/platform/tstring.h b/tensorflow/core/platform/tstring.h 116 | index 3fe1be2..515dbf7 100644 117 | --- a/tensorflow/core/platform/tstring.h 118 | +++ b/tensorflow/core/platform/tstring.h 119 | @@ -15,7 +15,7 @@ limitations under the License. 120 | 121 | #ifndef TENSORFLOW_CORE_PLATFORM_TSTRING_H_ 122 | #define TENSORFLOW_CORE_PLATFORM_TSTRING_H_ 123 | - 124 | +#include // Tom added 125 | #include 126 | 127 | #include 128 | @@ -225,7 +225,7 @@ class tstring { 129 | friend bool operator==(const std::string& a, const tstring& b); 130 | friend tstring operator+(const tstring& a, const tstring& b); 131 | friend std::ostream& operator<<(std::ostream& o, const tstring& str); 132 | - friend std::hash; 133 | + //friend struct std::hash; //Tom modified 134 | }; 135 | 136 | // Non-member function overloads 137 | diff --git a/tensorflow/core/util/gpu_device_functions.h b/tensorflow/core/util/gpu_device_functions.h 138 | index 7c54294..e648517 100644 139 | --- a/tensorflow/core/util/gpu_device_functions.h 140 | +++ b/tensorflow/core/util/gpu_device_functions.h 141 | @@ -140,11 +140,11 @@ __device__ const unsigned kGpuWarpAll = 0xffffffff; 142 | __device__ inline unsigned GpuLaneId() { 143 | unsigned int lane_id; 144 | #if GOOGLE_CUDA 145 | -#if __clang__ 146 | - return __nvvm_read_ptx_sreg_laneid(); 147 | -#else // __clang__ 148 | +//#if __clang__ 149 | +// return __nvvm_read_ptx_sreg_laneid(); 150 | +//#else // __clang__ 151 | asm("mov.u32 %0, %%laneid;" : "=r"(lane_id)); 152 | -#endif // __clang__ 153 | +//#endif // __clang__ 154 | #elif TENSORFLOW_USE_ROCM 155 | lane_id = __lane_id(); 156 | #endif 157 | diff --git a/tensorflow/core/util/gpu_kernel_helper.h b/tensorflow/core/util/gpu_kernel_helper.h 158 | index 51fd2a8..cff59b6 100644 159 | --- a/tensorflow/core/util/gpu_kernel_helper.h 160 | +++ b/tensorflow/core/util/gpu_kernel_helper.h 161 | @@ -57,7 +57,8 @@ using gpuError_t = hipError_t; 162 | #if GOOGLE_CUDA 163 | 164 | #define GPU_DYNAMIC_SHARED_MEM_DECL(ALIGN, TYPE, NAME) \ 165 | - extern __shared__ __align__(ALIGN) TYPE NAME[] 166 | +extern __shared__ TYPE NAME[] 167 | +// extern __shared__ __align__(ALIGN) TYPE NAME[] 168 | 169 | #elif TENSORFLOW_USE_ROCM 170 | 171 | diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 172 | index 44bb359..6bb31fe 100644 173 | --- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 174 | +++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc 175 | @@ -195,7 +195,7 @@ static string GetBinaryDir(bool strip_exe) { 176 | _NSGetExecutablePath(nullptr, &buffer_size); 177 | char unresolved_path[buffer_size]; 178 | _NSGetExecutablePath(unresolved_path, &buffer_size); 179 | - CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 180 | + //CHECK_ERR(realpath(unresolved_path, exe_path) ? 1 : -1); 181 | #else 182 | #if defined(PLATFORM_WINDOWS) 183 | HMODULE hModule = GetModuleHandle(NULL); 184 | diff --git a/third_party/gpus/cuda_configure.bzl b/third_party/gpus/cuda_configure.bzl 185 | index bdaaa4a..544d1c9 100644 186 | --- a/third_party/gpus/cuda_configure.bzl 187 | +++ b/third_party/gpus/cuda_configure.bzl 188 | @@ -462,7 +462,7 @@ def _check_cuda_lib_params(lib, cpu_value, basedir, version, static = False): 189 | _should_check_soname(version, static), 190 | ) 191 | 192 | -def _check_cuda_libs(repository_ctx, script_path, libs): 193 | +def _check_cuda_libs_failed(repository_ctx, script_path, libs): 194 | python_bin = get_python_bin(repository_ctx) 195 | contents = repository_ctx.read(script_path).splitlines() 196 | 197 | @@ -476,6 +476,7 @@ def _check_cuda_libs(repository_ctx, script_path, libs): 198 | cmd += "system('%s script.py %s');" % (python_bin, args) 199 | 200 | all_paths = [path for path, _ in libs] 201 | + print('cmd %s' % cmd) 202 | checked_paths = execute(repository_ctx, [python_bin, "-c", cmd]).stdout.splitlines() 203 | 204 | # Filter out empty lines from splitting on '\r\n' on Windows 205 | @@ -483,6 +484,33 @@ def _check_cuda_libs(repository_ctx, script_path, libs): 206 | if all_paths != checked_paths: 207 | auto_configure_fail("Error with installed CUDA libs. Expected '%s'. Actual '%s'." % (all_paths, checked_paths)) 208 | 209 | +def _check_cuda_libs(repository_ctx, script_path, paths, check_soname = True): 210 | + """ 211 | + Finds a library among a list of potential paths. 212 | + Args: 213 | + paths: List of paths to inspect. 214 | + Returns: 215 | + Returns the first path in paths that exist. 216 | + """ 217 | + objdump = repository_ctx.which("objdump") 218 | + mismatches = [] 219 | + for path in paths: 220 | + path = path[0] 221 | + print('mypath', path) 222 | + #if not path.exists: 223 | + # continue 224 | + output = repository_ctx.execute([objdump, "-p", str(path)]).stdout 225 | + output = [line for line in output.splitlines() if "name @rpath/" in line] 226 | + sonames = [line.strip().split("/")[-1] for line in output] 227 | + sonames = [sonames[0].strip().split(" ")[0] for line in output] 228 | + return path 229 | + 230 | + if mismatches: 231 | + auto_configure_fail( 232 | + "None of the libraries match their SONAME: " + ", ".join(mismatches), 233 | + ) 234 | + auto_configure_fail("No library found under: " + ", ".join(paths)) 235 | + 236 | def _find_libs(repository_ctx, check_cuda_libs_script, cuda_config): 237 | """Returns the CUDA and cuDNN libraries on the system. 238 | 239 | @@ -498,7 +526,7 @@ def _find_libs(repository_ctx, check_cuda_libs_script, cuda_config): 240 | Map of library names to structs of filename and path. 241 | """ 242 | cpu_value = cuda_config.cpu_value 243 | - stub_dir = "" if is_windows(repository_ctx) else "/stubs" 244 | + stub_dir = "" if is_windows(repository_ctx) else "" 245 | 246 | check_cuda_libs_params = { 247 | "cuda": _check_cuda_lib_params( 248 | @@ -826,7 +854,7 @@ def make_copy_dir_rule(repository_ctx, name, src_dir, out_dir): 249 | outs = [ 250 | %s 251 | ], 252 | - cmd = \"""cp -rLf "%s/." "%s/" \""", 253 | + cmd = \"""cp -r -f "%s/." "%s/" \""", 254 | )""" % (name, "\n".join(outs), src_dir, out_dir) 255 | 256 | def _flag_enabled(repository_ctx, flag_name): 257 | -------------------------------------------------------------------------------- /usr_local_lib/libgcc_s.1.dylib: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/usr_local_lib/libgcc_s.1.dylib -------------------------------------------------------------------------------- /usr_local_lib/libgomp.1.dylib: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TomHeaven/tensorflow-osx-build/aad96d435c6b91eaeeb841a4a9e070708872f40d/usr_local_lib/libgomp.1.dylib --------------------------------------------------------------------------------