├── Dockerfile ├── LICENSE ├── README.md ├── dsub_beast.sh ├── install-apt_packages.sh ├── install-beagle.sh ├── install-beast.sh └── run_beast.sh /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM nvidia/cuda:11.0.3-devel-ubuntu20.04 2 | 3 | # CUDA version must be compatible with driver version of host: 4 | # via: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html 5 | # 6 | # Note: The container-optimized OS (COS) images used to host dsub docker containers 7 | # have NVIDIA driver versions that lag current versions. The nvidia/cuda baseimage 8 | # baseimage above should use a CUDA version compatible with the driver in the 9 | # current COS image. 10 | # see: https://cloud.google.com/container-optimized-os/docs/how-to/run-gpus#install 11 | # https://cloud.google.com/container-optimized-os/docs/release-notes 12 | # 13 | # CUDA Toolkit Linux x86_64 Driver Version 14 | # ------------------------------------------------- 15 | # CUDA 11.5.1 >= 495.29.05 16 | # CUDA 11.4.3 >= 470.82.01 17 | # CUDA 11.4.2 >= 470.57.02 18 | # CUDA 11.4.1 >= 470.57.02 19 | # CUDA 11.4.0 >= 470.42.01 20 | # CUDA 11.3.1 >= 465.19.01 21 | # CUDA 11.3.0 >= 465.19.01 22 | # CUDA 11.2.2 >= 460.32.03 23 | # CUDA 11.2.1 >= 460.32.03 24 | # CUDA 11.2.0 >= 460.27.03 25 | # CUDA 11.1.1 >= 455.32 26 | # CUDA 11.1 >= 455.23 27 | # CUDA 11.0.3 >= 450.51.06 28 | # CUDA 11.0.2 >= 450.51.05 29 | # CUDA 10.2 >= 440.33 30 | # CUDA 10.1 >= 418.39 31 | # CUDA 10.0 (10.0.130) >= 410.48 32 | # CUDA 9.2 (9.2.88) >= 396.26 33 | # CUDA 9.1 (9.1.85) >= 390.46 34 | # CUDA 9.0 (9.0.76) >= 384.81 35 | # CUDA 8.0 (8.0.61 GA2) >= 375.26 36 | # CUDA 8.0 (8.0.44) >= 367.48 37 | # CUDA 7.5 (7.5.16) >= 352.31 38 | # CUDA 7.0 (7.0.28) >= 346.46 39 | 40 | LABEL maintainer "Daniel Park " 41 | LABEL maintainer_other "Christopher Tomkins-Tinch " 42 | 43 | COPY install-*.sh /opt/docker/ 44 | 45 | # System packages, Google Cloud SDK, and locale 46 | # ca-certificates and wget needed for gosu 47 | # bzip2, liblz4-toolk, and pigz are useful for packaging and archival 48 | # google-cloud-sdk needed when using this in GCE 49 | RUN /opt/docker/install-apt_packages.sh 50 | 51 | # Set default locale to en_US.UTF-8 52 | ENV LANG="en_US.UTF-8" LANGUAGE="en_US:en" LC_ALL="en_US.UTF-8" 53 | ENV LD_LIBRARY_PATH /usr/local/lib:${LD_LIBRARY_PATH} 54 | ENV PKG_CONFIG_PATH /usr/local/lib/pkgconfig:$PKG_CONFIG_PATH 55 | ENV LIBRARY_PATH /usr/local/cuda/lib64/stubs:${LIBRARY_PATH} 56 | 57 | RUN /opt/docker/install-beagle.sh 58 | 59 | RUN /opt/docker/install-beast.sh 60 | 61 | ENV BEAST="/usr/local" 62 | 63 | CMD ["/bin/bash"] 64 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Broad Institute 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # beast-beagle-cuda 3 | GPU-accelerated Docker images containing [BEAST](http://beast.community/about) and [BEAGLE](http://beast.community/beagle), compiled with [NVIDIA CUDA](https://developer.nvidia.com/cuda-zone) support 4 | 5 | **IMPORTANT:** The XML file provided to BEAST will need to be generated by a version of [BEAUTi](http://beast.community/beauti) compatible with the version of BEAST distributed within the Docker image (currently v1.10.5pre_thorney_v0.1.2) 6 | 7 | ## Instructions 8 | * (Follow the `dsub` instructions to [Get Started on Google Cloud](https://github.com/DataBiosphere/dsub#getting-started-on-google-cloud)) 9 | * Install [dsub](https://github.com/DataBiosphere/dsub) 10 | * [Enable the required GCP APIs](https://console.cloud.google.com/flows/enableapi?apiid=lifesciences.googleapis.com,storage_component,compute_component&redirect=https://console.cloud.google.com) 11 | * Create an input XML for BEAST (via [BEAUTi](http://beast.community/beauti)) 12 | * Transfer the XML file to a GS bucket 13 | * Call `dsub_beast.sh`: 14 | ``` 15 | Usage: 16 | dsub_beast.sh gs://path/to/in.xml gcp-project-name num_gpus [beagle_order] 17 | 18 | Note: The version of BEAST used should match the version of BEAUTi 19 | used to generate the input xml file. 20 | 21 | Docker images have been built for several versions of BEAST. 22 | The Docker image to be used can be selected by the BEAST_VERSION environment variable. 23 | For example: 24 | BEAST_VERSION='1.10.5pre_thorney_v0.1.2' dsub_beast.sh gs://path/to/in.xml gcp-project-name num_gpus [beagle_order] 25 | For available versions of BEAST, see the tags on Quay.io: 26 | https://quay.io/repository/broadinstitute/beast-beagle-cuda?tab=tags 27 | If BEAST_VERSION is not specified the 'latest' tag will be used. 28 | 29 | The GPU type can be set via the BEAST_GPU_MODEL environment variable. 30 | For example: 31 | BEAST_GPU_MODEL='nvidia-tesla-v100' dsub_beast.sh gs://path/to/in.xml gcp-project-name num_gpus [beagle_order] 32 | For available GPU models, see: 33 | https://cloud.google.com/compute/docs/gpus/ 34 | 35 | If 'beagle_order' is not specified, the number of partitions will be read from 36 | the input xml file and spread across the number of GPUs specified. 37 | Note: *the entire xml file will be downloaded from its bucket if 'beagle_order' is not specified* 38 | 39 | Extra arguments for BEAST may be passed via the BEAST_EXTRA_ARGS environment variable. 40 | For example: 41 | BEAST_EXTRA_ARGS='-beagle_instances 4' dsub_beast.sh gs://path/to/in.xml gcp-project-name num_gpus [beagle_order] 42 | ``` 43 | -------------------------------------------------------------------------------- /dsub_beast.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # This is only a demonstration of how to run beast via dsub. 4 | # The accelerator (GPU) type and count should be set according to 5 | # the size of the data and number of partitions. 6 | 7 | # --accelerator-count should scale with number of partitions in data 8 | # --nvidia-driver-version must match compatible CUDA version 9 | # 10 | 11 | GPU_TYPE="nvidia-tesla-p4" # see: https://cloud.google.com/compute/docs/gpus/ 12 | DOCKER_IMAGE="quay.io/broadinstitute/beast-beagle-cuda" 13 | 14 | # get absolute path for file 15 | function absolute_path() { 16 | local SOURCE="$1" 17 | while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink 18 | DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )" 19 | if [[ "$OSTYPE" == "darwin"* ]]; then 20 | SOURCE="$(readlink "$SOURCE")" 21 | else 22 | SOURCE="$(readlink -f "$SOURCE")" 23 | fi 24 | [[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located 25 | done 26 | echo "$SOURCE" 27 | } 28 | SOURCE="${BASH_SOURCE[0]}" 29 | SCRIPT=$(absolute_path "$SOURCE") 30 | SCRIPT_DIRNAME="$(dirname "$SOURCE")" 31 | SCRIPTPATH="$(cd -P "$(echo $SCRIPT_DIRNAME)" &> /dev/null && pwd)" 32 | SCRIPT="$SCRIPTPATH/$(basename "$SCRIPT")" # absolute path for this script 33 | 34 | function print_usage(){ 35 | echo "Usage: " 36 | echo " $(basename $0) gs://path/to/in.xml gcp-project-name num_gpus [beagle_order]" 37 | echo "" 38 | echo " Note: The version of BEAST used should match the version of BEAUTi" 39 | echo " used to generate the input xml file." 40 | echo "" 41 | echo " Docker images have been built for several versions of BEAST." 42 | echo " The Docker image to be used can be selected by the BEAST_VERSION environment variable." 43 | echo " For example:" 44 | echo " BEAST_VERSION='1.10.5pre_thorney_v0.1.2' $(basename $0) gs://path/to/in.xml gcp-project-name num_gpus [beagle_order]" 45 | echo " For available versions of BEAST, see the tags on Quay.io:" 46 | echo " https://quay.io/repository/broadinstitute/beast-beagle-cuda?tab=tags" 47 | echo " If BEAST_VERSION is not specified the 'latest' tag will be used." 48 | echo "" 49 | echo " The GPU type can be set via the BEAST_GPU_MODEL environment variable." 50 | echo " For example:" 51 | echo " BEAST_GPU_MODEL='nvidia-tesla-v100' $(basename $0) gs://path/to/in.xml gcp-project-name num_gpus [beagle_order]" 52 | echo " For available GPU models, see:" 53 | echo " https://cloud.google.com/compute/docs/gpus/" 54 | echo "" 55 | echo " If 'beagle_order' is not specified, the number of partitions will be read from" 56 | echo " the input xml file and spread across the number of GPUs specified." 57 | echo " Note: *the entire xml file will be downloaded from its bucket if 'beagle_order' is not specified*" 58 | echo "" 59 | echo " Extra arguments for BEAST may be passed via the BEAST_EXTRA_ARGS environment variable." 60 | echo " For example:" 61 | echo " BEAST_EXTRA_ARGS='-beagle_instances 4' $(basename $0) gs://path/to/in.xml gcp-project-name num_gpus [beagle_order]" 62 | } 63 | 64 | hash dsub &> /dev/null 65 | if [ $? -ne 0 ]; then 66 | echo "" 67 | echo "IMPORTANT: dsub must be installed and available to use this script" 68 | echo "" 69 | echo " -> Follow the dsub instructions to Get Started on Google Cloud:" 70 | echo " https://github.com/DataBiosphere/dsub#getting-started-on-google-cloud" 71 | echo " -> Install dsub:" 72 | echo " https://github.com/DataBiosphere/dsub" 73 | exit 1 74 | fi 75 | 76 | if [ $# -eq 0 ] || [ $# -lt 3 ]; then 77 | print_usage 78 | exit 1 79 | fi 80 | 81 | # if the user HAS NOT set the BEAST_VERSION environment variable 82 | # use the latest tagged Docker image 83 | if [[ -z "${BEAST_VERSION}" ]]; then 84 | DOCKER_IMAGE_TAG=":latest" 85 | else 86 | DOCKER_IMAGE_TAG=":${BEAST_VERSION}" 87 | fi 88 | 89 | # if the user HAS set the BEAST_GPU_MODEL environment variable 90 | if [[ ! -z "${BEAST_GPU_MODEL}" ]]; then 91 | GPU_TYPE="${BEAST_GPU_MODEL}" 92 | fi 93 | 94 | # input args for this script 95 | IN_XML="$1" 96 | OUT_BUCKET="$(dirname $1)" 97 | GCP_PROJECT="$2" 98 | NUM_GPUS="$3" 99 | 100 | # if the user HAS NOT specified a beagle_order 101 | # generate one based on the number of GPUs specified 102 | # and the number of partitions in the input XML file 103 | if [ -z "$4" ]; then 104 | number_of_partitions=$(gsutil cat "$1" | grep "" | wc -l | awk '{ printf "%d\n", $0 }') 105 | 106 | if [[ ${NUM_GPUS} > ${number_of_partitions} ]]; then 107 | echo "More GPUs (${NUM_GPUS}) have been requested than there are paritions (${number_of_partitions})." 108 | echo "Consider reducing the number of GPUs, or specify the 'beagle_order' yourself." 109 | echo "Exiting..." 110 | exit 1 111 | fi 112 | 113 | partition_string="" 114 | if [[ ${NUM_GPUS} > 0 ]]; then 115 | partitions_that_fit="$((${number_of_partitions}/${NUM_GPUS}))" 116 | extra_partitions="$((${number_of_partitions}%${NUM_GPUS}))" 117 | 118 | for i in $(seq 1 ${partitions_that_fit}); do 119 | partition_string="${partition_string}$(echo $(seq 1 ${NUM_GPUS})) " 120 | done 121 | if [[ ${extra_partitions} > 0 ]]; then 122 | partition_string="${partition_string} $(echo $(seq 1 ${extra_partitions}))" 123 | fi 124 | 125 | else 126 | # if no GPUs are specified, set all partitions to be on 127 | # resource 0 (CPU) 128 | for i in $(seq 1 ${number_of_partitions}); do 129 | partition_string="${partition_string}0," 130 | done 131 | fi 132 | partition_string=$(echo "${partition_string}" | sed 's/ / /g' | sed 's/ /,/g' | sed 's/,$//') 133 | BEAGLE_ORDER="${partition_string}" 134 | else 135 | BEAGLE_ORDER="$4" 136 | fi 137 | 138 | ACCELERATOR_SPEC="" 139 | if [[ ${NUM_GPUS} > 0 ]]; then 140 | ACCELERATOR_SPEC="--accelerator-type ${GPU_TYPE} --accelerator-count ${NUM_GPUS}" 141 | fi 142 | 143 | echo "" 144 | echo "Input file: ${IN_XML}" 145 | echo "OUT_BUCKET: ${OUT_BUCKET}" 146 | echo "NUM_GPUs: ${NUM_GPUS}" 147 | echo "BEAGLE_ORDER: ${BEAGLE_ORDER}" 148 | echo "GPU_TYPE: ${GPU_TYPE}" 149 | echo "DOCKER_IMAGE: ${DOCKER_IMAGE}${DOCKER_IMAGE_TAG}" 150 | echo "BEAST_EXTRA_ARGS: ${BEAST_EXTRA_ARGS}" 151 | 152 | dsub \ 153 | --provider=google-cls-v2 \ 154 | --project "${GCP_PROJECT}" \ 155 | --zone "us*" \ 156 | --image "${DOCKER_IMAGE}${DOCKER_IMAGE_TAG}" \ 157 | --input "INPUT_FILE=${IN_XML}" \ 158 | --output "OUTPUT_FILES=${OUT_BUCKET}/*" \ 159 | --logging "${OUT_BUCKET}" \ 160 | --env BEAGLE_ORDER="${BEAGLE_ORDER}" BEAST_EXTRA_ARGS="${BEAST_EXTRA_ARGS}" \ 161 | --script "${SCRIPTPATH}/run_beast.sh" \ 162 | --boot-disk-size 15 \ 163 | ${ACCELERATOR_SPEC} 164 | #--wait 165 | -------------------------------------------------------------------------------- /install-apt_packages.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e -o pipefail 4 | 5 | # Silence some warnings about Readline. Checkout more over her$ 6 | # https://github.com/phusion/baseimage-docker/issues/58 7 | DEBIAN_FRONTEND=noninteractive 8 | echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections 9 | 10 | # Add some basics 11 | apt-get update 12 | #--no-install-recommends 13 | # See here for packages required to build beagle: 14 | # https://github.com/beagle-dev/beagle-lib/wiki/LinuxInstallInstructions 15 | 16 | apt-get install -y -qq \ 17 | lsb-release ca-certificates wget rsync curl \ 18 | less nano vim git locales make \ 19 | dirmngr \ 20 | liblz4-tool pigz bzip2 lbzip2 zstd \ 21 | cmake build-essential autoconf automake libtool git pkg-config \ 22 | ant \ 23 | openjdk-11-jre openjdk-11-jdk 24 | # 25 | 26 | mkdir -p /usr/local/cuda/bin 27 | 28 | ln -s /usr/bin/gcc-9 /usr/local/cuda/bin/gcc 29 | ln -s /usr/bin/g++-9 /usr/local/cuda/bin/g++ 30 | 31 | # Auto-detect platform 32 | DEBIAN_PLATFORM="$(lsb_release -c -s)" 33 | echo "Debian platform: $DEBIAN_PLATFORM" 34 | 35 | # Add source for gcloud sdk 36 | echo "deb http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list 37 | # The line below is commented out since there is not a "focal" build of cloud-sdk in the apt repo (20.04); 38 | # remove the line above and uncomment below when available. 39 | # echo "deb http://packages.cloud.google.com/apt cloud-sdk-$DEBIAN_PLATFORM main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list 40 | 41 | curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - 42 | 43 | # Install gcloud and aws 44 | apt-get update 45 | apt-get install -y -qq --no-install-recommends \ 46 | google-cloud-sdk awscli 47 | 48 | # Upgrade and clean 49 | apt-get upgrade -y 50 | apt-get clean -y 51 | 52 | locale-gen en_US.UTF-8 53 | -------------------------------------------------------------------------------- /install-beagle.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e -o pipefail 4 | 5 | # report CPU info 6 | lscpu 7 | cat /proc/cpuinfo 8 | 9 | cd /opt/docker 10 | 11 | # beagle 3.1.2, known working with beast 1.10.5pre 12 | #git clone --depth=1 --branch="v3.1.2" https://github.com/beagle-dev/beagle-lib.git 13 | git clone --depth=1 --branch "v4.0.0" https://github.com/beagle-dev/beagle-lib.git 14 | cd beagle-lib 15 | 16 | mkdir build 17 | cd build 18 | #cmake -DBUILD_OPENCL=OFF -DBEAGLE_OPTIMIZE_FOR_NATIVE_ARCH=false -DCMAKE_INSTALL_PREFIX:PATH=/usr/local .. 19 | #cmake -DBUILD_OPENCL=OFF -DBEAGLE_OPTIMIZE_FOR_NATIVE_ARCH=true -DCMAKE_INSTALL_PREFIX:PATH=/usr/local .. 20 | # generic 64-bit cpu 21 | #cmake -DBUILD_OPENCL=OFF -DBEAGLE_OPTIMIZE_FOR_NATIVE_ARCH=false -DCMAKE_CXX_FLAGS="-march=x86-64 -mtune=intel" -DCMAKE_INSTALL_PREFIX:PATH=/usr/local .. 22 | cmake -DBUILD_OPENCL=OFF -DBEAGLE_OPTIMIZE_FOR_NATIVE_ARCH=false -DCMAKE_CXX_FLAGS="-march=haswell -mtune=intel" -DCMAKE_INSTALL_PREFIX:PATH=/usr/local .. 23 | make install 24 | 25 | ldconfig # LD_LIBRARY_PATH is also set in the Dockerfile to include /usr/local/lib 26 | 27 | examples/synthetictest -------------------------------------------------------------------------------- /install-beast.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | beast_version="v10.5.0-beta3" 4 | beast_name="BEAST_X_${beast_version}" 5 | 6 | wget --quiet https://github.com/beast-dev/beast-mcmc/releases/download/${beast_version}/${beast_name}.tgz -O ${beast_name}.tgz 7 | 8 | tar -xzpf ${beast_name}.tgz && mv BEAST*/ ${beast_name} 9 | rm ${beast_name}.tgz 10 | 11 | mv ${beast_name}/bin/* /usr/local/bin 12 | mv ${beast_name}/lib/* /usr/local/lib 13 | 14 | beast -beagle_info 15 | -------------------------------------------------------------------------------- /run_beast.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # This script is intended to invoke beast within a Docker container 4 | # running on the Google Cloud Platform via the Pipelines API / dsub 5 | 6 | IN_DIR=$(dirname "${INPUT_FILE}") 7 | OUT_DIR=$(dirname "${OUTPUT_FILES}") 8 | OUTPUT_PREFIX=$(basename "${INPUT_FILE}" .xml) 9 | if [ -z "${BEAGLE_ORDER}" ]; then 10 | BEAGLE_ORDER=1 # run on first GPU only if BEAGLE_ORDER is not set 11 | fi 12 | 13 | if [ -z "${INPUT_FILE}" ]; then 14 | echo "Usage: $(basename $0) [beagle_order]" 15 | echo ' The input xml must be passed via INPUT_FILE=/path/to/beauti_generated_input.xml' 16 | exit 1 17 | fi 18 | 19 | pwd 20 | cd $OUT_DIR 21 | # report beagle info including number of GPUs 22 | beast -beagle_info > "${OUTPUT_PREFIX}.out" 23 | # report CPU info 24 | lscpu | tee -a "${OUTPUT_PREFIX}.out" 25 | pwd 26 | beast -beagle_multipartition off -beagle_GPU -beagle_cuda -beagle_double -beagle_scaling always -beagle_order ${BEAGLE_ORDER} ${BEAST_EXTRA_ARGS} ${INPUT_FILE} >> "${OUTPUT_PREFIX}.out" 27 | ls 28 | --------------------------------------------------------------------------------