├── LICENSE ├── README.md ├── bin ├── generate-graphs.sh ├── sbt └── watch-and-email.sh ├── build.sbt ├── conf └── cluster-sim-env.sh.template ├── project └── build.properties ├── src ├── main │ ├── java │ │ └── ClusterSchedulingSimulation │ │ │ └── ClusterSimulationProtos.java │ ├── protocolbuffers │ │ ├── cluster_simulation_protos.proto │ │ └── compile_protobufs.sh │ ├── python │ │ ├── cluster_simulation_protos_pb2.py │ │ ├── generate-txt-from-protobuff.py │ │ ├── generate-txt-from-protobuff.sh │ │ ├── generate-txt-from-protobuffs-in-dir.sh │ │ └── graphing-scripts │ │ │ ├── README │ │ │ ├── comparison-plot-from-protobuff.py │ │ │ ├── comparison-plot-from-protobuff.sh │ │ │ ├── generate-plots-from-protobuff.py │ │ │ └── utils.py │ └── scala │ │ ├── CoreClusterSimulation.scala │ │ ├── ExperimentRunner.scala │ │ ├── MesosSimulation.scala │ │ ├── MonolithicSimulation.scala │ │ ├── OmegaSimulation.scala │ │ ├── ParseParm.scala │ │ ├── Simulation.scala │ │ ├── Util.scala │ │ └── Workloads.scala └── test │ └── scala │ └── TestSimulations.scala └── traces ├── README.txt ├── example-init-cluster-state.log └── job-distribution-traces ├── README.txt ├── example_csizes_cmb.log ├── example_interarrival_cmb.log └── example_runtimes_cmb.log /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013, Regents of the University of California 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. Redistributions in binary 9 | form must reproduce the above copyright notice, this list of conditions and the 10 | following disclaimer in the documentation and/or other materials provided with 11 | the distribution. Neither the name of the University of California, Berkeley 12 | nor the names of its contributors may be used to endorse or promote products 13 | derived from this software without specific prior written permission. THIS 14 | SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 15 | EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 16 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 17 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 18 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 20 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 21 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 22 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 23 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Cluster scheduler simulator overview 2 | 3 | This simulator can be used to prototype and compare different cluster scheduling strategies and policies. It generates synthetic cluster workloads from empirical parameter distributions (thus generating unique workloads even from a small amount of input data), simulates their scheduling and execution using a discrete event simulator, and finally permits analysis of scheduling performance metrics. 4 | 5 | The simulator was originally written as part of research on the "Omega" shared-state cluster scheduling architecture at Google. A paper on Omega, published at EuroSys 2013, uses of this simulator for the comparative evaluation of Omega and other alternative architectures (referred to as a "lightweight" simulator there) [1]. As such, the simulators design is somewhat geared towards the comparative evaluation needs of this paper, but it does also permit more general experimentation with: 6 | 7 | * scheduling policies and logics (i.e. "what machine should a task be bound to?"), 8 | * resource models (i.e. "how are machines represented for scheduling, and how are they shared between tasks?"), 9 | * shared-cluster scheduling architectures (i.e. "how can multiple independent schedulers be supported for a large shared, multi-framework cluster?"). 10 | 11 | While the simulator will simulate job arrival, scheduler decision making and task placement, it does **not** simulate the actual execution of the tasks or variation in their runtime due to shared resources. 12 | 13 | ## Downloading, building, and running 14 | 15 | The source code for the simulator is available in a Git repository hosted on Google Code. Instructions for downloading can be found at at https://code.google.com/p/cluster-scheduler-simulator/source/checkout. 16 | 17 | The simulator is written in Scala, and requires the Simple Build Tool (`sbt`) to run. A copy of `sbt` is package with the source code, but you will need the following prerequisites in order to run the simulator: 18 | 19 | * a working JVM (`openjdk-6-jre` and `openjdk-6-jdk` packages in mid-2013 Ubuntu packages), 20 | * a working installation of Scala (`scala` Ubuntu package), 21 | * Python 2.6 or above and matplotlib 1.0 or above for generation of graphs (`python-2.7` and `python-matplotlib` Ubuntu packages). 22 | 23 | Once you have ensured that all of these exist, simply type `bin/sbt run` from the project home directory in order to run the simulator: 24 | 25 | $ bin/sbt run 26 | [...] 27 | [info] Compiling 9 Scala sources and 1 Java source to ${WORKING_DIR}/target/scala-2.9.1/classes... 28 | [...] 29 | [info] Running Simulation 30 | 31 | RUNNING CLUSTER SIMULATOR EXPERIMENTS 32 | ------------------------ 33 | [...] 34 | 35 | ### Using command line flags 36 | 37 | The simulator can be passed some command-line arugments via configuration flags, such as `--thread-pool-size NUM_THREADS_INT` and `--random-seed SEED_VAL_INT`. To view all options run: 38 | 39 | $ bin/sbt "run --help" 40 | 41 | Note that when passing command line options to the `sbt run` command you need to include the word `run` and all of the options that follow it within a single set of quotes. `sbt` can also be used via the `sbt` console by simply running `bin/sbt` which will drop you at a prompt. If you are using this `sbt` console option, you do not need to put quotes around the run command and any flags you pass. 42 | 43 | ### Configuration file 44 | If a file `conf/cluster-sim-env.sh` exists, it will be sourced in the shell before the simulator is run. This was added as a way of setting up the JVM (e.g. heap size) for simulator runs. Check out `conf/cluster-sim-env.sh.template` as a starting point; you will need to uncomment and possibly modify the example configuration value set in that template file (and, of course, you will need to create a copy of the file removing the ".template" suffix). 45 | 46 | 47 | ## Configuring experiments 48 | 49 | The simulation is controlled by the experiments configured in the `src/main/scala/Simulation.scala` setup file. Comments in the file explain how to set up different workloads, workload-to-scheduler mappings and simulated cluster and machine sizes. 50 | 51 | Most of the workload setup happens in `src/main/scala/Workloads.scala`, so read through that file and make modifications there to have the simulator read from a trace file of your own (see more below about the type of trace files the simulator uses, and the example files included). 52 | 53 | Workloads in the simulator are generated from *empirical parameter distributions*. These are typically based on cluster *snapshots* (at a point in time) or *traces* (sequences of events over time). We unfortunately cannot provide the full input data used for our experiments with the simulator, but we do provide example input files in the `traces` subdirectory, illustrating the expected data format (further explained in the local README file in `traces`). The following inputs are required: 54 | 55 | * **initial cluster state**: when the simulation starts, the simulated cluster obviously cannot start off empty. Instead, we pre-load it with a set of running jobs (and tasks) at this point in time. These jobs start before the beginning of simulation, and may end during the simulation or after. The example file `traces/example-init-cluster-state.log` shows the input format for the jobs in the initial cluster state, as well as the departure events of those of them which end during the simulation. The resource footprints of tasks generated at simulation runtime will also be sampled from the distribution of resource footprints of tasks in the initial cluster state. 56 | * **job parameters**: the simulator samples three key parameters for each job from empirical distributions (i.e. randomly picks values from a large set): 57 | 1. Job sizes (`traces/job-distribution-traces/example_csizes_cmb.log`): the number of tasks in the generated job. We assume for simplicity that all tasks in a job have the same resource footprint. 58 | 2. Job inter-arrival times (`traces/job-distribution-traces/example_interarrival_cmb.log`): the time in between job arrivals for each workload (in seconds). The value drawn from this distribution indicates how many seconds elapse until another job arrives, i.e. the "gaps" in between jobs. 59 | 3. Job runtimes (`traces/job-distribution-traces/example_runtimes_cmb.log`): total job runtime. For simplicity, we assume that all tasks in a job run for exactly this long (although if a task gets scheduled later, it will also finish later). 60 | 61 | For further details, see `traces/README.txt` and `traces/job-distribution-traces/README.txt`. 62 | 63 | **Please note that the resource amounts specified in the example data files, and the example cluster machines configured in `Simulation.scala` do *not* reflect Google configurations. They are made-up numbers, so please do not quote them or try to interpret them!** 64 | 65 | A possible starting point for generating realistic input data is the public Google cluster trace [2, 3]. It should be straightforward to write scripts that extract the relevant data from the public trace's event logs. Although we do not provide such scripts, it is worth noting that the "cluster C" workload in the EuroSys paper [1] represents the same workload as the public trace. (If you do write scripts for converting the public trace into simulator format, please let us know, and we will happily include them in the simulator code release!) 66 | 67 | ## Experimental results: post-processing 68 | 69 | Experimental results are stored in serialized Protocol Buffers in the `experiment_results` directory at the root of the source tree by default: one subdirectory for each experiment, and with a unique name identifying the experimental setup as well as the start time. The schemas for the `.protobuf` files are stored in `src/main/protocolbuffers`. 70 | 71 | A script for post-processing and graphing experimental results is located in `src/main/python/graphing-scripts`, and `src/main/python` also contains scripts for converting the protobuf-encoded results into ASCII CSV files. See the README file in the `graphing-scripts` directory for detailed explanation. 72 | 73 | ## NOTES 74 | 75 | ### Changing and compiling the protocol buffers 76 | 77 | If you make changes to the protocol buffer file (in `src/main/protocolbuffers`), you will need to recompile them, which will generate updated Java files in `src/main/java`. To do so, you must install the protcol buffer compiler and run `src/main/protocolbuffers/compile_protobufs.sh`, which itself calls `protoc` (which it assumes is on your `PATH`). 78 | 79 | ### Known issues 80 | 81 | - The `schedulePartialJobs` option is used in the current implementation of the `MesosScheduler` class. Partial jobs are always scheduled (even if this flag is set to false). Hence the `mesosSimulatorSingleSchedulerZeroResourceJobsTest` currently fails to pass. 82 | 83 | ## Contributing, Development Status, and Contact Info 84 | 85 | Please use the Google Code [project issue tracker](https://code.google.com/p/cluster-scheduler-simulator/issues/list) for all bug reports, pull requests and patches, although we are unlikely to be able to respond to feature requests. You can also send any kind of feedback to the developers, [Andy Konwinski](http://andykonwinski.com/) and [Malte Schwarzkopf](http://www.cl.cam.ac.uk/~ms705/). 86 | 87 | ## References 88 | 89 | [1] Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek and John Wilkes. **[Omega: flexible, scalable schedulers for large compute clusters](http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf)**. In *Proceedings of the 8th European Conference on Computer Systems (EuroSys 2013)*. 90 | 91 | [2] Charles Reiss, Alexey Tumanov, Gregory Ganger, Randy Katz and Michael Kotzuch. **[Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis](http://www.pdl.cmu.edu/PDL-FTP/CloudComputing/googletrace-socc2012.pdf)**. In *Proceedings of the 3rd ACM Symposium on Cloud Computing (SoCC 2012)*. 92 | 93 | [3] Google public cluster workload traces. [https://code.google.com/p/googleclusterdata/](https://code.google.com/p/googleclusterdata/). 94 | -------------------------------------------------------------------------------- /bin/generate-graphs.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | bin_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 28 | CLUSTER_SIM_HOME=$bin_dir/.. 29 | echo CLUSTER_SIM_HOME is $CLUSTER_SIM_HOME/src/main/python/graphing-scripts 30 | cd $CLUSTER_SIM_HOME/src/main/python/graphing-scripts 31 | 32 | function usage 33 | { 34 | echo "usage: `basename $0` ABS_PATH_TO_INPUT_DIR [--env-set ENV_SET_1 --env-set ENV_SET_2 --png --paper-mode]" 35 | } 36 | 37 | if [ $# -eq 0 ]; then 38 | echo "please provide the input-directory containing your protocol buffer files (ending in .protobuf)" 39 | usage 40 | exit 41 | fi 42 | input_dir=$1 43 | shift 44 | echo input_dir set as $input_dir 45 | run_time='86400' 46 | # do_png should be "" or "png" 47 | do_png='' 48 | # 0 = normal, 1 = paper 49 | modes=0 50 | # Used to name directories generated in graph directory. 51 | # Only affects which lines are actually plotted when in paper-mode. 52 | # Must be all caps. 53 | env_sets_to_plot='' 54 | 55 | while [ "$1" != "" ]; do 56 | case $1 in 57 | -e | --env_set ) env_sets_to_plot+=$1 58 | ;; 59 | -p | --png ) do_png="png" 60 | ;; 61 | --paper-mode ) modes+=1 62 | ;; 63 | -h | --help ) usage 64 | exit 65 | ;; 66 | * ) usage 67 | exit 1 68 | esac 69 | shift 70 | done 71 | 72 | case $input_dir in 73 | *vary_C*) vary_dimensions+=c;; 74 | *vary_L*) vary_dimensions+=l;; 75 | *vary_Lambda*) vary_dimensions+=lambda;; 76 | *) echo "Protobuf filename must contain \"vary_[C|L|Lambda]\"." 77 | exit 1 78 | esac 79 | 80 | # Use a default env_set if none was specified. 81 | if [ -z "$env_sets_to_plot" ]; then 82 | env_sets_to_plot='C' 83 | fi 84 | 85 | plotting_script='generate-plots-from-protobuff.py' 86 | 87 | # Assumes runtime is the last token of the filename. 88 | run_time=`echo ${input_dir} | grep '[0-9]\+$' --only-matching` 89 | 90 | function graph_experiment() { 91 | if [ -z "$1" ]; then # Is parameter #1 zero length? 92 | echo "graph_experiment requires 1 parameter (the protobuff file name)." 93 | exit 94 | fi 95 | filename=$1 96 | 97 | for mode in $modes; do 98 | echo mode is ${mode} '(0 = non-paper, 1 = paper)' 99 | if [[ ${mode} -eq 1 ]]; then 100 | out_dir="${input_dir}/graphs/paper" 101 | else 102 | out_dir="${input_dir}/graphs" 103 | fi 104 | 105 | # Figure out which simulator type this protobuff came from. 106 | case $filename in 107 | *omega-resource-fit-incremental*) sim=omega-resource-fit-incremental;; 108 | *omega-resource-fit-all-or-nothing*) sim=omega-resource-fit-all-or-nothing;; 109 | *omega-sequence-numbers-incremental*) sim=omega-sequence-numbers-incremental;; 110 | *omega-sequence-numbers-all-or-nothing*) sim=omega-sequence-numbers-all-or-nothing;; 111 | *monolithic*) sim=monolithic;; 112 | *mesos*) sim=mesos;; 113 | *) echo "Unknown simulator type, in ${filename} exiting." 114 | exit 1 115 | esac 116 | 117 | num_service_scheds=`echo ${filename} | \ 118 | grep --only-matching '[0-9]\+_service' | \ 119 | grep --only-matching '[0-9]\+'` 120 | num_batch_scheds=`echo ${filename} | \ 121 | grep --only-matching '[0-9]\+_batch' | \ 122 | grep --only-matching '[0-9]\+'` 123 | echo "Parsed filename for num service (${num_service_scheds}) and" \ 124 | "batch (${num_batch_scheds}) schedulers." 125 | for vd in ${vary_dimensions}; do 126 | echo generating graphs for dimension ${vd}. 127 | 128 | case $filename in 129 | *single_path*) pathness=single_path;; 130 | *multi_path*) pathness=multi_path;; 131 | *) echo "Protobuf filename must contain" \ 132 | "[single|multi]_path." 133 | exit 1 134 | esac 135 | echo Pathness is ${pathness}. 136 | 137 | for envs_to_plot in ${env_sets_to_plot}; do 138 | complete_out_dir=${out_dir}/${sim}/${pathness}/${envs_to_plot}/${num_service_scheds}_service-${num_batch_scheds}_batch 139 | mkdir -p ${complete_out_dir} 140 | echo 'PYTHONPATH=$PYTHONPATH:.. '"python ${plotting_script}" \ 141 | "${complete_out_dir}" \ 142 | "${input_dir}/${filename}" \ 143 | "${mode} ${vd} ${envs_to_plot} ${do_png}" 144 | PYTHONPATH=$PYTHONPATH:.. python ${plotting_script} \ 145 | ${complete_out_dir} \ 146 | ${input_dir}/${filename} \ 147 | ${mode} ${vd} ${envs_to_plot} ${do_png} 148 | echo -e "\n" 149 | done 150 | done 151 | done 152 | } 153 | 154 | PROTO_LIST='' 155 | echo capturing: ls $input_dir|grep protobuf 156 | ls $input_dir|grep protobuf 157 | for curr_filename in `ls $input_dir|grep protobuf`; do 158 | PROTO_LIST+=curr_filename 159 | echo Calling graph_experiment with $curr_filename 160 | graph_experiment $curr_filename 161 | done 162 | 163 | -------------------------------------------------------------------------------- /bin/sbt: -------------------------------------------------------------------------------- 1 | #! /bin/bash 2 | NOFORMAT="false" 3 | if [ "$1" == "NOFORMAT" ]; then 4 | NOFORMAT="true" 5 | shift 6 | fi 7 | export CLUSTER_SIM_HOME=$(cd "$(dirname $0)/.."; pwd) 8 | if [[ -f $CLUSTER_SIM_HOME/conf/cluster-sim-env.sh ]]; then 9 | source $CLUSTER_SIM_HOME/conf/cluster-sim-env.sh #Sets up JAVA_OPTS env variable 10 | fi 11 | 12 | if [[ ! -f sbt/bin/sbt ]]; then 13 | wget "https://github.com/sbt/sbt/releases/download/v0.13.18/sbt-0.13.18.zip" 14 | unzip sbt-0.13.18.zip 15 | fi 16 | java -Dsbt.log.noformat=$NOFORMAT $JAVA_OPTS -XX:+UseParallelGC -jar `dirname $0`/../sbt/bin/sbt-launch.jar "$@" 17 | -------------------------------------------------------------------------------- /bin/watch-and-email.sh: -------------------------------------------------------------------------------- 1 | #! /bin/bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | PROCESS_ID=$@ 28 | 29 | echo monitoring process $PROCESS_ID 30 | 31 | IS_FINISHED=0 32 | 33 | function test_is_finished() { 34 | ps -p $PROCESS_ID 35 | IS_FINISHED=$? 36 | } 37 | 38 | test_is_finished 39 | 40 | while [[ $IS_FINISHED -eq 0 ]]; do 41 | echo Process $PROCESS_ID still running, waiting to send email. 42 | sleep 10 43 | test_is_finished 44 | done 45 | 46 | echo Process $PROCESS_ID done running, sending email. 47 | 48 | echo "Cluster Simulation experiments with pid $PROCESS_ID just finished running on `hostname`!" | mail -s "Cluster Simulation pid $PROCESS_ID finished running!" andykonwinski@gmail.com 49 | -------------------------------------------------------------------------------- /build.sbt: -------------------------------------------------------------------------------- 1 | // Copyright (c) 2013, Regents of the University of California 2 | // All rights reserved. 3 | // 4 | // Redistribution and use in source and binary forms, with or without 5 | // modification, are permitted provided that the following conditions are met: 6 | // 7 | // Redistributions of source code must retain the above copyright notice, this 8 | // list of conditions and the following disclaimer. Redistributions in binary 9 | // form must reproduce the above copyright notice, this list of conditions and the 10 | // following disclaimer in the documentation and/or other materials provided with 11 | // the distribution. Neither the name of the University of California, Berkeley 12 | // nor the names of its contributors may be used to endorse or promote products 13 | // derived from this software without specific prior written permission. THIS 14 | // SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 15 | // EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 16 | // WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 17 | // DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 18 | // FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19 | // DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 20 | // SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 21 | // CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 22 | // OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 23 | // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 24 | 25 | name := "Omega Simulator" 26 | 27 | version := "0.1" 28 | 29 | scalaVersion := "2.10.4" 30 | 31 | organization := "edu.berkeley.cs" 32 | 33 | mainClass := Some("Simulation") 34 | 35 | scalacOptions += "-deprecation" 36 | 37 | // Add a dependency on commons-math for poisson random number generator 38 | libraryDependencies += "org.apache.commons" % "commons-math" % "2.2" 39 | 40 | libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.5" % "test" 41 | 42 | libraryDependencies += "com.google.protobuf" % "protobuf-java" % "2.6.1" 43 | 44 | -------------------------------------------------------------------------------- /conf/cluster-sim-env.sh.template: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | # Copy this file to cluster-sim-env.sh and add configuration settings here. 28 | 29 | # Export environment variables for your site in this file. Some useful 30 | # variables to set are: 31 | 32 | # JAVA_OPTS, used in the Java command line in bin/sbt. For example, to set 33 | # the heap size, make this -Xmx500m for a 500meg heap or 34 | # -Xmx1G for 1gig heap (this example is commented out below). 35 | # export JAVA_OPTS="-Xmx1Gm" 36 | -------------------------------------------------------------------------------- /project/build.properties: -------------------------------------------------------------------------------- 1 | sbt.version=0.13.7 2 | -------------------------------------------------------------------------------- /src/main/protocolbuffers/cluster_simulation_protos.proto: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation; 28 | 29 | message ExperimentResultSet { 30 | repeated ExperimentEnv experiment_env = 1; 31 | 32 | message ExperimentEnv { 33 | optional string cell_name = 1; 34 | optional string workload_split_type = 2; 35 | optional bool is_prefilled = 5 [default = false]; 36 | optional double run_time = 3; 37 | repeated ExperimentResult experiment_result = 4; 38 | // Next field number: 6 39 | 40 | // There is a 1-1 relationship between an ExperimentResult and a WorkloadDesc. 41 | message ExperimentResult { 42 | // Track avg resource utilization attributable to tasks actually running. 43 | optional double cell_state_avg_cpu_utilization = 4; 44 | optional double cell_state_avg_mem_utilization = 5; 45 | // Track avg resource utilization attributable to pessimistic locking 46 | // while schedulers make their scheduling decisions. 47 | optional double cell_state_avg_cpu_locked = 13; 48 | optional double cell_state_avg_mem_locked = 14; 49 | // Track per-workload level stats for this experiment. 50 | repeated WorkloadStats workload_stats = 6; 51 | // Workload specific experiment parameters. 52 | optional string sweep_workload = 8; 53 | optional double avg_job_interarrival_time = 9; 54 | // Track per-scheduler level stats for this experiment. 55 | repeated SchedulerStats scheduler_stats = 7; 56 | // Scheduler specific experiment parameters. 57 | repeated SchedulerWorkload sweep_scheduler_workload = 10; 58 | optional double constant_think_time = 11; 59 | optional double per_task_think_time = 12; 60 | // Next field number: 15 61 | 62 | // Workload-level stats. 63 | message WorkloadStats { 64 | optional string workload_name = 1; 65 | optional int64 num_jobs = 2; 66 | optional int64 num_jobs_scheduled = 3; 67 | optional double job_think_times_90_percentile = 4; 68 | optional double avg_job_queue_times_till_first_scheduled = 5; 69 | optional double avg_job_queue_times_till_fully_scheduled = 6; 70 | optional double job_queue_time_till_first_scheduled_90_percentile = 7; 71 | optional double job_queue_time_till_fully_scheduled_90_percentile = 8; 72 | optional double num_scheduling_attempts_90_percentile = 9; 73 | optional double num_scheduling_attempts_99_percentile = 10; 74 | optional double num_task_scheduling_attempts_90_percentile = 11; 75 | optional double num_task_scheduling_attempts_99_percentile = 12; 76 | } 77 | 78 | message SchedulerStats { 79 | optional string scheduler_name = 1; 80 | optional double useful_busy_time = 3; 81 | optional double wasted_busy_time = 4; 82 | repeated PerDayStats per_day_stats = 15; 83 | repeated PerWorkloadBusyTime per_workload_busy_time = 5; 84 | // These are job level transactions 85 | // TODO(andyk): rename these to include "job" in the name. 86 | optional int64 num_successful_transactions = 6; 87 | optional int64 num_failed_transactions = 7; 88 | optional int64 num_no_resources_found_scheduling_attempts = 13; 89 | optional int64 num_retried_transactions = 11; 90 | optional int64 num_jobs_timed_out_scheduling = 16; 91 | optional int64 num_successful_task_transactions = 9; 92 | optional int64 num_failed_task_transactions = 10; 93 | optional bool is_multi_path = 8; 94 | // Num jobs in schedulers job queue when simulation ended. 95 | optional int64 num_jobs_left_in_queue = 12; 96 | optional int64 failed_find_victim_attempts = 14; 97 | // Next field ID:17 98 | 99 | // Per-day bucketing of important stats to support error bars. 100 | message PerDayStats { 101 | optional int64 day_num = 1; 102 | optional double useful_busy_time = 2; 103 | optional double wasted_busy_time = 3; 104 | optional int64 num_successful_transactions = 4; 105 | optional int64 num_failed_transactions = 5; 106 | } 107 | 108 | 109 | // Track busy time per scheduler, per workload. 110 | message PerWorkloadBusyTime { 111 | optional string workload_name = 1; 112 | optional double useful_busy_time = 2; 113 | optional double wasted_busy_time = 3; 114 | } 115 | } 116 | 117 | // (scheduler, workload) pairs, used to keep track of which 118 | // such pairs the parameter sweep is applied to in an experiment run. 119 | message SchedulerWorkload { 120 | optional string schedulerName = 1; 121 | optional string workloadName = 2; 122 | } 123 | } 124 | } 125 | } 126 | 127 | -------------------------------------------------------------------------------- /src/main/protocolbuffers/compile_protobufs.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | curr_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 28 | cd $curr_dir 29 | 30 | protoc --java_out=../java --python_out=../python ./cluster_simulation_protos.proto 31 | -------------------------------------------------------------------------------- /src/main/python/generate-txt-from-protobuff.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | # For each unique (cell_name, scheduler, metric) tuple, where metric 28 | # is either busy_time_median or conflict_fraction median, print a 29 | # different text file with rows that contain the following fields: 30 | # cell_name 31 | # sched_id 32 | # c 33 | # l 34 | # avg_job_interarrival_time 35 | # median_busy_time (or conflict_fraction) 36 | # err_bar_metric_for_busy_time (or conflict_fraction) 37 | 38 | import sys, os, re 39 | import logging 40 | import numpy as np 41 | from collections import defaultdict 42 | import cluster_simulation_protos_pb2 43 | 44 | logging.basicConfig(level=logging.DEBUG) 45 | 46 | def usage(): 47 | print "usage: generate-txt-from-protobuff.py " 48 | sys.exit(1) 49 | 50 | logging.debug("len(sys.argv): " + str(len(sys.argv))) 51 | 52 | if len(sys.argv) < 2: 53 | logging.error("Not enough arguments provided.") 54 | usage() 55 | 56 | try: 57 | input_protobuff_name = sys.argv[1] 58 | # Start optional args. 59 | if len(sys.argv) == 3: 60 | outfile_name_base = str(sys.argv[2]) 61 | else: 62 | #make the output files the same as the input but add .txt to end 63 | outfile_name_base = input_protobuff_name 64 | 65 | except: 66 | usage() 67 | 68 | logging.info("Input file: %s" % input_protobuff_name) 69 | 70 | def get_mad(median, data): 71 | logging.info("in get_mad, with median %f, data: %s" 72 | % (median, " ".join([str(i) for i in data]))) 73 | devs = [abs(x - median) for x in data] 74 | mad = np.median(devs) 75 | print "returning mad = %f" % mad 76 | return mad 77 | 78 | # Read in the ExperimentResultSet. 79 | experiment_result_set = cluster_simulation_protos_pb2.ExperimentResultSet() 80 | infile = open(input_protobuff_name, "rb") 81 | experiment_result_set.ParseFromString(infile.read()) 82 | infile.close() 83 | 84 | # This dictionary, indexed by 3tuples[String] of 85 | # (cell_name, scheduler_name, metric_name), holds as values strings 86 | # each holding all of the rows that will be written to to a text file 87 | # uniquely identified by the dictionary key. 88 | # This dictionary will be iterated over after being being filled 89 | # to create text files holding its contents. 90 | output_strings = defaultdict(str) 91 | # Loop through each experiment environment. 92 | logging.debug("Processing %d experiment envs." 93 | % len(experiment_result_set.experiment_env)) 94 | for env in experiment_result_set.experiment_env: 95 | logging.debug("Handling experiment env (%s %s)." 96 | % (env.cell_name, env.workload_split_type)) 97 | logging.debug("Processing %d experiment results." 98 | % len(env.experiment_result)) 99 | prev_l_val = -1.0 100 | for exp_result in env.experiment_result: 101 | logging.debug("Handling experiment result with C = %f and L = %f." 102 | % (exp_result.constant_think_time, 103 | exp_result.per_task_think_time)) 104 | for sched_stat in exp_result.scheduler_stats: 105 | logging.debug("Handling scheduler stat for %s." 106 | % sched_stat.scheduler_name) 107 | # Calculate per day busy time and conflict fractions. 108 | daily_busy_fractions = [] 109 | daily_conflict_fractions = [] 110 | for day_stats in sched_stat.per_day_stats: 111 | # Calculate the total busy time for each of the days and then 112 | # take median of all fo them. 113 | run_time_for_day = env.run_time - 86400 * day_stats.day_num 114 | logging.info("setting run_time_for_day = env.run_time - 86400 * " 115 | "day_stats.day_num = %f - 86400 * %d = %f" 116 | % (env.run_time, day_stats.day_num, run_time_for_day)) 117 | if run_time_for_day > 0.0: 118 | daily_busy_fractions.append(((day_stats.useful_busy_time + 119 | day_stats.wasted_busy_time) / 120 | min(86400.0, run_time_for_day))) 121 | logging.info("%s appending daily_conflict_fraction %f." 122 | % (sched_stat.scheduler_name, daily_busy_fractions[-1])) 123 | 124 | if day_stats.num_successful_transactions > 0: 125 | conflict_fraction = (float(day_stats.num_failed_transactions) / 126 | float(day_stats.num_failed_transactions + 127 | day_stats.num_successful_transactions)) 128 | daily_conflict_fractions.append(conflict_fraction) 129 | logging.info("%s appending daily_conflict_fraction %f." 130 | % (sched_stat.scheduler_name, conflict_fraction)) 131 | else: 132 | daily_conflict_fractions.append(0) 133 | logging.info("appending 0 to daily_conflict_fraction") 134 | 135 | logging.info("Done building daily_busy_fractions: %s" 136 | % " ".join([str(i) for i in daily_busy_fractions])) 137 | logging.info("Also done building daily_conflict_fractions: %s" 138 | % " ".join([str(i) for i in daily_conflict_fractions])) 139 | 140 | if prev_l_val != exp_result.per_task_think_time and prev_l_val != -1.0: 141 | opt_extra_newline = "\n" 142 | else: 143 | opt_extra_newline = "" 144 | prev_l_val = exp_result.per_task_think_time 145 | 146 | # Compute the busy_time row and append it to the string 147 | # accumulating output rows for this schedulerName. 148 | daily_busy_fraction_median = np.median(daily_busy_fractions) 149 | busy_frac_key = (env.cell_name, sched_stat.scheduler_name, "busy_frac") 150 | output_strings[busy_frac_key] += \ 151 | "%s%s %s %s %s %s %s %s\n" % (opt_extra_newline, 152 | env.cell_name, 153 | sched_stat.scheduler_name, 154 | exp_result.constant_think_time, 155 | exp_result.per_task_think_time, 156 | exp_result.avg_job_interarrival_time, 157 | daily_busy_fraction_median, 158 | get_mad(daily_busy_fraction_median, 159 | daily_busy_fractions)) 160 | 161 | conflict_fraction_median = np.median(daily_conflict_fractions) 162 | conf_frac_key = (env.cell_name, sched_stat.scheduler_name, "conf_frac") 163 | output_strings[conf_frac_key] += \ 164 | "%s%s %s %s %s %s %s %s\n" % (opt_extra_newline, 165 | env.cell_name, 166 | sched_stat.scheduler_name, 167 | exp_result.constant_think_time, 168 | exp_result.per_task_think_time, 169 | exp_result.avg_job_interarrival_time, 170 | conflict_fraction_median, 171 | get_mad(conflict_fraction_median, 172 | daily_conflict_fractions)) 173 | 174 | # Create output files. 175 | # One output file for each unique (cell_name, scheduler_name, metric) tuple. 176 | for key_tuple, out_str in output_strings.iteritems(): 177 | outfile_name = (outfile_name_base + 178 | "." + "_".join([str(i) for i in key_tuple]) + ".txt") 179 | logging.info("Creating output file: %s" % outfile_name) 180 | outfile = open(outfile_name, "w") 181 | outfile.write(out_str) 182 | outfile.close() 183 | -------------------------------------------------------------------------------- /src/main/python/generate-txt-from-protobuff.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | curr_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 28 | cd $curr_dir 29 | 30 | PYTHONPATH=$PYTHONPATH:.. python ./generate-txt-from-protobuff.py $@ 31 | -------------------------------------------------------------------------------- /src/main/python/generate-txt-from-protobuffs-in-dir.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | curr_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 28 | cd $curr_dir 29 | 30 | dirname=$1 31 | echo dirname is $dirname 32 | 33 | if [[ ! -d $dirname ]]; then 34 | echo "Error: accepts a single argument which is dirname containing protobuffs." 35 | exit 36 | fi 37 | 38 | # TODO(andyk): make naming convention for protobuf(f) consistent. 39 | for i in `ls $dirname/*.protobuf`; do 40 | echo PYTHONPATH=$PYTHONPATH:.. python ./generate-txt-from-protobuff.py $i 41 | PYTHONPATH=$PYTHONPATH:.. python ./generate-txt-from-protobuff.py $i 42 | done 43 | -------------------------------------------------------------------------------- /src/main/python/graphing-scripts/README: -------------------------------------------------------------------------------- 1 | Use these scripts to generate graphs of the output of the synthetic simulator. 2 | Before you can do that, you have to run some simulations. 3 | 4 | Once you have run some simulations, use CLUSTER_SIM_HOME/bin/generate-graphs.sh 5 | to generate graphs based on the output those simulations generated, which 6 | resides in HOME/experiment_results. 7 | -------------------------------------------------------------------------------- /src/main/python/graphing-scripts/comparison-plot-from-protobuff.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | # This file generates a set of graphs for a simulator "experiment". 28 | # An experiment is equivalent to the file generated from the run of a 29 | # single Experiment object in the simulator (i.e. a parameter sweep for a 30 | # set of workload_descs), with the added constraint that only one of 31 | # C, L, or lambda can be varied per a single series (the simulator 32 | # currently allows ranges to be provided for more than one of these). 33 | 34 | import sys, os, re 35 | from utils import * 36 | import numpy as np 37 | import matplotlib.pyplot as plt 38 | import math 39 | import operator 40 | import logging 41 | from collections import defaultdict 42 | 43 | import cluster_simulation_protos_pb2 44 | 45 | logging.basicConfig(level=logging.DEBUG, format="%(message)s") 46 | 47 | def usage(): 48 | print "usage: scheduler-business.py " \ 49 | " [png]" 50 | sys.exit(1) 51 | 52 | # if len(sys.argv) < 6: 53 | # logging.error("Not enough arguments provided.") 54 | # usage() 55 | 56 | paper_mode = True 57 | output_formats = ['pdf'] 58 | # try: 59 | # output_prefix = str(sys.argv[1]) 60 | # input_protobuff = sys.argv[2] 61 | # if int(sys.argv[3]) == 1: 62 | # paper_mode = True 63 | # vary_dim = sys.argv[4] 64 | # if vary_dim not in ['c', 'l', 'lambda']: 65 | # logging.error("vary_dim must be c, l, or lambda!") 66 | # sys.exit(1) 67 | # envs_to_plot = sys.argv[5] 68 | # if re.search("[^ABC]",envs_to_plot): 69 | # logging.error("envs_to_plot must be any combination of a, b, and c, without spaces!") 70 | # sys.exit(1) 71 | # if len(sys.argv) == 7: 72 | # if sys.argv[6] == "png": 73 | # output_formats.append('png') 74 | # else: 75 | # logging.error("The only valid optional 5th argument is 'png'") 76 | # sys.exit(1) 77 | # 78 | # except: 79 | # usage() 80 | # 81 | # set_leg_fontsize(11) 82 | 83 | # logging.info("Output prefix: %s" % output_prefix) 84 | # logging.info("Input file: %s" % input_protobuff) 85 | 86 | # google-omega-resfit-allornoth-single_path-vary_l-604800.protobuf 87 | # google-omega-resfit-inc-single_path-vary_l-604800.protobuf 88 | # google-omega-seqnum-allornoth-single_path-vary_l-604800.protobuf 89 | # google-omega-seqnum-inc-single_path-vary_l-604800.protobuf 90 | 91 | envs_to_plot = "C" 92 | 93 | file_dir = '/Users/andyk/omega-7day-simulator-results/' 94 | output_prefix = file_dir + "/graphs" 95 | 96 | file_names = [("Fine/Gang", "google-omega-resfit-allornoth-single_path-vary_c-604800.protobuf"), 97 | ("Fine/Inc", "google-omega-resfit-inc-single_path-vary_c-604800.protobuf"), 98 | ("Coarse/Gang", "google-omega-seqnum-allornoth-single_path-vary_c-604800.protobuf"), 99 | ("Course/Inc", "google-omega-seqnum-inc-single_path-vary_c-604800.protobuf")] 100 | 101 | experiment_result_sets = [] 102 | for title_name_tuple in file_names: 103 | title = title_name_tuple[0] 104 | file_name = title_name_tuple[1] 105 | full_name = file_dir + file_name 106 | # Read in the ExperimentResultSet. 107 | #experiment_result_sets.append((title, cluster_simulation_protos_pb2.ExperimentResultSet())) 108 | res_set = cluster_simulation_protos_pb2.ExperimentResultSet() 109 | experiment_result_sets.append([title, res_set]) 110 | #titles[experiment_result_sets[-1]] = title 111 | f = open(full_name, "rb") 112 | res_set.ParseFromString(f.read()) 113 | f.close() 114 | 115 | 116 | # --------------------------------------- 117 | # Set up some general graphing variables. 118 | if paper_mode: 119 | set_paper_rcs() 120 | fig = plt.figure(figsize=(2,1.33)) 121 | else: 122 | fig = plt.figure() 123 | 124 | prefilled_colors_web = { 'A': 'b', 'B': 'r', 'C': 'c', "synth": 'y' } 125 | colors_web = { 'A': 'b', 'B': 'r', 'C': 'm', "synth": 'y' } 126 | colors_paper = { 'A': 'b', 'B': 'r', 'C': 'c', "synth": 'b' } 127 | per_wl_colors = { 'OmegaBatch': 'b', 128 | 'OmegaService': 'r' } 129 | 130 | title_colors_web = { "Fine/Gang": 'b', "Fine/Inc": 'r', "Coarse/Gang": 'm', "Course/Inc": 'c' } 131 | 132 | prefilled_linestyles_web = { 'Monolithic': 'D-', 133 | 'MonolithicApprox': 's-', 134 | 'MesosBatch': 'D-', 135 | 'MesosService': 'D:', 136 | 'MesosBatchApprox': 's-', 137 | 'MesosServiceApprox': 's:', 138 | 'OmegaBatch': 'D-', 139 | 'OmegaService': 'D:', 140 | 'OmegaBatchApprox': 's-', 141 | 'OmegaServiceApprox': 's:', 142 | 'Batch': 'D-', 143 | 'Service': 'D:' } 144 | 145 | linestyles_web = { 'Monolithic': 'x-', 146 | 'MonolithicApprox': 'o-', 147 | 'MesosBatch': 'x-', 148 | 'MesosService': 'x:', 149 | 'MesosBatchApprox': 'o-', 150 | 'MesosServiceApprox': 'o:', 151 | 'OmegaBatch': 'x-', 152 | 'OmegaService': 'x:', 153 | 'OmegaBatchApprox': 'o-', 154 | 'OmegaServiceApprox': 'o:', 155 | 'Batch': 'x-', 156 | 'Service': 'x:' } 157 | linestyles_paper = { 'Monolithic': '-', 158 | 'MonolithicApprox': '--', 159 | 'MesosBatch': '-', 160 | 'MesosService': ':', 161 | 'MesosBatchApprox': '--', 162 | 'MesosServiceApprox': '-.', 163 | 'OmegaBatch': '-', 164 | 'OmegaService': ':', 165 | 'OmegaBatchApprox': '--', 166 | 'OmegaServiceApprox': '-.', 167 | 'Batch': '-', 168 | 'Service': ':' } 169 | 170 | dashes_paper = { 'Monolithic': (None,None), 171 | 'MonolithicApprox': (3,3), 172 | 'MesosBatch': (None,None), 173 | 'MesosService': (1,1), 174 | 'MesosBatchApprox': (3,3), 175 | 'MesosServiceApprox': (4,2), 176 | 'OmegaBatch': (None,None), 177 | 'OmegaService': (1,1), 178 | 'OmegaBatchApprox': (3,3), 179 | 'OmegaServiceApprox': (4,2), 180 | 'Batch': (None,None), 181 | 'Service': (1,1), 182 | 'Fine/Gang': (1,1), 183 | 'Fine/Inc': (3,3), 184 | 'Coarse/Gang': (4,2) 185 | } 186 | 187 | # Some dictionaries whose values will be dictionaries 188 | # to make 2d dictionaries, which will be indexed by both exp_env 189 | # and either workoad or scheduler name. 190 | # -- 191 | # (cellName, assignmentPolicy, workload_name) -> array of data points 192 | # for the parameter sweep done in the experiment. 193 | workload_queue_time_till_first = {} 194 | workload_queue_time_till_fully = {} 195 | workload_queue_time_till_first_90_ptile = {} 196 | workload_queue_time_till_fully_90_ptile = {} 197 | workload_num_jobs_unscheduled = {} 198 | # (cellName, assignmentPolicy, scheduler_name) -> array of data points 199 | # for the parameter sweep done in the experiment. 200 | sched_total_busy_fraction = {} 201 | sched_daily_busy_fraction = {} 202 | sched_daily_busy_fraction_err = {} 203 | # TODO(andyk): Graph retry_busy_fraction on same graph as total_busy_fraction 204 | # to parallel Malte's graphs. 205 | # sched_retry_busy_fraction = {} 206 | sched_conflict_fraction = {} 207 | sched_daily_conflict_fraction = {} 208 | sched_daily_conflict_fraction_err = {} 209 | sched_task_conflict_fraction = {} 210 | sched_num_retried_transactions = {} 211 | sched_num_jobs_remaining = {} 212 | sched_failed_find_victim_attempts = {} 213 | 214 | # Convenience wrapper to override __str__() 215 | class ExperimentEnv: 216 | def __init__(self, init_exp_env): 217 | self.exp_env = init_exp_env 218 | self.cell_name = init_exp_env.cell_name 219 | self.workload_split_type = init_exp_env.workload_split_type 220 | self.is_prefilled = init_exp_env.is_prefilled 221 | self.run_time = init_exp_env.run_time 222 | 223 | def __str__(self): 224 | return str("%s, %s" % (self.exp_env.cell_name, self.exp_env.workload_split_type)) 225 | 226 | # Figure out if we are varying c, l, or lambda in this experiment. 227 | def vary_dim(self): 228 | env = self.exp_env # Make a convenient short handle. 229 | assert(len(env.experiment_result) > 1) 230 | if (env.experiment_result[0].constant_think_time != 231 | env.experiment_result[1].constant_think_time): 232 | vary_dim = "c" 233 | # logging.debug("Varying %s. The first two experiments' c values were %d, %d " 234 | # % (vary_dim, 235 | # env.experiment_result[0].constant_think_time, 236 | # env.experiment_result[1].constant_think_time)) 237 | elif (env.experiment_result[0].per_task_think_time != 238 | env.experiment_result[1].per_task_think_time): 239 | vary_dim = "l" 240 | # logging.debug("Varying %s. The first two experiments' l values were %d, %d " 241 | # % (vary_dim, 242 | # env.experiment_result[0].per_task_think_time, 243 | # env.experiment_result[1].per_task_think_time)) 244 | else: 245 | vary_dim = "lambda" 246 | # logging.debug("Varying %s." % vary_dim) 247 | return vary_dim 248 | 249 | class Value: 250 | def __init__(self, init_x, init_y): 251 | self.x = init_x 252 | self.y = init_y 253 | def __str__(self): 254 | return str("%f, %f" % (self.x, self.y)) 255 | 256 | def bt_approx(cell_name, sched_name, point, vary_dim_, tt_c, tt_l, runtime): 257 | logging.debug("sched_name is %s " % sched_name) 258 | assert(sched_name == "Batch" or sched_name == "Service") 259 | lbd = {} 260 | n = {} 261 | # This function calculates an approximated scheduler busyness line given 262 | # an average inter-arrival time and job size for each scheduler 263 | # XXX: configure the below parameters and comment out the following 264 | # line in order to 265 | # 1) disable the warning, and 266 | # 2) get a correct no-conflict approximation. 267 | print >> sys.stderr, "*********************************************\n" \ 268 | "WARNING: YOU HAVE NOT CONFIGURED THE PARAMETERS IN THE bt_approx\n" \ 269 | "*********************************************\n" 270 | ################################ 271 | # XXX EDIT BELOW HERE 272 | # hard-coded SAMPLE params for cluster A 273 | lbd['A'] = { "Batch": 0.1, "Service": 0.01 } # lambdas for 0: serv & 1: Batch 274 | n['A'] = { "Batch": 10.0, "Service": 5.0 } # avg num tasks per job 275 | # hard-coded SAMPLE params for cluster B 276 | lbd['B'] = { "Batch": 0.1, "Service": 0.01 } 277 | n['B'] = { "Batch": 10.0, "Service": 5.0 } 278 | # hard-coded SAMPLE params for cluster C 279 | lbd['C'] = { "Batch": 0.1, "Service": 0.01 } 280 | n['C'] = { "Batch": 10.0, "Service": 5.0 } 281 | ################################ 282 | 283 | # approximation formula 284 | if vary_dim_ == 'c': 285 | # busy_time = num_jobs * (per_job_think_time = C + nL) / runtime 286 | return runtime * lbd[cell_name][sched_name] * \ 287 | ((point + n[cell_name][sched_name] * float(tt_l))) / runtime 288 | elif vary_dim_ == 'l': 289 | return runtime * lbd[cell_name][sched_name] * \ 290 | ((float(tt_c) + n[cell_name][sched_name] * point)) / runtime 291 | 292 | def get_mad(median, data): 293 | #print "in get_mad, with median %f, data: %s" % (median, " ".join([str(i) for i in data])) 294 | devs = [abs(x - median) for x in data] 295 | mad = np.median(devs) 296 | #print "returning mad = %f" % mad 297 | return mad 298 | 299 | def sort_labels(handles, labels): 300 | hl = sorted(zip(handles, labels), 301 | key=operator.itemgetter(1)) 302 | handles2, labels2 = zip(*hl) 303 | return (handles2, labels2) 304 | 305 | for experiment_result_set_arry in experiment_result_sets: 306 | title = experiment_result_set_arry[0] 307 | logging.debug("\n\n==========================\nHandling title %s." % title) 308 | experiment_result_set = experiment_result_set_arry[1] 309 | 310 | # Loop through each experiment environment. 311 | logging.debug("Processing %d experiment envs." 312 | % len(experiment_result_set.experiment_env)) 313 | for env in experiment_result_set.experiment_env: 314 | if not re.search(cell_to_anon(env.cell_name), envs_to_plot): 315 | logging.debug(" skipping env/cell " + env.cell_name) 316 | continue 317 | logging.debug("\n\n\n env: " + env.cell_name) 318 | exp_env = ExperimentEnv(env) # Wrap the protobuff object to get __str__() 319 | logging.debug(" Handling experiment env %s." % exp_env) 320 | 321 | # Within this environment, loop through each experiment result 322 | logging.debug(" Processing %d experiment results." % len(env.experiment_result)) 323 | for exp_result in env.experiment_result: 324 | logging.debug(" Handling experiment with per_task_think_time %f, constant_think_time %f" 325 | % (exp_result.per_task_think_time, exp_result.constant_think_time)) 326 | # Record the correct x val depending on which dimension is being 327 | # swept over in this experiment. 328 | vary_dim = exp_env.vary_dim() # This line is unecessary since this value 329 | # is a flag passed as an arg to the script. 330 | if vary_dim == "c": 331 | x_val = exp_result.constant_think_time 332 | elif vary_dim == "l": 333 | x_val = exp_result.per_task_think_time 334 | else: 335 | x_val = exp_result.avg_job_interarrival_time 336 | # logging.debug("Set x_val to %f." % x_val) 337 | 338 | # Build results dictionaries of per-scheduler stats. 339 | for sched_stat in exp_result.scheduler_stats: 340 | # Per day busy time and conflict fractions. 341 | daily_busy_fractions = [] 342 | daily_conflict_fractions = [] 343 | daily_conflicts = [] # counts the mean of daily abs # of conflicts. 344 | daily_successes = [] 345 | logging.debug(" handling scheduler %s" % sched_stat.scheduler_name) 346 | for day_stats in sched_stat.per_day_stats: 347 | # Calculate the total busy time for each of the days and then 348 | # take median of all fo them. 349 | run_time_for_day = exp_env.run_time - 86400 * day_stats.day_num 350 | # logging.debug("setting run_time_for_day = exp_env.run_time - 86400 * " 351 | # "day_stats.day_num = %f - 86400 * %d = %f" 352 | # % (exp_env.run_time, day_stats.day_num, run_time_for_day)) 353 | if run_time_for_day > 0.0: 354 | daily_busy_fractions.append(((day_stats.useful_busy_time + 355 | day_stats.wasted_busy_time) / 356 | min(86400.0, run_time_for_day))) 357 | 358 | if day_stats.num_successful_transactions > 0: 359 | conflict_fraction = (float(day_stats.num_failed_transactions) / 360 | float(day_stats.num_successful_transactions)) 361 | daily_conflict_fractions.append(conflict_fraction) 362 | daily_conflicts.append(float(day_stats.num_failed_transactions)) 363 | daily_successes.append(float(day_stats.num_successful_transactions)) 364 | # logging.debug("appending daily_conflict_fraction %f / %f = %f." 365 | # % (float(day_stats.num_failed_transactions), 366 | # float(day_stats.num_successful_transactions), 367 | # conflict_fraction)) 368 | else: 369 | daily_conflict_fractions.append(0) 370 | 371 | # Daily busy time median. 372 | daily_busy_time_med = np.median(daily_busy_fractions) 373 | logging.debug(" Daily_busy_fractions, med: %f, vals: %s" 374 | % (daily_busy_time_med, 375 | " ".join([str(i) for i in daily_busy_fractions]))) 376 | value = Value(x_val, daily_busy_time_med) 377 | append_or_create_2d(sched_daily_busy_fraction, 378 | title, 379 | sched_stat.scheduler_name, 380 | value) 381 | #logging.debug("sched_daily_busy_fraction[%s %s].append(%s)." 382 | # % (exp_env, sched_stat.scheduler_name, value)) 383 | # Error Bar (MAD) for daily busy time. 384 | value = Value(x_val, get_mad(daily_busy_time_med, 385 | daily_busy_fractions)) 386 | append_or_create_2d(sched_daily_busy_fraction_err, 387 | title, 388 | sched_stat.scheduler_name, 389 | value) 390 | #logging.debug("sched_daily_busy_fraction_err[%s %s].append(%s)." 391 | # % (exp_env, sched_stat.scheduler_name, value)) 392 | # Daily conflict fraction median. 393 | daily_conflict_fraction_med = np.median(daily_conflict_fractions) 394 | logging.debug(" Daily_abs_num_conflicts, med: %f, vals: %s" 395 | % (np.median(daily_conflicts), 396 | " ".join([str(i) for i in daily_conflicts]))) 397 | logging.debug(" Daily_num_successful_conflicts, med: %f, vals: %s" 398 | % (np.median(daily_successes), 399 | " ".join([str(i) for i in daily_successes]))) 400 | logging.debug(" Daily_conflict_fractions, med : %f, vals: %s\n --" 401 | % (daily_conflict_fraction_med, 402 | " ".join([str(i) for i in daily_conflict_fractions]))) 403 | value = Value(x_val, daily_conflict_fraction_med) 404 | append_or_create_2d(sched_daily_conflict_fraction, 405 | title, 406 | sched_stat.scheduler_name, 407 | value) 408 | # logging.debug("sched_daily_conflict_fraction[%s %s].append(%s)." 409 | # % (exp_env, sched_stat.scheduler_name, value)) 410 | # Error Bar (MAD) for daily conflict fraction. 411 | value = Value(x_val, get_mad(daily_conflict_fraction_med, 412 | daily_conflict_fractions)) 413 | append_or_create_2d(sched_daily_conflict_fraction_err, 414 | title, 415 | sched_stat.scheduler_name, 416 | value) 417 | 418 | 419 | def plot_2d_data_set_dict(data_set_2d_dict, 420 | plot_title, 421 | filename_suffix, 422 | y_label, 423 | y_axis_type, 424 | error_bars_data_set_2d_dict = None): 425 | assert(y_axis_type == "0-to-1" or 426 | y_axis_type == "ms-to-day" or 427 | y_axis_type == "abs") 428 | plt.clf() 429 | ax = fig.add_subplot(111) 430 | for title, name_to_val_map in data_set_2d_dict.iteritems(): 431 | for wl_or_sched_name, values in name_to_val_map.iteritems(): 432 | line_label = title 433 | # Hacky: chop MonolithicBatch, MesosBatch, MonolithicService, etc. 434 | # down to "Batch" and "Service" if in paper mode. 435 | updated_wl_or_sched_name = wl_or_sched_name 436 | if paper_mode and re.search("Batch", wl_or_sched_name): 437 | updated_wl_or_sched_name = "Batch" 438 | if paper_mode and re.search("Service", wl_or_sched_name): 439 | updated_wl_or_sched_name = "Service" 440 | 441 | # Don't show lines for service frameworks 442 | if updated_wl_or_sched_name == "Batch": 443 | "Skipping a line for a service scheduler" 444 | continue 445 | x_vals = [value.x for value in values] 446 | # Rewrite zero's for the y_axis_types that will be log. 447 | y_vals = [0.00001 if (value.y == 0 and y_axis_type == "ms-to-day") 448 | else value.y for value in values] 449 | logging.debug("Plotting line for %s %s %s." % 450 | (title, updated_wl_or_sched_name, plot_title)) 451 | #logging.debug("x vals: " + " ".join([str(i) for i in x_vals])) 452 | #logging.debug("y vals: " + " ".join([str(i) for i in y_vals])) 453 | logging.debug("wl_or_sched_name: " + wl_or_sched_name) 454 | logging.debug("title: " + title) 455 | 456 | ax.plot(x_vals, y_vals, 457 | dashes=dashes_paper[wl_or_sched_name], 458 | color=title_colors_web[title], 459 | label=line_label, markersize=4, 460 | mec=title_colors_web[title]) 461 | 462 | setup_graph_details(ax, plot_title, filename_suffix, y_label, y_axis_type) 463 | 464 | def setup_graph_details(ax, plot_title, filename_suffix, y_label, y_axis_type): 465 | assert(y_axis_type == "0-to-1" or 466 | y_axis_type == "ms-to-day" or 467 | y_axis_type == "abs") 468 | 469 | # Paper title. 470 | if not paper_mode: 471 | plt.title(plot_title) 472 | 473 | if paper_mode: 474 | try: 475 | # Set up the legend, for removing the border if in paper mode. 476 | handles, labels = ax.get_legend_handles_labels() 477 | handles2, labels2 = sort_labels(handles, labels) 478 | leg = plt.legend(handles2, labels2, loc=2, labelspacing=0) 479 | fr = leg.get_frame() 480 | fr.set_linewidth(0) 481 | except: 482 | print "Failed to remove frame around legend, legend probably is empty." 483 | 484 | # Axis labels. 485 | if not paper_mode: 486 | ax.set_ylabel(y_label) 487 | if vary_dim == "c": 488 | ax.set_xlabel(u'Scheduler 1 constant processing time [sec]') 489 | elif vary_dim == "l": 490 | ax.set_xlabel(u'Scheduler 1 per-task processing time [sec]') 491 | elif vary_dim == "lambda": 492 | ax.set_xlabel(u'Job arrival rate to scheduler 1, $\lambda_1$') 493 | 494 | # x-axis scale, limit, tics and tic labels. 495 | ax.set_xscale('log') 496 | ax.set_autoscalex_on(False) 497 | if vary_dim == 'c': 498 | plt.xlim(xmin=0.01) 499 | plt.xticks((0.01, 0.1, 1, 10, 100), ('10ms', '0.1s', '1s', '10s', '100s')) 500 | elif vary_dim == 'l': 501 | plt.xlim(xmin=0.001, xmax=1) 502 | plt.xticks((0.001, 0.01, 0.1, 1), ('1ms', '10ms', '0.1s', '1s')) 503 | elif vary_dim == 'lambda': 504 | plt.xlim([0.1, 100]) 505 | plt.xticks((0.1, 1, 10, 100), ('0.1s', '1s', '10s', '100s')) 506 | 507 | # y-axis limit, tics and tic labels. 508 | if y_axis_type == "0-to-1": 509 | logging.debug("Setting up y-axis for '0-to-1' style graph.") 510 | plt.ylim([0, 1]) 511 | plt.yticks((0, 0.2, 0.4, 0.6, 0.8, 1.0), 512 | ('0.0', '0.2', '0.4', '0.6', '0.8', '1.0')) 513 | elif y_axis_type == "ms-to-day": 514 | logging.debug("Setting up y-axis for 'ms-to-day' style graph.") 515 | #ax.set_yscale('symlog', linthreshy=0.001) 516 | ax.set_yscale('log') 517 | plt.ylim(ymin=0.01, ymax=24*3600) 518 | plt.yticks((0.01, 1, 60, 3600, 24*3600), ('10ms', '1s', '1m', '1h', '1d')) 519 | elif y_axis_type == "abs": 520 | plt.ylim(ymin=0) 521 | logging.debug("Setting up y-axis for 'abs' style graph.") 522 | #plt.yticks((0.01, 1, 60, 3600, 24*3600), ('10ms', '1s', '1m', '1h', '1d')) 523 | else: 524 | logging.error('y_axis_label parameter must be either "0-to-1"' 525 | ', "ms-to-day", or "abs".') 526 | sys.exit(1) 527 | 528 | final_filename = (output_prefix + 529 | ('/sisi-vary-%s-vs-' % vary_dim) + 530 | filename_suffix) 531 | logging.debug("Writing plot to %s", final_filename) 532 | writeout(final_filename, output_formats) 533 | 534 | 535 | #SCHEDULER DAILY BUSY AND CONFLICT FRACTION MEDIANS 536 | plot_2d_data_set_dict(sched_daily_busy_fraction, 537 | "Scheduler processing time vs. median(daily busy time fraction)", 538 | "daily-busy-fraction-med", 539 | u'Median(daily busy time fraction)', 540 | "0-to-1") 541 | 542 | plot_2d_data_set_dict(sched_daily_conflict_fraction, 543 | "Scheduler processing time vs. median(daily conflict fraction)", 544 | "daily-conflict-fraction-med", 545 | u'Median(daily conflict fraction)', 546 | "0-to-1") 547 | -------------------------------------------------------------------------------- /src/main/python/graphing-scripts/comparison-plot-from-protobuff.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Copyright (c) 2013, Regents of the University of California 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. Redistributions in binary 11 | # form must reproduce the above copyright notice, this list of conditions and the 12 | # following disclaimer in the documentation and/or other materials provided with 13 | # the distribution. Neither the name of the University of California, Berkeley 14 | # nor the names of its contributors may be used to endorse or promote products 15 | # derived from this software without specific prior written permission. THIS 16 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 17 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 18 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | 27 | curr_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" 28 | cd $curr_dir 29 | 30 | PYTHONPATH=$PYTHONPATH:.. python ./comparison-plot-from-protobuff.py $@ 31 | -------------------------------------------------------------------------------- /src/main/python/graphing-scripts/utils.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2013, Regents of the University of California 2 | # All rights reserved. 3 | 4 | # Redistribution and use in source and binary forms, with or without 5 | # modification, are permitted provided that the following conditions are met: 6 | 7 | # Redistributions of source code must retain the above copyright notice, this 8 | # list of conditions and the following disclaimer. Redistributions in binary 9 | # form must reproduce the above copyright notice, this list of conditions and the 10 | # following disclaimer in the documentation and/or other materials provided with 11 | # the distribution. Neither the name of the University of California, Berkeley 12 | # nor the names of its contributors may be used to endorse or promote products 13 | # derived from this software without specific prior written permission. THIS 14 | # SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 15 | # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 16 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 17 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 18 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 20 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 21 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 22 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 23 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 24 | 25 | from matplotlib import use, rc 26 | use('Agg') 27 | import matplotlib.pyplot as plt 28 | 29 | # plot saving utility function 30 | def writeout(filename_base, formats = ['pdf']): 31 | for fmt in formats: 32 | plt.savefig("%s.%s" % (filename_base, fmt), format=fmt, bbox_inches='tight') 33 | # plt.savefig("%s.%s" % (filename_base, fmt), format=fmt) 34 | 35 | def set_leg_fontsize(size): 36 | rc('legend', fontsize=size) 37 | 38 | def set_paper_rcs(): 39 | rc('font',**{'family':'sans-serif','sans-serif':['Helvetica'], 40 | 'serif':['Helvetica'],'size':8}) 41 | rc('text', usetex=True) 42 | rc('legend', fontsize=7) 43 | rc('figure', figsize=(3.33,2.22)) 44 | # rc('figure.subplot', left=0.10, top=0.90, bottom=0.12, right=0.95) 45 | rc('axes', linewidth=0.5) 46 | rc('lines', linewidth=0.5) 47 | 48 | def set_rcs(): 49 | rc('font',**{'family':'sans-serif','sans-serif':['Helvetica'], 50 | 'serif':['Times'],'size':12}) 51 | rc('text', usetex=True) 52 | rc('legend', fontsize=7) 53 | rc('figure', figsize=(6,4)) 54 | rc('figure.subplot', left=0.10, top=0.90, bottom=0.12, right=0.95) 55 | rc('axes', linewidth=0.5) 56 | rc('lines', linewidth=0.5) 57 | 58 | def append_or_create(d, i, e): 59 | if not i in d: 60 | d[i] = [e] 61 | else: 62 | d[i].append(e) 63 | 64 | # Append e to the array at position (i,k). 65 | # d - a dictionary of dictionaries of arrays, essentially a 2d dictionary. 66 | # i, k - essentially a 2 element tuple to use as the key into this 2d dict. 67 | # e - the value to add to the array indexed by key (i,k). 68 | def append_or_create_2d(d, i, k, e): 69 | if not i in d: 70 | d[i] = {k : [e]} 71 | elif k not in d[i]: 72 | d[i][k] = [e] 73 | else: 74 | d[i][k].append(e) 75 | 76 | def cell_to_anon(cell): 77 | if cell == 'A': 78 | return 'A' 79 | elif cell == 'B': 80 | return 'B' 81 | elif cell == 'C': 82 | return 'C' 83 | else: 84 | return 'SYNTH' 85 | -------------------------------------------------------------------------------- /src/main/scala/ExperimentRunner.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation 28 | 29 | import scala.collection.mutable.HashMap 30 | import scala.collection.mutable.ListBuffer 31 | import ClusterSimulationProtos._ 32 | import java.io._ 33 | 34 | /** 35 | * An experiment represents a series of runs of a simulator, 36 | * across ranges of paramters. Exactly one of {L, C, Lambda} 37 | * can be swept over per experiment, i.e. only one of 38 | * avgJobInterarrivalTimeRange, constantThinkTimeRange, and 39 | * perTaskThinkTimeRange can have size greater than one in a 40 | * single Experiment instance. 41 | */ 42 | class Experiment( 43 | name: String, 44 | // Workloads setup. 45 | workloadToSweepOver: String, 46 | avgJobInterarrivalTimeRange: Option[Seq[Double]] = None, 47 | workloadDescs: Seq[WorkloadDesc], 48 | // Schedulers setup. 49 | schedulerWorkloadsToSweepOver: Map[String, Seq[String]], 50 | constantThinkTimeRange: Seq[Double], 51 | perTaskThinkTimeRange: Seq[Double], 52 | blackListPercentRange: Seq[Double], 53 | // Workload -> scheduler mapping setup. 54 | schedulerWorkloadMap: Map[String, Seq[String]], 55 | // Simulator setup. 56 | simulatorDesc: ClusterSimulatorDesc, 57 | logging: Boolean = false, 58 | outputDirectory: String = "experiment_results", 59 | // Map from workloadName -> max % of cellState this prefill workload 60 | // can account for. Any prefill workload generator with workloadName 61 | // that is not contained in any of these maps will have no prefill 62 | // generated for this experiment, and any with name that is in multiple 63 | // of these maps will use the first limit that actually kicks in. 64 | prefillCpuLimits: Map[String, Double] = Map(), 65 | prefillMemLimits: Map[String, Double] = Map(), 66 | // Default simulations to 10 minute timeout. 67 | simulationTimeout: Double = 60.0*10.0) extends Runnable { 68 | prefillCpuLimits.values.foreach(l => assert(l >= 0.0 && l <= 1.0)) 69 | prefillMemLimits.values.foreach(l => assert(l >= 0.0 && l <= 1.0)) 70 | 71 | var parametersSweepingOver = 0 72 | avgJobInterarrivalTimeRange.foreach{opt: Seq[Double] => { 73 | if (opt.length > 1) { 74 | parametersSweepingOver += 1 75 | } 76 | }} 77 | if (constantThinkTimeRange.length > 1) {parametersSweepingOver += 1} 78 | if (perTaskThinkTimeRange.length > 1) {parametersSweepingOver += 1} 79 | // assert(parametersSweepingOver <= 1) 80 | 81 | override 82 | def toString = name 83 | 84 | def run() { 85 | // Create the output directory if it doesn't exist. 86 | (new File(outputDirectory)).mkdirs() 87 | val output = 88 | new java.io.FileOutputStream("%s/%s-%.0f.protobuf" 89 | .format(outputDirectory, 90 | name, 91 | simulatorDesc.runTime)) 92 | 93 | val experimentResultSet = ExperimentResultSet.newBuilder() 94 | 95 | // Parameter sweep over workloadDescs 96 | workloadDescs.foreach(workloadDesc => { 97 | println("\nSet workloadDesc = %s %s" 98 | .format(workloadDesc.cell, workloadDesc.assignmentPolicy)) 99 | 100 | // Save Experiment level stats into protobuf results. 101 | val experimentEnv = ExperimentResultSet.ExperimentEnv.newBuilder() 102 | experimentEnv.setCellName(workloadDesc.cell) 103 | experimentEnv.setWorkloadSplitType(workloadDesc.assignmentPolicy) 104 | experimentEnv.setIsPrefilled( 105 | workloadDesc.prefillWorkloadGenerators.length > 0) 106 | experimentEnv.setRunTime(simulatorDesc.runTime) 107 | 108 | // Generate preFill workloads. The simulator doesn't modify 109 | // these workloads like it does the workloads that are played during 110 | // the simulation. 111 | var prefillWorkloads = List[Workload]() 112 | workloadDesc.prefillWorkloadGenerators 113 | .filter(wlGen => { 114 | prefillCpuLimits.contains(wlGen.workloadName) || 115 | prefillMemLimits.contains(wlGen.workloadName) 116 | }).foreach(wlGen => { 117 | val cpusMaxOpt = prefillCpuLimits.get(wlGen.workloadName).map(i => { 118 | i * workloadDesc.cellStateDesc.numMachines * 119 | workloadDesc.cellStateDesc.cpusPerMachine 120 | }) 121 | 122 | val memMaxOpt = prefillMemLimits.get(wlGen.workloadName).map(i => { 123 | i * workloadDesc.cellStateDesc.numMachines * 124 | workloadDesc.cellStateDesc.memPerMachine 125 | }) 126 | println(("Creating a new prefill workload from " + 127 | "%s with maxCPU %s and maxMem %s") 128 | .format(wlGen.workloadName, cpusMaxOpt, memMaxOpt)) 129 | val newWorkload = wlGen.newWorkload(simulatorDesc.runTime, 130 | maxCpus = cpusMaxOpt, 131 | maxMem = memMaxOpt) 132 | for(job <- newWorkload.getJobs) { 133 | assert(job.submitted == 0.0) 134 | } 135 | prefillWorkloads ::= newWorkload 136 | }) 137 | 138 | // Parameter sweep over lambda. 139 | // If we have a range for lambda, loop over it, else 140 | // we just loop over a list holding a single element: None 141 | val jobInterarrivalRange = avgJobInterarrivalTimeRange match { 142 | case Some(paramsRange) => paramsRange.map(Some(_)) 143 | case None => List(None) 144 | } 145 | 146 | println("\nSet up avgJobInterarrivalTimeRange: %s\n" 147 | .format(jobInterarrivalRange)) 148 | jobInterarrivalRange.foreach(avgJobInterarrivalTime => { 149 | if (avgJobInterarrivalTime.isEmpty) { 150 | println("Since we're not in a labmda sweep, not overwriting lambda.") 151 | } else { 152 | println("Curr avgJobInterarrivalTime: %s\n" 153 | .format(avgJobInterarrivalTime)) 154 | } 155 | 156 | // Set up a list of workloads 157 | var commonWorkloadSet = ListBuffer[Workload]() 158 | var newAvgJobInterarrivalTime: Option[Double] = None 159 | workloadDesc.workloadGenerators.foreach(workloadGenerator => { 160 | if (workloadToSweepOver.equals( 161 | workloadGenerator.workloadName)) { 162 | // Only update the workload interarrival time if this is the 163 | // workload we are supposed to sweep over. If this is not a 164 | // lambda parameter sweep then updatedAvgJobInterarrivalTime 165 | // will remain None after this line is executed. 166 | newAvgJobInterarrivalTime = avgJobInterarrivalTime 167 | } 168 | println("Generating new Workload %s for window %f seconds long." 169 | .format(workloadGenerator.workloadName, simulatorDesc.runTime)) 170 | val newWorkload = 171 | workloadGenerator 172 | .newWorkload(timeWindow = simulatorDesc.runTime, 173 | updatedAvgJobInterarrivalTime = newAvgJobInterarrivalTime) 174 | commonWorkloadSet.append(newWorkload) 175 | }) 176 | 177 | // Parameter sweep over L. 178 | perTaskThinkTimeRange.foreach(perTaskThinkTime => { 179 | println("\nSet perTaskThinkTime = %f".format(perTaskThinkTime)) 180 | 181 | // Parameter sweep over C. 182 | constantThinkTimeRange.foreach(constantThinkTime => { 183 | println("\nSet constantThinkTime = %f".format(constantThinkTime)) 184 | 185 | // Parameter sweep over BlackListPercent (of cellstate). 186 | blackListPercentRange.foreach(blackListPercent => { 187 | println("\nSet blackListPercent = %f".format(blackListPercent)) 188 | 189 | // Make a copy of the workloads that this run of the simulator 190 | // will modify by using them to track statistics. 191 | val workloads = ListBuffer[Workload]() 192 | commonWorkloadSet.foreach(workload => { 193 | workloads.append(workload.copy) 194 | }) 195 | // Setup and and run the simulator. 196 | val simulator = 197 | simulatorDesc.newSimulator(constantThinkTime, 198 | perTaskThinkTime, 199 | blackListPercent, 200 | schedulerWorkloadsToSweepOver, 201 | schedulerWorkloadMap, 202 | workloadDesc.cellStateDesc, 203 | workloads, 204 | prefillWorkloads, 205 | logging) 206 | 207 | println("Running simulation with run().") 208 | val success: Boolean = simulator.run(Some(simulatorDesc.runTime), 209 | Some(simulationTimeout)) 210 | if (success) { 211 | // Simulation did not time out, so record stats. 212 | /** 213 | * Capture statistics into a protocolbuffer. 214 | */ 215 | val experimentResult = 216 | ExperimentResultSet.ExperimentEnv.ExperimentResult.newBuilder() 217 | 218 | experimentResult.setCellStateAvgCpuUtilization( 219 | simulator.avgCpuUtilization / simulator.cellState.totalCpus) 220 | experimentResult.setCellStateAvgMemUtilization( 221 | simulator.avgMemUtilization / simulator.cellState.totalMem) 222 | 223 | experimentResult.setCellStateAvgCpuLocked( 224 | simulator.avgCpuLocked / simulator.cellState.totalCpus) 225 | experimentResult.setCellStateAvgMemLocked( 226 | simulator.avgMemLocked / simulator.cellState.totalMem) 227 | 228 | // Save repeated stats about workloads. 229 | workloads.foreach(workload => { 230 | val workloadStats = ExperimentResultSet. 231 | ExperimentEnv. 232 | ExperimentResult. 233 | WorkloadStats.newBuilder() 234 | workloadStats.setWorkloadName(workload.name) 235 | workloadStats.setNumJobs(workload.numJobs) 236 | workloadStats.setNumJobsScheduled( 237 | workload.getJobs.filter(_.numSchedulingAttempts > 0).length) 238 | workload 239 | workloadStats.setJobThinkTimes90Percentile( 240 | workload.jobUsefulThinkTimesPercentile(0.9)) 241 | workloadStats.setAvgJobQueueTimesTillFirstScheduled( 242 | workload.avgJobQueueTimeTillFirstScheduled) 243 | workloadStats.setAvgJobQueueTimesTillFullyScheduled( 244 | workload.avgJobQueueTimeTillFullyScheduled) 245 | workloadStats.setJobQueueTimeTillFirstScheduled90Percentile( 246 | workload.jobQueueTimeTillFirstScheduledPercentile(0.9)) 247 | workloadStats.setJobQueueTimeTillFullyScheduled90Percentile( 248 | workload.jobQueueTimeTillFullyScheduledPercentile(0.9)) 249 | workloadStats.setNumSchedulingAttempts90Percentile( 250 | workload.numSchedulingAttemptsPercentile(0.9)) 251 | workloadStats.setNumSchedulingAttempts99Percentile( 252 | workload.numSchedulingAttemptsPercentile(0.99)) 253 | workloadStats.setNumTaskSchedulingAttempts90Percentile( 254 | workload.numTaskSchedulingAttemptsPercentile(0.9)) 255 | workloadStats.setNumTaskSchedulingAttempts99Percentile( 256 | workload.numTaskSchedulingAttemptsPercentile(0.99)) 257 | 258 | experimentResult.addWorkloadStats(workloadStats) 259 | }) 260 | // Record workload specific details about the parameter sweeps. 261 | experimentResult.setSweepWorkload(workloadToSweepOver) 262 | experimentResult.setAvgJobInterarrivalTime( 263 | avgJobInterarrivalTime.getOrElse( 264 | workloads.filter(_.name == workloadToSweepOver) 265 | .head.avgJobInterarrivalTime)) 266 | 267 | // Save repeated stats about schedulers. 268 | simulator.schedulers.values.foreach(scheduler => { 269 | val schedulerStats = 270 | ExperimentResultSet. 271 | ExperimentEnv. 272 | ExperimentResult. 273 | SchedulerStats.newBuilder() 274 | schedulerStats.setSchedulerName(scheduler.name) 275 | schedulerStats.setUsefulBusyTime( 276 | scheduler.totalUsefulTimeScheduling) 277 | schedulerStats.setWastedBusyTime( 278 | scheduler.totalWastedTimeScheduling) 279 | // Per scheduler metrics bucketed by day. 280 | // Use floor since days are zero-indexed. For example, if the 281 | // simulator only runs for 1/2 day, we should only have one 282 | // bucket (day 0), so our range should be 0 to 0. In this example 283 | // we would get floor(runTime / 86400) = floor(0.5) = 0. 284 | val daysRan = math.floor(simulatorDesc.runTime/86400.0).toInt 285 | println("Computing daily stats for days 0 through %d." 286 | .format(daysRan)) 287 | (0 to daysRan).foreach { 288 | day: Int => { 289 | val perDayStats = 290 | ExperimentResultSet. 291 | ExperimentEnv. 292 | ExperimentResult. 293 | SchedulerStats. 294 | PerDayStats.newBuilder() 295 | perDayStats.setDayNum(day) 296 | // Busy and wasted time bucketed by day. 297 | perDayStats.setUsefulBusyTime( 298 | scheduler.dailyUsefulTimeScheduling.getOrElse(day, 0.0)) 299 | println(("Writing dailyUsefulScheduling(day = %d) = %f for " + 300 | "scheduler %s") 301 | .format(day, 302 | scheduler 303 | .dailyUsefulTimeScheduling 304 | .getOrElse(day, 0.0), 305 | scheduler.name)) 306 | perDayStats.setWastedBusyTime( 307 | scheduler.dailyWastedTimeScheduling.getOrElse(day, 0.0)) 308 | // Counters bucketed by day. 309 | perDayStats.setNumSuccessfulTransactions( 310 | scheduler.dailySuccessTransactions.getOrElse[Int](day, 0)) 311 | perDayStats.setNumFailedTransactions( 312 | scheduler.dailyFailedTransactions.getOrElse[Int](day, 0)) 313 | 314 | schedulerStats.addPerDayStats(perDayStats) 315 | }} 316 | 317 | assert(scheduler.perWorkloadUsefulTimeScheduling.size == 318 | scheduler.perWorkloadWastedTimeScheduling.size, 319 | "the maps held by Scheduler to track per workload " + 320 | "useful and wasted time should be the same size " + 321 | "(Scheduler.addJob() should ensure this).") 322 | scheduler.perWorkloadUsefulTimeScheduling.foreach{ 323 | case (workloadName, workloadUsefulBusyTime) => { 324 | val perWorkloadBusyTime = 325 | ExperimentResultSet. 326 | ExperimentEnv. 327 | ExperimentResult. 328 | SchedulerStats. 329 | PerWorkloadBusyTime.newBuilder() 330 | perWorkloadBusyTime.setWorkloadName(workloadName) 331 | perWorkloadBusyTime.setUsefulBusyTime(workloadUsefulBusyTime) 332 | perWorkloadBusyTime.setWastedBusyTime( 333 | scheduler.perWorkloadWastedTimeScheduling(workloadName)) 334 | 335 | schedulerStats.addPerWorkloadBusyTime(perWorkloadBusyTime) 336 | }} 337 | // Counts of sched-level job transaction successes, failures, 338 | // and retries. 339 | schedulerStats.setNumSuccessfulTransactions( 340 | scheduler.numSuccessfulTransactions) 341 | schedulerStats.setNumFailedTransactions( 342 | scheduler.numFailedTransactions) 343 | schedulerStats.setNumNoResourcesFoundSchedulingAttempts( 344 | scheduler.numNoResourcesFoundSchedulingAttempts) 345 | schedulerStats.setNumRetriedTransactions( 346 | scheduler.numRetriedTransactions) 347 | schedulerStats.setNumJobsTimedOutScheduling( 348 | scheduler.numJobsTimedOutScheduling) 349 | // Counts of task transaction successes and failures. 350 | schedulerStats.setNumSuccessfulTaskTransactions( 351 | scheduler.numSuccessfulTaskTransactions) 352 | schedulerStats.setNumFailedTaskTransactions( 353 | scheduler.numFailedTaskTransactions) 354 | 355 | schedulerStats.setIsMultiPath(scheduler.isMultiPath) 356 | schedulerStats.setNumJobsLeftInQueue(scheduler.jobQueueSize) 357 | schedulerStats.setFailedFindVictimAttempts( 358 | scheduler.failedFindVictimAttempts) 359 | 360 | experimentResult.addSchedulerStats(schedulerStats) 361 | }) 362 | // Record scheduler specific details about the parameter sweeps. 363 | schedulerWorkloadsToSweepOver 364 | .foreach{case (schedName, workloadNames) => { 365 | workloadNames.foreach(workloadName => { 366 | val schedulerWorkload = 367 | ExperimentResultSet. 368 | ExperimentEnv. 369 | ExperimentResult. 370 | SchedulerWorkload.newBuilder() 371 | schedulerWorkload.setSchedulerName(schedName) 372 | schedulerWorkload.setWorkloadName(workloadName) 373 | experimentResult.addSweepSchedulerWorkload(schedulerWorkload) 374 | }) 375 | }} 376 | 377 | experimentResult.setConstantThinkTime(constantThinkTime) 378 | experimentResult.setPerTaskThinkTime(perTaskThinkTime) 379 | 380 | // Save our results as a protocol buffer. 381 | experimentEnv.addExperimentResult(experimentResult.build()) 382 | 383 | 384 | /** 385 | * TODO(andyk): Once protocol buffer support is finished, 386 | * remove this. 387 | */ 388 | 389 | // Create a sorted list of schedulers and workloads to compute 390 | // a lot of the stats below, so that the we can be sure 391 | // which column is which when we print the stats. 392 | val sortedSchedulers = simulator 393 | .schedulers.values.toList.sortWith(_.name < _.name) 394 | val sortedWorkloads = workloads.toList.sortWith(_.name < _.name) 395 | 396 | // Sorted names of workloads. 397 | var workloadNames = sortedWorkloads.map(_.name).mkString(" ") 398 | 399 | // Count the jobs in each workload. 400 | var numJobs = sortedWorkloads.map(_.numJobs).mkString(" ") 401 | 402 | // Count the jobs in each workload that were actually scheduled. 403 | val numJobsScheduled = sortedWorkloads.map(workload => { 404 | workload.getJobs.filter(_.numSchedulingAttempts > 0).length 405 | }).mkString(" ") 406 | 407 | // Sorted names of Schedulers. 408 | val schedNames = sortedSchedulers.map(_.name).mkString(" ") 409 | 410 | // Calculate per scheduler successful, failed, retried 411 | // transaction conflict rates. 412 | val schedSuccessfulTransactions = sortedSchedulers.map(sched => { 413 | sched.numSuccessfulTransactions 414 | }).mkString(" ") 415 | val schedFailedTransactions = sortedSchedulers.map(sched => { 416 | sched.numFailedTransactions 417 | }).mkString(" ") 418 | val schedNoResorucesFoundSchedAttempt = sortedSchedulers.map(sched => { 419 | sched.numNoResourcesFoundSchedulingAttempts 420 | }).mkString(" ") 421 | val schedRetriedTransactions = sortedSchedulers.map(sched => { 422 | sched.numRetriedTransactions 423 | }).mkString(" ") 424 | 425 | // Calculate per scheduler task transaction and conflict rates 426 | val schedSuccessfulTaskTransactions = sortedSchedulers.map(sched => { 427 | sched.numSuccessfulTaskTransactions 428 | }).mkString(" ") 429 | val schedFailedTaskTransactions = sortedSchedulers.map(sched => { 430 | sched.numFailedTaskTransactions 431 | }).mkString(" ") 432 | 433 | val schedNumJobsTimedOutScheduling = sortedSchedulers.map(sched => { 434 | sched.numJobsTimedOutScheduling 435 | }).mkString(" ") 436 | 437 | // Calculate per scheduler aggregate (useful + wasted) busy time. 438 | val schedBusyTimes = sortedSchedulers.map(sched => { 439 | println(("calculating busy time for sched %s as " + 440 | "(%f + %f) / %f = %f.") 441 | .format(sched.name, 442 | sched.totalUsefulTimeScheduling, 443 | sched.totalWastedTimeScheduling, 444 | simulator.currentTime, 445 | (sched.totalUsefulTimeScheduling + 446 | sched.totalWastedTimeScheduling) / 447 | simulator.currentTime)) 448 | (sched.totalUsefulTimeScheduling + 449 | sched.totalWastedTimeScheduling) / simulator.currentTime 450 | }).mkString(" ") 451 | 452 | // Calculate per scheduler aggregate (useful + wasted) busy time. 453 | val schedUsefulBusyTimes = sortedSchedulers.map(sched => { 454 | sched.totalUsefulTimeScheduling / simulator.currentTime 455 | }).mkString(" ") 456 | 457 | // Calculate per scheduler aggregate (useful + wasted) busy time. 458 | val schedWastedBusyTimes = sortedSchedulers.map(sched => { 459 | sched.totalWastedTimeScheduling / simulator.currentTime 460 | }).mkString(" ") 461 | 462 | // Calculate per-scheduler per-workload useful + wasted busy time. 463 | val perWorkloadSchedBusyTimes = sortedSchedulers.map(sched => { 464 | // Sort by workload name. 465 | val sortedSchedulingTimes = 466 | sched.perWorkloadUsefulTimeScheduling.toList.sortWith(_._1<_._1) 467 | sortedSchedulingTimes.map(nameTimePair => { 468 | (nameTimePair._2 + 469 | sched.perWorkloadWastedTimeScheduling(nameTimePair._1)) / 470 | simulator.currentTime 471 | }).mkString(" ") 472 | }).mkString(" ") 473 | 474 | // Calculate 90%tile per-workload time-scheduling for 475 | // scheduled jobs. 476 | // sortedWorkloads is a ListBuffer[Workload] 477 | // Workload.jobs is a ListBuffer[Job]. 478 | val jobThinkTimes90Percentile = sortedWorkloads.map(workload => { 479 | workload.jobUsefulThinkTimesPercentile(0.9) 480 | }).mkString(" ") 481 | 482 | // Calculate the average time jobs spent in scheduler's queue before 483 | // its first task was first scheduled. 484 | val avgJobQueueTimesTillFirstScheduled = sortedWorkloads.map(workload => { 485 | workload.avgJobQueueTimeTillFirstScheduled 486 | }).mkString(" ") 487 | 488 | // Calculate the average time jobs spent in scheduler's queue before 489 | // its final task was scheduled.. 490 | val avgJobQueueTimesTillFullyScheduled = sortedWorkloads.map(workload => { 491 | workload.avgJobQueueTimeTillFullyScheduled 492 | }).mkString(" ") 493 | 494 | // Calculate the 90%tile per-workload jobQueueTime*-s for 495 | // scheduled jobs. 496 | val jobQueueTimeTillFirstScheduled90Percentile = 497 | sortedWorkloads.map(workload => { 498 | workload.jobQueueTimeTillFirstScheduledPercentile(0.9) 499 | }).mkString(" ") 500 | 501 | val jobQueueTimeTillFullyScheduled90Percentile = 502 | sortedWorkloads.map(workload => { 503 | workload.jobQueueTimeTillFullyScheduledPercentile(0.9) 504 | }).mkString(" ") 505 | 506 | val numSchedulingAttempts90Percentile = 507 | sortedWorkloads.map(workload => { 508 | workload.numSchedulingAttemptsPercentile(0.9) 509 | }).mkString(" ") 510 | 511 | val numSchedulingAttempts99Percentile = 512 | sortedWorkloads.map(workload => { 513 | workload.numSchedulingAttemptsPercentile(0.99) 514 | }).mkString(" ") 515 | 516 | val numSchedulingAttemptsMax = 517 | sortedWorkloads.map(workload => { 518 | workload.getJobs.map(_.numSchedulingAttempts).max 519 | }).mkString(" ") 520 | 521 | val numTaskSchedulingAttempts90Percentile = 522 | sortedWorkloads.map(workload => { 523 | workload.numTaskSchedulingAttemptsPercentile(0.9) 524 | }).mkString(" ") 525 | 526 | val numTaskSchedulingAttempts99Percentile = 527 | sortedWorkloads.map(workload => { 528 | workload.numTaskSchedulingAttemptsPercentile(0.99) 529 | }).mkString(" ") 530 | 531 | val numTaskSchedulingAttemptsMax = 532 | sortedWorkloads.map(workload => { 533 | workload.getJobs.map(_.numTaskSchedulingAttempts).max 534 | }).mkString(" ") 535 | 536 | // Per-scheduler stats. 537 | val schedulerIsMultiPaths = sortedSchedulers.map(sched => { 538 | if (sched.isMultiPath) "1" 539 | else "0" 540 | }).mkString(" ") 541 | val schedulerJobQueueSizes = 542 | sortedSchedulers.map(_.jobQueueSize).mkString(" ") 543 | 544 | val prettyLine = ("cell: %s \n" + 545 | "assignment policy: %s \n" + 546 | "runtime: %f \n" + 547 | "avg cpu util: %f \n" + 548 | "avg mem util: %f \n" + 549 | "num workloads %d \n" + 550 | "workload names: %s \n" + 551 | "numjobs: %s \n" + 552 | "num jobs scheduled: %s \n" + 553 | "perWorkloadSchedBusyTimes: %s \n" + 554 | "jobThinkTimes90Percentile: %s \n" + 555 | "avgJobQueueTimesTillFirstScheduled: %s \n" + 556 | "avgJobQueueTimesTillFullyScheduled: %s \n" + 557 | "jobQueueTimeTillFirstScheduled90Percentile: %s \n" + 558 | "jobQueueTimeTillFullyScheduled90Percentile: %s \n" + 559 | "numSchedulingAttempts90Percentile: %s \n" + 560 | "numSchedulingAttempts99Percentile: %s \n" + 561 | "numSchedulingAttemptsMax: %s \n" + 562 | "numTaskSchedulingAttempts90Percentile: %s \n" + 563 | "numTaskSchedulingAttempts99Percentile: %s \n" + 564 | "numTaskSchedulingAttemptsMax: %s \n" + 565 | "simulator.schedulers.size: %d \n" + 566 | "schedNames: %s \n" + 567 | "schedBusyTimes: %s \n" + 568 | "schedUsefulBusyTimes: %s \n" + 569 | "schedWastedBusyTimes: %s \n" + 570 | "schedSuccessfulTransactions: %s \n" + 571 | "schedFailedTransactions: %s \n" + 572 | "schedNoResorucesFoundSchedAttempt: %s \n" + 573 | "schedRetriedTransactions: %s \n" + 574 | "schedSuccessfulTaskTransactions: %s \n" + 575 | "schedFailedTaskTransactions: %s \n" + 576 | "schedNumJobsTimedOutScheduling: %s \n" + 577 | "schedulerIsMultiPaths: %s \n" + 578 | "schedulerNumJobsLeftInQueue: %s \n" + 579 | "workloadToSweepOver: %s \n" + 580 | "avgJobInterarrivalTime: %f \n" + 581 | "constantThinkTime: %f \n" + 582 | "perTaskThinkTime %f").format( 583 | workloadDesc.cell, // %s 584 | workloadDesc.assignmentPolicy, // %s 585 | simulatorDesc.runTime, // %f 586 | simulator.avgCpuUtilization / 587 | simulator.cellState.totalCpus, // %f 588 | simulator.avgMemUtilization / 589 | simulator.cellState.totalMem, // %f 590 | workloads.length, // %d 591 | workloadNames, // %s 592 | numJobs, // %s 593 | numJobsScheduled, // %s 594 | perWorkloadSchedBusyTimes, // %s 595 | jobThinkTimes90Percentile, // %s 596 | avgJobQueueTimesTillFirstScheduled, // %s 597 | avgJobQueueTimesTillFullyScheduled, // %s 598 | jobQueueTimeTillFirstScheduled90Percentile, // %s 599 | jobQueueTimeTillFullyScheduled90Percentile, // %s 600 | numSchedulingAttempts90Percentile, // %s 601 | numSchedulingAttempts99Percentile, // %s 602 | numSchedulingAttemptsMax, // %s 603 | numTaskSchedulingAttempts90Percentile, // %s 604 | numTaskSchedulingAttempts99Percentile, // %s 605 | numTaskSchedulingAttemptsMax, // %s 606 | simulator.schedulers.size, // %d 607 | schedNames, // %s 608 | schedBusyTimes, // %s 609 | schedUsefulBusyTimes, // %s 610 | schedWastedBusyTimes, // %s 611 | schedSuccessfulTransactions, // %s 612 | schedFailedTransactions, // %s 613 | schedNoResorucesFoundSchedAttempt, // %s 614 | schedRetriedTransactions, // %s 615 | schedSuccessfulTaskTransactions, // %s 616 | schedFailedTaskTransactions, // %s 617 | schedNumJobsTimedOutScheduling, // %s 618 | schedulerIsMultiPaths, // %s 619 | schedulerJobQueueSizes, 620 | workloadToSweepOver, // %s 621 | avgJobInterarrivalTime.getOrElse( 622 | workloads.filter(_.name == workloadToSweepOver) // %f 623 | .head.avgJobInterarrivalTime), 624 | constantThinkTime, // %f 625 | perTaskThinkTime) // %f 626 | 627 | println(prettyLine + "\n") 628 | } else { // if (success) 629 | println("Simulation timed out.") 630 | } 631 | }) // blackListPercent 632 | }) // C 633 | }) // L 634 | }) // lambda 635 | experimentResultSet.addExperimentEnv(experimentEnv) 636 | }) // WorkloadDescs 637 | experimentResultSet.build().writeTo(output) 638 | output.close() 639 | } 640 | } 641 | -------------------------------------------------------------------------------- /src/main/scala/MesosSimulation.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation 28 | 29 | import collection.mutable.HashMap 30 | import collection.mutable.ListBuffer 31 | 32 | class MesosSimulatorDesc( 33 | schedulerDescs: Seq[MesosSchedulerDesc], 34 | runTime: Double, 35 | val allocatorConstantThinkTime: Double) 36 | extends ClusterSimulatorDesc(runTime){ 37 | override 38 | def newSimulator(constantThinkTime: Double, 39 | perTaskThinkTime: Double, 40 | blackListPercent: Double, 41 | schedulerWorkloadsToSweepOver: Map[String, Seq[String]], 42 | workloadToSchedulerMap: Map[String, Seq[String]], 43 | cellStateDesc: CellStateDesc, 44 | workloads: Seq[Workload], 45 | prefillWorkloads: Seq[Workload], 46 | logging: Boolean = false): ClusterSimulator = { 47 | var schedulers = HashMap[String, MesosScheduler]() 48 | // Create schedulers according to experiment parameters. 49 | schedulerDescs.foreach(schedDesc => { 50 | // If any of the scheduler-workload pairs we're sweeping over 51 | // are for this scheduler, then apply them before 52 | // registering it. 53 | var constantThinkTimes = HashMap[String, Double]( 54 | schedDesc.constantThinkTimes.toSeq: _*) 55 | var perTaskThinkTimes = HashMap[String, Double]( 56 | schedDesc.perTaskThinkTimes.toSeq: _*) 57 | var newBlackListPercent = 0.0 58 | if (schedulerWorkloadsToSweepOver 59 | .contains(schedDesc.name)) { 60 | newBlackListPercent = blackListPercent 61 | schedulerWorkloadsToSweepOver(schedDesc.name) 62 | .foreach(workloadName => { 63 | constantThinkTimes(workloadName) = constantThinkTime 64 | perTaskThinkTimes(workloadName) = perTaskThinkTime 65 | }) 66 | } 67 | schedulers(schedDesc.name) = 68 | new MesosScheduler(schedDesc.name, 69 | constantThinkTimes.toMap, 70 | perTaskThinkTimes.toMap, 71 | schedDesc.schedulePartialJobs, 72 | math.floor(newBlackListPercent * 73 | cellStateDesc.numMachines.toDouble).toInt) 74 | }) 75 | // It shouldn't matter which transactionMode we choose, but it does 76 | // matter that we use "resource-fit" conflictMode or else 77 | // responses to resource offers will likely fail. 78 | val cellState = new CellState(cellStateDesc.numMachines, 79 | cellStateDesc.cpusPerMachine, 80 | cellStateDesc.memPerMachine, 81 | conflictMode = "resource-fit", 82 | transactionMode = "all-or-nothing") 83 | 84 | val allocator = 85 | new MesosAllocator(allocatorConstantThinkTime) 86 | 87 | new MesosSimulator(cellState, 88 | schedulers.toMap, 89 | workloadToSchedulerMap, 90 | workloads, 91 | prefillWorkloads, 92 | allocator, 93 | logging) 94 | } 95 | } 96 | 97 | class MesosSimulator(cellState: CellState, 98 | override val schedulers: Map[String, MesosScheduler], 99 | workloadToSchedulerMap: Map[String, Seq[String]], 100 | workloads: Seq[Workload], 101 | prefillWorkloads: Seq[Workload], 102 | var allocator: MesosAllocator, 103 | logging: Boolean = false, 104 | monitorUtilization: Boolean = true) 105 | extends ClusterSimulator(cellState, 106 | schedulers, 107 | workloadToSchedulerMap, 108 | workloads, 109 | prefillWorkloads, 110 | logging, 111 | monitorUtilization) { 112 | assert(cellState.conflictMode.equals("resource-fit"), 113 | "Mesos requires cellstate to be set up with resource-fit conflictMode") 114 | // Set up a pointer to this simulator in the allocator. 115 | allocator.simulator = this 116 | 117 | log("========================================================") 118 | log("Mesos SIM CONSTRUCTOR - CellState total usage: %fcpus (%.1f%s), %fmem (%.1f%s)." 119 | .format(cellState.totalOccupiedCpus, 120 | cellState.totalOccupiedCpus / 121 | cellState.totalCpus * 100.0, 122 | "%", 123 | cellState.totalOccupiedMem, 124 | cellState.totalOccupiedMem / 125 | cellState.totalMem * 100.0, 126 | "%")) 127 | 128 | // Set up a pointer to this simulator in each scheduler. 129 | schedulers.values.foreach(_.mesosSimulator = this) 130 | } 131 | 132 | class MesosSchedulerDesc(name: String, 133 | constantThinkTimes: Map[String, Double], 134 | perTaskThinkTimes: Map[String, Double], 135 | val schedulePartialJobs: Boolean) 136 | extends SchedulerDesc(name, 137 | constantThinkTimes, 138 | perTaskThinkTimes) 139 | 140 | class MesosScheduler(name: String, 141 | constantThinkTimes: Map[String, Double], 142 | perTaskThinkTimes: Map[String, Double], 143 | val schedulePartialJobs: Boolean, 144 | numMachinesToBlackList: Double = 0) 145 | extends Scheduler(name, 146 | constantThinkTimes, 147 | perTaskThinkTimes, 148 | numMachinesToBlackList) { 149 | println("scheduler-id-info: %d, %s, %d, %s, %s" 150 | .format(Thread.currentThread().getId(), 151 | name, 152 | hashCode(), 153 | constantThinkTimes.mkString(";"), 154 | perTaskThinkTimes.mkString(";"))) 155 | // TODO(andyk): Clean up these Simulator classes 156 | // by templatizing the Scheduler class and having only 157 | // one simulator of the correct type, instead of one 158 | // simulator for each of the parent and child classes. 159 | var mesosSimulator: MesosSimulator = null 160 | val offerQueue = new collection.mutable.Queue[Offer] 161 | 162 | override 163 | def checkRegistered = { 164 | super.checkRegistered 165 | assert(mesosSimulator != null, "This scheduler has not been added to a " + 166 | "simulator yet.") 167 | } 168 | 169 | /** 170 | * How an allocator sends offers to a framework. 171 | */ 172 | def resourceOffer(offer: Offer): Unit = { 173 | offerQueue.enqueue(offer) 174 | handleNextResourceOffer() 175 | } 176 | 177 | def handleNextResourceOffer(): Unit = { 178 | // We essentially synchronize access to this scheduling logic 179 | // via the scheduling variable. We aren't protecting this from real 180 | // parallelism, but rather from discrete-event-simlation style parallelism. 181 | if(!scheduling && !offerQueue.isEmpty) { 182 | scheduling = true 183 | val offer = offerQueue.dequeue() 184 | // Use this offer to attempt to schedule jobs. 185 | simulator.log("------ In %s.resourceOffer(offer %d).".format(name, offer.id)) 186 | val offerResponse = collection.mutable.ListBuffer[ClaimDelta]() 187 | var aggThinkTime: Double = 0.0 188 | // TODO(andyk): add an efficient method to CellState that allows us to 189 | // check the largest slice of available resources to decode 190 | // if we should keep trying to schedule or not. 191 | while (offer.cellState.availableCpus > 0.000001 && 192 | offer.cellState.availableMem > 0.000001 && 193 | !pendingQueue.isEmpty) { 194 | val job = pendingQueue.dequeue 195 | job.updateTimeInQueueStats(simulator.currentTime) 196 | val jobThinkTime = getThinkTime(job) 197 | aggThinkTime += jobThinkTime 198 | job.numSchedulingAttempts += 1 199 | job.numTaskSchedulingAttempts += job.unscheduledTasks 200 | 201 | // Before calling the expensive scheduleJob() function, check 202 | // to see if one of this job's tasks could fit into the sum of 203 | // *all* the currently free resources in the offers' cell state. 204 | // If one can't, then there is no need to call scheduleJob(). If 205 | // one can, we call scheduleJob(), though we still might not fit 206 | // any tasks due to fragmentation. 207 | if (offer.cellState.availableCpus > job.cpusPerTask && 208 | offer.cellState.availableMem > job.cpusPerTask) { 209 | // Schedule the job using the cellstate in the ResourceOffer. 210 | val claimDeltas = scheduleJob(job, offer.cellState) 211 | if(claimDeltas.length > 0) { 212 | numSuccessfulTransactions += 1 213 | recordUsefulTimeScheduling(job, 214 | jobThinkTime, 215 | job.numSchedulingAttempts == 1) 216 | mesosSimulator.log(("Setting up job %d to accept at least " + 217 | "part of offer %d. About to spend %f seconds " + 218 | "scheduling it. Assigning %d tasks to it.") 219 | .format(job.id, offer.id, jobThinkTime, 220 | claimDeltas.length)) 221 | offerResponse ++= claimDeltas 222 | job.unscheduledTasks -= claimDeltas.length 223 | } else { 224 | mesosSimulator.log(("Rejecting all of offer %d for job %d, " + 225 | "which requires tasks with %f cpu, %f mem. " + 226 | "Not counting busy time for this sched attempt.") 227 | .format(offer.id, 228 | job.id, 229 | job.cpusPerTask, 230 | job.memPerTask)) 231 | numNoResourcesFoundSchedulingAttempts += 1 232 | } 233 | } else { 234 | mesosSimulator.log(("Short-path rejecting all of offer %d for " + 235 | "job %d because a single one of its tasks " + 236 | "(%f cpu, %f mem) wouldn't fit into the sum " + 237 | "of the offer's private cell state's " + 238 | "remaining resources (%f cpu, %f mem).") 239 | .format(offer.id, 240 | job.id, 241 | job.cpusPerTask, 242 | job.memPerTask, 243 | offer.cellState.availableCpus, 244 | offer.cellState.availableMem)) 245 | } 246 | 247 | var jobEventType = "" // Set this conditionally below; used in logging. 248 | // If job is only partially scheduled, put it back in the pendingQueue. 249 | if (job.unscheduledTasks > 0) { 250 | mesosSimulator.log(("Job %d is [still] only partially scheduled, " + 251 | "(%d out of %d its tasks remain unscheduled) so " + 252 | "putting it back in the queue.") 253 | .format(job.id, 254 | job.unscheduledTasks, 255 | job.numTasks)) 256 | // Give up on a job if (a) it hasn't scheduled a single task in 257 | // 100 tries or (b) it hasn't finished scheduling after 1000 tries. 258 | if ((job.numSchedulingAttempts > 100 && 259 | job.unscheduledTasks == job.numTasks) || 260 | job.numSchedulingAttempts > 1000) { 261 | println(("Abandoning job %d (%f cpu %f mem) with %d/%d " + 262 | "remaining tasks, after %d scheduling " + 263 | "attempts.").format(job.id, 264 | job.cpusPerTask, 265 | job.memPerTask, 266 | job.unscheduledTasks, 267 | job.numTasks, 268 | job.numSchedulingAttempts)) 269 | numJobsTimedOutScheduling += 1 270 | jobEventType = "abandoned" 271 | } else { 272 | simulator.afterDelay(1) { 273 | addJob(job) 274 | } 275 | } 276 | job.lastEnqueued = simulator.currentTime 277 | } else { 278 | // All tasks in job scheduled so not putting it back in pendingQueue. 279 | jobEventType = "fully-scheduled" 280 | } 281 | if (!jobEventType.equals("")) { 282 | // Print some stats that we can use to generate CDFs of the job 283 | // # scheduling attempts and job-time-till-scheduled. 284 | // println("%s %s %d %s %d %d %f" 285 | // .format(Thread.currentThread().getId(), 286 | // name, 287 | // hashCode(), 288 | // jobEventType, 289 | // job.id, 290 | // job.numSchedulingAttempts, 291 | // simulator.currentTime - job.submitted)) 292 | } 293 | } 294 | 295 | if (pendingQueue.isEmpty) { 296 | // If we have scheduled everything, notify the allocator that we 297 | // don't need resources offers until we request them again (which 298 | // we will do when another job is added to our pendingQueue. 299 | // Do this before we reply to the offer since the allocator may make 300 | // its next round of offers shortly after we respond to this offer. 301 | mesosSimulator.log(("After scheduling, %s's pending queue is " + 302 | "empty, canceling outstanding " + 303 | "resource request.").format(name)) 304 | mesosSimulator.allocator.cancelOfferRequest(this) 305 | } else { 306 | mesosSimulator.log(("%s's pending queue still has %d jobs in it, but " + 307 | "for some reason, they didn't fit into this " + 308 | "offer, so it will patiently wait for more " + 309 | "resource offers.").format(name, pendingQueue.size)) 310 | } 311 | 312 | // Send our response to this offer. 313 | mesosSimulator.afterDelay(aggThinkTime) { 314 | mesosSimulator.log(("Waited %f seconds of aggThinkTime, now " + 315 | "responding to offer %d with %d responses after.") 316 | .format(aggThinkTime, offer.id, offerResponse.length)) 317 | mesosSimulator.allocator.respondToOffer(offer, offerResponse) 318 | } 319 | // Done with this offer, see if we have another one to handle. 320 | scheduling = false 321 | handleNextResourceOffer() 322 | } 323 | } 324 | 325 | // When a job arrives, notify the allocator, so that it can make us offers 326 | // until we notify it that we don't have any more jobs, at which time it 327 | // can stop sending us offers. 328 | override 329 | def addJob(job: Job) = { 330 | assert(simulator != null, "This scheduler has not been added to a " + 331 | "simulator yet.") 332 | simulator.log("========================================================") 333 | simulator.log("addJOB: CellState total usage: %fcpus (%.1f%s), %fmem (%.1f%s)." 334 | .format(simulator.cellState.totalOccupiedCpus, 335 | simulator.cellState.totalOccupiedCpus / 336 | simulator.cellState.totalCpus * 100.0, 337 | "%", 338 | simulator.cellState.totalOccupiedMem, 339 | simulator.cellState.totalOccupiedMem / 340 | simulator.cellState.totalMem * 100.0, 341 | "%")) 342 | super.addJob(job) 343 | pendingQueue.enqueue(job) 344 | simulator.log("Enqueued job %d of workload type %s." 345 | .format(job.id, job.workloadName)) 346 | mesosSimulator.allocator.requestOffer(this) 347 | } 348 | } 349 | 350 | /** 351 | * Decides which scheduler to make resource offer to next, and manages 352 | * the resource offer process. 353 | * 354 | * @param constantThinkTime the time this scheduler takes to sort the 355 | * list of schedulers to decide which to offer to next. This happens 356 | * before each series of resource offers is made. 357 | * @param resources How many resources is managed by this MesosAllocator 358 | */ 359 | class MesosAllocator(constantThinkTime: Double, 360 | minCpuOffer: Double = 100.0, 361 | minMemOffer: Double = 100.0, 362 | // Min time, in seconds, to batch up resources 363 | // before making an offer. 364 | val offerBatchInterval: Double = 1.0) { 365 | var simulator: MesosSimulator = null 366 | var allocating: Boolean = false 367 | var schedulersRequestingResources = collection.mutable.Set[MesosScheduler]() 368 | var timeSpentAllocating: Double = 0.0 369 | var nextOfferId: Long = 0 370 | val offeredDeltas = HashMap[Long, Seq[ClaimDelta]]() 371 | // Are we currently waiting while a resource batch offer builds up 372 | // that has already been scheduled? 373 | var buildAndSendOfferScheduled = false 374 | 375 | def checkRegistered = { 376 | assert(simulator != null, "You must assign a simulator to a " + 377 | "MesosAllocator before you can use it.") 378 | } 379 | 380 | def getThinkTime: Double = { 381 | constantThinkTime 382 | } 383 | 384 | def requestOffer(needySched: MesosScheduler) { 385 | checkRegistered 386 | simulator.log("Received an offerRequest from %s.".format(needySched.name)) 387 | // Adding a scheduler to this list will ensure that it gets included 388 | // in the next round of resource offers. 389 | schedulersRequestingResources += needySched 390 | schedBuildAndSendOffer() 391 | } 392 | 393 | def cancelOfferRequest(needySched: MesosScheduler) = { 394 | simulator.log("Canceling the outstanding resourceRequest for scheduler %s.".format( 395 | needySched.name)) 396 | schedulersRequestingResources -= needySched 397 | } 398 | 399 | /** 400 | * We batch up available resources into periodic offers so 401 | * that we don't send an offer in response to *every* small event, 402 | * which adds latency to the average offer and slows the simulator down. 403 | * This feature was in Mesos for the NSDI paper experiments, but didn't 404 | * get committed to the open source codebase at that time. 405 | */ 406 | def schedBuildAndSendOffer() = { 407 | if (!buildAndSendOfferScheduled) { 408 | buildAndSendOfferScheduled = true 409 | simulator.afterDelay(offerBatchInterval) { 410 | simulator.log("Building and sending a batched offer") 411 | buildAndSendOffer() 412 | // Let another call to buildAndSendOffer() get scheduled, 413 | // giving some time for resources to build up that are 414 | // becoming available due to tasks finishing. 415 | buildAndSendOfferScheduled = false 416 | } 417 | } 418 | } 419 | /** 420 | * Sort schedulers in simulator using DRF, then make an offer to 421 | * the first scheduler in the list. 422 | * 423 | * After any task finishes or scheduler says it wants offers, we 424 | * call this, i.e. buildAndSendOffer(), again. Note that the only 425 | * resources that will be available will be the ones that 426 | * the task that just finished was using). 427 | */ 428 | def buildAndSendOffer(): Unit = { 429 | checkRegistered 430 | simulator.log("========================================================") 431 | simulator.log(("TOP OF BUILD AND SEND. CellState total occupied: " + 432 | "%fcpus (%.1f%%), %fmem (%.1f%%).") 433 | .format(simulator.cellState.totalOccupiedCpus, 434 | simulator.cellState.totalOccupiedCpus / 435 | simulator.cellState.totalCpus * 100.0, 436 | simulator.cellState.totalOccupiedMem, 437 | simulator.cellState.totalOccupiedMem / 438 | simulator.cellState.totalMem * 100.0)) 439 | // Build and send an offer only if: 440 | // (a) there are enough resources in cellstate and 441 | // (b) at least one scheduler wants offers currently 442 | // Else, don't do anything, since this function will be called 443 | // again when a task finishes or a scheduler says it wants offers. 444 | if (!schedulersRequestingResources.isEmpty && 445 | simulator.cellState.availableCpus >= minCpuOffer && 446 | simulator.cellState.availableCpus >= minMemOffer) { 447 | // Use DRF to pick a candidate scheduler to offer resources. 448 | val sortedSchedulers = 449 | drfSortSchedulers(schedulersRequestingResources.toSeq) 450 | sortedSchedulers.headOption.foreach(candidateSched => { 451 | // Create an offer by taking a snapshot of cell state. We might 452 | // discard this without sending it if we find that there are 453 | // no available resources in cell state right now. 454 | val privCellState = simulator.cellState.copy 455 | val offer = Offer(nextOfferId, candidateSched, privCellState) 456 | nextOfferId += 1 457 | 458 | // Call scheduleAllAvailable() which creates deltas, applies them, 459 | // and returns them; all based on common cell state. This doesn't 460 | // affect the privateCellState we created above. Store the deltas 461 | // using the offerID as key until we get a response from the scheduler. 462 | // This has the effect of pessimistally locking the resources in 463 | // common cell state until we hear back from the scheduler (or time 464 | // out and rescind the offer). 465 | val claimDeltas = 466 | candidateSched.scheduleAllAvailable(cellState = simulator.cellState, 467 | locked = true) 468 | // Make sure scheduleAllAvailable() did its job. 469 | assert(simulator.cellState.availableCpus < 0.01 && 470 | simulator.cellState.availableMem < 0.01, 471 | ("After scheduleAllAvailable() is called on a cell state " + 472 | "that cells state should not have any available resources " + 473 | "of any type, but this cell state still has %f cpus and %f " + 474 | "memory available").format(simulator.cellState.availableCpus, 475 | simulator.cellState.availableMem)) 476 | if (!claimDeltas.isEmpty) { 477 | assert(privCellState.totalLockedCpus != 478 | simulator.cellState.totalLockedCpus, 479 | "Since some resources were locked and put into a resource " + 480 | "offer, we expect the number of total lockedCpus to now be " + 481 | "different in the private cell state we created than in the" + 482 | "common cell state.") 483 | offeredDeltas(offer.id) = claimDeltas 484 | 485 | val thinkTime = getThinkTime 486 | simulator.afterDelay(thinkTime) { 487 | timeSpentAllocating += thinkTime 488 | simulator.log(("Allocator done thinking, sending offer to %s. " + 489 | "Offer contains private cell state with " + 490 | "%f cpu, %f mem available.") 491 | .format(candidateSched.name, 492 | offer.cellState.availableCpus, 493 | offer.cellState.availableMem)) 494 | // Send the offer. 495 | candidateSched.resourceOffer(offer) 496 | } 497 | } 498 | }) 499 | } else { 500 | var reason = "" 501 | if (schedulersRequestingResources.isEmpty) 502 | reason = "No schedulers currently want offers." 503 | if (simulator.cellState.availableCpus < minCpuOffer || 504 | simulator.cellState.availableCpus < minMemOffer) 505 | reason = ("Only %f cpus and %f mem available in common cell state " + 506 | "but min offer size is %f cpus and %f mem.") 507 | .format(simulator.cellState.availableCpus, 508 | simulator.cellState.availableCpus, 509 | minCpuOffer, 510 | minMemOffer) 511 | simulator.log("Not sending an offer after all. %s".format(reason)) 512 | } 513 | } 514 | 515 | /** 516 | * Schedulers call this to respond to resource offers. 517 | */ 518 | def respondToOffer(offer: Offer, claimDeltas: Seq[ClaimDelta]) = { 519 | checkRegistered 520 | simulator.log(("------Scheduler %s responded to offer %d with " + 521 | "%d claimDeltas.") 522 | .format(offer.scheduler.name, offer.id, claimDeltas.length)) 523 | 524 | // Look up, unapply, & discard the saved deltas associated with the offerid. 525 | // This will cause the framework to stop being charged for the resources that 526 | // were locked while he made his scheduling decision. 527 | assert(offeredDeltas.contains(offer.id), 528 | "Allocator received response to offer that is not on record.") 529 | offeredDeltas.remove(offer.id).foreach(savedDeltas => { 530 | savedDeltas.foreach(_.unApply(cellState = simulator.cellState, 531 | locked = true)) 532 | }) 533 | simulator.log("========================================================") 534 | simulator.log("AFTER UNAPPLYING SAVED DELTAS") 535 | simulator.log("CellState total usage: %fcpus (%.1f%s), %fmem (%.1f%s)." 536 | .format(simulator.cellState.totalOccupiedCpus, 537 | simulator.cellState.totalOccupiedCpus / 538 | simulator.cellState.totalCpus * 100.0, 539 | "%", 540 | simulator.cellState.totalOccupiedMem, 541 | simulator.cellState.totalOccupiedMem / 542 | simulator.cellState.totalMem * 100.0, 543 | "%")) 544 | simulator.log("Committing all %d deltas that were part of response %d " 545 | .format(claimDeltas.length, offer.id)) 546 | // commit() all deltas that were part of the offer response, don't use 547 | // the option of having cell state create the end events for us since we 548 | // want to add code to the end event that triggers another resource offer. 549 | if (claimDeltas.length > 0) { 550 | val commitResult = simulator.cellState.commit(claimDeltas, false) 551 | assert(commitResult.conflictedDeltas.length == 0, 552 | "Expecting no conflicts, but there were %d." 553 | .format(commitResult.conflictedDeltas.length)) 554 | 555 | // Create end events for all tasks committed. 556 | commitResult.committedDeltas.foreach(delta => { 557 | simulator.afterDelay(delta.duration) { 558 | delta.unApply(simulator.cellState) 559 | simulator.log(("A task started by scheduler %s finished. " + 560 | "Freeing %f cpus, %f mem. Available: %f cpus, %f " + 561 | "mem. Also, triggering a new batched offer round.") 562 | .format(delta.scheduler.name, 563 | delta.cpus, 564 | delta.mem, 565 | simulator.cellState.availableCpus, 566 | simulator.cellState.availableMem)) 567 | schedBuildAndSendOffer() 568 | } 569 | }) 570 | } 571 | schedBuildAndSendOffer() 572 | } 573 | 574 | /** 575 | * 1/N multi-resource fair sharing. 576 | */ 577 | def drfSortSchedulers(schedulers: Seq[MesosScheduler]): Seq[MesosScheduler] = { 578 | val schedulerDominantShares = schedulers.map(scheduler => { 579 | val shareOfCpus = 580 | simulator.cellState.occupiedCpus.getOrElse(scheduler.name, 0.0) 581 | val shareOfMem = 582 | simulator.cellState.occupiedMem.getOrElse(scheduler.name, 0.0) 583 | val domShare = math.max(shareOfCpus / simulator.cellState.totalCpus, 584 | shareOfMem / simulator.cellState.totalMem) 585 | var nameOfDomShare = "" 586 | if (shareOfCpus > shareOfMem) nameOfDomShare = "cpus" 587 | else nameOfDomShare = "mem" 588 | simulator.log("%s's dominant share is %s (%f%s)." 589 | .format(scheduler.name, nameOfDomShare, domShare, "%")) 590 | (scheduler, domShare) 591 | }) 592 | schedulerDominantShares.sortBy(_._2).map(_._1) 593 | } 594 | } 595 | 596 | case class Offer(id: Long, scheduler: MesosScheduler, cellState: CellState) 597 | -------------------------------------------------------------------------------- /src/main/scala/MonolithicSimulation.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation 28 | 29 | import scala.collection.mutable.HashMap 30 | 31 | /* This class and its subclasses are used by factory method 32 | * ClusterSimulator.newScheduler() to determine which type of Simulator 33 | * to create and also to carry any extra fields that the factory needs to 34 | * construct the simulator. 35 | */ 36 | class MonolithicSimulatorDesc(schedulerDescs: Seq[SchedulerDesc], 37 | runTime: Double) 38 | extends ClusterSimulatorDesc(runTime){ 39 | override 40 | def newSimulator(constantThinkTime: Double, 41 | perTaskThinkTime: Double, 42 | blackListPercent: Double, 43 | schedulerWorkloadsToSweepOver: Map[String, Seq[String]], 44 | workloadToSchedulerMap: Map[String, Seq[String]], 45 | cellStateDesc: CellStateDesc, 46 | workloads: Seq[Workload], 47 | prefillWorkloads: Seq[Workload], 48 | logging: Boolean = false): ClusterSimulator = { 49 | var schedulers = HashMap[String, Scheduler]() 50 | // Create schedulers according to experiment parameters. 51 | schedulerDescs.foreach(schedDesc => { 52 | // If any of the scheduler-workload pairs we're sweeping over 53 | // are for this scheduler, then apply them before 54 | // registering it. 55 | var constantThinkTimes = HashMap[String, Double]( 56 | schedDesc.constantThinkTimes.toSeq: _*) 57 | var perTaskThinkTimes = HashMap[String, Double]( 58 | schedDesc.perTaskThinkTimes.toSeq: _*) 59 | var newBlackListPercent = 0.0 60 | if (schedulerWorkloadsToSweepOver 61 | .contains(schedDesc.name)) { 62 | newBlackListPercent = blackListPercent 63 | schedulerWorkloadsToSweepOver(schedDesc.name) 64 | .foreach(workloadName => { 65 | constantThinkTimes(workloadName) = constantThinkTime 66 | perTaskThinkTimes(workloadName) = perTaskThinkTime 67 | }) 68 | } 69 | schedulers(schedDesc.name) = 70 | new MonolithicScheduler(schedDesc.name, 71 | constantThinkTimes.toMap, 72 | perTaskThinkTimes.toMap, 73 | math.floor(newBlackListPercent * 74 | cellStateDesc.numMachines.toDouble).toInt) 75 | }) 76 | 77 | val cellState = new CellState(cellStateDesc.numMachines, 78 | cellStateDesc.cpusPerMachine, 79 | cellStateDesc.memPerMachine, 80 | conflictMode = "resource-fit", 81 | transactionMode = "all-or-nothing") 82 | 83 | new ClusterSimulator(cellState, 84 | schedulers.toMap, 85 | workloadToSchedulerMap, 86 | workloads, 87 | prefillWorkloads, 88 | logging) 89 | } 90 | } 91 | 92 | class MonolithicScheduler(name: String, 93 | constantThinkTimes: Map[String, Double], 94 | perTaskThinkTimes: Map[String, Double], 95 | numMachinesToBlackList: Double = 0) 96 | extends Scheduler(name, 97 | constantThinkTimes, 98 | perTaskThinkTimes, 99 | numMachinesToBlackList) { 100 | 101 | println("scheduler-id-info: %d, %s, %d, %s, %s" 102 | .format(Thread.currentThread().getId(), 103 | name, 104 | hashCode(), 105 | constantThinkTimes.mkString(";"), 106 | perTaskThinkTimes.mkString(";"))) 107 | 108 | override 109 | def addJob(job: Job) = { 110 | assert(simulator != null, "This scheduler has not been added to a " + 111 | "simulator yet.") 112 | super.addJob(job) 113 | job.lastEnqueued = simulator.currentTime 114 | pendingQueue.enqueue(job) 115 | simulator.log("enqueued job " + job.id) 116 | if (!scheduling) 117 | scheduleNextJobAction() 118 | } 119 | 120 | /** 121 | * Checks to see if there is currently a job in this scheduler's job queue. 122 | * If there is, and this scheduler is not currently scheduling a job, then 123 | * pop that job off of the queue and "begin scheduling it". Scheduling a 124 | * job consists of setting this scheduler's state to scheduling = true, and 125 | * adding a finishSchedulingJobAction to the simulators event queue by 126 | * calling afterDelay(). 127 | */ 128 | def scheduleNextJobAction(): Unit = { 129 | assert(simulator != null, "This scheduler has not been added to a " + 130 | "simulator yet.") 131 | if (!scheduling && !pendingQueue.isEmpty) { 132 | scheduling = true 133 | val job = pendingQueue.dequeue 134 | job.updateTimeInQueueStats(simulator.currentTime) 135 | job.lastSchedulingStartTime = simulator.currentTime 136 | val thinkTime = getThinkTime(job) 137 | simulator.log("getThinkTime returned " + thinkTime) 138 | simulator.afterDelay(thinkTime) { 139 | simulator.log(("Scheduler %s finished scheduling job %d. " + 140 | "Attempting to schedule next job in scheduler's " + 141 | "pendingQueue.").format(name, job.id)) 142 | job.numSchedulingAttempts += 1 143 | job.numTaskSchedulingAttempts += job.unscheduledTasks 144 | val claimDeltas = scheduleJob(job, simulator.cellState) 145 | if(claimDeltas.length > 0) { 146 | simulator.cellState.scheduleEndEvents(claimDeltas) 147 | job.unscheduledTasks -= claimDeltas.length 148 | simulator.log("scheduled %d tasks of job %d's, %d remaining." 149 | .format(claimDeltas.length, job.id, job.unscheduledTasks)) 150 | numSuccessfulTransactions += 1 151 | recordUsefulTimeScheduling(job, 152 | thinkTime, 153 | job.numSchedulingAttempts == 1) 154 | } else { 155 | simulator.log(("No tasks scheduled for job %d (%f cpu %f mem) " + 156 | "during this scheduling attempt, not recording " + 157 | "any busy time. %d unscheduled tasks remaining.") 158 | .format(job.id, 159 | job.cpusPerTask, 160 | job.memPerTask, 161 | job.unscheduledTasks)) 162 | } 163 | var jobEventType = "" // Set this conditionally below; used in logging. 164 | // If the job isn't yet fully scheduled, put it back in the queue. 165 | if (job.unscheduledTasks > 0) { 166 | simulator.log(("Job %s didn't fully schedule, %d / %d tasks remain " + 167 | "(shape: %f cpus, %f mem). Putting it " + 168 | "back in the queue").format(job.id, 169 | job.unscheduledTasks, 170 | job.numTasks, 171 | job.cpusPerTask, 172 | job.memPerTask)) 173 | // Give up on a job if (a) it hasn't scheduled a single task in 174 | // 100 tries or (b) it hasn't finished scheduling after 1000 tries. 175 | if ((job.numSchedulingAttempts > 100 && 176 | job.unscheduledTasks == job.numTasks) || 177 | job.numSchedulingAttempts > 1000) { 178 | println(("Abandoning job %d (%f cpu %f mem) with %d/%d " + 179 | "remaining tasks, after %d scheduling " + 180 | "attempts.").format(job.id, 181 | job.cpusPerTask, 182 | job.memPerTask, 183 | job.unscheduledTasks, 184 | job.numTasks, 185 | job.numSchedulingAttempts)) 186 | numJobsTimedOutScheduling += 1 187 | jobEventType = "abandoned" 188 | } else { 189 | simulator.afterDelay(1) { 190 | addJob(job) 191 | } 192 | } 193 | } else { 194 | // All tasks in job scheduled so don't put it back in pendingQueue. 195 | jobEventType = "fully-scheduled" 196 | } 197 | if (!jobEventType.equals("")) { 198 | // println("%s %s %d %s %d %d %f" 199 | // .format(Thread.currentThread().getId(), 200 | // name, 201 | // hashCode(), 202 | // jobEventType, 203 | // job.id, 204 | // job.numSchedulingAttempts, 205 | // simulator.currentTime - job.submitted)) 206 | } 207 | 208 | scheduling = false 209 | scheduleNextJobAction() 210 | } 211 | simulator.log("Scheduler named '%s' started scheduling job %d " 212 | .format(name,job.id)) 213 | } 214 | } 215 | } 216 | -------------------------------------------------------------------------------- /src/main/scala/OmegaSimulation.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation 28 | 29 | import collection.mutable.HashMap 30 | import collection.mutable.ListBuffer 31 | 32 | class OmegaSimulatorDesc( 33 | val schedulerDescs: Seq[OmegaSchedulerDesc], 34 | runTime: Double, 35 | val conflictMode: String, 36 | val transactionMode: String) 37 | extends ClusterSimulatorDesc(runTime){ 38 | override 39 | def newSimulator(constantThinkTime: Double, 40 | perTaskThinkTime: Double, 41 | blackListPercent: Double, 42 | schedulerWorkloadsToSweepOver: Map[String, Seq[String]], 43 | workloadToSchedulerMap: Map[String, Seq[String]], 44 | cellStateDesc: CellStateDesc, 45 | workloads: Seq[Workload], 46 | prefillWorkloads: Seq[Workload], 47 | logging: Boolean = false): ClusterSimulator = { 48 | assert(blackListPercent >= 0.0 && blackListPercent <= 1.0) 49 | var schedulers = HashMap[String, OmegaScheduler]() 50 | // Create schedulers according to experiment parameters. 51 | println("Creating %d schedulers.".format(schedulerDescs.length)) 52 | schedulerDescs.foreach(schedDesc => { 53 | // If any of the scheduler-workload pairs we're sweeping over 54 | // are for this scheduler, then apply them before 55 | // registering it. 56 | var constantThinkTimes = HashMap[String, Double]( 57 | schedDesc.constantThinkTimes.toSeq: _*) 58 | var perTaskThinkTimes = HashMap[String, Double]( 59 | schedDesc.perTaskThinkTimes.toSeq: _*) 60 | var newBlackListPercent = 0.0 61 | if (schedulerWorkloadsToSweepOver 62 | .contains(schedDesc.name)) { 63 | newBlackListPercent = blackListPercent 64 | schedulerWorkloadsToSweepOver(schedDesc.name) 65 | .foreach(workloadName => { 66 | constantThinkTimes(workloadName) = constantThinkTime 67 | perTaskThinkTimes(workloadName) = perTaskThinkTime 68 | }) 69 | } 70 | println("Creating new scheduler %s".format(schedDesc.name)) 71 | schedulers(schedDesc.name) = 72 | new OmegaScheduler(schedDesc.name, 73 | constantThinkTimes.toMap, 74 | perTaskThinkTimes.toMap, 75 | math.floor(newBlackListPercent * 76 | cellStateDesc.numMachines.toDouble).toInt) 77 | }) 78 | val cellState = new CellState(cellStateDesc.numMachines, 79 | cellStateDesc.cpusPerMachine, 80 | cellStateDesc.memPerMachine, 81 | conflictMode, 82 | transactionMode) 83 | println("Creating new OmegaSimulator with schedulers %s." 84 | .format(schedulers.values.map(_.toString).mkString(", "))) 85 | println("Setting OmegaSimulator(%s, %s)'s common cell state to %d" 86 | .format(conflictMode, 87 | transactionMode, 88 | cellState.hashCode)) 89 | new OmegaSimulator(cellState, 90 | schedulers.toMap, 91 | workloadToSchedulerMap, 92 | workloads, 93 | prefillWorkloads, 94 | logging) 95 | } 96 | } 97 | 98 | /** 99 | * A simple subclass of SchedulerDesc for extensibility to 100 | * for symmetry in the naming of the type so that we don't 101 | * have to use a SchedulerDesc for an OmegaSimulator. 102 | */ 103 | class OmegaSchedulerDesc(name: String, 104 | constantThinkTimes: Map[String, Double], 105 | perTaskThinkTimes: Map[String, Double]) 106 | extends SchedulerDesc(name, 107 | constantThinkTimes, 108 | perTaskThinkTimes) 109 | 110 | class OmegaSimulator(cellState: CellState, 111 | override val schedulers: Map[String, OmegaScheduler], 112 | workloadToSchedulerMap: Map[String, Seq[String]], 113 | workloads: Seq[Workload], 114 | prefillWorkloads: Seq[Workload], 115 | logging: Boolean = false, 116 | monitorUtilization: Boolean = true) 117 | extends ClusterSimulator(cellState, 118 | schedulers, 119 | workloadToSchedulerMap, 120 | workloads, 121 | prefillWorkloads, 122 | logging, 123 | monitorUtilization) { 124 | // Set up a pointer to this simulator in each scheduler. 125 | schedulers.values.foreach(_.omegaSimulator = this) 126 | } 127 | 128 | /** 129 | * While an Omega Scheduler has jobs in its job queue, it: 130 | * 1: Syncs with cell state by getting a new copy of common cell state 131 | * 2: Schedules the next job j in the queue, using getThinkTime(j) seconds 132 | * and assigning creating and applying one delta per task in the job. 133 | * 3: submits the job to CellState 134 | * 4: if any tasks failed to schedule: insert job at back of queue 135 | * 5: rolls back its changes 136 | * 6: repeat, starting at 1 137 | */ 138 | class OmegaScheduler(name: String, 139 | constantThinkTimes: Map[String, Double], 140 | perTaskThinkTimes: Map[String, Double], 141 | numMachinesToBlackList: Double = 0) 142 | extends Scheduler(name, 143 | constantThinkTimes, 144 | perTaskThinkTimes, 145 | numMachinesToBlackList) { 146 | println("scheduler-id-info: %d, %s, %d, %s, %s" 147 | .format(Thread.currentThread().getId(), 148 | name, 149 | hashCode(), 150 | constantThinkTimes.mkString(";"), 151 | perTaskThinkTimes.mkString(";"))) 152 | // TODO(andyk): Clean up these Simulator classes 153 | // by templatizing the Scheduler class and having only 154 | // one simulator of the correct type, instead of one 155 | // simulator for each of the parent and child classes. 156 | var omegaSimulator: OmegaSimulator = null 157 | var privateCellState: CellState = null 158 | 159 | override 160 | def checkRegistered = { 161 | super.checkRegistered 162 | assert(omegaSimulator != null, "This scheduler has not been added to a " + 163 | "simulator yet.") 164 | } 165 | 166 | def incrementDailycounter(counter: HashMap[Int, Int]) = { 167 | val index: Int = math.floor(simulator.currentTime / 86400).toInt 168 | val currCount: Int = counter.getOrElse(index, 0) 169 | counter(index) = currCount + 1 170 | } 171 | 172 | // When a job arrives, start scheduling, or make sure we already are. 173 | override 174 | def addJob(job: Job) = { 175 | assert(simulator != null, "This scheduler has not been added to a " + 176 | "simulator yet.") 177 | 178 | assert(job.unscheduledTasks > 0) 179 | super.addJob(job) 180 | pendingQueue.enqueue(job) 181 | simulator.log("Scheduler %s enqueued job %d of workload type %s." 182 | .format(name, job.id, job.workloadName)) 183 | if (!scheduling) { 184 | omegaSimulator.log("Set %s scheduling to TRUE to schedule job %d." 185 | .format(name, job.id)) 186 | scheduling = true 187 | handleJob(pendingQueue.dequeue) 188 | } 189 | } 190 | 191 | /** 192 | * Schedule job and submit a transaction to common cellstate for 193 | * it. If not all tasks in the job are successfully committed, 194 | * put it back in the pendingQueue to be scheduled again. 195 | */ 196 | def handleJob(job: Job): Unit = { 197 | job.updateTimeInQueueStats(simulator.currentTime) 198 | syncCellState 199 | val jobThinkTime = getThinkTime(job) 200 | omegaSimulator.afterDelay(jobThinkTime) { 201 | job.numSchedulingAttempts += 1 202 | job.numTaskSchedulingAttempts += job.unscheduledTasks 203 | // Schedule the job in private cellstate. 204 | assert(job.unscheduledTasks > 0) 205 | val claimDeltas = scheduleJob(job, privateCellState) 206 | simulator.log(("Job %d (%s) finished %f seconds of scheduling " + 207 | "thinktime; now trying to claim resources for %d " + 208 | "tasks with %f cpus and %f mem each.") 209 | .format(job.id, 210 | job.workloadName, 211 | jobThinkTime, 212 | job.numTasks, 213 | job.cpusPerTask, 214 | job.memPerTask)) 215 | if (claimDeltas.length > 0) { 216 | // Attempt to claim resources in common cellstate by committing 217 | // a transaction. 218 | omegaSimulator.log("Submitting a transaction for %d tasks for job %d." 219 | .format(claimDeltas.length, job.id)) 220 | val commitResult = omegaSimulator.cellState.commit(claimDeltas, true) 221 | job.unscheduledTasks -= commitResult.committedDeltas.length 222 | omegaSimulator.log("%d tasks successfully committed for job %d." 223 | .format(commitResult.committedDeltas.length, job.id)) 224 | numSuccessfulTaskTransactions += commitResult.committedDeltas.length 225 | numFailedTaskTransactions += commitResult.conflictedDeltas.length 226 | if (job.numSchedulingAttempts > 1) 227 | numRetriedTransactions += 1 228 | 229 | // Record job-level stats. 230 | if (commitResult.conflictedDeltas.length == 0) { 231 | numSuccessfulTransactions += 1 232 | incrementDailycounter(dailySuccessTransactions) 233 | recordUsefulTimeScheduling(job, 234 | jobThinkTime, 235 | job.numSchedulingAttempts == 1) 236 | } else { 237 | numFailedTransactions += 1 238 | incrementDailycounter(dailyFailedTransactions) 239 | // omegaSimulator.log("adding %f seconds to wastedThinkTime counter." 240 | // .format(jobThinkTime)) 241 | recordWastedTimeScheduling(job, 242 | jobThinkTime, 243 | job.numSchedulingAttempts == 1) 244 | // omegaSimulator.log(("Transaction task CONFLICTED for job-%d on " + 245 | // "machines %s.") 246 | // .format(job.id, 247 | // commitResult.conflictedDeltas.map(_.machineID) 248 | // .mkString(", "))) 249 | } 250 | } else { 251 | simulator.log(("Not enough resources of the right shape were " + 252 | "available to schedule even one task of job %d, " + 253 | "so not submitting a transaction.").format(job.id)) 254 | numNoResourcesFoundSchedulingAttempts += 1 255 | } 256 | 257 | var jobEventType = "" // Set this conditionally below; used in logging. 258 | // If the job isn't yet fully scheduled, put it back in the queue. 259 | if (job.unscheduledTasks > 0) { 260 | // Give up on a job if (a) it hasn't scheduled a single task in 261 | // 100 tries or (b) it hasn't finished scheduling after 1000 tries. 262 | if ((job.numSchedulingAttempts > 100 && 263 | job.unscheduledTasks == job.numTasks) || 264 | job.numSchedulingAttempts > 1000) { 265 | println(("Abandoning job %d (%f cpu %f mem) with %d/%d " + 266 | "remaining tasks, after %d scheduling " + 267 | "attempts.").format(job.id, 268 | job.cpusPerTask, 269 | job.memPerTask, 270 | job.unscheduledTasks, 271 | job.numTasks, 272 | job.numSchedulingAttempts)) 273 | numJobsTimedOutScheduling += 1 274 | jobEventType = "abandoned" 275 | } else { 276 | simulator.log(("Job %d still has %d unscheduled tasks, adding it " + 277 | "back to scheduler %s's job queue.") 278 | .format(job.id, job.unscheduledTasks, name)) 279 | simulator.afterDelay(1) { 280 | addJob(job) 281 | } 282 | } 283 | } else { 284 | // All tasks in job scheduled so don't put it back in pendingQueue. 285 | jobEventType = "fully-scheduled" 286 | } 287 | if (!jobEventType.equals("")) { 288 | // println("%s %s %d %s %d %d %f" 289 | // .format(Thread.currentThread().getId(), 290 | // name, 291 | // hashCode(), 292 | // jobEventType, 293 | // job.id, 294 | // job.numSchedulingAttempts, 295 | // simulator.currentTime - job.submitted)) 296 | } 297 | 298 | omegaSimulator.log("Set " + name + " scheduling to FALSE") 299 | scheduling = false 300 | // Keep trying to schedule as long as we have jobs in the queue. 301 | if (!pendingQueue.isEmpty) { 302 | scheduling = true 303 | handleJob(pendingQueue.dequeue) 304 | } 305 | } 306 | } 307 | 308 | def syncCellState { 309 | checkRegistered 310 | privateCellState = omegaSimulator.cellState.copy 311 | simulator.log("%s synced private cellstate.".format(name)) 312 | // println("Scheduler %s (%d) has new private cell state %d" 313 | // .format(name, hashCode, privateCellState.hashCode)) 314 | } 315 | } 316 | -------------------------------------------------------------------------------- /src/main/scala/ParseParm.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | /* 28 | * ParseParms.scala 29 | * 30 | * Copied from: http://code.google.com/p/parse-cmd/wiki/AScalaParserClass 31 | * 32 | * ParseParms is an implementation of a command-line argument parser in Scala. 33 | * 34 | * It allows you to define: 35 | * 36 | * A Help String 37 | * Parameter entries each including: 38 | * name 39 | * default value 40 | * regular expression used for validation; defaults are used if not stated 41 | * tag to indicate a required parameter; defaults to not-required 42 | * A validate method to test passed arguments against defined parameters 43 | * 44 | * A validate method parses the arguments and returns a Scala tuple object 45 | * of the form (Boolean, String, Map) 46 | * 47 | * The Boolean indicates success and Map contains merged values: e.g. 48 | * supplied command-line arguments compared and merged agains defined parms. 49 | * 50 | * Failure, false, includes an error message in String; Map is empty 51 | * 52 | * The String includes an error message indicating missing required parms 53 | * and/or incorrect values: failed regular expression test 54 | * 55 | * The Map object contains the merged arguments and parameter default values 56 | * 57 | * Usage example is included below under Main 58 | * 59 | * jf.zarama at gmail dot com 60 | * 61 | * 2009.07.24 62 | */ 63 | 64 | package ca.zmatrix.utils 65 | 66 | class ParseParms(val help: String) { 67 | 68 | private var parms = Map[String,(String,String,Boolean)]() 69 | private var cache: Option[String] = None // save parm name across calls 70 | // used by req and rex methods 71 | def parm(name: String) = { 72 | parms += name -> ("", "^.*$", false ) ;cache = Some(name) 73 | this 74 | } 75 | 76 | def parm(name: String, default: String) = { 77 | parms += name -> (default, defRex(default), false); cache = Some(name) 78 | this 79 | } 80 | 81 | def parm(name: String, default: String, rex: String) = { 82 | parms += name -> (default, rex, false); cache = Some(name) 83 | this 84 | } 85 | 86 | def parm(name: String, default: String, rex: String, req: Boolean) = { 87 | parms += name -> (default, rex, req); cache = Some(name) 88 | this 89 | } 90 | 91 | def parm(name: String, default: String, req: Boolean) = { 92 | parms += name -> (default, defRex(default), req); cache = Some(name) 93 | this 94 | } 95 | 96 | def req(value: Boolean) = { // update required flag 97 | val k = checkName // for current parameter name 98 | if( k.length > 0 ) { // stored in cache 99 | val pvalue = parms(k) // parmeter tuple value 100 | val ntuple = (pvalue._1,pvalue._2,value) // new tuple 101 | parms += cache.get -> ntuple // update entry in parms 102 | } // .parm("-p1","1").req(true) 103 | this // enables chained calls 104 | } 105 | 106 | def rex(value: String) = { // update regular-expression 107 | val k = checkName // for current name 108 | if( k.length > 0 ) { // stored in cache 109 | val pvalue = parms(k) // parameter tuple value 110 | val ntuple = (pvalue._1,value,pvalue._3) // new tuple 111 | parms += cache.get -> ntuple // update tuple for key in parms 112 | } // .parm("-p1","1").rex(".+") 113 | this // enables chained calls 114 | } 115 | 116 | private def checkName = { // checks name stored in cache 117 | cache match { // to be a parm-name used for 118 | case Some(key) => key // req and rex methods 119 | case _ => "" // req & rex will not update 120 | } // entries if cache other than 121 | } // Some(key) 122 | 123 | private def defRex(default: String): String = { 124 | if( default.matches("^\\d+$") ) "^\\d+$" else "^.*$" 125 | } 126 | 127 | private def genMap(args: List[String] ) = { // return a Map of args 128 | var argsMap = Map[String,String]() // result object 129 | if( ( args.length % 2 ) != 0 ) argsMap // must have pairs: -name value 130 | else { // to return a valid Map 131 | for( i <- 0.until(args.length,2) ){ // iterate through args by 2 132 | argsMap += args(i) -> args(i+1) // add -name value pair 133 | } 134 | argsMap // return -name value Map 135 | } 136 | } 137 | 138 | private def testRequired( args: Map[String,String] ) = { 139 | val ParmsNotSupplied = new collection.mutable.ListBuffer[String] 140 | for{ (key,value) <- parms // iterate trough parms 141 | if value._3 // if parm is required 142 | if !args.contains(key) // and it is not in args 143 | } ParmsNotSupplied += key // add it to List 144 | ParmsNotSupplied.toList // empty: all required present 145 | } 146 | 147 | private def validParms( args: Map[String,String] ) = { 148 | val invalidParms = new collection.mutable.ListBuffer[String] 149 | for{ (key,value) <- args // iterate through args 150 | if parms.contains(key) // if it is a defined parm 151 | rex = parms(key)._2 // parm defined rex 152 | if !value.matches(rex) // if regex does not match 153 | } invalidParms += key // add invalid arg 154 | invalidParms.toList // empty: all parms valid 155 | } 156 | 157 | private def mergeParms( args: Map[String,String] ) = { 158 | //val mergedMap = collection.mutable.Map[String,String]() 159 | var mergedMap = Map[String,String]() // name value Map of results 160 | for{ (key,value) <- parms // iterate through parms 161 | //mValue = if( args.contains(key) ) args(key) else value(0) 162 | mValue = args.getOrElse(key,value._1) // args(key) or default 163 | } mergedMap += key -> mValue // update result Map 164 | mergedMap // return mergedMap 165 | } 166 | 167 | private def mkString(l1: List[String],l2: List[String]) = { 168 | "\nhelp: " + help + "\n\trequired parms missing: " + 169 | ( if( !l1.isEmpty ) l1.mkString(" ") else "" ) + 170 | ( if( !l2.isEmpty ) "\n\tinvalid parms: " + 171 | l2.mkString(" ") + "\n" else "" ) 172 | } 173 | 174 | def validate( args: List[String] ) = { // validate args to parms 175 | val argsMap = genMap( args ) // Map of args: -name value 176 | val reqList = testRequired( argsMap ) // List of missing required 177 | val validList = validParms( argsMap ) // List of (in)valid args 178 | if( reqList.isEmpty && validList.isEmpty ) {// successful return 179 | (true,"",mergeParms( argsMap )) // true, "", mergedParms 180 | } else (false,mkString(reqList,validList),Map[String,String]()) 181 | } 182 | } 183 | 184 | // object Main { 185 | // 186 | // /** 187 | // * @param args the command line arguments 188 | // */ 189 | // def main(args: Array[String]) = { 190 | // val helpString = " -p1 out.txt -p2 22 [ -p3 100 -p4 1200 ] " 191 | // val pp = new ParseParms( helpString ) 192 | // pp.parm("-p1", "output.txt").rex("^.*\\.txt$").req(true) // required 193 | // .parm("-p2", "22","^\\d{2}$",true) // alternate form, required 194 | // .parm("-p3","100").rex("^\\d{3}$") // optional 195 | // .parm("-p4","1200").rex("^\\d{4}$").req(false) // optional 196 | // 197 | // val result = pp.validate( args.toList ) 198 | // println( if( result._1 ) result._3 else result._2 ) 199 | // // result is a tuple (Boolean, String, Map) 200 | // // ._1 Boolean; false: error String contained in ._2, Map in ._3 is empty 201 | // // true: successful, Map of parsed & merged parms in ._3 202 | // 203 | // System.exit(0) 204 | // } 205 | // 206 | // } 207 | 208 | -------------------------------------------------------------------------------- /src/main/scala/Util.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation 28 | 29 | object Seed { 30 | private var seed: Long = 0 31 | def set(newSeed: Long) = {seed = newSeed} 32 | def apply() = seed 33 | } 34 | -------------------------------------------------------------------------------- /src/main/scala/Workloads.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | package ClusterSchedulingSimulation 28 | 29 | import java.io.File 30 | 31 | /** 32 | * Set up workloads based on measurements from a real cluster. 33 | * In the Eurosys paper, we used measurements from Google clusters here. 34 | */ 35 | object Workloads { 36 | /** 37 | * Set up CellStateDescs that will go into WorkloadDescs. Fabricated 38 | * numbers are provided as an example. Enter numbers based on your 39 | * own clusters instead. 40 | */ 41 | val exampleCellStateDesc = new CellStateDesc(numMachines = 10000, 42 | cpusPerMachine = 4, 43 | memPerMachine = 16) 44 | 45 | 46 | /** 47 | * Set up WorkloadDescs, containing generators of workloads and 48 | * pre-fill workloads based on measurements of cells/workloads. 49 | */ 50 | val exampleWorkloadGeneratorBatch = 51 | new ExpExpExpWorkloadGenerator(workloadName = "Batch".intern(), 52 | initAvgJobInterarrivalTime = 10.0, 53 | avgTasksPerJob = 100.0, 54 | avgJobDuration = (100.0), 55 | avgCpusPerTask = 1.0, 56 | avgMemPerTask = 2.0) 57 | val exampleWorkloadGeneratorService = 58 | new ExpExpExpWorkloadGenerator(workloadName = "Service".intern(), 59 | initAvgJobInterarrivalTime = 20.0, 60 | avgTasksPerJob = 10.0, 61 | avgJobDuration = (500.0), 62 | avgCpusPerTask = 1.0, 63 | avgMemPerTask = 2.0) 64 | val exampleWorkloadDesc = WorkloadDesc(cell = "example", 65 | assignmentPolicy = "CMB_PBB", 66 | workloadGenerators = 67 | exampleWorkloadGeneratorBatch :: 68 | exampleWorkloadGeneratorService :: Nil, 69 | cellStateDesc = exampleCellStateDesc) 70 | 71 | 72 | // example pre-fill workload generators. 73 | val examplePrefillTraceFileName = "traces/example-init-cluster-state.log" 74 | assert((new File(examplePrefillTraceFileName)).exists()) 75 | val exampleBatchPrefillTraceWLGenerator = 76 | new PrefillPbbTraceWorkloadGenerator("PrefillBatch", 77 | examplePrefillTraceFileName) 78 | val exampleServicePrefillTraceWLGenerator = 79 | new PrefillPbbTraceWorkloadGenerator("PrefillService", 80 | examplePrefillTraceFileName) 81 | val exampleBatchServicePrefillTraceWLGenerator = 82 | new PrefillPbbTraceWorkloadGenerator("PrefillBatchService", 83 | examplePrefillTraceFileName) 84 | 85 | val exampleWorkloadPrefillDesc = 86 | WorkloadDesc(cell = "example", 87 | assignmentPolicy = "CMB_PBB", 88 | workloadGenerators = 89 | exampleWorkloadGeneratorBatch :: 90 | exampleWorkloadGeneratorService :: 91 | Nil, 92 | cellStateDesc = exampleCellStateDesc, 93 | prefillWorkloadGenerators = 94 | List(exampleBatchServicePrefillTraceWLGenerator)) 95 | 96 | 97 | // Set up example workload with jobs that have interarrival times 98 | // from trace-based interarrival times. 99 | val exampleInterarrivalTraceFileName = "traces/job-distribution-traces/" + 100 | "example_interarrival_cmb.log" 101 | val exampleNumTasksTraceFileName = "traces/job-distribution-traces/" + 102 | "example_csizes_cmb.log" 103 | val exampleJobDurationTraceFileName = "traces/job-distribution-traces/" + 104 | "example_runtimes_cmb.log" 105 | assert((new File(exampleInterarrivalTraceFileName)).exists()) 106 | assert((new File(exampleNumTasksTraceFileName)).exists()) 107 | assert((new File(exampleJobDurationTraceFileName)).exists()) 108 | 109 | // A workload based on traces of interarrival times, tasks-per-job, 110 | // and job duration. Task shapes now based on pre-fill traces. 111 | val exampleWorkloadGeneratorTraceAllBatch = 112 | new TraceAllWLGenerator( 113 | "Batch".intern(), 114 | exampleInterarrivalTraceFileName, 115 | exampleNumTasksTraceFileName, 116 | exampleJobDurationTraceFileName, 117 | examplePrefillTraceFileName, 118 | maxCpusPerTask = 3.9, // Machines in example cluster have 4 CPUs. 119 | maxMemPerTask = 15.9) // Machines in example cluster have 16GB mem. 120 | 121 | val exampleWorkloadGeneratorTraceAllService = 122 | new TraceAllWLGenerator( 123 | "Service".intern(), 124 | exampleInterarrivalTraceFileName, 125 | exampleNumTasksTraceFileName, 126 | exampleJobDurationTraceFileName, 127 | examplePrefillTraceFileName, 128 | maxCpusPerTask = 3.9, 129 | maxMemPerTask = 15.9) 130 | 131 | val exampleTraceAllWorkloadPrefillDesc = 132 | WorkloadDesc(cell = "example", 133 | assignmentPolicy = "CMB_PBB", 134 | workloadGenerators = 135 | exampleWorkloadGeneratorTraceAllBatch :: 136 | exampleWorkloadGeneratorTraceAllService :: 137 | Nil, 138 | cellStateDesc = exampleCellStateDesc, 139 | prefillWorkloadGenerators = 140 | List(exampleBatchServicePrefillTraceWLGenerator)) 141 | } 142 | -------------------------------------------------------------------------------- /src/test/scala/TestSimulations.scala: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (c) 2013, Regents of the University of California 3 | * All rights reserved. 4 | * 5 | * Redistribution and use in source and binary forms, with or without 6 | * modification, are permitted provided that the following conditions are met: 7 | * 8 | * Redistributions of source code must retain the above copyright notice, this 9 | * list of conditions and the following disclaimer. Redistributions in binary 10 | * form must reproduce the above copyright notice, this list of conditions and the 11 | * following disclaimer in the documentation and/or other materials provided with 12 | * the distribution. Neither the name of the University of California, Berkeley 13 | * nor the names of its contributors may be used to endorse or promote products 14 | * derived from this software without specific prior written permission. THIS 15 | * SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY 16 | * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 17 | * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 18 | * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 19 | * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 20 | * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 21 | * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 22 | * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 23 | * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 24 | * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 25 | */ 26 | 27 | import org.scalatest.FunSuite 28 | 29 | import ClusterSchedulingSimulation.Workload 30 | import ClusterSchedulingSimulation.WorkloadDesc 31 | import ClusterSchedulingSimulation.Job 32 | import ClusterSchedulingSimulation.UniformWorkloadGenerator 33 | import ClusterSchedulingSimulation.CellState 34 | 35 | import ClusterSchedulingSimulation.ClusterSimulator 36 | import ClusterSchedulingSimulation.MonolithicScheduler 37 | 38 | import ClusterSchedulingSimulation.MesosSimulator 39 | import ClusterSchedulingSimulation.MesosScheduler 40 | import ClusterSchedulingSimulation.MesosAllocator 41 | 42 | import ClusterSchedulingSimulation.ClaimDelta 43 | import ClusterSchedulingSimulation.OmegaSimulator 44 | import ClusterSchedulingSimulation.OmegaScheduler 45 | 46 | import ClusterSchedulingSimulation.PrefillPbbTraceWorkloadGenerator 47 | import ClusterSchedulingSimulation.InterarrivalTimeTraceExpExpWLGenerator 48 | 49 | import collection.mutable.HashMap 50 | import collection.mutable.ListBuffer 51 | import sys.process._ 52 | 53 | class SimulatorsTestSuite extends FunSuite { 54 | /** 55 | * Monolithic simulator tests. 56 | */ 57 | test("MonolithicSimulatorTest") { 58 | println("\n\n\n=====================") 59 | println("Testing Monolithic simulator functionality.") 60 | println("=====================\n\n") 61 | // Build a workload manually. 62 | var workload = new Workload("unif") 63 | val numJobs = 4 // Don't change this unless you update the 64 | // hand calculations used in assert()-s below. 65 | 66 | (1 to numJobs).foreach(i => { 67 | workload.addJob(new Job(id = i, 68 | submitted = 0, 69 | numTasks = i, 70 | taskDuration = i, 71 | workloadName = workload.name, 72 | cpusPerTask = 1.0, 73 | memPerTask = 1.0)) 74 | }) 75 | assert(workload.numJobs == numJobs) 76 | assert(workload.getJobs.last.id == numJobs) 77 | 78 | // Create a simple scheduler. 79 | val scheduler = new MonolithicScheduler("simple_sched", 80 | Map("unif" -> 1), 81 | Map("unif" -> 1)) 82 | 83 | // Set up a CellState, conflictMode and transactionMode shouldn't 84 | // matter for a Monolithic Simulator. 85 | val monolithicCellState = new CellState(1, 10.0, 20.0, 86 | conflictMode = "sequence-numbers", 87 | transactionMode = "all-or-nothing") 88 | 89 | val monolithicSimulator = new ClusterSimulator( 90 | monolithicCellState, 91 | Map(scheduler.name -> scheduler), 92 | Map("unif" -> Seq("simple_sched")), 93 | List(workload), 94 | List(), 95 | logging = true, 96 | monitorUtilization = false) 97 | 98 | assert(monolithicSimulator.schedulers.size == 1) 99 | assert(monolithicSimulator.workloadToSchedulerMap.size == 1) 100 | assert(monolithicSimulator.agendaSize == numJobs) 101 | 102 | // Test that run() empties the Simulator's priority queue of WorkItem-s 103 | monolithicSimulator.run() 104 | println(monolithicSimulator.currentTime) 105 | println(numJobs + (1 to numJobs).sum) 106 | assert(monolithicSimulator.agendaSize == 0) 107 | // Last task finishes scheduling at (numJobs + (1 to numJobs).sum), 108 | // and runs for 3 seconds (since jobs are numbered 0 to 3 and each job has 109 | // task duration set to its id number). 110 | assert(monolithicSimulator.currentTime == 111 | numJobs + (1 to numJobs).sum + numJobs - 1) 112 | } 113 | 114 | test("testStats") { 115 | var workload = new Workload("unif") 116 | val numJobs = 4 // Don't change this unless you update the 117 | // hand calculations used in assert()-s below. 118 | 119 | workload = new Workload("unif") 120 | (1 to numJobs).foreach(i => { 121 | workload.addJob(Job(id = i, 122 | submitted = i, 123 | numTasks = i, 124 | taskDuration = i, 125 | workloadName = workload.name, 126 | cpusPerTask = 1.0, 127 | memPerTask = 1.0)) 128 | }) 129 | 130 | // Create a simple scheduler. 131 | val scheduler = new MonolithicScheduler("simple_sched", 132 | Map("unif" -> 1), 133 | Map("unif" -> 1)) 134 | 135 | // Set up a CellState 136 | val monolithicCellState = new CellState(1, 10.0, 20.0, 137 | conflictMode = "sequence-numbers", 138 | transactionMode = "all-or-nothing") 139 | 140 | val monolithicSimulator = new ClusterSimulator( 141 | monolithicCellState, 142 | Map(scheduler.name -> scheduler), 143 | Map("unif" -> Seq("simple_sched")), 144 | List(workload), 145 | List(), 146 | logging = true, 147 | monitorUtilization = false) 148 | 149 | monolithicSimulator.run() 150 | 151 | // Test the workload stats. 152 | workload.getJobs.foreach(job => { 153 | println(job.usefulTimeScheduling) 154 | assert(job.usefulTimeScheduling == 1 + job.id * 1) 155 | }) 156 | println(workload.jobUsefulThinkTimesPercentile(0.9)) 157 | assert(workload.jobUsefulThinkTimesPercentile(0.9) == ((numJobs + 1) * 0.9).toInt) 158 | // Job queue times: 159 | // job 1 arrives 1, thinktime 2, finishes at 3, queued 0 160 | // job 2 arrives 2, thinktime 3, finishes at 6, queued 1 161 | // job 3 arrives 3, thinktime 4, finishes at 10, queued 3 162 | // job 4 arrives 4, thinktime 5, finishes at 15 , queued 6 163 | assert(workload.avgJobQueueTimeTillFirstScheduled == 164 | (0.0 + 1.0 + 3.0 + 6.0)/4.0) 165 | println(workload.jobQueueTimeTillFirstScheduledPercentile(0.9)) 166 | val array = Array[Double](0.0, 1.0, 3.0, 6.0) 167 | assert(workload.jobQueueTimeTillFirstScheduledPercentile(0.9) == 168 | array((3 *0.9).toInt)) 169 | } 170 | 171 | /** 172 | * Mesos simulator tests. 173 | */ 174 | 175 | // The following test exercises functionality that is not yet implemented, 176 | // so we currently expect it to fail. 177 | test("mesosSimulatorSingleSchedulerZeroResourceJobsTest") { 178 | println("\n\n\n=====================") 179 | println("Testing Mesos simulator functionality.") 180 | println("=====================\n\n") 181 | 182 | var workload = new Workload("unif") 183 | val numJobs = 40 184 | (1 to numJobs).foreach(i => { 185 | workload.addJob(Job(id = i, 186 | submitted = i, 187 | numTasks = i, 188 | taskDuration = i, 189 | workloadName = workload.name, 190 | cpusPerTask = 1.0, 191 | memPerTask = 1.0)) 192 | }) 193 | 194 | // Create a simple scheduler, turn off partial job scheduling 195 | // So that we can check our think time calculations by just 196 | // assuming that each job only has to be scheduled once. 197 | // An alternative to this would be to just increase the size 198 | // of the test cell (e.g., to 1000cpus and 1000mem), 199 | // which would allow all of the jobs to fit simultaneously. 200 | val scheduler = new MesosScheduler(name = "mesos_test_sched", 201 | constantThinkTimes = Map("unif" -> 1), 202 | perTaskThinkTimes = Map("unif" -> 1), 203 | schedulePartialJobs = false) 204 | 205 | // Set up a CellState with plenty of space so that no jobs 206 | val mesosCellState = new CellState(1, 100.0, 200.0, 207 | conflictMode = "resource-fit", 208 | transactionMode = "all-or-nothing") 209 | 210 | // Create a round-robin allocator 211 | val allocatorConstantThinkTime = 1.0 212 | val mesosDRFAllocator = new MesosAllocator(allocatorConstantThinkTime) 213 | 214 | val mesosSimulator = new MesosSimulator( 215 | mesosCellState, 216 | Map(scheduler.name -> scheduler), 217 | Map("unif" -> Seq("mesos_test_sched")), 218 | List(workload), 219 | List(), 220 | mesosDRFAllocator, 221 | logging = true, 222 | monitorUtilization = false) 223 | 224 | mesosSimulator.run() 225 | assert(mesosSimulator.agendaSize == 0, ("Mesos Agenda should have been " + 226 | "zero when simulator finished " + 227 | "running, but was %d.") 228 | .format(mesosSimulator.agendaSize)) 229 | // Each job has constant think time of 1, and per task think of 1. 230 | // Thus, job i (which has i tasks) will think for 1 + i seconds. 231 | // So we have numJobs seconds from the constant term of each job, and 232 | // and 1 + 2 + ... + i from the i term. 233 | assert(workload.totalJobUsefulThinkTimes == numJobs + (1 to numJobs).sum, 234 | ("totalJobThinkTimes should have been %d, but was %f. THIS TEST " + 235 | "EXERCISES FUNCTIONALITY THAT IS NOT YET IMPLEMENTED, SO WE " + 236 | "CURRENTLY EXPECT IT TO FAIL").format( 237 | numJobs + (1 to numJobs).sum, workload.totalJobUsefulThinkTimes)) 238 | // For .75 percentile we should see 1 + i (see comment above) 239 | // where i = 40 * .75. 240 | assert(workload.jobUsefulThinkTimesPercentile(.75) == 1 + (40 * .75).toInt, 241 | ("Expected jobUsefulThinkTimesPercentil(0.75) to be %d, " + 242 | "but it was %f.") 243 | .format(1 + (40 * .75).toInt, 244 | workload.jobUsefulThinkTimesPercentile(.75))) 245 | } 246 | 247 | test("MesosAllocatorTest") { 248 | val testAllocator = new MesosAllocator(12) 249 | assert(testAllocator.getThinkTime == 12) 250 | } 251 | 252 | /** 253 | * Omega simulator tests. 254 | */ 255 | test("omegaSimulatorCellStateSyncApplyDeltaAndCommitTest") { 256 | println("\n\n\n=====================") 257 | println("omegaSimulatorCellStateSyncApplyDeltaAndCommitTest") 258 | println("=====================\n\n") 259 | println("\nRunning cellstate functionality test.") 260 | // Set up a workload with one job with one task. 261 | var workload = new Workload("unif") 262 | workload.addJob(Job(id = 1, 263 | submitted = 1.0, 264 | numTasks = 1, 265 | taskDuration = 10.0, 266 | workloadName = workload.name, 267 | cpusPerTask = 1.0, 268 | memPerTask = 1.0)) 269 | 270 | // Create an Omega scheduler. 271 | val scheduler = new OmegaScheduler(name = "omega_test_sched", 272 | constantThinkTimes = Map("unif" -> 1), 273 | perTaskThinkTimes = Map("unif" -> 1)) 274 | 275 | // Set up a CellState. 276 | val commonCellState = new CellState(numMachines = 10, 277 | cpusPerMachine = 1.0, 278 | memPerMachine = 2.0, 279 | conflictMode = "sequence-numbers", 280 | transactionMode = "all-or-nothing") 281 | 282 | // Set up a Simulator. 283 | val omegaSimulator = new OmegaSimulator( 284 | commonCellState, 285 | Map(scheduler.name -> scheduler), 286 | Map("unif" -> Seq("omega_test_sched")), 287 | List(workload), 288 | List(), 289 | logging = true, 290 | monitorUtilization = false) 291 | 292 | // Create a private copy of cellstate. 293 | val privateCellState = commonCellState.copy 294 | assert(privateCellState.numMachines == commonCellState.numMachines) 295 | assert(privateCellState.cpusPerMachine == commonCellState.cpusPerMachine) 296 | assert(privateCellState.memPerMachine == commonCellState.memPerMachine) 297 | // Test that the per machine state was successfully copied. 298 | (0 to commonCellState.allocatedCpusPerMachine.length - 1).foreach{ i => { 299 | assert(privateCellState.allocatedCpusPerMachine(i) == 300 | commonCellState.allocatedCpusPerMachine(i)) 301 | }} 302 | assert(privateCellState.machineSeqNums(0) == 0) 303 | // Make changes to the private cellstate by creating and applying a delta. 304 | val claimDelta = new ClaimDelta(scheduler, 305 | machineID = 0, 306 | privateCellState.machineSeqNums(0), 307 | duration = 10, 308 | cpus = 0.25, 309 | mem = 0.75) 310 | claimDelta.apply(privateCellState, false) 311 | // Check that machines sequence number was incremented in private cellstate. 312 | assert(privateCellState.machineSeqNums(0) == 1) 313 | // Check that changes to private cellstate stuck. 314 | assert(privateCellState.availableCpusPerMachine(0) == 1.0 - 0.25) 315 | assert(privateCellState.availableMemPerMachine(0) == 2.0 - 0.75) 316 | assert(privateCellState.allocatedCpusPerMachine(0) == 0.25) 317 | assert(privateCellState.allocatedMemPerMachine(0) == 0.75) 318 | // Check that common cellstate didn't change yet. 319 | assert(commonCellState.availableCpusPerMachine(0) == 1.0, 320 | ("commonCellState should have 1.0 cpus available on machine 0, " + 321 | "but only has %f.").format(commonCellState.availableCpusPerMachine(0))) 322 | assert(commonCellState.availableMemPerMachine(0) == 2.0) 323 | assert(commonCellState.allocatedCpusPerMachine(0) == 0.0) 324 | assert(commonCellState.allocatedMemPerMachine(0) == 0.0) 325 | 326 | // Commit the changes back to common cellstate. 327 | commonCellState.commit(Seq(claimDelta)) 328 | // Check that changes to common cellstate stuck. 329 | assert(commonCellState.availableCpusPerMachine(0) == 1.0 - 0.25) 330 | assert(commonCellState.availableMemPerMachine(0) == 2.0 - 0.75) 331 | assert(commonCellState.allocatedCpusPerMachine(0) == 0.25) 332 | assert(commonCellState.allocatedMemPerMachine(0) == 0.75) 333 | assert(commonCellState.machineSeqNums(0) == 1) 334 | 335 | // Set up two new private cellstates. 336 | val privateCellState1 = commonCellState.copy 337 | val privateCellState2 = commonCellState.copy 338 | assert(privateCellState1.machineSeqNums(0) == 1) 339 | assert(privateCellState2.machineSeqNums(0) == 1) 340 | 341 | // Make parallel changes in both private cellstates that 342 | // should cause a conflict in all-or-nothing conflict-mode. 343 | val claimDelta1 = new ClaimDelta(scheduler, 344 | machineID = 0, 345 | privateCellState1.machineSeqNums(0), 346 | duration = 10, 347 | cpus = 0.25, 348 | mem = 0.75) 349 | claimDelta1.apply(privateCellState1, false) 350 | assert(privateCellState1.machineSeqNums(0) == 2) 351 | 352 | // Check that the other private cellstate didn't change yet. 353 | assert(privateCellState2.availableCpusPerMachine(0) == 1.0 - 0.25) 354 | assert(privateCellState2.availableMemPerMachine(0) == 2.0 - 0.75) 355 | assert(privateCellState2.allocatedCpusPerMachine(0) == 0.25) 356 | assert(privateCellState2.allocatedMemPerMachine(0) == 0.75) 357 | assert(privateCellState2.machineSeqNums(0) == 1) 358 | 359 | val claimDelta2 = new ClaimDelta(scheduler, 360 | machineID = 0, 361 | privateCellState2.machineSeqNums(0), 362 | duration = 10, 363 | cpus = 0.25, 364 | mem = 0.75) 365 | claimDelta2.apply(privateCellState2, false) 366 | 367 | // Commit the changes from the first private cellstate to common cellstate. 368 | assert(commonCellState.commit(Seq(claimDelta1)).conflictedDeltas.length == 0) 369 | // Commit the changes from the second private cellstate and check 370 | // that it conflicts and doesn't change common cellstate. 371 | assert(commonCellState.commit(Seq(claimDelta2)).conflictedDeltas.length > 0) 372 | assert(commonCellState.availableCpusPerMachine(0) == 1.0 - 2 * 0.25) 373 | assert(commonCellState.availableMemPerMachine(0) == 2.0 - 2 * 0.75) 374 | assert(commonCellState.allocatedCpusPerMachine(0) == 2 * 0.25) 375 | assert(commonCellState.allocatedMemPerMachine(0) == 2 * 0.75) 376 | assert(commonCellState.machineSeqNums(0) == 2) 377 | } 378 | 379 | test("omegaSchedulerTest") { 380 | println("===========\nomegaSchedulerTest\n==========") 381 | println("\nRunning cellstate flow test.") 382 | var workload = new Workload("unif") 383 | workload.addJob(Job(id = 1, 384 | submitted = 1.0, 385 | numTasks = 1, 386 | taskDuration = 10.0, 387 | workloadName = "unif", 388 | cpusPerTask = 1.0, 389 | memPerTask = 1.0)) 390 | 391 | val scheduler = new OmegaScheduler(name = "omega_test_sched", 392 | constantThinkTimes = Map("unif" -> 1), 393 | perTaskThinkTimes = Map("unif" -> 1)) 394 | 395 | val commonCellState = new CellState(numMachines = 20, 396 | cpusPerMachine = 1.0, 397 | memPerMachine = 1.0, 398 | conflictMode = "sequence-numbers", 399 | transactionMode = "all-or-nothing") 400 | 401 | val omegaSimulator = new OmegaSimulator(commonCellState, 402 | Map(scheduler.name -> scheduler), 403 | Map("unif" -> Seq("omega_test_sched")), 404 | List(workload), 405 | List(), 406 | logging = true, 407 | monitorUtilization = false) 408 | 409 | // The job should be scheduled as soon as it is added to the scheduler. 410 | println("adding a job to scheduler.") 411 | scheduler.addJob(workload.getJobs.head) 412 | assert(scheduler.scheduling) 413 | assert(scheduler.jobQueueSize == 0) 414 | println("added job to scheduler.") 415 | } 416 | 417 | test("omegaSimulatorRunWithSingleSchedulerTest") { 418 | println("===========\nomegaSimulatorRunWithSingleSchedulerTest\n===========") 419 | println("\nRunning cellstate run w/ single scheduler test.") 420 | // Set up a workload with 40 jobs, each with 1 task. 421 | var workload = new Workload("unif") 422 | val numJobs = 40 423 | (1 to numJobs).foreach(i => { 424 | workload.addJob(Job(id = i, 425 | submitted = i, 426 | numTasks = 1, 427 | taskDuration = i, 428 | workloadName = workload.name, 429 | cpusPerTask = 1.0, 430 | memPerTask = 1.0)) 431 | }) 432 | 433 | // Create an Omega scheduler. 434 | val scheduler = new OmegaScheduler(name = "omega_test_sched", 435 | constantThinkTimes = Map("unif" -> 1), 436 | perTaskThinkTimes = Map("unif" -> 1)) 437 | 438 | // Set up a CellState. 439 | val commonCellState = new CellState(numMachines = 1000, 440 | cpusPerMachine = 1.0, 441 | memPerMachine = 1.0, 442 | conflictMode = "sequence-numbers", 443 | transactionMode = "all-or-nothing") 444 | 445 | // Set up a Simulator. 446 | val omegaSimulator = new OmegaSimulator( 447 | commonCellState, 448 | Map(scheduler.name -> scheduler), 449 | Map("unif" -> Seq("omega_test_sched")), 450 | List(workload), 451 | List(), 452 | logging = true, 453 | monitorUtilization = false) 454 | 455 | omegaSimulator.run() 456 | // Each job is scheduled two seconds after it arrives since all jobs 457 | // have one task so think time = C + L * 1 = 1 + 1 = 2. So job 40 458 | // should be scheduled at time 40 * 2 + 1, and it should run for 40 seconds. 459 | // Thus the simulator should finish at time 121, when the final task 460 | // finishes running. 461 | assert(omegaSimulator.currentTime == 121, 462 | "Simulation ran for %f seconds, but should have run for %d" 463 | .format(omegaSimulator.currentTime, 121)) 464 | } 465 | 466 | test("UniformWorkloadGeneratorTest") { 467 | println("\nRunning Uniform workload generator test.") 468 | // create a new WorkloadGenerator 469 | var workloadGen = 470 | new UniformWorkloadGenerator(workloadName = "test_wl", 471 | initJobInterarrivalTime = 1.0, 472 | tasksPerJob = 2, 473 | jobDuration = 3.0, 474 | cpusPerTask = 4.0, 475 | memPerTask = 5.0) 476 | 477 | // Test newWorkload. 478 | val workload = workloadGen.newWorkload(100.0) 479 | assert(workload.numJobs == 100, "numJobs was %d, should have been %d" 480 | .format(workload.numJobs, 100)) 481 | for(j <- workload.getJobs) { 482 | assert(j.numTasks == 2.0) 483 | assert(j.taskDuration == 3.0) 484 | assert(j.cpusPerTask == 4.0) 485 | assert(j.memPerTask == 5.0) 486 | } 487 | 488 | // Test newJob. 489 | val job = workloadGen.newJob(2003.0) 490 | assert(job.submitted == 2003.0) 491 | assert(job.numTasks == 2.0) 492 | assert(job.taskDuration == 3.0) 493 | assert(job.cpusPerTask == 4.0) 494 | assert(job.memPerTask == 5.0) 495 | } 496 | 497 | test("PrefillWorkloadGeneratorTest") { 498 | println("\nRunning exmaple prefill workload generator test.") 499 | val filename = "traces/example-init-cluster-state.log" 500 | 501 | // Load Service jobs. 502 | val servicePrefillTraceWLGenerator = new PrefillPbbTraceWorkloadGenerator( 503 | "PrefillService", filename) 504 | val prefillServiceWL = servicePrefillTraceWLGenerator.newWorkload(1000.0) 505 | // Cross validated by running at command line: 506 | val numServiceJobsInFile 507 | = Seq("awk", 508 | "$1 == 11 && $4 == 1 && $5 != 0 && $5 != 1", 509 | filename) 510 | .!!.split("\n").length 511 | assert(prefillServiceWL.numJobs == numServiceJobsInFile, 512 | ("Expected to find %d prefill service jobs from tracefile " + 513 | "%s, but found %d.") 514 | .format(numServiceJobsInFile, filename, prefillServiceWL.numJobs)) 515 | for (j <- prefillServiceWL.getJobs) { 516 | assert(j.submitted == 0) 517 | } 518 | 519 | // Load batch jobs. 520 | val batchPrefillTraceWLGenerator = new PrefillPbbTraceWorkloadGenerator( 521 | "PrefillBatch", filename) 522 | val prefillBatchWL = batchPrefillTraceWLGenerator.newWorkload(1000.0) 523 | val numBatchJobsInFile 524 | = Seq("awk", 525 | "$1 == 11 && ($4 != 1 || $5 == 0 || $5 == 1)", 526 | filename) 527 | .!!.split("\n").length 528 | assert(prefillBatchWL.numJobs == numBatchJobsInFile, 529 | ("Expected to find %d prefill batch jobs from tracefile %s, " + 530 | "but found %d.") 531 | .format(numBatchJobsInFile, filename, prefillBatchWL.numJobs)) 532 | } 533 | } 534 | -------------------------------------------------------------------------------- /traces/README.txt: -------------------------------------------------------------------------------- 1 | == Schema of cell input file == 2 | Fields are space delimited. Each row represents a job scheduling event. 3 | 4 | Each row in our traces belongs to one of two schemas. One schema has six columns, and the other has 8 columns. We describe both schema's below. The first five columns are the same in both schemas. 5 | 6 | === Common Columns === 7 | Column 0: possible values are 11 or 12 8 | "11" - something that was there at the beginning of timewindow 9 | "12" - something that was there at beginning of timewindow and ended at [timestamp] (see Column 1) 10 | Column 1: timestamp 11 | Column 2: unique job ID 12 | Column 3: 0 or 1 - prod_job - boolean flag indicating if this job is "production" priority as described in [1] 13 | Column 4: 0, 1, 2, or 3 - sched_class - see description of "Scheduling Class" in [1] 14 | 15 | === 6 column format === 16 | Column 5: UNSPECIFIED/UNUSED 17 | 18 | === 8 column format === 19 | Column 5: number of tasks 20 | Column 6: aggregate CPU usage of job (in num cores) 21 | Column 7: aggregate Ram usage of job (in bytes) 22 | 23 | == CMB_PBB split logic == 24 | For our primary research evaluation in [2] we used a job -> scheduler assignment policy as follows, which we call CMB_PBB. 25 | 26 | The following Python snippet captures the definition of CMB_PBB. It decides a job's scheduler assignment based on its values of the prod_job and sched_class fields (see columns 3 and 4 above). 27 | 28 | elif apol == 'cmb_pbb': 29 | if prod_job and sched_class != 0 and sched_class != 1: 30 | return 1 31 | else: 32 | return 0 33 | 34 | == Example Lines == 35 | 11 0.000000 623486366592 0 2 1 1 1074000000 36 | 12 12.602755 623486366592 0 2 82587 37 | 11 0.000000 158249529602 1 1 1 1 7286400 38 | 39 | == Bibliography == 40 | [1] Google ClusterData 2011 traces, https://github.com/google/cluster-data/blob/master/ClusterData2011_2.md 41 | [2] Omega: flexible, scalable schedulers for large compute clusters, https://research.google.com/pubs/pub41684.html 42 | -------------------------------------------------------------------------------- /traces/example-init-cluster-state.log: -------------------------------------------------------------------------------- 1 | 11 0.000000 049475829738997701 1 3 1 0.17000000178813934 524288000 2 | 11 0.000000 610339428128088085 0 0 1 0.0010000000474974513 10485760 3 | 11 0.000000 856292140335049805 0 0 1000 0.10000000149011612 1074000000000 4 | 12 6.844149 610339428128088085 0 0 5.001143932342529 5 | 12 8.921186 856292140335049805 0 0 20.576528549194336 6 | -------------------------------------------------------------------------------- /traces/job-distribution-traces/README.txt: -------------------------------------------------------------------------------- 1 | == Schema of trace files of job interarrival time, num_tasks, job_duration == 2 | 3 | These trace files should contain distributions of job interarrival time, 4 | num_tasks, job_duration from your cluster. The simulator will build 5 | empirical distributions based on these files. 6 | 7 | === Columns === 8 | Column 0: cluster_name 9 | Column 1: assignment policy ("cmb-new" = "CMB_PBB") 10 | Column 2: scheduler id, values can be 0 or 1. 0 = batch, service = 1 11 | Column 3: depending on which trace file: 12 | interarrival time (seconds since last job arrival) 13 | OR tasks in job 14 | OR job runtime (seconds) 15 | 16 | Traces may mix batch and service jobs, although the provided examples segregate them. 17 | -------------------------------------------------------------------------------- /traces/job-distribution-traces/example_csizes_cmb.log: -------------------------------------------------------------------------------- 1 | example_cluster cmb-new 0 1 2 | example_cluster cmb-new 0 1 3 | example_cluster cmb-new 0 1 4 | example_cluster cmb-new 0 1600 5 | example_cluster cmb-new 0 1 6 | example_cluster cmb-new 0 1 7 | example_cluster cmb-new 0 1 8 | example_cluster cmb-new 0 1 9 | example_cluster cmb-new 0 1 10 | example_cluster cmb-new 0 1 11 | example_cluster cmb-new 1 1 12 | example_cluster cmb-new 1 1 13 | example_cluster cmb-new 1 1 14 | example_cluster cmb-new 1 1 15 | example_cluster cmb-new 1 1 16 | example_cluster cmb-new 1 1 17 | example_cluster cmb-new 1 1 18 | example_cluster cmb-new 1 1 19 | example_cluster cmb-new 1 1 20 | example_cluster cmb-new 1 1 21 | -------------------------------------------------------------------------------- /traces/job-distribution-traces/example_interarrival_cmb.log: -------------------------------------------------------------------------------- 1 | example_cluster cmb-new 0 2.517473 2 | example_cluster cmb-new 0 1.295932 3 | example_cluster cmb-new 0 1.618243 4 | example_cluster cmb-new 0 0.024418 5 | example_cluster cmb-new 0 0.060959 6 | example_cluster cmb-new 0 1.291739 7 | example_cluster cmb-new 0 0.020385 8 | example_cluster cmb-new 0 2.001258 9 | example_cluster cmb-new 0 0.090779 10 | example_cluster cmb-new 0 0.018862 11 | example_cluster cmb-new 1 340.360427 12 | example_cluster cmb-new 1 82.592528 13 | example_cluster cmb-new 1 1068.807106 14 | example_cluster cmb-new 1 50.258920 15 | example_cluster cmb-new 1 1205.063186 16 | example_cluster cmb-new 1 76.170744 17 | example_cluster cmb-new 1 884.422104 18 | example_cluster cmb-new 1 237.976148 19 | example_cluster cmb-new 1 107.145817 20 | example_cluster cmb-new 1 1048.466730 21 | -------------------------------------------------------------------------------- /traces/job-distribution-traces/example_runtimes_cmb.log: -------------------------------------------------------------------------------- 1 | example_cluster cmb-new 0 1.727678 2 | example_cluster cmb-new 0 24.602128 3 | example_cluster cmb-new 0 21.330643 4 | example_cluster cmb-new 0 18.513231 5 | example_cluster cmb-new 0 15.296218 6 | example_cluster cmb-new 0 13.420730 7 | example_cluster cmb-new 0 9.506823 8 | example_cluster cmb-new 0 41.397468 9 | example_cluster cmb-new 0 43.686984 10 | example_cluster cmb-new 0 46.735950 11 | example_cluster cmb-new 1 101.121137 12 | example_cluster cmb-new 1 208.691963 13 | example_cluster cmb-new 1 110.178521 14 | example_cluster cmb-new 1 178.837062 15 | example_cluster cmb-new 1 101.111495 16 | example_cluster cmb-new 1 194.864865 17 | example_cluster cmb-new 1 176.516334 18 | example_cluster cmb-new 1 109.439388 19 | example_cluster cmb-new 1 243.861564 20 | example_cluster cmb-new 1 107.897281 21 | --------------------------------------------------------------------------------