├── README.md ├── dask-ecs-template.yaml ├── dockerfiles ├── cuda │ └── theano8.2 │ │ ├── .env │ │ ├── Dockerfile │ │ ├── base │ │ ├── .env │ │ ├── Dockerfile │ │ └── docker-compose.yml │ │ ├── docker-compose.yml │ │ └── requirements.txt └── python35 │ ├── .env │ ├── Dockerfile │ ├── docker-compose.yml │ └── requirements.txt ├── docs └── topology.png └── notebooks ├── dask-simple.ipynb └── dask-theano.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # Dask ECS 2 | 3 | This is an opinionated template for spinning up a dask cluster based on docker. 4 | 5 | ## CloudFormation Topology 6 | 7 | ![](docs/topology.png) 8 | 9 | 10 | ## Install 11 | 12 | First clone this repo. 13 | 14 | Then navigate to your [aws console cloudformation dash](https://console.aws.amazon.com/cloudformation) -> create stack -> choose a template -> Upload a template to Amazon S3 -> choose file -> then navigate to this dask-ecs-template.yaml file. In the web portal you can configure to your liking. 15 | 16 | 17 | ## Example Docker Containers 18 | 19 | https://hub.docker.com/r/sayreblades/dask-ecs/ 20 | 21 | ## Example Dask Client 22 | 23 | ``` 24 | from dask import bag as db 25 | import distributed 26 | 27 | 28 | client = distributed.Client(address="[YourDaskServer:Port]") 29 | b = db.from_sequence([1, 2, 3, 4, 5, 6]) 30 | c = b.map(lambda o: o*2) 31 | f = client.compute(c) 32 | f.result() 33 | ``` 34 | 35 | ## Logs 36 | 37 | Creates logs in cloud watch. The log group name will be the name you gave the cloud formation stack. 38 | 39 | To view logs use the cloudwatch web interface or awslogs: https://github.com/jorgebastida/awslogs 40 | 41 | ``` 42 | awslogs get [log group name] -w 43 | ``` 44 | 45 | ## Similar Projects 46 | 47 | - https://hub.docker.com/r/magsol/distributed-dask/ 48 | 49 | - https://github.com/ogrisel/docker-distributed 50 | 51 | 52 | ## Warnings 53 | 54 | - The cluster is wide open in terms of network connectivity. Use at your own risk. 55 | 56 | - The current configuration uses a custom ami image (ami-2505a35f) which is only available in us-east-1 region. 57 | 58 | - If you are using the p2.x class of machines for your cluster, you may need to request a resource limit increase 59 | based on your region: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html 60 | 61 | - If you are attempting to use dask locally, you should take care that your local python dependencies exactly 62 | match the versions deployed to the cluster. One convention Im using to manage this is by maintaining a 63 | requirements.txt file associated with each docker container. 64 | See https://github.com/SayreBlades/dask-ecs/tree/master/dockerfiles/python35 65 | Which was used to build the container: sayreblades/dask-ecs:python35 66 | 67 | ## Notes 68 | 69 | For building AMI's for use with ECS... putting these here for future reference: 70 | 71 | https://stackoverflow.com/questions/39018180/aws-ecs-agent-wont-start 72 | http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-install.html 73 | http://docs.aws.amazon.com/AmazonECS/latest/developerguide/launch_container_instance.html 74 | 75 | 76 | For building GPU (p2.xlarge) instance on ecs: 77 | 78 | https://github.com/bfolkens/nvidia-docker-bootstrap 79 | 80 | 81 | ## TODO 82 | 83 | - add optional security 84 | - add a cuda ready image for each region 85 | -------------------------------------------------------------------------------- /dask-ecs-template.yaml: -------------------------------------------------------------------------------- 1 | AWSTemplateFormatVersion: '2010-09-09' 2 | Description: Dask Cluster 3 | 4 | Mappings: 5 | network: 6 | internet: 7 | cidr: 0.0.0.0/0 8 | vpc: 9 | cidr: 10.0.0.0/16 10 | subnet1: 11 | cidr: 10.0.0.0/24 12 | subnet2: 13 | cidr: 10.0.1.0/24 14 | subnet3: 15 | cidr: 10.0.2.0/24 16 | service: 17 | daskscheduler-app: 18 | port: 8786 19 | daskscheduler-bokeh: 20 | port: 8787 21 | jupyter-notebook: 22 | port: 8888 23 | daskworker: 24 | port: 5000 25 | 26 | 27 | Parameters: 28 | 29 | AMI: 30 | Type: String 31 | Description: AMI to load on your ec2 instances (use ami-b115a3cb on p2.x) 32 | Default: ami-20ff515a 33 | 34 | EC2InstanceType: 35 | Type: String 36 | Description: EC2 instance type (generally m4.x for non cuda ami; p2.x for cuda ami) 37 | Default: m4.xlarge 38 | 39 | EC2KeyPairName: 40 | Type: AWS::EC2::KeyPair::KeyName 41 | Description: Used for ssh access to ec2 instances, create in ec2 dash -> key pairs 42 | 43 | ClusterSize: 44 | Type: Number 45 | Description: Number of nodes in the cluster 46 | Default: 3 47 | 48 | DaskSchedulerImage: 49 | Type: String 50 | Description: Docker image with dask scheduler, must be runnable with command `dask-scheduler` 51 | Default: sayreblades/dask-ecs:python35 52 | 53 | DaskWorkerImage: 54 | Type: String 55 | Description: Docker image with dask worker, must be runnable with command `dask-worker [host] --worker-port [port]` 56 | Default: sayreblades/dask-ecs:python35 57 | 58 | JupyterImage: 59 | Type: String 60 | Description: Docker image with jupyter installed, must be runnable with command `jupyter notebook --ip=0.0.0.0 --allow-root` 61 | Default: sayreblades/dask-ecs:python35 62 | 63 | EBSVolumeSize: 64 | Type: Number 65 | Description: Size of EC2 Disk Volume in gigs 66 | Default: 50 67 | 68 | Outputs: 69 | 70 | SchedulerHTTP: 71 | Value: !Sub 72 | - http://${Domain} 73 | - { Domain: !GetAtt DaskSchedulerServiceELB.DNSName } 74 | 75 | SchedulerTCP: 76 | Value: !Sub 77 | - ${Domain}:${Port} 78 | - { Domain: !GetAtt DaskSchedulerServiceELB.DNSName, Port: 8786 } 79 | 80 | JupyterHTTP: 81 | Value: !Sub 82 | - http://${Domain}:8888 83 | - { Domain: !GetAtt DaskSchedulerServiceELB.DNSName } 84 | 85 | Logs: 86 | Value: !Sub https://console.aws.amazon.com/cloudwatch/home?region=${AWS::Region}#logStream:group=${DaskLogGroup} 87 | 88 | Resources: 89 | 90 | ##################################################################################### 91 | # NETWORKING 92 | ##################################################################################### 93 | 94 | Vpc: 95 | Type: AWS::EC2::VPC 96 | Properties: 97 | EnableDnsSupport: 'true' 98 | EnableDnsHostnames: 'true' 99 | CidrBlock: !FindInMap [network, vpc, cidr] 100 | 101 | InternetGateway: 102 | Type: AWS::EC2::InternetGateway 103 | 104 | AttachGateway: 105 | Type: AWS::EC2::VPCGatewayAttachment 106 | Properties: 107 | VpcId: !Ref Vpc 108 | InternetGatewayId: !Ref InternetGateway 109 | 110 | Subnet1: 111 | Type: AWS::EC2::Subnet 112 | Properties: 113 | VpcId: !Ref Vpc 114 | CidrBlock: !FindInMap [network, subnet1, cidr] 115 | AvailabilityZone: !Select [0, !GetAZs { "Ref": "AWS::Region" } ] 116 | 117 | Subnet1RouteTable: 118 | Type: AWS::EC2::RouteTable 119 | Properties: 120 | VpcId: !Ref Vpc 121 | 122 | Subnet1RouteTableAttachGateway: 123 | DependsOn: AttachGateway 124 | Type: AWS::EC2::Route 125 | Properties: 126 | RouteTableId: !Ref Subnet1RouteTable 127 | GatewayId: !Ref InternetGateway 128 | DestinationCidrBlock: !FindInMap [ network, internet, cidr ] 129 | 130 | Subnet1RouteTableAssociation: 131 | Type: AWS::EC2::SubnetRouteTableAssociation 132 | Properties: 133 | SubnetId: !Ref Subnet1 134 | RouteTableId: !Ref Subnet1RouteTable 135 | 136 | Subnet2: 137 | Type: AWS::EC2::Subnet 138 | Properties: 139 | VpcId: !Ref Vpc 140 | CidrBlock: !FindInMap [network, subnet2, cidr] 141 | AvailabilityZone: !Select [1, !GetAZs { "Ref": "AWS::Region" } ] 142 | 143 | Subnet2RouteTable: 144 | Type: AWS::EC2::RouteTable 145 | Properties: 146 | VpcId: !Ref Vpc 147 | 148 | Subnet2RouteTableAttachGateway: 149 | DependsOn: AttachGateway 150 | Type: AWS::EC2::Route 151 | Properties: 152 | RouteTableId: !Ref Subnet2RouteTable 153 | GatewayId: !Ref InternetGateway 154 | DestinationCidrBlock: !FindInMap [ network, internet, cidr ] 155 | 156 | Subnet2RouteTableAssociation: 157 | Type: AWS::EC2::SubnetRouteTableAssociation 158 | Properties: 159 | SubnetId: !Ref Subnet2 160 | RouteTableId: !Ref Subnet2RouteTable 161 | 162 | Subnet3: 163 | Type: AWS::EC2::Subnet 164 | Properties: 165 | VpcId: !Ref Vpc 166 | CidrBlock: !FindInMap [network, subnet3, cidr] 167 | AvailabilityZone: !Select [2, !GetAZs { "Ref": "AWS::Region" } ] 168 | 169 | Subnet3RouteTable: 170 | Type: AWS::EC2::RouteTable 171 | Properties: 172 | VpcId: !Ref Vpc 173 | 174 | Subnet3RouteTableAttachGateway: 175 | DependsOn: AttachGateway 176 | Type: AWS::EC2::Route 177 | Properties: 178 | RouteTableId: !Ref Subnet3RouteTable 179 | GatewayId: !Ref InternetGateway 180 | DestinationCidrBlock: !FindInMap [ network, internet, cidr ] 181 | 182 | Subnet3RouteTableAssociation: 183 | Type: AWS::EC2::SubnetRouteTableAssociation 184 | Properties: 185 | SubnetId: !Ref Subnet3 186 | RouteTableId: !Ref Subnet3RouteTable 187 | 188 | 189 | ##################################################################################### 190 | # ECS 191 | ##################################################################################### 192 | 193 | EcsCluster: 194 | Type: AWS::ECS::Cluster 195 | 196 | EcsServiceRole: 197 | Type: AWS::IAM::Role 198 | Properties: 199 | AssumeRolePolicyDocument: 200 | Statement: 201 | - Effect: Allow 202 | Principal: 203 | Service: 204 | - ecs.amazonaws.com 205 | Action: 206 | - sts:AssumeRole 207 | Path: "/" 208 | Policies: 209 | - PolicyName: ecs-service 210 | PolicyDocument: 211 | Statement: 212 | - Effect: Allow 213 | Action: 214 | - elasticloadbalancing:DeregisterInstancesFromLoadBalancer 215 | - elasticloadbalancing:DeregisterTargets 216 | - elasticloadbalancing:Describe* 217 | - elasticloadbalancing:RegisterInstancesWithLoadBalancer 218 | - elasticloadbalancing:RegisterTargets 219 | - ec2:Describe* 220 | - ec2:AuthorizeSecurityGroupIngress 221 | Resource: "*" 222 | 223 | EcsEc2InstanceRole: 224 | Type: AWS::IAM::Role 225 | Properties: 226 | AssumeRolePolicyDocument: 227 | Statement: 228 | - Effect: Allow 229 | Principal: 230 | Service: 231 | - ec2.amazonaws.com 232 | Action: 233 | - sts:AssumeRole 234 | Path: "/" 235 | Policies: 236 | - PolicyName: ecs-service 237 | PolicyDocument: 238 | Statement: 239 | - Effect: Allow 240 | Action: 241 | - ecs:CreateCluster 242 | - ecs:DeregisterContainerInstance 243 | - ecs:DiscoverPollEndpoint 244 | - ecs:Poll 245 | - ecs:RegisterContainerInstance 246 | - ecs:StartTelemetrySession 247 | - ecs:Submit* 248 | - logs:CreateLogStream 249 | - logs:PutLogEvents 250 | - ecr:* 251 | - cloudtrail:LookupEvents 252 | Resource: "*" 253 | 254 | EcsEc2InstanceSecurityGroup: 255 | Type: AWS::EC2::SecurityGroup 256 | Properties: 257 | GroupDescription: Dask cluster machine security group 258 | VpcId: !Ref Vpc 259 | SecurityGroupIngress: 260 | # SSH is open 261 | - CidrIp: 0.0.0.0/0 262 | IpProtocol: "tcp" 263 | FromPort: 22 264 | ToPort: 22 265 | # Internal requests 266 | - CidrIp: !FindInMap [network, vpc, cidr] 267 | IpProtocol: "tcp" 268 | FromPort: 0 269 | ToPort: 65535 270 | 271 | EcsEc2InstanceProfile: 272 | Type: AWS::IAM::InstanceProfile 273 | Properties: 274 | Path: "/" 275 | Roles: 276 | - Ref: EcsEc2InstanceRole 277 | 278 | EcsEc2InstanceLaunchConfig: 279 | Type: AWS::AutoScaling::LaunchConfiguration 280 | Properties: 281 | ImageId: !Ref AMI 282 | InstanceType: !Ref EC2InstanceType 283 | AssociatePublicIpAddress: "true" 284 | IamInstanceProfile: !Ref EcsEc2InstanceProfile 285 | KeyName: !Ref EC2KeyPairName 286 | SecurityGroups: 287 | - !Ref EcsEc2InstanceSecurityGroup 288 | EbsOptimized: false 289 | UserData: 290 | "Fn::Base64": !Sub | 291 | #!/bin/bash 292 | echo ECS_CLUSTER=${EcsCluster} > /etc/ecs/ecs.config 293 | BlockDeviceMappings: 294 | - DeviceName: /dev/xvda 295 | Ebs: 296 | VolumeSize: !Ref EBSVolumeSize 297 | VolumeType: gp2 298 | DeleteOnTermination: "true" 299 | 300 | EcsEc2InstanceAutoScalingGroup: 301 | Type: AWS::AutoScaling::AutoScalingGroup 302 | Properties: 303 | VPCZoneIdentifier: 304 | - !Ref Subnet1 305 | - !Ref Subnet2 306 | - !Ref Subnet3 307 | LaunchConfigurationName: !Ref EcsEc2InstanceLaunchConfig 308 | MinSize: !Ref ClusterSize 309 | MaxSize: !Ref ClusterSize 310 | DesiredCapacity: !Ref ClusterSize 311 | 312 | DaskLogGroup: 313 | Type: "AWS::Logs::LogGroup" 314 | Properties: 315 | LogGroupName: !Sub "${AWS::StackName}" 316 | RetentionInDays: 7 317 | 318 | ##################################################################################### 319 | # Dask Scheuler 320 | ##################################################################################### 321 | 322 | DaskSchedulerTask: 323 | Type: AWS::ECS::TaskDefinition 324 | Properties: 325 | NetworkMode: host 326 | ContainerDefinitions: 327 | - MemoryReservation: 1024 328 | Cpu: 0 329 | Essential: true 330 | Name: dask-scheduler 331 | Image: !Ref DaskSchedulerImage 332 | Command: 333 | - dask-scheduler 334 | PortMappings: 335 | - ContainerPort: !FindInMap [service, daskscheduler-app, port] 336 | HostPort: !FindInMap [service, daskscheduler-app, port] 337 | - ContainerPort: !FindInMap [service, daskscheduler-bokeh, port] 338 | HostPort: !FindInMap [service, daskscheduler-bokeh, port] 339 | LogConfiguration: 340 | LogDriver: awslogs 341 | Options: 342 | awslogs-region: !Ref AWS::Region 343 | awslogs-group: !Ref DaskLogGroup 344 | awslogs-stream-prefix: dask-scheduler 345 | - MemoryReservation: 1024 346 | Cpu: 0 347 | Essential: true 348 | Name: jupyter-notebook 349 | Image: !Ref JupyterImage 350 | Command: 351 | - jupyter 352 | - notebook 353 | - --ip=0.0.0.0 354 | - --allow-root 355 | PortMappings: 356 | - ContainerPort: !FindInMap [service, jupyter-notebook, port] 357 | HostPort: !FindInMap [service, jupyter-notebook, port] 358 | LogConfiguration: 359 | LogDriver: awslogs 360 | Options: 361 | awslogs-region: !Ref AWS::Region 362 | awslogs-group: !Ref DaskLogGroup 363 | awslogs-stream-prefix: jupyter-notebook 364 | 365 | DaskSchedulerServiceSG: 366 | Type: AWS::EC2::SecurityGroup 367 | Properties: 368 | GroupDescription: Dask Scheduler Allowed Ports 369 | VpcId: !Ref Vpc 370 | SecurityGroupIngress: 371 | - IpProtocol: tcp 372 | FromPort: !FindInMap [ service, daskscheduler-app, port ] 373 | ToPort: !FindInMap [ service, daskscheduler-app, port ] 374 | CidrIp: 0.0.0.0/0 375 | - IpProtocol: tcp 376 | FromPort: !FindInMap [ service, jupyter-notebook, port ] 377 | ToPort: !FindInMap [ service, jupyter-notebook, port ] 378 | CidrIp: 0.0.0.0/0 379 | - IpProtocol: tcp 380 | FromPort: 80 381 | ToPort: 80 382 | CidrIp: 0.0.0.0/0 383 | 384 | DaskSchedulerServiceELB: 385 | Type: AWS::ElasticLoadBalancing::LoadBalancer 386 | Properties: 387 | LoadBalancerName: !Sub "dask-scheduler-${AWS::StackName}" 388 | SecurityGroups: 389 | - !Ref DaskSchedulerServiceSG 390 | Subnets: 391 | - !Ref Subnet1 392 | - !Ref Subnet2 393 | - !Ref Subnet3 394 | Scheme: internet-facing 395 | HealthCheck: 396 | HealthyThreshold: 10 397 | Interval: 30 398 | Target: !Join ['', [ 'HTTP:', !FindInMap [ service, daskscheduler-bokeh, port ], '/' ] ] 399 | Timeout: 5 400 | UnhealthyThreshold: 2 401 | Listeners: 402 | - InstancePort: !FindInMap [ service, daskscheduler-app, port ] 403 | LoadBalancerPort: !FindInMap [ service, daskscheduler-app, port ] 404 | Protocol: TCP 405 | InstanceProtocol: TCP 406 | - InstancePort: !FindInMap [ service, daskscheduler-bokeh, port ] 407 | LoadBalancerPort: 80 408 | Protocol: TCP 409 | InstanceProtocol: TCP 410 | - InstancePort: !FindInMap [ service, jupyter-notebook, port ] 411 | LoadBalancerPort: 8888 412 | Protocol: TCP 413 | InstanceProtocol: TCP 414 | ConnectionSettings: 415 | IdleTimeout: 3600 416 | 417 | DaskSchedulerService: 418 | Type: AWS::ECS::Service 419 | Properties: 420 | Cluster: !Ref EcsCluster 421 | DesiredCount: 1 422 | DeploymentConfiguration: 423 | MaximumPercent: 100 424 | MinimumHealthyPercent: 0 425 | TaskDefinition: !Ref DaskSchedulerTask 426 | Role: !Ref EcsServiceRole 427 | LoadBalancers: 428 | - ContainerName: dask-scheduler 429 | ContainerPort: !FindInMap [service, daskscheduler-app, port] 430 | LoadBalancerName: !Ref DaskSchedulerServiceELB 431 | 432 | 433 | ##################################################################################### 434 | # Dask Worker 435 | ##################################################################################### 436 | 437 | DaskWorkerTask: 438 | Type: AWS::ECS::TaskDefinition 439 | Properties: 440 | NetworkMode: host 441 | ContainerDefinitions: 442 | - MemoryReservation: 1024 443 | Privileged: true 444 | Essential: true 445 | Name: dask-worker 446 | Image: 447 | !Ref DaskWorkerImage 448 | Cpu: 0 449 | PortMappings: 450 | - ContainerPort: !FindInMap [service, daskworker, port] 451 | HostPort: !FindInMap [service, daskworker, port] 452 | Command: 453 | - dask-worker 454 | - !Join [ ":", [ !GetAtt DaskSchedulerServiceELB.DNSName, !FindInMap [service, daskscheduler-app, port] ] ] 455 | - --worker-port 456 | - !FindInMap [service, daskworker, port] 457 | LogConfiguration: 458 | LogDriver: awslogs 459 | Options: 460 | awslogs-region: !Ref AWS::Region 461 | awslogs-group: !Ref DaskLogGroup 462 | awslogs-stream-prefix: dask-worker 463 | 464 | DaskWorkerService: 465 | Type: AWS::ECS::Service 466 | Properties: 467 | Cluster: !Ref EcsCluster 468 | DesiredCount: !Ref ClusterSize 469 | DeploymentConfiguration: 470 | MaximumPercent: 200 471 | MinimumHealthyPercent: 0 472 | TaskDefinition: !Ref DaskWorkerTask 473 | 474 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/.env: -------------------------------------------------------------------------------- 1 | VERSION=cuda8.0-theano8.2 2 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM sayreblades/dask-ecs:cuda8.0-theano8.2-base 2 | 3 | WORKDIR /root/ 4 | 5 | RUN pip install jupyter==1.0.0 6 | RUN jupyter nbextension enable --py widgetsnbextension 7 | ADD https://raw.githubusercontent.com/SayreBlades/dask-ecs/master/notebooks/dask-simple.ipynb . 8 | ADD https://raw.githubusercontent.com/SayreBlades/dask-ecs/master/notebooks/dask-theano.ipynb . 9 | 10 | COPY requirements.txt . 11 | RUN pip install -r requirements.txt 12 | 13 | ENV LC_ALL=C.UTF-8 14 | ENV LANG=C.UTF-8 15 | 16 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/base/.env: -------------------------------------------------------------------------------- 1 | VERSION=cuda8.0-theano8.2-base 2 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/base/Dockerfile: -------------------------------------------------------------------------------- 1 | # Start with cuDNN base image 2 | FROM sayreblades/dask-ecs:cuda8.0-cudnn5-base 3 | 4 | # Install git, wget, python-dev, pip, BLAS + LAPACK and other dependencies 5 | RUN apt-get update && apt-get install -y \ 6 | gfortran \ 7 | git \ 8 | wget \ 9 | liblapack-dev \ 10 | libopenblas-dev \ 11 | python3-dev \ 12 | python3-pip 13 | 14 | RUN ln -s /usr/bin/python3 /usr/local/bin/python 15 | RUN pip3 --no-cache-dir install --upgrade pip 16 | 17 | # Set CUDA_ROOT 18 | ENV CUDA_ROOT /usr/local/cuda/bin 19 | 20 | # Install bleeding-edge Theano 21 | RUN pip install --upgrade git+git://github.com/Theano/Theano.git@rel-0.8.2 22 | RUN pip install --upgrade six 23 | 24 | # Set up .theanorc for CUDA 25 | RUN printf "[global]\ndevice=gpu\nfloatX=float32\noptimizer_including=cudnn\n[lib]\ncnmem=0.1\n[nvcc]\nfastmath=True" > /root/.theanorc 26 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/base/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '2.1' 2 | 3 | services: 4 | 5 | scheduler: 6 | image: sayreblades/dask-ecs:${VERSION:-latest} 7 | build: . 8 | command: dask-scheduler 9 | ports: 10 | - 8786 11 | - 8787 12 | 13 | jupyter: 14 | image: sayreblades/dask-ecs:${VERSION:-latest} 15 | build: . 16 | links: 17 | - scheduler 18 | command: jupyter notebook --ip=0.0.0.0 --allow-root 19 | ports: 20 | - "8887:8888" 21 | 22 | worker: 23 | image: sayreblades/dask-ecs:${VERSION:-latest} 24 | build: . 25 | command: dask-worker scheduler:8786 26 | links: 27 | - scheduler 28 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '2.1' 2 | 3 | services: 4 | 5 | scheduler: 6 | image: sayreblades/dask-ecs:${VERSION:-latest} 7 | build: . 8 | command: dask-scheduler 9 | ports: 10 | - "8786:8786" 11 | - "8787:8787" 12 | 13 | jupyter: 14 | image: sayreblades/dask-ecs:${VERSION:-latest} 15 | build: . 16 | links: 17 | - scheduler 18 | environment: 19 | - SCHEDULER=scheduler 20 | command: jupyter notebook --ip=0.0.0.0 --allow-root 21 | ports: 22 | - "8888:8888" 23 | 24 | worker: 25 | image: sayreblades/dask-ecs:${VERSION:-latest} 26 | build: . 27 | command: dask-worker scheduler:8786 28 | links: 29 | - scheduler 30 | -------------------------------------------------------------------------------- /dockerfiles/cuda/theano8.2/requirements.txt: -------------------------------------------------------------------------------- 1 | bokeh==0.12.9 2 | cloudpickle==0.4.1 3 | dask-searchcv==0.1.0 4 | dask==0.15.2 5 | distributed==1.18.3 6 | numpy==1.13.3 7 | scikit-learn==0.19.1 8 | scipy==1.0.0 9 | tornado==4.5.2 10 | ipywidgets==7.0.3 11 | -------------------------------------------------------------------------------- /dockerfiles/python35/.env: -------------------------------------------------------------------------------- 1 | VERSION=python35 2 | -------------------------------------------------------------------------------- /dockerfiles/python35/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM python:3.5 2 | 3 | WORKDIR /root/ 4 | 5 | RUN pip install jupyter==1.0.0 ipywidgets==7.0.3 6 | RUN jupyter nbextension enable --py widgetsnbextension 7 | ADD https://raw.githubusercontent.com/SayreBlades/dask-ecs/master/notebooks/dask-simple.ipynb . 8 | 9 | COPY requirements.txt . 10 | RUN pip install -r requirements.txt 11 | 12 | ENV LC_ALL=C.UTF-8 13 | ENV LANG=C.UTF-8 14 | -------------------------------------------------------------------------------- /dockerfiles/python35/docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '2.1' 2 | 3 | services: 4 | 5 | scheduler: 6 | image: sayreblades/dask-ecs:${VERSION:-latest} 7 | build: . 8 | command: dask-scheduler 9 | ports: 10 | - "8786:8786" 11 | - "8787:8787" 12 | 13 | jupyter: 14 | image: sayreblades/dask-ecs:${VERSION:-latest} 15 | build: . 16 | links: 17 | - scheduler 18 | environment: 19 | - SCHEDULER=scheduler 20 | command: jupyter notebook --ip=0.0.0.0 --allow-root 21 | ports: 22 | - "8888:8888" 23 | 24 | worker: 25 | image: sayreblades/dask-ecs:${VERSION:-latest} 26 | build: . 27 | command: dask-worker scheduler:8786 28 | links: 29 | - scheduler 30 | -------------------------------------------------------------------------------- /dockerfiles/python35/requirements.txt: -------------------------------------------------------------------------------- 1 | bokeh==0.12.9 2 | cloudpickle==0.5.1 3 | dask==0.15.4 4 | numpy==1.13.3 5 | pandas==0.21.0 6 | toolz==0.8.2 7 | distributed==1.18.3 8 | 9 | -------------------------------------------------------------------------------- /docs/topology.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SayreBlades/dask-ecs/1b75eb93a5d8ac5f0c49bbc375d87340a16b9b2f/docs/topology.png -------------------------------------------------------------------------------- /notebooks/dask-simple.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import os\n", 10 | "from dask import bag as db\n", 11 | "import distributed\n", 12 | "from dask.distributed import progress" 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": 2, 18 | "metadata": {}, 19 | "outputs": [], 20 | "source": [ 21 | "scheduler = os.environ.get('SCHEDULER', 'localhost')\n", 22 | "client = distributed.Client(address=scheduler + \":8786\")" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 3, 28 | "metadata": {}, 29 | "outputs": [ 30 | { 31 | "data": { 32 | "application/vnd.jupyter.widget-view+json": { 33 | "model_id": "afa821b529a147c4a1c9b3f93e20d629", 34 | "version_major": 2, 35 | "version_minor": 0 36 | }, 37 | "text/html": [ 38 | "

Failed to display Jupyter Widget of type VBox.

\n", 39 | "

\n", 40 | " If you're reading this message in Jupyter Notebook or JupyterLab, it may mean\n", 41 | " that the widgets JavaScript is still loading. If this message persists, it\n", 42 | " likely means that the widgets JavaScript library is either not installed or\n", 43 | " not enabled. See the Jupyter\n", 44 | " Widgets Documentation for setup instructions.\n", 45 | "

\n", 46 | "

\n", 47 | " If you're reading this message in another notebook frontend (for example, a static\n", 48 | " rendering on GitHub or NBViewer),\n", 49 | " it may mean that your frontend doesn't currently support widgets.\n", 50 | "

\n" 51 | ], 52 | "text/plain": [ 53 | "VBox()" 54 | ] 55 | }, 56 | "metadata": {}, 57 | "output_type": "display_data" 58 | } 59 | ], 60 | "source": [ 61 | "b = db.from_sequence([1, 2, 3, 4, 5, 6])\n", 62 | "c = b.map(lambda o: o*2)\n", 63 | "f = client.compute(c)\n", 64 | "progress(f)" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 4, 70 | "metadata": {}, 71 | "outputs": [ 72 | { 73 | "data": { 74 | "text/plain": [ 75 | "[2, 4, 6, 8, 10, 12]" 76 | ] 77 | }, 78 | "execution_count": 4, 79 | "metadata": {}, 80 | "output_type": "execute_result" 81 | } 82 | ], 83 | "source": [ 84 | "f.result()" 85 | ] 86 | } 87 | ], 88 | "metadata": { 89 | "kernelspec": { 90 | "display_name": "Python 3", 91 | "language": "python", 92 | "name": "python3" 93 | }, 94 | "language_info": { 95 | "codemirror_mode": { 96 | "name": "ipython", 97 | "version": 3 98 | }, 99 | "file_extension": ".py", 100 | "mimetype": "text/x-python", 101 | "name": "python", 102 | "nbconvert_exporter": "python", 103 | "pygments_lexer": "ipython3", 104 | "version": "3.5.2" 105 | } 106 | }, 107 | "nbformat": 4, 108 | "nbformat_minor": 2 109 | } 110 | -------------------------------------------------------------------------------- /notebooks/dask-theano.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import os\n", 10 | "from dask import bag as db\n", 11 | "import distributed\n", 12 | "from dask.distributed import progress" 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": 2, 18 | "metadata": {}, 19 | "outputs": [], 20 | "source": [ 21 | "scheduler = os.environ.get('SCHEDULER', 'localhost')\n", 22 | "client = distributed.Client(address=scheduler + \":8786\")" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 3, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "def add_remote():\n", 32 | " from theano import tensor as T\n", 33 | " i = T.scalar()\n", 34 | " o = i + 1\n", 35 | " return o.eval({i:2})" 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": 4, 41 | "metadata": {}, 42 | "outputs": [ 43 | { 44 | "data": { 45 | "application/vnd.jupyter.widget-view+json": { 46 | "model_id": "b65e8c06ece348f1bd8a14e95ca9e929", 47 | "version_major": 2, 48 | "version_minor": 0 49 | }, 50 | "text/html": [ 51 | "

Failed to display Jupyter Widget of type VBox.

\n", 52 | "

\n", 53 | " If you're reading this message in Jupyter Notebook or JupyterLab, it may mean\n", 54 | " that the widgets JavaScript is still loading. If this message persists, it\n", 55 | " likely means that the widgets JavaScript library is either not installed or\n", 56 | " not enabled. See the Jupyter\n", 57 | " Widgets Documentation for setup instructions.\n", 58 | "

\n", 59 | "

\n", 60 | " If you're reading this message in another notebook frontend (for example, a static\n", 61 | " rendering on GitHub or NBViewer),\n", 62 | " it may mean that your frontend doesn't currently support widgets.\n", 63 | "

\n" 64 | ], 65 | "text/plain": [ 66 | "VBox()" 67 | ] 68 | }, 69 | "metadata": {}, 70 | "output_type": "display_data" 71 | } 72 | ], 73 | "source": [ 74 | "f = client.submit(func=add_remote)\n", 75 | "progress(f)" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": 5, 81 | "metadata": {}, 82 | "outputs": [ 83 | { 84 | "data": { 85 | "text/plain": [ 86 | "array(3.0, dtype=float32)" 87 | ] 88 | }, 89 | "execution_count": 5, 90 | "metadata": {}, 91 | "output_type": "execute_result" 92 | } 93 | ], 94 | "source": [ 95 | "f.result()" 96 | ] 97 | } 98 | ], 99 | "metadata": { 100 | "kernelspec": { 101 | "display_name": "Python 3", 102 | "language": "python", 103 | "name": "python3" 104 | }, 105 | "language_info": { 106 | "codemirror_mode": { 107 | "name": "ipython", 108 | "version": 3 109 | }, 110 | "file_extension": ".py", 111 | "mimetype": "text/x-python", 112 | "name": "python", 113 | "nbconvert_exporter": "python", 114 | "pygments_lexer": "ipython3", 115 | "version": "3.5.2" 116 | } 117 | }, 118 | "nbformat": 4, 119 | "nbformat_minor": 2 120 | } 121 | --------------------------------------------------------------------------------