├── .gitmodules ├── models ├── __init__.py ├── resnet_cifar.py └── util.py ├── get_data ├── __init__.py ├── __init__.pyc ├── get_data.pyc ├── download_and_convert_cifar_10.py └── util.py ├── slurm_tf_helper ├── __init__.py └── setup_clusters.py ├── train.sl ├── README.md └── main.py /.gitmodules: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /models/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /get_data/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /slurm_tf_helper/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /get_data/__init__.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NERSC/cori-tf-distributed-examples/master/get_data/__init__.pyc -------------------------------------------------------------------------------- /get_data/get_data.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NERSC/cori-tf-distributed-examples/master/get_data/get_data.pyc -------------------------------------------------------------------------------- /train.sl: -------------------------------------------------------------------------------- 1 | #!/bin/bash -l 2 | 3 | #SBATCH -p regular 4 | #SBATCH -C haswell 5 | #SBATCH -o batch_outputs/slurm_%N.%j.out 6 | #SBATCH -e batch_outputs/slurm_%N.%j.out 7 | #SBATCH --qos=premium 8 | module load deeplearning 9 | python get_data/download_and_convert_cifar_10.py 10 | srun python -u main.py $@ 11 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # cori-tf-distributed-examples 2 | Scripts/Benchmarks for Running Tensorflow Distributed on Cori 3 | ### Running Code 4 | #### Running In Batch 5 | sbatch -N \ -t \