├── .gitignore
├── ADD_NOISE_INSTRUCTIONS.md
├── EVALUATION_INSTRUCTIONS.md
├── LICENSE
├── README.md
├── TRAINING_INSTRUCTIONS.md
├── datasets
    ├── __init__.py
    ├── beyond_gauss.py
    ├── build_darktable_data.py
    ├── build_imagenet_data.py
    ├── build_pixel_isp_data.py
    ├── classes.py
    ├── dataset_factory.py
    ├── dataset_utils.py
    ├── imagenet.py
    ├── imagenet_2012_bounding_boxes.csv
    ├── imagenet_labels.txt
    ├── imagenet_lsvrc_2015_synsets.txt
    ├── imagenet_metadata.txt
    ├── imnet_reg.py
    ├── labels.txt
    ├── number_synsets.txt
    ├── raw.py
    ├── raw_metadata.txt
    ├── synset_labels.txt
    └── training_synsets.txt
├── deployment
    ├── __init__.py
    ├── model_deploy.py
    └── model_deploy_test.py
├── environment.yml
├── loss_functions
    ├── __init__.py
    └── loss_factory.py
├── nets
    ├── __init__.py
    ├── inception.py
    ├── inception_utils.py
    ├── isp.py
    ├── mobilenet_isp.py
    ├── mobilenet_v1.py
    ├── nets_factory.py
    ├── nets_factory_test.py
    └── unet.py
├── preprocessing
    ├── __init__.py
    ├── inception_preprocessing.py
    ├── isp_pretrain_preprocessing.py
    ├── joint_isp_preprocessing.py
    ├── no_preprocessing.py
    ├── preprocessing_factory.py
    ├── sensor_model.py
    └── writeout_preprocessing.py
├── run_test_captured_images.sh
├── run_test_synthetic_images.sh
├── run_train_joint_models.sh
├── simulate_raw_images.py
├── teaser
    ├── architecture_2.jpg
    └── teaser_v4.png
├── test_captured_images.py
├── test_synthetic_images.py
└── train_image_classifier.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *__pycache__*
2 | *.idea
3 | *$py.class
4 | *.egg-info
5 | 


--------------------------------------------------------------------------------
/ADD_NOISE_INSTRUCTIONS.md:
--------------------------------------------------------------------------------
 1 | # Simulating noisy raw images from Imagenet
 2 | In order to evaluate and train new ISP or perception 
 3 | models on noisy images, we provide the noisy images 
 4 | that we used for evaluating a the Hardware ISP of Movidius Myraid 2
 5 | evaluation board: [Noisy-ImageNet](https://drive.google.com/drive/folders/1f9B319TDtFpZSi7HEXnrPa31rtPm54iH?usp=sharing).
 6 | 
 7 | We also provide the code to simulate noisy raw images from 
 8 | the ImageNet dataset, using the image formation model 
 9 | described in the manuscript.
10 | 
11 | In order to introduce `2to20lux` noise to a the ImageNet dataset run 
12 | 
13 | ```
14 | python simulate_raw_images.py --ll_low=0.001 --ll_high=0.010 \
15 |     --input_dir=$IMAGENET_DIR --output_dir=$OUT_DIR 
16 | ```
17 | where `$IMAGENET_DIR` is the ImageNet directory, training or evaluation sets,
18 | `$OUT_DIR` is the directory where the noisy images are written to, and 
19 | `ll_low` and `ll_high` are the lowest and highest light level, respectively.
20 | To generate images with other noise profiles, adapt the `ll_low` and 
21 | `ll_high` accordingly (see more examples in the `run_train_joint_models.sh` script).
22 | 


--------------------------------------------------------------------------------
/EVALUATION_INSTRUCTIONS.md:
--------------------------------------------------------------------------------
 1 | # Evaluating pre-trained models
 2 | In order to reproduce the results presented in the 
 3 | paper, first, download the [pre-trained models](https://drive.google.com/file/d/1kBTRAS2W5Ayf2DOxKIgIBmPv5OHaMbCD/view?usp=sharing).
 4 | 
 5 | ## Evaluate our joint models on real data
 6 | Download and extract the real captured (low-light) images [dataset](https://drive.google.com/file/d/1fj2u8t_wVdNVUmcjyeK8VuqDfTAd7RJA/view?usp=sharing).
 7 | 
 8 | To run our `2to200lux` joint model over the captured data (Table 2 of the paper), 
 9 | run
10 | ```
11 | python test_captured_images.py --device=1 --dataset_dir=$DATASET_DIR --dataset_name=imagenet \
12 |     --checkpoint_path=$CHECKPOINTS/joint128/2to200lux/model.ckpt-232721 \
13 |     --model_name=mobilenet_isp --noise_channel=True --use_anscombe=True \
14 |     --isp_model_name=isp --eval_image_size=224 --sensor=Pixel --eval_dir $OUT_DIR
15 | ```
16 | where `--device` is the GPU where the model will run on, 
17 | and `$DATASET_DIR` and `$CHECKPOINTS` should be set to the downloaded dataset 
18 | and checkpoint directories, respectively. `$OUT_DIR` can be set to an
19 | arbitrary output directory path. See `run_test_captured_images.sh` for 
20 | additional parameters to evaluate baseline models.
21 | 
22 | ## Evaluate our joint models on synthetic data
23 | Download the [Imagenet][in] (validation) dataset.
24 | To evaluate our joint model over noisy images with a `6lux` noise profile, run
25 | ```
26 | python test_synthetic_images.py --device=1 --checkpoint_path=$CHECKPOINTS/joint128/6lux/model.ckpt-222267  \
27 |     --dataset_dir=$IMAGENET_DATASET_DIR  --dataset_name=imagenet --mode=6lux \
28 |     --model_name=mobilenet_isp --eval_dir=$OUT_DIR
29 | ```
30 | 
31 | where  `$IMAGENET_DATASET_DIR` is the path to the ImageNet (validation) dataset,
32 | `$CHECKPOINTS` is set to the downloaded checkpoints directory,
33 | `--device` is the GPU where the model will run on, 
34 | and `$OUT_DIR` is an arbitrary directory where the results are written to. 
35 | 
36 | For both the synthetic and captured image evaluation scripts, generated results include
37 | the noisy input raw images, anscombe network output images, 
38 | and lists of correctly and wrongly classified images. 
39 | To run the trained models over different noise profiles, 
40 | modify the checkpoint paths and `--mode` parameter (3lux, 6lux, 2to20lux, or 2to200lux)
41 | accordingly. See `run_test_synthetic_images.sh` for the specific parameters for each noise profile.
42 | 
43 | 
44 | <!-- ## Evaluate baseline models -->
45 | <!-- ## Download Datasets -->
46 | <!-- To evaluate the models trained from scratch and models that use 
47 | U-net, run `eval_and_log_filenames_with_isp.py` with the appropiate 
48 | checkpoints and and noisy profiles, as done with our models. -->
49 | 
50 | <!-- To evaluate models trained from scratch, finetuned in 
51 | Movidius and Darktable ISP output, and U-Net learnable ISP, 
52 | first download the simulated RAW noisy, and ISP processed datasets by running
53 | ```
54 | tools/download_checkpoints.sh
55 | ```
56 | The script downloads extracts the data into the `noisy_imagenet` folder.
57 | 
58 | To evaluate models that use Movidius and Darktable ISP, use 
59 | ```
60 | noise=6lux
61 | checkpoints_dir=/mnt/storage03-mtl/users/frank/dirty_pixels/checkpoints
62 | dataset_dir=/media/frank.julca-aguilar/daimler_data/dirty_pixels/dataset
63 | python eval_and_log_filenames_with_isp.py --device=1 --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-222267'  \
64 |     --dataset_dir=$dataset_dir'/validation_clean'  --dataset_name=imagenet --mode=$noise \
65 |     --model_name=mobilenet_isp --target_name=/media/frank.julca-aguilar/daimler_data/dirty_pixels/evaluations_tog_final/ours 
66 | ``` -->
67 | 
68 | [in]: http://image-net.org/index


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2021 princeton-computational-imaging
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Dirty Pixels: Towards End-to-End Image Processing and Perception
 2 | This repository contains the code for the paper 
 3 | 
 4 | **[Dirty Pixels: Towards End-to-End Image Processing and Perception][1]**  
 5 | [Steven Diamond][sd], [Vincent Sitzmann][vs], [Frank Julca-Aguilar][fj], [Stephen Boyd][sb], [Gordon Wetzstein][gw], [Felix Heide][fh]  
 6 | Transactions on Graphics, 2021 | To be presented at SIGGRAPH, 2021
 7 | 
 8 | <div align="center">
 9 |   <img src="teaser/teaser_v4.png" width="800px" />
10 | </div>
11 | 
12 | <br />
13 | 
14 | <div align="center">
15 |   
16 |   <img src="teaser/architecture_2.jpg" width="800px" />
17 | </div>
18 | 
19 | ## Installation
20 | Clone this repository:
21 | ```
22 | git clone git@github.com:princeton-computational-imaging/DirtyPixels.git
23 | ```
24 | 
25 | The project was developed using Python 3.6, Tensorflow (v1.12) and Slim.
26 | We provide an environment file to install all dependencies (creating an envirnoment called dirtypix):
27 | 
28 | ```
29 | conda env create -f environment.yml
30 | conda activate dirtypix
31 | ```
32 | 
33 | 
34 | 
35 | ## Running Experiments
36 | We provide code and data and trained models to reproduce the main results presented at the paper, and instructions on how to use this project for further research:
37 | - [EVALUATION_INSTRUCTIONS.md](EVALUATION_INSTRUCTIONS.md) provides instructions
38 | on how to evaluate our proposed models and reproduce results of the paper.
39 | - [TRAINING_INSTRUCTIONS.md](TRAINING_INSTRUCTIONS.md) gives instructions on how to train new models following our proposed approach.
40 | - [ADD_NOISE_INSTRUCTIONS.md](ADD_NOISE_INSTRUCTIONS.md) explains how to simulate 
41 | noisy raw images following the image formation model defined in the 
42 | manuscript.
43 | 
44 | ## Citation
45 | If you find our work useful in your research, please cite:
46 | 
47 | ```
48 | @article{steven:dirtypixels2021,
49 |   title={Dirty Pixels: Towards End-to-End Image Processing and Perception},
50 |   author={Diamond, Steven and Sitzmann, Vincent and Julca-Aguilar, Frank and Boyd, Stephen and Wetzstein, Gordon and Heide, Felix},
51 |   journal={ACM Transactions on Graphics (SIGGRAPH)},
52 |   year={2021},
53 |   publisher={ACM}
54 | }
55 | ```
56 | 
57 | ## License
58 | 
59 | This project is released under [MIT License](LICENSE).
60 | 
61 | 
62 | [1]: https://arxiv.org/abs/1701.06487
63 | [sd]: https://stevendiamond.me
64 | [vs]: https://vsitzmann.github.io
65 | [fj]: https://github.com/fjulca-aguilar 
66 | [sb]: https://web.stanford.edu/~boyd/
67 | [gw]: https://stanford.edu/~gordonwz/
68 | [fh]: https://www.cs.princeton.edu/~fheide/
69 | 
70 | 


--------------------------------------------------------------------------------
/TRAINING_INSTRUCTIONS.md:
--------------------------------------------------------------------------------
 1 | 
 2 | ## Training new models over noisy RAW data
 3 | Download the [Imagenet][in] (training) dataset.
 4 | As described in the supplemental document,
 5 | our joint models were trained in two stages. 
 6 | In the first stage, we train the anscombe 
 7 | and MobileNet components separately on ImageNet. 
 8 | In this stage, we use L1 norm to train the 
 9 | anscombe networks. In the second stage, 
10 | the joint (MobileNet + Anscombe) model is trained using only 
11 | the high level (classification) 
12 | loss and the checkpoints obtained in the first stage. 
13 | To facilitate training new models, we provide the  
14 | checkpoints obtained from the first stage. The checkpoints 
15 | can be downloaded following the instruction in 
16 | [EVALUATION_INSTRUCTIONS.md](EVALUATION_INSTRUCTIONS.md).
17 | 
18 | ## Generating TFRecords for training
19 | In order to generate TFRecord files for training, 
20 | run the `build_imagenet_data.py` script in the `datasets` 
21 | folder:
22 | 
23 | ```
24 | cd datasets
25 | python build_imagenet_data.py --train_directory=$IMAGENET_TRAIN_DIR \
26 |     --output_directory=$OUT_DIR \
27 |     --num_threads 8
28 | ```
29 | where `$IMAGENET_TRAIN_DIR` is the path to the Imagenet training dataset,
30 | `$OUT_DIR` is the path to the directory where the TFRecord files will 
31 | be exported, and `--num_threads` defines the number of threads to 
32 | preprocess the images.
33 | 
34 | 
35 | 
36 | ##  Training command example
37 | In order to train our proposed joint architecture
38 | on a `6lux` noise profile run:
39 | 
40 | ```
41 | python train_image_classifier.py --train_dir=$TRAIN_DIR \
42 |     --dataset_dir=$IMAGENET_TFRECORDS  --ll_low=0.003 \
43 |     --ll_high=0.003  --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
44 |     --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000  \
45 |     --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
46 |     --use_anscombe=True --num_clones=2 --isp_model_name=isp --num_iters=1 --device=0,1 \
47 |     --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
48 | ```
49 | where `$IMAGENET_TFRECORDS` is set to the directory with the Imagenet TFRecords, and `$CHECKPOINTS` is set to the downloaded checkpoints directory. The paramaters `--checkpoint_path` 
50 | and `--isp_checkpoint_path` are set to the checkpoints obtained in the first training stage.
51 | For training over other noisy profiles, see 
52 | `run_train_joint_models.sh`. For more details about the specific training parameters, 
53 | see the main manuscript and supplemental document. To visualise the training 
54 | progress run `tensorboard --logdir=$TRAIN_DIR`.
55 | 
56 | [in]: http://image-net.org/index
57 | 
58 | 


--------------------------------------------------------------------------------
/datasets/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/datasets/beyond_gauss.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides data for the Cifar10 dataset.
16 | 
17 | The dataset scripts used to create the dataset can be found at:
18 | tensorflow/models/slim/data/create_cifar10_dataset.py
19 | """
20 | 
21 | from __future__ import absolute_import
22 | from __future__ import division
23 | from __future__ import print_function
24 | 
25 | import os
26 | import tensorflow as tf
27 | 
28 | from datasets import dataset_utils
29 | 
30 | slim = tf.contrib.slim
31 | 
32 | _FILE_PATTERN =  'beyond_gauss_%s.tfrecord'
33 | 
34 | # for bw patches
35 | # SPLITS_TO_SIZES = {'train': 212096, 'test': 10000}
36 | # for color patches
37 | SPLITS_TO_SIZES = {'train': 252416, 'test': 10000}
38 | _ITEMS_TO_DESCRIPTIONS = {
39 |     'input': 'The input image for the model',
40 |     'ground_truth': 'The ground truth image to regress on.',
41 | }
42 | 
43 | 
44 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
45 |   """Gets a dataset tuple with instructions for reading cifar10.
46 | 
47 |   Args:
48 |     split_name: A train/test split name.
49 |     dataset_dir: The base directory of the dataset sources.
50 |     file_pattern: The file pattern to use when matching the dataset sources.
51 |       It is assumed that the pattern contains a '%s' string so that the split
52 |       name can be inserted.
53 |     reader: The TensorFlow reader type.
54 | 
55 |   Returns:
56 |     A `Dataset` namedtuple.
57 | 
58 |   Raises:
59 |     ValueError: if `split_name` is not a valid train/test split.
60 |   """
61 |   if split_name not in SPLITS_TO_SIZES:
62 |     raise ValueError('split name %s was not recognized.' % split_name)
63 | 
64 |   if not file_pattern:
65 |     file_pattern = _FILE_PATTERN
66 |   file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
67 | 
68 |   # Allowing None in the signature so that dataset_factory can use the default.
69 |   if not reader:
70 |     reader = tf.TFRecordReader
71 | 
72 |   keys_to_features = {
73 |       'input_img/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
74 |       'gt_img/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
75 |       'image/format': tf.FixedLenFeature((), tf.string, default_value='png'),
76 |   }
77 | 
78 |   items_to_handlers = {
79 |       'input': slim.tfexample_decoder.Image('input_img/encoded', format_key='image/format'),
80 |       'ground_truth': slim.tfexample_decoder.Image('gt_img/encoded', format_key='image/format'),
81 |   }
82 | 
83 |   decoder = slim.tfexample_decoder.TFExampleDecoder(
84 |       keys_to_features, items_to_handlers)
85 | 
86 |   return slim.dataset.Dataset(
87 |       data_sources=file_pattern,
88 |       reader=reader,
89 |       decoder=decoder,
90 |       num_samples=SPLITS_TO_SIZES[split_name],
91 |       items_to_descriptions=_ITEMS_TO_DESCRIPTIONS)
92 | 


--------------------------------------------------------------------------------
/datasets/classes.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | from glob import glob
 3 | from distutils.dir_util import copy_tree
 4 | 
 5 | base = '/media/data/dirty_pix_v3/validation_RAW/'
 6 | root = base + 'RAW_human_ISO8000_EXP10000'
 7 | target = base + 'RAW_synset_ISO8000_EXP10000'
 8 | 
 9 | human_labels = glob(os.path.join(root,'*/'))
10 | human_labels = [label.split('/')[-2] for label in human_labels]
11 | print(human_labels)
12 | 
13 | human_to_synset = {}
14 | with open('raw_metadata.txt', 'r') as synset_human_file:
15 |     for line in synset_human_file:
16 | 	synset = line[:9]
17 | 	human = line[9:].strip().lower()
18 |         for label in human_labels:
19 |             for match in human.split(','):
20 |                 if label.strip() == match.strip().lower():
21 |                    human_to_synset[label] = synset
22 | 
23 | print(human_to_synset)
24 | missing = False
25 | for h in human_labels:
26 |    if h not in human_to_synset:
27 |     print h
28 |     missing = True
29 | if missing:
30 |     print("Missing synsets!")
31 | else:
32 |     print("All synsets mapped!")
33 | 
34 | #print len(human_labels)
35 | #print len(human_to_synset)
36 | 
37 | all_dirs = glob(os.path.join(root,'*/'))
38 | for subdir in all_dirs:
39 |     no_imgs = len(glob(os.path.join(subdir, '*.dng')))
40 |     if not no_imgs:
41 |         print(subdir + " is empty")
42 |         continue
43 | 
44 |     subdir = subdir[len(root)+1:-1]
45 |     print(subdir)
46 | 
47 |     if subdir not in human_to_synset:
48 |         print("Skipping %s"%subdir)
49 |         continue
50 | 
51 |     print("Copying %d files from class %s"%(no_imgs, subdir))
52 | 
53 |     synset = human_to_synset[subdir]
54 |     new_dir = os.path.join(target, synset)
55 |     old_dir = os.path.join(root,subdir)
56 |     copy_tree(old_dir, new_dir)
57 | 


--------------------------------------------------------------------------------
/datasets/dataset_factory.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """A factory-pattern class which returns classification image/label pairs."""
16 | 
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 | 
21 | from datasets import imagenet
22 | from datasets import imnet_reg
23 | from datasets import beyond_gauss
24 | from datasets import raw
25 | 
26 | datasets_map = {
27 |     'imnet_reg': imnet_reg,
28 |     'imagenet': imagenet,
29 |     'beyond_gauss': beyond_gauss,
30 |     'raw': raw,
31 | }
32 | 
33 | 
34 | def get_dataset(name, split_name, dataset_dir, file_pattern=None, reader=None):
35 |   """Given a dataset name and a split_name returns a Dataset.
36 | 
37 |   Args:
38 |     name: String, the name of the dataset.
39 |     split_name: A train/test split name.
40 |     dataset_dir: The directory where the dataset files are stored.
41 |     file_pattern: The file pattern to use for matching the dataset source files.
42 |     reader: The subclass of tf.ReaderBase. If left as `None`, then the default
43 |       reader defined by each dataset is used.
44 | 
45 |   Returns:
46 |     A `Dataset` class.
47 | 
48 |   Raises:
49 |     ValueError: If the dataset `name` is unknown.
50 |   """
51 |   if name not in datasets_map:
52 |     raise ValueError('Name of dataset unknown %s' % name)
53 |   return datasets_map[name].get_split(
54 |       split_name,
55 |       dataset_dir,
56 |       file_pattern,
57 |       reader)
58 | 


--------------------------------------------------------------------------------
/datasets/dataset_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Contains utilities for downloading and converting datasets."""
 16 | from __future__ import absolute_import
 17 | from __future__ import division
 18 | from __future__ import print_function
 19 | 
 20 | import os
 21 | import sys
 22 | import tarfile
 23 | 
 24 | from six.moves import urllib
 25 | import tensorflow as tf
 26 | 
 27 | LABELS_FILENAME = 'labels.txt'
 28 | 
 29 | 
 30 | def int64_feature(values):
 31 |   """Returns a TF-Feature of int64s.
 32 | 
 33 |   Args:
 34 |     values: A scalar or list of values.
 35 | 
 36 |   Returns:
 37 |     a TF-Feature.
 38 |   """
 39 |   if not isinstance(values, (tuple, list)):
 40 |     values = [values]
 41 |   return tf.train.Feature(int64_list=tf.train.Int64List(value=values))
 42 | 
 43 | 
 44 | def bytes_feature(values):
 45 |   """Returns a TF-Feature of bytes.
 46 | 
 47 |   Args:
 48 |     values: A string.
 49 | 
 50 |   Returns:
 51 |     a TF-Feature.
 52 |   """
 53 |   return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values]))
 54 | 
 55 | 
 56 | def image_to_tfexample(image_data, image_format, height, width, class_id):
 57 |   return tf.train.Example(features=tf.train.Features(feature={
 58 |       'image/encoded': bytes_feature(image_data),
 59 |       'image/format': bytes_feature(image_format),
 60 |       'image/class/label': int64_feature(class_id),
 61 |       'image/height': int64_feature(height),
 62 |       'image/width': int64_feature(width),
 63 |   }))
 64 | 
 65 | 
 66 | def image_to_tfexample_for_regression(input_img, gt_img, image_format, height, width):
 67 |   return tf.train.Example(features=tf.train.Features(feature={
 68 |       'input_img/encoded': bytes_feature(input_img),
 69 |       'gt_img/encoded': bytes_feature(gt_img),
 70 |       'imgs/format': bytes_feature(image_format),
 71 |       'imgs/height': int64_feature(height),
 72 |       'imgs/width': int64_feature(width),
 73 |   }))
 74 | 
 75 | 
 76 | def download_and_uncompress_tarball(tarball_url, dataset_dir):
 77 |   """Downloads the `tarball_url` and uncompresses it locally.
 78 | 
 79 |   Args:
 80 |     tarball_url: The URL of a tarball file.
 81 |     dataset_dir: The directory where the temporary files are stored.
 82 |   """
 83 |   filename = tarball_url.split('/')[-1]
 84 |   filepath = os.path.join(dataset_dir, filename)
 85 | 
 86 |   def _progress(count, block_size, total_size):
 87 |     sys.stdout.write('\r>> Downloading %s %.1f%%' % (
 88 |         filename, float(count * block_size) / float(total_size) * 100.0))
 89 |     sys.stdout.flush()
 90 |   filepath, _ = urllib.request.urlretrieve(tarball_url, filepath, _progress)
 91 |   print()
 92 |   statinfo = os.stat(filepath)
 93 |   print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')
 94 |   tarfile.open(filepath, 'r:gz').extractall(dataset_dir)
 95 | 
 96 | 
 97 | def write_label_file(labels_to_class_names, dataset_dir,
 98 |                      filename=LABELS_FILENAME):
 99 |   """Writes a file with the list of class names.
100 | 
101 |   Args:
102 |     labels_to_class_names: A map of (integer) labels to class names.
103 |     dataset_dir: The directory in which the labels file should be written.
104 |     filename: The filename where the class names are written.
105 |   """
106 |   labels_filename = os.path.join(dataset_dir, filename)
107 |   with tf.gfile.Open(labels_filename, 'w') as f:
108 |     for label in labels_to_class_names:
109 |       class_name = labels_to_class_names[label]
110 |       f.write('%d:%s\n' % (label, class_name))
111 | 
112 | 
113 | def has_labels(dataset_dir, filename=LABELS_FILENAME):
114 |   """Specifies whether or not the dataset directory contains a label map file.
115 | 
116 |   Args:
117 |     dataset_dir: The directory in which the labels file is found.
118 |     filename: The filename where the class names are written.
119 | 
120 |   Returns:
121 |     `True` if the labels file exists and `False` otherwise.
122 |   """
123 |   return tf.gfile.Exists(os.path.join(dataset_dir, filename))
124 | 
125 | 
126 | def read_label_file(dataset_dir, filename=LABELS_FILENAME):
127 |   """Reads the labels file and returns a mapping from ID to class name.
128 | 
129 |   Args:
130 |     dataset_dir: The directory in which the labels file is found.
131 |     filename: The filename where the class names are written.
132 | 
133 |   Returns:
134 |     A map from a label (integer) to class name.
135 |   """
136 |   labels_filename = os.path.join(dataset_dir, filename)
137 |   with tf.gfile.Open(labels_filename, 'r') as f:
138 |     lines = f.read() #f.read().decode()
139 |   lines = lines.split('\n')
140 |   print('lines ', lines)
141 |   lines = filter(None, lines)
142 | 
143 |   labels_to_class_names = {}
144 |   for line in lines:
145 |     index = line.index(':')
146 |     labels_to_class_names[int(line[:index])] = line[index+1:]
147 |   return labels_to_class_names
148 | 


--------------------------------------------------------------------------------
/datasets/imagenet.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Provides data for the ImageNet ILSVRC 2012 Dataset plus some bounding boxes.
 16 | 
 17 | Some images have one or more bounding boxes associated with the label of the
 18 | image. See details here: http://image-net.org/download-bboxes
 19 | 
 20 | ImageNet is based upon WordNet 3.0. To uniquely identify a synset, we use
 21 | "WordNet ID" (wnid), which is a concatenation of POS ( i.e. part of speech )
 22 | and SYNSET OFFSET of WordNet. For more information, please refer to the
 23 | WordNet documentation[http://wordnet.princeton.edu/wordnet/documentation/].
 24 | 
 25 | "There are bounding boxes for over 3000 popular synsets available.
 26 | For each synset, there are on average 150 images with bounding boxes."
 27 | 
 28 | WARNING: Don't use for object detection, in this case all the bounding boxes
 29 | of the image belong to just one class.
 30 | """
 31 | from __future__ import absolute_import
 32 | from __future__ import division
 33 | from __future__ import print_function
 34 | 
 35 | import os
 36 | from six.moves import urllib
 37 | import tensorflow as tf
 38 | 
 39 | from datasets import dataset_utils
 40 | 
 41 | slim = tf.contrib.slim
 42 | 
 43 | # TODO(nsilberman): Add tfrecord file type once the script is updated.
 44 | _FILE_PATTERN = '%s-*'
 45 | 
 46 | _SPLITS_TO_SIZES = {
 47 |     'train': 600000,#1281167,
 48 |     'validation': 50000,
 49 | }
 50 | 
 51 | _ITEMS_TO_DESCRIPTIONS = {
 52 |     'image': 'A color image of varying height and width.',
 53 |     'label': 'The label id of the image, integer between 0 and 999',
 54 |     'label_text': 'The text of the label.',
 55 |     'object/bbox': 'A list of bounding boxes.',
 56 |     'object/label': 'A list of labels, one per each object.',
 57 | }
 58 | 
 59 | _NUM_CLASSES = 1001
 60 | 
 61 | 
 62 | def create_readable_names_for_imagenet_labels():
 63 |   """Create a dict mapping label id to human readable string.
 64 | 
 65 |   Returns:
 66 |       labels_to_names: dictionary where keys are integers from to 1000
 67 |       and values are human-readable names.
 68 | 
 69 |   We retrieve a synset file, which contains a list of valid synset labels used
 70 |   by ILSVRC competition. There is one synset one per line, eg.
 71 |           #   n01440764
 72 |           #   n01443537
 73 |   We also retrieve a synset_to_human_file, which contains a mapping from synsets
 74 |   to human-readable names for every synset in Imagenet. These are stored in a
 75 |   tsv format, as follows:
 76 |           #   n02119247    black fox
 77 |           #   n02119359    silver fox
 78 |   We assign each synset (in alphabetical order) an integer, starting from 1
 79 |   (since 0 is reserved for the background class).
 80 | 
 81 |   Code is based on
 82 |   https://github.com/tensorflow/models/blob/master/inception/inception/data/build_imagenet_data.py#L463
 83 |   """
 84 | 
 85 |   # pylint: disable=g-line-too-long
 86 |   base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/inception/inception/data/'
 87 |   synset_url = '{}/imagenet_lsvrc_2015_synsets.txt'.format(base_url)
 88 |   synset_to_human_url = '{}/imagenet_metadata.txt'.format(base_url)
 89 | 
 90 |   #filename, _ = urllib.request.urlretrieve(synset_url)
 91 |   filename = './datasets/imagenet_lsvrc_2015_synsets.txt'
 92 |   synset_list = [s.strip() for s in open(filename).readlines()]
 93 |   num_synsets_in_ilsvrc = len(synset_list)
 94 |   assert num_synsets_in_ilsvrc == 1000
 95 | 
 96 |   #filename, _ = urllib.request.urlretrieve(synset_to_human_url)
 97 |   filename = './datasets/imagenet_metadata.txt'
 98 |   synset_to_human_list = open(filename).readlines()
 99 |   num_synsets_in_all_imagenet = len(synset_to_human_list)
100 |   assert num_synsets_in_all_imagenet == 21842
101 | 
102 |   synset_to_human = {}
103 |   for s in synset_to_human_list:
104 |     parts = s.strip().split('\t')
105 |     assert len(parts) == 2
106 |     synset = parts[0]
107 |     human = parts[1]
108 |     synset_to_human[synset] = human
109 | 
110 |   label_index = 1
111 |   labels_to_names = {0: 'background'}
112 |   for synset in synset_list:
113 |     name = synset_to_human[synset]
114 |     labels_to_names[label_index] = name
115 |     label_index += 1
116 | 
117 |   return labels_to_names
118 | 
119 | 
120 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
121 |   """Gets a dataset tuple with instructions for reading ImageNet.
122 | 
123 |   Args:
124 |     split_name: A train/test split name.
125 |     dataset_dir: The base directory of the dataset sources.
126 |     file_pattern: The file pattern to use when matching the dataset sources.
127 |       It is assumed that the pattern contains a '%s' string so that the split
128 |       name can be inserted.
129 |     reader: The TensorFlow reader type.
130 | 
131 |   Returns:
132 |     A `Dataset` namedtuple.
133 | 
134 |   Raises:
135 |     ValueError: if `split_name` is not a valid train/test split.
136 |   """
137 |   if split_name not in _SPLITS_TO_SIZES:
138 |     raise ValueError('split name %s was not recognized.' % split_name)
139 | 
140 |   if not file_pattern:
141 |     file_pattern = _FILE_PATTERN
142 |   file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
143 | 
144 |   # Allowing None in the signature so that dataset_factory can use the default.
145 |   if reader is None:
146 |     reader = tf.TFRecordReader
147 | 
148 |   keys_to_features = {
149 |       'image/encoded': tf.FixedLenFeature(
150 |           (), tf.string, default_value=''),
151 |       'image/format': tf.FixedLenFeature(
152 |           (), tf.string, default_value='jpeg'),
153 |       'image/class/label': tf.FixedLenFeature(
154 |           [], dtype=tf.int64, default_value=-1),
155 |       'image/class/text': tf.FixedLenFeature(
156 |           [], dtype=tf.string, default_value=''),
157 |       'image/object/bbox/xmin': tf.VarLenFeature(
158 |           dtype=tf.float32),
159 |       'image/object/bbox/ymin': tf.VarLenFeature(
160 |           dtype=tf.float32),
161 |       'image/object/bbox/xmax': tf.VarLenFeature(
162 |           dtype=tf.float32),
163 |       'image/object/bbox/ymax': tf.VarLenFeature(
164 |           dtype=tf.float32),
165 |       'image/object/class/label': tf.VarLenFeature(
166 |           dtype=tf.int64),
167 |   }
168 | 
169 |   items_to_handlers = {
170 |       'image': slim.tfexample_decoder.Image('image/encoded', 'image/format'),
171 |       'label': slim.tfexample_decoder.Tensor('image/class/label'),
172 |       'label_text': slim.tfexample_decoder.Tensor('image/class/text'),
173 |       'object/bbox': slim.tfexample_decoder.BoundingBox(
174 |           ['ymin', 'xmin', 'ymax', 'xmax'], 'image/object/bbox/'),
175 |       'object/label': slim.tfexample_decoder.Tensor('image/object/class/label'),
176 |   }
177 | 
178 |   decoder = slim.tfexample_decoder.TFExampleDecoder(
179 |       keys_to_features, items_to_handlers)
180 | 
181 |   labels_to_names = None
182 |   if dataset_utils.has_labels(dataset_dir):
183 |     labels_to_names = dataset_utils.read_label_file(dataset_dir)
184 |   else:
185 |     labels_to_names = create_readable_names_for_imagenet_labels()
186 |     dataset_utils.write_label_file(labels_to_names, dataset_dir)
187 | 
188 |   return slim.dataset.Dataset(
189 |       data_sources=file_pattern,
190 |       reader=reader,
191 |       decoder=decoder,
192 |       num_samples=_SPLITS_TO_SIZES[split_name],
193 |       items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,
194 |       num_classes=_NUM_CLASSES,
195 |       labels_to_names=labels_to_names)
196 | 


--------------------------------------------------------------------------------
/datasets/imagenet_lsvrc_2015_synsets.txt:
--------------------------------------------------------------------------------
   1 | n01440764
   2 | n01443537
   3 | n01484850
   4 | n01491361
   5 | n01494475
   6 | n01496331
   7 | n01498041
   8 | n01514668
   9 | n01514859
  10 | n01518878
  11 | n01530575
  12 | n01531178
  13 | n01532829
  14 | n01534433
  15 | n01537544
  16 | n01558993
  17 | n01560419
  18 | n01580077
  19 | n01582220
  20 | n01592084
  21 | n01601694
  22 | n01608432
  23 | n01614925
  24 | n01616318
  25 | n01622779
  26 | n01629819
  27 | n01630670
  28 | n01631663
  29 | n01632458
  30 | n01632777
  31 | n01641577
  32 | n01644373
  33 | n01644900
  34 | n01664065
  35 | n01665541
  36 | n01667114
  37 | n01667778
  38 | n01669191
  39 | n01675722
  40 | n01677366
  41 | n01682714
  42 | n01685808
  43 | n01687978
  44 | n01688243
  45 | n01689811
  46 | n01692333
  47 | n01693334
  48 | n01694178
  49 | n01695060
  50 | n01697457
  51 | n01698640
  52 | n01704323
  53 | n01728572
  54 | n01728920
  55 | n01729322
  56 | n01729977
  57 | n01734418
  58 | n01735189
  59 | n01737021
  60 | n01739381
  61 | n01740131
  62 | n01742172
  63 | n01744401
  64 | n01748264
  65 | n01749939
  66 | n01751748
  67 | n01753488
  68 | n01755581
  69 | n01756291
  70 | n01768244
  71 | n01770081
  72 | n01770393
  73 | n01773157
  74 | n01773549
  75 | n01773797
  76 | n01774384
  77 | n01774750
  78 | n01775062
  79 | n01776313
  80 | n01784675
  81 | n01795545
  82 | n01796340
  83 | n01797886
  84 | n01798484
  85 | n01806143
  86 | n01806567
  87 | n01807496
  88 | n01817953
  89 | n01818515
  90 | n01819313
  91 | n01820546
  92 | n01824575
  93 | n01828970
  94 | n01829413
  95 | n01833805
  96 | n01843065
  97 | n01843383
  98 | n01847000
  99 | n01855032
 100 | n01855672
 101 | n01860187
 102 | n01871265
 103 | n01872401
 104 | n01873310
 105 | n01877812
 106 | n01882714
 107 | n01883070
 108 | n01910747
 109 | n01914609
 110 | n01917289
 111 | n01924916
 112 | n01930112
 113 | n01943899
 114 | n01944390
 115 | n01945685
 116 | n01950731
 117 | n01955084
 118 | n01968897
 119 | n01978287
 120 | n01978455
 121 | n01980166
 122 | n01981276
 123 | n01983481
 124 | n01984695
 125 | n01985128
 126 | n01986214
 127 | n01990800
 128 | n02002556
 129 | n02002724
 130 | n02006656
 131 | n02007558
 132 | n02009229
 133 | n02009912
 134 | n02011460
 135 | n02012849
 136 | n02013706
 137 | n02017213
 138 | n02018207
 139 | n02018795
 140 | n02025239
 141 | n02027492
 142 | n02028035
 143 | n02033041
 144 | n02037110
 145 | n02051845
 146 | n02056570
 147 | n02058221
 148 | n02066245
 149 | n02071294
 150 | n02074367
 151 | n02077923
 152 | n02085620
 153 | n02085782
 154 | n02085936
 155 | n02086079
 156 | n02086240
 157 | n02086646
 158 | n02086910
 159 | n02087046
 160 | n02087394
 161 | n02088094
 162 | n02088238
 163 | n02088364
 164 | n02088466
 165 | n02088632
 166 | n02089078
 167 | n02089867
 168 | n02089973
 169 | n02090379
 170 | n02090622
 171 | n02090721
 172 | n02091032
 173 | n02091134
 174 | n02091244
 175 | n02091467
 176 | n02091635
 177 | n02091831
 178 | n02092002
 179 | n02092339
 180 | n02093256
 181 | n02093428
 182 | n02093647
 183 | n02093754
 184 | n02093859
 185 | n02093991
 186 | n02094114
 187 | n02094258
 188 | n02094433
 189 | n02095314
 190 | n02095570
 191 | n02095889
 192 | n02096051
 193 | n02096177
 194 | n02096294
 195 | n02096437
 196 | n02096585
 197 | n02097047
 198 | n02097130
 199 | n02097209
 200 | n02097298
 201 | n02097474
 202 | n02097658
 203 | n02098105
 204 | n02098286
 205 | n02098413
 206 | n02099267
 207 | n02099429
 208 | n02099601
 209 | n02099712
 210 | n02099849
 211 | n02100236
 212 | n02100583
 213 | n02100735
 214 | n02100877
 215 | n02101006
 216 | n02101388
 217 | n02101556
 218 | n02102040
 219 | n02102177
 220 | n02102318
 221 | n02102480
 222 | n02102973
 223 | n02104029
 224 | n02104365
 225 | n02105056
 226 | n02105162
 227 | n02105251
 228 | n02105412
 229 | n02105505
 230 | n02105641
 231 | n02105855
 232 | n02106030
 233 | n02106166
 234 | n02106382
 235 | n02106550
 236 | n02106662
 237 | n02107142
 238 | n02107312
 239 | n02107574
 240 | n02107683
 241 | n02107908
 242 | n02108000
 243 | n02108089
 244 | n02108422
 245 | n02108551
 246 | n02108915
 247 | n02109047
 248 | n02109525
 249 | n02109961
 250 | n02110063
 251 | n02110185
 252 | n02110341
 253 | n02110627
 254 | n02110806
 255 | n02110958
 256 | n02111129
 257 | n02111277
 258 | n02111500
 259 | n02111889
 260 | n02112018
 261 | n02112137
 262 | n02112350
 263 | n02112706
 264 | n02113023
 265 | n02113186
 266 | n02113624
 267 | n02113712
 268 | n02113799
 269 | n02113978
 270 | n02114367
 271 | n02114548
 272 | n02114712
 273 | n02114855
 274 | n02115641
 275 | n02115913
 276 | n02116738
 277 | n02117135
 278 | n02119022
 279 | n02119789
 280 | n02120079
 281 | n02120505
 282 | n02123045
 283 | n02123159
 284 | n02123394
 285 | n02123597
 286 | n02124075
 287 | n02125311
 288 | n02127052
 289 | n02128385
 290 | n02128757
 291 | n02128925
 292 | n02129165
 293 | n02129604
 294 | n02130308
 295 | n02132136
 296 | n02133161
 297 | n02134084
 298 | n02134418
 299 | n02137549
 300 | n02138441
 301 | n02165105
 302 | n02165456
 303 | n02167151
 304 | n02168699
 305 | n02169497
 306 | n02172182
 307 | n02174001
 308 | n02177972
 309 | n02190166
 310 | n02206856
 311 | n02219486
 312 | n02226429
 313 | n02229544
 314 | n02231487
 315 | n02233338
 316 | n02236044
 317 | n02256656
 318 | n02259212
 319 | n02264363
 320 | n02268443
 321 | n02268853
 322 | n02276258
 323 | n02277742
 324 | n02279972
 325 | n02280649
 326 | n02281406
 327 | n02281787
 328 | n02317335
 329 | n02319095
 330 | n02321529
 331 | n02325366
 332 | n02326432
 333 | n02328150
 334 | n02342885
 335 | n02346627
 336 | n02356798
 337 | n02361337
 338 | n02363005
 339 | n02364673
 340 | n02389026
 341 | n02391049
 342 | n02395406
 343 | n02396427
 344 | n02397096
 345 | n02398521
 346 | n02403003
 347 | n02408429
 348 | n02410509
 349 | n02412080
 350 | n02415577
 351 | n02417914
 352 | n02422106
 353 | n02422699
 354 | n02423022
 355 | n02437312
 356 | n02437616
 357 | n02441942
 358 | n02442845
 359 | n02443114
 360 | n02443484
 361 | n02444819
 362 | n02445715
 363 | n02447366
 364 | n02454379
 365 | n02457408
 366 | n02480495
 367 | n02480855
 368 | n02481823
 369 | n02483362
 370 | n02483708
 371 | n02484975
 372 | n02486261
 373 | n02486410
 374 | n02487347
 375 | n02488291
 376 | n02488702
 377 | n02489166
 378 | n02490219
 379 | n02492035
 380 | n02492660
 381 | n02493509
 382 | n02493793
 383 | n02494079
 384 | n02497673
 385 | n02500267
 386 | n02504013
 387 | n02504458
 388 | n02509815
 389 | n02510455
 390 | n02514041
 391 | n02526121
 392 | n02536864
 393 | n02606052
 394 | n02607072
 395 | n02640242
 396 | n02641379
 397 | n02643566
 398 | n02655020
 399 | n02666196
 400 | n02667093
 401 | n02669723
 402 | n02672831
 403 | n02676566
 404 | n02687172
 405 | n02690373
 406 | n02692877
 407 | n02699494
 408 | n02701002
 409 | n02704792
 410 | n02708093
 411 | n02727426
 412 | n02730930
 413 | n02747177
 414 | n02749479
 415 | n02769748
 416 | n02776631
 417 | n02777292
 418 | n02782093
 419 | n02783161
 420 | n02786058
 421 | n02787622
 422 | n02788148
 423 | n02790996
 424 | n02791124
 425 | n02791270
 426 | n02793495
 427 | n02794156
 428 | n02795169
 429 | n02797295
 430 | n02799071
 431 | n02802426
 432 | n02804414
 433 | n02804610
 434 | n02807133
 435 | n02808304
 436 | n02808440
 437 | n02814533
 438 | n02814860
 439 | n02815834
 440 | n02817516
 441 | n02823428
 442 | n02823750
 443 | n02825657
 444 | n02834397
 445 | n02835271
 446 | n02837789
 447 | n02840245
 448 | n02841315
 449 | n02843684
 450 | n02859443
 451 | n02860847
 452 | n02865351
 453 | n02869837
 454 | n02870880
 455 | n02871525
 456 | n02877765
 457 | n02879718
 458 | n02883205
 459 | n02892201
 460 | n02892767
 461 | n02894605
 462 | n02895154
 463 | n02906734
 464 | n02909870
 465 | n02910353
 466 | n02916936
 467 | n02917067
 468 | n02927161
 469 | n02930766
 470 | n02939185
 471 | n02948072
 472 | n02950826
 473 | n02951358
 474 | n02951585
 475 | n02963159
 476 | n02965783
 477 | n02966193
 478 | n02966687
 479 | n02971356
 480 | n02974003
 481 | n02977058
 482 | n02978881
 483 | n02979186
 484 | n02980441
 485 | n02981792
 486 | n02988304
 487 | n02992211
 488 | n02992529
 489 | n02999410
 490 | n03000134
 491 | n03000247
 492 | n03000684
 493 | n03014705
 494 | n03016953
 495 | n03017168
 496 | n03018349
 497 | n03026506
 498 | n03028079
 499 | n03032252
 500 | n03041632
 501 | n03042490
 502 | n03045698
 503 | n03047690
 504 | n03062245
 505 | n03063599
 506 | n03063689
 507 | n03065424
 508 | n03075370
 509 | n03085013
 510 | n03089624
 511 | n03095699
 512 | n03100240
 513 | n03109150
 514 | n03110669
 515 | n03124043
 516 | n03124170
 517 | n03125729
 518 | n03126707
 519 | n03127747
 520 | n03127925
 521 | n03131574
 522 | n03133878
 523 | n03134739
 524 | n03141823
 525 | n03146219
 526 | n03160309
 527 | n03179701
 528 | n03180011
 529 | n03187595
 530 | n03188531
 531 | n03196217
 532 | n03197337
 533 | n03201208
 534 | n03207743
 535 | n03207941
 536 | n03208938
 537 | n03216828
 538 | n03218198
 539 | n03220513
 540 | n03223299
 541 | n03240683
 542 | n03249569
 543 | n03250847
 544 | n03255030
 545 | n03259280
 546 | n03271574
 547 | n03272010
 548 | n03272562
 549 | n03290653
 550 | n03291819
 551 | n03297495
 552 | n03314780
 553 | n03325584
 554 | n03337140
 555 | n03344393
 556 | n03345487
 557 | n03347037
 558 | n03355925
 559 | n03372029
 560 | n03376595
 561 | n03379051
 562 | n03384352
 563 | n03388043
 564 | n03388183
 565 | n03388549
 566 | n03393912
 567 | n03394916
 568 | n03400231
 569 | n03404251
 570 | n03417042
 571 | n03424325
 572 | n03425413
 573 | n03443371
 574 | n03444034
 575 | n03445777
 576 | n03445924
 577 | n03447447
 578 | n03447721
 579 | n03450230
 580 | n03452741
 581 | n03457902
 582 | n03459775
 583 | n03461385
 584 | n03467068
 585 | n03476684
 586 | n03476991
 587 | n03478589
 588 | n03481172
 589 | n03482405
 590 | n03483316
 591 | n03485407
 592 | n03485794
 593 | n03492542
 594 | n03494278
 595 | n03495258
 596 | n03496892
 597 | n03498962
 598 | n03527444
 599 | n03529860
 600 | n03530642
 601 | n03532672
 602 | n03534580
 603 | n03535780
 604 | n03538406
 605 | n03544143
 606 | n03584254
 607 | n03584829
 608 | n03590841
 609 | n03594734
 610 | n03594945
 611 | n03595614
 612 | n03598930
 613 | n03599486
 614 | n03602883
 615 | n03617480
 616 | n03623198
 617 | n03627232
 618 | n03630383
 619 | n03633091
 620 | n03637318
 621 | n03642806
 622 | n03649909
 623 | n03657121
 624 | n03658185
 625 | n03661043
 626 | n03662601
 627 | n03666591
 628 | n03670208
 629 | n03673027
 630 | n03676483
 631 | n03680355
 632 | n03690938
 633 | n03691459
 634 | n03692522
 635 | n03697007
 636 | n03706229
 637 | n03709823
 638 | n03710193
 639 | n03710637
 640 | n03710721
 641 | n03717622
 642 | n03720891
 643 | n03721384
 644 | n03724870
 645 | n03729826
 646 | n03733131
 647 | n03733281
 648 | n03733805
 649 | n03742115
 650 | n03743016
 651 | n03759954
 652 | n03761084
 653 | n03763968
 654 | n03764736
 655 | n03769881
 656 | n03770439
 657 | n03770679
 658 | n03773504
 659 | n03775071
 660 | n03775546
 661 | n03776460
 662 | n03777568
 663 | n03777754
 664 | n03781244
 665 | n03782006
 666 | n03785016
 667 | n03786901
 668 | n03787032
 669 | n03788195
 670 | n03788365
 671 | n03791053
 672 | n03792782
 673 | n03792972
 674 | n03793489
 675 | n03794056
 676 | n03796401
 677 | n03803284
 678 | n03804744
 679 | n03814639
 680 | n03814906
 681 | n03825788
 682 | n03832673
 683 | n03837869
 684 | n03838899
 685 | n03840681
 686 | n03841143
 687 | n03843555
 688 | n03854065
 689 | n03857828
 690 | n03866082
 691 | n03868242
 692 | n03868863
 693 | n03871628
 694 | n03873416
 695 | n03874293
 696 | n03874599
 697 | n03876231
 698 | n03877472
 699 | n03877845
 700 | n03884397
 701 | n03887697
 702 | n03888257
 703 | n03888605
 704 | n03891251
 705 | n03891332
 706 | n03895866
 707 | n03899768
 708 | n03902125
 709 | n03903868
 710 | n03908618
 711 | n03908714
 712 | n03916031
 713 | n03920288
 714 | n03924679
 715 | n03929660
 716 | n03929855
 717 | n03930313
 718 | n03930630
 719 | n03933933
 720 | n03935335
 721 | n03937543
 722 | n03938244
 723 | n03942813
 724 | n03944341
 725 | n03947888
 726 | n03950228
 727 | n03954731
 728 | n03956157
 729 | n03958227
 730 | n03961711
 731 | n03967562
 732 | n03970156
 733 | n03976467
 734 | n03976657
 735 | n03977966
 736 | n03980874
 737 | n03982430
 738 | n03983396
 739 | n03991062
 740 | n03992509
 741 | n03995372
 742 | n03998194
 743 | n04004767
 744 | n04005630
 745 | n04008634
 746 | n04009552
 747 | n04019541
 748 | n04023962
 749 | n04026417
 750 | n04033901
 751 | n04033995
 752 | n04037443
 753 | n04039381
 754 | n04040759
 755 | n04041544
 756 | n04044716
 757 | n04049303
 758 | n04065272
 759 | n04067472
 760 | n04069434
 761 | n04070727
 762 | n04074963
 763 | n04081281
 764 | n04086273
 765 | n04090263
 766 | n04099969
 767 | n04111531
 768 | n04116512
 769 | n04118538
 770 | n04118776
 771 | n04120489
 772 | n04125021
 773 | n04127249
 774 | n04131690
 775 | n04133789
 776 | n04136333
 777 | n04141076
 778 | n04141327
 779 | n04141975
 780 | n04146614
 781 | n04147183
 782 | n04149813
 783 | n04152593
 784 | n04153751
 785 | n04154565
 786 | n04162706
 787 | n04179913
 788 | n04192698
 789 | n04200800
 790 | n04201297
 791 | n04204238
 792 | n04204347
 793 | n04208210
 794 | n04209133
 795 | n04209239
 796 | n04228054
 797 | n04229816
 798 | n04235860
 799 | n04238763
 800 | n04239074
 801 | n04243546
 802 | n04251144
 803 | n04252077
 804 | n04252225
 805 | n04254120
 806 | n04254680
 807 | n04254777
 808 | n04258138
 809 | n04259630
 810 | n04263257
 811 | n04264628
 812 | n04265275
 813 | n04266014
 814 | n04270147
 815 | n04273569
 816 | n04275548
 817 | n04277352
 818 | n04285008
 819 | n04286575
 820 | n04296562
 821 | n04310018
 822 | n04311004
 823 | n04311174
 824 | n04317175
 825 | n04325704
 826 | n04326547
 827 | n04328186
 828 | n04330267
 829 | n04332243
 830 | n04335435
 831 | n04336792
 832 | n04344873
 833 | n04346328
 834 | n04347754
 835 | n04350905
 836 | n04355338
 837 | n04355933
 838 | n04356056
 839 | n04357314
 840 | n04366367
 841 | n04367480
 842 | n04370456
 843 | n04371430
 844 | n04371774
 845 | n04372370
 846 | n04376876
 847 | n04380533
 848 | n04389033
 849 | n04392985
 850 | n04398044
 851 | n04399382
 852 | n04404412
 853 | n04409515
 854 | n04417672
 855 | n04418357
 856 | n04423845
 857 | n04428191
 858 | n04429376
 859 | n04435653
 860 | n04442312
 861 | n04443257
 862 | n04447861
 863 | n04456115
 864 | n04458633
 865 | n04461696
 866 | n04462240
 867 | n04465501
 868 | n04467665
 869 | n04476259
 870 | n04479046
 871 | n04482393
 872 | n04483307
 873 | n04485082
 874 | n04486054
 875 | n04487081
 876 | n04487394
 877 | n04493381
 878 | n04501370
 879 | n04505470
 880 | n04507155
 881 | n04509417
 882 | n04515003
 883 | n04517823
 884 | n04522168
 885 | n04523525
 886 | n04525038
 887 | n04525305
 888 | n04532106
 889 | n04532670
 890 | n04536866
 891 | n04540053
 892 | n04542943
 893 | n04548280
 894 | n04548362
 895 | n04550184
 896 | n04552348
 897 | n04553703
 898 | n04554684
 899 | n04557648
 900 | n04560804
 901 | n04562935
 902 | n04579145
 903 | n04579432
 904 | n04584207
 905 | n04589890
 906 | n04590129
 907 | n04591157
 908 | n04591713
 909 | n04592741
 910 | n04596742
 911 | n04597913
 912 | n04599235
 913 | n04604644
 914 | n04606251
 915 | n04612504
 916 | n04613696
 917 | n06359193
 918 | n06596364
 919 | n06785654
 920 | n06794110
 921 | n06874185
 922 | n07248320
 923 | n07565083
 924 | n07579787
 925 | n07583066
 926 | n07584110
 927 | n07590611
 928 | n07613480
 929 | n07614500
 930 | n07615774
 931 | n07684084
 932 | n07693725
 933 | n07695742
 934 | n07697313
 935 | n07697537
 936 | n07711569
 937 | n07714571
 938 | n07714990
 939 | n07715103
 940 | n07716358
 941 | n07716906
 942 | n07717410
 943 | n07717556
 944 | n07718472
 945 | n07718747
 946 | n07720875
 947 | n07730033
 948 | n07734744
 949 | n07742313
 950 | n07745940
 951 | n07747607
 952 | n07749582
 953 | n07753113
 954 | n07753275
 955 | n07753592
 956 | n07754684
 957 | n07760859
 958 | n07768694
 959 | n07802026
 960 | n07831146
 961 | n07836838
 962 | n07860988
 963 | n07871810
 964 | n07873807
 965 | n07875152
 966 | n07880968
 967 | n07892512
 968 | n07920052
 969 | n07930864
 970 | n07932039
 971 | n09193705
 972 | n09229709
 973 | n09246464
 974 | n09256479
 975 | n09288635
 976 | n09332890
 977 | n09399592
 978 | n09421951
 979 | n09428293
 980 | n09468604
 981 | n09472597
 982 | n09835506
 983 | n10148035
 984 | n10565667
 985 | n11879895
 986 | n11939491
 987 | n12057211
 988 | n12144580
 989 | n12267677
 990 | n12620546
 991 | n12768682
 992 | n12985857
 993 | n12998815
 994 | n13037406
 995 | n13040303
 996 | n13044778
 997 | n13052670
 998 | n13054560
 999 | n13133613
1000 | n15075141
1001 | 


--------------------------------------------------------------------------------
/datasets/imnet_reg.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Provides data for the ImageNet ILSVRC 2012 Dataset plus some bounding boxes.
 16 | 
 17 | Some images have one or more bounding boxes associated with the label of the
 18 | image. See details here: http://image-net.org/download-bboxes
 19 | 
 20 | ImageNet is based upon WordNet 3.0. To uniquely identify a synset, we use
 21 | "WordNet ID" (wnid), which is a concatenation of POS ( i.e. part of speech )
 22 | and SYNSET OFFSET of WordNet. For more information, please refer to the
 23 | WordNet documentation[http://wordnet.princeton.edu/wordnet/documentation/].
 24 | 
 25 | "There are bounding boxes for over 3000 popular synsets available.
 26 | For each synset, there are on average 150 images with bounding boxes."
 27 | 
 28 | WARNING: Don't use for object detection, in this case all the bounding boxes
 29 | of the image belong to just one class.
 30 | """
 31 | from __future__ import absolute_import
 32 | from __future__ import division
 33 | from __future__ import print_function
 34 | 
 35 | import os
 36 | from six.moves import urllib
 37 | import tensorflow as tf
 38 | 
 39 | from datasets import dataset_utils
 40 | 
 41 | slim = tf.contrib.slim
 42 | 
 43 | # TODO(nsilberman): Add tfrecord file type once the script is updated.
 44 | _FILE_PATTERN = '%s-*'
 45 | 
 46 | _SPLITS_TO_SIZES = {
 47 |     'train': 1156644,
 48 |     'validation': 50000,
 49 | }
 50 | 
 51 | _ITEMS_TO_DESCRIPTIONS = {
 52 |     'input_img': 'A color image of varying height and width.',
 53 |     'gt_img': 'A color image of varying height and width.',
 54 |     'label': 'The label id of the image, integer between 0 and 999',
 55 |     'label_text': 'The text of the label.',
 56 |     'object/bbox': 'A list of bounding boxes.',
 57 |     'object/label': 'A list of labels, one per each object.',
 58 | }
 59 | 
 60 | _NUM_CLASSES = 1001
 61 | 
 62 | 
 63 | def create_readable_names_for_imagenet_labels():
 64 |   """Create a dict mapping label id to human readable string.
 65 | 
 66 |   Returns:
 67 |       labels_to_names: dictionary where keys are integers from to 1000
 68 |       and values are human-readable names.
 69 | 
 70 |   We retrieve a synset file, which contains a list of valid synset labels used
 71 |   by ILSVRC competition. There is one synset one per line, eg.
 72 |           #   n01440764
 73 |           #   n01443537
 74 |   We also retrieve a synset_to_human_file, which contains a mapping from synsets
 75 |   to human-readable names for every synset in Imagenet. These are stored in a
 76 |   tsv format, as follows:
 77 |           #   n02119247    black fox
 78 |           #   n02119359    silver fox
 79 |   We assign each synset (in alphabetical order) an integer, starting from 1
 80 |   (since 0 is reserved for the background class).
 81 | 
 82 |   Code is based on
 83 |   https://github.com/tensorflow/models/blob/master/inception/inception/data/convert_reg_imgnet.py#L463
 84 |   """
 85 | 
 86 |   # pylint: disable=g-line-too-long
 87 |   base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/inception/inception/data/'
 88 |   synset_url = '{}/imagenet_lsvrc_2015_synsets.txt'.format(base_url)
 89 |   synset_to_human_url = '{}/imagenet_metadata.txt'.format(base_url)
 90 | 
 91 |   #filename, _ = urllib.request.urlretrieve(synset_url)
 92 |   filename = 'imagenet_lsvrc_2015_synsets.txt'
 93 |   synset_list = [s.strip() for s in open(filename).readlines()]
 94 |   num_synsets_in_ilsvrc = len(synset_list)
 95 |   assert num_synsets_in_ilsvrc == 1000
 96 | 
 97 |   #filename, _ = urllib.request.urlretrieve(synset_to_human_url)
 98 |   filename = 'imagenet_metadata.txt'
 99 |   synset_to_human_list = open(filename).readlines()
100 |   num_synsets_in_all_imagenet = len(synset_to_human_list)
101 |   assert num_synsets_in_all_imagenet == 21842
102 | 
103 |   synset_to_human = {}
104 |   for s in synset_to_human_list:
105 |     parts = s.strip().split('\t')
106 |     assert len(parts) == 2
107 |     synset = parts[0]
108 |     human = parts[1]
109 |     synset_to_human[synset] = human
110 | 
111 |   label_index = 1
112 |   labels_to_names = {0: 'background'}
113 |   for synset in synset_list:
114 |     name = synset_to_human[synset]
115 |     labels_to_names[label_index] = name
116 |     label_index += 1
117 | 
118 |   return labels_to_names
119 | 
120 | 
121 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
122 |   """Gets a dataset tuple with instructions for reading ImageNet.
123 | 
124 |   Args:
125 |     split_name: A train/test split name.
126 |     dataset_dir: The base directory of the dataset sources.
127 |     file_pattern: The file pattern to use when matching the dataset sources.
128 |       It is assumed that the pattern contains a '%s' string so that the split
129 |       name can be inserted.
130 |     reader: The TensorFlow reader type.
131 | 
132 |   Returns:
133 |     A `Dataset` namedtuple.
134 | 
135 |   Raises:
136 |     ValueError: if `split_name` is not a valid train/test split.
137 |   """
138 |   if split_name not in _SPLITS_TO_SIZES:
139 |     raise ValueError('split name %s was not recognized.' % split_name)
140 | 
141 |   if not file_pattern:
142 |     file_pattern = _FILE_PATTERN
143 |   file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
144 | 
145 |   # Allowing None in the signature so that dataset_factory can use the default.
146 |   if reader is None:
147 |     reader = tf.TFRecordReader
148 | 
149 |   keys_to_features = {
150 |       'input_img/encoded': tf.FixedLenFeature(
151 |           (), tf.string, default_value=''),
152 |       'gt_img/encoded': tf.FixedLenFeature(
153 |           (), tf.string, default_value=''),
154 |       'image/format': tf.FixedLenFeature(
155 |           (), tf.string, default_value='jpeg'),
156 |       'image/class/label': tf.FixedLenFeature(
157 |           [], dtype=tf.int64, default_value=-1),
158 |       'image/class/text': tf.FixedLenFeature(
159 |           [], dtype=tf.string, default_value=''),
160 |       'image/object/bbox/xmin': tf.VarLenFeature(
161 |           dtype=tf.float32),
162 |       'image/object/bbox/ymin': tf.VarLenFeature(
163 |           dtype=tf.float32),
164 |       'image/object/bbox/xmax': tf.VarLenFeature(
165 |           dtype=tf.float32),
166 |       'image/object/bbox/ymax': tf.VarLenFeature(
167 |           dtype=tf.float32),
168 |       'image/object/class/label': tf.VarLenFeature(
169 |           dtype=tf.int64),
170 |   }
171 | 
172 |   items_to_handlers = {
173 |       'input': slim.tfexample_decoder.Image('input_img/encoded', 'image/format'),
174 |       'ground_truth': slim.tfexample_decoder.Image('gt_img/encoded', 'image/format'),
175 |       'label': slim.tfexample_decoder.Tensor('image/class/label'),
176 |       'label_text': slim.tfexample_decoder.Tensor('image/class/text'),
177 |       'object/bbox': slim.tfexample_decoder.BoundingBox(
178 |           ['ymin', 'xmin', 'ymax', 'xmax'], 'image/object/bbox/'),
179 |       'object/label': slim.tfexample_decoder.Tensor('image/object/class/label'),
180 |   }
181 | 
182 |   decoder = slim.tfexample_decoder.TFExampleDecoder(
183 |       keys_to_features, items_to_handlers)
184 | 
185 |   return slim.dataset.Dataset(
186 |       data_sources=file_pattern,
187 |       reader=reader,
188 |       decoder=decoder,
189 |       num_samples=_SPLITS_TO_SIZES[split_name],
190 |       items_to_descriptions=_ITEMS_TO_DESCRIPTIONS)
191 | 


--------------------------------------------------------------------------------
/datasets/raw.py:
--------------------------------------------------------------------------------
  1 | """
  2 | data for the ImageNet ILSVRC 2012 Dataset plus some bounding boxes.
  3 | 
  4 | Some images have one or more bounding boxes associated with the label of the
  5 | image. See details here: http://image-net.org/download-bboxes
  6 | 
  7 | ImageNet is based upon WordNet 3.0. To uniquely identify a synset, we use
  8 | "WordNet ID" (wnid), which is a concatenation of POS ( i.e. part of speech )
  9 | and SYNSET OFFSET of WordNet. For more information, please refer to the
 10 | WordNet documentation[http://wordnet.princeton.edu/wordnet/documentation/].
 11 | 
 12 | "There are bounding boxes for over 3000 popular synsets available.
 13 | For each synset, there are on average 150 images with bounding boxes."
 14 | 
 15 | WARNING: Don't use for object detection, in this case all the bounding boxes
 16 | of the image belong to just one class.
 17 | """
 18 | from __future__ import absolute_import
 19 | from __future__ import division
 20 | from __future__ import print_function
 21 | 
 22 | import os
 23 | from six.moves import urllib
 24 | import tensorflow as tf
 25 | 
 26 | from datasets import dataset_utils
 27 | 
 28 | slim = tf.contrib.slim
 29 | 
 30 | # TODO(nsilberman): Add tfrecord file type once the script is updated.
 31 | _FILE_PATTERN = '%s-*'
 32 | 
 33 | _SPLITS_TO_SIZES = {
 34 |     'train': 1281167,
 35 |     'validation': 1103, # low 844, medium 1103
 36 | }
 37 | 
 38 | _ITEMS_TO_DESCRIPTIONS = {
 39 |     'image': 'A color image of varying height and width.',
 40 |     'label': 'The label id of the image, integer between 0 and 999',
 41 |     'label_text': 'The text of the label.',
 42 |     'object/bbox': 'A list of bounding boxes.',
 43 |     'object/label': 'A list of labels, one per each object.',
 44 | }
 45 | 
 46 | _NUM_CLASSES = 1001
 47 | 
 48 | 
 49 | def create_readable_names_for_imagenet_labels():
 50 |   """Create a dict mapping label id to human readable string.
 51 | 
 52 |   Returns:
 53 |       labels_to_names: dictionary where keys are integers from to 1000
 54 |       and values are human-readable names.
 55 | 
 56 |   We retrieve a synset file, which contains a list of valid synset labels used
 57 |   by ILSVRC competition. There is one synset one per line, eg.
 58 |           #   n01440764
 59 |           #   n01443537
 60 |   We also retrieve a synset_to_human_file, which contains a mapping from synsets
 61 |   to human-readable names for every synset in Imagenet. These are stored in a
 62 |   tsv format, as follows:
 63 |           #   n02119247    black fox
 64 |           #   n02119359    silver fox
 65 |   We assign each synset (in alphabetical order) an integer, starting from 1
 66 |   (since 0 is reserved for the background class).
 67 | 
 68 |   Code is based on
 69 |   https://github.com/tensorflow/models/blob/master/inception/inception/data/build_imagenet_data.py#L463
 70 |   """
 71 | 
 72 |   # pylint: disable=g-line-too-long
 73 |   base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/inception/inception/data/'
 74 |   synset_url = '{}/imagenet_lsvrc_2015_synsets.txt'.format(base_url)
 75 |   synset_to_human_url = '{}/imagenet_metadata.txt'.format(base_url)
 76 | 
 77 |   filename, _ = urllib.request.urlretrieve(synset_url)
 78 |   synset_list = [s.strip() for s in open(filename).readlines()]
 79 |   num_synsets_in_ilsvrc = len(synset_list)
 80 |   assert num_synsets_in_ilsvrc == 1000
 81 | 
 82 |   filename, _ = urllib.request.urlretrieve(synset_to_human_url)
 83 |   synset_to_human_list = open(filename).readlines()
 84 |   num_synsets_in_all_imagenet = len(synset_to_human_list)
 85 |   assert num_synsets_in_all_imagenet == 21842
 86 | 
 87 |   synset_to_human = {}
 88 |   for s in synset_to_human_list:
 89 |     parts = s.strip().split('\t')
 90 |     assert len(parts) == 2
 91 |     synset = parts[0]
 92 |     human = parts[1]
 93 |     synset_to_human[synset] = human
 94 | 
 95 |   label_index = 1
 96 |   labels_to_names = {0: 'background'}
 97 |   for synset in synset_list:
 98 |     name = synset_to_human[synset]
 99 |     labels_to_names[label_index] = name
100 |     label_index += 1
101 | 
102 |   return labels_to_names
103 | 
104 | 
105 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
106 |   """Gets a dataset tuple with instructions for reading ImageNet.
107 | 
108 |   Args:
109 |     split_name: A train/test split name.
110 |     dataset_dir: The base directory of the dataset sources.
111 |     file_pattern: The file pattern to use when matching the dataset sources.
112 |       It is assumed that the pattern contains a '%s' string so that the split
113 |       name can be inserted.
114 |     reader: The TensorFlow reader type.
115 | 
116 |   Returns:
117 |     A `Dataset` namedtuple.
118 | 
119 |   Raises:
120 |     ValueError: if `split_name` is not a valid train/test split.
121 |   """
122 |   if split_name not in _SPLITS_TO_SIZES:
123 |     raise ValueError('split name %s was not recognized.' % split_name)
124 | 
125 |   if not file_pattern:
126 |     file_pattern = _FILE_PATTERN
127 |   file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
128 | 
129 |   # Allowing None in the signature so that dataset_factory can use the default.
130 |   if reader is None:
131 |     reader = tf.TFRecordReader
132 | 
133 |   keys_to_features = {
134 |       'image/encoded': tf.FixedLenFeature(
135 |           (), tf.string, default_value=''),
136 |       'image/format': tf.FixedLenFeature(
137 |           (), tf.string, default_value='jpeg'),
138 |       'image/class/label': tf.FixedLenFeature(
139 |           [], dtype=tf.int64, default_value=-1),
140 |       'image/class/text': tf.FixedLenFeature(
141 |           [], dtype=tf.string, default_value=''),
142 |       'image/filename': tf.FixedLenFeature(
143 |           [], dtype=tf.string, default_value=''),
144 |       'image/object/bbox/xmin': tf.VarLenFeature(
145 |           dtype=tf.float32),
146 |       'image/object/bbox/ymin': tf.VarLenFeature(
147 |           dtype=tf.float32),
148 |       'image/object/bbox/xmax': tf.VarLenFeature(
149 |           dtype=tf.float32),
150 |       'image/object/bbox/ymax': tf.VarLenFeature(
151 |           dtype=tf.float32),
152 |       'image/object/class/label': tf.VarLenFeature(
153 |           dtype=tf.int64),
154 |   }
155 | 
156 |   items_to_handlers = {
157 |       'image': slim.tfexample_decoder.Image('image/encoded', 'image/format'),
158 |       'label': slim.tfexample_decoder.Tensor('image/class/label'),
159 |       'filename': slim.tfexample_decoder.Tensor('image/filename'),
160 |       'label_text': slim.tfexample_decoder.Tensor('image/class/text'),
161 |       'object/bbox': slim.tfexample_decoder.BoundingBox(
162 |           ['ymin', 'xmin', 'ymax', 'xmax'], 'image/object/bbox/'),
163 |       'object/label': slim.tfexample_decoder.Tensor('image/object/class/label'),
164 |   }
165 | 
166 |   decoder = slim.tfexample_decoder.TFExampleDecoder(
167 |       keys_to_features, items_to_handlers)
168 | 
169 |   labels_to_names = None
170 |   if dataset_utils.has_labels(dataset_dir):
171 |     labels_to_names = dataset_utils.read_label_file(dataset_dir)
172 |   else:
173 |     labels_to_names = create_readable_names_for_imagenet_labels()
174 |     dataset_utils.write_label_file(labels_to_names, dataset_dir)
175 | 
176 |   return slim.dataset.Dataset(
177 |       data_sources=file_pattern,
178 |       reader=reader,
179 |       decoder=decoder,
180 |       num_samples=_SPLITS_TO_SIZES[split_name],
181 |       items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,
182 |       num_classes=_NUM_CLASSES,
183 |       labels_to_names=labels_to_names)
184 | 
185 | 


--------------------------------------------------------------------------------
/datasets/raw_metadata.txt:
--------------------------------------------------------------------------------
 1 | n04116512	rubber eraser, rubber, pencil eraser
 2 | n03995372	power drill
 3 | n03983396	pop bottle, soda bottle
 4 | n03291819	envelope
 5 | n03063599	coffee mug
 6 | n03891251	park bench
 7 | n07753592	banana
 8 | n02870880	bookcase
 9 | n02965783	car mirror
10 | n02823428	beer bottle
11 | n02974003	car wheel
12 | n04254777	sock, socks
13 | n03085013	computer keyboard, keypad
14 | n03793489	mouse, computer mouse
15 | n02783161	ballpoint, ballpoint pen, ballpen, Biro
16 | n04485082	tripod
17 | n02877765	bottlecap
18 | n03792782	mountain bike, all-terrain bike, off-roader
19 | n03782006	monitor
20 | n04131690	saltshaker, salt shaker
21 | n03761084	microwave, microwave oven
22 | n04557648	water bottle
23 | n03208938	disk brake, disc brake, disk brakes
24 | n04507155	umbrella
25 | n02786058	Band Aid
26 | n04153751	screw
27 | n04548362	wallet, billfold, notecase, pocketbook
28 | n04254120	soap dispenser
29 | n04356056	sunglasses, dark glasses, shades
30 | n04548280	wall clock
31 | n04447861	toilet seat
32 | n03958227	plastic bag
33 | n03717622	manhole cover
34 | n03481172	hammer
35 | n15075141	toilet tissue, toilet paper, bathroom tissue
36 | n04004767	printer
37 | n03924679	photocopier
38 | n03657121	lens cap, lens cover
39 | n04118776	rule, ruler
40 | n04009552	projector
41 | n03857828	oscilloscope, scope, cathode-ray oscilloscope, CRO
42 | n03492542	hard disc, hard disk, fixed disk
43 | n03388183	fountain pen
44 | n02840245	binder, ring-binder
45 | n02769748	backpack, back pack, knapsack, packsack, rucksack, haversack
46 | n03832673	notebook, notebook computer
47 | n03297495	espresso maker
48 | n02782093	balloon
49 | n03887697	paper towel
50 | n04069434	reflex camera
51 | n03180011	desktop computer
52 | n03179701	desk
53 | n02992529	cellular telephone, cellular phone, cellphone, cell, mobile phone
54 | n03637318	lampshade, lamp shade
55 | n03929660	pick, plectrum, plectron
56 | n03445777	golf ball
57 | n03666591	lighter, light, igniter, ignitor
58 | n04591713	wine bottle
59 | n02747177   trash can
60 | n04409515   tennis ball
61 | n03223299   doormat
62 | n04554684   washing machine
63 | n04557648   water bottle
64 | n04553703   washbasin
65 | 
66 | 


--------------------------------------------------------------------------------
/datasets/synset_labels.txt:
--------------------------------------------------------------------------------
  1 | n02823428:441
  2 | n04507155:880
  3 | n04485082:873
  4 | n04557648:899
  5 | n03782006:665
  6 | n02769748:415
  7 | n03291819:550
  8 | n03793489:674
  9 | n03085013:509
 10 | n02840245:447
 11 | n02974003:480
 12 | n03887697:701
 13 | n03761084:652
 14 | n04153751:784
 15 | n02870880:454
 16 | n03924679:714
 17 | n03983396:738
 18 | n15075141:1000
 19 | n03717622:641
 20 | n03208938:536
 21 | n04254120:805
 22 | n03063599:505
 23 | n03637318:620
 24 | n03891251:704
 25 | n03832673:682
 26 | n04356056:838
 27 | n04118776:770
 28 | n04009552:746
 29 | n02965783:476
 30 | n04004767:743
 31 | n03180011:528
 32 | n07753592:955
 33 | n04548280:893
 34 | n04447861:862
 35 | n04548362:894
 36 | n03995372:741
 37 | n03857828:689
 38 | n02783161:419
 39 | n03388183:564
 40 | n03297495:551
 41 | n03085013:509
 42 | n03657121:623
 43 | n07753592:955
 44 | n04548280:893
 45 | n04254777:807
 46 | n02783161:419
 47 | n03887697:701
 48 | n02823428:441
 49 | n04447861:862
 50 | n03782006:665
 51 | n03063599:505
 52 | n15075141:1000
 53 | n03891251:704
 54 | n04153751:784
 55 | n03857828:689
 56 | n03291819:550
 57 | n02786058:420
 58 | n04548362:894
 59 | n04118776:770
 60 | n02974003:480
 61 | n03761084:652
 62 | n04485082:873
 63 | n04254120:805
 64 | n03924679:714
 65 | n03637318:620
 66 | n02769748:415
 67 | n02870880:454
 68 | n03793489:674
 69 | n03995372:741
 70 | n03717622:641
 71 | n02965783:476
 72 | n03887697:701
 73 | n03717622:641
 74 | n03063599:505
 75 | n04507155:880
 76 | n04447861:862
 77 | n07753592:955
 78 | n04153751:784
 79 | n15075141:1000
 80 | n03793489:674
 81 | n03782006:665
 82 | n03291819:550
 83 | n02870880:454
 84 | n04254120:805
 85 | n04118776:770
 86 | n03657121:623
 87 | n03208938:536
 88 | n03983396:738
 89 | n02783161:419
 90 | n03085013:509
 91 | n04548280:893
 92 | n03297495:551
 93 | n04485082:873
 94 | n04004767:743
 95 | n03857828:689
 96 | n02974003:480
 97 | n04548362:894
 98 | n03761084:652
 99 | n02769748:415
100 | n03891251:704
101 | n04254777:807
102 | n03924679:714
103 | n03995372:741
104 | n02823428:441
105 | n03832673:682
106 | n02786058:420
107 | n04557648:899
108 | n02965783:476
109 | n02840245:447
110 | n04009552:746
111 | n04356056:838
112 | n03637318:620
113 | n03180011:528
114 | n03388183:564
115 | n03929660:715
116 | n04591713:908
117 | n03445777:575
118 | n03666591:627
119 | n02747177:413
120 | n04409515:853
121 | n03223299:540
122 | n04554684:898
123 | n04557648:899
124 | n04553703:897
125 | n04116512:768
126 | 


--------------------------------------------------------------------------------
/datasets/training_synsets.txt:
--------------------------------------------------------------------------------
   1 | n01440764
   2 | n01443537
   3 | n01484850
   4 | n01491361
   5 | n01494475
   6 | n01496331
   7 | n01498041
   8 | n01514668
   9 | n01514859
  10 | n01518878
  11 | n01530575
  12 | n01531178
  13 | n01532829
  14 | n01534433
  15 | n01537544
  16 | n01558993
  17 | n01560419
  18 | n01580077
  19 | n01582220
  20 | n01592084
  21 | n01601694
  22 | n01608432
  23 | n01614925
  24 | n01616318
  25 | n01622779
  26 | n01629819
  27 | n01630670
  28 | n01631663
  29 | n01632458
  30 | n01632777
  31 | n01641577
  32 | n01644373
  33 | n01644900
  34 | n01664065
  35 | n01665541
  36 | n01667114
  37 | n01667778
  38 | n01669191
  39 | n01675722
  40 | n01677366
  41 | n01682714
  42 | n01685808
  43 | n01687978
  44 | n01688243
  45 | n01689811
  46 | n01692333
  47 | n01693334
  48 | n01694178
  49 | n01695060
  50 | n01697457
  51 | n01698640
  52 | n01704323
  53 | n01728572
  54 | n01728920
  55 | n01729322
  56 | n01729977
  57 | n01734418
  58 | n01735189
  59 | n01737021
  60 | n01739381
  61 | n01740131
  62 | n01742172
  63 | n01744401
  64 | n01748264
  65 | n01749939
  66 | n01751748
  67 | n01753488
  68 | n01755581
  69 | n01756291
  70 | n01768244
  71 | n01770081
  72 | n01770393
  73 | n01773157
  74 | n01773549
  75 | n01773797
  76 | n01774384
  77 | n01774750
  78 | n01775062
  79 | n01776313
  80 | n01784675
  81 | n01795545
  82 | n01796340
  83 | n01797886
  84 | n01798484
  85 | n01806143
  86 | n01806567
  87 | n01807496
  88 | n01817953
  89 | n01818515
  90 | n01819313
  91 | n01820546
  92 | n01824575
  93 | n01828970
  94 | n01829413
  95 | n01833805
  96 | n01843065
  97 | n01843383
  98 | n01847000
  99 | n01855032
 100 | n01855672
 101 | n01860187
 102 | n01871265
 103 | n01872772
 104 | n01873310
 105 | n01877812
 106 | n01882714
 107 | n01883070
 108 | n01910747
 109 | n01914609
 110 | n01917289
 111 | n01924916
 112 | n01930112
 113 | n01943899
 114 | n01944390
 115 | n07922607
 116 | n01950731
 117 | n01955084
 118 | n01968897
 119 | n01978287
 120 | n01978455
 121 | n01980166
 122 | n01981276
 123 | n01983481
 124 | n01984695
 125 | n01985128
 126 | n01986214
 127 | n01990800
 128 | n02002556
 129 | n02002724
 130 | n02006656
 131 | n02007558
 132 | n02009229
 133 | n02009912
 134 | n02011460
 135 | n02012849
 136 | n02013706
 137 | n02017213
 138 | n02018207
 139 | n02018795
 140 | n02025239
 141 | n02027492
 142 | n02028035
 143 | n02033041
 144 | n02037110
 145 | n02051845
 146 | n02056570
 147 | n02058221
 148 | n02066245
 149 | n02071294
 150 | n02074367
 151 | n02077923
 152 | n02085620
 153 | n02085782
 154 | n02085936
 155 | n02086079
 156 | n02086240
 157 | n02086646
 158 | n02086910
 159 | n02087046
 160 | n02087394
 161 | n02088094
 162 | n02088238
 163 | n02088364
 164 | n02088466
 165 | n02088632
 166 | n02089078
 167 | n02089867
 168 | n02089973
 169 | n02090379
 170 | n02090622
 171 | n02090721
 172 | n02091032
 173 | n02091134
 174 | n02091244
 175 | n02091467
 176 | n02091635
 177 | n02091831
 178 | n02092002
 179 | n02092339
 180 | n02093256
 181 | n02093428
 182 | n02093647
 183 | n02093754
 184 | n02093859
 185 | n02093991
 186 | n02094114
 187 | n02094258
 188 | n02094433
 189 | n02095314
 190 | n02095570
 191 | n02095889
 192 | n02096051
 193 | n02096177
 194 | n02096294
 195 | n02096437
 196 | n02096585
 197 | n02097047
 198 | n02097130
 199 | n02097209
 200 | n02097298
 201 | n02097474
 202 | n02097658
 203 | n02098105
 204 | n02098286
 205 | n02098413
 206 | n02099267
 207 | n02099429
 208 | n02099601
 209 | n02099712
 210 | n02099849
 211 | n02100236
 212 | n02100583
 213 | n02100735
 214 | n02100877
 215 | n02101006
 216 | n02101388
 217 | n02101556
 218 | n02102040
 219 | n02102177
 220 | n02102318
 221 | n02102480
 222 | n02102973
 223 | n02104029
 224 | n02104365
 225 | n02105056
 226 | n02105162
 227 | n02105251
 228 | n02105412
 229 | n02105505
 230 | n02105641
 231 | n02105855
 232 | n02106030
 233 | n02106166
 234 | n02106382
 235 | n02106550
 236 | n02106662
 237 | n02107142
 238 | n02107312
 239 | n02107574
 240 | n02107683
 241 | n02107908
 242 | n02108000
 243 | n02108089
 244 | n02108422
 245 | n02108551
 246 | n02108915
 247 | n02109047
 248 | n02109525
 249 | n02109961
 250 | n02110063
 251 | n02110185
 252 | n02110341
 253 | n02110627
 254 | n02110806
 255 | n02110958
 256 | n02111129
 257 | n02111277
 258 | n02111500
 259 | n02111889
 260 | n02112018
 261 | n02112137
 262 | n02112350
 263 | n02112706
 264 | n02113023
 265 | n02113186
 266 | n02113624
 267 | n02113712
 268 | n02113799
 269 | n02113978
 270 | n02114367
 271 | n02114548
 272 | n02114712
 273 | n02114855
 274 | n02115641
 275 | n02115913
 276 | n02116738
 277 | n02117135
 278 | n02119022
 279 | n02119789
 280 | n02120079
 281 | n02120505
 282 | n02123045
 283 | n02123159
 284 | n02123394
 285 | n02123597
 286 | n02124075
 287 | n02125311
 288 | n02127052
 289 | n02128385
 290 | n02128757
 291 | n02128925
 292 | n02129165
 293 | n02129604
 294 | n02130308
 295 | n02132136
 296 | n02133161
 297 | n02134084
 298 | n02134418
 299 | n02137549
 300 | n02138441
 301 | n02165105
 302 | n02165456
 303 | n02167151
 304 | n02168699
 305 | n02169497
 306 | n02172182
 307 | n02174001
 308 | n02177972
 309 | n03373237
 310 | n02206856
 311 | n02219486
 312 | n02226429
 313 | n02229544
 314 | n02231487
 315 | n02233338
 316 | n02236044
 317 | n02256656
 318 | n02259212
 319 | n02264363
 320 | n02268443
 321 | n02268853
 322 | n02276258
 323 | n02277742
 324 | n02279972
 325 | n02280649
 326 | n02281406
 327 | n02281787
 328 | n02317335
 329 | n02319095
 330 | n02321529
 331 | n02325366
 332 | n02326432
 333 | n02328150
 334 | n02342885
 335 | n02346627
 336 | n02356798
 337 | n02361337
 338 | n02818254
 339 | n02364673
 340 | n02389026
 341 | n02391049
 342 | n02395406
 343 | n02396427
 344 | n02397096
 345 | n02398521
 346 | n02403003
 347 | n02408429
 348 | n02410509
 349 | n02412080
 350 | n02415577
 351 | n02417914
 352 | n02422106
 353 | n02422699
 354 | n02423022
 355 | n02437312
 356 | n02437616
 357 | n02441942
 358 | n02442845
 359 | n02443114
 360 | n02443484
 361 | n02444819
 362 | n02445715
 363 | n02447366
 364 | n02454379
 365 | n02457408
 366 | n02480495
 367 | n02480855
 368 | n02481823
 369 | n02483362
 370 | n02483708
 371 | n02484975
 372 | n02486261
 373 | n02486410
 374 | n02487347
 375 | n02488291
 376 | n02488702
 377 | n02489166
 378 | n02490219
 379 | n02492035
 380 | n02492660
 381 | n02493509
 382 | n02493793
 383 | n02494079
 384 | n02497673
 385 | n02500267
 386 | n02504013
 387 | n02504458
 388 | n02509815
 389 | n02510455
 390 | n02514041
 391 | n02526121
 392 | n02536864
 393 | n02606052
 394 | n02607072
 395 | n02640242
 396 | n02641379
 397 | n02643566
 398 | n02655020
 399 | n02666196
 400 | n02667093
 401 | n02669723
 402 | n02672831
 403 | n02676566
 404 | n02687172
 405 | n02690373
 406 | n02692877
 407 | n02699494
 408 | n02701002
 409 | n02704792
 410 | n02708093
 411 | n02727426
 412 | n08496334
 413 | n02747177
 414 | n02749479
 415 | n02769748
 416 | n02776631
 417 | n02777292
 418 | n02782093
 419 | n02783161
 420 | n02786058
 421 | n02787622
 422 | n02788148
 423 | n02790996
 424 | n02791124
 425 | n02791270
 426 | n02793495
 427 | n02794156
 428 | n02795169
 429 | n02797295
 430 | n02799071
 431 | n02802426
 432 | n02804515
 433 | n02804610
 434 | n02807133
 435 | n02808304
 436 | n02808440
 437 | n02814533
 438 | n02814860
 439 | n02815834
 440 | n02817516
 441 | n02823428
 442 | n02823750
 443 | n02825657
 444 | n02834397
 445 | n02835271
 446 | n02837789
 447 | n02840245
 448 | n02841315
 449 | n02843684
 450 | n02859443
 451 | n02860847
 452 | n02865351
 453 | n02869837
 454 | n02870880
 455 | n02871525
 456 | n02877765
 457 | n02879718
 458 | n02883205
 459 | n02892201
 460 | n02892767
 461 | n02894605
 462 | n02895154
 463 | n02906734
 464 | n02909870
 465 | n02910353
 466 | n02916936
 467 | n02917067
 468 | n02927161
 469 | n02930766
 470 | n02939185
 471 | n02948072
 472 | n02950826
 473 | n02951358
 474 | n02951585
 475 | n02963159
 476 | n02965783
 477 | n02966193
 478 | n02966687
 479 | n02971356
 480 | n02974003
 481 | n02977058
 482 | n02978881
 483 | n02979186
 484 | n02980441
 485 | n02981792
 486 | n02988304
 487 | n02992211
 488 | n02992529
 489 | n02999936
 490 | n03000134
 491 | n03000247
 492 | n03000684
 493 | n03014705
 494 | n03016953
 495 | n03017168
 496 | n03018349
 497 | n03026506
 498 | n03028079
 499 | n03032252
 500 | n03041632
 501 | n03042490
 502 | n03045698
 503 | n03047690
 504 | n03062245
 505 | n03063599
 506 | n03063689
 507 | n03065424
 508 | n03075370
 509 | n03085013
 510 | n03089624
 511 | n03095699
 512 | n03100240
 513 | n03109150
 514 | n03110669
 515 | n03124043
 516 | n03124170
 517 | n03125729
 518 | n03126707
 519 | n03127747
 520 | n03127925
 521 | n03131574
 522 | n03133878
 523 | n03134739
 524 | n03141823
 525 | n03146219
 526 | n03160309
 527 | n03179701
 528 | n03180011
 529 | n03187595
 530 | n03188531
 531 | n03196217
 532 | n03197337
 533 | n03201208
 534 | n03207743
 535 | n03207941
 536 | n03208938
 537 | n03216828
 538 | n03218198
 539 | n03220513
 540 | n03223299
 541 | n03240683
 542 | n03249569
 543 | n03250847
 544 | n03255030
 545 | n03259401
 546 | n03271574
 547 | n03272010
 548 | n03272562
 549 | n03290653
 550 | n13869788
 551 | n03297495
 552 | n03314780
 553 | n03325584
 554 | n03337140
 555 | n03344393
 556 | n03345487
 557 | n03347037
 558 | n03355925
 559 | n03372029
 560 | n03376595
 561 | n03379051
 562 | n03384352
 563 | n03388043
 564 | n03388183
 565 | n03388549
 566 | n03393912
 567 | n03394916
 568 | n03400231
 569 | n03404251
 570 | n03417042
 571 | n03424325
 572 | n03425413
 573 | n03443371
 574 | n03444034
 575 | n03445777
 576 | n03445924
 577 | n03447447
 578 | n03447721
 579 | n03450230
 580 | n03452741
 581 | n03457902
 582 | n03459775
 583 | n03461385
 584 | n03467068
 585 | n03476684
 586 | n03476991
 587 | n03478589
 588 | n03482001
 589 | n03482405
 590 | n03483316
 591 | n03485407
 592 | n03485794
 593 | n03492542
 594 | n03494278
 595 | n03495570
 596 | n03496892
 597 | n03498962
 598 | n03527565
 599 | n03529860
 600 | n09218315
 601 | n03532672
 602 | n03534580
 603 | n03535780
 604 | n03538406
 605 | n03544143
 606 | n03584254
 607 | n03584829
 608 | n03590841
 609 | n03594734
 610 | n03594945
 611 | n03595614
 612 | n03598930
 613 | n03599486
 614 | n03602883
 615 | n03617480
 616 | n03623198
 617 | n03627232
 618 | n03630383
 619 | n03633091
 620 | n03637318
 621 | n03642806
 622 | n03649909
 623 | n03657121
 624 | n03658185
 625 | n07977870
 626 | n03662601
 627 | n03666591
 628 | n03670208
 629 | n03673027
 630 | n03676483
 631 | n03680355
 632 | n03690938
 633 | n03691459
 634 | n03692522
 635 | n03697007
 636 | n03706229
 637 | n03709823
 638 | n03710193
 639 | n03710637
 640 | n03710721
 641 | n03717622
 642 | n03720891
 643 | n03721384
 644 | n03725035
 645 | n03729826
 646 | n03733131
 647 | n03733281
 648 | n03733805
 649 | n03742115
 650 | n03743016
 651 | n03759954
 652 | n03761084
 653 | n03763968
 654 | n03764736
 655 | n03769881
 656 | n03770439
 657 | n03770679
 658 | n03773504
 659 | n03775071
 660 | n03775546
 661 | n03776460
 662 | n03777568
 663 | n03777754
 664 | n03781244
 665 | n03782006
 666 | n03785016
 667 | n03786901
 668 | n03787032
 669 | n03788195
 670 | n03788365
 671 | n03791053
 672 | n03792782
 673 | n03792972
 674 | n03793489
 675 | n03794056
 676 | n03796401
 677 | n03803284
 678 | n03804744
 679 | n03814639
 680 | n03814906
 681 | n03825788
 682 | n03832673
 683 | n03837869
 684 | n03838899
 685 | n03840681
 686 | n03841143
 687 | n03843555
 688 | n03854065
 689 | n03857828
 690 | n03866082
 691 | n03868242
 692 | n03868863
 693 | n03871628
 694 | n03873416
 695 | n03874293
 696 | n03874599
 697 | n03876231
 698 | n03877472
 699 | n03878211
 700 | n03884397
 701 | n03887697
 702 | n03888257
 703 | n03888605
 704 | n03891251
 705 | n03891332
 706 | n03895866
 707 | n03899768
 708 | n03902125
 709 | n03903868
 710 | n03908618
 711 | n03908714
 712 | n03916031
 713 | n03920288
 714 | n03924679
 715 | n03929660
 716 | n03929855
 717 | n03930313
 718 | n03930630
 719 | n03934042
 720 | n03935335
 721 | n03937543
 722 | n03938244
 723 | n03942813
 724 | n03944341
 725 | n03947888
 726 | n03950228
 727 | n03954731
 728 | n03956157
 729 | n03958227
 730 | n03961711
 731 | n03967562
 732 | n03970156
 733 | n03976467
 734 | n03977158
 735 | n03977966
 736 | n03980874
 737 | n03982430
 738 | n03983396
 739 | n03991062
 740 | n03992509
 741 | n03995372
 742 | n03998194
 743 | n04004767
 744 | n04005630
 745 | n04008634
 746 | n04009801
 747 | n04019541
 748 | n04023962
 749 | n04026417
 750 | n04033901
 751 | n04033995
 752 | n04037443
 753 | n04039381
 754 | n09403211
 755 | n04041544
 756 | n04044716
 757 | n04049303
 758 | n04065272
 759 | n04067658
 760 | n04069434
 761 | n04070727
 762 | n04074963
 763 | n04081281
 764 | n04086273
 765 | n04090263
 766 | n04099969
 767 | n04111531
 768 | n04116512
 769 | n04118538
 770 | n04118776
 771 | n04120489
 772 | n04125116
 773 | n04127249
 774 | n04131690
 775 | n04133789
 776 | n04136333
 777 | n04141076
 778 | n04141327
 779 | n04141975
 780 | n04146614
 781 | n04147291
 782 | n04149813
 783 | n04152593
 784 | n04154340
 785 | n07917272
 786 | n04162706
 787 | n04179913
 788 | n04192698
 789 | n04200800
 790 | n04201297
 791 | n04204238
 792 | n04204347
 793 | n04208427
 794 | n04209133
 795 | n04209239
 796 | n04228054
 797 | n04229816
 798 | n04235860
 799 | n04238763
 800 | n04239074
 801 | n04243546
 802 | n04251144
 803 | n04252077
 804 | n04252225
 805 | n04254120
 806 | n04254680
 807 | n04254777
 808 | n04258138
 809 | n04259630
 810 | n04263257
 811 | n04264628
 812 | n04265275
 813 | n04266014
 814 | n04270147
 815 | n04273569
 816 | n04275548
 817 | n04277669
 818 | n04285008
 819 | n04286575
 820 | n04296562
 821 | n04310018
 822 | n04311004
 823 | n04311174
 824 | n04317175
 825 | n04325704
 826 | n04326547
 827 | n04328186
 828 | n04330267
 829 | n04332243
 830 | n04335435
 831 | n04337157
 832 | n04344873
 833 | n04346328
 834 | n04347754
 835 | n04350905
 836 | n04355338
 837 | n04355933
 838 | n04356056
 839 | n04357314
 840 | n04366367
 841 | n04367480
 842 | n04370456
 843 | n04371430
 844 | n04371774
 845 | n04372370
 846 | n04376876
 847 | n04380533
 848 | n04389033
 849 | n04392985
 850 | n04398044
 851 | n04399382
 852 | n04404412
 853 | n04409515
 854 | n04417672
 855 | n04418357
 856 | n04423845
 857 | n04428191
 858 | n04429376
 859 | n04435653
 860 | n04442312
 861 | n04443257
 862 | n04447861
 863 | n04456115
 864 | n04458633
 865 | n04461696
 866 | n04462240
 867 | n04465666
 868 | n04467665
 869 | n04476259
 870 | n04479046
 871 | n04482393
 872 | n04483307
 873 | n04485082
 874 | n04486054
 875 | n04487081
 876 | n04487394
 877 | n04493381
 878 | n04501370
 879 | n04505470
 880 | n04507155
 881 | n04509417
 882 | n04515003
 883 | n04517823
 884 | n04522168
 885 | n04523525
 886 | n04525038
 887 | n04525305
 888 | n04532106
 889 | n04532670
 890 | n04536866
 891 | n04540053
 892 | n04542943
 893 | n04548280
 894 | n04548362
 895 | n04550184
 896 | n04552348
 897 | n04553703
 898 | n04554684
 899 | n04557648
 900 | n04560804
 901 | n04562935
 902 | n04579145
 903 | n04579667
 904 | n04584207
 905 | n04589890
 906 | n04590129
 907 | n04591157
 908 | n04591713
 909 | n04592741
 910 | n04596742
 911 | n04597913
 912 | n04599235
 913 | n04604644
 914 | n04606251
 915 | n04612504
 916 | n04613696
 917 | n06359193
 918 | n06596364
 919 | n06785654
 920 | n06794110
 921 | n06874185
 922 | n07248320
 923 | n07565083
 924 | n07579787
 925 | n07583066
 926 | n07584110
 927 | n07590611
 928 | n07613480
 929 | n07614500
 930 | n07615774
 931 | n07684084
 932 | n07693725
 933 | n07695742
 934 | n07697313
 935 | n07697537
 936 | n07711569
 937 | n07714571
 938 | n07714990
 939 | n07715103
 940 | n12159804
 941 | n12160303
 942 | n12160857
 943 | n07717556
 944 | n07718472
 945 | n07718747
 946 | n07720875
 947 | n07730033
 948 | n13001041
 949 | n07742313
 950 | n07745940
 951 | n07747607
 952 | n07749582
 953 | n07753113
 954 | n07753275
 955 | n07753592
 956 | n07754684
 957 | n07760859
 958 | n07768694
 959 | n07802026
 960 | n07831146
 961 | n07836838
 962 | n07860988
 963 | n07871810
 964 | n07873807
 965 | n07875152
 966 | n07880968
 967 | n07892512
 968 | n07920052
 969 | n07930864
 970 | n07932039
 971 | n09193705
 972 | n09229709
 973 | n09246464
 974 | n09256479
 975 | n09288635
 976 | n09332890
 977 | n09399592
 978 | n09421951
 979 | n09428293
 980 | n09468604
 981 | n09472597
 982 | n09835506
 983 | n10148035
 984 | n10565667
 985 | n11879895
 986 | n11939491
 987 | n12057211
 988 | n12144580
 989 | n12267677
 990 | n12620546
 991 | n12768682
 992 | n12985857
 993 | n12998815
 994 | n13037406
 995 | n13040303
 996 | n13044778
 997 | n13052670
 998 | n13054560
 999 | n13133613
1000 | n15075141
1001 | 


--------------------------------------------------------------------------------
/deployment/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
  1 | name: dirtypix
  2 | channels:
  3 |   - defaults
  4 | dependencies:
  5 |   - _libgcc_mutex=0.1=main
  6 |   - _tflow_select=2.1.0=gpu
  7 |   - absl-py=0.12.0=py36h06a4308_0
  8 |   - astor=0.8.1=py36h06a4308_0
  9 |   - blas=1.0=mkl
 10 |   - blosc=1.21.0=h8c45485_0
 11 |   - brotli=1.0.9=he6710b0_2
 12 |   - bzip2=1.0.8=h7b6447c_0
 13 |   - c-ares=1.17.1=h27cfd23_0
 14 |   - ca-certificates=2021.4.13=h06a4308_1
 15 |   - certifi=2020.12.5=py36h06a4308_0
 16 |   - charls=2.1.0=he6710b0_2
 17 |   - cloudpickle=1.6.0=py_0
 18 |   - coverage=5.5=py36h27cfd23_2
 19 |   - cudatoolkit=9.2=0
 20 |   - cudnn=7.6.5=cuda9.2_0
 21 |   - cupti=9.2.148=0
 22 |   - cycler=0.10.0=py36_0
 23 |   - cython=0.29.23=py36h2531618_0
 24 |   - cytoolz=0.11.0=py36h7b6447c_0
 25 |   - dask-core=2021.3.0=pyhd3eb1b0_0
 26 |   - decorator=5.0.6=pyhd3eb1b0_0
 27 |   - freetype=2.10.4=h5ab3b9f_0
 28 |   - gast=0.4.0=py_0
 29 |   - giflib=5.1.4=h14c3975_1
 30 |   - grpcio=1.36.1=py36h2157cd5_1
 31 |   - h5py=2.10.0=py36hd6299e0_1
 32 |   - hdf5=1.10.6=hb1b8bf9_0
 33 |   - imagecodecs=2020.5.30=py36hfa7d478_2
 34 |   - imageio=2.9.0=pyhd3eb1b0_0
 35 |   - importlib-metadata=3.10.0=py36h06a4308_0
 36 |   - intel-openmp=2021.2.0=h06a4308_610
 37 |   - jpeg=9b=h024ee3a_2
 38 |   - jxrlib=1.1=h7b6447c_2
 39 |   - keras-applications=1.0.8=py_1
 40 |   - keras-preprocessing=1.1.2=pyhd3eb1b0_0
 41 |   - kiwisolver=1.3.1=py36h2531618_0
 42 |   - lcms2=2.12=h3be6417_0
 43 |   - ld_impl_linux-64=2.33.1=h53a641e_7
 44 |   - libaec=1.0.4=he6710b0_1
 45 |   - libffi=3.3=he6710b0_2
 46 |   - libgcc-ng=9.1.0=hdf63c60_0
 47 |   - libgfortran-ng=7.3.0=hdf63c60_0
 48 |   - libpng=1.6.37=hbc83047_0
 49 |   - libprotobuf=3.14.0=h8c45485_0
 50 |   - libstdcxx-ng=9.1.0=hdf63c60_0
 51 |   - libtiff=4.1.0=h2733197_1
 52 |   - libwebp=1.0.1=h8e7db2f_0
 53 |   - libzopfli=1.0.3=he6710b0_0
 54 |   - lz4-c=1.9.3=h2531618_0
 55 |   - markdown=3.3.4=py36h06a4308_0
 56 |   - matplotlib-base=3.3.4=py36h62a2d02_0
 57 |   - mkl=2020.2=256
 58 |   - mkl-service=2.3.0=py36he8ac12f_0
 59 |   - mkl_fft=1.3.0=py36h54f3939_0
 60 |   - mkl_random=1.1.1=py36h0573a6f_0
 61 |   - ncurses=6.2=he6710b0_1
 62 |   - networkx=2.5=py_0
 63 |   - numpy=1.19.2=py36h54aff64_0
 64 |   - numpy-base=1.19.2=py36hfa32c7d_0
 65 |   - olefile=0.46=py_0
 66 |   - openjpeg=2.3.0=h05c96fa_1
 67 |   - openssl=1.1.1k=h27cfd23_0
 68 |   - pillow=8.2.0=py36he98fc37_0
 69 |   - pip=21.0.1=py36h06a4308_0
 70 |   - protobuf=3.14.0=py36h2531618_1
 71 |   - pyparsing=2.4.7=pyhd3eb1b0_0
 72 |   - python=3.6.13=hdb3f193_0
 73 |   - python-dateutil=2.8.1=pyhd3eb1b0_0
 74 |   - pywavelets=1.1.1=py36h7b6447c_2
 75 |   - pyyaml=5.4.1=py36h27cfd23_1
 76 |   - readline=8.1=h27cfd23_0
 77 |   - scikit-image=0.17.2=py36hdf5156a_0
 78 |   - scipy=1.5.2=py36h0b6359f_0
 79 |   - setuptools=52.0.0=py36h06a4308_0
 80 |   - six=1.15.0=pyhd3eb1b0_0
 81 |   - snappy=1.1.8=he6710b0_0
 82 |   - sqlite=3.35.4=hdfb4753_0
 83 |   - tensorboard=1.12.2=py36he6710b0_0
 84 |   - tensorflow=1.12.0=gpu_py36he74679b_0
 85 |   - tensorflow-base=1.12.0=gpu_py36had579c0_0
 86 |   - tensorflow-gpu=1.12.0=h0d30ee6_0
 87 |   - termcolor=1.1.0=py36h06a4308_1
 88 |   - tifffile=2021.3.31=pyhd3eb1b0_1
 89 |   - tk=8.6.10=hbc83047_0
 90 |   - toolz=0.11.1=pyhd3eb1b0_0
 91 |   - tornado=6.1=py36h27cfd23_0
 92 |   - typing_extensions=3.7.4.3=pyha847dfd_0
 93 |   - werkzeug=1.0.1=pyhd3eb1b0_0
 94 |   - wheel=0.36.2=pyhd3eb1b0_0
 95 |   - xz=5.2.5=h7b6447c_0
 96 |   - yaml=0.2.5=h7b6447c_0
 97 |   - zipp=3.4.1=pyhd3eb1b0_0
 98 |   - zlib=1.2.11=h7b6447c_3
 99 |   - zstd=1.4.5=h9ceee32_0
100 | prefix: /home/frank.julca-aguilar/anaconda3/envs/dirtypix
101 | 


--------------------------------------------------------------------------------
/loss_functions/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/princeton-computational-imaging/DirtyPixels/6c82b124c9e32bbf5fa7d6adf8db8103132e4e5e/loss_functions/__init__.py


--------------------------------------------------------------------------------
/loss_functions/loss_factory.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains a factory for building various models."""
16 | 
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 | 
21 | import tensorflow as tf
22 | 
23 | from preprocessing import cifarnet_preprocessing
24 | from preprocessing import inception_preprocessing
25 | from preprocessing import lenet_preprocessing
26 | from preprocessing import vgg_preprocessing
27 | 
28 | slim = tf.contrib.slim
29 | 
30 | 
31 | 
32 | def get_loss(name):
33 |   """Returns loss_fn(outputs, ground_truths, **kwargs), where "outputs" are the model outputs.
34 | 
35 |   Args:
36 |     name: The name of the loss function.
37 | 
38 |   Returns:
39 |     loss_fn: A function that computes the loss between the inputs and the ground_truths
40 | 
41 |   Raises:
42 |     ValueError: If Preprocessing `name` is not recognized.
43 |   """
44 |   loss_fn_map = {
45 |     'mean_squared_error':slim.losses.mean_squared_error,
46 |     'absolute_difference':slim.losses.absolute_difference
47 |   }
48 | 
49 |   if name not in loss_fn_map:
50 |     raise ValueError('Loss function name [%s] was not recognized' % name)
51 | 
52 |   def loss_fn(outputs, ground_truths, **kwargs):
53 |     return loss_fn_map[name](
54 |         outputs, ground_truths, **kwargs)
55 | 
56 |   return loss_fn
57 | 


--------------------------------------------------------------------------------
/nets/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/nets/inception.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Brings all inception models under one namespace."""
16 | 
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 | 
21 | # pylint: disable=unused-import
22 | from nets.inception_resnet_v2 import inception_resnet_v2
23 | from nets.inception_resnet_v2 import inception_resnet_v2_arg_scope
24 | from nets.inception_v1 import inception_v1
25 | from nets.inception_v1 import inception_v1_arg_scope
26 | from nets.inception_v1 import inception_v1_base
27 | from nets.inception_v2 import inception_v2
28 | from nets.inception_v2 import inception_v2_arg_scope
29 | from nets.inception_v2 import inception_v2_base
30 | from nets.inception_v3 import inception_v3
31 | from nets.inception_v3 import inception_v3_arg_scope
32 | from nets.inception_v3 import inception_v3_base
33 | from nets.inception_v4 import inception_v4
34 | from nets.inception_v4 import inception_v4_arg_scope
35 | from nets.inception_v4 import inception_v4_base
36 | # pylint: enable=unused-import
37 | 


--------------------------------------------------------------------------------
/nets/inception_utils.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains common code shared by all inception models.
16 | 
17 | Usage of arg scope:
18 |   with slim.arg_scope(inception_arg_scope()):
19 |     logits, end_points = inception.inception_v3(images, num_classes,
20 |                                                 is_training=is_training)
21 | 
22 | """
23 | from __future__ import absolute_import
24 | from __future__ import division
25 | from __future__ import print_function
26 | 
27 | import tensorflow as tf
28 | 
29 | slim = tf.contrib.slim
30 | 
31 | 
32 | def inception_arg_scope(weight_decay=0.00004,
33 |                         use_batch_norm=True,
34 |                         batch_norm_decay=0.9997,
35 |                         batch_norm_epsilon=0.001):
36 |   """Defines the default arg scope for inception models.
37 | 
38 |   Args:
39 |     weight_decay: The weight decay to use for regularizing the model.
40 |     use_batch_norm: "If `True`, batch_norm is applied after each convolution.
41 |     batch_norm_decay: Decay for batch norm moving average.
42 |     batch_norm_epsilon: Small float added to variance to avoid dividing by zero
43 |       in batch norm.
44 | 
45 |   Returns:
46 |     An `arg_scope` to use for the inception models.
47 |   """
48 |   print("weight decay = ", weight_decay)
49 |   batch_norm_params = {
50 |       # Decay for the moving averages.
51 |       'decay': batch_norm_decay,
52 |       # epsilon to prevent 0s in variance.
53 |       'epsilon': batch_norm_epsilon,
54 |       # collection containing update_ops.
55 |       'updates_collections': tf.GraphKeys.UPDATE_OPS,
56 |   }
57 |   if use_batch_norm:
58 |     normalizer_fn = slim.batch_norm
59 |     normalizer_params = batch_norm_params
60 |   else:
61 |     normalizer_fn = None
62 |     normalizer_params = {}
63 |   # Set weight_decay for weights in Conv and FC layers.
64 |   with slim.arg_scope([slim.conv2d, slim.fully_connected],
65 |                       weights_regularizer=slim.l2_regularizer(weight_decay)):
66 |     with slim.arg_scope(
67 |         [slim.conv2d],
68 |         weights_initializer=slim.variance_scaling_initializer(),
69 |         activation_fn=tf.nn.relu,
70 |         normalizer_fn=normalizer_fn,
71 |         normalizer_params=normalizer_params) as sc:
72 |       return sc
73 | 


--------------------------------------------------------------------------------
/nets/isp.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Contains the definition of the Inception V4 architecture.
 16 | 
 17 | As described in http://arxiv.org/abs/1602.07261.
 18 | 
 19 |   Inception-v4, Inception-ResNet and the Impact of Residual Connections
 20 |     on Learning
 21 |   Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi
 22 | """
 23 | from __future__ import absolute_import
 24 | from __future__ import division
 25 | from __future__ import print_function
 26 | 
 27 | import tensorflow as tf
 28 | 
 29 | from nets import inception_utils
 30 | 
 31 | slim = tf.contrib.slim
 32 | 
 33 | def isp_arg_scope(weight_decay=0.00004,
 34 |                   use_batch_norm=True,
 35 |                   batch_norm_decay=0.95,
 36 |                   batch_norm_epsilon=0.0001):
 37 |    """Defines the default arg scope for inception models.
 38 | 
 39 |    Args:
 40 |      weight_decay: The weight decay to use for regularizing the model.
 41 |      use_batch_norm: "If `True`, batch_norm is applied after each convolution.
 42 |      batch_norm_decay: Decay for batch norm moving average.
 43 |      batch_norm_epsilon: Small float added to variance to avoid dividing by zero
 44 |        in batch norm.
 45 | 
 46 |    Returns:
 47 |      An `arg_scope` to use for the inception models.
 48 |    """
 49 |    print("weight decay = ", weight_decay)
 50 |    print("batch norm decay = ", batch_norm_decay)
 51 |    batch_norm_params = {
 52 |        # Decay for the moving averages.
 53 |        'decay': batch_norm_decay,
 54 |        # epsilon to prevent 0s in variance.
 55 |        'epsilon': batch_norm_epsilon,
 56 |        # collection containing update_ops.
 57 |        'updates_collections': tf.GraphKeys.UPDATE_OPS,
 58 |        'center': True,
 59 |        'scale': False,
 60 |    }
 61 |    if use_batch_norm:
 62 |      normalizer_fn = slim.batch_norm
 63 |      normalizer_params = batch_norm_params
 64 |    else:
 65 |      normalizer_fn = None
 66 |      normalizer_params = {}
 67 |    # Set weight_decay for weights in Conv and FC layers.
 68 |    with slim.arg_scope([slim.conv2d, slim.fully_connected],
 69 |                        weights_regularizer=slim.l2_regularizer(weight_decay)):
 70 |      with slim.arg_scope(
 71 |          [slim.conv2d],
 72 |          weights_initializer=slim.variance_scaling_initializer(),
 73 |          activation_fn=tf.nn.relu,
 74 |          normalizer_fn=normalizer_fn,
 75 |          normalizer_params=normalizer_params) as sc:
 76 |        return sc
 77 | 
 78 | def anscombe(data, sigma, alpha, scale=255.0, is_real_data=False):
 79 |     """Transform N(mu,sigma^2) + \alpha Pois(y) into N(0,scale^2) noise."""
 80 |     if is_real_data:
 81 |       z = data/alpha[:,None,None,:]
 82 |       sigma_hat = sigma/alpha
 83 |       sqrt_term = z + 3./8. + tf.square(sigma_hat)[:,None,None,:]
 84 |     else:
 85 |       z = data/alpha[:,None,None,None]
 86 |       sigma_hat = sigma/alpha
 87 |       sqrt_term = z + 3./8. + tf.square(sigma_hat)[:,None,None,None]
 88 |     
 89 |     sqrt_term = tf.maximum(sqrt_term, 0.0)
 90 | 
 91 |     return 2*tf.sqrt(sqrt_term)
 92 | 
 93 | 
 94 | def inv_anscombe(data, sigma, alpha, scale=1.0, unbiased=False, is_real_data=False):
 95 |     """Invert anscombe transform."""
 96 |     sigma_hat = sigma/alpha
 97 |     if is_real_data:
 98 |       z = .25* tf.square(data) - 1./8 - tf.square(sigma_hat)[:,None,None,:]
 99 |       if unbiased:
100 |         z = z + .25*tf.sqrt(3./2)*data**-1 - 11./8.*data**-2 + 5./8.*tf.sqrt(3./2)*data**-3
101 |       result = z*alpha[:,None,None,:]
102 |     else:
103 |       z = .25* tf.square(data) - 1./8 - tf.square(sigma_hat)[:,None,None,None]
104 |       #data = tf.Print(data, ["data", tf.reduce_max(data), tf.reduce_min(data)])
105 |       
106 |       #z = tf.maximum(z, 0)
107 |       if unbiased:
108 |         z = z + .25*tf.sqrt(3./2)*data**-1 - 11./8.*data**-2 + 5./8.*tf.sqrt(3./2)*data**-3
109 |       result = z*alpha[:,None,None,None]
110 |     return result
111 |     #return tf.clip_by_value(result, 0.0, scale)
112 | 
113 | def prox_grad_isp(inputs,
114 |                   alpha,
115 |                   sigma,
116 |                   bayer_mask,
117 |                   num_iters=4,
118 |                   num_channels=3,
119 |                   num_layers=5,
120 |                   kernel=None,
121 |                   num_classes=1001,
122 |                   is_training=True,
123 |                   scale=1.0,
124 |                   use_anscombe=True,
125 |                   noise_channel=True, 
126 |                   use_chen_unet=False, 
127 |                   is_real_data=True):
128 | 
129 |     end_points = {}
130 |     end_points['inputs'] = inputs
131 |     if use_anscombe and alpha is not None:
132 |         print(("USING THE ANCOMB TRANSFORM with scale %f" % scale) + "!"*10)
133 |         true_img = anscombe(inputs, alpha=alpha, sigma=sigma, scale=scale, is_real_data=is_real_data)
134 |         min_offset = tf.reduce_min(true_img, [1,2,3], keep_dims=True)
135 |         max_scale = tf.reduce_max(true_img, [1,2,3], keep_dims=True)
136 |         noise_scale = scale/(max_scale - min_offset)
137 |         true_img = (true_img - min_offset)*noise_scale
138 |         noise_ch = noise_scale
139 |         end_points['post_anscombe'] = true_img
140 |     else:
141 |         true_img = inputs
142 |         noise_ch = sigma[:,None,None,None]
143 | 
144 |     if not noise_channel:
145 |         noise_ch = None
146 |     else:
147 |         print(("USING NOISE CHANNEL"))
148 |         dims = [d.value for d in inputs.get_shape()]
149 |         noise_ch = tf.tile(noise_ch, [1, dims[1], dims[2], 1])
150 | 
151 |     if use_chen_unet:
152 |       print('USING UNET AS ISP (NON-PROX GRAD)')
153 |       from nets import unet
154 |       ans_x_out = unet.unet(true_img)
155 |       end_points = {}
156 | 
157 |     else:
158 |       ans_x_out, end_points = prox_grad(true_img, bayer_mask, end_points, num_layers=num_layers,
159 |                           num_iters=num_iters, noise_channel=noise_ch, is_training=is_training)
160 |     # ans_x_out, end_points = prox_grad(true_img, bayer_mask, end_points, num_layers=num_layers,
161 |     #                       num_iters=num_iters, noise_channel=noise_ch, is_training=is_training)
162 | 
163 |     if use_anscombe and alpha is not None:
164 |         end_points['pre_inv_anscombe'] = ans_x_out
165 |         ans_x_out = ans_x_out/noise_scale + min_offset
166 |         ans_x_out = inv_anscombe(ans_x_out, alpha=alpha, sigma=sigma, scale=scale, is_real_data=is_real_data)
167 |     end_points['outputs'] = ans_x_out
168 |     return ans_x_out, end_points
169 | 
170 | 
171 | def prox_grad(inputs, bayer_mask, end_points, num_layers=5, num_iters=4, lambda_init=1.0,
172 |         is_training=True, scope='gauss_den', noise_channel=None):
173 |   flat_inputs = tf.reduce_sum(inputs, 3, keep_dims=True)
174 |   with tf.variable_scope(scope, 'gauss_den', [inputs]) as sc:
175 |     xk = inputs
176 |     lam = slim.variable(name='lambda', shape=[], initializer=tf.constant_initializer(lambda_init))
177 |     end_points['lambda'] = lam
178 |     beta_init = 1.0
179 |     for t in range(num_iters):
180 |       with tf.variable_scope('iter_%i'% t):
181 |          with slim.arg_scope([slim.batch_norm, slim.dropout],
182 |                               is_training=is_training):
183 |              # Collect outputs for conv2d, fully_connected and max_pool2d.
184 |              beta_init *=  2.0 # Continuation scheme as proposed in http://www.caam.rice.edu/~yzhang/reports/tr0710_rev.pdf, algorithm 2
185 |              beta = slim.variable(name='beta', shape=[], initializer=tf.constant_initializer(beta_init))
186 |              end_points['beta%s'%t] = beta
187 |              with tf.variable_scope('prior_grad') as prior_scope:
188 |                 #curr_z = cnn_proximal(xk, num_layers, 3, noise_channel, width=12, rate=1)
189 |                 if noise_channel is None:
190 |                     concat_xk = xk
191 |                 else:
192 |                     concat_xk = tf.concat([xk, noise_channel], 3)
193 |                 curr_z = unet_res(concat_xk, 0, 'unet')
194 |              #end_points['prior_grad_%i' % t] = curr_z
195 |              tmp = xk - curr_z
196 |              xk = (lam*bayer_mask*inputs + beta*tmp)/(lam*bayer_mask + beta)
197 |              #end_points['iter_%i' % t] = xk
198 | 
199 |   return xk, end_points
200 | 
201 | def unet_res(inputs, depth, scope, max_depth=2):
202 |    # U-NET operating at a given resolution.
203 |    shape = [d.value for d in inputs.get_shape()]
204 |    print(depth, shape)
205 |    ch = max(shape[3]*2, 8)
206 |    with tf.variable_scope('depth_%s' % depth, values=[inputs]) as scope:
207 |         if depth == 0:
208 |           outputs = slim.conv2d(inputs, ch, [3, 3], rate=2, scope='conv_in', normalizer_fn=None)
209 |         else:
210 |           outputs = slim.conv2d(inputs, ch, [3, 3], scope='conv_in')
211 |         outputs = slim.conv2d(outputs, ch, [3, 3], scope='conv_1')
212 |         downsamp = slim.avg_pool2d(outputs, [2, 2])
213 |    if depth < max_depth:
214 |      lower = unet_res(downsamp, depth+1, scope, max_depth)
215 |      outputs = tf.concat([outputs, lower], 3)
216 |    with tf.variable_scope('depth_%s' % depth, values=[outputs]) as scope:
217 |         outputs = slim.conv2d(outputs, ch, [3, 3], scope='conv_2')
218 |         if depth > 0:
219 |            outputs = slim.conv2d(outputs, ch, [3, 3], scope='out_conv')
220 |            outputs = slim.conv2d_transpose(outputs, ch//2, [2,2], stride=2, scope='up_conv',
221 |                activation_fn=None, normalizer_fn=None)
222 |         else:
223 |            outputs = slim.conv2d(outputs, 3, [3, 3], scope='out_conv',
224 |              activation_fn=None, normalizer_fn=None)
225 |    return outputs
226 | 


--------------------------------------------------------------------------------
/nets/nets_factory.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains a factory for building various models."""
16 | 
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 | import functools
21 | 
22 | import tensorflow as tf
23 | 
24 | from nets import isp
25 | from nets import mobilenet_v1
26 | from nets import mobilenet_isp
27 | 
28 | slim = tf.contrib.slim
29 | 
30 | networks_map = {'isp': isp.prox_grad_isp,
31 |                 'mobilenet_isp': mobilenet_isp.mobilenet_v1,
32 |                 'mobilenet_v1': mobilenet_v1.mobilenet_v1,
33 |                 'deeper_mobilenet_v1': mobilenet_v1.deeper_mobile_net_v1,
34 |                }
35 | 
36 | arg_scopes_map = {'isp': isp.isp_arg_scope,
37 |                   'mobilenet_isp': mobilenet_isp.mobilenet_v1_arg_scope,
38 |                   'mobilenet_v1': mobilenet_v1.mobilenet_v1_arg_scope,
39 |                   'deeper_mobilenet_v1': mobilenet_v1.mobilenet_v1_arg_scope,
40 |                  }
41 | 
42 | 
43 | def get_network_fn(name, num_classes, weight_decay, batch_norm_decay, is_training):
44 |   """Returns a network_fn such as `logits, end_points = network_fn(images)`.
45 | 
46 |   Args:
47 |     name: The name of the network.
48 |     num_classes: The number of classes to use for classification.
49 |     weight_decay: The l2 coefficient for the model weights.
50 |     is_training: `True` if the model is being used for training and `False`
51 |       otherwise.
52 | 
53 |   Returns:
54 |     network_fn: A function that applies the model to a batch of images. It has
55 |       the following signature:
56 |         logits, end_points = network_fn(images)
57 |   Raises:
58 |     ValueError: If network `name` is not recognized.
59 |   """
60 |   if name not in networks_map:
61 |     raise ValueError('Name of network unknown %s' % name)
62 | 
63 |   func = networks_map[name]
64 | 
65 |   @functools.wraps(func)
66 |   def network_fn(images, **kwargs):
67 |     arg_scope = arg_scopes_map[name](weight_decay=weight_decay)
68 | 
69 |     with slim.arg_scope(arg_scope):
70 |       return func(images, is_training=is_training, **kwargs)
71 | 
72 |   if hasattr(func, 'default_image_size'):
73 |     network_fn.default_image_size = func.default_image_size
74 | 
75 |   return network_fn
76 | 


--------------------------------------------------------------------------------
/nets/nets_factory_test.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Google Inc. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | 
16 | """Tests for slim.inception."""
17 | 
18 | from __future__ import absolute_import
19 | from __future__ import division
20 | from __future__ import print_function
21 | 
22 | import tensorflow as tf
23 | 
24 | from nets import nets_factory
25 | 
26 | slim = tf.contrib.slim
27 | 
28 | 
29 | class NetworksTest(tf.test.TestCase):
30 | 
31 |   def testGetNetworkFn(self):
32 |     batch_size = 5
33 |     num_classes = 1000
34 |     for net in nets_factory.networks_map:
35 |       with self.test_session():
36 |         net_fn = nets_factory.get_network_fn(net, num_classes)
37 |         # Most networks use 224 as their default_image_size
38 |         image_size = getattr(net_fn, 'default_image_size', 224)
39 |         inputs = tf.random_uniform((batch_size, image_size, image_size, 3))
40 |         logits, end_points = net_fn(inputs)
41 |         self.assertTrue(isinstance(logits, tf.Tensor))
42 |         self.assertTrue(isinstance(end_points, dict))
43 |         self.assertEqual(logits.get_shape().as_list()[0], batch_size)
44 |         self.assertEqual(logits.get_shape().as_list()[-1], num_classes)
45 | 
46 |   def testGetNetworkFnArgScope(self):
47 |     batch_size = 5
48 |     num_classes = 10
49 |     net = 'cifarnet'
50 |     with self.test_session(use_gpu=True):
51 |       net_fn = nets_factory.get_network_fn(net, num_classes)
52 |       image_size = getattr(net_fn, 'default_image_size', 224)
53 |       with slim.arg_scope([slim.model_variable, slim.variable],
54 |                           device='/CPU:0'):
55 |         inputs = tf.random_uniform((batch_size, image_size, image_size, 3))
56 |         net_fn(inputs)
57 |       weights = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, 'CifarNet/conv1')[0]
58 |       self.assertDeviceEqual('/CPU:0', weights.device)
59 | 
60 | if __name__ == '__main__':
61 |   tf.test.main()
62 | 


--------------------------------------------------------------------------------
/nets/unet.py:
--------------------------------------------------------------------------------
  1 | # Tensorflow mandates these.
  2 | from __future__ import absolute_import
  3 | from __future__ import division
  4 | from __future__ import print_function
  5 | 
  6 | from collections import namedtuple
  7 | import functools
  8 | 
  9 | import tensorflow as tf
 10 | 
 11 | slim = tf.contrib.slim
 12 | 
 13 | def lrelu(x):
 14 |   return tf.maximum(x * 0.2, x)
 15 | 
 16 | def upsample_and_concat(x1, x2, output_channels, in_channels):
 17 |   pool_size = 2
 18 |   deconv_filter = tf.Variable(tf.truncated_normal([pool_size, pool_size, output_channels, in_channels], stddev=0.02))
 19 |   deconv = tf.nn.conv2d_transpose(x1, deconv_filter, tf.shape(x2), strides=[1, pool_size, pool_size, 1])
 20 | 
 21 |   deconv_output = tf.concat([deconv, x2], 3)
 22 |   deconv_output.set_shape([None, None, None, output_channels * 2])
 23 | 
 24 |   return deconv_output
 25 | 
 26 | 
 27 | def unet(input, scope=None):
 28 |   with tf.variable_scope(scope, 'gauss_den_chen_unet', [input]) as sc:
 29 |     conv1 = slim.conv2d(input, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_1')
 30 |     conv1 = slim.conv2d(conv1, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_2')
 31 |     pool1 = slim.max_pool2d(conv1, [2, 2], padding='SAME')
 32 | 
 33 |     conv2 = slim.conv2d(pool1, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_1')
 34 |     conv2 = slim.conv2d(conv2, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_2')
 35 |     pool2 = slim.max_pool2d(conv2, [2, 2], padding='SAME')
 36 | 
 37 |     conv3 = slim.conv2d(pool2, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_1')
 38 |     conv3 = slim.conv2d(conv3, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_2')
 39 |     pool3 = slim.max_pool2d(conv3, [2, 2], padding='SAME')
 40 | 
 41 |     conv4 = slim.conv2d(pool3, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_1')
 42 |     conv4 = slim.conv2d(conv4, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_2')
 43 |     pool4 = slim.max_pool2d(conv4, [2, 2], padding='SAME')
 44 | 
 45 |     conv5 = slim.conv2d(pool4, 128, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_1')
 46 |     conv5 = slim.conv2d(conv5, 128, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_2')
 47 | 
 48 |     up6 = upsample_and_concat(conv5, conv4, 64, 128)
 49 |     conv6 = slim.conv2d(up6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_1')
 50 |     conv6 = slim.conv2d(conv6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_2')
 51 | 
 52 |     up7 = upsample_and_concat(conv6, conv3, 32, 64)
 53 |     conv7 = slim.conv2d(up7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_1')
 54 |     conv7 = slim.conv2d(conv7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_2')
 55 | 
 56 |     up8 = upsample_and_concat(conv7, conv2, 16, 32)
 57 |     conv8 = slim.conv2d(up8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_1')
 58 |     conv8 = slim.conv2d(conv8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_2')
 59 | 
 60 |     up9 = upsample_and_concat(conv8, conv1, 8, 16)
 61 |     conv9 = slim.conv2d(up9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_1')
 62 |     conv9 = slim.conv2d(conv9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_2')
 63 | 
 64 |     # conv10 = slim.conv2d(conv9, 12, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
 65 |     # out = tf.depth_to_space(conv10, 2)
 66 |     out = slim.conv2d(conv9, 3, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
 67 |   return out
 68 | 
 69 | # def unet(input, scope=None):
 70 | #   with tf.variable_scope(scope, 'gauss_den_chen_unet', [input]) as sc:
 71 | #     conv1 = slim.conv2d(input, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_1')
 72 | #     conv1 = slim.conv2d(conv1, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_2')
 73 | #     pool1 = slim.max_pool2d(conv1, [2, 2], padding='SAME')
 74 | 
 75 | #     conv2 = slim.conv2d(pool1, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_1')
 76 | #     conv2 = slim.conv2d(conv2, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_2')
 77 | #     pool2 = slim.max_pool2d(conv2, [2, 2], padding='SAME')
 78 | 
 79 | #     conv3 = slim.conv2d(pool2, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_1')
 80 | #     conv3 = slim.conv2d(conv3, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_2')
 81 | #     pool3 = slim.max_pool2d(conv3, [2, 2], padding='SAME')
 82 | 
 83 | #     conv4 = slim.conv2d(pool3, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_1')
 84 | #     conv4 = slim.conv2d(conv4, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_2')
 85 | #     pool4 = slim.max_pool2d(conv4, [2, 2], padding='SAME')
 86 | 
 87 | #     conv5 = slim.conv2d(pool4, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_1')
 88 | #     conv5 = slim.conv2d(conv5, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_2')
 89 | 
 90 | #     up6 = upsample_and_concat(conv5, conv4, 32, 64)
 91 | #     conv6 = slim.conv2d(up6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_1')
 92 | #     conv6 = slim.conv2d(conv6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_2')
 93 | 
 94 | #     up7 = upsample_and_concat(conv6, conv3, 32, 64)
 95 | #     conv7 = slim.conv2d(up7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_1')
 96 | #     conv7 = slim.conv2d(conv7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_2')
 97 | 
 98 | #     up8 = upsample_and_concat(conv7, conv2, 16, 32)
 99 | #     conv8 = slim.conv2d(up8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_1')
100 | #     conv8 = slim.conv2d(conv8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_2')
101 | 
102 | #     up9 = upsample_and_concat(conv8, conv1, 8, 16)
103 | #     conv9 = slim.conv2d(up9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_1')
104 | #     conv9 = slim.conv2d(conv9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_2')
105 | 
106 | #     # conv10 = slim.conv2d(conv9, 12, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
107 | #     # out = tf.depth_to_space(conv10, 2)
108 | #     out = slim.conv2d(conv9, 3, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
109 | #   return out


--------------------------------------------------------------------------------
/preprocessing/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/preprocessing/inception_preprocessing.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Provides utilities to preprocess images for the Inception networks."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | from preprocessing import sensor_model
 22 | 
 23 | import tensorflow as tf
 24 | import numpy as np
 25 | 
 26 | from tensorflow.python.ops import control_flow_ops
 27 | 
 28 | 
 29 | def apply_with_random_selector(x, func, num_cases):
 30 |   """Computes func(x, sel), with sel sampled from [0...num_cases-1].
 31 | 
 32 |   Args:
 33 |     x: input Tensor.
 34 |     func: Python function to apply.
 35 |     num_cases: Python int32, number of cases to sample sel from.
 36 | 
 37 |   Returns:
 38 |     The result of func(x, sel), where func receives the value of the
 39 |     selector as a python integer, but sel is sampled dynamically.
 40 |   """
 41 |   sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
 42 |   # Pass the real x only to one of the func calls.
 43 |   return control_flow_ops.merge([
 44 |       func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
 45 |       for case in range(num_cases)])[0]
 46 | 
 47 | 
 48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
 49 |   """Distort the color of a Tensor image.
 50 | 
 51 |   Each color distortion is non-commutative and thus ordering of the color ops
 52 |   matters. Ideally we would randomly permute the ordering of the color ops.
 53 |   Rather then adding that level of complication, we select a distinct ordering
 54 |   of color ops for each preprocessing thread.
 55 | 
 56 |   Args:
 57 |     image: 3-D Tensor containing single image in [0, 1].
 58 |     color_ordering: Python int, a type of distortion (valid values: 0-3).
 59 |     fast_mode: Avoids slower ops (random_hue and random_contrast)
 60 |     scope: Optional scope for name_scope.
 61 |   Returns:
 62 |     3-D Tensor color-distorted image on range [0, 1]
 63 |   Raises:
 64 |     ValueError: if color_ordering not in [0, 3]
 65 |   """
 66 |   with tf.name_scope(scope, 'distort_color', [image]):
 67 |     if fast_mode:
 68 |       if color_ordering == 0:
 69 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 70 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 71 |       else:
 72 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 73 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 74 |     else:
 75 |       if color_ordering == 0:
 76 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 77 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 78 |         image = tf.image.random_hue(image, max_delta=0.2)
 79 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 80 |       elif color_ordering == 1:
 81 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 82 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 83 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 84 |         image = tf.image.random_hue(image, max_delta=0.2)
 85 |       elif color_ordering == 2:
 86 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 87 |         image = tf.image.random_hue(image, max_delta=0.2)
 88 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 89 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 90 |       elif color_ordering == 3:
 91 |         image = tf.image.random_hue(image, max_delta=0.2)
 92 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 93 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 94 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 95 |       else:
 96 |         raise ValueError('color_ordering must be in [0, 3]')
 97 | 
 98 |     # The random_* ops do not necessarily clamp.
 99 |     return tf.clip_by_value(image, 0.0, 1.0)
100 | 
101 | 
102 | def distorted_bounding_box_crop(image,
103 |                                 bbox,
104 |                                 min_object_covered=0.1,
105 |                                 aspect_ratio_range=(0.75, 1.33),
106 |                                 area_range=(0.05, 1.0),
107 |                                 max_attempts=100,
108 |                                 scope=None):
109 |   """Generates cropped_image using a one of the bboxes randomly distorted.
110 | 
111 |   See `tf.image.sample_distorted_bounding_box` for more documentation.
112 | 
113 |   Args:
114 |     image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 |       where each coordinate is [0, 1) and the coordinates are arranged
117 |       as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 |       image.
119 |     min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 |       area of the image must contain at least this fraction of any bounding box
121 |       supplied.
122 |     aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 |       image must have an aspect ratio = width / height within this range.
124 |     area_range: An optional list of `floats`. The cropped area of the image
125 |       must contain a fraction of the supplied image within in this range.
126 |     max_attempts: An optional `int`. Number of attempts at generating a cropped
127 |       region of the image of the specified constraints. After `max_attempts`
128 |       failures, return the entire image.
129 |     scope: Optional scope for name_scope.
130 |   Returns:
131 |     A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 |   """
133 |   with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 |     # Each bounding box has shape [1, num_boxes, box coords] and
135 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 | 
137 |     # A large fraction of image datasets contain a human-annotated bounding
138 |     # box delineating the region of the image containing the object of interest.
139 |     # We choose to create a new bounding box for the object which is a randomly
140 |     # distorted version of the human-annotated bounding box that obeys an
141 |     # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 |     # bounding box. If no box is supplied, then we assume the bounding box is
143 |     # the entire image.
144 |     sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 |         tf.shape(image),
146 |         bounding_boxes=bbox,
147 |         min_object_covered=min_object_covered,
148 |         aspect_ratio_range=aspect_ratio_range,
149 |         area_range=area_range,
150 |         max_attempts=max_attempts,
151 |         use_image_if_no_bounding_boxes=True)
152 |     bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 | 
154 |     # Crop the image to the specified bounding box.
155 |     cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 |     return cropped_image, distort_bbox
157 | 
158 | 
159 | def preprocess_for_train(image, height, width, bbox,
160 |                          fast_mode=True,
161 |                          light_level=None,
162 |                          scope=None):
163 |   """Distort one image for training a netwo.
164 | 
165 |   Distorting images provides a useful technique for augmenting the data
166 |   set during training in order to make the network invariant to aspects
167 |   of the image that do not effect the label.
168 | 
169 |   Additionally it would create image_summaries to display the different
170 |   transformations applied to the image.
171 | 
172 |   Args:
173 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 |       is [0, MAX], where MAX is largest positive representable number for
176 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 |     height: integer
178 |     width: integer
179 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 |       where each coordinate is [0, 1) and the coordinates are arranged
181 |       as [ymin, xmin, ymax, xmax].
182 |     fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 |       bi-cubic resizing, random_hue or random_contrast).
184 |     scope: Optional scope for name_scope.
185 |   Returns:
186 |     3-D float Tensor of distorted image used for training with range [-1, 1].
187 |   """
188 |   with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 |     if bbox is None:
190 |       bbox = tf.constant([0.0, 0.0, 1.0, 1.0],
191 |                          dtype=tf.float32,
192 |                          shape=[1, 1, 4])
193 |     if image.dtype != tf.float32:
194 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
195 | 
196 |     # Each bounding box has shape [1, num_boxes, box coords] and
197 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
198 |     image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
199 |                                                   bbox)
200 |     tf.summary.image('image_with_bounding_boxes', image_with_box)
201 | 
202 |     distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
203 | 
204 |     # Restore the shape since the dynamic slice based upon the bbox_size loses
205 |     # the third dimension.
206 |     distorted_image.set_shape([None, None, 3])
207 |     image_with_distorted_box = tf.image.draw_bounding_boxes(
208 |         tf.expand_dims(image, 0), distorted_bbox)
209 |     tf.summary.image('images_with_distorted_bounding_box',
210 |                      image_with_distorted_box)
211 | 
212 |     # Use nearest neighbor subsampling.
213 |     print("USING NEAREST NEIGHBOR SUBSAMPLING")
214 |     distorted_image = tf.image.resize_images(distorted_image, [height, width],
215 | 			method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
216 | 
217 |     tf.summary.image('cropped_resized_image',
218 |                      tf.expand_dims(distorted_image, 0))
219 | 
220 |     # Add noise - this is only relevant when training the model from scratch. For training with an ISP, there's a small subset of images that are already noised up.
221 |     #distorted_image, a, gauss_std = sensor_model.sensor_noise_rand_light_level(distorted_image, light_level)
222 |     #tf.summary.image('noisy_image', tf.expand_dims(distorted_image,0))
223 |     #bayer_mask = sensor_model.get_bayer_mask(height, width)
224 |     #tf.summary.image('bayer_mask', tf.expand_dims(bayer_mask*255, 0))
225 |     #distorted_image = distorted_image*bayer_mask
226 | 
227 |     # Randomly flip the image horizontally.
228 |     distorted_image = tf.image.random_flip_left_right(distorted_image)
229 | 
230 |     tf.summary.image('final_distorted_image',
231 |                      tf.expand_dims(distorted_image, 0))
232 |     distorted_image = tf.subtract(distorted_image, 0.5)
233 |     distorted_image = tf.multiply(distorted_image, 2.0)
234 |     return distorted_image
235 | 
236 | 
237 | def preprocess_for_eval(image, height, width, light_level=None,
238 |                         central_fraction=0.875, scope=None):
239 |   """Prepare one image for evaluation.
240 | 
241 |   If height and width are specified it would output an image with that size by
242 |   applying resize_bilinear.
243 | 
244 |   If central_fraction is specified it would cropt the central fraction of the
245 |   input image.
246 | 
247 |   Args:
248 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
249 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
250 |       is [0, MAX], where MAX is largest positive representable number for
251 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
252 |     height: integer
253 |     width: integer
254 |     central_fraction: Optional Float, fraction of the image to crop.
255 |     scope: Optional scope for name_scope.
256 |   Returns:
257 |     3-D float Tensor of prepared image.
258 |   """
259 |   with tf.name_scope(scope, 'eval_image', [image, height, width]):
260 |     if image.dtype != tf.float32:
261 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
262 | 
263 |     # Crop the central region of the image with an area containing 87.5% of
264 |     # the original image.
265 |     if central_fraction:
266 |       image = tf.image.central_crop(image, central_fraction=central_fraction)
267 | 
268 |     #image = tf.py_func(sensor_model.sensor_model, [image], tf.float32, stateful=True)
269 |     if height and width:
270 |       # Resize the image to the specified height and width.
271 |       image = tf.expand_dims(image, 0)
272 |       image = tf.image.resize_images(image, [height, width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
273 | 
274 |       image = tf.squeeze(image, [0])
275 | 
276 |     # Add noise (only for our ISP)
277 |     #image, a, gauss_std = sensor_model.sensor_noise_rand_light_level(image, light_level)
278 |     #image = image*sensor_model.get_bayer_mask(height, width)
279 | 
280 |     image = tf.subtract(image, 0.5)
281 |     image = tf.multiply(image, 2.0)
282 |     image.set_shape([height, width, 3])
283 |     return image
284 | 
285 | 
286 | def preprocess_image(image, ground_truth, height, width,
287 |                      is_training=False,
288 |                      bbox=None,
289 |                      fast_mode=True,
290 |                      light_level=None):
291 |   """Pre-process one image for training or evaluation.
292 | 
293 |   Args:
294 |     image: 3-D Tensor [height, width, channels] with the image.
295 |     height: integer, image expected height.
296 |     width: integer, image expected width.
297 |     is_training: Boolean. If true it would transform an image for train,
298 |       otherwise it would transform it for evaluation.
299 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
300 |       where each coordinate is [0, 1) and the coordinates are arranged as
301 |       [ymin, xmin, ymax, xmax].
302 |     fast_mode: Optional boolean, if True avoids slower transformations.
303 | 
304 |   Returns:
305 |     3-D float Tensor containing an appropriately scaled image
306 | 
307 |   Raises:
308 |     ValueError: if user does not provide bounding box
309 |   """
310 |   if is_training:
311 |     return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
312 |   else:
313 |     return preprocess_for_eval(image, height, width, light_level)
314 | 


--------------------------------------------------------------------------------
/preprocessing/joint_isp_preprocessing.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Provides utilities to preprocess images for the Inception networks."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | from preprocessing import sensor_model
 22 | 
 23 | import tensorflow as tf
 24 | import numpy as np
 25 | 
 26 | from tensorflow.python.ops import control_flow_ops
 27 | 
 28 | 
 29 | def apply_with_random_selector(x, func, num_cases):
 30 |   """Computes func(x, sel), with sel sampled from [0...num_cases-1].
 31 | 
 32 |   Args:
 33 |     x: input Tensor.
 34 |     func: Python function to apply.
 35 |     num_cases: Python int32, number of cases to sample sel from.
 36 | 
 37 |   Returns:
 38 |     The result of func(x, sel), where func receives the value of the
 39 |     selector as a python integer, but sel is sampled dynamically.
 40 |   """
 41 |   sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
 42 |   # Pass the real x only to one of the func calls.
 43 |   return control_flow_ops.merge([
 44 |       func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
 45 |       for case in range(num_cases)])[0]
 46 | 
 47 | 
 48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
 49 |   """Distort the color of a Tensor image.
 50 | 
 51 |   Each color distortion is non-commutative and thus ordering of the color ops
 52 |   matters. Ideally we would randomly permute the ordering of the color ops.
 53 |   Rather then adding that level of complication, we select a distinct ordering
 54 |   of color ops for each preprocessing thread.
 55 | 
 56 |   Args:
 57 |     image: 3-D Tensor containing single image in [0, 1].
 58 |     color_ordering: Python int, a type of distortion (valid values: 0-3).
 59 |     fast_mode: Avoids slower ops (random_hue and random_contrast)
 60 |     scope: Optional scope for name_scope.
 61 |   Returns:
 62 |     3-D Tensor color-distorted image on range [0, 1]
 63 |   Raises:
 64 |     ValueError: if color_ordering not in [0, 3]
 65 |   """
 66 |   with tf.name_scope(scope, 'distort_color', [image]):
 67 |     if fast_mode:
 68 |       if color_ordering == 0:
 69 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 70 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 71 |       else:
 72 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 73 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 74 |     else:
 75 |       if color_ordering == 0:
 76 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 77 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 78 |         image = tf.image.random_hue(image, max_delta=0.2)
 79 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 80 |       elif color_ordering == 1:
 81 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 82 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 83 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 84 |         image = tf.image.random_hue(image, max_delta=0.2)
 85 |       elif color_ordering == 2:
 86 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 87 |         image = tf.image.random_hue(image, max_delta=0.2)
 88 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 89 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 90 |       elif color_ordering == 3:
 91 |         image = tf.image.random_hue(image, max_delta=0.2)
 92 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 93 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 94 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 95 |       else:
 96 |         raise ValueError('color_ordering must be in [0, 3]')
 97 | 
 98 |     # The random_* ops do not necessarily clamp.
 99 |     return tf.clip_by_value(image, 0.0, 1.0)
100 | 
101 | 
102 | def distorted_bounding_box_crop(image,
103 |                                 bbox,
104 |                                 min_object_covered=0.1,
105 |                                 aspect_ratio_range=(0.75, 1.33),
106 |                                 area_range=(0.05, 1.0),
107 |                                 max_attempts=100,
108 |                                 scope=None):
109 |   """Generates cropped_image using a one of the bboxes randomly distorted.
110 | 
111 |   See `tf.image.sample_distorted_bounding_box` for more documentation.
112 | 
113 |   Args:
114 |     image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 |       where each coordinate is [0, 1) and the coordinates are arranged
117 |       as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 |       image.
119 |     min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 |       area of the image must contain at least this fraction of any bounding box
121 |       supplied.
122 |     aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 |       image must have an aspect ratio = width / height within this range.
124 |     area_range: An optional list of `floats`. The cropped area of the image
125 |       must contain a fraction of the supplied image within in this range.
126 |     max_attempts: An optional `int`. Number of attempts at generating a cropped
127 |       region of the image of the specified constraints. After `max_attempts`
128 |       failures, return the entire image.
129 |     scope: Optional scope for name_scope.
130 |   Returns:
131 |     A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 |   """
133 |   with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 |     # Each bounding box has shape [1, num_boxes, box coords] and
135 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 | 
137 |     # A large fraction of image datasets contain a human-annotated bounding
138 |     # box delineating the region of the image containing the object of interest.
139 |     # We choose to create a new bounding box for the object which is a randomly
140 |     # distorted version of the human-annotated bounding box that obeys an
141 |     # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 |     # bounding box. If no box is supplied, then we assume the bounding box is
143 |     # the entire image.
144 |     sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 |         tf.shape(image),
146 |         bounding_boxes=bbox,
147 |         min_object_covered=min_object_covered,
148 |         aspect_ratio_range=aspect_ratio_range,
149 |         area_range=area_range,
150 |         max_attempts=max_attempts,
151 |         use_image_if_no_bounding_boxes=True)
152 |     bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 | 
154 |     # Crop the image to the specified bounding box.
155 |     cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 |     return cropped_image, distort_bbox
157 | 
158 | 
159 | def preprocess_for_train(image, height, width, bbox,
160 |                          fast_mode=True,
161 |                          light_level=None,
162 |                          scope=None):
163 |   """Distort one image for training a netwo.
164 | 
165 |   Distorting images provides a useful technique for augmenting the data
166 |   set during training in order to make the network invariant to aspects
167 |   of the image that do not effect the label.
168 | 
169 |   Additionally it would create image_summaries to display the different
170 |   transformations applied to the image.
171 | 
172 |   Args:
173 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 |       is [0, MAX], where MAX is largest positive representable number for
176 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 |     height: integer
178 |     width: integer
179 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 |       where each coordinate is [0, 1) and the coordinates are arranged
181 |       as [ymin, xmin, ymax, xmax].
182 |     fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 |       bi-cubic resizing, random_hue or random_contrast).
184 |     scope: Optional scope for name_scope.
185 |   Returns:
186 |     3-D float Tensor of distorted image used for training with range [-1, 1].
187 |   """
188 |   with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 | 
190 | 
191 |     if bbox is None:
192 |       bbox = tf.constant([0.0, 0.0, 1.0, 1.0],
193 |                          dtype=tf.float32,
194 |                          shape=[1, 1, 4])
195 |     if image.dtype != tf.float32:
196 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
197 | 
198 |     # Each bounding box has shape [1, num_boxes, box coords] and
199 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
200 |     image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
201 |                                                   bbox)
202 |     tf.summary.image('image_with_bounding_boxes', image_with_box)
203 | 
204 |     distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
205 |     # Restore the shape since the dynamic slice based upon the bbox_size loses
206 |     # the third dimension.
207 |     distorted_image.set_shape([None, None, 3])
208 |     image_with_distorted_box = tf.image.draw_bounding_boxes(
209 |         tf.expand_dims(image, 0), distorted_bbox)
210 |     tf.summary.image('images_with_distorted_bounding_box',
211 |                      image_with_distorted_box)
212 | 
213 |     # This resizing operation may distort the images because the aspect
214 |     # ratio is not respected. We select a resize method in a round robin
215 |     # fashion based on the thread number.
216 |     # Note that ResizeMethod contains 4 enumerated resizing methods.
217 | 
218 | 
219 |     # We select only 1 case for fast_mode bilinear.
220 |     #num_resize_cases = 1
221 |     #distorted_image = apply_with_random_selector(
222 |     #    distorted_image,
223 |     #    lambda x, method: tf.image.resize_images(x, [height, width], method=method),
224 |     #    num_cases=num_resize_cases)
225 | 
226 |     # Use nearest neighbor subsampling.
227 |     print("USING NEAREST NEIGHBOR SUBSAMPLING")
228 |     distorted_image = tf.image.resize_images(distorted_image, [height, width],
229 | 			method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
230 | 
231 |     tf.summary.image('cropped_resized_image',
232 |                      tf.expand_dims(distorted_image, 0))
233 | 
234 |     # Randomly flip the image horizontally.
235 |     distorted_image = tf.image.random_flip_left_right(distorted_image)
236 | 
237 |     tf.summary.image('final_distorted_image',
238 |                      tf.expand_dims(distorted_image, 0))
239 |     return distorted_image
240 | 
241 | 
242 | def preprocess_for_eval(image, height, width, light_level=None,
243 |                         central_fraction=0.875, scope=None, sensor='Nexus_6P_rear'):
244 |   """Prepare one image for evaluation.
245 | 
246 |   If height and width are specified it would output an image with that size by
247 |   applying resize_bilinear.
248 | 
249 |   If central_fraction is specified it would cropt the central fraction of the
250 |   input image.
251 | 
252 |   Args:
253 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
254 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
255 |       is [0, MAX], where MAX is largest positive representable number for
256 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
257 |     height: integer
258 |     width: integer
259 |     central_fraction: Optional Float, fraction of the image to crop.
260 |     scope: Optional scope for name_scope.
261 |   Returns:
262 |     3-D float Tensor of prepared image.
263 |   """
264 |   with tf.name_scope(scope, 'eval_image', [image, height, width]):
265 |     if image.dtype != tf.float32:
266 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
267 | 
268 |     # Crop the central region of the image with an area containing 87.5% of
269 |     # the original image.
270 |     if central_fraction:
271 |       image = tf.image.central_crop(image, central_fraction=central_fraction)
272 | 
273 |     #image = tf.py_func(sensor_model.sensor_model, [image], tf.float32, stateful=True)
274 |     if height and width:
275 |       # Resize the image to the specified height and width.
276 |       image = tf.expand_dims(image, 0)
277 |       image = tf.image.resize_images(image, [height, width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
278 | 
279 |       image = tf.squeeze(image, [0])
280 | 
281 | 
282 |     B = image[::2,::2,2]
283 |     R = image[1::2,1::2,0]
284 |     G1 = image[1::2,::2,1]
285 |     G2 = image[::2,1::2,1]
286 |     stacked = tf.stack([R, B, G1, G2], axis=2)
287 |     mean = tf.reduce_mean(stacked)
288 |     std = tf.py_func(noise_est, [stacked], tf.float32)
289 |     light_level = sensor_model.std2ll(std, mean=mean, sensor=sensor)
290 |     light_level.set_shape([])
291 |     image.set_shape([height, width, 3])
292 |     return image, light_level
293 | 
294 | def noise_est(img):
295 |     stds = sensor_model.estimate_std(img)
296 |     return np.float32(np.mean(stds))
297 | 
298 | def preprocess_image(image, ground_truth, height, width,
299 |                      is_training=False,
300 |                      bbox=None,
301 |                      fast_mode=True,
302 |                      light_level=None,
303 |                      sensor='Nexus_6P_rear'):
304 |   """Pre-process one image for training or evaluation.
305 | 
306 |   Args:
307 |     image: 3-D Tensor [height, width, channels] with the image.
308 |     height: integer, image expected height.
309 |     width: integer, image expected width.
310 |     is_training: Boolean. If true it would transform an image for train,
311 |       otherwise it would transform it for evaluation.
312 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
313 |       where each coordinate is [0, 1) and the coordinates are arranged as
314 |       [ymin, xmin, ymax, xmax].
315 |     fast_mode: Optional boolean, if True avoids slower transformations.
316 | 
317 |   Returns:
318 |     3-D float Tensor containing an appropriately scaled image
319 | 
320 |   Raises:
321 |     ValueError: if user does not provide bounding box
322 |   """
323 |   if is_training:
324 |     return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
325 |   else:
326 |     return preprocess_for_eval(image, height, width, light_level, sensor=sensor)
327 | 


--------------------------------------------------------------------------------
/preprocessing/no_preprocessing.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Provides utilities to preprocess images for the Inception networks."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | from preprocessing import sensor_model
 22 | 
 23 | import tensorflow as tf
 24 | import numpy as np
 25 | 
 26 | from tensorflow.python.ops import control_flow_ops
 27 | 
 28 | 
 29 | def apply_with_random_selector(x, func, num_cases):
 30 |   """Computes func(x, sel), with sel sampled from [0...num_cases-1].
 31 | 
 32 |   Args:
 33 |     x: input Tensor.
 34 |     func: Python function to apply.
 35 |     num_cases: Python int32, number of cases to sample sel from.
 36 | 
 37 |   Returns:
 38 |     The result of func(x, sel), where func receives the value of the
 39 |     selector as a python integer, but sel is sampled dynamically.
 40 |   """
 41 |   sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
 42 |   # Pass the real x only to one of the func calls.
 43 |   return control_flow_ops.merge([
 44 |       func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
 45 |       for case in range(num_cases)])[0]
 46 | 
 47 | 
 48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
 49 |   """Distort the color of a Tensor image.
 50 | 
 51 |   Each color distortion is non-commutative and thus ordering of the color ops
 52 |   matters. Ideally we would randomly permute the ordering of the color ops.
 53 |   Rather then adding that level of complication, we select a distinct ordering
 54 |   of color ops for each preprocessing thread.
 55 | 
 56 |   Args:
 57 |     image: 3-D Tensor containing single image in [0, 1].
 58 |     color_ordering: Python int, a type of distortion (valid values: 0-3).
 59 |     fast_mode: Avoids slower ops (random_hue and random_contrast)
 60 |     scope: Optional scope for name_scope.
 61 |   Returns:
 62 |     3-D Tensor color-distorted image on range [0, 1]
 63 |   Raises:
 64 |     ValueError: if color_ordering not in [0, 3]
 65 |   """
 66 |   with tf.name_scope(scope, 'distort_color', [image]):
 67 |     if fast_mode:
 68 |       if color_ordering == 0:
 69 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 70 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 71 |       else:
 72 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 73 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 74 |     else:
 75 |       if color_ordering == 0:
 76 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 77 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 78 |         image = tf.image.random_hue(image, max_delta=0.2)
 79 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 80 |       elif color_ordering == 1:
 81 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 82 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 83 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 84 |         image = tf.image.random_hue(image, max_delta=0.2)
 85 |       elif color_ordering == 2:
 86 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 87 |         image = tf.image.random_hue(image, max_delta=0.2)
 88 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 89 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 90 |       elif color_ordering == 3:
 91 |         image = tf.image.random_hue(image, max_delta=0.2)
 92 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 93 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 94 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 95 |       else:
 96 |         raise ValueError('color_ordering must be in [0, 3]')
 97 | 
 98 |     # The random_* ops do not necessarily clamp.
 99 |     return tf.clip_by_value(image, 0.0, 1.0)
100 | 
101 | 
102 | def distorted_bounding_box_crop(image,
103 |                                 bbox,
104 |                                 min_object_covered=0.1,
105 |                                 aspect_ratio_range=(0.75, 1.33),
106 |                                 area_range=(0.05, 1.0),
107 |                                 max_attempts=100,
108 |                                 scope=None):
109 |   """Generates cropped_image using a one of the bboxes randomly distorted.
110 | 
111 |   See `tf.image.sample_distorted_bounding_box` for more documentation.
112 | 
113 |   Args:
114 |     image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 |       where each coordinate is [0, 1) and the coordinates are arranged
117 |       as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 |       image.
119 |     min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 |       area of the image must contain at least this fraction of any bounding box
121 |       supplied.
122 |     aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 |       image must have an aspect ratio = width / height within this range.
124 |     area_range: An optional list of `floats`. The cropped area of the image
125 |       must contain a fraction of the supplied image within in this range.
126 |     max_attempts: An optional `int`. Number of attempts at generating a cropped
127 |       region of the image of the specified constraints. After `max_attempts`
128 |       failures, return the entire image.
129 |     scope: Optional scope for name_scope.
130 |   Returns:
131 |     A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 |   """
133 |   with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 |     # Each bounding box has shape [1, num_boxes, box coords] and
135 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 | 
137 |     # A large fraction of image datasets contain a human-annotated bounding
138 |     # box delineating the region of the image containing the object of interest.
139 |     # We choose to create a new bounding box for the object which is a randomly
140 |     # distorted version of the human-annotated bounding box that obeys an
141 |     # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 |     # bounding box. If no box is supplied, then we assume the bounding box is
143 |     # the entire image.
144 |     sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 |         tf.shape(image),
146 |         bounding_boxes=bbox,
147 |         min_object_covered=min_object_covered,
148 |         aspect_ratio_range=aspect_ratio_range,
149 |         area_range=area_range,
150 |         max_attempts=max_attempts,
151 |         use_image_if_no_bounding_boxes=True)
152 |     bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 | 
154 |     # Crop the image to the specified bounding box.
155 |     cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 |     return cropped_image, distort_bbox
157 | 
158 | 
159 | def preprocess_for_train(image, height, width, bbox,
160 |                          fast_mode=True,
161 |                          light_level=None,
162 |                          scope=None):
163 |   """Distort one image for training a netwo.
164 | 
165 |   Distorting images provides a useful technique for augmenting the data
166 |   set during training in order to make the network invariant to aspects
167 |   of the image that do not effect the label.
168 | 
169 |   Additionally it would create image_summaries to display the different
170 |   transformations applied to the image.
171 | 
172 |   Args:
173 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 |       is [0, MAX], where MAX is largest positive representable number for
176 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 |     height: integer
178 |     width: integer
179 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 |       where each coordinate is [0, 1) and the coordinates are arranged
181 |       as [ymin, xmin, ymax, xmax].
182 |     fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 |       bi-cubic resizing, random_hue or random_contrast).
184 |     scope: Optional scope for name_scope.
185 |   Returns:
186 |     3-D float Tensor of distorted image used for training with range [-1, 1].
187 |   """
188 |   with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 | 
190 | 
191 |     if image.dtype != tf.float32:
192 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
193 | 
194 |     # Randomly flip the image horizontally.
195 |     distorted_image = tf.image.random_flip_left_right(image)
196 | 
197 |     tf.summary.image('final_distorted_image',
198 |                      tf.expand_dims(distorted_image, 0))
199 |     distorted_image.set_shape([height, width, 3])
200 |     return 2*(distorted_image - 0.5)
201 | 
202 | 
203 | def preprocess_for_eval(image, height, width, light_level=None,
204 |                         central_fraction=0.875, scope=None):
205 |   """Prepare one image for evaluation.
206 | 
207 |   If height and width are specified it would output an image with that size by
208 |   applying resize_bilinear.
209 | 
210 |   If central_fraction is specified it would cropt the central fraction of the
211 |   input image.
212 | 
213 |   Args:
214 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
215 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
216 |       is [0, MAX], where MAX is largest positive representable number for
217 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
218 |     height: integer
219 |     width: integer
220 |     central_fraction: Optional Float, fraction of the image to crop.
221 |     scope: Optional scope for name_scope.
222 |   Returns:
223 |     3-D float Tensor of prepared image.
224 |   """
225 |   with tf.name_scope(scope, 'eval_image', [image, height, width]):
226 |     if image.dtype != tf.float32:
227 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
228 |     image.set_shape([height, width, 3])
229 | 
230 |     # Break into colors and then upsample.
231 |     #B = tf.image.resize_images(image[::2,::2,2:3], [height, width])
232 |     #R = tf.image.resize_images(image[1::2,1::2,0:1], [height, width])
233 |     #G1 = tf.image.resize_images(image[1::2,::2,1:2], [height, width])
234 |     #G2 = tf.image.resize_images(image[::2,1::2,1:2], [height, width])
235 |     #image = tf.concat([R,(G1+G2)/2,B], axis=2)
236 | 
237 |     #image = tf.Print(image, [tf.reduce_min(image), tf.reduce_max(image)])
238 |     return 2*(image - 0.5)
239 | 
240 | def noise_est(img):
241 |     stds = sensor_model.estimate_std(img)
242 |     return np.float32(np.mean(stds))
243 | 
244 | def preprocess_image(image, ground_truth, height, width,
245 |                      is_training=False,
246 |                      bbox=None,
247 |                      fast_mode=True,
248 |                      light_level=None, sensor=None):
249 |   """Pre-process one image for training or evaluation.
250 | 
251 |   Args:
252 |     image: 3-D Tensor [height, width, channels] with the image.
253 |     height: integer, image expected height.
254 |     width: integer, image expected width.
255 |     is_training: Boolean. If true it would transform an image for train,
256 |       otherwise it would transform it for evaluation.
257 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
258 |       where each coordinate is [0, 1) and the coordinates are arranged as
259 |       [ymin, xmin, ymax, xmax].
260 |     fast_mode: Optional boolean, if True avoids slower transformations.
261 | 
262 |   Returns:
263 |     3-D float Tensor containing an appropriately scaled image
264 | 
265 |   Raises:
266 |     ValueError: if user does not provide bounding box
267 |   """
268 |   if is_training:
269 |     return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
270 |   else:
271 |     return preprocess_for_eval(image, height, width, light_level)
272 | 


--------------------------------------------------------------------------------
/preprocessing/preprocessing_factory.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | # http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains a factory for building various models."""
16 | 
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 | 
21 | import tensorflow as tf
22 | 
23 | from preprocessing import inception_preprocessing
24 | from preprocessing import isp_pretrain_preprocessing
25 | from preprocessing import joint_isp_preprocessing
26 | from preprocessing import writeout_preprocessing
27 | from preprocessing import no_preprocessing
28 | 
29 | slim = tf.contrib.slim
30 | 
31 | 
32 | def get_preprocessing(name, is_training):
33 |   """Returns preprocessing_fn(image, height, width, **kwargs).
34 | 
35 |   Args:
36 |     name: The name of the preprocessing function.
37 |     is_training: `True` if the model is being used for training and `False`
38 |       otherwise.
39 | 
40 |   Returns:
41 |     preprocessing_fn: A function that preprocessing a single image (pre-batch).
42 |       It has the following signature:
43 |         image = preprocessing_fn(image, output_height, output_width, ...).
44 | 
45 |   Raises:
46 |     ValueError: If Preprocessing `name` is not recognized.
47 |   """
48 |   preprocessing_fn_map = {
49 |       'isp': isp_pretrain_preprocessing,
50 |       'mobilenet_v1': inception_preprocessing,
51 |       'mobilenet_isp': joint_isp_preprocessing,
52 |       'resnet_isp': isp_pretrain_preprocessing,
53 |       'gharbi_isp': isp_pretrain_preprocessing,
54 |       'writeout': writeout_preprocessing,
55 |       'none': no_preprocessing,
56 |       'deeper_mobilenet_v1': inception_preprocessing,
57 |   }
58 | 
59 |   if name not in preprocessing_fn_map:
60 |     raise ValueError('Preprocessing name [%s] was not recognized' % name)
61 | 
62 |   def preprocessing_fn(image, ground_truth, output_height, output_width, **kwargs):
63 |     return preprocessing_fn_map[name].preprocess_image(
64 |         image, ground_truth, output_height, output_width, is_training=is_training, **kwargs)
65 | 
66 |   return preprocessing_fn
67 | 


--------------------------------------------------------------------------------
/preprocessing/sensor_model.py:
--------------------------------------------------------------------------------
  1 | import tensorflow as tf
  2 | import numpy as np
  3 | import scipy.io as sio
  4 | from scipy.stats import poisson
  5 | 
  6 | ###############################################################################
  7 | # Sensor model
  8 | ###############################################################################
  9 | 
 10 | # Sensors calibrated for ISO100 (format is Poissonian scale, Gaussian std)
 11 | # Iso 100, 200, 400, 800, 1600, 3200
 12 | sensors = {'Nexus_6P_rear': [0.00018724, 0.0004733],
 13 |            'Nexus_6P_front': [0.00015, 0.0003875],
 14 |            'SEMCO': [0.000388, 0.0025],
 15 |            'OV2740': [0.000088021, 0.00022673],
 16 |            #'GAUSSIAN': [0,0.005],
 17 |            'GAUSS': [0,1],
 18 |            'POISSON': [1,0],
 19 |            'Pixel': [0.0153, 0.0328], #[0.00019856, 0.0017],
 20 |            'Pixel3x3': [2.2682e-4, 0.0017],
 21 |            'Pixel5x5': [1.2361e-4, 0.0043],
 22 |            'Pixel7x7': [7.3344e-05, 0.0077],
 23 |            }
 24 | 
 25 | sensorpositions = {'center': 0.5,
 26 |             'offaxis': 0.9,
 27 |            'periphery': 1.0}
 28 | 
 29 | light_levels = 3 * np.array([2 ** i for i in range(6)]) / 2000.0
 30 | 
 31 | def std2ll(std, mean=0.5, sensor='Nexus_6P_rear'):
 32 |     #light_level = sensors[sensor][1]/std
 33 |     #print('Sensor', sensor)
 34 |     alpha, beta = sensors[sensor]
 35 |     alpha_mean = alpha*mean
 36 |     num = np.sqrt(alpha_mean**2 + 4*beta**2*std**2) - alpha_mean
 37 |     light_level = (2*beta**2)/num
 38 |     return light_level
 39 | 
 40 | def get_bayer_mask(height, width):
 41 |     # Mask based on Bayer pattern. (assume RGB order of colors)
 42 |     # B G
 43 |     # G R
 44 |     bayer_mask = np.zeros([height, width, 3])
 45 |     bayer_mask[1::2,1::2,0:1] = 1 # R
 46 |     bayer_mask[1::2,::2,1:2] = 1 # G
 47 |     bayer_mask[::2,1::2,1:2] = 1 # G
 48 |     bayer_mask[::2,::2,2:3] = 1 # B
 49 |     return bayer_mask
 50 | 
 51 | def optics_model(psfs, sensorpos='center', visualize=True ):
 52 |     #Expects calibrated PSFs (in matlab format) as input
 53 | 
 54 |     #Compute positions on grid
 55 |     psf_shape = np.array(psfs.shape)
 56 |     selected_pos = (psf_shape*sensorpositions[sensorpos]).astype(int)
 57 | 
 58 |     #Extract the position
 59 |     psf_sel = psfs[selected_pos[0] - 1,selected_pos[1] - 1]['PSF'][0,0]
 60 |     psf_sel = np.maximum(psf_sel, 0.0)
 61 | 
 62 |     #Normalize
 63 |     for ch in range(psf_sel.shape[2]):
 64 |         psf_sel[:,:,ch] = psf_sel[:,:,ch]/np.sum(psf_sel[:,:,ch])
 65 | 
 66 |     return psf_sel
 67 | 
 68 | def psf_iterator():
 69 |     sensor_positions = ['center', 'offaxis', 'periphery']
 70 |     psfs = sio.loadmat('PSFs/bloc_256_Nexus_defective.mat')['bloc']
 71 |     for sensor_position in sensor_positions:
 72 |         psf_kernel = np.asfortranarray(optics_model(psfs, sensorpos=sensor_position, visualize=False).astype(np.float32))
 73 |         yield sensor_position, psf_kernel
 74 | 
 75 | def load_psfs():
 76 |     sensor_positions = ['center', 'offaxis', 'periphery']
 77 |     psfs = sio.loadmat('PSFs/bloc_256_Nexus_defective.mat')['bloc']
 78 | 
 79 |     kernels = []
 80 |     for sensor_position in sensor_positions:
 81 |         kernel = np.asfortranarray(optics_model(psfs, sensorpos=sensor_position, visualize=False).astype(np.float32))
 82 |         for channel in xrange(3):
 83 |             kernel[:,:,channel] /= np.sum(kernel[:,:,channel])
 84 |         kernels.append(kernel)
 85 | 
 86 |     return kernels
 87 | 
 88 | def get_noise_params(iso, sensor):
 89 |     sensor = 'Nexus_6P_rear'
 90 |     poisson = sensors[sensor][0]
 91 |     sigma = sensors[sensor][1]
 92 | 
 93 |     a = poisson * iso / 100.0 #Poisson scale
 94 |     b = (sigma * iso / 100.0)**2
 95 | 
 96 |     return a, np.sqrt(b)
 97 | 
 98 | def sensor_model(y):
 99 |     # Invalid sensor
100 |     iso = 1.0 / 0.0015 * 100
101 |     sensor='Nexus_6P_rear'
102 | 
103 |     poisson = sensors[sensor][0]
104 |     sigma = sensors[sensor][1]
105 | 
106 |     #Output stats
107 |     #print( 'Sensor {0} ISO {1} Poisson {2} Gaussian {3}'.format(sensor, iso, poisson, sigma) )
108 | 
109 |     # Assume linear ISO model
110 |     a = poisson * iso / 100.0 #Poisson scale
111 |     b = (sigma * iso / 100.0)**2
112 | 
113 |     #Return Poissonian-Gaussian response
114 |     #noisy_img = poisson_gaussian_np(y, a, b, True, True)
115 |     noisy_img = poisson_gaussian_np(y, a, b, True, True)
116 |     return noisy_img.astype(np.float32)
117 | 
118 | def sensor_noise_rand_sigma(img_batch, sigma_range, scale=1.0, sensor='Nexus_6P_rear'):
119 |     # Define in terms of Gaussian noise after Anscombe.
120 |     batch_size = img_batch.get_shape()[0].value
121 |     poisson = sensors[sensor][0]
122 |     gauss = sensors[sensor][1]
123 |     sigma = tf.random_uniform([batch_size], sigma_range[0], sigma_range[1])*scale/255.0
124 |     if poisson == 0:
125 |         noisy_batch = img_batch + sigma[:,None,None,None] * tf.random_normal(shape=img_batch.get_shape(), dtype=tf.float32)
126 |         noisy_batch = tf.clip_by_value(noisy_batch, 0.0, scale)
127 |         return noisy_batch, None, sigma
128 |     sigma_hat = gauss/poisson
129 |     offset = 2*tf.sqrt(3./8. + sigma_hat**2)
130 |     tmp = (1./sigma + offset)**2/4 - 3./8. - sigma_hat**2
131 |     light_level = poisson*tmp
132 |     iso = 1.0 / light_level * 100.
133 |     #iso = tf.Print(iso, [light_level])
134 | 
135 |     # Assume linear ISO model
136 |     a = poisson * iso / 100.0  * scale #Poisson scale
137 |     gauss_var = tf.square(gauss * iso / 100.0) * scale**2
138 | 
139 |     upper = 2*tf.sqrt(light_level/poisson + 3./8. + sigma_hat**2)
140 |     lower = 2*tf.sqrt(3./8. + sigma_hat**2)
141 |     tf.summary.scalar('noise_level', 255./(upper - lower)[0])
142 |     tf.summary.scalar('iso', tf.reduce_mean(iso))
143 |     tf.summary.scalar('light_level', tf.reduce_mean(light_level))
144 |     tf.summary.scalar('a', tf.reduce_mean(a)/scale)
145 |     tf.summary.scalar('gauss_variance', tf.reduce_mean(gauss_var)/scale**2)
146 | 
147 |     # a = tf.Print(a, [255./(upper - lower)])
148 |     print("Simulating sensor {0}.".format(sensor))
149 | 
150 |     noisy_batch = poisson_gauss_tf(img_batch, a, gauss_var, clip=(0.,scale))
151 |     # Return Poissonian-Gaussian response
152 |     return noisy_batch, a, tf.sqrt(gauss_var)
153 | 
154 | def get_coeffs(light_levels, sensor='Nexus_6P_rear'):
155 |     #print('Sensor', sensor)
156 |     poisson = sensors[sensor][0]
157 |     gauss = sensors[sensor][1]
158 |     iso = 1.0 / light_levels * 100.
159 |     a = poisson * iso / 100.0  #Poisson scale
160 |     b = (gauss * iso / 100.0)
161 |     return a, b
162 | 
163 | def sensor_noise_rand_light_level(img_batch, ll_range, scale=1.0, sensor='Nexus_6P_rear'):
164 |     print("Sensor = %s, scale = %s" % (sensor, scale))
165 |     batch_size = img_batch.get_shape()[0].value
166 |     poisson = sensors[sensor][0]
167 |     gauss = sensors[sensor][1]
168 | 
169 |     # Sample uniformly in logspace.
170 |     # low ll * exp(u), u ~ [0, log(high ll/low ll)]
171 |     ll_ratio = ll_range[1]/ll_range[0]
172 |     ll_factor = tf.random_uniform([batch_size], minval=0, maxval=tf.log(ll_ratio), dtype=tf.float32)
173 |     light_level = ll_range[0]*tf.exp(ll_factor)
174 |     iso = 1.0 / light_level * 100.
175 | 
176 |     # Assume linear ISO model
177 |     a = poisson * iso / 100.0  * scale #Poisson scale
178 | 
179 |     gauss_var = tf.square(gauss * iso / 100.0) * scale**2
180 |     if poisson == 0:
181 |         noisy_batch = img_batch + tf.sqrt(gauss_var[:,None,None,None]) * tf.random_normal(shape=img_batch.get_shape(), dtype=tf.float32)
182 |         noisy_batch = tf.clip_by_value(noisy_batch, 0.0, scale)
183 |         return noisy_batch, np.zeros(batch_size), tf.sqrt(gauss_var)
184 | 
185 |     tf.summary.scalar('iso', tf.reduce_mean(iso))
186 |     tf.summary.scalar('light_level', tf.reduce_mean(light_level))
187 |     tf.summary.scalar('a', tf.reduce_mean(a)/scale)
188 |     tf.summary.scalar('gauss_variance', tf.reduce_mean(gauss_var)/scale**2)
189 | 
190 |     print("Simulating sensor {0}.".format(sensor))
191 | 
192 |     noisy_batch = poisson_gauss_tf(img_batch, a, gauss_var, clip=(0.,scale))
193 |     sigma_hat = gauss/poisson
194 | 
195 |     return noisy_batch, a, tf.sqrt(gauss_var)
196 | 
197 | def poisson_gauss_tf(img_batch, a, gauss_var, clip=(0.,1.)):
198 |     # Apply poissonian-gaussian noise model following A.Foi et al.
199 |     # Foi, A., "Practical denoising of clipped or overexposed noisy images",
200 |     # Proc. 16th European Signal Process. Conf., EUSIPCO 2008, Lausanne, Switzerland, August 2008.
201 |     batch_shape = tf.shape(img_batch)
202 | 
203 |     a_p = a[:,None,None,None]
204 |     out = tf.random_poisson(shape=[], lam=tf.maximum(img_batch/a_p, 0.0), dtype=tf.float32) * a_p
205 |     #out = tf.Print(out, [tf.reduce_max(out), tf.reduce_min(out)])
206 |     gauss_var = tf.maximum(gauss_var, 0.0)
207 | 
208 |     gauss_noise = tf.sqrt(gauss_var[:,None,None,None]) * tf.random_normal(shape=batch_shape, dtype=tf.float32) #Gaussian component
209 | 
210 |     out += gauss_noise
211 | 
212 |     # Clipping
213 |     if clip is not None:
214 |         out = tf.clip_by_value(out, clip[0], clip[1])
215 | 
216 |     # Return the simulated image
217 |     return out
218 | 
219 | 
220 | def poisson_gaussian_np(y, a, b, clip_below=True, clip_above=True):
221 |     # Apply poissonian-gaussian noise model following A.Foi et al.
222 |     # Foi, A., "Practical denoising of clipped or overexposed noisy images",
223 |     # Proc. 16th European Signal Process. Conf., EUSIPCO 2008, Lausanne, Switzerland, August 2008.
224 | 
225 |     # Check method
226 |     if(a==0):   # no Poissonian component
227 |         z = y
228 |     else:      # Poissonian component
229 |         z = np.random.poisson( np.maximum(y/a,0.0) )*a;
230 | 
231 |     if(b<0):
232 |         raise warnings.warn('The Gaussian noise parameter b has to be non-negative  (setting b=0)')
233 |         b  = 0.0
234 | 
235 |     z = z + np.sqrt(b) * np.random.randn(*y.shape) #Gaussian component
236 | 
237 |     # Clipping
238 |     if(clip_above):
239 |         z = np.minimum(z, 1.0);
240 | 
241 |     if(clip_below):
242 |         z = np.maximum(z, 0.0);
243 | 
244 |     # Return the simulated image
245 |     return z
246 | 
247 | # Currently only implements one method
248 | NoiseEstMethod = {'daub_reflect': 0, 'daub_replicate': 1}
249 | 
250 | 
251 | def estimate_std(z, method='daub_reflect'):
252 |     import cv2
253 |     # Estimates noise standard deviation assuming additive gaussian noise
254 | 
255 |     # Check method
256 |     if (method not in NoiseEstMethod.values()) and (method in NoiseEstMethod.keys()):
257 |         method = NoiseEstMethod[method]
258 |     else:
259 |         raise Exception("Invalid noise estimation method.")
260 | 
261 |     # Check shape
262 |     if len(z.shape) == 2:
263 |         z = z[..., np.newaxis]
264 |     elif len(z.shape) != 3:
265 |         raise Exception("Supports only up to 3D images.")
266 | 
267 |     # Run on multichannel image
268 |     channels = z.shape[2]
269 |     dev = np.zeros(channels)
270 | 
271 |     # Iterate over channels
272 |     for ch in range(channels):
273 | 
274 |         # Daubechies denoising method
275 |         if method == NoiseEstMethod['daub_reflect'] or method == NoiseEstMethod['daub_replicate']:
276 |             daub6kern = np.array([0.03522629188571, 0.08544127388203, -0.13501102001025,
277 |                                   -0.45987750211849, 0.80689150931109, -0.33267055295008],
278 |                                  dtype=np.float32, order='F')
279 | 
280 |             if method == NoiseEstMethod['daub_reflect']:
281 |                 wav_det = cv2.sepFilter2D(z, -1, daub6kern, daub6kern,
282 |                                           borderType=cv2.BORDER_REFLECT_101)
283 |             else:
284 |                 wav_det = cv2.sepFilter2D(z, -1, daub6kern, daub6kern,
285 |                                           borderType=cv2.BORDER_REPLICATE)
286 | 
287 |             dev[ch] = np.median(np.absolute(wav_det)) / 0.6745
288 | 
289 |     # Return standard deviation
290 |     return dev
291 | 


--------------------------------------------------------------------------------
/preprocessing/writeout_preprocessing.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | """Provides utilities to preprocess images for the Inception networks."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | from preprocessing import sensor_model
 22 | 
 23 | import tensorflow as tf
 24 | import numpy as np
 25 | 
 26 | from tensorflow.python.ops import control_flow_ops
 27 | 
 28 | 
 29 | def apply_with_random_selector(x, func, num_cases):
 30 |   """Computes func(x, sel), with sel sampled from [0...num_cases-1].
 31 | 
 32 |   Args:
 33 |     x: input Tensor.
 34 |     func: Python function to apply.
 35 |     num_cases: Python int32, number of cases to sample sel from.
 36 | 
 37 |   Returns:
 38 |     The result of func(x, sel), where func receives the value of the
 39 |     selector as a python integer, but sel is sampled dynamically.
 40 |   """
 41 |   sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
 42 |   # Pass the real x only to one of the func calls.
 43 |   return control_flow_ops.merge([
 44 |       func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
 45 |       for case in range(num_cases)])[0]
 46 | 
 47 | 
 48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
 49 |   """Distort the color of a Tensor image.
 50 | 
 51 |   Each color distortion is non-commutative and thus ordering of the color ops
 52 |   matters. Ideally we would randomly permute the ordering of the color ops.
 53 |   Rather then adding that level of complication, we select a distinct ordering
 54 |   of color ops for each preprocessing thread.
 55 | 
 56 |   Args:
 57 |     image: 3-D Tensor containing single image in [0, 1].
 58 |     color_ordering: Python int, a type of distortion (valid values: 0-3).
 59 |     fast_mode: Avoids slower ops (random_hue and random_contrast)
 60 |     scope: Optional scope for name_scope.
 61 |   Returns:
 62 |     3-D Tensor color-distorted image on range [0, 1]
 63 |   Raises:
 64 |     ValueError: if color_ordering not in [0, 3]
 65 |   """
 66 |   with tf.name_scope(scope, 'distort_color', [image]):
 67 |     if fast_mode:
 68 |       if color_ordering == 0:
 69 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 70 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 71 |       else:
 72 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 73 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 74 |     else:
 75 |       if color_ordering == 0:
 76 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 77 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 78 |         image = tf.image.random_hue(image, max_delta=0.2)
 79 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 80 |       elif color_ordering == 1:
 81 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 82 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 83 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 84 |         image = tf.image.random_hue(image, max_delta=0.2)
 85 |       elif color_ordering == 2:
 86 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 87 |         image = tf.image.random_hue(image, max_delta=0.2)
 88 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 89 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 90 |       elif color_ordering == 3:
 91 |         image = tf.image.random_hue(image, max_delta=0.2)
 92 |         image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
 93 |         image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
 94 |         image = tf.image.random_brightness(image, max_delta=32. / 255.)
 95 |       else:
 96 |         raise ValueError('color_ordering must be in [0, 3]')
 97 | 
 98 |     # The random_* ops do not necessarily clamp.
 99 |     return tf.clip_by_value(image, 0.0, 1.0)
100 | 
101 | 
102 | def distorted_bounding_box_crop(image,
103 |                                 bbox,
104 |                                 min_object_covered=0.1,
105 |                                 aspect_ratio_range=(0.75, 1.33),
106 |                                 area_range=(0.05, 1.0),
107 |                                 max_attempts=100,
108 |                                 scope=None):
109 |   """Generates cropped_image using a one of the bboxes randomly distorted.
110 | 
111 |   See `tf.image.sample_distorted_bounding_box` for more documentation.
112 | 
113 |   Args:
114 |     image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 |       where each coordinate is [0, 1) and the coordinates are arranged
117 |       as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 |       image.
119 |     min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 |       area of the image must contain at least this fraction of any bounding box
121 |       supplied.
122 |     aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 |       image must have an aspect ratio = width / height within this range.
124 |     area_range: An optional list of `floats`. The cropped area of the image
125 |       must contain a fraction of the supplied image within in this range.
126 |     max_attempts: An optional `int`. Number of attempts at generating a cropped
127 |       region of the image of the specified constraints. After `max_attempts`
128 |       failures, return the entire image.
129 |     scope: Optional scope for name_scope.
130 |   Returns:
131 |     A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 |   """
133 |   with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 |     # Each bounding box has shape [1, num_boxes, box coords] and
135 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 | 
137 |     # A large fraction of image datasets contain a human-annotated bounding
138 |     # box delineating the region of the image containing the object of interest.
139 |     # We choose to create a new bounding box for the object which is a randomly
140 |     # distorted version of the human-annotated bounding box that obeys an
141 |     # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 |     # bounding box. If no box is supplied, then we assume the bounding box is
143 |     # the entire image.
144 |     sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 |         tf.shape(image),
146 |         bounding_boxes=bbox,
147 |         min_object_covered=min_object_covered,
148 |         aspect_ratio_range=aspect_ratio_range,
149 |         area_range=area_range,
150 |         max_attempts=max_attempts,
151 |         use_image_if_no_bounding_boxes=True)
152 |     bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 | 
154 |     # Crop the image to the specified bounding box.
155 |     cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 |     return cropped_image, distort_bbox
157 | 
158 | 
159 | def preprocess_for_train(image, height, width, bbox,
160 |                          fast_mode=True,
161 |                          light_level=None,
162 |                          scope=None):
163 |   """Distort one image for training a netwo.
164 | 
165 |   Distorting images provides a useful technique for augmenting the data
166 |   set during training in order to make the network invariant to aspects
167 |   of the image that do not effect the label.
168 | 
169 |   Additionally it would create image_summaries to display the different
170 |   transformations applied to the image.
171 | 
172 |   Args:
173 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 |       is [0, MAX], where MAX is largest positive representable number for
176 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 |     height: integer
178 |     width: integer
179 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 |       where each coordinate is [0, 1) and the coordinates are arranged
181 |       as [ymin, xmin, ymax, xmax].
182 |     fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 |       bi-cubic resizing, random_hue or random_contrast).
184 |     scope: Optional scope for name_scope.
185 |   Returns:
186 |     3-D float Tensor of distorted image used for training with range [-1, 1].
187 |   """
188 |   with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 | 
190 | 
191 |     if bbox is None:
192 |       bbox = tf.constant([0.0, 0.0, 1.0, 1.0],
193 |                          dtype=tf.float32,
194 |                          shape=[1, 1, 4])
195 |     if image.dtype != tf.float32:
196 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
197 | 
198 |     # Each bounding box has shape [1, num_boxes, box coords] and
199 |     # the coordinates are ordered [ymin, xmin, ymax, xmax].
200 |     image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
201 |                                                   bbox)
202 |     tf.summary.image('image_with_bounding_boxes', image_with_box)
203 | 
204 |     distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
205 |     # Restore the shape since the dynamic slice based upon the bbox_size loses
206 |     # the third dimension.
207 |     distorted_image.set_shape([None, None, 3])
208 |     image_with_distorted_box = tf.image.draw_bounding_boxes(
209 |         tf.expand_dims(image, 0), distorted_bbox)
210 |     tf.summary.image('images_with_distorted_bounding_box',
211 |                      image_with_distorted_box)
212 | 
213 |     # This resizing operation may distort the images because the aspect
214 |     # ratio is not respected. We select a resize method in a round robin
215 |     # fashion based on the thread number.
216 |     # Note that ResizeMethod contains 4 enumerated resizing methods.
217 | 
218 | 
219 |     # We select only 1 case for fast_mode bilinear.
220 |     #num_resize_cases = 1
221 |     #distorted_image = apply_with_random_selector(
222 |     #    distorted_image,
223 |     #    lambda x, method: tf.image.resize_images(x, [height, width], method=method),
224 |     #    num_cases=num_resize_cases)
225 | 
226 |     # Use nearest neighbor subsampling.
227 |     print("USING NEAREST NEIGHBOR SUBSAMPLING")
228 |     distorted_image = tf.image.resize_images(distorted_image, [height, width],
229 | 			method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
230 | 
231 |     tf.summary.image('cropped_resized_image',
232 |                      tf.expand_dims(distorted_image, 0))
233 | 
234 |     # Randomly flip the image horizontally.
235 |     distorted_image = tf.image.random_flip_left_right(distorted_image)
236 | 
237 |     tf.summary.image('final_distorted_image',
238 |                      tf.expand_dims(distorted_image, 0))
239 |     return distorted_image
240 | 
241 | 
242 | def preprocess_for_eval(image, height, width, light_level=None,
243 |                         central_fraction=0.875, scope=None):
244 |   """Prepare one image for evaluation.
245 | 
246 |   If height and width are specified it would output an image with that size by
247 |   applying resize_bilinear.
248 | 
249 |   If central_fraction is specified it would cropt the central fraction of the
250 |   input image.
251 | 
252 |   Args:
253 |     image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
254 |       [0, 1], otherwise it would converted to tf.float32 assuming that the range
255 |       is [0, MAX], where MAX is largest positive representable number for
256 |       int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
257 |     height: integer
258 |     width: integer
259 |     central_fraction: Optional Float, fraction of the image to crop.
260 |     scope: Optional scope for name_scope.
261 |   Returns:
262 |     3-D float Tensor of prepared image.
263 |   """
264 |   with tf.name_scope(scope, 'eval_image', [image, height, width]):
265 |     if image.dtype != tf.float32:
266 |       image = tf.image.convert_image_dtype(image, dtype=tf.float32)
267 | 
268 |     # Crop the central region of the image with an area containing 87.5% of
269 |     # the original image.
270 |     if central_fraction:
271 |       image = tf.image.central_crop(image, central_fraction=central_fraction)
272 | 
273 |     #image = tf.py_func(sensor_model.sensor_model, [image], tf.float32, stateful=True)
274 |     if height and width:
275 |       # Resize the image to the specified height and width.
276 |       image = tf.expand_dims(image, 0)
277 |       image = tf.image.resize_images(image, [height, width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
278 | 
279 |       image = tf.squeeze(image, [0])
280 | 
281 |     image.set_shape([height, width, 3])
282 |     return image
283 | 
284 | def preprocess_image(image, ground_truth, height, width,
285 |                      is_training=False,
286 |                      bbox=None,
287 |                      fast_mode=True,
288 |                      light_level=None):
289 |   """Pre-process one image for training or evaluation.
290 | 
291 |   Args:
292 |     image: 3-D Tensor [height, width, channels] with the image.
293 |     height: integer, image expected height.
294 |     width: integer, image expected width.
295 |     is_training: Boolean. If true it would transform an image for train,
296 |       otherwise it would transform it for evaluation.
297 |     bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
298 |       where each coordinate is [0, 1) and the coordinates are arranged as
299 |       [ymin, xmin, ymax, xmax].
300 |     fast_mode: Optional boolean, if True avoids slower transformations.
301 | 
302 |   Returns:
303 |     3-D float Tensor containing an appropriately scaled image
304 | 
305 |   Raises:
306 |     ValueError: if user does not provide bounding box
307 |   """
308 |   if is_training:
309 |     return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
310 |   else:
311 |     return preprocess_for_eval(image, height, width, light_level)
312 | 


--------------------------------------------------------------------------------
/run_test_captured_images.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Set the checkpoint and dataset paths
 4 | checkpoints=/path/to/checkpoints
 5 | dataset_dir=/path/to/dataset/RAW_synset_ISO8000_EXP10000/
 6 | 
 7 | # Change --eval_dir paramater if needed
 8 | # Proposed Joint Architecture
 9 | python test_captured_images.py --device=1 --dataset_dir=$dataset_dir --dataset_name=imagenet \
10 |     --checkpoint_path=$checkpoints/joint128/2to200lux/model.ckpt-232721 \
11 |     --model_name=mobilenet_isp --noise_channel=True --use_anscombe=True \
12 |     --isp_model_name=isp --eval_image_size=224 --sensor=Pixel --eval_dir joint_real_2to200lux
13 | 
14 | # Proposed Joint Architecture (no Anscombe layers)
15 | python test_captured_images.py --device=1 --dataset_dir=$dataset_dir --dataset_name=imagenet \
16 |     --checkpoint_path=$checkpoints/joint128/2to200lux_no_ansc/model.ckpt-215307 \
17 |     --model_name=mobilenet_isp --noise_channel=False --use_anscombe=False \
18 |     --isp_model_name=isp --eval_image_size=224 --sensor=Pixel --eval_dir joint_no_anscombe_real_2to200lux
19 | 
20 | # # From Scratch MobileNet-v1
21 | python test_captured_images.py --device=1 --dataset_dir=$dataset_dir --dataset_name=imagenet \
22 |     --checkpoint_path=$checkpoints/mobilenet_v1_128/2to200lux/model.ckpt-325357 \
23 |     --model_name=mobilenet_v1 --eval_image_size=224 --preprocessing_name=mobilenet_isp \
24 |     --eval_dir mobilenet_v1_real_2to200lux


--------------------------------------------------------------------------------
/run_test_synthetic_images.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | checkpoints_dir=/path/to/checkpoints
 4 | dataset_dir=/path/to/imagenet_validation
 5 | eval_dir=/path/to/output_dir
 6 | 
 7 | noise=3lux
 8 | python test_synthetic_images.py --device=1 \
 9 |     --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-216759'  \
10 |     --dataset_dir=$dataset_dir  --dataset_name=imagenet --mode=$noise \
11 |     --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
12 | 
13 | noise=6lux
14 | python test_synthetic_images.py --device=1 \
15 |     --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-222267'  \
16 |     --dataset_dir=$dataset_dir  --dataset_name=imagenet --mode=$noise \
17 |     --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
18 | 
19 | noise=2to20lux
20 | python test_synthetic_images.py --device=1 \
21 |     --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-232718'  \
22 |     --dataset_dir=$dataset_dir  --dataset_name=imagenet --mode=$noise \
23 |     --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
24 | 
25 | noise=2to200lux
26 | python test_synthetic_images.py --device=1 \
27 | --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-232721'  \
28 |     --dataset_dir=$dataset_dir  --dataset_name=imagenet --mode=$noise \
29 |     --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
30 | 
31 | 


--------------------------------------------------------------------------------
/run_train_joint_models.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | TRAIN_DIR=/path/to/train_dir
 4 | IMAGENET_TFRECORDS=/path/to/imagenetTFRecords
 5 | CHECKPOINTS=/path/to/checkpoints
 6 | 
 7 | # Train with 3lux noisy images
 8 | # Set number of clones and device according to machine resources
 9 | python train_image_classifier.py --train_dir=$TRAIN_DIR/3lux \
10 |     --dataset_dir=$IMAGENET_TFRECORDS  --ll_low=0.0015 \
11 |     --ll_high=0.0015  --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
12 |     --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000  \
13 |     --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
14 |     --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,2,3 \
15 |     --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
16 | 
17 | 
18 | # Train with 6lux noisy images 
19 | python train_image_classifier.py --train_dir=$TRAIN_DIR/6lux \
20 |     --dataset_dir=$IMAGENET_TFRECORDS  --ll_low=0.003 \
21 |     --ll_high=0.003  --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
22 |     --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000  \
23 |     --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
24 |     --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,3,4 \
25 |     --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
26 | 
27 | 
28 | # Train with 2to20lux noisy images
29 | python train_image_classifier.py --train_dir=$TRAIN_DIR/2to20lux \
30 |     --dataset_dir=$IMAGENET_TFRECORDS  --ll_low=0.001 \
31 |     --ll_high=0.010  --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
32 |     --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000  \
33 |     --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
34 |     --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,2,3 \
35 |     --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
36 | 
37 | # Train with 2to200lux noisy images
38 | python train_image_classifier.py --train_dir=$TRAIN_DIR/2to200lux \
39 |     --dataset_dir=$IMAGENET_TFRECORDS  --ll_low=0.001 \
40 |     --ll_high=0.100  --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
41 |     --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000  \
42 |     --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
43 |     --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,3,4 \
44 |     --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128


--------------------------------------------------------------------------------
/simulate_raw_images.py:
--------------------------------------------------------------------------------
  1 | """Script for adding noise to ImageNet-like dataset."""
  2 | 
  3 | from __future__ import absolute_import
  4 | from __future__ import division
  5 | from __future__ import print_function
  6 | 
  7 | import math
  8 | import tensorflow as tf
  9 | import os
 10 | import cv2
 11 | from datasets import dataset_factory, build_imagenet_data
 12 | import numpy as np
 13 | from preprocessing import preprocessing_factory, sensor_model
 14 | from pprint import pprint
 15 | from glob import glob
 16 | 
 17 | 
 18 | tf.app.flags.DEFINE_float(
 19 |      'll_low', None,
 20 |      'Lowest light level.')
 21 | 
 22 | tf.app.flags.DEFINE_float(
 23 |      'll_high', None,
 24 |      'Highest light level.')
 25 | 
 26 | tf.app.flags.DEFINE_string(
 27 |      'sensor', 'Nexus_6P_rear', 'The sensor.')
 28 | 
 29 | tf.app.flags.DEFINE_string(
 30 |     'output_dir', None, 'Directory where the results are saved to.')
 31 | 
 32 | tf.app.flags.DEFINE_string(
 33 |     'input_dir', None, 'The directory where the dataset files are stored.')
 34 | 
 35 | tf.app.flags.DEFINE_string(
 36 |     'preprocessing_name', 'mobilenet_v1', 'The name of the preprocessing to use. If left as `None`, then the model_name flag is used.')
 37 | 
 38 | tf.app.flags.DEFINE_integer(
 39 |     'eval_image_size', 128, 'Eval image size')
 40 | 
 41 | FLAGS = tf.app.flags.FLAGS
 42 | 
 43 | def main(_):
 44 |   if not FLAGS.input_dir:
 45 |     raise ValueError('You must supply the input directory with --input_dir')
 46 |   if not FLAGS.output_dir:
 47 |     raise ValueError('You must supply the dataset directory with --output_dir')
 48 | 
 49 |   tf.logging.set_verbosity(tf.logging.INFO)
 50 |   with tf.Graph().as_default():    
 51 | 
 52 |     # Preprocess the images so that they all have the same size
 53 |     preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
 54 |     image_preprocessing_fn = preprocessing_factory.get_preprocessing(
 55 |         preprocessing_name,
 56 |         is_training=False)
 57 | 
 58 |     eval_image_size = FLAGS.eval_image_size
 59 |     orig_image = tf.placeholder(tf.uint8, shape=(None, None, 3))
 60 |     image = image_preprocessing_fn(orig_image, orig_image, eval_image_size, eval_image_size)
 61 |     images = tf.expand_dims(image, 0)
 62 | 
 63 |     # Add noise.
 64 |     noisy_batch, alpha, sigma = sensor_model.sensor_noise_rand_light_level(images,
 65 |                                 [FLAGS.ll_low, FLAGS.ll_high],
 66 |                                 scale=1.0, sensor=FLAGS.sensor)
 67 | 
 68 |     bayer_mask = sensor_model.get_bayer_mask(eval_image_size, eval_image_size)
 69 |     inputs = noisy_batch*bayer_mask
 70 | 
 71 |     if not os.path.isdir(FLAGS.output_dir):
 72 |         os.mkdir(FLAGS.output_dir)
 73 | 
 74 |     with tf.Session() as sess:
 75 |         count = 0
 76 |         synsets = [path for path in os.listdir(FLAGS.input_dir) if not '.' in path]
 77 | 
 78 |         for synset in synsets:
 79 |             path = os.path.join(FLAGS.input_dir, synset)
 80 |             image_names = os.listdir(path)
 81 |             print("Found %d images in %s"%(len(image_names), synset))
 82 | 
 83 |             synset_path = os.path.join(FLAGS.output_dir, synset)
 84 |             if not os.path.isdir(synset_path):
 85 |                 os.mkdir(synset_path)
 86 | 
 87 |             for imagename in image_names:
 88 |                 output_imgfn = os.path.join(FLAGS.output_dir, synset, imagename.split('.')[0]+'.png')
 89 |                 if os.path.isfile(output_imgfn):
 90 |                     continue
 91 |                 loaded_image = cv2.imread(os.path.join(path, imagename))
 92 |                 
 93 |                 # BGR to RGB
 94 |                 loaded_image = loaded_image[..., ::-1]
 95 |                 images, alpha_val, sigma_val = sess.run(
 96 |                         [inputs, alpha, sigma],
 97 |                         feed_dict={orig_image:loaded_image})
 98 |                 img = (255.0*images[0,:,:,:]).astype(np.uint8)
 99 | 
100 |                 # RGB to BGR
101 |                 img = img[..., ::-1]
102 | 
103 |                 if count % 1000 == 0:
104 |                     print("%d processed images." % (count))
105 |                 cv2.imwrite(output_imgfn, img)
106 |                 count += 1
107 | 
108 |     print('Total images processed:', count)
109 | 
110 | 
111 | if __name__ == '__main__':
112 |   tf.app.run()
113 | 


--------------------------------------------------------------------------------
/teaser/architecture_2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/princeton-computational-imaging/DirtyPixels/6c82b124c9e32bbf5fa7d6adf8db8103132e4e5e/teaser/architecture_2.jpg


--------------------------------------------------------------------------------
/teaser/teaser_v4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/princeton-computational-imaging/DirtyPixels/6c82b124c9e32bbf5fa7d6adf8db8103132e4e5e/teaser/teaser_v4.png


--------------------------------------------------------------------------------
/test_captured_images.py:
--------------------------------------------------------------------------------
  1 | """Generic evaluation script that evaluates a model using the Dirty-Pixels captured dataset."""
  2 | 
  3 | from __future__ import absolute_import
  4 | from __future__ import division
  5 | from __future__ import print_function
  6 | 
  7 | import math
  8 | import skimage.measure
  9 | import scipy.ndimage.filters
 10 | import tensorflow as tf
 11 | import scipy.io
 12 | import os
 13 | import cv2
 14 | from datasets import dataset_factory, build_imagenet_data
 15 | import numpy as np
 16 | from nets import nets_factory
 17 | from preprocessing import preprocessing_factory, sensor_model
 18 | import matplotlib.pyplot as plt
 19 | from pprint import pprint
 20 | from nets.isp import anscombe
 21 | import rawpy
 22 | import pyexifinfo
 23 | 
 24 | slim = tf.contrib.slim
 25 | 
 26 | tf.app.flags.DEFINE_string(
 27 |     'device', '0', 'GPU device to use.')
 28 | 
 29 | tf.app.flags.DEFINE_string(
 30 |     'sensor', 'Nexus_6P_rear', 'The sensor.')
 31 | 
 32 | tf.app.flags.DEFINE_string(
 33 |     'isp_model_name', None, 'The name of the ISP architecture to train.')
 34 | 
 35 | tf.app.flags.DEFINE_boolean('use_anscombe', True,
 36 |                             'Use Anscombe transform.')
 37 | 
 38 | tf.app.flags.DEFINE_boolean('noise_channel', True,
 39 |                             'Use noise channel.')
 40 | 
 41 | tf.app.flags.DEFINE_integer(
 42 |     'num_iters', 1,
 43 |     'Number of iterations for the unrolled Proximal Gradient Network.')
 44 | 
 45 | tf.app.flags.DEFINE_integer(
 46 |     'num_layers', 17, 'Number of layers to be used in the HQS ISP prior -- DEPRECATED')
 47 | 
 48 | tf.app.flags.DEFINE_string(
 49 |     'checkpoint_path', '/tmp/tfmodel/',
 50 |     'The directory where the model was written to or an absolute path to a '
 51 |     'checkpoint file.')
 52 | 
 53 | tf.app.flags.DEFINE_string(
 54 |     'eval_dir', '/tmp/tfmodel/', 'Directory where the results are saved to.')
 55 | 
 56 | tf.app.flags.DEFINE_string(
 57 |     'dataset_dir', None, 'The directory where the dataset files are stored.')
 58 | 
 59 | tf.app.flags.DEFINE_string(
 60 |     'model_name', 'inception_v3', 'The name of the architecture to evaluate.')
 61 | 
 62 | tf.app.flags.DEFINE_string(
 63 |     'preprocessing_name', None, 'The name of the preprocessing to use. If left '
 64 |                                 'as `None`, then the model_name flag is used.')
 65 | 
 66 | tf.app.flags.DEFINE_integer(
 67 |     'eval_image_size', None, 'Eval image size')
 68 | 
 69 | 
 70 | FLAGS = tf.app.flags.FLAGS
 71 | 
 72 | 
 73 | def crop_and_subsample(img, target_size, average=None):
 74 |     factor = int(np.floor(min(img.shape) / target_size))
 75 |     ch = (img.shape[0] - factor * target_size) / 2
 76 |     cw = (img.shape[1] - factor * target_size) / 2
 77 |     cropped = img[int(np.floor(ch)):-int(np.ceil(ch)),
 78 |               int(np.floor(cw)):-int(np.ceil(cw))]
 79 |     if average is not None:
 80 |         cropped = scipy.ndimage.filters.convolve(cropped, np.ones((average, average)))
 81 |     return cropped[::factor, ::factor]
 82 | 
 83 | 
 84 | def main(_):
 85 |     if not FLAGS.dataset_dir:
 86 |         raise ValueError('You must supply the dataset directory with --dataset_dir')
 87 | 
 88 |     os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.device
 89 | 
 90 |     tf.logging.set_verbosity(tf.logging.INFO)
 91 |     with tf.Graph().as_default():
 92 | 
 93 |         ####################
 94 |         # Select the model #
 95 |         ####################
 96 |         num_classes = 1001
 97 |         network_fn = nets_factory.get_network_fn(
 98 |             FLAGS.model_name,
 99 |             num_classes,
100 |             weight_decay=0.0,
101 |             batch_norm_decay=0.95,
102 |             is_training=False)
103 | 
104 |         #####################################
105 |         # Select the preprocessing function #
106 |         #####################################
107 |         preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
108 |         image_preprocessing_fn = preprocessing_factory.get_preprocessing(
109 |             preprocessing_name,
110 |             is_training=False)
111 | 
112 |         eval_image_size = FLAGS.eval_image_size or network_fn.default_image_size
113 | 
114 |         orig_image = tf.placeholder(tf.float32, shape=(eval_image_size, eval_image_size, 3))
115 |         alpha = tf.placeholder(tf.float32, shape=[1, 3])
116 |         sigma = tf.placeholder(tf.float32, shape=[1, 3])
117 |         bayer_mask = sensor_model.get_bayer_mask(eval_image_size, eval_image_size)
118 |         # image = image_preprocessing_fn(orig_image, orig_image, eval_image_size, eval_image_size, sensor=FLAGS.sensor)
119 |         image = orig_image * bayer_mask
120 |         # alpha, sigma = sensor_model.get_coeffs(light_level[None], sensor=FLAGS.sensor)
121 |         # Scale to [-1, 1]
122 |         if FLAGS.isp_model_name is None:
123 |             image = 2 * (image - 0.5)
124 | 
125 |         images = tf.expand_dims(image, 0)
126 | 
127 |         ####################
128 |         # Define the model #
129 |         ####################
130 |         inputs =  images
131 | 
132 |         network_ops = network_fn(images=inputs, alpha=alpha, sigma=sigma,
133 |                                         bayer_mask=bayer_mask, use_anscombe=FLAGS.use_anscombe,
134 |                                         noise_channel=FLAGS.noise_channel,
135 |                                         num_classes=num_classes,
136 |                                         num_iters=FLAGS.num_iters, num_layers=FLAGS.num_layers,
137 |                                         isp_model_name=FLAGS.isp_model_name, is_real_data=True)
138 |         logits, end_points = network_ops[:2]
139 | 
140 |         variables_to_restore = slim.get_variables_to_restore()
141 |         saver = tf.train.Saver()
142 | 
143 |         if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
144 |             checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
145 |         else:
146 |             checkpoint_path = FLAGS.checkpoint_path
147 | 
148 |         synset2label = {}
149 |         with open("datasets/synset_labels.txt", "r") as f:
150 |             for line in f:
151 |                 synset, label = line.split(':')
152 |                 synset2label[synset] = int(label)
153 | 
154 |         if not os.path.isdir(FLAGS.eval_dir):
155 |             os.mkdir(FLAGS.eval_dir)
156 | 
157 |         with tf.Session() as sess:
158 |             saver.restore(sess, FLAGS.checkpoint_path)
159 |             synsets = os.listdir(FLAGS.dataset_dir)
160 |             number_to_human = {int(i[0]):i[1] for i in np.genfromtxt('datasets/imagenet_labels.txt', delimiter=':', dtype=np.string_)}
161 | 
162 |             # estimated alpha and gama 
163 |             alpha_val = 0.0153 
164 |             sigma_val = 0.0328 
165 |             count = 0
166 |             top1 = 0
167 |             top5 = 0
168 |             correct_paths = []
169 |             wrong_paths = []
170 |             for synset in synsets:
171 |                 if synset == 'labels.txt':
172 |                     continue
173 |                 synset_top5 = 0
174 |                 path = os.path.join(FLAGS.dataset_dir, synset)
175 |                 image_names = [name for name in sorted(os.listdir(path)) if '.dng' in name]
176 |                 for imagename in image_names:
177 |                     try:
178 |                         loaded_image = rawpy.imread(os.path.join(path, imagename))
179 |                         info = pyexifinfo.get_json(os.path.join(path, imagename))[0]
180 |                         black_level = float(info['EXIF:BlackLevel'].split(' ')[0])
181 |                         awb = [float(x) for x in info['EXIF:AsShotNeutral'].split(' ')]
182 |                         raw_img = (loaded_image.raw_image_visible - black_level) / 1023.
183 |                     except Exception as e:
184 |                         print(synset, imagename, e)
185 |                         continue
186 | 
187 |                     B = raw_img[::2, ::2] / awb[2]
188 |                     R = raw_img[1::2, 1::2] / awb[0]
189 |                     G1 = raw_img[1::2, ::2] / awb[1]
190 |                     G2 = raw_img[::2, 1::2] / awb[1]
191 |                     B, R, G1, G2 = (crop_and_subsample(img, eval_image_size // 2)
192 |                                     for img in [B, R, G1, G2])
193 |                     scale_factor = 1.0 / np.percentile(np.stack([B, R, G1, G2], axis=2), 98)
194 | 
195 |                     mosaiced = np.zeros((224, 224, 3))
196 |                     mosaiced[::2, ::2, 2] = B
197 |                     mosaiced[1::2, 1::2, 0] = R
198 |                     mosaiced[1::2, ::2, 1] = G1
199 |                     mosaiced[::2, 1::2, 1] = G2
200 | 
201 |                     img_scaled = mosaiced * scale_factor
202 |                     input_img = np.clip(img_scaled, 0, 1)
203 |                     scaling = (scale_factor / np.array(awb))[None, :]
204 |                     logits_vals, clean_image = sess.run(
205 |                         [logits[0, :], end_points.get('mobilenet_input', alpha)],
206 |                         feed_dict={orig_image: input_img,
207 |                                     alpha: alpha_val * scaling,
208 |                                     sigma: sigma_val * scaling})
209 |                     correct = synset2label[synset]
210 |                     predictions = np.argsort(-logits_vals)
211 |                     rank = np.nonzero(predictions == correct)[0]
212 |                     clean_image = clean_image.squeeze()
213 | 
214 |                     if count % 100 == 0:
215 |                         print("%d images out of 1000" % (count))
216 | 
217 |                     trgt_path = os.path.join(FLAGS.eval_dir, 'clean', synset)
218 |                     raw_path = os.path.join(FLAGS.eval_dir, 'raw', synset)
219 | 
220 |                     if not os.path.exists(raw_path):
221 |                         os.makedirs(raw_path)
222 | 
223 |                     if not os.path.exists(trgt_path):
224 |                         os.makedirs(trgt_path)
225 |                     cv2.imwrite(os.path.join(raw_path,  imagename[:-4]+'.png'), (input_img*255).astype(np.uint8))
226 |                     if FLAGS.isp_model_name == 'isp':
227 |                         trgt_path = os.path.join(trgt_path, imagename[:-4]+'.png')
228 |                         plt.imsave(trgt_path, clean_image)
229 | 
230 |                     if rank == 0:
231 |                         correct_paths.append("%s \"%s\" \"%s\""%(os.path.join(trgt_path, imagename[:-4]+'.png'), number_to_human[correct], number_to_human[predictions[0]]))
232 |                         top1 += 1.0
233 |                     else:
234 |                         wrong_paths.append("%s \"%s\" \"%s\""%(os.path.join(trgt_path, imagename[:-4]+'.png'), number_to_human[correct], number_to_human[predictions[0]]))
235 | 
236 |                     if rank <= 5:
237 |                         top5 += 1.0
238 |                         synset_top5 += 1.0
239 |                     count += 1
240 | 
241 |                 print("Synset %s, Top 5 %f" % (synset, synset_top5 / len(image_names)))
242 | 
243 |             print("Top-1 %f, Top-5 %f" % (top1 / count, top5 / count))
244 | 
245 |             with open(os.path.join(FLAGS.eval_dir, 'correct.txt'), 'w') as f:
246 |                 for item in correct_paths:
247 |                     f.write("%s\n" % item)
248 |             with open(os.path.join(FLAGS.eval_dir, 'wrong.txt'), 'w') as f:
249 |                 for item in wrong_paths:
250 |                     f.write("%s\n" % item)
251 | 
252 | 
253 | if __name__ == '__main__':
254 |     tf.app.run()
255 | 
256 | 
257 | 
258 | 


--------------------------------------------------------------------------------
/test_synthetic_images.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | 
 16 | 
 17 | # This file evaluates a trained network on a test dataset and saves the filenames of images
 18 | # that were correctly / falsely classified into a text file, so that the images that different
 19 | # classifiers got right / wrong can be compared.
 20 | 
 21 | """Generic evaluation script that evaluates a model using a given dataset.
 22 |   Noise is introduced before images are input to the classifier, and it is defined by the 
 23 |   mode parameter, and the Camera Image Formation
 24 |   model defined in the Dirty-Pixels manuscript.
 25 | """
 26 | 
 27 | from __future__ import absolute_import
 28 | from __future__ import division
 29 | from __future__ import print_function
 30 | 
 31 | import math
 32 | import tensorflow as tf
 33 | import os
 34 | from glob import glob
 35 | 
 36 | import cv2
 37 | 
 38 | from preprocessing import preprocessing_factory, sensor_model
 39 | from datasets import dataset_factory
 40 | from nets import nets_factory
 41 | import numpy as np
 42 | 
 43 | slim = tf.contrib.slim
 44 | 
 45 | tf.app.flags.DEFINE_string(
 46 |             'device', '', 'The address of the TensorFlow master to use.')
 47 | 
 48 | tf.app.flags.DEFINE_string(
 49 |             'mode', '3lux', 'Noise profile: 3lux, 6lux, 2to20lux, or 2to200lux.')
 50 | 
 51 | 
 52 | tf.app.flags.DEFINE_string(
 53 |     'checkpoint_path', '/tmp/tfmodel/',
 54 |     'The directory where the model was written to or an absolute path to a '
 55 |     'checkpoint file.')
 56 | 
 57 | tf.app.flags.DEFINE_string(
 58 |     'dataset_name', 'imagenet', 'The name of the dataset to load.')
 59 | 
 60 | tf.app.flags.DEFINE_string(
 61 |     'dataset_dir', None, 'The directory where the dataset files are stored.')
 62 | 
 63 | tf.app.flags.DEFINE_string(
 64 |     'model_name', None, 'The name of the architecture to evaluate.')
 65 | 
 66 | tf.app.flags.DEFINE_string('eval_dir', 'output_synthetic_images', 'Output directory')
 67 | 
 68 | FLAGS = tf.app.flags.FLAGS
 69 | 
 70 | 
 71 | def imnet_generator(root_directory):
 72 |     # list all directories
 73 |     dirs = sorted(glob(os.path.join(root_directory, "*/")))
 74 |     print("#### num dirs", len(dirs))
 75 | 
 76 |     # Build the label lookup table
 77 |     synset_to_label = {synset.decode('utf-8'):i+1 for i, synset in enumerate(np.genfromtxt('datasets/imagenet_lsvrc_2015_synsets.txt', dtype=np.string_))}
 78 |     # print(synset_to_label.items())
 79 | 
 80 |     # loop through directories and glob all images
 81 |     for idx, dir in enumerate(dirs):
 82 |         # Glob all image files in this directory
 83 |         img_files = glob(os.path.join(dir, '*.png'))
 84 |         img_files += glob(os.path.join(dir, '*.jpg'))
 85 |         img_files += glob(os.path.join(dir, '*.jpeg'))
 86 |         img_files += glob(os.path.join(dir, '*.JPEG'))
 87 | 
 88 |         for img_file in img_files:
 89 |             yield img_file, synset_to_label[os.path.basename(os.path.normpath(dir))], os.path.basename(os.path.normpath(dir))
 90 | 
 91 | def parse_img(img_path):
 92 |     rgb_string = tf.read_file(img_path)
 93 |     rgb_decoded = tf.image.decode_jpeg(rgb_string) # uint8
 94 |     rgb_decoded = tf.cast(rgb_decoded, tf.float32)
 95 |     rgb_decoded /= 255.
 96 |     return rgb_decoded
 97 | 
 98 | def main(_):
 99 |   if not FLAGS.dataset_dir:
100 |     raise ValueError('You must supply the dataset directory with --dataset_dir')
101 | 
102 |   os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.device
103 |   eval_dir = FLAGS.eval_dir
104 | 
105 |   tf.logging.set_verbosity(tf.logging.INFO)
106 |   with tf.Graph().as_default():
107 |     tf_global_step = slim.get_or_create_global_step()
108 | 
109 |     num_classes = 1001
110 |     eval_image_size = 128
111 | 
112 |     image_path_graph = tf.placeholder(tf.string)
113 |     label_graph = tf.placeholder(tf.int32)
114 | 
115 |     image = parse_img(image_path_graph)
116 | 
117 |     image = tf.image.central_crop(image, central_fraction=0.875)
118 | 
119 |     image = tf.expand_dims(image, 0)
120 |     image = tf.image.resize_images(image, [eval_image_size, eval_image_size], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
121 |     image = tf.squeeze(image, [0])
122 | 
123 |     ####################
124 |     # Select the model #
125 |     ####################
126 |     network_fn = nets_factory.get_network_fn(
127 |         FLAGS.model_name,
128 |         num_classes=num_classes,
129 |         batch_norm_decay=0.9,
130 |         weight_decay=0.0,
131 |         is_training=False)
132 | 
133 |     image.set_shape([128,128,3])
134 |     image = tf.expand_dims(image, 0)
135 | 
136 |     if FLAGS.mode == '2to20lux':
137 |         ll_low = 0.001
138 |         ll_high = 0.01
139 |     elif FLAGS.mode == '2to200lux':
140 |         ll_low = 0.001
141 |         ll_high = 0.1
142 |     elif FLAGS.mode == '3lux':
143 |         ll_low = 0.0015
144 |         ll_high = 0.0015
145 |     elif FLAGS.mode == '6lux':
146 |         ll_low = 0.003
147 |         ll_high = 0.003
148 | 
149 |     noisy_batch, alpha, sigma = \
150 |                 sensor_model.sensor_noise_rand_light_level(image, [ll_low, ll_high], scale=1.0, sensor='Nexus_6P_rear')
151 |     bayer_mask = sensor_model.get_bayer_mask(128, 128)
152 | 
153 |     raw_image_graph = noisy_batch * bayer_mask
154 | 
155 |     ####################
156 |     # Define the model #
157 |     ####################
158 |     logits, end_points, cleaned_image_graph = network_fn(images=raw_image_graph, alpha=alpha, sigma=sigma,
159 |                 bayer_mask=bayer_mask, use_anscombe=True,
160 |                 noise_channel=True,
161 |                 num_classes=num_classes,
162 |                 num_iters=1, num_layers=17,
163 |                 isp_model_name='isp')
164 | 
165 |     predictions = tf.argmax(logits, 1)
166 | 
167 |     if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
168 |         print('###### Loading last checkpoint of directory', FLAGS.checkpoint_path)
169 |         checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
170 |     else:
171 |         print('###### Loading checkpoint', FLAGS.checkpoint_path)
172 |         checkpoint_path = FLAGS.checkpoint_path
173 | 
174 | 
175 |     tf.logging.info('Evaluating %s' % FLAGS.checkpoint_path)
176 | 
177 |     correct_paths = []
178 |     wrong_paths = []
179 | 
180 |     # Restore variables from checkpoint
181 |     variables_to_restore = slim.get_variables_to_restore() # slim.get_model_variables() 
182 |     saver = tf.train.Saver(variables_to_restore)
183 | 
184 |     number_to_human = {int(i[0]):i[1] for i in np.genfromtxt('datasets/imagenet_labels.txt', delimiter=':', dtype=np.string_)}
185 | 
186 |     eval_dir= FLAGS.eval_dir
187 |     os.makedirs(eval_dir, exist_ok=True)
188 | 
189 |     with tf.Session() as sess:
190 |         sess.run(tf.global_variables_initializer())
191 |         saver.restore(sess, checkpoint_path)
192 | 
193 |         count = 0
194 |         for img_file, label, synset in imnet_generator(FLAGS.dataset_dir):
195 |             preds_value, cleaned_image, raw_image = sess.run([predictions, cleaned_image_graph, raw_image_graph],
196 |                     feed_dict={image_path_graph:img_file, label_graph:label})
197 | 
198 |             cleaned_image = np.clip(cleaned_image, 0.0, 1.0).squeeze()[:,:,::-1]
199 |             raw_image = raw_image.squeeze()[:,:,::-1]
200 |             img_filename = os.path.basename(os.path.normpath(img_file))
201 | 
202 |             our_path = os.path.join(eval_dir, 'anscombe_output', FLAGS.mode, synset)
203 |             raw_path = os.path.join(eval_dir, 'raw', FLAGS.mode, synset)
204 | 
205 |             if not os.path.exists(our_path):
206 |                 os.makedirs(our_path)
207 |             if not os.path.exists(raw_path):
208 |                 os.makedirs(raw_path)
209 | 
210 |             if count % 10000 == 0:
211 |                 print('num. processed ', count)
212 |                 print('num. correct paths', len(correct_paths))
213 |             count += 1
214 |             img_filename = os.path.splitext(img_filename)[0] + '.png'
215 | 
216 |             cv2.imwrite(os.path.join(our_path, img_filename), (cleaned_image*255).astype(np.uint8))
217 |             cv2.imwrite(os.path.join(raw_path, img_filename), (raw_image*255).astype(np.uint8))
218 | 
219 |             if preds_value.squeeze() == label:
220 |                 correct_paths.append("%s \"%s\" \"%s\""%(os.path.join(our_path, img_filename), number_to_human[label], number_to_human[preds_value[0]]))
221 |             else:
222 |                 wrong_paths.append("%s \"%s\" \"%s\""%(os.path.join(our_path, img_filename), number_to_human[label], number_to_human[preds_value[0]]))
223 | 
224 |         print('Top-1 accuracy', float(len(correct_paths))/float(len(wrong_paths)+len(correct_paths)))
225 |     correct_paths_fn = os.path.join(eval_dir, FLAGS.mode + '_correct.txt')
226 |     with open(correct_paths_fn, 'w') as f:
227 |         for item in correct_paths:
228 |             f.write("%s\n" % item)
229 |     wrong_paths_fn = os.path.join(eval_dir, FLAGS.mode + '_wrong.txt')
230 |     with open(wrong_paths_fn, 'w') as f:
231 |         for item in wrong_paths:
232 |             f.write("%s\n" % item)
233 | 
234 | if __name__ == '__main__':
235 |   tf.app.run()
236 | 


--------------------------------------------------------------------------------