├── .gitignore
├── ADD_NOISE_INSTRUCTIONS.md
├── EVALUATION_INSTRUCTIONS.md
├── LICENSE
├── README.md
├── TRAINING_INSTRUCTIONS.md
├── datasets
├── __init__.py
├── beyond_gauss.py
├── build_darktable_data.py
├── build_imagenet_data.py
├── build_pixel_isp_data.py
├── classes.py
├── dataset_factory.py
├── dataset_utils.py
├── imagenet.py
├── imagenet_2012_bounding_boxes.csv
├── imagenet_labels.txt
├── imagenet_lsvrc_2015_synsets.txt
├── imagenet_metadata.txt
├── imnet_reg.py
├── labels.txt
├── number_synsets.txt
├── raw.py
├── raw_metadata.txt
├── synset_labels.txt
└── training_synsets.txt
├── deployment
├── __init__.py
├── model_deploy.py
└── model_deploy_test.py
├── environment.yml
├── loss_functions
├── __init__.py
└── loss_factory.py
├── nets
├── __init__.py
├── inception.py
├── inception_utils.py
├── isp.py
├── mobilenet_isp.py
├── mobilenet_v1.py
├── nets_factory.py
├── nets_factory_test.py
└── unet.py
├── preprocessing
├── __init__.py
├── inception_preprocessing.py
├── isp_pretrain_preprocessing.py
├── joint_isp_preprocessing.py
├── no_preprocessing.py
├── preprocessing_factory.py
├── sensor_model.py
└── writeout_preprocessing.py
├── run_test_captured_images.sh
├── run_test_synthetic_images.sh
├── run_train_joint_models.sh
├── simulate_raw_images.py
├── teaser
├── architecture_2.jpg
└── teaser_v4.png
├── test_captured_images.py
├── test_synthetic_images.py
└── train_image_classifier.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *__pycache__*
2 | *.idea
3 | *$py.class
4 | *.egg-info
5 |
--------------------------------------------------------------------------------
/ADD_NOISE_INSTRUCTIONS.md:
--------------------------------------------------------------------------------
1 | # Simulating noisy raw images from Imagenet
2 | In order to evaluate and train new ISP or perception
3 | models on noisy images, we provide the noisy images
4 | that we used for evaluating a the Hardware ISP of Movidius Myraid 2
5 | evaluation board: [Noisy-ImageNet](https://drive.google.com/drive/folders/1f9B319TDtFpZSi7HEXnrPa31rtPm54iH?usp=sharing).
6 |
7 | We also provide the code to simulate noisy raw images from
8 | the ImageNet dataset, using the image formation model
9 | described in the manuscript.
10 |
11 | In order to introduce `2to20lux` noise to a the ImageNet dataset run
12 |
13 | ```
14 | python simulate_raw_images.py --ll_low=0.001 --ll_high=0.010 \
15 | --input_dir=$IMAGENET_DIR --output_dir=$OUT_DIR
16 | ```
17 | where `$IMAGENET_DIR` is the ImageNet directory, training or evaluation sets,
18 | `$OUT_DIR` is the directory where the noisy images are written to, and
19 | `ll_low` and `ll_high` are the lowest and highest light level, respectively.
20 | To generate images with other noise profiles, adapt the `ll_low` and
21 | `ll_high` accordingly (see more examples in the `run_train_joint_models.sh` script).
22 |
--------------------------------------------------------------------------------
/EVALUATION_INSTRUCTIONS.md:
--------------------------------------------------------------------------------
1 | # Evaluating pre-trained models
2 | In order to reproduce the results presented in the
3 | paper, first, download the [pre-trained models](https://drive.google.com/file/d/1kBTRAS2W5Ayf2DOxKIgIBmPv5OHaMbCD/view?usp=sharing).
4 |
5 | ## Evaluate our joint models on real data
6 | Download and extract the real captured (low-light) images [dataset](https://drive.google.com/file/d/1fj2u8t_wVdNVUmcjyeK8VuqDfTAd7RJA/view?usp=sharing).
7 |
8 | To run our `2to200lux` joint model over the captured data (Table 2 of the paper),
9 | run
10 | ```
11 | python test_captured_images.py --device=1 --dataset_dir=$DATASET_DIR --dataset_name=imagenet \
12 | --checkpoint_path=$CHECKPOINTS/joint128/2to200lux/model.ckpt-232721 \
13 | --model_name=mobilenet_isp --noise_channel=True --use_anscombe=True \
14 | --isp_model_name=isp --eval_image_size=224 --sensor=Pixel --eval_dir $OUT_DIR
15 | ```
16 | where `--device` is the GPU where the model will run on,
17 | and `$DATASET_DIR` and `$CHECKPOINTS` should be set to the downloaded dataset
18 | and checkpoint directories, respectively. `$OUT_DIR` can be set to an
19 | arbitrary output directory path. See `run_test_captured_images.sh` for
20 | additional parameters to evaluate baseline models.
21 |
22 | ## Evaluate our joint models on synthetic data
23 | Download the [Imagenet][in] (validation) dataset.
24 | To evaluate our joint model over noisy images with a `6lux` noise profile, run
25 | ```
26 | python test_synthetic_images.py --device=1 --checkpoint_path=$CHECKPOINTS/joint128/6lux/model.ckpt-222267 \
27 | --dataset_dir=$IMAGENET_DATASET_DIR --dataset_name=imagenet --mode=6lux \
28 | --model_name=mobilenet_isp --eval_dir=$OUT_DIR
29 | ```
30 |
31 | where `$IMAGENET_DATASET_DIR` is the path to the ImageNet (validation) dataset,
32 | `$CHECKPOINTS` is set to the downloaded checkpoints directory,
33 | `--device` is the GPU where the model will run on,
34 | and `$OUT_DIR` is an arbitrary directory where the results are written to.
35 |
36 | For both the synthetic and captured image evaluation scripts, generated results include
37 | the noisy input raw images, anscombe network output images,
38 | and lists of correctly and wrongly classified images.
39 | To run the trained models over different noise profiles,
40 | modify the checkpoint paths and `--mode` parameter (3lux, 6lux, 2to20lux, or 2to200lux)
41 | accordingly. See `run_test_synthetic_images.sh` for the specific parameters for each noise profile.
42 |
43 |
44 |
45 |
46 |
49 |
50 |
67 |
68 | [in]: http://image-net.org/index
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2021 princeton-computational-imaging
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Dirty Pixels: Towards End-to-End Image Processing and Perception
2 | This repository contains the code for the paper
3 |
4 | **[Dirty Pixels: Towards End-to-End Image Processing and Perception][1]**
5 | [Steven Diamond][sd], [Vincent Sitzmann][vs], [Frank Julca-Aguilar][fj], [Stephen Boyd][sb], [Gordon Wetzstein][gw], [Felix Heide][fh]
6 | Transactions on Graphics, 2021 | To be presented at SIGGRAPH, 2021
7 |
8 |
9 |

10 |
11 |
12 |
13 |
14 |
15 |
16 |

17 |
18 |
19 | ## Installation
20 | Clone this repository:
21 | ```
22 | git clone git@github.com:princeton-computational-imaging/DirtyPixels.git
23 | ```
24 |
25 | The project was developed using Python 3.6, Tensorflow (v1.12) and Slim.
26 | We provide an environment file to install all dependencies (creating an envirnoment called dirtypix):
27 |
28 | ```
29 | conda env create -f environment.yml
30 | conda activate dirtypix
31 | ```
32 |
33 |
34 |
35 | ## Running Experiments
36 | We provide code and data and trained models to reproduce the main results presented at the paper, and instructions on how to use this project for further research:
37 | - [EVALUATION_INSTRUCTIONS.md](EVALUATION_INSTRUCTIONS.md) provides instructions
38 | on how to evaluate our proposed models and reproduce results of the paper.
39 | - [TRAINING_INSTRUCTIONS.md](TRAINING_INSTRUCTIONS.md) gives instructions on how to train new models following our proposed approach.
40 | - [ADD_NOISE_INSTRUCTIONS.md](ADD_NOISE_INSTRUCTIONS.md) explains how to simulate
41 | noisy raw images following the image formation model defined in the
42 | manuscript.
43 |
44 | ## Citation
45 | If you find our work useful in your research, please cite:
46 |
47 | ```
48 | @article{steven:dirtypixels2021,
49 | title={Dirty Pixels: Towards End-to-End Image Processing and Perception},
50 | author={Diamond, Steven and Sitzmann, Vincent and Julca-Aguilar, Frank and Boyd, Stephen and Wetzstein, Gordon and Heide, Felix},
51 | journal={ACM Transactions on Graphics (SIGGRAPH)},
52 | year={2021},
53 | publisher={ACM}
54 | }
55 | ```
56 |
57 | ## License
58 |
59 | This project is released under [MIT License](LICENSE).
60 |
61 |
62 | [1]: https://arxiv.org/abs/1701.06487
63 | [sd]: https://stevendiamond.me
64 | [vs]: https://vsitzmann.github.io
65 | [fj]: https://github.com/fjulca-aguilar
66 | [sb]: https://web.stanford.edu/~boyd/
67 | [gw]: https://stanford.edu/~gordonwz/
68 | [fh]: https://www.cs.princeton.edu/~fheide/
69 |
70 |
--------------------------------------------------------------------------------
/TRAINING_INSTRUCTIONS.md:
--------------------------------------------------------------------------------
1 |
2 | ## Training new models over noisy RAW data
3 | Download the [Imagenet][in] (training) dataset.
4 | As described in the supplemental document,
5 | our joint models were trained in two stages.
6 | In the first stage, we train the anscombe
7 | and MobileNet components separately on ImageNet.
8 | In this stage, we use L1 norm to train the
9 | anscombe networks. In the second stage,
10 | the joint (MobileNet + Anscombe) model is trained using only
11 | the high level (classification)
12 | loss and the checkpoints obtained in the first stage.
13 | To facilitate training new models, we provide the
14 | checkpoints obtained from the first stage. The checkpoints
15 | can be downloaded following the instruction in
16 | [EVALUATION_INSTRUCTIONS.md](EVALUATION_INSTRUCTIONS.md).
17 |
18 | ## Generating TFRecords for training
19 | In order to generate TFRecord files for training,
20 | run the `build_imagenet_data.py` script in the `datasets`
21 | folder:
22 |
23 | ```
24 | cd datasets
25 | python build_imagenet_data.py --train_directory=$IMAGENET_TRAIN_DIR \
26 | --output_directory=$OUT_DIR \
27 | --num_threads 8
28 | ```
29 | where `$IMAGENET_TRAIN_DIR` is the path to the Imagenet training dataset,
30 | `$OUT_DIR` is the path to the directory where the TFRecord files will
31 | be exported, and `--num_threads` defines the number of threads to
32 | preprocess the images.
33 |
34 |
35 |
36 | ## Training command example
37 | In order to train our proposed joint architecture
38 | on a `6lux` noise profile run:
39 |
40 | ```
41 | python train_image_classifier.py --train_dir=$TRAIN_DIR \
42 | --dataset_dir=$IMAGENET_TFRECORDS --ll_low=0.003 \
43 | --ll_high=0.003 --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
44 | --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000 \
45 | --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
46 | --use_anscombe=True --num_clones=2 --isp_model_name=isp --num_iters=1 --device=0,1 \
47 | --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
48 | ```
49 | where `$IMAGENET_TFRECORDS` is set to the directory with the Imagenet TFRecords, and `$CHECKPOINTS` is set to the downloaded checkpoints directory. The paramaters `--checkpoint_path`
50 | and `--isp_checkpoint_path` are set to the checkpoints obtained in the first training stage.
51 | For training over other noisy profiles, see
52 | `run_train_joint_models.sh`. For more details about the specific training parameters,
53 | see the main manuscript and supplemental document. To visualise the training
54 | progress run `tensorboard --logdir=$TRAIN_DIR`.
55 |
56 | [in]: http://image-net.org/index
57 |
58 |
--------------------------------------------------------------------------------
/datasets/__init__.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/datasets/beyond_gauss.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides data for the Cifar10 dataset.
16 |
17 | The dataset scripts used to create the dataset can be found at:
18 | tensorflow/models/slim/data/create_cifar10_dataset.py
19 | """
20 |
21 | from __future__ import absolute_import
22 | from __future__ import division
23 | from __future__ import print_function
24 |
25 | import os
26 | import tensorflow as tf
27 |
28 | from datasets import dataset_utils
29 |
30 | slim = tf.contrib.slim
31 |
32 | _FILE_PATTERN = 'beyond_gauss_%s.tfrecord'
33 |
34 | # for bw patches
35 | # SPLITS_TO_SIZES = {'train': 212096, 'test': 10000}
36 | # for color patches
37 | SPLITS_TO_SIZES = {'train': 252416, 'test': 10000}
38 | _ITEMS_TO_DESCRIPTIONS = {
39 | 'input': 'The input image for the model',
40 | 'ground_truth': 'The ground truth image to regress on.',
41 | }
42 |
43 |
44 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
45 | """Gets a dataset tuple with instructions for reading cifar10.
46 |
47 | Args:
48 | split_name: A train/test split name.
49 | dataset_dir: The base directory of the dataset sources.
50 | file_pattern: The file pattern to use when matching the dataset sources.
51 | It is assumed that the pattern contains a '%s' string so that the split
52 | name can be inserted.
53 | reader: The TensorFlow reader type.
54 |
55 | Returns:
56 | A `Dataset` namedtuple.
57 |
58 | Raises:
59 | ValueError: if `split_name` is not a valid train/test split.
60 | """
61 | if split_name not in SPLITS_TO_SIZES:
62 | raise ValueError('split name %s was not recognized.' % split_name)
63 |
64 | if not file_pattern:
65 | file_pattern = _FILE_PATTERN
66 | file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
67 |
68 | # Allowing None in the signature so that dataset_factory can use the default.
69 | if not reader:
70 | reader = tf.TFRecordReader
71 |
72 | keys_to_features = {
73 | 'input_img/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
74 | 'gt_img/encoded': tf.FixedLenFeature((), tf.string, default_value=''),
75 | 'image/format': tf.FixedLenFeature((), tf.string, default_value='png'),
76 | }
77 |
78 | items_to_handlers = {
79 | 'input': slim.tfexample_decoder.Image('input_img/encoded', format_key='image/format'),
80 | 'ground_truth': slim.tfexample_decoder.Image('gt_img/encoded', format_key='image/format'),
81 | }
82 |
83 | decoder = slim.tfexample_decoder.TFExampleDecoder(
84 | keys_to_features, items_to_handlers)
85 |
86 | return slim.dataset.Dataset(
87 | data_sources=file_pattern,
88 | reader=reader,
89 | decoder=decoder,
90 | num_samples=SPLITS_TO_SIZES[split_name],
91 | items_to_descriptions=_ITEMS_TO_DESCRIPTIONS)
92 |
--------------------------------------------------------------------------------
/datasets/classes.py:
--------------------------------------------------------------------------------
1 | import os
2 | from glob import glob
3 | from distutils.dir_util import copy_tree
4 |
5 | base = '/media/data/dirty_pix_v3/validation_RAW/'
6 | root = base + 'RAW_human_ISO8000_EXP10000'
7 | target = base + 'RAW_synset_ISO8000_EXP10000'
8 |
9 | human_labels = glob(os.path.join(root,'*/'))
10 | human_labels = [label.split('/')[-2] for label in human_labels]
11 | print(human_labels)
12 |
13 | human_to_synset = {}
14 | with open('raw_metadata.txt', 'r') as synset_human_file:
15 | for line in synset_human_file:
16 | synset = line[:9]
17 | human = line[9:].strip().lower()
18 | for label in human_labels:
19 | for match in human.split(','):
20 | if label.strip() == match.strip().lower():
21 | human_to_synset[label] = synset
22 |
23 | print(human_to_synset)
24 | missing = False
25 | for h in human_labels:
26 | if h not in human_to_synset:
27 | print h
28 | missing = True
29 | if missing:
30 | print("Missing synsets!")
31 | else:
32 | print("All synsets mapped!")
33 |
34 | #print len(human_labels)
35 | #print len(human_to_synset)
36 |
37 | all_dirs = glob(os.path.join(root,'*/'))
38 | for subdir in all_dirs:
39 | no_imgs = len(glob(os.path.join(subdir, '*.dng')))
40 | if not no_imgs:
41 | print(subdir + " is empty")
42 | continue
43 |
44 | subdir = subdir[len(root)+1:-1]
45 | print(subdir)
46 |
47 | if subdir not in human_to_synset:
48 | print("Skipping %s"%subdir)
49 | continue
50 |
51 | print("Copying %d files from class %s"%(no_imgs, subdir))
52 |
53 | synset = human_to_synset[subdir]
54 | new_dir = os.path.join(target, synset)
55 | old_dir = os.path.join(root,subdir)
56 | copy_tree(old_dir, new_dir)
57 |
--------------------------------------------------------------------------------
/datasets/dataset_factory.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """A factory-pattern class which returns classification image/label pairs."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | from datasets import imagenet
22 | from datasets import imnet_reg
23 | from datasets import beyond_gauss
24 | from datasets import raw
25 |
26 | datasets_map = {
27 | 'imnet_reg': imnet_reg,
28 | 'imagenet': imagenet,
29 | 'beyond_gauss': beyond_gauss,
30 | 'raw': raw,
31 | }
32 |
33 |
34 | def get_dataset(name, split_name, dataset_dir, file_pattern=None, reader=None):
35 | """Given a dataset name and a split_name returns a Dataset.
36 |
37 | Args:
38 | name: String, the name of the dataset.
39 | split_name: A train/test split name.
40 | dataset_dir: The directory where the dataset files are stored.
41 | file_pattern: The file pattern to use for matching the dataset source files.
42 | reader: The subclass of tf.ReaderBase. If left as `None`, then the default
43 | reader defined by each dataset is used.
44 |
45 | Returns:
46 | A `Dataset` class.
47 |
48 | Raises:
49 | ValueError: If the dataset `name` is unknown.
50 | """
51 | if name not in datasets_map:
52 | raise ValueError('Name of dataset unknown %s' % name)
53 | return datasets_map[name].get_split(
54 | split_name,
55 | dataset_dir,
56 | file_pattern,
57 | reader)
58 |
--------------------------------------------------------------------------------
/datasets/dataset_utils.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains utilities for downloading and converting datasets."""
16 | from __future__ import absolute_import
17 | from __future__ import division
18 | from __future__ import print_function
19 |
20 | import os
21 | import sys
22 | import tarfile
23 |
24 | from six.moves import urllib
25 | import tensorflow as tf
26 |
27 | LABELS_FILENAME = 'labels.txt'
28 |
29 |
30 | def int64_feature(values):
31 | """Returns a TF-Feature of int64s.
32 |
33 | Args:
34 | values: A scalar or list of values.
35 |
36 | Returns:
37 | a TF-Feature.
38 | """
39 | if not isinstance(values, (tuple, list)):
40 | values = [values]
41 | return tf.train.Feature(int64_list=tf.train.Int64List(value=values))
42 |
43 |
44 | def bytes_feature(values):
45 | """Returns a TF-Feature of bytes.
46 |
47 | Args:
48 | values: A string.
49 |
50 | Returns:
51 | a TF-Feature.
52 | """
53 | return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values]))
54 |
55 |
56 | def image_to_tfexample(image_data, image_format, height, width, class_id):
57 | return tf.train.Example(features=tf.train.Features(feature={
58 | 'image/encoded': bytes_feature(image_data),
59 | 'image/format': bytes_feature(image_format),
60 | 'image/class/label': int64_feature(class_id),
61 | 'image/height': int64_feature(height),
62 | 'image/width': int64_feature(width),
63 | }))
64 |
65 |
66 | def image_to_tfexample_for_regression(input_img, gt_img, image_format, height, width):
67 | return tf.train.Example(features=tf.train.Features(feature={
68 | 'input_img/encoded': bytes_feature(input_img),
69 | 'gt_img/encoded': bytes_feature(gt_img),
70 | 'imgs/format': bytes_feature(image_format),
71 | 'imgs/height': int64_feature(height),
72 | 'imgs/width': int64_feature(width),
73 | }))
74 |
75 |
76 | def download_and_uncompress_tarball(tarball_url, dataset_dir):
77 | """Downloads the `tarball_url` and uncompresses it locally.
78 |
79 | Args:
80 | tarball_url: The URL of a tarball file.
81 | dataset_dir: The directory where the temporary files are stored.
82 | """
83 | filename = tarball_url.split('/')[-1]
84 | filepath = os.path.join(dataset_dir, filename)
85 |
86 | def _progress(count, block_size, total_size):
87 | sys.stdout.write('\r>> Downloading %s %.1f%%' % (
88 | filename, float(count * block_size) / float(total_size) * 100.0))
89 | sys.stdout.flush()
90 | filepath, _ = urllib.request.urlretrieve(tarball_url, filepath, _progress)
91 | print()
92 | statinfo = os.stat(filepath)
93 | print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')
94 | tarfile.open(filepath, 'r:gz').extractall(dataset_dir)
95 |
96 |
97 | def write_label_file(labels_to_class_names, dataset_dir,
98 | filename=LABELS_FILENAME):
99 | """Writes a file with the list of class names.
100 |
101 | Args:
102 | labels_to_class_names: A map of (integer) labels to class names.
103 | dataset_dir: The directory in which the labels file should be written.
104 | filename: The filename where the class names are written.
105 | """
106 | labels_filename = os.path.join(dataset_dir, filename)
107 | with tf.gfile.Open(labels_filename, 'w') as f:
108 | for label in labels_to_class_names:
109 | class_name = labels_to_class_names[label]
110 | f.write('%d:%s\n' % (label, class_name))
111 |
112 |
113 | def has_labels(dataset_dir, filename=LABELS_FILENAME):
114 | """Specifies whether or not the dataset directory contains a label map file.
115 |
116 | Args:
117 | dataset_dir: The directory in which the labels file is found.
118 | filename: The filename where the class names are written.
119 |
120 | Returns:
121 | `True` if the labels file exists and `False` otherwise.
122 | """
123 | return tf.gfile.Exists(os.path.join(dataset_dir, filename))
124 |
125 |
126 | def read_label_file(dataset_dir, filename=LABELS_FILENAME):
127 | """Reads the labels file and returns a mapping from ID to class name.
128 |
129 | Args:
130 | dataset_dir: The directory in which the labels file is found.
131 | filename: The filename where the class names are written.
132 |
133 | Returns:
134 | A map from a label (integer) to class name.
135 | """
136 | labels_filename = os.path.join(dataset_dir, filename)
137 | with tf.gfile.Open(labels_filename, 'r') as f:
138 | lines = f.read() #f.read().decode()
139 | lines = lines.split('\n')
140 | print('lines ', lines)
141 | lines = filter(None, lines)
142 |
143 | labels_to_class_names = {}
144 | for line in lines:
145 | index = line.index(':')
146 | labels_to_class_names[int(line[:index])] = line[index+1:]
147 | return labels_to_class_names
148 |
--------------------------------------------------------------------------------
/datasets/imagenet.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides data for the ImageNet ILSVRC 2012 Dataset plus some bounding boxes.
16 |
17 | Some images have one or more bounding boxes associated with the label of the
18 | image. See details here: http://image-net.org/download-bboxes
19 |
20 | ImageNet is based upon WordNet 3.0. To uniquely identify a synset, we use
21 | "WordNet ID" (wnid), which is a concatenation of POS ( i.e. part of speech )
22 | and SYNSET OFFSET of WordNet. For more information, please refer to the
23 | WordNet documentation[http://wordnet.princeton.edu/wordnet/documentation/].
24 |
25 | "There are bounding boxes for over 3000 popular synsets available.
26 | For each synset, there are on average 150 images with bounding boxes."
27 |
28 | WARNING: Don't use for object detection, in this case all the bounding boxes
29 | of the image belong to just one class.
30 | """
31 | from __future__ import absolute_import
32 | from __future__ import division
33 | from __future__ import print_function
34 |
35 | import os
36 | from six.moves import urllib
37 | import tensorflow as tf
38 |
39 | from datasets import dataset_utils
40 |
41 | slim = tf.contrib.slim
42 |
43 | # TODO(nsilberman): Add tfrecord file type once the script is updated.
44 | _FILE_PATTERN = '%s-*'
45 |
46 | _SPLITS_TO_SIZES = {
47 | 'train': 600000,#1281167,
48 | 'validation': 50000,
49 | }
50 |
51 | _ITEMS_TO_DESCRIPTIONS = {
52 | 'image': 'A color image of varying height and width.',
53 | 'label': 'The label id of the image, integer between 0 and 999',
54 | 'label_text': 'The text of the label.',
55 | 'object/bbox': 'A list of bounding boxes.',
56 | 'object/label': 'A list of labels, one per each object.',
57 | }
58 |
59 | _NUM_CLASSES = 1001
60 |
61 |
62 | def create_readable_names_for_imagenet_labels():
63 | """Create a dict mapping label id to human readable string.
64 |
65 | Returns:
66 | labels_to_names: dictionary where keys are integers from to 1000
67 | and values are human-readable names.
68 |
69 | We retrieve a synset file, which contains a list of valid synset labels used
70 | by ILSVRC competition. There is one synset one per line, eg.
71 | # n01440764
72 | # n01443537
73 | We also retrieve a synset_to_human_file, which contains a mapping from synsets
74 | to human-readable names for every synset in Imagenet. These are stored in a
75 | tsv format, as follows:
76 | # n02119247 black fox
77 | # n02119359 silver fox
78 | We assign each synset (in alphabetical order) an integer, starting from 1
79 | (since 0 is reserved for the background class).
80 |
81 | Code is based on
82 | https://github.com/tensorflow/models/blob/master/inception/inception/data/build_imagenet_data.py#L463
83 | """
84 |
85 | # pylint: disable=g-line-too-long
86 | base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/inception/inception/data/'
87 | synset_url = '{}/imagenet_lsvrc_2015_synsets.txt'.format(base_url)
88 | synset_to_human_url = '{}/imagenet_metadata.txt'.format(base_url)
89 |
90 | #filename, _ = urllib.request.urlretrieve(synset_url)
91 | filename = './datasets/imagenet_lsvrc_2015_synsets.txt'
92 | synset_list = [s.strip() for s in open(filename).readlines()]
93 | num_synsets_in_ilsvrc = len(synset_list)
94 | assert num_synsets_in_ilsvrc == 1000
95 |
96 | #filename, _ = urllib.request.urlretrieve(synset_to_human_url)
97 | filename = './datasets/imagenet_metadata.txt'
98 | synset_to_human_list = open(filename).readlines()
99 | num_synsets_in_all_imagenet = len(synset_to_human_list)
100 | assert num_synsets_in_all_imagenet == 21842
101 |
102 | synset_to_human = {}
103 | for s in synset_to_human_list:
104 | parts = s.strip().split('\t')
105 | assert len(parts) == 2
106 | synset = parts[0]
107 | human = parts[1]
108 | synset_to_human[synset] = human
109 |
110 | label_index = 1
111 | labels_to_names = {0: 'background'}
112 | for synset in synset_list:
113 | name = synset_to_human[synset]
114 | labels_to_names[label_index] = name
115 | label_index += 1
116 |
117 | return labels_to_names
118 |
119 |
120 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
121 | """Gets a dataset tuple with instructions for reading ImageNet.
122 |
123 | Args:
124 | split_name: A train/test split name.
125 | dataset_dir: The base directory of the dataset sources.
126 | file_pattern: The file pattern to use when matching the dataset sources.
127 | It is assumed that the pattern contains a '%s' string so that the split
128 | name can be inserted.
129 | reader: The TensorFlow reader type.
130 |
131 | Returns:
132 | A `Dataset` namedtuple.
133 |
134 | Raises:
135 | ValueError: if `split_name` is not a valid train/test split.
136 | """
137 | if split_name not in _SPLITS_TO_SIZES:
138 | raise ValueError('split name %s was not recognized.' % split_name)
139 |
140 | if not file_pattern:
141 | file_pattern = _FILE_PATTERN
142 | file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
143 |
144 | # Allowing None in the signature so that dataset_factory can use the default.
145 | if reader is None:
146 | reader = tf.TFRecordReader
147 |
148 | keys_to_features = {
149 | 'image/encoded': tf.FixedLenFeature(
150 | (), tf.string, default_value=''),
151 | 'image/format': tf.FixedLenFeature(
152 | (), tf.string, default_value='jpeg'),
153 | 'image/class/label': tf.FixedLenFeature(
154 | [], dtype=tf.int64, default_value=-1),
155 | 'image/class/text': tf.FixedLenFeature(
156 | [], dtype=tf.string, default_value=''),
157 | 'image/object/bbox/xmin': tf.VarLenFeature(
158 | dtype=tf.float32),
159 | 'image/object/bbox/ymin': tf.VarLenFeature(
160 | dtype=tf.float32),
161 | 'image/object/bbox/xmax': tf.VarLenFeature(
162 | dtype=tf.float32),
163 | 'image/object/bbox/ymax': tf.VarLenFeature(
164 | dtype=tf.float32),
165 | 'image/object/class/label': tf.VarLenFeature(
166 | dtype=tf.int64),
167 | }
168 |
169 | items_to_handlers = {
170 | 'image': slim.tfexample_decoder.Image('image/encoded', 'image/format'),
171 | 'label': slim.tfexample_decoder.Tensor('image/class/label'),
172 | 'label_text': slim.tfexample_decoder.Tensor('image/class/text'),
173 | 'object/bbox': slim.tfexample_decoder.BoundingBox(
174 | ['ymin', 'xmin', 'ymax', 'xmax'], 'image/object/bbox/'),
175 | 'object/label': slim.tfexample_decoder.Tensor('image/object/class/label'),
176 | }
177 |
178 | decoder = slim.tfexample_decoder.TFExampleDecoder(
179 | keys_to_features, items_to_handlers)
180 |
181 | labels_to_names = None
182 | if dataset_utils.has_labels(dataset_dir):
183 | labels_to_names = dataset_utils.read_label_file(dataset_dir)
184 | else:
185 | labels_to_names = create_readable_names_for_imagenet_labels()
186 | dataset_utils.write_label_file(labels_to_names, dataset_dir)
187 |
188 | return slim.dataset.Dataset(
189 | data_sources=file_pattern,
190 | reader=reader,
191 | decoder=decoder,
192 | num_samples=_SPLITS_TO_SIZES[split_name],
193 | items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,
194 | num_classes=_NUM_CLASSES,
195 | labels_to_names=labels_to_names)
196 |
--------------------------------------------------------------------------------
/datasets/imagenet_lsvrc_2015_synsets.txt:
--------------------------------------------------------------------------------
1 | n01440764
2 | n01443537
3 | n01484850
4 | n01491361
5 | n01494475
6 | n01496331
7 | n01498041
8 | n01514668
9 | n01514859
10 | n01518878
11 | n01530575
12 | n01531178
13 | n01532829
14 | n01534433
15 | n01537544
16 | n01558993
17 | n01560419
18 | n01580077
19 | n01582220
20 | n01592084
21 | n01601694
22 | n01608432
23 | n01614925
24 | n01616318
25 | n01622779
26 | n01629819
27 | n01630670
28 | n01631663
29 | n01632458
30 | n01632777
31 | n01641577
32 | n01644373
33 | n01644900
34 | n01664065
35 | n01665541
36 | n01667114
37 | n01667778
38 | n01669191
39 | n01675722
40 | n01677366
41 | n01682714
42 | n01685808
43 | n01687978
44 | n01688243
45 | n01689811
46 | n01692333
47 | n01693334
48 | n01694178
49 | n01695060
50 | n01697457
51 | n01698640
52 | n01704323
53 | n01728572
54 | n01728920
55 | n01729322
56 | n01729977
57 | n01734418
58 | n01735189
59 | n01737021
60 | n01739381
61 | n01740131
62 | n01742172
63 | n01744401
64 | n01748264
65 | n01749939
66 | n01751748
67 | n01753488
68 | n01755581
69 | n01756291
70 | n01768244
71 | n01770081
72 | n01770393
73 | n01773157
74 | n01773549
75 | n01773797
76 | n01774384
77 | n01774750
78 | n01775062
79 | n01776313
80 | n01784675
81 | n01795545
82 | n01796340
83 | n01797886
84 | n01798484
85 | n01806143
86 | n01806567
87 | n01807496
88 | n01817953
89 | n01818515
90 | n01819313
91 | n01820546
92 | n01824575
93 | n01828970
94 | n01829413
95 | n01833805
96 | n01843065
97 | n01843383
98 | n01847000
99 | n01855032
100 | n01855672
101 | n01860187
102 | n01871265
103 | n01872401
104 | n01873310
105 | n01877812
106 | n01882714
107 | n01883070
108 | n01910747
109 | n01914609
110 | n01917289
111 | n01924916
112 | n01930112
113 | n01943899
114 | n01944390
115 | n01945685
116 | n01950731
117 | n01955084
118 | n01968897
119 | n01978287
120 | n01978455
121 | n01980166
122 | n01981276
123 | n01983481
124 | n01984695
125 | n01985128
126 | n01986214
127 | n01990800
128 | n02002556
129 | n02002724
130 | n02006656
131 | n02007558
132 | n02009229
133 | n02009912
134 | n02011460
135 | n02012849
136 | n02013706
137 | n02017213
138 | n02018207
139 | n02018795
140 | n02025239
141 | n02027492
142 | n02028035
143 | n02033041
144 | n02037110
145 | n02051845
146 | n02056570
147 | n02058221
148 | n02066245
149 | n02071294
150 | n02074367
151 | n02077923
152 | n02085620
153 | n02085782
154 | n02085936
155 | n02086079
156 | n02086240
157 | n02086646
158 | n02086910
159 | n02087046
160 | n02087394
161 | n02088094
162 | n02088238
163 | n02088364
164 | n02088466
165 | n02088632
166 | n02089078
167 | n02089867
168 | n02089973
169 | n02090379
170 | n02090622
171 | n02090721
172 | n02091032
173 | n02091134
174 | n02091244
175 | n02091467
176 | n02091635
177 | n02091831
178 | n02092002
179 | n02092339
180 | n02093256
181 | n02093428
182 | n02093647
183 | n02093754
184 | n02093859
185 | n02093991
186 | n02094114
187 | n02094258
188 | n02094433
189 | n02095314
190 | n02095570
191 | n02095889
192 | n02096051
193 | n02096177
194 | n02096294
195 | n02096437
196 | n02096585
197 | n02097047
198 | n02097130
199 | n02097209
200 | n02097298
201 | n02097474
202 | n02097658
203 | n02098105
204 | n02098286
205 | n02098413
206 | n02099267
207 | n02099429
208 | n02099601
209 | n02099712
210 | n02099849
211 | n02100236
212 | n02100583
213 | n02100735
214 | n02100877
215 | n02101006
216 | n02101388
217 | n02101556
218 | n02102040
219 | n02102177
220 | n02102318
221 | n02102480
222 | n02102973
223 | n02104029
224 | n02104365
225 | n02105056
226 | n02105162
227 | n02105251
228 | n02105412
229 | n02105505
230 | n02105641
231 | n02105855
232 | n02106030
233 | n02106166
234 | n02106382
235 | n02106550
236 | n02106662
237 | n02107142
238 | n02107312
239 | n02107574
240 | n02107683
241 | n02107908
242 | n02108000
243 | n02108089
244 | n02108422
245 | n02108551
246 | n02108915
247 | n02109047
248 | n02109525
249 | n02109961
250 | n02110063
251 | n02110185
252 | n02110341
253 | n02110627
254 | n02110806
255 | n02110958
256 | n02111129
257 | n02111277
258 | n02111500
259 | n02111889
260 | n02112018
261 | n02112137
262 | n02112350
263 | n02112706
264 | n02113023
265 | n02113186
266 | n02113624
267 | n02113712
268 | n02113799
269 | n02113978
270 | n02114367
271 | n02114548
272 | n02114712
273 | n02114855
274 | n02115641
275 | n02115913
276 | n02116738
277 | n02117135
278 | n02119022
279 | n02119789
280 | n02120079
281 | n02120505
282 | n02123045
283 | n02123159
284 | n02123394
285 | n02123597
286 | n02124075
287 | n02125311
288 | n02127052
289 | n02128385
290 | n02128757
291 | n02128925
292 | n02129165
293 | n02129604
294 | n02130308
295 | n02132136
296 | n02133161
297 | n02134084
298 | n02134418
299 | n02137549
300 | n02138441
301 | n02165105
302 | n02165456
303 | n02167151
304 | n02168699
305 | n02169497
306 | n02172182
307 | n02174001
308 | n02177972
309 | n02190166
310 | n02206856
311 | n02219486
312 | n02226429
313 | n02229544
314 | n02231487
315 | n02233338
316 | n02236044
317 | n02256656
318 | n02259212
319 | n02264363
320 | n02268443
321 | n02268853
322 | n02276258
323 | n02277742
324 | n02279972
325 | n02280649
326 | n02281406
327 | n02281787
328 | n02317335
329 | n02319095
330 | n02321529
331 | n02325366
332 | n02326432
333 | n02328150
334 | n02342885
335 | n02346627
336 | n02356798
337 | n02361337
338 | n02363005
339 | n02364673
340 | n02389026
341 | n02391049
342 | n02395406
343 | n02396427
344 | n02397096
345 | n02398521
346 | n02403003
347 | n02408429
348 | n02410509
349 | n02412080
350 | n02415577
351 | n02417914
352 | n02422106
353 | n02422699
354 | n02423022
355 | n02437312
356 | n02437616
357 | n02441942
358 | n02442845
359 | n02443114
360 | n02443484
361 | n02444819
362 | n02445715
363 | n02447366
364 | n02454379
365 | n02457408
366 | n02480495
367 | n02480855
368 | n02481823
369 | n02483362
370 | n02483708
371 | n02484975
372 | n02486261
373 | n02486410
374 | n02487347
375 | n02488291
376 | n02488702
377 | n02489166
378 | n02490219
379 | n02492035
380 | n02492660
381 | n02493509
382 | n02493793
383 | n02494079
384 | n02497673
385 | n02500267
386 | n02504013
387 | n02504458
388 | n02509815
389 | n02510455
390 | n02514041
391 | n02526121
392 | n02536864
393 | n02606052
394 | n02607072
395 | n02640242
396 | n02641379
397 | n02643566
398 | n02655020
399 | n02666196
400 | n02667093
401 | n02669723
402 | n02672831
403 | n02676566
404 | n02687172
405 | n02690373
406 | n02692877
407 | n02699494
408 | n02701002
409 | n02704792
410 | n02708093
411 | n02727426
412 | n02730930
413 | n02747177
414 | n02749479
415 | n02769748
416 | n02776631
417 | n02777292
418 | n02782093
419 | n02783161
420 | n02786058
421 | n02787622
422 | n02788148
423 | n02790996
424 | n02791124
425 | n02791270
426 | n02793495
427 | n02794156
428 | n02795169
429 | n02797295
430 | n02799071
431 | n02802426
432 | n02804414
433 | n02804610
434 | n02807133
435 | n02808304
436 | n02808440
437 | n02814533
438 | n02814860
439 | n02815834
440 | n02817516
441 | n02823428
442 | n02823750
443 | n02825657
444 | n02834397
445 | n02835271
446 | n02837789
447 | n02840245
448 | n02841315
449 | n02843684
450 | n02859443
451 | n02860847
452 | n02865351
453 | n02869837
454 | n02870880
455 | n02871525
456 | n02877765
457 | n02879718
458 | n02883205
459 | n02892201
460 | n02892767
461 | n02894605
462 | n02895154
463 | n02906734
464 | n02909870
465 | n02910353
466 | n02916936
467 | n02917067
468 | n02927161
469 | n02930766
470 | n02939185
471 | n02948072
472 | n02950826
473 | n02951358
474 | n02951585
475 | n02963159
476 | n02965783
477 | n02966193
478 | n02966687
479 | n02971356
480 | n02974003
481 | n02977058
482 | n02978881
483 | n02979186
484 | n02980441
485 | n02981792
486 | n02988304
487 | n02992211
488 | n02992529
489 | n02999410
490 | n03000134
491 | n03000247
492 | n03000684
493 | n03014705
494 | n03016953
495 | n03017168
496 | n03018349
497 | n03026506
498 | n03028079
499 | n03032252
500 | n03041632
501 | n03042490
502 | n03045698
503 | n03047690
504 | n03062245
505 | n03063599
506 | n03063689
507 | n03065424
508 | n03075370
509 | n03085013
510 | n03089624
511 | n03095699
512 | n03100240
513 | n03109150
514 | n03110669
515 | n03124043
516 | n03124170
517 | n03125729
518 | n03126707
519 | n03127747
520 | n03127925
521 | n03131574
522 | n03133878
523 | n03134739
524 | n03141823
525 | n03146219
526 | n03160309
527 | n03179701
528 | n03180011
529 | n03187595
530 | n03188531
531 | n03196217
532 | n03197337
533 | n03201208
534 | n03207743
535 | n03207941
536 | n03208938
537 | n03216828
538 | n03218198
539 | n03220513
540 | n03223299
541 | n03240683
542 | n03249569
543 | n03250847
544 | n03255030
545 | n03259280
546 | n03271574
547 | n03272010
548 | n03272562
549 | n03290653
550 | n03291819
551 | n03297495
552 | n03314780
553 | n03325584
554 | n03337140
555 | n03344393
556 | n03345487
557 | n03347037
558 | n03355925
559 | n03372029
560 | n03376595
561 | n03379051
562 | n03384352
563 | n03388043
564 | n03388183
565 | n03388549
566 | n03393912
567 | n03394916
568 | n03400231
569 | n03404251
570 | n03417042
571 | n03424325
572 | n03425413
573 | n03443371
574 | n03444034
575 | n03445777
576 | n03445924
577 | n03447447
578 | n03447721
579 | n03450230
580 | n03452741
581 | n03457902
582 | n03459775
583 | n03461385
584 | n03467068
585 | n03476684
586 | n03476991
587 | n03478589
588 | n03481172
589 | n03482405
590 | n03483316
591 | n03485407
592 | n03485794
593 | n03492542
594 | n03494278
595 | n03495258
596 | n03496892
597 | n03498962
598 | n03527444
599 | n03529860
600 | n03530642
601 | n03532672
602 | n03534580
603 | n03535780
604 | n03538406
605 | n03544143
606 | n03584254
607 | n03584829
608 | n03590841
609 | n03594734
610 | n03594945
611 | n03595614
612 | n03598930
613 | n03599486
614 | n03602883
615 | n03617480
616 | n03623198
617 | n03627232
618 | n03630383
619 | n03633091
620 | n03637318
621 | n03642806
622 | n03649909
623 | n03657121
624 | n03658185
625 | n03661043
626 | n03662601
627 | n03666591
628 | n03670208
629 | n03673027
630 | n03676483
631 | n03680355
632 | n03690938
633 | n03691459
634 | n03692522
635 | n03697007
636 | n03706229
637 | n03709823
638 | n03710193
639 | n03710637
640 | n03710721
641 | n03717622
642 | n03720891
643 | n03721384
644 | n03724870
645 | n03729826
646 | n03733131
647 | n03733281
648 | n03733805
649 | n03742115
650 | n03743016
651 | n03759954
652 | n03761084
653 | n03763968
654 | n03764736
655 | n03769881
656 | n03770439
657 | n03770679
658 | n03773504
659 | n03775071
660 | n03775546
661 | n03776460
662 | n03777568
663 | n03777754
664 | n03781244
665 | n03782006
666 | n03785016
667 | n03786901
668 | n03787032
669 | n03788195
670 | n03788365
671 | n03791053
672 | n03792782
673 | n03792972
674 | n03793489
675 | n03794056
676 | n03796401
677 | n03803284
678 | n03804744
679 | n03814639
680 | n03814906
681 | n03825788
682 | n03832673
683 | n03837869
684 | n03838899
685 | n03840681
686 | n03841143
687 | n03843555
688 | n03854065
689 | n03857828
690 | n03866082
691 | n03868242
692 | n03868863
693 | n03871628
694 | n03873416
695 | n03874293
696 | n03874599
697 | n03876231
698 | n03877472
699 | n03877845
700 | n03884397
701 | n03887697
702 | n03888257
703 | n03888605
704 | n03891251
705 | n03891332
706 | n03895866
707 | n03899768
708 | n03902125
709 | n03903868
710 | n03908618
711 | n03908714
712 | n03916031
713 | n03920288
714 | n03924679
715 | n03929660
716 | n03929855
717 | n03930313
718 | n03930630
719 | n03933933
720 | n03935335
721 | n03937543
722 | n03938244
723 | n03942813
724 | n03944341
725 | n03947888
726 | n03950228
727 | n03954731
728 | n03956157
729 | n03958227
730 | n03961711
731 | n03967562
732 | n03970156
733 | n03976467
734 | n03976657
735 | n03977966
736 | n03980874
737 | n03982430
738 | n03983396
739 | n03991062
740 | n03992509
741 | n03995372
742 | n03998194
743 | n04004767
744 | n04005630
745 | n04008634
746 | n04009552
747 | n04019541
748 | n04023962
749 | n04026417
750 | n04033901
751 | n04033995
752 | n04037443
753 | n04039381
754 | n04040759
755 | n04041544
756 | n04044716
757 | n04049303
758 | n04065272
759 | n04067472
760 | n04069434
761 | n04070727
762 | n04074963
763 | n04081281
764 | n04086273
765 | n04090263
766 | n04099969
767 | n04111531
768 | n04116512
769 | n04118538
770 | n04118776
771 | n04120489
772 | n04125021
773 | n04127249
774 | n04131690
775 | n04133789
776 | n04136333
777 | n04141076
778 | n04141327
779 | n04141975
780 | n04146614
781 | n04147183
782 | n04149813
783 | n04152593
784 | n04153751
785 | n04154565
786 | n04162706
787 | n04179913
788 | n04192698
789 | n04200800
790 | n04201297
791 | n04204238
792 | n04204347
793 | n04208210
794 | n04209133
795 | n04209239
796 | n04228054
797 | n04229816
798 | n04235860
799 | n04238763
800 | n04239074
801 | n04243546
802 | n04251144
803 | n04252077
804 | n04252225
805 | n04254120
806 | n04254680
807 | n04254777
808 | n04258138
809 | n04259630
810 | n04263257
811 | n04264628
812 | n04265275
813 | n04266014
814 | n04270147
815 | n04273569
816 | n04275548
817 | n04277352
818 | n04285008
819 | n04286575
820 | n04296562
821 | n04310018
822 | n04311004
823 | n04311174
824 | n04317175
825 | n04325704
826 | n04326547
827 | n04328186
828 | n04330267
829 | n04332243
830 | n04335435
831 | n04336792
832 | n04344873
833 | n04346328
834 | n04347754
835 | n04350905
836 | n04355338
837 | n04355933
838 | n04356056
839 | n04357314
840 | n04366367
841 | n04367480
842 | n04370456
843 | n04371430
844 | n04371774
845 | n04372370
846 | n04376876
847 | n04380533
848 | n04389033
849 | n04392985
850 | n04398044
851 | n04399382
852 | n04404412
853 | n04409515
854 | n04417672
855 | n04418357
856 | n04423845
857 | n04428191
858 | n04429376
859 | n04435653
860 | n04442312
861 | n04443257
862 | n04447861
863 | n04456115
864 | n04458633
865 | n04461696
866 | n04462240
867 | n04465501
868 | n04467665
869 | n04476259
870 | n04479046
871 | n04482393
872 | n04483307
873 | n04485082
874 | n04486054
875 | n04487081
876 | n04487394
877 | n04493381
878 | n04501370
879 | n04505470
880 | n04507155
881 | n04509417
882 | n04515003
883 | n04517823
884 | n04522168
885 | n04523525
886 | n04525038
887 | n04525305
888 | n04532106
889 | n04532670
890 | n04536866
891 | n04540053
892 | n04542943
893 | n04548280
894 | n04548362
895 | n04550184
896 | n04552348
897 | n04553703
898 | n04554684
899 | n04557648
900 | n04560804
901 | n04562935
902 | n04579145
903 | n04579432
904 | n04584207
905 | n04589890
906 | n04590129
907 | n04591157
908 | n04591713
909 | n04592741
910 | n04596742
911 | n04597913
912 | n04599235
913 | n04604644
914 | n04606251
915 | n04612504
916 | n04613696
917 | n06359193
918 | n06596364
919 | n06785654
920 | n06794110
921 | n06874185
922 | n07248320
923 | n07565083
924 | n07579787
925 | n07583066
926 | n07584110
927 | n07590611
928 | n07613480
929 | n07614500
930 | n07615774
931 | n07684084
932 | n07693725
933 | n07695742
934 | n07697313
935 | n07697537
936 | n07711569
937 | n07714571
938 | n07714990
939 | n07715103
940 | n07716358
941 | n07716906
942 | n07717410
943 | n07717556
944 | n07718472
945 | n07718747
946 | n07720875
947 | n07730033
948 | n07734744
949 | n07742313
950 | n07745940
951 | n07747607
952 | n07749582
953 | n07753113
954 | n07753275
955 | n07753592
956 | n07754684
957 | n07760859
958 | n07768694
959 | n07802026
960 | n07831146
961 | n07836838
962 | n07860988
963 | n07871810
964 | n07873807
965 | n07875152
966 | n07880968
967 | n07892512
968 | n07920052
969 | n07930864
970 | n07932039
971 | n09193705
972 | n09229709
973 | n09246464
974 | n09256479
975 | n09288635
976 | n09332890
977 | n09399592
978 | n09421951
979 | n09428293
980 | n09468604
981 | n09472597
982 | n09835506
983 | n10148035
984 | n10565667
985 | n11879895
986 | n11939491
987 | n12057211
988 | n12144580
989 | n12267677
990 | n12620546
991 | n12768682
992 | n12985857
993 | n12998815
994 | n13037406
995 | n13040303
996 | n13044778
997 | n13052670
998 | n13054560
999 | n13133613
1000 | n15075141
1001 |
--------------------------------------------------------------------------------
/datasets/imnet_reg.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides data for the ImageNet ILSVRC 2012 Dataset plus some bounding boxes.
16 |
17 | Some images have one or more bounding boxes associated with the label of the
18 | image. See details here: http://image-net.org/download-bboxes
19 |
20 | ImageNet is based upon WordNet 3.0. To uniquely identify a synset, we use
21 | "WordNet ID" (wnid), which is a concatenation of POS ( i.e. part of speech )
22 | and SYNSET OFFSET of WordNet. For more information, please refer to the
23 | WordNet documentation[http://wordnet.princeton.edu/wordnet/documentation/].
24 |
25 | "There are bounding boxes for over 3000 popular synsets available.
26 | For each synset, there are on average 150 images with bounding boxes."
27 |
28 | WARNING: Don't use for object detection, in this case all the bounding boxes
29 | of the image belong to just one class.
30 | """
31 | from __future__ import absolute_import
32 | from __future__ import division
33 | from __future__ import print_function
34 |
35 | import os
36 | from six.moves import urllib
37 | import tensorflow as tf
38 |
39 | from datasets import dataset_utils
40 |
41 | slim = tf.contrib.slim
42 |
43 | # TODO(nsilberman): Add tfrecord file type once the script is updated.
44 | _FILE_PATTERN = '%s-*'
45 |
46 | _SPLITS_TO_SIZES = {
47 | 'train': 1156644,
48 | 'validation': 50000,
49 | }
50 |
51 | _ITEMS_TO_DESCRIPTIONS = {
52 | 'input_img': 'A color image of varying height and width.',
53 | 'gt_img': 'A color image of varying height and width.',
54 | 'label': 'The label id of the image, integer between 0 and 999',
55 | 'label_text': 'The text of the label.',
56 | 'object/bbox': 'A list of bounding boxes.',
57 | 'object/label': 'A list of labels, one per each object.',
58 | }
59 |
60 | _NUM_CLASSES = 1001
61 |
62 |
63 | def create_readable_names_for_imagenet_labels():
64 | """Create a dict mapping label id to human readable string.
65 |
66 | Returns:
67 | labels_to_names: dictionary where keys are integers from to 1000
68 | and values are human-readable names.
69 |
70 | We retrieve a synset file, which contains a list of valid synset labels used
71 | by ILSVRC competition. There is one synset one per line, eg.
72 | # n01440764
73 | # n01443537
74 | We also retrieve a synset_to_human_file, which contains a mapping from synsets
75 | to human-readable names for every synset in Imagenet. These are stored in a
76 | tsv format, as follows:
77 | # n02119247 black fox
78 | # n02119359 silver fox
79 | We assign each synset (in alphabetical order) an integer, starting from 1
80 | (since 0 is reserved for the background class).
81 |
82 | Code is based on
83 | https://github.com/tensorflow/models/blob/master/inception/inception/data/convert_reg_imgnet.py#L463
84 | """
85 |
86 | # pylint: disable=g-line-too-long
87 | base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/inception/inception/data/'
88 | synset_url = '{}/imagenet_lsvrc_2015_synsets.txt'.format(base_url)
89 | synset_to_human_url = '{}/imagenet_metadata.txt'.format(base_url)
90 |
91 | #filename, _ = urllib.request.urlretrieve(synset_url)
92 | filename = 'imagenet_lsvrc_2015_synsets.txt'
93 | synset_list = [s.strip() for s in open(filename).readlines()]
94 | num_synsets_in_ilsvrc = len(synset_list)
95 | assert num_synsets_in_ilsvrc == 1000
96 |
97 | #filename, _ = urllib.request.urlretrieve(synset_to_human_url)
98 | filename = 'imagenet_metadata.txt'
99 | synset_to_human_list = open(filename).readlines()
100 | num_synsets_in_all_imagenet = len(synset_to_human_list)
101 | assert num_synsets_in_all_imagenet == 21842
102 |
103 | synset_to_human = {}
104 | for s in synset_to_human_list:
105 | parts = s.strip().split('\t')
106 | assert len(parts) == 2
107 | synset = parts[0]
108 | human = parts[1]
109 | synset_to_human[synset] = human
110 |
111 | label_index = 1
112 | labels_to_names = {0: 'background'}
113 | for synset in synset_list:
114 | name = synset_to_human[synset]
115 | labels_to_names[label_index] = name
116 | label_index += 1
117 |
118 | return labels_to_names
119 |
120 |
121 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
122 | """Gets a dataset tuple with instructions for reading ImageNet.
123 |
124 | Args:
125 | split_name: A train/test split name.
126 | dataset_dir: The base directory of the dataset sources.
127 | file_pattern: The file pattern to use when matching the dataset sources.
128 | It is assumed that the pattern contains a '%s' string so that the split
129 | name can be inserted.
130 | reader: The TensorFlow reader type.
131 |
132 | Returns:
133 | A `Dataset` namedtuple.
134 |
135 | Raises:
136 | ValueError: if `split_name` is not a valid train/test split.
137 | """
138 | if split_name not in _SPLITS_TO_SIZES:
139 | raise ValueError('split name %s was not recognized.' % split_name)
140 |
141 | if not file_pattern:
142 | file_pattern = _FILE_PATTERN
143 | file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
144 |
145 | # Allowing None in the signature so that dataset_factory can use the default.
146 | if reader is None:
147 | reader = tf.TFRecordReader
148 |
149 | keys_to_features = {
150 | 'input_img/encoded': tf.FixedLenFeature(
151 | (), tf.string, default_value=''),
152 | 'gt_img/encoded': tf.FixedLenFeature(
153 | (), tf.string, default_value=''),
154 | 'image/format': tf.FixedLenFeature(
155 | (), tf.string, default_value='jpeg'),
156 | 'image/class/label': tf.FixedLenFeature(
157 | [], dtype=tf.int64, default_value=-1),
158 | 'image/class/text': tf.FixedLenFeature(
159 | [], dtype=tf.string, default_value=''),
160 | 'image/object/bbox/xmin': tf.VarLenFeature(
161 | dtype=tf.float32),
162 | 'image/object/bbox/ymin': tf.VarLenFeature(
163 | dtype=tf.float32),
164 | 'image/object/bbox/xmax': tf.VarLenFeature(
165 | dtype=tf.float32),
166 | 'image/object/bbox/ymax': tf.VarLenFeature(
167 | dtype=tf.float32),
168 | 'image/object/class/label': tf.VarLenFeature(
169 | dtype=tf.int64),
170 | }
171 |
172 | items_to_handlers = {
173 | 'input': slim.tfexample_decoder.Image('input_img/encoded', 'image/format'),
174 | 'ground_truth': slim.tfexample_decoder.Image('gt_img/encoded', 'image/format'),
175 | 'label': slim.tfexample_decoder.Tensor('image/class/label'),
176 | 'label_text': slim.tfexample_decoder.Tensor('image/class/text'),
177 | 'object/bbox': slim.tfexample_decoder.BoundingBox(
178 | ['ymin', 'xmin', 'ymax', 'xmax'], 'image/object/bbox/'),
179 | 'object/label': slim.tfexample_decoder.Tensor('image/object/class/label'),
180 | }
181 |
182 | decoder = slim.tfexample_decoder.TFExampleDecoder(
183 | keys_to_features, items_to_handlers)
184 |
185 | return slim.dataset.Dataset(
186 | data_sources=file_pattern,
187 | reader=reader,
188 | decoder=decoder,
189 | num_samples=_SPLITS_TO_SIZES[split_name],
190 | items_to_descriptions=_ITEMS_TO_DESCRIPTIONS)
191 |
--------------------------------------------------------------------------------
/datasets/raw.py:
--------------------------------------------------------------------------------
1 | """
2 | data for the ImageNet ILSVRC 2012 Dataset plus some bounding boxes.
3 |
4 | Some images have one or more bounding boxes associated with the label of the
5 | image. See details here: http://image-net.org/download-bboxes
6 |
7 | ImageNet is based upon WordNet 3.0. To uniquely identify a synset, we use
8 | "WordNet ID" (wnid), which is a concatenation of POS ( i.e. part of speech )
9 | and SYNSET OFFSET of WordNet. For more information, please refer to the
10 | WordNet documentation[http://wordnet.princeton.edu/wordnet/documentation/].
11 |
12 | "There are bounding boxes for over 3000 popular synsets available.
13 | For each synset, there are on average 150 images with bounding boxes."
14 |
15 | WARNING: Don't use for object detection, in this case all the bounding boxes
16 | of the image belong to just one class.
17 | """
18 | from __future__ import absolute_import
19 | from __future__ import division
20 | from __future__ import print_function
21 |
22 | import os
23 | from six.moves import urllib
24 | import tensorflow as tf
25 |
26 | from datasets import dataset_utils
27 |
28 | slim = tf.contrib.slim
29 |
30 | # TODO(nsilberman): Add tfrecord file type once the script is updated.
31 | _FILE_PATTERN = '%s-*'
32 |
33 | _SPLITS_TO_SIZES = {
34 | 'train': 1281167,
35 | 'validation': 1103, # low 844, medium 1103
36 | }
37 |
38 | _ITEMS_TO_DESCRIPTIONS = {
39 | 'image': 'A color image of varying height and width.',
40 | 'label': 'The label id of the image, integer between 0 and 999',
41 | 'label_text': 'The text of the label.',
42 | 'object/bbox': 'A list of bounding boxes.',
43 | 'object/label': 'A list of labels, one per each object.',
44 | }
45 |
46 | _NUM_CLASSES = 1001
47 |
48 |
49 | def create_readable_names_for_imagenet_labels():
50 | """Create a dict mapping label id to human readable string.
51 |
52 | Returns:
53 | labels_to_names: dictionary where keys are integers from to 1000
54 | and values are human-readable names.
55 |
56 | We retrieve a synset file, which contains a list of valid synset labels used
57 | by ILSVRC competition. There is one synset one per line, eg.
58 | # n01440764
59 | # n01443537
60 | We also retrieve a synset_to_human_file, which contains a mapping from synsets
61 | to human-readable names for every synset in Imagenet. These are stored in a
62 | tsv format, as follows:
63 | # n02119247 black fox
64 | # n02119359 silver fox
65 | We assign each synset (in alphabetical order) an integer, starting from 1
66 | (since 0 is reserved for the background class).
67 |
68 | Code is based on
69 | https://github.com/tensorflow/models/blob/master/inception/inception/data/build_imagenet_data.py#L463
70 | """
71 |
72 | # pylint: disable=g-line-too-long
73 | base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/inception/inception/data/'
74 | synset_url = '{}/imagenet_lsvrc_2015_synsets.txt'.format(base_url)
75 | synset_to_human_url = '{}/imagenet_metadata.txt'.format(base_url)
76 |
77 | filename, _ = urllib.request.urlretrieve(synset_url)
78 | synset_list = [s.strip() for s in open(filename).readlines()]
79 | num_synsets_in_ilsvrc = len(synset_list)
80 | assert num_synsets_in_ilsvrc == 1000
81 |
82 | filename, _ = urllib.request.urlretrieve(synset_to_human_url)
83 | synset_to_human_list = open(filename).readlines()
84 | num_synsets_in_all_imagenet = len(synset_to_human_list)
85 | assert num_synsets_in_all_imagenet == 21842
86 |
87 | synset_to_human = {}
88 | for s in synset_to_human_list:
89 | parts = s.strip().split('\t')
90 | assert len(parts) == 2
91 | synset = parts[0]
92 | human = parts[1]
93 | synset_to_human[synset] = human
94 |
95 | label_index = 1
96 | labels_to_names = {0: 'background'}
97 | for synset in synset_list:
98 | name = synset_to_human[synset]
99 | labels_to_names[label_index] = name
100 | label_index += 1
101 |
102 | return labels_to_names
103 |
104 |
105 | def get_split(split_name, dataset_dir, file_pattern=None, reader=None):
106 | """Gets a dataset tuple with instructions for reading ImageNet.
107 |
108 | Args:
109 | split_name: A train/test split name.
110 | dataset_dir: The base directory of the dataset sources.
111 | file_pattern: The file pattern to use when matching the dataset sources.
112 | It is assumed that the pattern contains a '%s' string so that the split
113 | name can be inserted.
114 | reader: The TensorFlow reader type.
115 |
116 | Returns:
117 | A `Dataset` namedtuple.
118 |
119 | Raises:
120 | ValueError: if `split_name` is not a valid train/test split.
121 | """
122 | if split_name not in _SPLITS_TO_SIZES:
123 | raise ValueError('split name %s was not recognized.' % split_name)
124 |
125 | if not file_pattern:
126 | file_pattern = _FILE_PATTERN
127 | file_pattern = os.path.join(dataset_dir, file_pattern % split_name)
128 |
129 | # Allowing None in the signature so that dataset_factory can use the default.
130 | if reader is None:
131 | reader = tf.TFRecordReader
132 |
133 | keys_to_features = {
134 | 'image/encoded': tf.FixedLenFeature(
135 | (), tf.string, default_value=''),
136 | 'image/format': tf.FixedLenFeature(
137 | (), tf.string, default_value='jpeg'),
138 | 'image/class/label': tf.FixedLenFeature(
139 | [], dtype=tf.int64, default_value=-1),
140 | 'image/class/text': tf.FixedLenFeature(
141 | [], dtype=tf.string, default_value=''),
142 | 'image/filename': tf.FixedLenFeature(
143 | [], dtype=tf.string, default_value=''),
144 | 'image/object/bbox/xmin': tf.VarLenFeature(
145 | dtype=tf.float32),
146 | 'image/object/bbox/ymin': tf.VarLenFeature(
147 | dtype=tf.float32),
148 | 'image/object/bbox/xmax': tf.VarLenFeature(
149 | dtype=tf.float32),
150 | 'image/object/bbox/ymax': tf.VarLenFeature(
151 | dtype=tf.float32),
152 | 'image/object/class/label': tf.VarLenFeature(
153 | dtype=tf.int64),
154 | }
155 |
156 | items_to_handlers = {
157 | 'image': slim.tfexample_decoder.Image('image/encoded', 'image/format'),
158 | 'label': slim.tfexample_decoder.Tensor('image/class/label'),
159 | 'filename': slim.tfexample_decoder.Tensor('image/filename'),
160 | 'label_text': slim.tfexample_decoder.Tensor('image/class/text'),
161 | 'object/bbox': slim.tfexample_decoder.BoundingBox(
162 | ['ymin', 'xmin', 'ymax', 'xmax'], 'image/object/bbox/'),
163 | 'object/label': slim.tfexample_decoder.Tensor('image/object/class/label'),
164 | }
165 |
166 | decoder = slim.tfexample_decoder.TFExampleDecoder(
167 | keys_to_features, items_to_handlers)
168 |
169 | labels_to_names = None
170 | if dataset_utils.has_labels(dataset_dir):
171 | labels_to_names = dataset_utils.read_label_file(dataset_dir)
172 | else:
173 | labels_to_names = create_readable_names_for_imagenet_labels()
174 | dataset_utils.write_label_file(labels_to_names, dataset_dir)
175 |
176 | return slim.dataset.Dataset(
177 | data_sources=file_pattern,
178 | reader=reader,
179 | decoder=decoder,
180 | num_samples=_SPLITS_TO_SIZES[split_name],
181 | items_to_descriptions=_ITEMS_TO_DESCRIPTIONS,
182 | num_classes=_NUM_CLASSES,
183 | labels_to_names=labels_to_names)
184 |
185 |
--------------------------------------------------------------------------------
/datasets/raw_metadata.txt:
--------------------------------------------------------------------------------
1 | n04116512 rubber eraser, rubber, pencil eraser
2 | n03995372 power drill
3 | n03983396 pop bottle, soda bottle
4 | n03291819 envelope
5 | n03063599 coffee mug
6 | n03891251 park bench
7 | n07753592 banana
8 | n02870880 bookcase
9 | n02965783 car mirror
10 | n02823428 beer bottle
11 | n02974003 car wheel
12 | n04254777 sock, socks
13 | n03085013 computer keyboard, keypad
14 | n03793489 mouse, computer mouse
15 | n02783161 ballpoint, ballpoint pen, ballpen, Biro
16 | n04485082 tripod
17 | n02877765 bottlecap
18 | n03792782 mountain bike, all-terrain bike, off-roader
19 | n03782006 monitor
20 | n04131690 saltshaker, salt shaker
21 | n03761084 microwave, microwave oven
22 | n04557648 water bottle
23 | n03208938 disk brake, disc brake, disk brakes
24 | n04507155 umbrella
25 | n02786058 Band Aid
26 | n04153751 screw
27 | n04548362 wallet, billfold, notecase, pocketbook
28 | n04254120 soap dispenser
29 | n04356056 sunglasses, dark glasses, shades
30 | n04548280 wall clock
31 | n04447861 toilet seat
32 | n03958227 plastic bag
33 | n03717622 manhole cover
34 | n03481172 hammer
35 | n15075141 toilet tissue, toilet paper, bathroom tissue
36 | n04004767 printer
37 | n03924679 photocopier
38 | n03657121 lens cap, lens cover
39 | n04118776 rule, ruler
40 | n04009552 projector
41 | n03857828 oscilloscope, scope, cathode-ray oscilloscope, CRO
42 | n03492542 hard disc, hard disk, fixed disk
43 | n03388183 fountain pen
44 | n02840245 binder, ring-binder
45 | n02769748 backpack, back pack, knapsack, packsack, rucksack, haversack
46 | n03832673 notebook, notebook computer
47 | n03297495 espresso maker
48 | n02782093 balloon
49 | n03887697 paper towel
50 | n04069434 reflex camera
51 | n03180011 desktop computer
52 | n03179701 desk
53 | n02992529 cellular telephone, cellular phone, cellphone, cell, mobile phone
54 | n03637318 lampshade, lamp shade
55 | n03929660 pick, plectrum, plectron
56 | n03445777 golf ball
57 | n03666591 lighter, light, igniter, ignitor
58 | n04591713 wine bottle
59 | n02747177 trash can
60 | n04409515 tennis ball
61 | n03223299 doormat
62 | n04554684 washing machine
63 | n04557648 water bottle
64 | n04553703 washbasin
65 |
66 |
--------------------------------------------------------------------------------
/datasets/synset_labels.txt:
--------------------------------------------------------------------------------
1 | n02823428:441
2 | n04507155:880
3 | n04485082:873
4 | n04557648:899
5 | n03782006:665
6 | n02769748:415
7 | n03291819:550
8 | n03793489:674
9 | n03085013:509
10 | n02840245:447
11 | n02974003:480
12 | n03887697:701
13 | n03761084:652
14 | n04153751:784
15 | n02870880:454
16 | n03924679:714
17 | n03983396:738
18 | n15075141:1000
19 | n03717622:641
20 | n03208938:536
21 | n04254120:805
22 | n03063599:505
23 | n03637318:620
24 | n03891251:704
25 | n03832673:682
26 | n04356056:838
27 | n04118776:770
28 | n04009552:746
29 | n02965783:476
30 | n04004767:743
31 | n03180011:528
32 | n07753592:955
33 | n04548280:893
34 | n04447861:862
35 | n04548362:894
36 | n03995372:741
37 | n03857828:689
38 | n02783161:419
39 | n03388183:564
40 | n03297495:551
41 | n03085013:509
42 | n03657121:623
43 | n07753592:955
44 | n04548280:893
45 | n04254777:807
46 | n02783161:419
47 | n03887697:701
48 | n02823428:441
49 | n04447861:862
50 | n03782006:665
51 | n03063599:505
52 | n15075141:1000
53 | n03891251:704
54 | n04153751:784
55 | n03857828:689
56 | n03291819:550
57 | n02786058:420
58 | n04548362:894
59 | n04118776:770
60 | n02974003:480
61 | n03761084:652
62 | n04485082:873
63 | n04254120:805
64 | n03924679:714
65 | n03637318:620
66 | n02769748:415
67 | n02870880:454
68 | n03793489:674
69 | n03995372:741
70 | n03717622:641
71 | n02965783:476
72 | n03887697:701
73 | n03717622:641
74 | n03063599:505
75 | n04507155:880
76 | n04447861:862
77 | n07753592:955
78 | n04153751:784
79 | n15075141:1000
80 | n03793489:674
81 | n03782006:665
82 | n03291819:550
83 | n02870880:454
84 | n04254120:805
85 | n04118776:770
86 | n03657121:623
87 | n03208938:536
88 | n03983396:738
89 | n02783161:419
90 | n03085013:509
91 | n04548280:893
92 | n03297495:551
93 | n04485082:873
94 | n04004767:743
95 | n03857828:689
96 | n02974003:480
97 | n04548362:894
98 | n03761084:652
99 | n02769748:415
100 | n03891251:704
101 | n04254777:807
102 | n03924679:714
103 | n03995372:741
104 | n02823428:441
105 | n03832673:682
106 | n02786058:420
107 | n04557648:899
108 | n02965783:476
109 | n02840245:447
110 | n04009552:746
111 | n04356056:838
112 | n03637318:620
113 | n03180011:528
114 | n03388183:564
115 | n03929660:715
116 | n04591713:908
117 | n03445777:575
118 | n03666591:627
119 | n02747177:413
120 | n04409515:853
121 | n03223299:540
122 | n04554684:898
123 | n04557648:899
124 | n04553703:897
125 | n04116512:768
126 |
--------------------------------------------------------------------------------
/datasets/training_synsets.txt:
--------------------------------------------------------------------------------
1 | n01440764
2 | n01443537
3 | n01484850
4 | n01491361
5 | n01494475
6 | n01496331
7 | n01498041
8 | n01514668
9 | n01514859
10 | n01518878
11 | n01530575
12 | n01531178
13 | n01532829
14 | n01534433
15 | n01537544
16 | n01558993
17 | n01560419
18 | n01580077
19 | n01582220
20 | n01592084
21 | n01601694
22 | n01608432
23 | n01614925
24 | n01616318
25 | n01622779
26 | n01629819
27 | n01630670
28 | n01631663
29 | n01632458
30 | n01632777
31 | n01641577
32 | n01644373
33 | n01644900
34 | n01664065
35 | n01665541
36 | n01667114
37 | n01667778
38 | n01669191
39 | n01675722
40 | n01677366
41 | n01682714
42 | n01685808
43 | n01687978
44 | n01688243
45 | n01689811
46 | n01692333
47 | n01693334
48 | n01694178
49 | n01695060
50 | n01697457
51 | n01698640
52 | n01704323
53 | n01728572
54 | n01728920
55 | n01729322
56 | n01729977
57 | n01734418
58 | n01735189
59 | n01737021
60 | n01739381
61 | n01740131
62 | n01742172
63 | n01744401
64 | n01748264
65 | n01749939
66 | n01751748
67 | n01753488
68 | n01755581
69 | n01756291
70 | n01768244
71 | n01770081
72 | n01770393
73 | n01773157
74 | n01773549
75 | n01773797
76 | n01774384
77 | n01774750
78 | n01775062
79 | n01776313
80 | n01784675
81 | n01795545
82 | n01796340
83 | n01797886
84 | n01798484
85 | n01806143
86 | n01806567
87 | n01807496
88 | n01817953
89 | n01818515
90 | n01819313
91 | n01820546
92 | n01824575
93 | n01828970
94 | n01829413
95 | n01833805
96 | n01843065
97 | n01843383
98 | n01847000
99 | n01855032
100 | n01855672
101 | n01860187
102 | n01871265
103 | n01872772
104 | n01873310
105 | n01877812
106 | n01882714
107 | n01883070
108 | n01910747
109 | n01914609
110 | n01917289
111 | n01924916
112 | n01930112
113 | n01943899
114 | n01944390
115 | n07922607
116 | n01950731
117 | n01955084
118 | n01968897
119 | n01978287
120 | n01978455
121 | n01980166
122 | n01981276
123 | n01983481
124 | n01984695
125 | n01985128
126 | n01986214
127 | n01990800
128 | n02002556
129 | n02002724
130 | n02006656
131 | n02007558
132 | n02009229
133 | n02009912
134 | n02011460
135 | n02012849
136 | n02013706
137 | n02017213
138 | n02018207
139 | n02018795
140 | n02025239
141 | n02027492
142 | n02028035
143 | n02033041
144 | n02037110
145 | n02051845
146 | n02056570
147 | n02058221
148 | n02066245
149 | n02071294
150 | n02074367
151 | n02077923
152 | n02085620
153 | n02085782
154 | n02085936
155 | n02086079
156 | n02086240
157 | n02086646
158 | n02086910
159 | n02087046
160 | n02087394
161 | n02088094
162 | n02088238
163 | n02088364
164 | n02088466
165 | n02088632
166 | n02089078
167 | n02089867
168 | n02089973
169 | n02090379
170 | n02090622
171 | n02090721
172 | n02091032
173 | n02091134
174 | n02091244
175 | n02091467
176 | n02091635
177 | n02091831
178 | n02092002
179 | n02092339
180 | n02093256
181 | n02093428
182 | n02093647
183 | n02093754
184 | n02093859
185 | n02093991
186 | n02094114
187 | n02094258
188 | n02094433
189 | n02095314
190 | n02095570
191 | n02095889
192 | n02096051
193 | n02096177
194 | n02096294
195 | n02096437
196 | n02096585
197 | n02097047
198 | n02097130
199 | n02097209
200 | n02097298
201 | n02097474
202 | n02097658
203 | n02098105
204 | n02098286
205 | n02098413
206 | n02099267
207 | n02099429
208 | n02099601
209 | n02099712
210 | n02099849
211 | n02100236
212 | n02100583
213 | n02100735
214 | n02100877
215 | n02101006
216 | n02101388
217 | n02101556
218 | n02102040
219 | n02102177
220 | n02102318
221 | n02102480
222 | n02102973
223 | n02104029
224 | n02104365
225 | n02105056
226 | n02105162
227 | n02105251
228 | n02105412
229 | n02105505
230 | n02105641
231 | n02105855
232 | n02106030
233 | n02106166
234 | n02106382
235 | n02106550
236 | n02106662
237 | n02107142
238 | n02107312
239 | n02107574
240 | n02107683
241 | n02107908
242 | n02108000
243 | n02108089
244 | n02108422
245 | n02108551
246 | n02108915
247 | n02109047
248 | n02109525
249 | n02109961
250 | n02110063
251 | n02110185
252 | n02110341
253 | n02110627
254 | n02110806
255 | n02110958
256 | n02111129
257 | n02111277
258 | n02111500
259 | n02111889
260 | n02112018
261 | n02112137
262 | n02112350
263 | n02112706
264 | n02113023
265 | n02113186
266 | n02113624
267 | n02113712
268 | n02113799
269 | n02113978
270 | n02114367
271 | n02114548
272 | n02114712
273 | n02114855
274 | n02115641
275 | n02115913
276 | n02116738
277 | n02117135
278 | n02119022
279 | n02119789
280 | n02120079
281 | n02120505
282 | n02123045
283 | n02123159
284 | n02123394
285 | n02123597
286 | n02124075
287 | n02125311
288 | n02127052
289 | n02128385
290 | n02128757
291 | n02128925
292 | n02129165
293 | n02129604
294 | n02130308
295 | n02132136
296 | n02133161
297 | n02134084
298 | n02134418
299 | n02137549
300 | n02138441
301 | n02165105
302 | n02165456
303 | n02167151
304 | n02168699
305 | n02169497
306 | n02172182
307 | n02174001
308 | n02177972
309 | n03373237
310 | n02206856
311 | n02219486
312 | n02226429
313 | n02229544
314 | n02231487
315 | n02233338
316 | n02236044
317 | n02256656
318 | n02259212
319 | n02264363
320 | n02268443
321 | n02268853
322 | n02276258
323 | n02277742
324 | n02279972
325 | n02280649
326 | n02281406
327 | n02281787
328 | n02317335
329 | n02319095
330 | n02321529
331 | n02325366
332 | n02326432
333 | n02328150
334 | n02342885
335 | n02346627
336 | n02356798
337 | n02361337
338 | n02818254
339 | n02364673
340 | n02389026
341 | n02391049
342 | n02395406
343 | n02396427
344 | n02397096
345 | n02398521
346 | n02403003
347 | n02408429
348 | n02410509
349 | n02412080
350 | n02415577
351 | n02417914
352 | n02422106
353 | n02422699
354 | n02423022
355 | n02437312
356 | n02437616
357 | n02441942
358 | n02442845
359 | n02443114
360 | n02443484
361 | n02444819
362 | n02445715
363 | n02447366
364 | n02454379
365 | n02457408
366 | n02480495
367 | n02480855
368 | n02481823
369 | n02483362
370 | n02483708
371 | n02484975
372 | n02486261
373 | n02486410
374 | n02487347
375 | n02488291
376 | n02488702
377 | n02489166
378 | n02490219
379 | n02492035
380 | n02492660
381 | n02493509
382 | n02493793
383 | n02494079
384 | n02497673
385 | n02500267
386 | n02504013
387 | n02504458
388 | n02509815
389 | n02510455
390 | n02514041
391 | n02526121
392 | n02536864
393 | n02606052
394 | n02607072
395 | n02640242
396 | n02641379
397 | n02643566
398 | n02655020
399 | n02666196
400 | n02667093
401 | n02669723
402 | n02672831
403 | n02676566
404 | n02687172
405 | n02690373
406 | n02692877
407 | n02699494
408 | n02701002
409 | n02704792
410 | n02708093
411 | n02727426
412 | n08496334
413 | n02747177
414 | n02749479
415 | n02769748
416 | n02776631
417 | n02777292
418 | n02782093
419 | n02783161
420 | n02786058
421 | n02787622
422 | n02788148
423 | n02790996
424 | n02791124
425 | n02791270
426 | n02793495
427 | n02794156
428 | n02795169
429 | n02797295
430 | n02799071
431 | n02802426
432 | n02804515
433 | n02804610
434 | n02807133
435 | n02808304
436 | n02808440
437 | n02814533
438 | n02814860
439 | n02815834
440 | n02817516
441 | n02823428
442 | n02823750
443 | n02825657
444 | n02834397
445 | n02835271
446 | n02837789
447 | n02840245
448 | n02841315
449 | n02843684
450 | n02859443
451 | n02860847
452 | n02865351
453 | n02869837
454 | n02870880
455 | n02871525
456 | n02877765
457 | n02879718
458 | n02883205
459 | n02892201
460 | n02892767
461 | n02894605
462 | n02895154
463 | n02906734
464 | n02909870
465 | n02910353
466 | n02916936
467 | n02917067
468 | n02927161
469 | n02930766
470 | n02939185
471 | n02948072
472 | n02950826
473 | n02951358
474 | n02951585
475 | n02963159
476 | n02965783
477 | n02966193
478 | n02966687
479 | n02971356
480 | n02974003
481 | n02977058
482 | n02978881
483 | n02979186
484 | n02980441
485 | n02981792
486 | n02988304
487 | n02992211
488 | n02992529
489 | n02999936
490 | n03000134
491 | n03000247
492 | n03000684
493 | n03014705
494 | n03016953
495 | n03017168
496 | n03018349
497 | n03026506
498 | n03028079
499 | n03032252
500 | n03041632
501 | n03042490
502 | n03045698
503 | n03047690
504 | n03062245
505 | n03063599
506 | n03063689
507 | n03065424
508 | n03075370
509 | n03085013
510 | n03089624
511 | n03095699
512 | n03100240
513 | n03109150
514 | n03110669
515 | n03124043
516 | n03124170
517 | n03125729
518 | n03126707
519 | n03127747
520 | n03127925
521 | n03131574
522 | n03133878
523 | n03134739
524 | n03141823
525 | n03146219
526 | n03160309
527 | n03179701
528 | n03180011
529 | n03187595
530 | n03188531
531 | n03196217
532 | n03197337
533 | n03201208
534 | n03207743
535 | n03207941
536 | n03208938
537 | n03216828
538 | n03218198
539 | n03220513
540 | n03223299
541 | n03240683
542 | n03249569
543 | n03250847
544 | n03255030
545 | n03259401
546 | n03271574
547 | n03272010
548 | n03272562
549 | n03290653
550 | n13869788
551 | n03297495
552 | n03314780
553 | n03325584
554 | n03337140
555 | n03344393
556 | n03345487
557 | n03347037
558 | n03355925
559 | n03372029
560 | n03376595
561 | n03379051
562 | n03384352
563 | n03388043
564 | n03388183
565 | n03388549
566 | n03393912
567 | n03394916
568 | n03400231
569 | n03404251
570 | n03417042
571 | n03424325
572 | n03425413
573 | n03443371
574 | n03444034
575 | n03445777
576 | n03445924
577 | n03447447
578 | n03447721
579 | n03450230
580 | n03452741
581 | n03457902
582 | n03459775
583 | n03461385
584 | n03467068
585 | n03476684
586 | n03476991
587 | n03478589
588 | n03482001
589 | n03482405
590 | n03483316
591 | n03485407
592 | n03485794
593 | n03492542
594 | n03494278
595 | n03495570
596 | n03496892
597 | n03498962
598 | n03527565
599 | n03529860
600 | n09218315
601 | n03532672
602 | n03534580
603 | n03535780
604 | n03538406
605 | n03544143
606 | n03584254
607 | n03584829
608 | n03590841
609 | n03594734
610 | n03594945
611 | n03595614
612 | n03598930
613 | n03599486
614 | n03602883
615 | n03617480
616 | n03623198
617 | n03627232
618 | n03630383
619 | n03633091
620 | n03637318
621 | n03642806
622 | n03649909
623 | n03657121
624 | n03658185
625 | n07977870
626 | n03662601
627 | n03666591
628 | n03670208
629 | n03673027
630 | n03676483
631 | n03680355
632 | n03690938
633 | n03691459
634 | n03692522
635 | n03697007
636 | n03706229
637 | n03709823
638 | n03710193
639 | n03710637
640 | n03710721
641 | n03717622
642 | n03720891
643 | n03721384
644 | n03725035
645 | n03729826
646 | n03733131
647 | n03733281
648 | n03733805
649 | n03742115
650 | n03743016
651 | n03759954
652 | n03761084
653 | n03763968
654 | n03764736
655 | n03769881
656 | n03770439
657 | n03770679
658 | n03773504
659 | n03775071
660 | n03775546
661 | n03776460
662 | n03777568
663 | n03777754
664 | n03781244
665 | n03782006
666 | n03785016
667 | n03786901
668 | n03787032
669 | n03788195
670 | n03788365
671 | n03791053
672 | n03792782
673 | n03792972
674 | n03793489
675 | n03794056
676 | n03796401
677 | n03803284
678 | n03804744
679 | n03814639
680 | n03814906
681 | n03825788
682 | n03832673
683 | n03837869
684 | n03838899
685 | n03840681
686 | n03841143
687 | n03843555
688 | n03854065
689 | n03857828
690 | n03866082
691 | n03868242
692 | n03868863
693 | n03871628
694 | n03873416
695 | n03874293
696 | n03874599
697 | n03876231
698 | n03877472
699 | n03878211
700 | n03884397
701 | n03887697
702 | n03888257
703 | n03888605
704 | n03891251
705 | n03891332
706 | n03895866
707 | n03899768
708 | n03902125
709 | n03903868
710 | n03908618
711 | n03908714
712 | n03916031
713 | n03920288
714 | n03924679
715 | n03929660
716 | n03929855
717 | n03930313
718 | n03930630
719 | n03934042
720 | n03935335
721 | n03937543
722 | n03938244
723 | n03942813
724 | n03944341
725 | n03947888
726 | n03950228
727 | n03954731
728 | n03956157
729 | n03958227
730 | n03961711
731 | n03967562
732 | n03970156
733 | n03976467
734 | n03977158
735 | n03977966
736 | n03980874
737 | n03982430
738 | n03983396
739 | n03991062
740 | n03992509
741 | n03995372
742 | n03998194
743 | n04004767
744 | n04005630
745 | n04008634
746 | n04009801
747 | n04019541
748 | n04023962
749 | n04026417
750 | n04033901
751 | n04033995
752 | n04037443
753 | n04039381
754 | n09403211
755 | n04041544
756 | n04044716
757 | n04049303
758 | n04065272
759 | n04067658
760 | n04069434
761 | n04070727
762 | n04074963
763 | n04081281
764 | n04086273
765 | n04090263
766 | n04099969
767 | n04111531
768 | n04116512
769 | n04118538
770 | n04118776
771 | n04120489
772 | n04125116
773 | n04127249
774 | n04131690
775 | n04133789
776 | n04136333
777 | n04141076
778 | n04141327
779 | n04141975
780 | n04146614
781 | n04147291
782 | n04149813
783 | n04152593
784 | n04154340
785 | n07917272
786 | n04162706
787 | n04179913
788 | n04192698
789 | n04200800
790 | n04201297
791 | n04204238
792 | n04204347
793 | n04208427
794 | n04209133
795 | n04209239
796 | n04228054
797 | n04229816
798 | n04235860
799 | n04238763
800 | n04239074
801 | n04243546
802 | n04251144
803 | n04252077
804 | n04252225
805 | n04254120
806 | n04254680
807 | n04254777
808 | n04258138
809 | n04259630
810 | n04263257
811 | n04264628
812 | n04265275
813 | n04266014
814 | n04270147
815 | n04273569
816 | n04275548
817 | n04277669
818 | n04285008
819 | n04286575
820 | n04296562
821 | n04310018
822 | n04311004
823 | n04311174
824 | n04317175
825 | n04325704
826 | n04326547
827 | n04328186
828 | n04330267
829 | n04332243
830 | n04335435
831 | n04337157
832 | n04344873
833 | n04346328
834 | n04347754
835 | n04350905
836 | n04355338
837 | n04355933
838 | n04356056
839 | n04357314
840 | n04366367
841 | n04367480
842 | n04370456
843 | n04371430
844 | n04371774
845 | n04372370
846 | n04376876
847 | n04380533
848 | n04389033
849 | n04392985
850 | n04398044
851 | n04399382
852 | n04404412
853 | n04409515
854 | n04417672
855 | n04418357
856 | n04423845
857 | n04428191
858 | n04429376
859 | n04435653
860 | n04442312
861 | n04443257
862 | n04447861
863 | n04456115
864 | n04458633
865 | n04461696
866 | n04462240
867 | n04465666
868 | n04467665
869 | n04476259
870 | n04479046
871 | n04482393
872 | n04483307
873 | n04485082
874 | n04486054
875 | n04487081
876 | n04487394
877 | n04493381
878 | n04501370
879 | n04505470
880 | n04507155
881 | n04509417
882 | n04515003
883 | n04517823
884 | n04522168
885 | n04523525
886 | n04525038
887 | n04525305
888 | n04532106
889 | n04532670
890 | n04536866
891 | n04540053
892 | n04542943
893 | n04548280
894 | n04548362
895 | n04550184
896 | n04552348
897 | n04553703
898 | n04554684
899 | n04557648
900 | n04560804
901 | n04562935
902 | n04579145
903 | n04579667
904 | n04584207
905 | n04589890
906 | n04590129
907 | n04591157
908 | n04591713
909 | n04592741
910 | n04596742
911 | n04597913
912 | n04599235
913 | n04604644
914 | n04606251
915 | n04612504
916 | n04613696
917 | n06359193
918 | n06596364
919 | n06785654
920 | n06794110
921 | n06874185
922 | n07248320
923 | n07565083
924 | n07579787
925 | n07583066
926 | n07584110
927 | n07590611
928 | n07613480
929 | n07614500
930 | n07615774
931 | n07684084
932 | n07693725
933 | n07695742
934 | n07697313
935 | n07697537
936 | n07711569
937 | n07714571
938 | n07714990
939 | n07715103
940 | n12159804
941 | n12160303
942 | n12160857
943 | n07717556
944 | n07718472
945 | n07718747
946 | n07720875
947 | n07730033
948 | n13001041
949 | n07742313
950 | n07745940
951 | n07747607
952 | n07749582
953 | n07753113
954 | n07753275
955 | n07753592
956 | n07754684
957 | n07760859
958 | n07768694
959 | n07802026
960 | n07831146
961 | n07836838
962 | n07860988
963 | n07871810
964 | n07873807
965 | n07875152
966 | n07880968
967 | n07892512
968 | n07920052
969 | n07930864
970 | n07932039
971 | n09193705
972 | n09229709
973 | n09246464
974 | n09256479
975 | n09288635
976 | n09332890
977 | n09399592
978 | n09421951
979 | n09428293
980 | n09468604
981 | n09472597
982 | n09835506
983 | n10148035
984 | n10565667
985 | n11879895
986 | n11939491
987 | n12057211
988 | n12144580
989 | n12267677
990 | n12620546
991 | n12768682
992 | n12985857
993 | n12998815
994 | n13037406
995 | n13040303
996 | n13044778
997 | n13052670
998 | n13054560
999 | n13133613
1000 | n15075141
1001 |
--------------------------------------------------------------------------------
/deployment/__init__.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
1 | name: dirtypix
2 | channels:
3 | - defaults
4 | dependencies:
5 | - _libgcc_mutex=0.1=main
6 | - _tflow_select=2.1.0=gpu
7 | - absl-py=0.12.0=py36h06a4308_0
8 | - astor=0.8.1=py36h06a4308_0
9 | - blas=1.0=mkl
10 | - blosc=1.21.0=h8c45485_0
11 | - brotli=1.0.9=he6710b0_2
12 | - bzip2=1.0.8=h7b6447c_0
13 | - c-ares=1.17.1=h27cfd23_0
14 | - ca-certificates=2021.4.13=h06a4308_1
15 | - certifi=2020.12.5=py36h06a4308_0
16 | - charls=2.1.0=he6710b0_2
17 | - cloudpickle=1.6.0=py_0
18 | - coverage=5.5=py36h27cfd23_2
19 | - cudatoolkit=9.2=0
20 | - cudnn=7.6.5=cuda9.2_0
21 | - cupti=9.2.148=0
22 | - cycler=0.10.0=py36_0
23 | - cython=0.29.23=py36h2531618_0
24 | - cytoolz=0.11.0=py36h7b6447c_0
25 | - dask-core=2021.3.0=pyhd3eb1b0_0
26 | - decorator=5.0.6=pyhd3eb1b0_0
27 | - freetype=2.10.4=h5ab3b9f_0
28 | - gast=0.4.0=py_0
29 | - giflib=5.1.4=h14c3975_1
30 | - grpcio=1.36.1=py36h2157cd5_1
31 | - h5py=2.10.0=py36hd6299e0_1
32 | - hdf5=1.10.6=hb1b8bf9_0
33 | - imagecodecs=2020.5.30=py36hfa7d478_2
34 | - imageio=2.9.0=pyhd3eb1b0_0
35 | - importlib-metadata=3.10.0=py36h06a4308_0
36 | - intel-openmp=2021.2.0=h06a4308_610
37 | - jpeg=9b=h024ee3a_2
38 | - jxrlib=1.1=h7b6447c_2
39 | - keras-applications=1.0.8=py_1
40 | - keras-preprocessing=1.1.2=pyhd3eb1b0_0
41 | - kiwisolver=1.3.1=py36h2531618_0
42 | - lcms2=2.12=h3be6417_0
43 | - ld_impl_linux-64=2.33.1=h53a641e_7
44 | - libaec=1.0.4=he6710b0_1
45 | - libffi=3.3=he6710b0_2
46 | - libgcc-ng=9.1.0=hdf63c60_0
47 | - libgfortran-ng=7.3.0=hdf63c60_0
48 | - libpng=1.6.37=hbc83047_0
49 | - libprotobuf=3.14.0=h8c45485_0
50 | - libstdcxx-ng=9.1.0=hdf63c60_0
51 | - libtiff=4.1.0=h2733197_1
52 | - libwebp=1.0.1=h8e7db2f_0
53 | - libzopfli=1.0.3=he6710b0_0
54 | - lz4-c=1.9.3=h2531618_0
55 | - markdown=3.3.4=py36h06a4308_0
56 | - matplotlib-base=3.3.4=py36h62a2d02_0
57 | - mkl=2020.2=256
58 | - mkl-service=2.3.0=py36he8ac12f_0
59 | - mkl_fft=1.3.0=py36h54f3939_0
60 | - mkl_random=1.1.1=py36h0573a6f_0
61 | - ncurses=6.2=he6710b0_1
62 | - networkx=2.5=py_0
63 | - numpy=1.19.2=py36h54aff64_0
64 | - numpy-base=1.19.2=py36hfa32c7d_0
65 | - olefile=0.46=py_0
66 | - openjpeg=2.3.0=h05c96fa_1
67 | - openssl=1.1.1k=h27cfd23_0
68 | - pillow=8.2.0=py36he98fc37_0
69 | - pip=21.0.1=py36h06a4308_0
70 | - protobuf=3.14.0=py36h2531618_1
71 | - pyparsing=2.4.7=pyhd3eb1b0_0
72 | - python=3.6.13=hdb3f193_0
73 | - python-dateutil=2.8.1=pyhd3eb1b0_0
74 | - pywavelets=1.1.1=py36h7b6447c_2
75 | - pyyaml=5.4.1=py36h27cfd23_1
76 | - readline=8.1=h27cfd23_0
77 | - scikit-image=0.17.2=py36hdf5156a_0
78 | - scipy=1.5.2=py36h0b6359f_0
79 | - setuptools=52.0.0=py36h06a4308_0
80 | - six=1.15.0=pyhd3eb1b0_0
81 | - snappy=1.1.8=he6710b0_0
82 | - sqlite=3.35.4=hdfb4753_0
83 | - tensorboard=1.12.2=py36he6710b0_0
84 | - tensorflow=1.12.0=gpu_py36he74679b_0
85 | - tensorflow-base=1.12.0=gpu_py36had579c0_0
86 | - tensorflow-gpu=1.12.0=h0d30ee6_0
87 | - termcolor=1.1.0=py36h06a4308_1
88 | - tifffile=2021.3.31=pyhd3eb1b0_1
89 | - tk=8.6.10=hbc83047_0
90 | - toolz=0.11.1=pyhd3eb1b0_0
91 | - tornado=6.1=py36h27cfd23_0
92 | - typing_extensions=3.7.4.3=pyha847dfd_0
93 | - werkzeug=1.0.1=pyhd3eb1b0_0
94 | - wheel=0.36.2=pyhd3eb1b0_0
95 | - xz=5.2.5=h7b6447c_0
96 | - yaml=0.2.5=h7b6447c_0
97 | - zipp=3.4.1=pyhd3eb1b0_0
98 | - zlib=1.2.11=h7b6447c_3
99 | - zstd=1.4.5=h9ceee32_0
100 | prefix: /home/frank.julca-aguilar/anaconda3/envs/dirtypix
101 |
--------------------------------------------------------------------------------
/loss_functions/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/princeton-computational-imaging/DirtyPixels/6c82b124c9e32bbf5fa7d6adf8db8103132e4e5e/loss_functions/__init__.py
--------------------------------------------------------------------------------
/loss_functions/loss_factory.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains a factory for building various models."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | import tensorflow as tf
22 |
23 | from preprocessing import cifarnet_preprocessing
24 | from preprocessing import inception_preprocessing
25 | from preprocessing import lenet_preprocessing
26 | from preprocessing import vgg_preprocessing
27 |
28 | slim = tf.contrib.slim
29 |
30 |
31 |
32 | def get_loss(name):
33 | """Returns loss_fn(outputs, ground_truths, **kwargs), where "outputs" are the model outputs.
34 |
35 | Args:
36 | name: The name of the loss function.
37 |
38 | Returns:
39 | loss_fn: A function that computes the loss between the inputs and the ground_truths
40 |
41 | Raises:
42 | ValueError: If Preprocessing `name` is not recognized.
43 | """
44 | loss_fn_map = {
45 | 'mean_squared_error':slim.losses.mean_squared_error,
46 | 'absolute_difference':slim.losses.absolute_difference
47 | }
48 |
49 | if name not in loss_fn_map:
50 | raise ValueError('Loss function name [%s] was not recognized' % name)
51 |
52 | def loss_fn(outputs, ground_truths, **kwargs):
53 | return loss_fn_map[name](
54 | outputs, ground_truths, **kwargs)
55 |
56 | return loss_fn
57 |
--------------------------------------------------------------------------------
/nets/__init__.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/nets/inception.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Brings all inception models under one namespace."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | # pylint: disable=unused-import
22 | from nets.inception_resnet_v2 import inception_resnet_v2
23 | from nets.inception_resnet_v2 import inception_resnet_v2_arg_scope
24 | from nets.inception_v1 import inception_v1
25 | from nets.inception_v1 import inception_v1_arg_scope
26 | from nets.inception_v1 import inception_v1_base
27 | from nets.inception_v2 import inception_v2
28 | from nets.inception_v2 import inception_v2_arg_scope
29 | from nets.inception_v2 import inception_v2_base
30 | from nets.inception_v3 import inception_v3
31 | from nets.inception_v3 import inception_v3_arg_scope
32 | from nets.inception_v3 import inception_v3_base
33 | from nets.inception_v4 import inception_v4
34 | from nets.inception_v4 import inception_v4_arg_scope
35 | from nets.inception_v4 import inception_v4_base
36 | # pylint: enable=unused-import
37 |
--------------------------------------------------------------------------------
/nets/inception_utils.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains common code shared by all inception models.
16 |
17 | Usage of arg scope:
18 | with slim.arg_scope(inception_arg_scope()):
19 | logits, end_points = inception.inception_v3(images, num_classes,
20 | is_training=is_training)
21 |
22 | """
23 | from __future__ import absolute_import
24 | from __future__ import division
25 | from __future__ import print_function
26 |
27 | import tensorflow as tf
28 |
29 | slim = tf.contrib.slim
30 |
31 |
32 | def inception_arg_scope(weight_decay=0.00004,
33 | use_batch_norm=True,
34 | batch_norm_decay=0.9997,
35 | batch_norm_epsilon=0.001):
36 | """Defines the default arg scope for inception models.
37 |
38 | Args:
39 | weight_decay: The weight decay to use for regularizing the model.
40 | use_batch_norm: "If `True`, batch_norm is applied after each convolution.
41 | batch_norm_decay: Decay for batch norm moving average.
42 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero
43 | in batch norm.
44 |
45 | Returns:
46 | An `arg_scope` to use for the inception models.
47 | """
48 | print("weight decay = ", weight_decay)
49 | batch_norm_params = {
50 | # Decay for the moving averages.
51 | 'decay': batch_norm_decay,
52 | # epsilon to prevent 0s in variance.
53 | 'epsilon': batch_norm_epsilon,
54 | # collection containing update_ops.
55 | 'updates_collections': tf.GraphKeys.UPDATE_OPS,
56 | }
57 | if use_batch_norm:
58 | normalizer_fn = slim.batch_norm
59 | normalizer_params = batch_norm_params
60 | else:
61 | normalizer_fn = None
62 | normalizer_params = {}
63 | # Set weight_decay for weights in Conv and FC layers.
64 | with slim.arg_scope([slim.conv2d, slim.fully_connected],
65 | weights_regularizer=slim.l2_regularizer(weight_decay)):
66 | with slim.arg_scope(
67 | [slim.conv2d],
68 | weights_initializer=slim.variance_scaling_initializer(),
69 | activation_fn=tf.nn.relu,
70 | normalizer_fn=normalizer_fn,
71 | normalizer_params=normalizer_params) as sc:
72 | return sc
73 |
--------------------------------------------------------------------------------
/nets/isp.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains the definition of the Inception V4 architecture.
16 |
17 | As described in http://arxiv.org/abs/1602.07261.
18 |
19 | Inception-v4, Inception-ResNet and the Impact of Residual Connections
20 | on Learning
21 | Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi
22 | """
23 | from __future__ import absolute_import
24 | from __future__ import division
25 | from __future__ import print_function
26 |
27 | import tensorflow as tf
28 |
29 | from nets import inception_utils
30 |
31 | slim = tf.contrib.slim
32 |
33 | def isp_arg_scope(weight_decay=0.00004,
34 | use_batch_norm=True,
35 | batch_norm_decay=0.95,
36 | batch_norm_epsilon=0.0001):
37 | """Defines the default arg scope for inception models.
38 |
39 | Args:
40 | weight_decay: The weight decay to use for regularizing the model.
41 | use_batch_norm: "If `True`, batch_norm is applied after each convolution.
42 | batch_norm_decay: Decay for batch norm moving average.
43 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero
44 | in batch norm.
45 |
46 | Returns:
47 | An `arg_scope` to use for the inception models.
48 | """
49 | print("weight decay = ", weight_decay)
50 | print("batch norm decay = ", batch_norm_decay)
51 | batch_norm_params = {
52 | # Decay for the moving averages.
53 | 'decay': batch_norm_decay,
54 | # epsilon to prevent 0s in variance.
55 | 'epsilon': batch_norm_epsilon,
56 | # collection containing update_ops.
57 | 'updates_collections': tf.GraphKeys.UPDATE_OPS,
58 | 'center': True,
59 | 'scale': False,
60 | }
61 | if use_batch_norm:
62 | normalizer_fn = slim.batch_norm
63 | normalizer_params = batch_norm_params
64 | else:
65 | normalizer_fn = None
66 | normalizer_params = {}
67 | # Set weight_decay for weights in Conv and FC layers.
68 | with slim.arg_scope([slim.conv2d, slim.fully_connected],
69 | weights_regularizer=slim.l2_regularizer(weight_decay)):
70 | with slim.arg_scope(
71 | [slim.conv2d],
72 | weights_initializer=slim.variance_scaling_initializer(),
73 | activation_fn=tf.nn.relu,
74 | normalizer_fn=normalizer_fn,
75 | normalizer_params=normalizer_params) as sc:
76 | return sc
77 |
78 | def anscombe(data, sigma, alpha, scale=255.0, is_real_data=False):
79 | """Transform N(mu,sigma^2) + \alpha Pois(y) into N(0,scale^2) noise."""
80 | if is_real_data:
81 | z = data/alpha[:,None,None,:]
82 | sigma_hat = sigma/alpha
83 | sqrt_term = z + 3./8. + tf.square(sigma_hat)[:,None,None,:]
84 | else:
85 | z = data/alpha[:,None,None,None]
86 | sigma_hat = sigma/alpha
87 | sqrt_term = z + 3./8. + tf.square(sigma_hat)[:,None,None,None]
88 |
89 | sqrt_term = tf.maximum(sqrt_term, 0.0)
90 |
91 | return 2*tf.sqrt(sqrt_term)
92 |
93 |
94 | def inv_anscombe(data, sigma, alpha, scale=1.0, unbiased=False, is_real_data=False):
95 | """Invert anscombe transform."""
96 | sigma_hat = sigma/alpha
97 | if is_real_data:
98 | z = .25* tf.square(data) - 1./8 - tf.square(sigma_hat)[:,None,None,:]
99 | if unbiased:
100 | z = z + .25*tf.sqrt(3./2)*data**-1 - 11./8.*data**-2 + 5./8.*tf.sqrt(3./2)*data**-3
101 | result = z*alpha[:,None,None,:]
102 | else:
103 | z = .25* tf.square(data) - 1./8 - tf.square(sigma_hat)[:,None,None,None]
104 | #data = tf.Print(data, ["data", tf.reduce_max(data), tf.reduce_min(data)])
105 |
106 | #z = tf.maximum(z, 0)
107 | if unbiased:
108 | z = z + .25*tf.sqrt(3./2)*data**-1 - 11./8.*data**-2 + 5./8.*tf.sqrt(3./2)*data**-3
109 | result = z*alpha[:,None,None,None]
110 | return result
111 | #return tf.clip_by_value(result, 0.0, scale)
112 |
113 | def prox_grad_isp(inputs,
114 | alpha,
115 | sigma,
116 | bayer_mask,
117 | num_iters=4,
118 | num_channels=3,
119 | num_layers=5,
120 | kernel=None,
121 | num_classes=1001,
122 | is_training=True,
123 | scale=1.0,
124 | use_anscombe=True,
125 | noise_channel=True,
126 | use_chen_unet=False,
127 | is_real_data=True):
128 |
129 | end_points = {}
130 | end_points['inputs'] = inputs
131 | if use_anscombe and alpha is not None:
132 | print(("USING THE ANCOMB TRANSFORM with scale %f" % scale) + "!"*10)
133 | true_img = anscombe(inputs, alpha=alpha, sigma=sigma, scale=scale, is_real_data=is_real_data)
134 | min_offset = tf.reduce_min(true_img, [1,2,3], keep_dims=True)
135 | max_scale = tf.reduce_max(true_img, [1,2,3], keep_dims=True)
136 | noise_scale = scale/(max_scale - min_offset)
137 | true_img = (true_img - min_offset)*noise_scale
138 | noise_ch = noise_scale
139 | end_points['post_anscombe'] = true_img
140 | else:
141 | true_img = inputs
142 | noise_ch = sigma[:,None,None,None]
143 |
144 | if not noise_channel:
145 | noise_ch = None
146 | else:
147 | print(("USING NOISE CHANNEL"))
148 | dims = [d.value for d in inputs.get_shape()]
149 | noise_ch = tf.tile(noise_ch, [1, dims[1], dims[2], 1])
150 |
151 | if use_chen_unet:
152 | print('USING UNET AS ISP (NON-PROX GRAD)')
153 | from nets import unet
154 | ans_x_out = unet.unet(true_img)
155 | end_points = {}
156 |
157 | else:
158 | ans_x_out, end_points = prox_grad(true_img, bayer_mask, end_points, num_layers=num_layers,
159 | num_iters=num_iters, noise_channel=noise_ch, is_training=is_training)
160 | # ans_x_out, end_points = prox_grad(true_img, bayer_mask, end_points, num_layers=num_layers,
161 | # num_iters=num_iters, noise_channel=noise_ch, is_training=is_training)
162 |
163 | if use_anscombe and alpha is not None:
164 | end_points['pre_inv_anscombe'] = ans_x_out
165 | ans_x_out = ans_x_out/noise_scale + min_offset
166 | ans_x_out = inv_anscombe(ans_x_out, alpha=alpha, sigma=sigma, scale=scale, is_real_data=is_real_data)
167 | end_points['outputs'] = ans_x_out
168 | return ans_x_out, end_points
169 |
170 |
171 | def prox_grad(inputs, bayer_mask, end_points, num_layers=5, num_iters=4, lambda_init=1.0,
172 | is_training=True, scope='gauss_den', noise_channel=None):
173 | flat_inputs = tf.reduce_sum(inputs, 3, keep_dims=True)
174 | with tf.variable_scope(scope, 'gauss_den', [inputs]) as sc:
175 | xk = inputs
176 | lam = slim.variable(name='lambda', shape=[], initializer=tf.constant_initializer(lambda_init))
177 | end_points['lambda'] = lam
178 | beta_init = 1.0
179 | for t in range(num_iters):
180 | with tf.variable_scope('iter_%i'% t):
181 | with slim.arg_scope([slim.batch_norm, slim.dropout],
182 | is_training=is_training):
183 | # Collect outputs for conv2d, fully_connected and max_pool2d.
184 | beta_init *= 2.0 # Continuation scheme as proposed in http://www.caam.rice.edu/~yzhang/reports/tr0710_rev.pdf, algorithm 2
185 | beta = slim.variable(name='beta', shape=[], initializer=tf.constant_initializer(beta_init))
186 | end_points['beta%s'%t] = beta
187 | with tf.variable_scope('prior_grad') as prior_scope:
188 | #curr_z = cnn_proximal(xk, num_layers, 3, noise_channel, width=12, rate=1)
189 | if noise_channel is None:
190 | concat_xk = xk
191 | else:
192 | concat_xk = tf.concat([xk, noise_channel], 3)
193 | curr_z = unet_res(concat_xk, 0, 'unet')
194 | #end_points['prior_grad_%i' % t] = curr_z
195 | tmp = xk - curr_z
196 | xk = (lam*bayer_mask*inputs + beta*tmp)/(lam*bayer_mask + beta)
197 | #end_points['iter_%i' % t] = xk
198 |
199 | return xk, end_points
200 |
201 | def unet_res(inputs, depth, scope, max_depth=2):
202 | # U-NET operating at a given resolution.
203 | shape = [d.value for d in inputs.get_shape()]
204 | print(depth, shape)
205 | ch = max(shape[3]*2, 8)
206 | with tf.variable_scope('depth_%s' % depth, values=[inputs]) as scope:
207 | if depth == 0:
208 | outputs = slim.conv2d(inputs, ch, [3, 3], rate=2, scope='conv_in', normalizer_fn=None)
209 | else:
210 | outputs = slim.conv2d(inputs, ch, [3, 3], scope='conv_in')
211 | outputs = slim.conv2d(outputs, ch, [3, 3], scope='conv_1')
212 | downsamp = slim.avg_pool2d(outputs, [2, 2])
213 | if depth < max_depth:
214 | lower = unet_res(downsamp, depth+1, scope, max_depth)
215 | outputs = tf.concat([outputs, lower], 3)
216 | with tf.variable_scope('depth_%s' % depth, values=[outputs]) as scope:
217 | outputs = slim.conv2d(outputs, ch, [3, 3], scope='conv_2')
218 | if depth > 0:
219 | outputs = slim.conv2d(outputs, ch, [3, 3], scope='out_conv')
220 | outputs = slim.conv2d_transpose(outputs, ch//2, [2,2], stride=2, scope='up_conv',
221 | activation_fn=None, normalizer_fn=None)
222 | else:
223 | outputs = slim.conv2d(outputs, 3, [3, 3], scope='out_conv',
224 | activation_fn=None, normalizer_fn=None)
225 | return outputs
226 |
--------------------------------------------------------------------------------
/nets/nets_factory.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains a factory for building various models."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 | import functools
21 |
22 | import tensorflow as tf
23 |
24 | from nets import isp
25 | from nets import mobilenet_v1
26 | from nets import mobilenet_isp
27 |
28 | slim = tf.contrib.slim
29 |
30 | networks_map = {'isp': isp.prox_grad_isp,
31 | 'mobilenet_isp': mobilenet_isp.mobilenet_v1,
32 | 'mobilenet_v1': mobilenet_v1.mobilenet_v1,
33 | 'deeper_mobilenet_v1': mobilenet_v1.deeper_mobile_net_v1,
34 | }
35 |
36 | arg_scopes_map = {'isp': isp.isp_arg_scope,
37 | 'mobilenet_isp': mobilenet_isp.mobilenet_v1_arg_scope,
38 | 'mobilenet_v1': mobilenet_v1.mobilenet_v1_arg_scope,
39 | 'deeper_mobilenet_v1': mobilenet_v1.mobilenet_v1_arg_scope,
40 | }
41 |
42 |
43 | def get_network_fn(name, num_classes, weight_decay, batch_norm_decay, is_training):
44 | """Returns a network_fn such as `logits, end_points = network_fn(images)`.
45 |
46 | Args:
47 | name: The name of the network.
48 | num_classes: The number of classes to use for classification.
49 | weight_decay: The l2 coefficient for the model weights.
50 | is_training: `True` if the model is being used for training and `False`
51 | otherwise.
52 |
53 | Returns:
54 | network_fn: A function that applies the model to a batch of images. It has
55 | the following signature:
56 | logits, end_points = network_fn(images)
57 | Raises:
58 | ValueError: If network `name` is not recognized.
59 | """
60 | if name not in networks_map:
61 | raise ValueError('Name of network unknown %s' % name)
62 |
63 | func = networks_map[name]
64 |
65 | @functools.wraps(func)
66 | def network_fn(images, **kwargs):
67 | arg_scope = arg_scopes_map[name](weight_decay=weight_decay)
68 |
69 | with slim.arg_scope(arg_scope):
70 | return func(images, is_training=is_training, **kwargs)
71 |
72 | if hasattr(func, 'default_image_size'):
73 | network_fn.default_image_size = func.default_image_size
74 |
75 | return network_fn
76 |
--------------------------------------------------------------------------------
/nets/nets_factory_test.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 Google Inc. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 |
16 | """Tests for slim.inception."""
17 |
18 | from __future__ import absolute_import
19 | from __future__ import division
20 | from __future__ import print_function
21 |
22 | import tensorflow as tf
23 |
24 | from nets import nets_factory
25 |
26 | slim = tf.contrib.slim
27 |
28 |
29 | class NetworksTest(tf.test.TestCase):
30 |
31 | def testGetNetworkFn(self):
32 | batch_size = 5
33 | num_classes = 1000
34 | for net in nets_factory.networks_map:
35 | with self.test_session():
36 | net_fn = nets_factory.get_network_fn(net, num_classes)
37 | # Most networks use 224 as their default_image_size
38 | image_size = getattr(net_fn, 'default_image_size', 224)
39 | inputs = tf.random_uniform((batch_size, image_size, image_size, 3))
40 | logits, end_points = net_fn(inputs)
41 | self.assertTrue(isinstance(logits, tf.Tensor))
42 | self.assertTrue(isinstance(end_points, dict))
43 | self.assertEqual(logits.get_shape().as_list()[0], batch_size)
44 | self.assertEqual(logits.get_shape().as_list()[-1], num_classes)
45 |
46 | def testGetNetworkFnArgScope(self):
47 | batch_size = 5
48 | num_classes = 10
49 | net = 'cifarnet'
50 | with self.test_session(use_gpu=True):
51 | net_fn = nets_factory.get_network_fn(net, num_classes)
52 | image_size = getattr(net_fn, 'default_image_size', 224)
53 | with slim.arg_scope([slim.model_variable, slim.variable],
54 | device='/CPU:0'):
55 | inputs = tf.random_uniform((batch_size, image_size, image_size, 3))
56 | net_fn(inputs)
57 | weights = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, 'CifarNet/conv1')[0]
58 | self.assertDeviceEqual('/CPU:0', weights.device)
59 |
60 | if __name__ == '__main__':
61 | tf.test.main()
62 |
--------------------------------------------------------------------------------
/nets/unet.py:
--------------------------------------------------------------------------------
1 | # Tensorflow mandates these.
2 | from __future__ import absolute_import
3 | from __future__ import division
4 | from __future__ import print_function
5 |
6 | from collections import namedtuple
7 | import functools
8 |
9 | import tensorflow as tf
10 |
11 | slim = tf.contrib.slim
12 |
13 | def lrelu(x):
14 | return tf.maximum(x * 0.2, x)
15 |
16 | def upsample_and_concat(x1, x2, output_channels, in_channels):
17 | pool_size = 2
18 | deconv_filter = tf.Variable(tf.truncated_normal([pool_size, pool_size, output_channels, in_channels], stddev=0.02))
19 | deconv = tf.nn.conv2d_transpose(x1, deconv_filter, tf.shape(x2), strides=[1, pool_size, pool_size, 1])
20 |
21 | deconv_output = tf.concat([deconv, x2], 3)
22 | deconv_output.set_shape([None, None, None, output_channels * 2])
23 |
24 | return deconv_output
25 |
26 |
27 | def unet(input, scope=None):
28 | with tf.variable_scope(scope, 'gauss_den_chen_unet', [input]) as sc:
29 | conv1 = slim.conv2d(input, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_1')
30 | conv1 = slim.conv2d(conv1, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_2')
31 | pool1 = slim.max_pool2d(conv1, [2, 2], padding='SAME')
32 |
33 | conv2 = slim.conv2d(pool1, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_1')
34 | conv2 = slim.conv2d(conv2, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_2')
35 | pool2 = slim.max_pool2d(conv2, [2, 2], padding='SAME')
36 |
37 | conv3 = slim.conv2d(pool2, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_1')
38 | conv3 = slim.conv2d(conv3, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_2')
39 | pool3 = slim.max_pool2d(conv3, [2, 2], padding='SAME')
40 |
41 | conv4 = slim.conv2d(pool3, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_1')
42 | conv4 = slim.conv2d(conv4, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_2')
43 | pool4 = slim.max_pool2d(conv4, [2, 2], padding='SAME')
44 |
45 | conv5 = slim.conv2d(pool4, 128, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_1')
46 | conv5 = slim.conv2d(conv5, 128, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_2')
47 |
48 | up6 = upsample_and_concat(conv5, conv4, 64, 128)
49 | conv6 = slim.conv2d(up6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_1')
50 | conv6 = slim.conv2d(conv6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_2')
51 |
52 | up7 = upsample_and_concat(conv6, conv3, 32, 64)
53 | conv7 = slim.conv2d(up7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_1')
54 | conv7 = slim.conv2d(conv7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_2')
55 |
56 | up8 = upsample_and_concat(conv7, conv2, 16, 32)
57 | conv8 = slim.conv2d(up8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_1')
58 | conv8 = slim.conv2d(conv8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_2')
59 |
60 | up9 = upsample_and_concat(conv8, conv1, 8, 16)
61 | conv9 = slim.conv2d(up9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_1')
62 | conv9 = slim.conv2d(conv9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_2')
63 |
64 | # conv10 = slim.conv2d(conv9, 12, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
65 | # out = tf.depth_to_space(conv10, 2)
66 | out = slim.conv2d(conv9, 3, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
67 | return out
68 |
69 | # def unet(input, scope=None):
70 | # with tf.variable_scope(scope, 'gauss_den_chen_unet', [input]) as sc:
71 | # conv1 = slim.conv2d(input, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_1')
72 | # conv1 = slim.conv2d(conv1, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv1_2')
73 | # pool1 = slim.max_pool2d(conv1, [2, 2], padding='SAME')
74 |
75 | # conv2 = slim.conv2d(pool1, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_1')
76 | # conv2 = slim.conv2d(conv2, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv2_2')
77 | # pool2 = slim.max_pool2d(conv2, [2, 2], padding='SAME')
78 |
79 | # conv3 = slim.conv2d(pool2, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_1')
80 | # conv3 = slim.conv2d(conv3, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv3_2')
81 | # pool3 = slim.max_pool2d(conv3, [2, 2], padding='SAME')
82 |
83 | # conv4 = slim.conv2d(pool3, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_1')
84 | # conv4 = slim.conv2d(conv4, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv4_2')
85 | # pool4 = slim.max_pool2d(conv4, [2, 2], padding='SAME')
86 |
87 | # conv5 = slim.conv2d(pool4, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_1')
88 | # conv5 = slim.conv2d(conv5, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv5_2')
89 |
90 | # up6 = upsample_and_concat(conv5, conv4, 32, 64)
91 | # conv6 = slim.conv2d(up6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_1')
92 | # conv6 = slim.conv2d(conv6, 64, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv6_2')
93 |
94 | # up7 = upsample_and_concat(conv6, conv3, 32, 64)
95 | # conv7 = slim.conv2d(up7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_1')
96 | # conv7 = slim.conv2d(conv7, 32, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv7_2')
97 |
98 | # up8 = upsample_and_concat(conv7, conv2, 16, 32)
99 | # conv8 = slim.conv2d(up8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_1')
100 | # conv8 = slim.conv2d(conv8, 16, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv8_2')
101 |
102 | # up9 = upsample_and_concat(conv8, conv1, 8, 16)
103 | # conv9 = slim.conv2d(up9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_1')
104 | # conv9 = slim.conv2d(conv9, 8, [3, 3], rate=1, activation_fn=lrelu, scope='g_conv9_2')
105 |
106 | # # conv10 = slim.conv2d(conv9, 12, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
107 | # # out = tf.depth_to_space(conv10, 2)
108 | # out = slim.conv2d(conv9, 3, [1, 1], rate=1, activation_fn=None, scope='g_conv10')
109 | # return out
--------------------------------------------------------------------------------
/preprocessing/__init__.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/preprocessing/inception_preprocessing.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides utilities to preprocess images for the Inception networks."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | from preprocessing import sensor_model
22 |
23 | import tensorflow as tf
24 | import numpy as np
25 |
26 | from tensorflow.python.ops import control_flow_ops
27 |
28 |
29 | def apply_with_random_selector(x, func, num_cases):
30 | """Computes func(x, sel), with sel sampled from [0...num_cases-1].
31 |
32 | Args:
33 | x: input Tensor.
34 | func: Python function to apply.
35 | num_cases: Python int32, number of cases to sample sel from.
36 |
37 | Returns:
38 | The result of func(x, sel), where func receives the value of the
39 | selector as a python integer, but sel is sampled dynamically.
40 | """
41 | sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
42 | # Pass the real x only to one of the func calls.
43 | return control_flow_ops.merge([
44 | func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
45 | for case in range(num_cases)])[0]
46 |
47 |
48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
49 | """Distort the color of a Tensor image.
50 |
51 | Each color distortion is non-commutative and thus ordering of the color ops
52 | matters. Ideally we would randomly permute the ordering of the color ops.
53 | Rather then adding that level of complication, we select a distinct ordering
54 | of color ops for each preprocessing thread.
55 |
56 | Args:
57 | image: 3-D Tensor containing single image in [0, 1].
58 | color_ordering: Python int, a type of distortion (valid values: 0-3).
59 | fast_mode: Avoids slower ops (random_hue and random_contrast)
60 | scope: Optional scope for name_scope.
61 | Returns:
62 | 3-D Tensor color-distorted image on range [0, 1]
63 | Raises:
64 | ValueError: if color_ordering not in [0, 3]
65 | """
66 | with tf.name_scope(scope, 'distort_color', [image]):
67 | if fast_mode:
68 | if color_ordering == 0:
69 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
70 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
71 | else:
72 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
73 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
74 | else:
75 | if color_ordering == 0:
76 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
77 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
78 | image = tf.image.random_hue(image, max_delta=0.2)
79 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
80 | elif color_ordering == 1:
81 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
82 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
83 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
84 | image = tf.image.random_hue(image, max_delta=0.2)
85 | elif color_ordering == 2:
86 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
87 | image = tf.image.random_hue(image, max_delta=0.2)
88 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
89 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
90 | elif color_ordering == 3:
91 | image = tf.image.random_hue(image, max_delta=0.2)
92 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
93 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
94 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
95 | else:
96 | raise ValueError('color_ordering must be in [0, 3]')
97 |
98 | # The random_* ops do not necessarily clamp.
99 | return tf.clip_by_value(image, 0.0, 1.0)
100 |
101 |
102 | def distorted_bounding_box_crop(image,
103 | bbox,
104 | min_object_covered=0.1,
105 | aspect_ratio_range=(0.75, 1.33),
106 | area_range=(0.05, 1.0),
107 | max_attempts=100,
108 | scope=None):
109 | """Generates cropped_image using a one of the bboxes randomly distorted.
110 |
111 | See `tf.image.sample_distorted_bounding_box` for more documentation.
112 |
113 | Args:
114 | image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 | where each coordinate is [0, 1) and the coordinates are arranged
117 | as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 | image.
119 | min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 | area of the image must contain at least this fraction of any bounding box
121 | supplied.
122 | aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 | image must have an aspect ratio = width / height within this range.
124 | area_range: An optional list of `floats`. The cropped area of the image
125 | must contain a fraction of the supplied image within in this range.
126 | max_attempts: An optional `int`. Number of attempts at generating a cropped
127 | region of the image of the specified constraints. After `max_attempts`
128 | failures, return the entire image.
129 | scope: Optional scope for name_scope.
130 | Returns:
131 | A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 | """
133 | with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 | # Each bounding box has shape [1, num_boxes, box coords] and
135 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 |
137 | # A large fraction of image datasets contain a human-annotated bounding
138 | # box delineating the region of the image containing the object of interest.
139 | # We choose to create a new bounding box for the object which is a randomly
140 | # distorted version of the human-annotated bounding box that obeys an
141 | # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 | # bounding box. If no box is supplied, then we assume the bounding box is
143 | # the entire image.
144 | sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 | tf.shape(image),
146 | bounding_boxes=bbox,
147 | min_object_covered=min_object_covered,
148 | aspect_ratio_range=aspect_ratio_range,
149 | area_range=area_range,
150 | max_attempts=max_attempts,
151 | use_image_if_no_bounding_boxes=True)
152 | bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 |
154 | # Crop the image to the specified bounding box.
155 | cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 | return cropped_image, distort_bbox
157 |
158 |
159 | def preprocess_for_train(image, height, width, bbox,
160 | fast_mode=True,
161 | light_level=None,
162 | scope=None):
163 | """Distort one image for training a netwo.
164 |
165 | Distorting images provides a useful technique for augmenting the data
166 | set during training in order to make the network invariant to aspects
167 | of the image that do not effect the label.
168 |
169 | Additionally it would create image_summaries to display the different
170 | transformations applied to the image.
171 |
172 | Args:
173 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 | is [0, MAX], where MAX is largest positive representable number for
176 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 | height: integer
178 | width: integer
179 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 | where each coordinate is [0, 1) and the coordinates are arranged
181 | as [ymin, xmin, ymax, xmax].
182 | fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 | bi-cubic resizing, random_hue or random_contrast).
184 | scope: Optional scope for name_scope.
185 | Returns:
186 | 3-D float Tensor of distorted image used for training with range [-1, 1].
187 | """
188 | with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 | if bbox is None:
190 | bbox = tf.constant([0.0, 0.0, 1.0, 1.0],
191 | dtype=tf.float32,
192 | shape=[1, 1, 4])
193 | if image.dtype != tf.float32:
194 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
195 |
196 | # Each bounding box has shape [1, num_boxes, box coords] and
197 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
198 | image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
199 | bbox)
200 | tf.summary.image('image_with_bounding_boxes', image_with_box)
201 |
202 | distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
203 |
204 | # Restore the shape since the dynamic slice based upon the bbox_size loses
205 | # the third dimension.
206 | distorted_image.set_shape([None, None, 3])
207 | image_with_distorted_box = tf.image.draw_bounding_boxes(
208 | tf.expand_dims(image, 0), distorted_bbox)
209 | tf.summary.image('images_with_distorted_bounding_box',
210 | image_with_distorted_box)
211 |
212 | # Use nearest neighbor subsampling.
213 | print("USING NEAREST NEIGHBOR SUBSAMPLING")
214 | distorted_image = tf.image.resize_images(distorted_image, [height, width],
215 | method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
216 |
217 | tf.summary.image('cropped_resized_image',
218 | tf.expand_dims(distorted_image, 0))
219 |
220 | # Add noise - this is only relevant when training the model from scratch. For training with an ISP, there's a small subset of images that are already noised up.
221 | #distorted_image, a, gauss_std = sensor_model.sensor_noise_rand_light_level(distorted_image, light_level)
222 | #tf.summary.image('noisy_image', tf.expand_dims(distorted_image,0))
223 | #bayer_mask = sensor_model.get_bayer_mask(height, width)
224 | #tf.summary.image('bayer_mask', tf.expand_dims(bayer_mask*255, 0))
225 | #distorted_image = distorted_image*bayer_mask
226 |
227 | # Randomly flip the image horizontally.
228 | distorted_image = tf.image.random_flip_left_right(distorted_image)
229 |
230 | tf.summary.image('final_distorted_image',
231 | tf.expand_dims(distorted_image, 0))
232 | distorted_image = tf.subtract(distorted_image, 0.5)
233 | distorted_image = tf.multiply(distorted_image, 2.0)
234 | return distorted_image
235 |
236 |
237 | def preprocess_for_eval(image, height, width, light_level=None,
238 | central_fraction=0.875, scope=None):
239 | """Prepare one image for evaluation.
240 |
241 | If height and width are specified it would output an image with that size by
242 | applying resize_bilinear.
243 |
244 | If central_fraction is specified it would cropt the central fraction of the
245 | input image.
246 |
247 | Args:
248 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
249 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
250 | is [0, MAX], where MAX is largest positive representable number for
251 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
252 | height: integer
253 | width: integer
254 | central_fraction: Optional Float, fraction of the image to crop.
255 | scope: Optional scope for name_scope.
256 | Returns:
257 | 3-D float Tensor of prepared image.
258 | """
259 | with tf.name_scope(scope, 'eval_image', [image, height, width]):
260 | if image.dtype != tf.float32:
261 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
262 |
263 | # Crop the central region of the image with an area containing 87.5% of
264 | # the original image.
265 | if central_fraction:
266 | image = tf.image.central_crop(image, central_fraction=central_fraction)
267 |
268 | #image = tf.py_func(sensor_model.sensor_model, [image], tf.float32, stateful=True)
269 | if height and width:
270 | # Resize the image to the specified height and width.
271 | image = tf.expand_dims(image, 0)
272 | image = tf.image.resize_images(image, [height, width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
273 |
274 | image = tf.squeeze(image, [0])
275 |
276 | # Add noise (only for our ISP)
277 | #image, a, gauss_std = sensor_model.sensor_noise_rand_light_level(image, light_level)
278 | #image = image*sensor_model.get_bayer_mask(height, width)
279 |
280 | image = tf.subtract(image, 0.5)
281 | image = tf.multiply(image, 2.0)
282 | image.set_shape([height, width, 3])
283 | return image
284 |
285 |
286 | def preprocess_image(image, ground_truth, height, width,
287 | is_training=False,
288 | bbox=None,
289 | fast_mode=True,
290 | light_level=None):
291 | """Pre-process one image for training or evaluation.
292 |
293 | Args:
294 | image: 3-D Tensor [height, width, channels] with the image.
295 | height: integer, image expected height.
296 | width: integer, image expected width.
297 | is_training: Boolean. If true it would transform an image for train,
298 | otherwise it would transform it for evaluation.
299 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
300 | where each coordinate is [0, 1) and the coordinates are arranged as
301 | [ymin, xmin, ymax, xmax].
302 | fast_mode: Optional boolean, if True avoids slower transformations.
303 |
304 | Returns:
305 | 3-D float Tensor containing an appropriately scaled image
306 |
307 | Raises:
308 | ValueError: if user does not provide bounding box
309 | """
310 | if is_training:
311 | return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
312 | else:
313 | return preprocess_for_eval(image, height, width, light_level)
314 |
--------------------------------------------------------------------------------
/preprocessing/joint_isp_preprocessing.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides utilities to preprocess images for the Inception networks."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | from preprocessing import sensor_model
22 |
23 | import tensorflow as tf
24 | import numpy as np
25 |
26 | from tensorflow.python.ops import control_flow_ops
27 |
28 |
29 | def apply_with_random_selector(x, func, num_cases):
30 | """Computes func(x, sel), with sel sampled from [0...num_cases-1].
31 |
32 | Args:
33 | x: input Tensor.
34 | func: Python function to apply.
35 | num_cases: Python int32, number of cases to sample sel from.
36 |
37 | Returns:
38 | The result of func(x, sel), where func receives the value of the
39 | selector as a python integer, but sel is sampled dynamically.
40 | """
41 | sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
42 | # Pass the real x only to one of the func calls.
43 | return control_flow_ops.merge([
44 | func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
45 | for case in range(num_cases)])[0]
46 |
47 |
48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
49 | """Distort the color of a Tensor image.
50 |
51 | Each color distortion is non-commutative and thus ordering of the color ops
52 | matters. Ideally we would randomly permute the ordering of the color ops.
53 | Rather then adding that level of complication, we select a distinct ordering
54 | of color ops for each preprocessing thread.
55 |
56 | Args:
57 | image: 3-D Tensor containing single image in [0, 1].
58 | color_ordering: Python int, a type of distortion (valid values: 0-3).
59 | fast_mode: Avoids slower ops (random_hue and random_contrast)
60 | scope: Optional scope for name_scope.
61 | Returns:
62 | 3-D Tensor color-distorted image on range [0, 1]
63 | Raises:
64 | ValueError: if color_ordering not in [0, 3]
65 | """
66 | with tf.name_scope(scope, 'distort_color', [image]):
67 | if fast_mode:
68 | if color_ordering == 0:
69 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
70 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
71 | else:
72 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
73 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
74 | else:
75 | if color_ordering == 0:
76 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
77 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
78 | image = tf.image.random_hue(image, max_delta=0.2)
79 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
80 | elif color_ordering == 1:
81 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
82 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
83 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
84 | image = tf.image.random_hue(image, max_delta=0.2)
85 | elif color_ordering == 2:
86 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
87 | image = tf.image.random_hue(image, max_delta=0.2)
88 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
89 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
90 | elif color_ordering == 3:
91 | image = tf.image.random_hue(image, max_delta=0.2)
92 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
93 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
94 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
95 | else:
96 | raise ValueError('color_ordering must be in [0, 3]')
97 |
98 | # The random_* ops do not necessarily clamp.
99 | return tf.clip_by_value(image, 0.0, 1.0)
100 |
101 |
102 | def distorted_bounding_box_crop(image,
103 | bbox,
104 | min_object_covered=0.1,
105 | aspect_ratio_range=(0.75, 1.33),
106 | area_range=(0.05, 1.0),
107 | max_attempts=100,
108 | scope=None):
109 | """Generates cropped_image using a one of the bboxes randomly distorted.
110 |
111 | See `tf.image.sample_distorted_bounding_box` for more documentation.
112 |
113 | Args:
114 | image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 | where each coordinate is [0, 1) and the coordinates are arranged
117 | as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 | image.
119 | min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 | area of the image must contain at least this fraction of any bounding box
121 | supplied.
122 | aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 | image must have an aspect ratio = width / height within this range.
124 | area_range: An optional list of `floats`. The cropped area of the image
125 | must contain a fraction of the supplied image within in this range.
126 | max_attempts: An optional `int`. Number of attempts at generating a cropped
127 | region of the image of the specified constraints. After `max_attempts`
128 | failures, return the entire image.
129 | scope: Optional scope for name_scope.
130 | Returns:
131 | A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 | """
133 | with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 | # Each bounding box has shape [1, num_boxes, box coords] and
135 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 |
137 | # A large fraction of image datasets contain a human-annotated bounding
138 | # box delineating the region of the image containing the object of interest.
139 | # We choose to create a new bounding box for the object which is a randomly
140 | # distorted version of the human-annotated bounding box that obeys an
141 | # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 | # bounding box. If no box is supplied, then we assume the bounding box is
143 | # the entire image.
144 | sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 | tf.shape(image),
146 | bounding_boxes=bbox,
147 | min_object_covered=min_object_covered,
148 | aspect_ratio_range=aspect_ratio_range,
149 | area_range=area_range,
150 | max_attempts=max_attempts,
151 | use_image_if_no_bounding_boxes=True)
152 | bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 |
154 | # Crop the image to the specified bounding box.
155 | cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 | return cropped_image, distort_bbox
157 |
158 |
159 | def preprocess_for_train(image, height, width, bbox,
160 | fast_mode=True,
161 | light_level=None,
162 | scope=None):
163 | """Distort one image for training a netwo.
164 |
165 | Distorting images provides a useful technique for augmenting the data
166 | set during training in order to make the network invariant to aspects
167 | of the image that do not effect the label.
168 |
169 | Additionally it would create image_summaries to display the different
170 | transformations applied to the image.
171 |
172 | Args:
173 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 | is [0, MAX], where MAX is largest positive representable number for
176 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 | height: integer
178 | width: integer
179 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 | where each coordinate is [0, 1) and the coordinates are arranged
181 | as [ymin, xmin, ymax, xmax].
182 | fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 | bi-cubic resizing, random_hue or random_contrast).
184 | scope: Optional scope for name_scope.
185 | Returns:
186 | 3-D float Tensor of distorted image used for training with range [-1, 1].
187 | """
188 | with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 |
190 |
191 | if bbox is None:
192 | bbox = tf.constant([0.0, 0.0, 1.0, 1.0],
193 | dtype=tf.float32,
194 | shape=[1, 1, 4])
195 | if image.dtype != tf.float32:
196 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
197 |
198 | # Each bounding box has shape [1, num_boxes, box coords] and
199 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
200 | image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
201 | bbox)
202 | tf.summary.image('image_with_bounding_boxes', image_with_box)
203 |
204 | distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
205 | # Restore the shape since the dynamic slice based upon the bbox_size loses
206 | # the third dimension.
207 | distorted_image.set_shape([None, None, 3])
208 | image_with_distorted_box = tf.image.draw_bounding_boxes(
209 | tf.expand_dims(image, 0), distorted_bbox)
210 | tf.summary.image('images_with_distorted_bounding_box',
211 | image_with_distorted_box)
212 |
213 | # This resizing operation may distort the images because the aspect
214 | # ratio is not respected. We select a resize method in a round robin
215 | # fashion based on the thread number.
216 | # Note that ResizeMethod contains 4 enumerated resizing methods.
217 |
218 |
219 | # We select only 1 case for fast_mode bilinear.
220 | #num_resize_cases = 1
221 | #distorted_image = apply_with_random_selector(
222 | # distorted_image,
223 | # lambda x, method: tf.image.resize_images(x, [height, width], method=method),
224 | # num_cases=num_resize_cases)
225 |
226 | # Use nearest neighbor subsampling.
227 | print("USING NEAREST NEIGHBOR SUBSAMPLING")
228 | distorted_image = tf.image.resize_images(distorted_image, [height, width],
229 | method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
230 |
231 | tf.summary.image('cropped_resized_image',
232 | tf.expand_dims(distorted_image, 0))
233 |
234 | # Randomly flip the image horizontally.
235 | distorted_image = tf.image.random_flip_left_right(distorted_image)
236 |
237 | tf.summary.image('final_distorted_image',
238 | tf.expand_dims(distorted_image, 0))
239 | return distorted_image
240 |
241 |
242 | def preprocess_for_eval(image, height, width, light_level=None,
243 | central_fraction=0.875, scope=None, sensor='Nexus_6P_rear'):
244 | """Prepare one image for evaluation.
245 |
246 | If height and width are specified it would output an image with that size by
247 | applying resize_bilinear.
248 |
249 | If central_fraction is specified it would cropt the central fraction of the
250 | input image.
251 |
252 | Args:
253 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
254 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
255 | is [0, MAX], where MAX is largest positive representable number for
256 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
257 | height: integer
258 | width: integer
259 | central_fraction: Optional Float, fraction of the image to crop.
260 | scope: Optional scope for name_scope.
261 | Returns:
262 | 3-D float Tensor of prepared image.
263 | """
264 | with tf.name_scope(scope, 'eval_image', [image, height, width]):
265 | if image.dtype != tf.float32:
266 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
267 |
268 | # Crop the central region of the image with an area containing 87.5% of
269 | # the original image.
270 | if central_fraction:
271 | image = tf.image.central_crop(image, central_fraction=central_fraction)
272 |
273 | #image = tf.py_func(sensor_model.sensor_model, [image], tf.float32, stateful=True)
274 | if height and width:
275 | # Resize the image to the specified height and width.
276 | image = tf.expand_dims(image, 0)
277 | image = tf.image.resize_images(image, [height, width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
278 |
279 | image = tf.squeeze(image, [0])
280 |
281 |
282 | B = image[::2,::2,2]
283 | R = image[1::2,1::2,0]
284 | G1 = image[1::2,::2,1]
285 | G2 = image[::2,1::2,1]
286 | stacked = tf.stack([R, B, G1, G2], axis=2)
287 | mean = tf.reduce_mean(stacked)
288 | std = tf.py_func(noise_est, [stacked], tf.float32)
289 | light_level = sensor_model.std2ll(std, mean=mean, sensor=sensor)
290 | light_level.set_shape([])
291 | image.set_shape([height, width, 3])
292 | return image, light_level
293 |
294 | def noise_est(img):
295 | stds = sensor_model.estimate_std(img)
296 | return np.float32(np.mean(stds))
297 |
298 | def preprocess_image(image, ground_truth, height, width,
299 | is_training=False,
300 | bbox=None,
301 | fast_mode=True,
302 | light_level=None,
303 | sensor='Nexus_6P_rear'):
304 | """Pre-process one image for training or evaluation.
305 |
306 | Args:
307 | image: 3-D Tensor [height, width, channels] with the image.
308 | height: integer, image expected height.
309 | width: integer, image expected width.
310 | is_training: Boolean. If true it would transform an image for train,
311 | otherwise it would transform it for evaluation.
312 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
313 | where each coordinate is [0, 1) and the coordinates are arranged as
314 | [ymin, xmin, ymax, xmax].
315 | fast_mode: Optional boolean, if True avoids slower transformations.
316 |
317 | Returns:
318 | 3-D float Tensor containing an appropriately scaled image
319 |
320 | Raises:
321 | ValueError: if user does not provide bounding box
322 | """
323 | if is_training:
324 | return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
325 | else:
326 | return preprocess_for_eval(image, height, width, light_level, sensor=sensor)
327 |
--------------------------------------------------------------------------------
/preprocessing/no_preprocessing.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides utilities to preprocess images for the Inception networks."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | from preprocessing import sensor_model
22 |
23 | import tensorflow as tf
24 | import numpy as np
25 |
26 | from tensorflow.python.ops import control_flow_ops
27 |
28 |
29 | def apply_with_random_selector(x, func, num_cases):
30 | """Computes func(x, sel), with sel sampled from [0...num_cases-1].
31 |
32 | Args:
33 | x: input Tensor.
34 | func: Python function to apply.
35 | num_cases: Python int32, number of cases to sample sel from.
36 |
37 | Returns:
38 | The result of func(x, sel), where func receives the value of the
39 | selector as a python integer, but sel is sampled dynamically.
40 | """
41 | sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
42 | # Pass the real x only to one of the func calls.
43 | return control_flow_ops.merge([
44 | func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
45 | for case in range(num_cases)])[0]
46 |
47 |
48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
49 | """Distort the color of a Tensor image.
50 |
51 | Each color distortion is non-commutative and thus ordering of the color ops
52 | matters. Ideally we would randomly permute the ordering of the color ops.
53 | Rather then adding that level of complication, we select a distinct ordering
54 | of color ops for each preprocessing thread.
55 |
56 | Args:
57 | image: 3-D Tensor containing single image in [0, 1].
58 | color_ordering: Python int, a type of distortion (valid values: 0-3).
59 | fast_mode: Avoids slower ops (random_hue and random_contrast)
60 | scope: Optional scope for name_scope.
61 | Returns:
62 | 3-D Tensor color-distorted image on range [0, 1]
63 | Raises:
64 | ValueError: if color_ordering not in [0, 3]
65 | """
66 | with tf.name_scope(scope, 'distort_color', [image]):
67 | if fast_mode:
68 | if color_ordering == 0:
69 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
70 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
71 | else:
72 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
73 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
74 | else:
75 | if color_ordering == 0:
76 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
77 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
78 | image = tf.image.random_hue(image, max_delta=0.2)
79 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
80 | elif color_ordering == 1:
81 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
82 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
83 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
84 | image = tf.image.random_hue(image, max_delta=0.2)
85 | elif color_ordering == 2:
86 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
87 | image = tf.image.random_hue(image, max_delta=0.2)
88 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
89 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
90 | elif color_ordering == 3:
91 | image = tf.image.random_hue(image, max_delta=0.2)
92 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
93 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
94 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
95 | else:
96 | raise ValueError('color_ordering must be in [0, 3]')
97 |
98 | # The random_* ops do not necessarily clamp.
99 | return tf.clip_by_value(image, 0.0, 1.0)
100 |
101 |
102 | def distorted_bounding_box_crop(image,
103 | bbox,
104 | min_object_covered=0.1,
105 | aspect_ratio_range=(0.75, 1.33),
106 | area_range=(0.05, 1.0),
107 | max_attempts=100,
108 | scope=None):
109 | """Generates cropped_image using a one of the bboxes randomly distorted.
110 |
111 | See `tf.image.sample_distorted_bounding_box` for more documentation.
112 |
113 | Args:
114 | image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 | where each coordinate is [0, 1) and the coordinates are arranged
117 | as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 | image.
119 | min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 | area of the image must contain at least this fraction of any bounding box
121 | supplied.
122 | aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 | image must have an aspect ratio = width / height within this range.
124 | area_range: An optional list of `floats`. The cropped area of the image
125 | must contain a fraction of the supplied image within in this range.
126 | max_attempts: An optional `int`. Number of attempts at generating a cropped
127 | region of the image of the specified constraints. After `max_attempts`
128 | failures, return the entire image.
129 | scope: Optional scope for name_scope.
130 | Returns:
131 | A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 | """
133 | with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 | # Each bounding box has shape [1, num_boxes, box coords] and
135 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 |
137 | # A large fraction of image datasets contain a human-annotated bounding
138 | # box delineating the region of the image containing the object of interest.
139 | # We choose to create a new bounding box for the object which is a randomly
140 | # distorted version of the human-annotated bounding box that obeys an
141 | # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 | # bounding box. If no box is supplied, then we assume the bounding box is
143 | # the entire image.
144 | sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 | tf.shape(image),
146 | bounding_boxes=bbox,
147 | min_object_covered=min_object_covered,
148 | aspect_ratio_range=aspect_ratio_range,
149 | area_range=area_range,
150 | max_attempts=max_attempts,
151 | use_image_if_no_bounding_boxes=True)
152 | bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 |
154 | # Crop the image to the specified bounding box.
155 | cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 | return cropped_image, distort_bbox
157 |
158 |
159 | def preprocess_for_train(image, height, width, bbox,
160 | fast_mode=True,
161 | light_level=None,
162 | scope=None):
163 | """Distort one image for training a netwo.
164 |
165 | Distorting images provides a useful technique for augmenting the data
166 | set during training in order to make the network invariant to aspects
167 | of the image that do not effect the label.
168 |
169 | Additionally it would create image_summaries to display the different
170 | transformations applied to the image.
171 |
172 | Args:
173 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 | is [0, MAX], where MAX is largest positive representable number for
176 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 | height: integer
178 | width: integer
179 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 | where each coordinate is [0, 1) and the coordinates are arranged
181 | as [ymin, xmin, ymax, xmax].
182 | fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 | bi-cubic resizing, random_hue or random_contrast).
184 | scope: Optional scope for name_scope.
185 | Returns:
186 | 3-D float Tensor of distorted image used for training with range [-1, 1].
187 | """
188 | with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 |
190 |
191 | if image.dtype != tf.float32:
192 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
193 |
194 | # Randomly flip the image horizontally.
195 | distorted_image = tf.image.random_flip_left_right(image)
196 |
197 | tf.summary.image('final_distorted_image',
198 | tf.expand_dims(distorted_image, 0))
199 | distorted_image.set_shape([height, width, 3])
200 | return 2*(distorted_image - 0.5)
201 |
202 |
203 | def preprocess_for_eval(image, height, width, light_level=None,
204 | central_fraction=0.875, scope=None):
205 | """Prepare one image for evaluation.
206 |
207 | If height and width are specified it would output an image with that size by
208 | applying resize_bilinear.
209 |
210 | If central_fraction is specified it would cropt the central fraction of the
211 | input image.
212 |
213 | Args:
214 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
215 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
216 | is [0, MAX], where MAX is largest positive representable number for
217 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
218 | height: integer
219 | width: integer
220 | central_fraction: Optional Float, fraction of the image to crop.
221 | scope: Optional scope for name_scope.
222 | Returns:
223 | 3-D float Tensor of prepared image.
224 | """
225 | with tf.name_scope(scope, 'eval_image', [image, height, width]):
226 | if image.dtype != tf.float32:
227 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
228 | image.set_shape([height, width, 3])
229 |
230 | # Break into colors and then upsample.
231 | #B = tf.image.resize_images(image[::2,::2,2:3], [height, width])
232 | #R = tf.image.resize_images(image[1::2,1::2,0:1], [height, width])
233 | #G1 = tf.image.resize_images(image[1::2,::2,1:2], [height, width])
234 | #G2 = tf.image.resize_images(image[::2,1::2,1:2], [height, width])
235 | #image = tf.concat([R,(G1+G2)/2,B], axis=2)
236 |
237 | #image = tf.Print(image, [tf.reduce_min(image), tf.reduce_max(image)])
238 | return 2*(image - 0.5)
239 |
240 | def noise_est(img):
241 | stds = sensor_model.estimate_std(img)
242 | return np.float32(np.mean(stds))
243 |
244 | def preprocess_image(image, ground_truth, height, width,
245 | is_training=False,
246 | bbox=None,
247 | fast_mode=True,
248 | light_level=None, sensor=None):
249 | """Pre-process one image for training or evaluation.
250 |
251 | Args:
252 | image: 3-D Tensor [height, width, channels] with the image.
253 | height: integer, image expected height.
254 | width: integer, image expected width.
255 | is_training: Boolean. If true it would transform an image for train,
256 | otherwise it would transform it for evaluation.
257 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
258 | where each coordinate is [0, 1) and the coordinates are arranged as
259 | [ymin, xmin, ymax, xmax].
260 | fast_mode: Optional boolean, if True avoids slower transformations.
261 |
262 | Returns:
263 | 3-D float Tensor containing an appropriately scaled image
264 |
265 | Raises:
266 | ValueError: if user does not provide bounding box
267 | """
268 | if is_training:
269 | return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
270 | else:
271 | return preprocess_for_eval(image, height, width, light_level)
272 |
--------------------------------------------------------------------------------
/preprocessing/preprocessing_factory.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Contains a factory for building various models."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | import tensorflow as tf
22 |
23 | from preprocessing import inception_preprocessing
24 | from preprocessing import isp_pretrain_preprocessing
25 | from preprocessing import joint_isp_preprocessing
26 | from preprocessing import writeout_preprocessing
27 | from preprocessing import no_preprocessing
28 |
29 | slim = tf.contrib.slim
30 |
31 |
32 | def get_preprocessing(name, is_training):
33 | """Returns preprocessing_fn(image, height, width, **kwargs).
34 |
35 | Args:
36 | name: The name of the preprocessing function.
37 | is_training: `True` if the model is being used for training and `False`
38 | otherwise.
39 |
40 | Returns:
41 | preprocessing_fn: A function that preprocessing a single image (pre-batch).
42 | It has the following signature:
43 | image = preprocessing_fn(image, output_height, output_width, ...).
44 |
45 | Raises:
46 | ValueError: If Preprocessing `name` is not recognized.
47 | """
48 | preprocessing_fn_map = {
49 | 'isp': isp_pretrain_preprocessing,
50 | 'mobilenet_v1': inception_preprocessing,
51 | 'mobilenet_isp': joint_isp_preprocessing,
52 | 'resnet_isp': isp_pretrain_preprocessing,
53 | 'gharbi_isp': isp_pretrain_preprocessing,
54 | 'writeout': writeout_preprocessing,
55 | 'none': no_preprocessing,
56 | 'deeper_mobilenet_v1': inception_preprocessing,
57 | }
58 |
59 | if name not in preprocessing_fn_map:
60 | raise ValueError('Preprocessing name [%s] was not recognized' % name)
61 |
62 | def preprocessing_fn(image, ground_truth, output_height, output_width, **kwargs):
63 | return preprocessing_fn_map[name].preprocess_image(
64 | image, ground_truth, output_height, output_width, is_training=is_training, **kwargs)
65 |
66 | return preprocessing_fn
67 |
--------------------------------------------------------------------------------
/preprocessing/sensor_model.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import scipy.io as sio
4 | from scipy.stats import poisson
5 |
6 | ###############################################################################
7 | # Sensor model
8 | ###############################################################################
9 |
10 | # Sensors calibrated for ISO100 (format is Poissonian scale, Gaussian std)
11 | # Iso 100, 200, 400, 800, 1600, 3200
12 | sensors = {'Nexus_6P_rear': [0.00018724, 0.0004733],
13 | 'Nexus_6P_front': [0.00015, 0.0003875],
14 | 'SEMCO': [0.000388, 0.0025],
15 | 'OV2740': [0.000088021, 0.00022673],
16 | #'GAUSSIAN': [0,0.005],
17 | 'GAUSS': [0,1],
18 | 'POISSON': [1,0],
19 | 'Pixel': [0.0153, 0.0328], #[0.00019856, 0.0017],
20 | 'Pixel3x3': [2.2682e-4, 0.0017],
21 | 'Pixel5x5': [1.2361e-4, 0.0043],
22 | 'Pixel7x7': [7.3344e-05, 0.0077],
23 | }
24 |
25 | sensorpositions = {'center': 0.5,
26 | 'offaxis': 0.9,
27 | 'periphery': 1.0}
28 |
29 | light_levels = 3 * np.array([2 ** i for i in range(6)]) / 2000.0
30 |
31 | def std2ll(std, mean=0.5, sensor='Nexus_6P_rear'):
32 | #light_level = sensors[sensor][1]/std
33 | #print('Sensor', sensor)
34 | alpha, beta = sensors[sensor]
35 | alpha_mean = alpha*mean
36 | num = np.sqrt(alpha_mean**2 + 4*beta**2*std**2) - alpha_mean
37 | light_level = (2*beta**2)/num
38 | return light_level
39 |
40 | def get_bayer_mask(height, width):
41 | # Mask based on Bayer pattern. (assume RGB order of colors)
42 | # B G
43 | # G R
44 | bayer_mask = np.zeros([height, width, 3])
45 | bayer_mask[1::2,1::2,0:1] = 1 # R
46 | bayer_mask[1::2,::2,1:2] = 1 # G
47 | bayer_mask[::2,1::2,1:2] = 1 # G
48 | bayer_mask[::2,::2,2:3] = 1 # B
49 | return bayer_mask
50 |
51 | def optics_model(psfs, sensorpos='center', visualize=True ):
52 | #Expects calibrated PSFs (in matlab format) as input
53 |
54 | #Compute positions on grid
55 | psf_shape = np.array(psfs.shape)
56 | selected_pos = (psf_shape*sensorpositions[sensorpos]).astype(int)
57 |
58 | #Extract the position
59 | psf_sel = psfs[selected_pos[0] - 1,selected_pos[1] - 1]['PSF'][0,0]
60 | psf_sel = np.maximum(psf_sel, 0.0)
61 |
62 | #Normalize
63 | for ch in range(psf_sel.shape[2]):
64 | psf_sel[:,:,ch] = psf_sel[:,:,ch]/np.sum(psf_sel[:,:,ch])
65 |
66 | return psf_sel
67 |
68 | def psf_iterator():
69 | sensor_positions = ['center', 'offaxis', 'periphery']
70 | psfs = sio.loadmat('PSFs/bloc_256_Nexus_defective.mat')['bloc']
71 | for sensor_position in sensor_positions:
72 | psf_kernel = np.asfortranarray(optics_model(psfs, sensorpos=sensor_position, visualize=False).astype(np.float32))
73 | yield sensor_position, psf_kernel
74 |
75 | def load_psfs():
76 | sensor_positions = ['center', 'offaxis', 'periphery']
77 | psfs = sio.loadmat('PSFs/bloc_256_Nexus_defective.mat')['bloc']
78 |
79 | kernels = []
80 | for sensor_position in sensor_positions:
81 | kernel = np.asfortranarray(optics_model(psfs, sensorpos=sensor_position, visualize=False).astype(np.float32))
82 | for channel in xrange(3):
83 | kernel[:,:,channel] /= np.sum(kernel[:,:,channel])
84 | kernels.append(kernel)
85 |
86 | return kernels
87 |
88 | def get_noise_params(iso, sensor):
89 | sensor = 'Nexus_6P_rear'
90 | poisson = sensors[sensor][0]
91 | sigma = sensors[sensor][1]
92 |
93 | a = poisson * iso / 100.0 #Poisson scale
94 | b = (sigma * iso / 100.0)**2
95 |
96 | return a, np.sqrt(b)
97 |
98 | def sensor_model(y):
99 | # Invalid sensor
100 | iso = 1.0 / 0.0015 * 100
101 | sensor='Nexus_6P_rear'
102 |
103 | poisson = sensors[sensor][0]
104 | sigma = sensors[sensor][1]
105 |
106 | #Output stats
107 | #print( 'Sensor {0} ISO {1} Poisson {2} Gaussian {3}'.format(sensor, iso, poisson, sigma) )
108 |
109 | # Assume linear ISO model
110 | a = poisson * iso / 100.0 #Poisson scale
111 | b = (sigma * iso / 100.0)**2
112 |
113 | #Return Poissonian-Gaussian response
114 | #noisy_img = poisson_gaussian_np(y, a, b, True, True)
115 | noisy_img = poisson_gaussian_np(y, a, b, True, True)
116 | return noisy_img.astype(np.float32)
117 |
118 | def sensor_noise_rand_sigma(img_batch, sigma_range, scale=1.0, sensor='Nexus_6P_rear'):
119 | # Define in terms of Gaussian noise after Anscombe.
120 | batch_size = img_batch.get_shape()[0].value
121 | poisson = sensors[sensor][0]
122 | gauss = sensors[sensor][1]
123 | sigma = tf.random_uniform([batch_size], sigma_range[0], sigma_range[1])*scale/255.0
124 | if poisson == 0:
125 | noisy_batch = img_batch + sigma[:,None,None,None] * tf.random_normal(shape=img_batch.get_shape(), dtype=tf.float32)
126 | noisy_batch = tf.clip_by_value(noisy_batch, 0.0, scale)
127 | return noisy_batch, None, sigma
128 | sigma_hat = gauss/poisson
129 | offset = 2*tf.sqrt(3./8. + sigma_hat**2)
130 | tmp = (1./sigma + offset)**2/4 - 3./8. - sigma_hat**2
131 | light_level = poisson*tmp
132 | iso = 1.0 / light_level * 100.
133 | #iso = tf.Print(iso, [light_level])
134 |
135 | # Assume linear ISO model
136 | a = poisson * iso / 100.0 * scale #Poisson scale
137 | gauss_var = tf.square(gauss * iso / 100.0) * scale**2
138 |
139 | upper = 2*tf.sqrt(light_level/poisson + 3./8. + sigma_hat**2)
140 | lower = 2*tf.sqrt(3./8. + sigma_hat**2)
141 | tf.summary.scalar('noise_level', 255./(upper - lower)[0])
142 | tf.summary.scalar('iso', tf.reduce_mean(iso))
143 | tf.summary.scalar('light_level', tf.reduce_mean(light_level))
144 | tf.summary.scalar('a', tf.reduce_mean(a)/scale)
145 | tf.summary.scalar('gauss_variance', tf.reduce_mean(gauss_var)/scale**2)
146 |
147 | # a = tf.Print(a, [255./(upper - lower)])
148 | print("Simulating sensor {0}.".format(sensor))
149 |
150 | noisy_batch = poisson_gauss_tf(img_batch, a, gauss_var, clip=(0.,scale))
151 | # Return Poissonian-Gaussian response
152 | return noisy_batch, a, tf.sqrt(gauss_var)
153 |
154 | def get_coeffs(light_levels, sensor='Nexus_6P_rear'):
155 | #print('Sensor', sensor)
156 | poisson = sensors[sensor][0]
157 | gauss = sensors[sensor][1]
158 | iso = 1.0 / light_levels * 100.
159 | a = poisson * iso / 100.0 #Poisson scale
160 | b = (gauss * iso / 100.0)
161 | return a, b
162 |
163 | def sensor_noise_rand_light_level(img_batch, ll_range, scale=1.0, sensor='Nexus_6P_rear'):
164 | print("Sensor = %s, scale = %s" % (sensor, scale))
165 | batch_size = img_batch.get_shape()[0].value
166 | poisson = sensors[sensor][0]
167 | gauss = sensors[sensor][1]
168 |
169 | # Sample uniformly in logspace.
170 | # low ll * exp(u), u ~ [0, log(high ll/low ll)]
171 | ll_ratio = ll_range[1]/ll_range[0]
172 | ll_factor = tf.random_uniform([batch_size], minval=0, maxval=tf.log(ll_ratio), dtype=tf.float32)
173 | light_level = ll_range[0]*tf.exp(ll_factor)
174 | iso = 1.0 / light_level * 100.
175 |
176 | # Assume linear ISO model
177 | a = poisson * iso / 100.0 * scale #Poisson scale
178 |
179 | gauss_var = tf.square(gauss * iso / 100.0) * scale**2
180 | if poisson == 0:
181 | noisy_batch = img_batch + tf.sqrt(gauss_var[:,None,None,None]) * tf.random_normal(shape=img_batch.get_shape(), dtype=tf.float32)
182 | noisy_batch = tf.clip_by_value(noisy_batch, 0.0, scale)
183 | return noisy_batch, np.zeros(batch_size), tf.sqrt(gauss_var)
184 |
185 | tf.summary.scalar('iso', tf.reduce_mean(iso))
186 | tf.summary.scalar('light_level', tf.reduce_mean(light_level))
187 | tf.summary.scalar('a', tf.reduce_mean(a)/scale)
188 | tf.summary.scalar('gauss_variance', tf.reduce_mean(gauss_var)/scale**2)
189 |
190 | print("Simulating sensor {0}.".format(sensor))
191 |
192 | noisy_batch = poisson_gauss_tf(img_batch, a, gauss_var, clip=(0.,scale))
193 | sigma_hat = gauss/poisson
194 |
195 | return noisy_batch, a, tf.sqrt(gauss_var)
196 |
197 | def poisson_gauss_tf(img_batch, a, gauss_var, clip=(0.,1.)):
198 | # Apply poissonian-gaussian noise model following A.Foi et al.
199 | # Foi, A., "Practical denoising of clipped or overexposed noisy images",
200 | # Proc. 16th European Signal Process. Conf., EUSIPCO 2008, Lausanne, Switzerland, August 2008.
201 | batch_shape = tf.shape(img_batch)
202 |
203 | a_p = a[:,None,None,None]
204 | out = tf.random_poisson(shape=[], lam=tf.maximum(img_batch/a_p, 0.0), dtype=tf.float32) * a_p
205 | #out = tf.Print(out, [tf.reduce_max(out), tf.reduce_min(out)])
206 | gauss_var = tf.maximum(gauss_var, 0.0)
207 |
208 | gauss_noise = tf.sqrt(gauss_var[:,None,None,None]) * tf.random_normal(shape=batch_shape, dtype=tf.float32) #Gaussian component
209 |
210 | out += gauss_noise
211 |
212 | # Clipping
213 | if clip is not None:
214 | out = tf.clip_by_value(out, clip[0], clip[1])
215 |
216 | # Return the simulated image
217 | return out
218 |
219 |
220 | def poisson_gaussian_np(y, a, b, clip_below=True, clip_above=True):
221 | # Apply poissonian-gaussian noise model following A.Foi et al.
222 | # Foi, A., "Practical denoising of clipped or overexposed noisy images",
223 | # Proc. 16th European Signal Process. Conf., EUSIPCO 2008, Lausanne, Switzerland, August 2008.
224 |
225 | # Check method
226 | if(a==0): # no Poissonian component
227 | z = y
228 | else: # Poissonian component
229 | z = np.random.poisson( np.maximum(y/a,0.0) )*a;
230 |
231 | if(b<0):
232 | raise warnings.warn('The Gaussian noise parameter b has to be non-negative (setting b=0)')
233 | b = 0.0
234 |
235 | z = z + np.sqrt(b) * np.random.randn(*y.shape) #Gaussian component
236 |
237 | # Clipping
238 | if(clip_above):
239 | z = np.minimum(z, 1.0);
240 |
241 | if(clip_below):
242 | z = np.maximum(z, 0.0);
243 |
244 | # Return the simulated image
245 | return z
246 |
247 | # Currently only implements one method
248 | NoiseEstMethod = {'daub_reflect': 0, 'daub_replicate': 1}
249 |
250 |
251 | def estimate_std(z, method='daub_reflect'):
252 | import cv2
253 | # Estimates noise standard deviation assuming additive gaussian noise
254 |
255 | # Check method
256 | if (method not in NoiseEstMethod.values()) and (method in NoiseEstMethod.keys()):
257 | method = NoiseEstMethod[method]
258 | else:
259 | raise Exception("Invalid noise estimation method.")
260 |
261 | # Check shape
262 | if len(z.shape) == 2:
263 | z = z[..., np.newaxis]
264 | elif len(z.shape) != 3:
265 | raise Exception("Supports only up to 3D images.")
266 |
267 | # Run on multichannel image
268 | channels = z.shape[2]
269 | dev = np.zeros(channels)
270 |
271 | # Iterate over channels
272 | for ch in range(channels):
273 |
274 | # Daubechies denoising method
275 | if method == NoiseEstMethod['daub_reflect'] or method == NoiseEstMethod['daub_replicate']:
276 | daub6kern = np.array([0.03522629188571, 0.08544127388203, -0.13501102001025,
277 | -0.45987750211849, 0.80689150931109, -0.33267055295008],
278 | dtype=np.float32, order='F')
279 |
280 | if method == NoiseEstMethod['daub_reflect']:
281 | wav_det = cv2.sepFilter2D(z, -1, daub6kern, daub6kern,
282 | borderType=cv2.BORDER_REFLECT_101)
283 | else:
284 | wav_det = cv2.sepFilter2D(z, -1, daub6kern, daub6kern,
285 | borderType=cv2.BORDER_REPLICATE)
286 |
287 | dev[ch] = np.median(np.absolute(wav_det)) / 0.6745
288 |
289 | # Return standard deviation
290 | return dev
291 |
--------------------------------------------------------------------------------
/preprocessing/writeout_preprocessing.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | """Provides utilities to preprocess images for the Inception networks."""
16 |
17 | from __future__ import absolute_import
18 | from __future__ import division
19 | from __future__ import print_function
20 |
21 | from preprocessing import sensor_model
22 |
23 | import tensorflow as tf
24 | import numpy as np
25 |
26 | from tensorflow.python.ops import control_flow_ops
27 |
28 |
29 | def apply_with_random_selector(x, func, num_cases):
30 | """Computes func(x, sel), with sel sampled from [0...num_cases-1].
31 |
32 | Args:
33 | x: input Tensor.
34 | func: Python function to apply.
35 | num_cases: Python int32, number of cases to sample sel from.
36 |
37 | Returns:
38 | The result of func(x, sel), where func receives the value of the
39 | selector as a python integer, but sel is sampled dynamically.
40 | """
41 | sel = tf.random_uniform([], maxval=num_cases, dtype=tf.int32)
42 | # Pass the real x only to one of the func calls.
43 | return control_flow_ops.merge([
44 | func(control_flow_ops.switch(x, tf.equal(sel, case))[1], case)
45 | for case in range(num_cases)])[0]
46 |
47 |
48 | def distort_color(image, color_ordering=0, fast_mode=True, scope=None):
49 | """Distort the color of a Tensor image.
50 |
51 | Each color distortion is non-commutative and thus ordering of the color ops
52 | matters. Ideally we would randomly permute the ordering of the color ops.
53 | Rather then adding that level of complication, we select a distinct ordering
54 | of color ops for each preprocessing thread.
55 |
56 | Args:
57 | image: 3-D Tensor containing single image in [0, 1].
58 | color_ordering: Python int, a type of distortion (valid values: 0-3).
59 | fast_mode: Avoids slower ops (random_hue and random_contrast)
60 | scope: Optional scope for name_scope.
61 | Returns:
62 | 3-D Tensor color-distorted image on range [0, 1]
63 | Raises:
64 | ValueError: if color_ordering not in [0, 3]
65 | """
66 | with tf.name_scope(scope, 'distort_color', [image]):
67 | if fast_mode:
68 | if color_ordering == 0:
69 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
70 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
71 | else:
72 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
73 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
74 | else:
75 | if color_ordering == 0:
76 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
77 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
78 | image = tf.image.random_hue(image, max_delta=0.2)
79 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
80 | elif color_ordering == 1:
81 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
82 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
83 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
84 | image = tf.image.random_hue(image, max_delta=0.2)
85 | elif color_ordering == 2:
86 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
87 | image = tf.image.random_hue(image, max_delta=0.2)
88 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
89 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
90 | elif color_ordering == 3:
91 | image = tf.image.random_hue(image, max_delta=0.2)
92 | image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
93 | image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
94 | image = tf.image.random_brightness(image, max_delta=32. / 255.)
95 | else:
96 | raise ValueError('color_ordering must be in [0, 3]')
97 |
98 | # The random_* ops do not necessarily clamp.
99 | return tf.clip_by_value(image, 0.0, 1.0)
100 |
101 |
102 | def distorted_bounding_box_crop(image,
103 | bbox,
104 | min_object_covered=0.1,
105 | aspect_ratio_range=(0.75, 1.33),
106 | area_range=(0.05, 1.0),
107 | max_attempts=100,
108 | scope=None):
109 | """Generates cropped_image using a one of the bboxes randomly distorted.
110 |
111 | See `tf.image.sample_distorted_bounding_box` for more documentation.
112 |
113 | Args:
114 | image: 3-D Tensor of image (it will be converted to floats in [0, 1]).
115 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
116 | where each coordinate is [0, 1) and the coordinates are arranged
117 | as [ymin, xmin, ymax, xmax]. If num_boxes is 0 then it would use the whole
118 | image.
119 | min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
120 | area of the image must contain at least this fraction of any bounding box
121 | supplied.
122 | aspect_ratio_range: An optional list of `floats`. The cropped area of the
123 | image must have an aspect ratio = width / height within this range.
124 | area_range: An optional list of `floats`. The cropped area of the image
125 | must contain a fraction of the supplied image within in this range.
126 | max_attempts: An optional `int`. Number of attempts at generating a cropped
127 | region of the image of the specified constraints. After `max_attempts`
128 | failures, return the entire image.
129 | scope: Optional scope for name_scope.
130 | Returns:
131 | A tuple, a 3-D Tensor cropped_image and the distorted bbox
132 | """
133 | with tf.name_scope(scope, 'distorted_bounding_box_crop', [image, bbox]):
134 | # Each bounding box has shape [1, num_boxes, box coords] and
135 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
136 |
137 | # A large fraction of image datasets contain a human-annotated bounding
138 | # box delineating the region of the image containing the object of interest.
139 | # We choose to create a new bounding box for the object which is a randomly
140 | # distorted version of the human-annotated bounding box that obeys an
141 | # allowed range of aspect ratios, sizes and overlap with the human-annotated
142 | # bounding box. If no box is supplied, then we assume the bounding box is
143 | # the entire image.
144 | sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
145 | tf.shape(image),
146 | bounding_boxes=bbox,
147 | min_object_covered=min_object_covered,
148 | aspect_ratio_range=aspect_ratio_range,
149 | area_range=area_range,
150 | max_attempts=max_attempts,
151 | use_image_if_no_bounding_boxes=True)
152 | bbox_begin, bbox_size, distort_bbox = sample_distorted_bounding_box
153 |
154 | # Crop the image to the specified bounding box.
155 | cropped_image = tf.slice(image, bbox_begin, bbox_size)
156 | return cropped_image, distort_bbox
157 |
158 |
159 | def preprocess_for_train(image, height, width, bbox,
160 | fast_mode=True,
161 | light_level=None,
162 | scope=None):
163 | """Distort one image for training a netwo.
164 |
165 | Distorting images provides a useful technique for augmenting the data
166 | set during training in order to make the network invariant to aspects
167 | of the image that do not effect the label.
168 |
169 | Additionally it would create image_summaries to display the different
170 | transformations applied to the image.
171 |
172 | Args:
173 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
174 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
175 | is [0, MAX], where MAX is largest positive representable number for
176 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details).
177 | height: integer
178 | width: integer
179 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
180 | where each coordinate is [0, 1) and the coordinates are arranged
181 | as [ymin, xmin, ymax, xmax].
182 | fast_mode: Optional boolean, if True avoids slower transformations (i.e.
183 | bi-cubic resizing, random_hue or random_contrast).
184 | scope: Optional scope for name_scope.
185 | Returns:
186 | 3-D float Tensor of distorted image used for training with range [-1, 1].
187 | """
188 | with tf.name_scope(scope, 'distort_image', [image, height, width, bbox]):
189 |
190 |
191 | if bbox is None:
192 | bbox = tf.constant([0.0, 0.0, 1.0, 1.0],
193 | dtype=tf.float32,
194 | shape=[1, 1, 4])
195 | if image.dtype != tf.float32:
196 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
197 |
198 | # Each bounding box has shape [1, num_boxes, box coords] and
199 | # the coordinates are ordered [ymin, xmin, ymax, xmax].
200 | image_with_box = tf.image.draw_bounding_boxes(tf.expand_dims(image, 0),
201 | bbox)
202 | tf.summary.image('image_with_bounding_boxes', image_with_box)
203 |
204 | distorted_image, distorted_bbox = distorted_bounding_box_crop(image, bbox)
205 | # Restore the shape since the dynamic slice based upon the bbox_size loses
206 | # the third dimension.
207 | distorted_image.set_shape([None, None, 3])
208 | image_with_distorted_box = tf.image.draw_bounding_boxes(
209 | tf.expand_dims(image, 0), distorted_bbox)
210 | tf.summary.image('images_with_distorted_bounding_box',
211 | image_with_distorted_box)
212 |
213 | # This resizing operation may distort the images because the aspect
214 | # ratio is not respected. We select a resize method in a round robin
215 | # fashion based on the thread number.
216 | # Note that ResizeMethod contains 4 enumerated resizing methods.
217 |
218 |
219 | # We select only 1 case for fast_mode bilinear.
220 | #num_resize_cases = 1
221 | #distorted_image = apply_with_random_selector(
222 | # distorted_image,
223 | # lambda x, method: tf.image.resize_images(x, [height, width], method=method),
224 | # num_cases=num_resize_cases)
225 |
226 | # Use nearest neighbor subsampling.
227 | print("USING NEAREST NEIGHBOR SUBSAMPLING")
228 | distorted_image = tf.image.resize_images(distorted_image, [height, width],
229 | method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
230 |
231 | tf.summary.image('cropped_resized_image',
232 | tf.expand_dims(distorted_image, 0))
233 |
234 | # Randomly flip the image horizontally.
235 | distorted_image = tf.image.random_flip_left_right(distorted_image)
236 |
237 | tf.summary.image('final_distorted_image',
238 | tf.expand_dims(distorted_image, 0))
239 | return distorted_image
240 |
241 |
242 | def preprocess_for_eval(image, height, width, light_level=None,
243 | central_fraction=0.875, scope=None):
244 | """Prepare one image for evaluation.
245 |
246 | If height and width are specified it would output an image with that size by
247 | applying resize_bilinear.
248 |
249 | If central_fraction is specified it would cropt the central fraction of the
250 | input image.
251 |
252 | Args:
253 | image: 3-D Tensor of image. If dtype is tf.float32 then the range should be
254 | [0, 1], otherwise it would converted to tf.float32 assuming that the range
255 | is [0, MAX], where MAX is largest positive representable number for
256 | int(8/16/32) data type (see `tf.image.convert_image_dtype` for details)
257 | height: integer
258 | width: integer
259 | central_fraction: Optional Float, fraction of the image to crop.
260 | scope: Optional scope for name_scope.
261 | Returns:
262 | 3-D float Tensor of prepared image.
263 | """
264 | with tf.name_scope(scope, 'eval_image', [image, height, width]):
265 | if image.dtype != tf.float32:
266 | image = tf.image.convert_image_dtype(image, dtype=tf.float32)
267 |
268 | # Crop the central region of the image with an area containing 87.5% of
269 | # the original image.
270 | if central_fraction:
271 | image = tf.image.central_crop(image, central_fraction=central_fraction)
272 |
273 | #image = tf.py_func(sensor_model.sensor_model, [image], tf.float32, stateful=True)
274 | if height and width:
275 | # Resize the image to the specified height and width.
276 | image = tf.expand_dims(image, 0)
277 | image = tf.image.resize_images(image, [height, width], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
278 |
279 | image = tf.squeeze(image, [0])
280 |
281 | image.set_shape([height, width, 3])
282 | return image
283 |
284 | def preprocess_image(image, ground_truth, height, width,
285 | is_training=False,
286 | bbox=None,
287 | fast_mode=True,
288 | light_level=None):
289 | """Pre-process one image for training or evaluation.
290 |
291 | Args:
292 | image: 3-D Tensor [height, width, channels] with the image.
293 | height: integer, image expected height.
294 | width: integer, image expected width.
295 | is_training: Boolean. If true it would transform an image for train,
296 | otherwise it would transform it for evaluation.
297 | bbox: 3-D float Tensor of bounding boxes arranged [1, num_boxes, coords]
298 | where each coordinate is [0, 1) and the coordinates are arranged as
299 | [ymin, xmin, ymax, xmax].
300 | fast_mode: Optional boolean, if True avoids slower transformations.
301 |
302 | Returns:
303 | 3-D float Tensor containing an appropriately scaled image
304 |
305 | Raises:
306 | ValueError: if user does not provide bounding box
307 | """
308 | if is_training:
309 | return preprocess_for_train(image, height, width, bbox, fast_mode, light_level)
310 | else:
311 | return preprocess_for_eval(image, height, width, light_level)
312 |
--------------------------------------------------------------------------------
/run_test_captured_images.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Set the checkpoint and dataset paths
4 | checkpoints=/path/to/checkpoints
5 | dataset_dir=/path/to/dataset/RAW_synset_ISO8000_EXP10000/
6 |
7 | # Change --eval_dir paramater if needed
8 | # Proposed Joint Architecture
9 | python test_captured_images.py --device=1 --dataset_dir=$dataset_dir --dataset_name=imagenet \
10 | --checkpoint_path=$checkpoints/joint128/2to200lux/model.ckpt-232721 \
11 | --model_name=mobilenet_isp --noise_channel=True --use_anscombe=True \
12 | --isp_model_name=isp --eval_image_size=224 --sensor=Pixel --eval_dir joint_real_2to200lux
13 |
14 | # Proposed Joint Architecture (no Anscombe layers)
15 | python test_captured_images.py --device=1 --dataset_dir=$dataset_dir --dataset_name=imagenet \
16 | --checkpoint_path=$checkpoints/joint128/2to200lux_no_ansc/model.ckpt-215307 \
17 | --model_name=mobilenet_isp --noise_channel=False --use_anscombe=False \
18 | --isp_model_name=isp --eval_image_size=224 --sensor=Pixel --eval_dir joint_no_anscombe_real_2to200lux
19 |
20 | # # From Scratch MobileNet-v1
21 | python test_captured_images.py --device=1 --dataset_dir=$dataset_dir --dataset_name=imagenet \
22 | --checkpoint_path=$checkpoints/mobilenet_v1_128/2to200lux/model.ckpt-325357 \
23 | --model_name=mobilenet_v1 --eval_image_size=224 --preprocessing_name=mobilenet_isp \
24 | --eval_dir mobilenet_v1_real_2to200lux
--------------------------------------------------------------------------------
/run_test_synthetic_images.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | checkpoints_dir=/path/to/checkpoints
4 | dataset_dir=/path/to/imagenet_validation
5 | eval_dir=/path/to/output_dir
6 |
7 | noise=3lux
8 | python test_synthetic_images.py --device=1 \
9 | --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-216759' \
10 | --dataset_dir=$dataset_dir --dataset_name=imagenet --mode=$noise \
11 | --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
12 |
13 | noise=6lux
14 | python test_synthetic_images.py --device=1 \
15 | --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-222267' \
16 | --dataset_dir=$dataset_dir --dataset_name=imagenet --mode=$noise \
17 | --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
18 |
19 | noise=2to20lux
20 | python test_synthetic_images.py --device=1 \
21 | --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-232718' \
22 | --dataset_dir=$dataset_dir --dataset_name=imagenet --mode=$noise \
23 | --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
24 |
25 | noise=2to200lux
26 | python test_synthetic_images.py --device=1 \
27 | --checkpoint_path=$checkpoints_dir'/joint128/'$noise'/model.ckpt-232721' \
28 | --dataset_dir=$dataset_dir --dataset_name=imagenet --mode=$noise \
29 | --model_name=mobilenet_isp --eval_dir=$eval_dir/$noise
30 |
31 |
--------------------------------------------------------------------------------
/run_train_joint_models.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | TRAIN_DIR=/path/to/train_dir
4 | IMAGENET_TFRECORDS=/path/to/imagenetTFRecords
5 | CHECKPOINTS=/path/to/checkpoints
6 |
7 | # Train with 3lux noisy images
8 | # Set number of clones and device according to machine resources
9 | python train_image_classifier.py --train_dir=$TRAIN_DIR/3lux \
10 | --dataset_dir=$IMAGENET_TFRECORDS --ll_low=0.0015 \
11 | --ll_high=0.0015 --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
12 | --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000 \
13 | --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
14 | --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,2,3 \
15 | --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
16 |
17 |
18 | # Train with 6lux noisy images
19 | python train_image_classifier.py --train_dir=$TRAIN_DIR/6lux \
20 | --dataset_dir=$IMAGENET_TFRECORDS --ll_low=0.003 \
21 | --ll_high=0.003 --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
22 | --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000 \
23 | --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
24 | --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,3,4 \
25 | --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
26 |
27 |
28 | # Train with 2to20lux noisy images
29 | python train_image_classifier.py --train_dir=$TRAIN_DIR/2to20lux \
30 | --dataset_dir=$IMAGENET_TFRECORDS --ll_low=0.001 \
31 | --ll_high=0.010 --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
32 | --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000 \
33 | --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
34 | --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,2,3 \
35 | --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
36 |
37 | # Train with 2to200lux noisy images
38 | python train_image_classifier.py --train_dir=$TRAIN_DIR/2to200lux \
39 | --dataset_dir=$IMAGENET_TFRECORDS --ll_low=0.001 \
40 | --ll_high=0.100 --batch_size=256 --model_name=mobilenet_isp --num_readers=8 \
41 | --num_preprocessing_threads=8 --isp_checkpoint_path=$CHECKPOINTS/multires128/6lux/model.ckpt-27000 \
42 | --checkpoint_path=$CHECKPOINTS/mobilenet_v1_128/mobilenet_v1_1.0_128.ckpt --noise_channel=True \
43 | --use_anscombe=True --num_clones=4 --isp_model_name=isp --num_iters=1 --device=0,1,3,4 \
44 | --learning_rate=0.00045 --num_epochs_per_decay=2 --train_image_size=128
--------------------------------------------------------------------------------
/simulate_raw_images.py:
--------------------------------------------------------------------------------
1 | """Script for adding noise to ImageNet-like dataset."""
2 |
3 | from __future__ import absolute_import
4 | from __future__ import division
5 | from __future__ import print_function
6 |
7 | import math
8 | import tensorflow as tf
9 | import os
10 | import cv2
11 | from datasets import dataset_factory, build_imagenet_data
12 | import numpy as np
13 | from preprocessing import preprocessing_factory, sensor_model
14 | from pprint import pprint
15 | from glob import glob
16 |
17 |
18 | tf.app.flags.DEFINE_float(
19 | 'll_low', None,
20 | 'Lowest light level.')
21 |
22 | tf.app.flags.DEFINE_float(
23 | 'll_high', None,
24 | 'Highest light level.')
25 |
26 | tf.app.flags.DEFINE_string(
27 | 'sensor', 'Nexus_6P_rear', 'The sensor.')
28 |
29 | tf.app.flags.DEFINE_string(
30 | 'output_dir', None, 'Directory where the results are saved to.')
31 |
32 | tf.app.flags.DEFINE_string(
33 | 'input_dir', None, 'The directory where the dataset files are stored.')
34 |
35 | tf.app.flags.DEFINE_string(
36 | 'preprocessing_name', 'mobilenet_v1', 'The name of the preprocessing to use. If left as `None`, then the model_name flag is used.')
37 |
38 | tf.app.flags.DEFINE_integer(
39 | 'eval_image_size', 128, 'Eval image size')
40 |
41 | FLAGS = tf.app.flags.FLAGS
42 |
43 | def main(_):
44 | if not FLAGS.input_dir:
45 | raise ValueError('You must supply the input directory with --input_dir')
46 | if not FLAGS.output_dir:
47 | raise ValueError('You must supply the dataset directory with --output_dir')
48 |
49 | tf.logging.set_verbosity(tf.logging.INFO)
50 | with tf.Graph().as_default():
51 |
52 | # Preprocess the images so that they all have the same size
53 | preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
54 | image_preprocessing_fn = preprocessing_factory.get_preprocessing(
55 | preprocessing_name,
56 | is_training=False)
57 |
58 | eval_image_size = FLAGS.eval_image_size
59 | orig_image = tf.placeholder(tf.uint8, shape=(None, None, 3))
60 | image = image_preprocessing_fn(orig_image, orig_image, eval_image_size, eval_image_size)
61 | images = tf.expand_dims(image, 0)
62 |
63 | # Add noise.
64 | noisy_batch, alpha, sigma = sensor_model.sensor_noise_rand_light_level(images,
65 | [FLAGS.ll_low, FLAGS.ll_high],
66 | scale=1.0, sensor=FLAGS.sensor)
67 |
68 | bayer_mask = sensor_model.get_bayer_mask(eval_image_size, eval_image_size)
69 | inputs = noisy_batch*bayer_mask
70 |
71 | if not os.path.isdir(FLAGS.output_dir):
72 | os.mkdir(FLAGS.output_dir)
73 |
74 | with tf.Session() as sess:
75 | count = 0
76 | synsets = [path for path in os.listdir(FLAGS.input_dir) if not '.' in path]
77 |
78 | for synset in synsets:
79 | path = os.path.join(FLAGS.input_dir, synset)
80 | image_names = os.listdir(path)
81 | print("Found %d images in %s"%(len(image_names), synset))
82 |
83 | synset_path = os.path.join(FLAGS.output_dir, synset)
84 | if not os.path.isdir(synset_path):
85 | os.mkdir(synset_path)
86 |
87 | for imagename in image_names:
88 | output_imgfn = os.path.join(FLAGS.output_dir, synset, imagename.split('.')[0]+'.png')
89 | if os.path.isfile(output_imgfn):
90 | continue
91 | loaded_image = cv2.imread(os.path.join(path, imagename))
92 |
93 | # BGR to RGB
94 | loaded_image = loaded_image[..., ::-1]
95 | images, alpha_val, sigma_val = sess.run(
96 | [inputs, alpha, sigma],
97 | feed_dict={orig_image:loaded_image})
98 | img = (255.0*images[0,:,:,:]).astype(np.uint8)
99 |
100 | # RGB to BGR
101 | img = img[..., ::-1]
102 |
103 | if count % 1000 == 0:
104 | print("%d processed images." % (count))
105 | cv2.imwrite(output_imgfn, img)
106 | count += 1
107 |
108 | print('Total images processed:', count)
109 |
110 |
111 | if __name__ == '__main__':
112 | tf.app.run()
113 |
--------------------------------------------------------------------------------
/teaser/architecture_2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/princeton-computational-imaging/DirtyPixels/6c82b124c9e32bbf5fa7d6adf8db8103132e4e5e/teaser/architecture_2.jpg
--------------------------------------------------------------------------------
/teaser/teaser_v4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/princeton-computational-imaging/DirtyPixels/6c82b124c9e32bbf5fa7d6adf8db8103132e4e5e/teaser/teaser_v4.png
--------------------------------------------------------------------------------
/test_captured_images.py:
--------------------------------------------------------------------------------
1 | """Generic evaluation script that evaluates a model using the Dirty-Pixels captured dataset."""
2 |
3 | from __future__ import absolute_import
4 | from __future__ import division
5 | from __future__ import print_function
6 |
7 | import math
8 | import skimage.measure
9 | import scipy.ndimage.filters
10 | import tensorflow as tf
11 | import scipy.io
12 | import os
13 | import cv2
14 | from datasets import dataset_factory, build_imagenet_data
15 | import numpy as np
16 | from nets import nets_factory
17 | from preprocessing import preprocessing_factory, sensor_model
18 | import matplotlib.pyplot as plt
19 | from pprint import pprint
20 | from nets.isp import anscombe
21 | import rawpy
22 | import pyexifinfo
23 |
24 | slim = tf.contrib.slim
25 |
26 | tf.app.flags.DEFINE_string(
27 | 'device', '0', 'GPU device to use.')
28 |
29 | tf.app.flags.DEFINE_string(
30 | 'sensor', 'Nexus_6P_rear', 'The sensor.')
31 |
32 | tf.app.flags.DEFINE_string(
33 | 'isp_model_name', None, 'The name of the ISP architecture to train.')
34 |
35 | tf.app.flags.DEFINE_boolean('use_anscombe', True,
36 | 'Use Anscombe transform.')
37 |
38 | tf.app.flags.DEFINE_boolean('noise_channel', True,
39 | 'Use noise channel.')
40 |
41 | tf.app.flags.DEFINE_integer(
42 | 'num_iters', 1,
43 | 'Number of iterations for the unrolled Proximal Gradient Network.')
44 |
45 | tf.app.flags.DEFINE_integer(
46 | 'num_layers', 17, 'Number of layers to be used in the HQS ISP prior -- DEPRECATED')
47 |
48 | tf.app.flags.DEFINE_string(
49 | 'checkpoint_path', '/tmp/tfmodel/',
50 | 'The directory where the model was written to or an absolute path to a '
51 | 'checkpoint file.')
52 |
53 | tf.app.flags.DEFINE_string(
54 | 'eval_dir', '/tmp/tfmodel/', 'Directory where the results are saved to.')
55 |
56 | tf.app.flags.DEFINE_string(
57 | 'dataset_dir', None, 'The directory where the dataset files are stored.')
58 |
59 | tf.app.flags.DEFINE_string(
60 | 'model_name', 'inception_v3', 'The name of the architecture to evaluate.')
61 |
62 | tf.app.flags.DEFINE_string(
63 | 'preprocessing_name', None, 'The name of the preprocessing to use. If left '
64 | 'as `None`, then the model_name flag is used.')
65 |
66 | tf.app.flags.DEFINE_integer(
67 | 'eval_image_size', None, 'Eval image size')
68 |
69 |
70 | FLAGS = tf.app.flags.FLAGS
71 |
72 |
73 | def crop_and_subsample(img, target_size, average=None):
74 | factor = int(np.floor(min(img.shape) / target_size))
75 | ch = (img.shape[0] - factor * target_size) / 2
76 | cw = (img.shape[1] - factor * target_size) / 2
77 | cropped = img[int(np.floor(ch)):-int(np.ceil(ch)),
78 | int(np.floor(cw)):-int(np.ceil(cw))]
79 | if average is not None:
80 | cropped = scipy.ndimage.filters.convolve(cropped, np.ones((average, average)))
81 | return cropped[::factor, ::factor]
82 |
83 |
84 | def main(_):
85 | if not FLAGS.dataset_dir:
86 | raise ValueError('You must supply the dataset directory with --dataset_dir')
87 |
88 | os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.device
89 |
90 | tf.logging.set_verbosity(tf.logging.INFO)
91 | with tf.Graph().as_default():
92 |
93 | ####################
94 | # Select the model #
95 | ####################
96 | num_classes = 1001
97 | network_fn = nets_factory.get_network_fn(
98 | FLAGS.model_name,
99 | num_classes,
100 | weight_decay=0.0,
101 | batch_norm_decay=0.95,
102 | is_training=False)
103 |
104 | #####################################
105 | # Select the preprocessing function #
106 | #####################################
107 | preprocessing_name = FLAGS.preprocessing_name or FLAGS.model_name
108 | image_preprocessing_fn = preprocessing_factory.get_preprocessing(
109 | preprocessing_name,
110 | is_training=False)
111 |
112 | eval_image_size = FLAGS.eval_image_size or network_fn.default_image_size
113 |
114 | orig_image = tf.placeholder(tf.float32, shape=(eval_image_size, eval_image_size, 3))
115 | alpha = tf.placeholder(tf.float32, shape=[1, 3])
116 | sigma = tf.placeholder(tf.float32, shape=[1, 3])
117 | bayer_mask = sensor_model.get_bayer_mask(eval_image_size, eval_image_size)
118 | # image = image_preprocessing_fn(orig_image, orig_image, eval_image_size, eval_image_size, sensor=FLAGS.sensor)
119 | image = orig_image * bayer_mask
120 | # alpha, sigma = sensor_model.get_coeffs(light_level[None], sensor=FLAGS.sensor)
121 | # Scale to [-1, 1]
122 | if FLAGS.isp_model_name is None:
123 | image = 2 * (image - 0.5)
124 |
125 | images = tf.expand_dims(image, 0)
126 |
127 | ####################
128 | # Define the model #
129 | ####################
130 | inputs = images
131 |
132 | network_ops = network_fn(images=inputs, alpha=alpha, sigma=sigma,
133 | bayer_mask=bayer_mask, use_anscombe=FLAGS.use_anscombe,
134 | noise_channel=FLAGS.noise_channel,
135 | num_classes=num_classes,
136 | num_iters=FLAGS.num_iters, num_layers=FLAGS.num_layers,
137 | isp_model_name=FLAGS.isp_model_name, is_real_data=True)
138 | logits, end_points = network_ops[:2]
139 |
140 | variables_to_restore = slim.get_variables_to_restore()
141 | saver = tf.train.Saver()
142 |
143 | if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
144 | checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
145 | else:
146 | checkpoint_path = FLAGS.checkpoint_path
147 |
148 | synset2label = {}
149 | with open("datasets/synset_labels.txt", "r") as f:
150 | for line in f:
151 | synset, label = line.split(':')
152 | synset2label[synset] = int(label)
153 |
154 | if not os.path.isdir(FLAGS.eval_dir):
155 | os.mkdir(FLAGS.eval_dir)
156 |
157 | with tf.Session() as sess:
158 | saver.restore(sess, FLAGS.checkpoint_path)
159 | synsets = os.listdir(FLAGS.dataset_dir)
160 | number_to_human = {int(i[0]):i[1] for i in np.genfromtxt('datasets/imagenet_labels.txt', delimiter=':', dtype=np.string_)}
161 |
162 | # estimated alpha and gama
163 | alpha_val = 0.0153
164 | sigma_val = 0.0328
165 | count = 0
166 | top1 = 0
167 | top5 = 0
168 | correct_paths = []
169 | wrong_paths = []
170 | for synset in synsets:
171 | if synset == 'labels.txt':
172 | continue
173 | synset_top5 = 0
174 | path = os.path.join(FLAGS.dataset_dir, synset)
175 | image_names = [name for name in sorted(os.listdir(path)) if '.dng' in name]
176 | for imagename in image_names:
177 | try:
178 | loaded_image = rawpy.imread(os.path.join(path, imagename))
179 | info = pyexifinfo.get_json(os.path.join(path, imagename))[0]
180 | black_level = float(info['EXIF:BlackLevel'].split(' ')[0])
181 | awb = [float(x) for x in info['EXIF:AsShotNeutral'].split(' ')]
182 | raw_img = (loaded_image.raw_image_visible - black_level) / 1023.
183 | except Exception as e:
184 | print(synset, imagename, e)
185 | continue
186 |
187 | B = raw_img[::2, ::2] / awb[2]
188 | R = raw_img[1::2, 1::2] / awb[0]
189 | G1 = raw_img[1::2, ::2] / awb[1]
190 | G2 = raw_img[::2, 1::2] / awb[1]
191 | B, R, G1, G2 = (crop_and_subsample(img, eval_image_size // 2)
192 | for img in [B, R, G1, G2])
193 | scale_factor = 1.0 / np.percentile(np.stack([B, R, G1, G2], axis=2), 98)
194 |
195 | mosaiced = np.zeros((224, 224, 3))
196 | mosaiced[::2, ::2, 2] = B
197 | mosaiced[1::2, 1::2, 0] = R
198 | mosaiced[1::2, ::2, 1] = G1
199 | mosaiced[::2, 1::2, 1] = G2
200 |
201 | img_scaled = mosaiced * scale_factor
202 | input_img = np.clip(img_scaled, 0, 1)
203 | scaling = (scale_factor / np.array(awb))[None, :]
204 | logits_vals, clean_image = sess.run(
205 | [logits[0, :], end_points.get('mobilenet_input', alpha)],
206 | feed_dict={orig_image: input_img,
207 | alpha: alpha_val * scaling,
208 | sigma: sigma_val * scaling})
209 | correct = synset2label[synset]
210 | predictions = np.argsort(-logits_vals)
211 | rank = np.nonzero(predictions == correct)[0]
212 | clean_image = clean_image.squeeze()
213 |
214 | if count % 100 == 0:
215 | print("%d images out of 1000" % (count))
216 |
217 | trgt_path = os.path.join(FLAGS.eval_dir, 'clean', synset)
218 | raw_path = os.path.join(FLAGS.eval_dir, 'raw', synset)
219 |
220 | if not os.path.exists(raw_path):
221 | os.makedirs(raw_path)
222 |
223 | if not os.path.exists(trgt_path):
224 | os.makedirs(trgt_path)
225 | cv2.imwrite(os.path.join(raw_path, imagename[:-4]+'.png'), (input_img*255).astype(np.uint8))
226 | if FLAGS.isp_model_name == 'isp':
227 | trgt_path = os.path.join(trgt_path, imagename[:-4]+'.png')
228 | plt.imsave(trgt_path, clean_image)
229 |
230 | if rank == 0:
231 | correct_paths.append("%s \"%s\" \"%s\""%(os.path.join(trgt_path, imagename[:-4]+'.png'), number_to_human[correct], number_to_human[predictions[0]]))
232 | top1 += 1.0
233 | else:
234 | wrong_paths.append("%s \"%s\" \"%s\""%(os.path.join(trgt_path, imagename[:-4]+'.png'), number_to_human[correct], number_to_human[predictions[0]]))
235 |
236 | if rank <= 5:
237 | top5 += 1.0
238 | synset_top5 += 1.0
239 | count += 1
240 |
241 | print("Synset %s, Top 5 %f" % (synset, synset_top5 / len(image_names)))
242 |
243 | print("Top-1 %f, Top-5 %f" % (top1 / count, top5 / count))
244 |
245 | with open(os.path.join(FLAGS.eval_dir, 'correct.txt'), 'w') as f:
246 | for item in correct_paths:
247 | f.write("%s\n" % item)
248 | with open(os.path.join(FLAGS.eval_dir, 'wrong.txt'), 'w') as f:
249 | for item in wrong_paths:
250 | f.write("%s\n" % item)
251 |
252 |
253 | if __name__ == '__main__':
254 | tf.app.run()
255 |
256 |
257 |
258 |
--------------------------------------------------------------------------------
/test_synthetic_images.py:
--------------------------------------------------------------------------------
1 | # Copyright 2016 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 |
16 |
17 | # This file evaluates a trained network on a test dataset and saves the filenames of images
18 | # that were correctly / falsely classified into a text file, so that the images that different
19 | # classifiers got right / wrong can be compared.
20 |
21 | """Generic evaluation script that evaluates a model using a given dataset.
22 | Noise is introduced before images are input to the classifier, and it is defined by the
23 | mode parameter, and the Camera Image Formation
24 | model defined in the Dirty-Pixels manuscript.
25 | """
26 |
27 | from __future__ import absolute_import
28 | from __future__ import division
29 | from __future__ import print_function
30 |
31 | import math
32 | import tensorflow as tf
33 | import os
34 | from glob import glob
35 |
36 | import cv2
37 |
38 | from preprocessing import preprocessing_factory, sensor_model
39 | from datasets import dataset_factory
40 | from nets import nets_factory
41 | import numpy as np
42 |
43 | slim = tf.contrib.slim
44 |
45 | tf.app.flags.DEFINE_string(
46 | 'device', '', 'The address of the TensorFlow master to use.')
47 |
48 | tf.app.flags.DEFINE_string(
49 | 'mode', '3lux', 'Noise profile: 3lux, 6lux, 2to20lux, or 2to200lux.')
50 |
51 |
52 | tf.app.flags.DEFINE_string(
53 | 'checkpoint_path', '/tmp/tfmodel/',
54 | 'The directory where the model was written to or an absolute path to a '
55 | 'checkpoint file.')
56 |
57 | tf.app.flags.DEFINE_string(
58 | 'dataset_name', 'imagenet', 'The name of the dataset to load.')
59 |
60 | tf.app.flags.DEFINE_string(
61 | 'dataset_dir', None, 'The directory where the dataset files are stored.')
62 |
63 | tf.app.flags.DEFINE_string(
64 | 'model_name', None, 'The name of the architecture to evaluate.')
65 |
66 | tf.app.flags.DEFINE_string('eval_dir', 'output_synthetic_images', 'Output directory')
67 |
68 | FLAGS = tf.app.flags.FLAGS
69 |
70 |
71 | def imnet_generator(root_directory):
72 | # list all directories
73 | dirs = sorted(glob(os.path.join(root_directory, "*/")))
74 | print("#### num dirs", len(dirs))
75 |
76 | # Build the label lookup table
77 | synset_to_label = {synset.decode('utf-8'):i+1 for i, synset in enumerate(np.genfromtxt('datasets/imagenet_lsvrc_2015_synsets.txt', dtype=np.string_))}
78 | # print(synset_to_label.items())
79 |
80 | # loop through directories and glob all images
81 | for idx, dir in enumerate(dirs):
82 | # Glob all image files in this directory
83 | img_files = glob(os.path.join(dir, '*.png'))
84 | img_files += glob(os.path.join(dir, '*.jpg'))
85 | img_files += glob(os.path.join(dir, '*.jpeg'))
86 | img_files += glob(os.path.join(dir, '*.JPEG'))
87 |
88 | for img_file in img_files:
89 | yield img_file, synset_to_label[os.path.basename(os.path.normpath(dir))], os.path.basename(os.path.normpath(dir))
90 |
91 | def parse_img(img_path):
92 | rgb_string = tf.read_file(img_path)
93 | rgb_decoded = tf.image.decode_jpeg(rgb_string) # uint8
94 | rgb_decoded = tf.cast(rgb_decoded, tf.float32)
95 | rgb_decoded /= 255.
96 | return rgb_decoded
97 |
98 | def main(_):
99 | if not FLAGS.dataset_dir:
100 | raise ValueError('You must supply the dataset directory with --dataset_dir')
101 |
102 | os.environ['CUDA_VISIBLE_DEVICES'] = FLAGS.device
103 | eval_dir = FLAGS.eval_dir
104 |
105 | tf.logging.set_verbosity(tf.logging.INFO)
106 | with tf.Graph().as_default():
107 | tf_global_step = slim.get_or_create_global_step()
108 |
109 | num_classes = 1001
110 | eval_image_size = 128
111 |
112 | image_path_graph = tf.placeholder(tf.string)
113 | label_graph = tf.placeholder(tf.int32)
114 |
115 | image = parse_img(image_path_graph)
116 |
117 | image = tf.image.central_crop(image, central_fraction=0.875)
118 |
119 | image = tf.expand_dims(image, 0)
120 | image = tf.image.resize_images(image, [eval_image_size, eval_image_size], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
121 | image = tf.squeeze(image, [0])
122 |
123 | ####################
124 | # Select the model #
125 | ####################
126 | network_fn = nets_factory.get_network_fn(
127 | FLAGS.model_name,
128 | num_classes=num_classes,
129 | batch_norm_decay=0.9,
130 | weight_decay=0.0,
131 | is_training=False)
132 |
133 | image.set_shape([128,128,3])
134 | image = tf.expand_dims(image, 0)
135 |
136 | if FLAGS.mode == '2to20lux':
137 | ll_low = 0.001
138 | ll_high = 0.01
139 | elif FLAGS.mode == '2to200lux':
140 | ll_low = 0.001
141 | ll_high = 0.1
142 | elif FLAGS.mode == '3lux':
143 | ll_low = 0.0015
144 | ll_high = 0.0015
145 | elif FLAGS.mode == '6lux':
146 | ll_low = 0.003
147 | ll_high = 0.003
148 |
149 | noisy_batch, alpha, sigma = \
150 | sensor_model.sensor_noise_rand_light_level(image, [ll_low, ll_high], scale=1.0, sensor='Nexus_6P_rear')
151 | bayer_mask = sensor_model.get_bayer_mask(128, 128)
152 |
153 | raw_image_graph = noisy_batch * bayer_mask
154 |
155 | ####################
156 | # Define the model #
157 | ####################
158 | logits, end_points, cleaned_image_graph = network_fn(images=raw_image_graph, alpha=alpha, sigma=sigma,
159 | bayer_mask=bayer_mask, use_anscombe=True,
160 | noise_channel=True,
161 | num_classes=num_classes,
162 | num_iters=1, num_layers=17,
163 | isp_model_name='isp')
164 |
165 | predictions = tf.argmax(logits, 1)
166 |
167 | if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
168 | print('###### Loading last checkpoint of directory', FLAGS.checkpoint_path)
169 | checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
170 | else:
171 | print('###### Loading checkpoint', FLAGS.checkpoint_path)
172 | checkpoint_path = FLAGS.checkpoint_path
173 |
174 |
175 | tf.logging.info('Evaluating %s' % FLAGS.checkpoint_path)
176 |
177 | correct_paths = []
178 | wrong_paths = []
179 |
180 | # Restore variables from checkpoint
181 | variables_to_restore = slim.get_variables_to_restore() # slim.get_model_variables()
182 | saver = tf.train.Saver(variables_to_restore)
183 |
184 | number_to_human = {int(i[0]):i[1] for i in np.genfromtxt('datasets/imagenet_labels.txt', delimiter=':', dtype=np.string_)}
185 |
186 | eval_dir= FLAGS.eval_dir
187 | os.makedirs(eval_dir, exist_ok=True)
188 |
189 | with tf.Session() as sess:
190 | sess.run(tf.global_variables_initializer())
191 | saver.restore(sess, checkpoint_path)
192 |
193 | count = 0
194 | for img_file, label, synset in imnet_generator(FLAGS.dataset_dir):
195 | preds_value, cleaned_image, raw_image = sess.run([predictions, cleaned_image_graph, raw_image_graph],
196 | feed_dict={image_path_graph:img_file, label_graph:label})
197 |
198 | cleaned_image = np.clip(cleaned_image, 0.0, 1.0).squeeze()[:,:,::-1]
199 | raw_image = raw_image.squeeze()[:,:,::-1]
200 | img_filename = os.path.basename(os.path.normpath(img_file))
201 |
202 | our_path = os.path.join(eval_dir, 'anscombe_output', FLAGS.mode, synset)
203 | raw_path = os.path.join(eval_dir, 'raw', FLAGS.mode, synset)
204 |
205 | if not os.path.exists(our_path):
206 | os.makedirs(our_path)
207 | if not os.path.exists(raw_path):
208 | os.makedirs(raw_path)
209 |
210 | if count % 10000 == 0:
211 | print('num. processed ', count)
212 | print('num. correct paths', len(correct_paths))
213 | count += 1
214 | img_filename = os.path.splitext(img_filename)[0] + '.png'
215 |
216 | cv2.imwrite(os.path.join(our_path, img_filename), (cleaned_image*255).astype(np.uint8))
217 | cv2.imwrite(os.path.join(raw_path, img_filename), (raw_image*255).astype(np.uint8))
218 |
219 | if preds_value.squeeze() == label:
220 | correct_paths.append("%s \"%s\" \"%s\""%(os.path.join(our_path, img_filename), number_to_human[label], number_to_human[preds_value[0]]))
221 | else:
222 | wrong_paths.append("%s \"%s\" \"%s\""%(os.path.join(our_path, img_filename), number_to_human[label], number_to_human[preds_value[0]]))
223 |
224 | print('Top-1 accuracy', float(len(correct_paths))/float(len(wrong_paths)+len(correct_paths)))
225 | correct_paths_fn = os.path.join(eval_dir, FLAGS.mode + '_correct.txt')
226 | with open(correct_paths_fn, 'w') as f:
227 | for item in correct_paths:
228 | f.write("%s\n" % item)
229 | wrong_paths_fn = os.path.join(eval_dir, FLAGS.mode + '_wrong.txt')
230 | with open(wrong_paths_fn, 'w') as f:
231 | for item in wrong_paths:
232 | f.write("%s\n" % item)
233 |
234 | if __name__ == '__main__':
235 | tf.app.run()
236 |
--------------------------------------------------------------------------------