├── LICENSE.txt
├── README.md
├── img
└── metfaces-teaser.png
└── metfaces.py
/LICENSE.txt:
--------------------------------------------------------------------------------
1 | MetFaces is an image dataset of human faces extracted from works of art.
2 |
3 | The source images are made available under the Creative Commons Zero (CC0)
4 | (https://creativecommons.org/publicdomain/zero/1.0/) license by the
5 | Metropolitan Museum of Art (https://www.metmuseum.org/).
6 | For more information about their Open Access policy, please refer to
7 | https://www.metmuseum.org/about-the-met/policies-and-documents/image-resources.
8 |
9 | The dataset itself (including JSON metadata, processed images, and documentation) is
10 | made available under Creative Commons BY-NC 2.0 (https://creativecommons.org/licenses/by-nc/2.0/)
11 | license by NVIDIA Corporation. You can use, redistribute, and adapt it
12 | for non-commercial purposes, as long as you (a) give appropriate credit by
13 | citing our paper, and (b) indicate any changes that you've made.
14 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## MetFaces Dataset
2 |
3 | 
4 |
5 | MetFaces is an image dataset of human faces extracted from works of art, originally created as part of our work on:
6 |
7 | > **Training Generative Adversarial Networks with Limited Data**
8 | > Tero Karras (NVIDIA), Miika Aittala (NVIDIA), Janne Hellsten (NVIDIA), Samuli Laine (NVIDIA), Jaakko Lehtinen (NVIDIA and Aalto University), Timo Aila (NVIDIA)
9 | > https://arxiv.org/abs/2006.06676
10 |
11 | The dataset consists of 1336 high-quality PNG images at 1024×1024 resolution. The images were downloaded via the [Metropolitan Museum of Art Collection API](https://metmuseum.github.io/), and automatically aligned and cropped using [dlib](http://dlib.net/). Various automatic filters were used to prune the set.
12 |
13 | For business inquiries, please contact [researchinquiries@nvidia.com](mailto:researchinquiries@nvidia.com)
14 |
15 | For press and other inquiries, please contact Hector Marinez at [hmarinez@nvidia.com](mailto:hmarinez@nvidia.com)
16 |
17 | ## Licenses
18 |
19 | The source images are made available under the [Creative Commons Zero (CC0)](https://creativecommons.org/publicdomain/zero/1.0/) license by the [Metropolitan Museum of Art](https://www.metmuseum.org/). Please [read here](https://www.metmuseum.org/about-the-met/policies-and-documents/image-resources) for more information about their Open Access policy.
20 |
21 | The dataset itself (including JSON metadata, processed images, and documentation) is made available under [Creative Commons BY-NC 2.0](https://creativecommons.org/licenses/by-nc/2.0/) license by NVIDIA Corporation. You can **use, redistribute, and adapt it for non-commercial purposes**, as long as you (a) give appropriate credit by **citing our paper**, and (b) **indicate any changes** that you've made.
22 |
23 |
24 | ## Overview
25 |
26 | All data is hosted on Google Drive:
27 |
28 | | Path | Size | Files | Format | Description
29 | | :--- | :--: | ----: | :----: | :----------
30 | | [metfaces-dataset](https://drive.google.com/open?id=1w-Os4uERBmXwCm7Oo_kW6X3Sd2YHpJMC) | 14.5 GB | 2621 | | Main folder
31 | | ├ [metfaces.json](https://drive.google.com/open?id=1o11-JkkwBbZW61w03O7qGrhkydNALDSH) | 1.8 MB | 1 | JSON | Image metadata including original download URL.
32 | | ├ [images](https://drive.google.com/open?id=1iChdwdW7mZFUyivKtDwL8ehCNhYKQz6D) | 1.6 GB | 1336 | PNG | Aligned and cropped images at 1024×1024
33 | | └ [unprocessed](https://drive.google.com/open?id=1lut1g1oASGsipQQB67EFqVhjt4UgC5JW) | 13 GB | 1284 | PNG | Original images
34 |
35 | ## Reproducing the dataset
36 |
37 | MetFaces 1024x1024 images can be reproduced with the `metfaces.py` script as follows:
38 |
39 | 1. Download the contents of the metfaces-dataset Google Drive folder. Retain the original folder structure (e.g., you should have `local/path/metfaces.json`, `local/path/unprocessed`.)
40 | 2. Run `metfaces.py --json data/metfaces-dataset.json --source-images data --output-dir out`
41 |
42 | To reproduce the MetFaces-U dataset ("unaligned MetFaces"), use the following command:
43 |
44 | ```
45 | python metfaces.py --json data/metfaces-dataset.json --source-images data \
46 | --random-shift=0.2 --retry-crops --no-rotation \
47 | --output-dir out-unaligned
48 | ```
49 |
50 |
51 | ## Metadata
52 |
53 | The `metfaces.json` file contains the following information for each image:
54 |
55 | ```
56 | [
57 | {
58 | "obj_id": "11713", # Metmuseum object ID
59 | "meta_url": "https://collectionapi.metmuseum.org/public/collection/v1/objects/11713",
60 | "source_url": "https://images.metmuseum.org/CRDImages/ad/original/ap26.129.1.jpg",
61 | "source_path": "unprocessed/image-11713.png", # Original raw image file under local dataset copy
62 | "source_md5": "c1e4c5a42de6a4d6909d3820c16f9eb5", # MD5 checksum of the raw image file
63 | "image_path": "images/11713-00.png", # Processed 1024x1024 image
64 | "image_md5": "605a90ab744bdbc9737da5620f2777ab", # MD5 checksum of the processed image
65 | "title": "Portrait of a Gentleman", # Metmuseum object's title
66 | "artist_display_name": "Charles Willson Peale", # Metmuseum object's artist's display name
67 | "face_spec": { # Info about the raw image:
68 | "rect": [404, 238, 775, 610], # - Axis-aligned rectangle of the face region
69 | "landmarks": [...], # - 68 face landmarks reported by dlib
70 | "shrink": 2
71 | },
72 | "face_idx": 0
73 | },
74 | ...
75 | ]
76 | ```
77 |
78 | For full Metmuseum metadata, you can access the `meta_url` contents by e.g., `curl https://collectionapi.metmuseum.org/public/collection/v1/objects/11713`.
79 |
--------------------------------------------------------------------------------
/img/metfaces-teaser.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NVlabs/metfaces-dataset/f362dae010cfdb2027bf57486d46823c99f048ee/img/metfaces-teaser.png
--------------------------------------------------------------------------------
/metfaces.py:
--------------------------------------------------------------------------------
1 | # Copyright 2020 NVIDIA Corporation
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | import argparse
16 | import json
17 | import numpy as np
18 | import os
19 | import PIL.Image
20 | import scipy.ndimage
21 | from tqdm import tqdm
22 |
23 | _examples = '''examples:
24 |
25 | # Run x, y, z
26 | python %(prog)s --output=tmp
27 | '''
28 |
29 | def extract_face(face, source_images, output_dir, rng, target_size=1024, supersampling=4, enable_padding=True, random_shift=0.0, retry_crops=False, rotate_level=True):
30 | def rot90(v) -> np.ndarray:
31 | return np.array([-v[1], v[0]])
32 |
33 | # Sanitize facial landmarks.
34 | face_spec = face['face_spec']
35 | landmarks = (np.float32(face_spec['landmarks']) + 0.5) * face_spec['shrink']
36 | assert landmarks.shape == (68, 2)
37 | lm_eye_left = landmarks[36 : 42] # left-clockwise
38 | lm_eye_right = landmarks[42 : 48] # left-clockwise
39 | lm_mouth_outer = landmarks[48 : 60] # left-clockwise
40 |
41 | # Calculate auxiliary vectors.
42 | eye_left = np.mean(lm_eye_left, axis=0)
43 | eye_right = np.mean(lm_eye_right, axis=0)
44 | eye_avg = (eye_left + eye_right) * 0.5
45 | eye_to_eye = eye_right - eye_left
46 | mouth_left = lm_mouth_outer[0]
47 | mouth_right = lm_mouth_outer[6]
48 | mouth_avg = (mouth_left + mouth_right) * 0.5
49 | eye_to_mouth = mouth_avg - eye_avg
50 |
51 | # Choose oriented crop rectangle.
52 | if rotate_level:
53 | # Orient according to tilt of the input image
54 | x = eye_to_eye - rot90(eye_to_mouth)
55 | x /= np.hypot(*x)
56 | x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8)
57 | y = rot90(x)
58 | c0 = eye_avg + eye_to_mouth * 0.1
59 | else:
60 | # Do not match the tilt in the source data, i.e., use an axis-aligned rectangle
61 | x = np.array([1, 0], dtype=np.float64)
62 | x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8)
63 | y = np.flipud(x) * [-1, 1]
64 | c0 = eye_avg + eye_to_mouth * 0.1
65 |
66 | # Load.
67 | img = PIL.Image.open(os.path.join(source_images, face['source_path'])).convert('RGB')
68 |
69 | # Calculate auxiliary data.
70 | qsize = np.hypot(*x) * 2
71 | quad = np.stack([c0 - x - y, c0 - x + y, c0 + x + y, c0 + x - y])
72 |
73 | # Keep drawing new random crop offsets until we find one that is contained in the image
74 | # and does not require padding
75 | if random_shift != 0:
76 | for _ in range(1000):
77 | # Offset the crop rectange center by a random shift proportional to image dimension
78 | # and the requested standard deviation (by default 0)
79 | c = (c0 + np.hypot(*x)*2 * random_shift * rng.normal(0, 1, c0.shape))
80 | quad = np.stack([c - x - y, c - x + y, c + x + y, c + x - y])
81 | crop = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1]))))
82 | if not retry_crops or not (crop[0] < 0 or crop[1] < 0 or crop[2] >= img.width or crop[3] >= img.height):
83 | # We're happy with this crop (either it fits within the image, or retries are disabled)
84 | break
85 | else:
86 | # rejected N times, give up and move to next image
87 | # (does not happen in practice with the MetFaces data)
88 | print('rejected image %s' % face['source_path'])
89 | return
90 |
91 | # Shrink.
92 | shrink = int(np.floor(qsize / target_size * 0.5))
93 | if shrink > 1:
94 | rsize = (int(np.rint(float(img.size[0]) / shrink)), int(np.rint(float(img.size[1]) / shrink)))
95 | img = img.resize(rsize, PIL.Image.ANTIALIAS)
96 | quad /= shrink
97 | qsize /= shrink
98 |
99 | # Crop.
100 | border = max(int(np.rint(qsize * 0.1)), 3)
101 | crop = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1]))))
102 | crop = (max(crop[0] - border, 0), max(crop[1] - border, 0), min(crop[2] + border, img.size[0]), min(crop[3] + border, img.size[1]))
103 | if crop[2] - crop[0] < img.size[0] or crop[3] - crop[1] < img.size[1]:
104 | img = img.crop(crop)
105 | quad -= crop[0:2]
106 |
107 | # Pad.
108 | pad = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1]))))
109 | pad = (max(-pad[0] + border, 0), max(-pad[1] + border, 0), max(pad[2] - img.size[0] + border, 0), max(pad[3] - img.size[1] + border, 0))
110 | if enable_padding and max(pad) > border - 4:
111 | pad = np.maximum(pad, int(np.rint(qsize * 0.3)))
112 | img = np.pad(np.float32(img), ((pad[1], pad[3]), (pad[0], pad[2]), (0, 0)), 'reflect')
113 | h, w, _ = img.shape
114 | y, x, _ = np.ogrid[:h, :w, :1]
115 | mask = np.maximum(1.0 - np.minimum(np.float32(x) / pad[0], np.float32(w-1-x) / pad[2]), 1.0 - np.minimum(np.float32(y) / pad[1], np.float32(h-1-y) / pad[3]))
116 | blur = qsize * 0.02
117 | img += (scipy.ndimage.gaussian_filter(img, [blur, blur, 0]) - img) * np.clip(mask * 3.0 + 1.0, 0.0, 1.0)
118 | img += (np.median(img, axis=(0,1)) - img) * np.clip(mask, 0.0, 1.0)
119 | img = PIL.Image.fromarray(np.uint8(np.clip(np.rint(img), 0, 255)), 'RGB')
120 | quad += pad[:2]
121 |
122 | # Transform.
123 | super_size = target_size * supersampling
124 | img = img.transform((super_size, super_size), PIL.Image.QUAD, (quad + 0.5).flatten(), PIL.Image.BILINEAR)
125 | if target_size < super_size:
126 | img = img.resize((target_size, target_size), PIL.Image.ANTIALIAS)
127 |
128 | # Save face image.
129 | img.save(os.path.join(output_dir, f"{face['obj_id']}-{face['face_idx']:02d}.png"))
130 |
131 |
132 | def main():
133 | parser = argparse.ArgumentParser(
134 | description='MetFaces dataset processing tool',
135 | epilog=_examples,
136 | formatter_class=argparse.RawDescriptionHelpFormatter
137 | )
138 | parser.add_argument('--json', help='MetFaces metadata json file path', required=True)
139 | parser.add_argument('--source-images', help='Location of MetFaces raw image data', required=True)
140 | parser.add_argument('--output-dir', help='Where to save output files')
141 | parser.add_argument('--random-shift', help='Standard deviation of random crop rectangle jitter', type=float, default=0.0, metavar='SHIFT')
142 | parser.add_argument('--retry-crops', help='Retry random shift if crop rectangle falls outside image (up to 1000 times)', dest='retry_crops', default=False, action='store_true')
143 | parser.add_argument('--no-rotation', help='Keep the original orientation of images', dest='no_rotation', default=False, action='store_true')
144 | args = parser.parse_args()
145 |
146 | os.makedirs(args.output_dir, exist_ok=True)
147 |
148 | rng = np.random.RandomState(12345) # fix the random seed for reproducibility
149 |
150 | with open(args.json, encoding="utf8") as fin:
151 | faces = json.load(fin)
152 | for f in tqdm(faces):
153 | extract_face(f, source_images=args.source_images, output_dir=args.output_dir, rng=rng,
154 | random_shift=args.random_shift, retry_crops=args.retry_crops, rotate_level=not args.no_rotation)
155 |
156 | if __name__ == "__main__":
157 | main()
158 |
--------------------------------------------------------------------------------