├── img └── metfaces-teaser.png ├── LICENSE.txt ├── README.md └── metfaces.py /img/metfaces-teaser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NVlabs/metfaces-dataset/HEAD/img/metfaces-teaser.png -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MetFaces is an image dataset of human faces extracted from works of art. 2 | 3 | The source images are made available under the Creative Commons Zero (CC0) 4 | (https://creativecommons.org/publicdomain/zero/1.0/) license by the 5 | Metropolitan Museum of Art (https://www.metmuseum.org/). 6 | For more information about their Open Access policy, please refer to 7 | https://www.metmuseum.org/about-the-met/policies-and-documents/image-resources. 8 | 9 | The dataset itself (including JSON metadata, processed images, and documentation) is 10 | made available under Creative Commons BY-NC 2.0 (https://creativecommons.org/licenses/by-nc/2.0/) 11 | license by NVIDIA Corporation. You can use, redistribute, and adapt it 12 | for non-commercial purposes, as long as you (a) give appropriate credit by 13 | citing our paper, and (b) indicate any changes that you've made. 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## MetFaces Dataset 2 | 3 | ![Teaser image](./img/metfaces-teaser.png) 4 | 5 | MetFaces is an image dataset of human faces extracted from works of art, originally created as part of our work on: 6 | 7 | > **Training Generative Adversarial Networks with Limited Data**
8 | > Tero Karras (NVIDIA), Miika Aittala (NVIDIA), Janne Hellsten (NVIDIA), Samuli Laine (NVIDIA), Jaakko Lehtinen (NVIDIA and Aalto University), Timo Aila (NVIDIA)
9 | > https://arxiv.org/abs/2006.06676 10 | 11 | The dataset consists of 1336 high-quality PNG images at 1024×1024 resolution. The images were downloaded via the [Metropolitan Museum of Art Collection API](https://metmuseum.github.io/), and automatically aligned and cropped using [dlib](http://dlib.net/). Various automatic filters were used to prune the set. 12 | 13 | For business inquiries, please contact [researchinquiries@nvidia.com](mailto:researchinquiries@nvidia.com) 14 | 15 | For press and other inquiries, please contact Hector Marinez at [hmarinez@nvidia.com](mailto:hmarinez@nvidia.com) 16 | 17 | ## Licenses 18 | 19 | The source images are made available under the [Creative Commons Zero (CC0)](https://creativecommons.org/publicdomain/zero/1.0/) license by the [Metropolitan Museum of Art](https://www.metmuseum.org/). Please [read here](https://www.metmuseum.org/about-the-met/policies-and-documents/image-resources) for more information about their Open Access policy. 20 | 21 | The dataset itself (including JSON metadata, processed images, and documentation) is made available under [Creative Commons BY-NC 2.0](https://creativecommons.org/licenses/by-nc/2.0/) license by NVIDIA Corporation. You can **use, redistribute, and adapt it for non-commercial purposes**, as long as you (a) give appropriate credit by **citing our paper**, and (b) **indicate any changes** that you've made. 22 | 23 | 24 | ## Overview 25 | 26 | All data is hosted on Google Drive: 27 | 28 | | Path | Size | Files | Format | Description 29 | | :--- | :--: | ----: | :----: | :---------- 30 | | [metfaces-dataset](https://drive.google.com/open?id=1w-Os4uERBmXwCm7Oo_kW6X3Sd2YHpJMC) | 14.5 GB | 2621 | | Main folder 31 | | ├ [metfaces.json](https://drive.google.com/open?id=1o11-JkkwBbZW61w03O7qGrhkydNALDSH) | 1.8 MB | 1 | JSON | Image metadata including original download URL. 32 | | ├ [images](https://drive.google.com/open?id=1iChdwdW7mZFUyivKtDwL8ehCNhYKQz6D) | 1.6 GB | 1336 | PNG | Aligned and cropped images at 1024×1024 33 | | └ [unprocessed](https://drive.google.com/open?id=1lut1g1oASGsipQQB67EFqVhjt4UgC5JW) | 13 GB | 1284 | PNG | Original images 34 | 35 | ## Reproducing the dataset 36 | 37 | MetFaces 1024x1024 images can be reproduced with the `metfaces.py` script as follows: 38 | 39 | 1. Download the contents of the metfaces-dataset Google Drive folder. Retain the original folder structure (e.g., you should have `local/path/metfaces.json`, `local/path/unprocessed`.) 40 | 2. Run `metfaces.py --json data/metfaces-dataset.json --source-images data --output-dir out` 41 | 42 | To reproduce the MetFaces-U dataset ("unaligned MetFaces"), use the following command: 43 | 44 | ``` 45 | python metfaces.py --json data/metfaces-dataset.json --source-images data \ 46 | --random-shift=0.2 --retry-crops --no-rotation \ 47 | --output-dir out-unaligned 48 | ``` 49 | 50 | 51 | ## Metadata 52 | 53 | The `metfaces.json` file contains the following information for each image: 54 | 55 | ``` 56 | [ 57 | { 58 | "obj_id": "11713", # Metmuseum object ID 59 | "meta_url": "https://collectionapi.metmuseum.org/public/collection/v1/objects/11713", 60 | "source_url": "https://images.metmuseum.org/CRDImages/ad/original/ap26.129.1.jpg", 61 | "source_path": "unprocessed/image-11713.png", # Original raw image file under local dataset copy 62 | "source_md5": "c1e4c5a42de6a4d6909d3820c16f9eb5", # MD5 checksum of the raw image file 63 | "image_path": "images/11713-00.png", # Processed 1024x1024 image 64 | "image_md5": "605a90ab744bdbc9737da5620f2777ab", # MD5 checksum of the processed image 65 | "title": "Portrait of a Gentleman", # Metmuseum object's title 66 | "artist_display_name": "Charles Willson Peale", # Metmuseum object's artist's display name 67 | "face_spec": { # Info about the raw image: 68 | "rect": [404, 238, 775, 610], # - Axis-aligned rectangle of the face region 69 | "landmarks": [...], # - 68 face landmarks reported by dlib 70 | "shrink": 2 71 | }, 72 | "face_idx": 0 73 | }, 74 | ... 75 | ] 76 | ``` 77 | 78 | For full Metmuseum metadata, you can access the `meta_url` contents by e.g., `curl https://collectionapi.metmuseum.org/public/collection/v1/objects/11713`. 79 | -------------------------------------------------------------------------------- /metfaces.py: -------------------------------------------------------------------------------- 1 | # Copyright 2020 NVIDIA Corporation 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | import argparse 16 | import json 17 | import numpy as np 18 | import os 19 | import PIL.Image 20 | import scipy.ndimage 21 | from tqdm import tqdm 22 | 23 | _examples = '''examples: 24 | 25 | # Run x, y, z 26 | python %(prog)s --output=tmp 27 | ''' 28 | 29 | def extract_face(face, source_images, output_dir, rng, target_size=1024, supersampling=4, enable_padding=True, random_shift=0.0, retry_crops=False, rotate_level=True): 30 | def rot90(v) -> np.ndarray: 31 | return np.array([-v[1], v[0]]) 32 | 33 | # Sanitize facial landmarks. 34 | face_spec = face['face_spec'] 35 | landmarks = (np.float32(face_spec['landmarks']) + 0.5) * face_spec['shrink'] 36 | assert landmarks.shape == (68, 2) 37 | lm_eye_left = landmarks[36 : 42] # left-clockwise 38 | lm_eye_right = landmarks[42 : 48] # left-clockwise 39 | lm_mouth_outer = landmarks[48 : 60] # left-clockwise 40 | 41 | # Calculate auxiliary vectors. 42 | eye_left = np.mean(lm_eye_left, axis=0) 43 | eye_right = np.mean(lm_eye_right, axis=0) 44 | eye_avg = (eye_left + eye_right) * 0.5 45 | eye_to_eye = eye_right - eye_left 46 | mouth_left = lm_mouth_outer[0] 47 | mouth_right = lm_mouth_outer[6] 48 | mouth_avg = (mouth_left + mouth_right) * 0.5 49 | eye_to_mouth = mouth_avg - eye_avg 50 | 51 | # Choose oriented crop rectangle. 52 | if rotate_level: 53 | # Orient according to tilt of the input image 54 | x = eye_to_eye - rot90(eye_to_mouth) 55 | x /= np.hypot(*x) 56 | x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8) 57 | y = rot90(x) 58 | c0 = eye_avg + eye_to_mouth * 0.1 59 | else: 60 | # Do not match the tilt in the source data, i.e., use an axis-aligned rectangle 61 | x = np.array([1, 0], dtype=np.float64) 62 | x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8) 63 | y = np.flipud(x) * [-1, 1] 64 | c0 = eye_avg + eye_to_mouth * 0.1 65 | 66 | # Load. 67 | img = PIL.Image.open(os.path.join(source_images, face['source_path'])).convert('RGB') 68 | 69 | # Calculate auxiliary data. 70 | qsize = np.hypot(*x) * 2 71 | quad = np.stack([c0 - x - y, c0 - x + y, c0 + x + y, c0 + x - y]) 72 | 73 | # Keep drawing new random crop offsets until we find one that is contained in the image 74 | # and does not require padding 75 | if random_shift != 0: 76 | for _ in range(1000): 77 | # Offset the crop rectange center by a random shift proportional to image dimension 78 | # and the requested standard deviation (by default 0) 79 | c = (c0 + np.hypot(*x)*2 * random_shift * rng.normal(0, 1, c0.shape)) 80 | quad = np.stack([c - x - y, c - x + y, c + x + y, c + x - y]) 81 | crop = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1])))) 82 | if not retry_crops or not (crop[0] < 0 or crop[1] < 0 or crop[2] >= img.width or crop[3] >= img.height): 83 | # We're happy with this crop (either it fits within the image, or retries are disabled) 84 | break 85 | else: 86 | # rejected N times, give up and move to next image 87 | # (does not happen in practice with the MetFaces data) 88 | print('rejected image %s' % face['source_path']) 89 | return 90 | 91 | # Shrink. 92 | shrink = int(np.floor(qsize / target_size * 0.5)) 93 | if shrink > 1: 94 | rsize = (int(np.rint(float(img.size[0]) / shrink)), int(np.rint(float(img.size[1]) / shrink))) 95 | img = img.resize(rsize, PIL.Image.ANTIALIAS) 96 | quad /= shrink 97 | qsize /= shrink 98 | 99 | # Crop. 100 | border = max(int(np.rint(qsize * 0.1)), 3) 101 | crop = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1])))) 102 | crop = (max(crop[0] - border, 0), max(crop[1] - border, 0), min(crop[2] + border, img.size[0]), min(crop[3] + border, img.size[1])) 103 | if crop[2] - crop[0] < img.size[0] or crop[3] - crop[1] < img.size[1]: 104 | img = img.crop(crop) 105 | quad -= crop[0:2] 106 | 107 | # Pad. 108 | pad = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1])))) 109 | pad = (max(-pad[0] + border, 0), max(-pad[1] + border, 0), max(pad[2] - img.size[0] + border, 0), max(pad[3] - img.size[1] + border, 0)) 110 | if enable_padding and max(pad) > border - 4: 111 | pad = np.maximum(pad, int(np.rint(qsize * 0.3))) 112 | img = np.pad(np.float32(img), ((pad[1], pad[3]), (pad[0], pad[2]), (0, 0)), 'reflect') 113 | h, w, _ = img.shape 114 | y, x, _ = np.ogrid[:h, :w, :1] 115 | mask = np.maximum(1.0 - np.minimum(np.float32(x) / pad[0], np.float32(w-1-x) / pad[2]), 1.0 - np.minimum(np.float32(y) / pad[1], np.float32(h-1-y) / pad[3])) 116 | blur = qsize * 0.02 117 | img += (scipy.ndimage.gaussian_filter(img, [blur, blur, 0]) - img) * np.clip(mask * 3.0 + 1.0, 0.0, 1.0) 118 | img += (np.median(img, axis=(0,1)) - img) * np.clip(mask, 0.0, 1.0) 119 | img = PIL.Image.fromarray(np.uint8(np.clip(np.rint(img), 0, 255)), 'RGB') 120 | quad += pad[:2] 121 | 122 | # Transform. 123 | super_size = target_size * supersampling 124 | img = img.transform((super_size, super_size), PIL.Image.QUAD, (quad + 0.5).flatten(), PIL.Image.BILINEAR) 125 | if target_size < super_size: 126 | img = img.resize((target_size, target_size), PIL.Image.ANTIALIAS) 127 | 128 | # Save face image. 129 | img.save(os.path.join(output_dir, f"{face['obj_id']}-{face['face_idx']:02d}.png")) 130 | 131 | 132 | def main(): 133 | parser = argparse.ArgumentParser( 134 | description='MetFaces dataset processing tool', 135 | epilog=_examples, 136 | formatter_class=argparse.RawDescriptionHelpFormatter 137 | ) 138 | parser.add_argument('--json', help='MetFaces metadata json file path', required=True) 139 | parser.add_argument('--source-images', help='Location of MetFaces raw image data', required=True) 140 | parser.add_argument('--output-dir', help='Where to save output files') 141 | parser.add_argument('--random-shift', help='Standard deviation of random crop rectangle jitter', type=float, default=0.0, metavar='SHIFT') 142 | parser.add_argument('--retry-crops', help='Retry random shift if crop rectangle falls outside image (up to 1000 times)', dest='retry_crops', default=False, action='store_true') 143 | parser.add_argument('--no-rotation', help='Keep the original orientation of images', dest='no_rotation', default=False, action='store_true') 144 | args = parser.parse_args() 145 | 146 | os.makedirs(args.output_dir, exist_ok=True) 147 | 148 | rng = np.random.RandomState(12345) # fix the random seed for reproducibility 149 | 150 | with open(args.json, encoding="utf8") as fin: 151 | faces = json.load(fin) 152 | for f in tqdm(faces): 153 | extract_face(f, source_images=args.source_images, output_dir=args.output_dir, rng=rng, 154 | random_shift=args.random_shift, retry_crops=args.retry_crops, rotate_level=not args.no_rotation) 155 | 156 | if __name__ == "__main__": 157 | main() 158 | --------------------------------------------------------------------------------