├── .gitignore ├── LICENSE ├── MANIFEST ├── README.md ├── VERSION ├── camera_utils.py ├── data └── .gitignore ├── mesh_renderer.py ├── mesh_renderer ├── kernels │ ├── rasterize_triangles_grad.cc │ ├── rasterize_triangles_impl.cc │ ├── rasterize_triangles_impl.h │ ├── rasterize_triangles_impl_test.cc │ └── rasterize_triangles_op.cc ├── mesh_renderer_test.py ├── rasterize_triangles_test.py ├── test_data │ ├── BUILD │ ├── Barycentrics_Cube.png │ ├── Colored_Cube_0.png │ ├── Colored_Cube_1.png │ ├── External_Triangle.png │ ├── Gray_Cube_0.png │ ├── Gray_Cube_1.png │ ├── Inside_Box.png │ ├── Perspective_Corrected_Triangle.png │ ├── Simple_Tetrahedron.png │ ├── Simple_Triangle.png │ └── Unlit_Cube_0.png └── test_utils.py ├── notebooks └── FirstCrack.ipynb ├── rasterize_triangles.py ├── setup.py └── third_party ├── lodepng.cpp └── lodepng.h /.gitignore: -------------------------------------------------------------------------------- 1 | bazel* 2 | build 3 | *.egg-info 4 | .ipynb_checkpoints 5 | dist 6 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /MANIFEST: -------------------------------------------------------------------------------- 1 | # file GENERATED by distutils, do NOT edit 2 | camera_utils.py 3 | mesh_renderer.py 4 | rasterize_triangles.py 5 | setup.py 6 | mesh_renderer/kernels/rasterize_triangles_grad.cc 7 | mesh_renderer/kernels/rasterize_triangles_impl.cc 8 | mesh_renderer/kernels/rasterize_triangles_op.cc 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TF Mesh Renderer 2 | 3 | This is a differentiable, 3D mesh renderer using TensorFlow. 4 | [Original repository](https://github.com/google/tf_mesh_renderer). 5 | 6 | This for sends it to Pypi, and removes bazel as a dependency for installation 7 | (e.g. just use `python3 setup.py install`). 8 | 9 | ### Installation 10 | ``` 11 | pip install mesh_renderer 12 | ``` 13 | 14 | ### Usage 15 | 16 | ``` 17 | # load your geometry (this is a cube): 18 | object_vertices = np.array([[-1, -1, 1], [-1, -1, -1], [-1, 1, -1], [-1, 1, 1], [1, -1, 1], 19 | [1, -1, -1], [1, 1, -1], [1, 1, 1]]) 20 | object_triangles = np.array([[0, 1, 2], [2, 3, 0], [3, 2, 6], [6, 7, 3], [7, 6, 5], [5, 4, 7], 21 | [4, 5, 1], [1, 0, 4], [5, 6, 2], [2, 1, 5], [7, 4, 0], [0, 3, 7]], dtype=np.int32) 22 | object_vertices = tf.constant(object_vertices, dtype=tf.float32) 23 | object_triangles = tf.constant(object_triangles, dtype=tf.int32) 24 | object_normals = tf.nn.l2_normalize(object_vertices, dim=1) 25 | 26 | # rotate the geometry: 27 | angles = [[-1.16, 0.00, 3.48]] 28 | 29 | model_rotation = camera_utils.euler_matrices(angles)[0, :3, :3] 30 | # camera position: 31 | eye = tf.constant([[0.0, 0.0, 6.0]], dtype=tf.float32) 32 | lightbulb = tf.constant([[0.0, 0.0, 6.0]], dtype=tf.float32) 33 | center = tf.constant([[0.0, 0.0, 0.0]], dtype=tf.float32) 34 | world_up = tf.constant([[0.0, 1.0, 0.0]], dtype=tf.float32) 35 | vertex_diffuse_colors = tf.reshape(tf.ones_like(vertices), [1, vertices.get_shape()[0].value, 3]) 36 | light_positions = tf.expand_dims(lightbulb, axis=0) 37 | light_intensities = tf.ones([1, 1, 3], dtype=tf.float32) 38 | ambient_color = tf.constant([[0.0, 0.0, 0.0]]) 39 | 40 | vertex_positions = tf.reshape( 41 | tf.matmul(vertices, model_rotation, transpose_b=True), 42 | [1, vertices.get_shape()[0].value, 3]) 43 | desired_normals = tf.reshape( 44 | tf.matmul(normals, model_rotation, transpose_b=True), 45 | [1, vertices.get_shape()[0].value, 3]) 46 | 47 | # render is a tf.Tensor 3d tensor of shape height x width x 4 (r, g, b, a) 48 | # you can backpropagate through it. 49 | render = mesh_renderer.mesh_renderer( 50 | vertex_positions, triangles, desired_normals, 51 | vertex_diffuse_colors, eye, center, world_up, light_positions, 52 | light_intensities, image_width, image_height, 53 | ambient_color=ambient_color, 54 | ) 55 | ``` 56 | 57 | 58 | # Original Readme 59 | 60 | This is a differentiable, 3D mesh renderer using TensorFlow. 61 | 62 | This is not an official Google product. 63 | 64 | The interface to the renderer is provided by mesh_renderer.py and 65 | rasterize_triangles.py, which provide TensorFlow Ops that can be added to a 66 | TensorFlow graph. The internals of the renderer are handled by a C++ kernel. 67 | 68 | The input to the C++ rendering kernel is a list of 3D vertices and a list of 69 | triangles, where a triangle consists of a list of three vertex ids. The 70 | output of the renderer is a pair of images containing triangle ids and 71 | barycentric weights. Pixel values in the barycentric weight image are the 72 | weights of the pixel center point with respect to the triangle at that pixel 73 | (identified by the triangle id). The renderer provides derivatives of the 74 | barycentric weights of the pixel centers with respect to the vertex 75 | positions. 76 | 77 | Any approximation error stems from the assumption that the triangle id at a 78 | pixel does not change as the vertices are moved. This is a reasonable 79 | approximation for small changes in vertex position. Even when the triangle id 80 | does change, the derivatives will be computed by extrapolating the barycentric 81 | weights of a neighboring triangle, which will produce a good approximation if 82 | the mesh is smooth. The main source of error occurs at occlusion boundaries, and 83 | particularly at the edge of an open mesh, where the background appears opposite 84 | the triangle's edge. 85 | 86 | The algorithm implemented is described by Olano and Greer, "Triangle Scan 87 | Conversion using 2D Homogeneous Coordinates," HWWS 1997. 88 | 89 | How to Build 90 | ------------ 91 | 92 | Follow the instructions to [install TensorFlow using virtualenv](https://www.tensorflow.org/install/install_linux#installing_with_virtualenv). 93 | 94 | Build and run tests using Bazel from inside the (tensorflow) virtualenv: 95 | 96 | `(tensorflow)$ ./runtests.sh` 97 | 98 | The script calls the Bazel rules using the Python interpreter at 99 | `$VIRTUAL_ENV/bin/python`. If you aren't using virtualenv, `bazel test ...` may 100 | be sufficient. 101 | 102 | Citation 103 | -------- 104 | 105 | If you use this renderer in your research, please cite [this paper](http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html "CVF Version"): 106 | 107 | *Unsupervised Training for 3D Morphable Model Regression*. Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, and William T. Freeman. CVPR 2018, pp. 8377-8386. 108 | 109 | ``` 110 | @InProceedings{Genova_2018_CVPR, 111 | author = {Genova, Kyle and Cole, Forrester and Maschinot, Aaron and Sarna, Aaron and Vlasic, Daniel and Freeman, William T.}, 112 | title = {Unsupervised Training for 3D Morphable Model Regression}, 113 | booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 114 | month = {June}, 115 | year = {2018} 116 | } 117 | ``` 118 | 119 | 120 | Bust of safo: https://cdn.thingiverse.com/zipfiles/ac/39/53/07/80/Bust_of_Sappho_.zip 121 | 122 | -------------------------------------------------------------------------------- /VERSION: -------------------------------------------------------------------------------- 1 | 1.0 2 | -------------------------------------------------------------------------------- /camera_utils.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | """Collection of TF functions for managing 3D camera matrices.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import math 22 | import tensorflow as tf 23 | 24 | 25 | def perspective(aspect_ratio, fov_y, near_clip, far_clip): 26 | """Computes perspective transformation matrices. 27 | 28 | Functionality mimes gluPerspective (third_party/GL/glu/include/GLU/glu.h). 29 | 30 | Args: 31 | aspect_ratio: float value specifying the image aspect ratio (width/height). 32 | fov_y: 1-D float32 Tensor with shape [batch_size] specifying output vertical 33 | field of views in degrees. 34 | near_clip: 1-D float32 Tensor with shape [batch_size] specifying near 35 | clipping plane distance. 36 | far_clip: 1-D float32 Tensor with shape [batch_size] specifying far clipping 37 | plane distance. 38 | 39 | Returns: 40 | A [batch_size, 4, 4] float tensor that maps from right-handed points in eye 41 | space to left-handed points in clip space. 42 | """ 43 | # The multiplication of fov_y by pi/360.0 simultaneously converts to radians 44 | # and adds the half-angle factor of .5. 45 | focal_lengths_y = 1.0 / tf.tan(fov_y * (math.pi / 360.0)) 46 | depth_range = far_clip - near_clip 47 | p_22 = -(far_clip + near_clip) / depth_range 48 | p_23 = -2.0 * (far_clip * near_clip / depth_range) 49 | 50 | zeros = tf.zeros_like(p_23, dtype=tf.float32) 51 | # pyformat: disable 52 | perspective_transform = tf.concat( 53 | [ 54 | focal_lengths_y / aspect_ratio, zeros, zeros, zeros, 55 | zeros, focal_lengths_y, zeros, zeros, 56 | zeros, zeros, p_22, p_23, 57 | zeros, zeros, -tf.ones_like(p_23, dtype=tf.float32), zeros 58 | ], axis=0) 59 | # pyformat: enable 60 | perspective_transform = tf.reshape(perspective_transform, [4, 4, -1]) 61 | return tf.transpose(perspective_transform, [2, 0, 1]) 62 | 63 | 64 | def look_at(eye, center, world_up): 65 | """Computes camera viewing matrices. 66 | 67 | Functionality mimes gluLookAt (third_party/GL/glu/include/GLU/glu.h). 68 | 69 | Args: 70 | eye: 2-D float32 tensor with shape [batch_size, 3] containing the XYZ world 71 | space position of the camera. 72 | center: 2-D float32 tensor with shape [batch_size, 3] containing a position 73 | along the center of the camera's gaze. 74 | world_up: 2-D float32 tensor with shape [batch_size, 3] specifying the 75 | world's up direction; the output camera will have no tilt with respect 76 | to this direction. 77 | 78 | Returns: 79 | A [batch_size, 4, 4] float tensor containing a right-handed camera 80 | extrinsics matrix that maps points from world space to points in eye space. 81 | """ 82 | batch_size = center.shape[0].value 83 | vector_degeneracy_cutoff = 1e-6 84 | forward = center - eye 85 | forward_norm = tf.norm(forward, ord='euclidean', axis=1, keepdims=True) 86 | tf.assert_greater( 87 | forward_norm, 88 | vector_degeneracy_cutoff, 89 | message='Camera matrix is degenerate because eye and center are close.') 90 | forward = tf.divide(forward, forward_norm) 91 | 92 | to_side = tf.cross(forward, world_up) 93 | to_side_norm = tf.norm(to_side, ord='euclidean', axis=1, keepdims=True) 94 | tf.assert_greater( 95 | to_side_norm, 96 | vector_degeneracy_cutoff, 97 | message='Camera matrix is degenerate because up and gaze are close or' 98 | 'because up is degenerate.') 99 | to_side = tf.divide(to_side, to_side_norm) 100 | cam_up = tf.cross(to_side, forward) 101 | 102 | w_column = tf.constant( 103 | batch_size * [[0., 0., 0., 1.]], dtype=tf.float32) # [batch_size, 4] 104 | w_column = tf.reshape(w_column, [batch_size, 4, 1]) 105 | view_rotation = tf.stack( 106 | [to_side, cam_up, -forward, 107 | tf.zeros_like(to_side, dtype=tf.float32)], 108 | axis=1) # [batch_size, 4, 3] matrix 109 | view_rotation = tf.concat( 110 | [view_rotation, w_column], axis=2) # [batch_size, 4, 4] 111 | 112 | identity_batch = tf.tile(tf.expand_dims(tf.eye(3), 0), [batch_size, 1, 1]) 113 | view_translation = tf.concat([identity_batch, tf.expand_dims(-eye, 2)], 2) 114 | view_translation = tf.concat( 115 | [view_translation, 116 | tf.reshape(w_column, [batch_size, 1, 4])], 1) 117 | camera_matrices = tf.matmul(view_rotation, view_translation) 118 | return camera_matrices 119 | 120 | 121 | def euler_matrices(angles): 122 | """Computes a XYZ Tait-Bryan (improper Euler angle) rotation. 123 | 124 | Returns 4x4 matrices for convenient multiplication with other transformations. 125 | 126 | Args: 127 | angles: a [batch_size, 3] tensor containing X, Y, and Z angles in radians. 128 | 129 | Returns: 130 | a [batch_size, 4, 4] tensor of matrices. 131 | """ 132 | s = tf.sin(angles) 133 | c = tf.cos(angles) 134 | # Rename variables for readability in the matrix definition below. 135 | c0, c1, c2 = (c[:, 0], c[:, 1], c[:, 2]) 136 | s0, s1, s2 = (s[:, 0], s[:, 1], s[:, 2]) 137 | 138 | zeros = tf.zeros_like(s[:, 0]) 139 | ones = tf.ones_like(s[:, 0]) 140 | 141 | # pyformat: disable 142 | flattened = tf.concat( 143 | [ 144 | c2 * c1, c2 * s1 * s0 - c0 * s2, s2 * s0 + c2 * c0 * s1, zeros, 145 | c1 * s2, c2 * c0 + s2 * s1 * s0, c0 * s2 * s1 - c2 * s0, zeros, 146 | -s1, c1 * s0, c1 * c0, zeros, 147 | zeros, zeros, zeros, ones 148 | ], 149 | axis=0) 150 | # pyformat: enable 151 | reshaped = tf.reshape(flattened, [4, 4, -1]) 152 | return tf.transpose(reshaped, [2, 0, 1]) 153 | 154 | 155 | def transform_homogeneous(matrices, vertices): 156 | """Applies batched 4x4 homogenous matrix transformations to 3-D vertices. 157 | 158 | The vertices are input and output as as row-major, but are interpreted as 159 | column vectors multiplied on the right-hand side of the matrices. More 160 | explicitly, this function computes (MV^T)^T. 161 | Vertices are assumed to be xyz, and are extended to xyzw with w=1. 162 | 163 | Args: 164 | matrices: a [batch_size, 4, 4] tensor of matrices. 165 | vertices: a [batch_size, N, 3] tensor of xyz vertices. 166 | 167 | Returns: 168 | a [batch_size, N, 4] tensor of xyzw vertices. 169 | 170 | Raises: 171 | ValueError: if matrices or vertices have the wrong number of dimensions. 172 | """ 173 | if len(matrices.shape) != 3: 174 | raise ValueError( 175 | 'matrices must have 3 dimensions (missing batch dimension?)') 176 | if len(vertices.shape) != 3: 177 | raise ValueError( 178 | 'vertices must have 3 dimensions (missing batch dimension?)') 179 | homogeneous_coord = tf.ones( 180 | [tf.shape(vertices)[0], tf.shape(vertices)[1], 1], dtype=tf.float32) 181 | vertices_homogeneous = tf.concat([vertices, homogeneous_coord], 2) 182 | 183 | return tf.matmul(vertices_homogeneous, matrices, transpose_b=True) 184 | -------------------------------------------------------------------------------- /data/.gitignore: -------------------------------------------------------------------------------- 1 | *.stl 2 | -------------------------------------------------------------------------------- /mesh_renderer.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | """Differentiable 3-D rendering of a triangle mesh.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import tensorflow as tf 22 | 23 | import camera_utils 24 | import rasterize_triangles 25 | 26 | 27 | def phong_shader(normals, 28 | alphas, 29 | pixel_positions, 30 | light_positions, 31 | light_intensities, 32 | diffuse_colors=None, 33 | camera_position=None, 34 | specular_colors=None, 35 | shininess_coefficients=None, 36 | ambient_color=None): 37 | """Computes pixelwise lighting from rasterized buffers with the Phong model. 38 | 39 | Args: 40 | normals: a 4D float32 tensor with shape [batch_size, image_height, 41 | image_width, 3]. The inner dimension is the world space XYZ normal for 42 | the corresponding pixel. Should be already normalized. 43 | alphas: a 3D float32 tensor with shape [batch_size, image_height, 44 | image_width]. The inner dimension is the alpha value (transparency) 45 | for the corresponding pixel. 46 | pixel_positions: a 4D float32 tensor with shape [batch_size, image_height, 47 | image_width, 3]. The inner dimension is the world space XYZ position for 48 | the corresponding pixel. 49 | light_positions: a 3D tensor with shape [batch_size, light_count, 3]. The 50 | XYZ position of each light in the scene. In the same coordinate space as 51 | pixel_positions. 52 | light_intensities: a 3D tensor with shape [batch_size, light_count, 3]. The 53 | RGB intensity values for each light. Intensities may be above one. 54 | diffuse_colors: a 4D float32 tensor with shape [batch_size, image_height, 55 | image_width, 3]. The inner dimension is the diffuse RGB coefficients at 56 | a pixel in the range [0, 1]. 57 | camera_position: a 1D tensor with shape [batch_size, 3]. The XYZ camera 58 | position in the scene. If supplied, specular reflections will be 59 | computed. If not supplied, specular_colors and shininess_coefficients 60 | are expected to be None. In the same coordinate space as 61 | pixel_positions. 62 | specular_colors: a 4D float32 tensor with shape [batch_size, image_height, 63 | image_width, 3]. The inner dimension is the specular RGB coefficients at 64 | a pixel in the range [0, 1]. If None, assumed to be tf.zeros() 65 | shininess_coefficients: A 3D float32 tensor that is broadcasted to shape 66 | [batch_size, image_height, image_width]. The inner dimension is the 67 | shininess coefficient for the object at a pixel. Dimensions that are 68 | constant can be given length 1, so [batch_size, 1, 1] and [1, 1, 1] are 69 | also valid input shapes. 70 | ambient_color: a 2D tensor with shape [batch_size, 3]. The RGB ambient 71 | color, which is added to each pixel before tone mapping. If None, it is 72 | assumed to be tf.zeros(). 73 | Returns: 74 | A 4D float32 tensor of shape [batch_size, image_height, image_width, 4] 75 | containing the lit RGBA color values for each image at each pixel. Colors 76 | are in the range [0,1]. 77 | 78 | Raises: 79 | ValueError: An invalid argument to the method is detected. 80 | """ 81 | batch_size, image_height, image_width = [s.value for s in normals.shape[:-1]] 82 | light_count = light_positions.shape[1].value 83 | pixel_count = image_height * image_width 84 | # Reshape all values to easily do pixelwise computations: 85 | normals = tf.reshape(normals, [batch_size, -1, 3]) 86 | alphas = tf.reshape(alphas, [batch_size, -1, 1]) 87 | diffuse_colors = tf.reshape(diffuse_colors, [batch_size, -1, 3]) 88 | if camera_position is not None: 89 | specular_colors = tf.reshape(specular_colors, [batch_size, -1, 3]) 90 | 91 | # Ambient component 92 | output_colors = tf.zeros([batch_size, image_height * image_width, 3]) 93 | if ambient_color is not None: 94 | ambient_reshaped = tf.expand_dims(ambient_color, axis=1) 95 | output_colors = tf.add(output_colors, ambient_reshaped * diffuse_colors) 96 | 97 | # Diffuse component 98 | pixel_positions = tf.reshape(pixel_positions, [batch_size, -1, 3]) 99 | per_light_pixel_positions = tf.stack( 100 | [pixel_positions] * light_count, 101 | axis=1) # [batch_size, light_count, pixel_count, 3] 102 | directions_to_lights = tf.nn.l2_normalize( 103 | tf.expand_dims(light_positions, axis=2) - per_light_pixel_positions, 104 | dim=3) # [batch_size, light_count, pixel_count, 3] 105 | # The specular component should only contribute when the light and normal 106 | # face one another (i.e. the dot product is nonnegative): 107 | normals_dot_lights = tf.clip_by_value( 108 | tf.reduce_sum( 109 | tf.expand_dims(normals, axis=1) * directions_to_lights, axis=3), 0.0, 110 | 1.0) # [batch_size, light_count, pixel_count] 111 | diffuse_output = tf.expand_dims( 112 | diffuse_colors, axis=1) * tf.expand_dims( 113 | normals_dot_lights, axis=3) * tf.expand_dims( 114 | light_intensities, axis=2) 115 | diffuse_output = tf.reduce_sum( 116 | diffuse_output, axis=1) # [batch_size, pixel_count, 3] 117 | output_colors = tf.add(output_colors, diffuse_output) 118 | 119 | # Specular component 120 | if camera_position is not None: 121 | camera_position = tf.reshape(camera_position, [batch_size, 1, 3]) 122 | mirror_reflection_direction = tf.nn.l2_normalize( 123 | 2.0 * tf.expand_dims(normals_dot_lights, axis=3) * tf.expand_dims( 124 | normals, axis=1) - directions_to_lights, 125 | dim=3) 126 | direction_to_camera = tf.nn.l2_normalize( 127 | camera_position - pixel_positions, dim=2) 128 | reflection_direction_dot_camera_direction = tf.reduce_sum( 129 | tf.expand_dims(direction_to_camera, axis=1) * 130 | mirror_reflection_direction, 131 | axis=3) 132 | # The specular component should only contribute when the reflection is 133 | # external: 134 | reflection_direction_dot_camera_direction = tf.clip_by_value( 135 | tf.nn.l2_normalize(reflection_direction_dot_camera_direction, dim=2), 136 | 0.0, 1.0) 137 | # The specular component should also only contribute when the diffuse 138 | # component contributes: 139 | reflection_direction_dot_camera_direction = tf.where( 140 | normals_dot_lights != 0.0, reflection_direction_dot_camera_direction, 141 | tf.zeros_like( 142 | reflection_direction_dot_camera_direction, dtype=tf.float32)) 143 | # Reshape to support broadcasting the shininess coefficient, which rarely 144 | # varies per-vertex: 145 | reflection_direction_dot_camera_direction = tf.reshape( 146 | reflection_direction_dot_camera_direction, 147 | [batch_size, light_count, image_height, image_width]) 148 | shininess_coefficients = tf.expand_dims(shininess_coefficients, axis=1) 149 | specularity = tf.reshape( 150 | tf.pow(reflection_direction_dot_camera_direction, 151 | shininess_coefficients), 152 | [batch_size, light_count, pixel_count, 1]) 153 | specular_output = tf.expand_dims( 154 | specular_colors, axis=1) * specularity * tf.expand_dims( 155 | light_intensities, axis=2) 156 | specular_output = tf.reduce_sum(specular_output, axis=1) 157 | output_colors = tf.add(output_colors, specular_output) 158 | rgb_images = tf.reshape(output_colors, 159 | [batch_size, image_height, image_width, 3]) 160 | alpha_images = tf.reshape(alphas, [batch_size, image_height, image_width, 1]) 161 | valid_rgb_values = tf.concat(3 * [alpha_images > 0.5], axis=3) 162 | rgb_images = tf.where(valid_rgb_values, rgb_images, 163 | tf.zeros_like(rgb_images, dtype=tf.float32)) 164 | return tf.reverse(tf.concat([rgb_images, alpha_images], axis=3), axis=[1]) 165 | 166 | 167 | def tone_mapper(image, gamma): 168 | """Applies gamma correction to the input image. 169 | 170 | Tone maps the input image batch in order to make scenes with a high dynamic 171 | range viewable. The gamma correction factor is computed separately per image, 172 | but is shared between all provided channels. The exact function computed is: 173 | 174 | image_out = A*image_in^gamma, where A is an image-wide constant computed so 175 | that the maximum image value is approximately 1. The correction is applied 176 | to all channels. 177 | 178 | Args: 179 | image: 4-D float32 tensor with shape [batch_size, image_height, 180 | image_width, channel_count]. The batch of images to tone map. 181 | gamma: 0-D float32 nonnegative tensor. Values of gamma below one compress 182 | relative contrast in the image, and values above one increase it. A 183 | value of 1 is equivalent to scaling the image to have a maximum value 184 | of 1. 185 | Returns: 186 | 4-D float32 tensor with shape [batch_size, image_height, image_width, 187 | channel_count]. Contains the gamma-corrected images, clipped to the range 188 | [0, 1]. 189 | """ 190 | batch_size = image.shape[0].value 191 | corrected_image = tf.pow(image, gamma) 192 | image_max = tf.reduce_max( 193 | tf.reshape(corrected_image, [batch_size, -1]), axis=1) 194 | scaled_image = tf.divide(corrected_image, 195 | tf.reshape(image_max, [batch_size, 1, 1, 1])) 196 | return tf.clip_by_value(scaled_image, 0.0, 1.0) 197 | 198 | 199 | def mesh_renderer(vertices, 200 | triangles, 201 | normals, 202 | diffuse_colors, 203 | camera_position, 204 | camera_lookat, 205 | camera_up, 206 | light_positions, 207 | light_intensities, 208 | image_width, 209 | image_height, 210 | specular_colors=None, 211 | shininess_coefficients=None, 212 | ambient_color=None, 213 | fov_y=40.0, 214 | near_clip=0.01, 215 | far_clip=10.0): 216 | """Renders an input scene using phong shading, and returns an output image. 217 | 218 | Args: 219 | vertices: 3-D float32 tensor with shape [batch_size, vertex_count, 3]. Each 220 | triplet is an xyz position in world space. 221 | triangles: 2-D int32 tensor with shape [triangle_count, 3]. Each triplet 222 | should contain vertex indices describing a triangle such that the 223 | triangle's normal points toward the viewer if the forward order of the 224 | triplet defines a clockwise winding of the vertices. Gradients with 225 | respect to this tensor are not available. 226 | normals: 3-D float32 tensor with shape [batch_size, vertex_count, 3]. Each 227 | triplet is the xyz vertex normal for its corresponding vertex. Each 228 | vector is assumed to be already normalized. 229 | diffuse_colors: 3-D float32 tensor with shape [batch_size, 230 | vertex_count, 3]. The RGB diffuse reflection in the range [0,1] for 231 | each vertex. 232 | camera_position: 2-D tensor with shape [batch_size, 3] or 1-D tensor with 233 | shape [3] specifying the XYZ world space camera position. 234 | camera_lookat: 2-D tensor with shape [batch_size, 3] or 1-D tensor with 235 | shape [3] containing an XYZ point along the center of the camera's gaze. 236 | camera_up: 2-D tensor with shape [batch_size, 3] or 1-D tensor with shape 237 | [3] containing the up direction for the camera. The camera will have no 238 | tilt with respect to this direction. 239 | light_positions: a 3-D tensor with shape [batch_size, light_count, 3]. The 240 | XYZ position of each light in the scene. In the same coordinate space as 241 | pixel_positions. 242 | light_intensities: a 3-D tensor with shape [batch_size, light_count, 3]. The 243 | RGB intensity values for each light. Intensities may be above one. 244 | image_width: int specifying desired output image width in pixels. 245 | image_height: int specifying desired output image height in pixels. 246 | specular_colors: 3-D float32 tensor with shape [batch_size, 247 | vertex_count, 3]. The RGB specular reflection in the range [0, 1] for 248 | each vertex. If supplied, specular reflections will be computed, and 249 | both specular_colors and shininess_coefficients are expected. 250 | shininess_coefficients: a 0D-2D float32 tensor with maximum shape 251 | [batch_size, vertex_count]. The phong shininess coefficient of each 252 | vertex. A 0D tensor or float gives a constant shininess coefficient 253 | across all batches and images. A 1D tensor must have shape [batch_size], 254 | and a single shininess coefficient per image is used. 255 | ambient_color: a 2D tensor with shape [batch_size, 3]. The RGB ambient 256 | color, which is added to each pixel in the scene. If None, it is 257 | assumed to be black. 258 | fov_y: float, 0D tensor, or 1D tensor with shape [batch_size] specifying 259 | desired output image y field of view in degrees. 260 | near_clip: float, 0D tensor, or 1D tensor with shape [batch_size] specifying 261 | near clipping plane distance. 262 | far_clip: float, 0D tensor, or 1D tensor with shape [batch_size] specifying 263 | far clipping plane distance. 264 | 265 | Returns: 266 | A 4-D float32 tensor of shape [batch_size, image_height, image_width, 4] 267 | containing the lit RGBA color values for each image at each pixel. RGB 268 | colors are the intensity values before tonemapping and can be in the range 269 | [0, infinity]. Clipping to the range [0,1] with tf.clip_by_value is likely 270 | reasonable for both viewing and training most scenes. More complex scenes 271 | with multiple lights should tone map color values for display only. One 272 | simple tonemapping approach is to rescale color values as x/(1+x); gamma 273 | compression is another common techinque. Alpha values are zero for 274 | background pixels and near one for mesh pixels. 275 | Raises: 276 | ValueError: An invalid argument to the method is detected. 277 | """ 278 | if len(vertices.shape) != 3: 279 | raise ValueError('Vertices must have shape [batch_size, vertex_count, 3].') 280 | batch_size = vertices.shape[0].value 281 | if len(normals.shape) != 3: 282 | raise ValueError('Normals must have shape [batch_size, vertex_count, 3].') 283 | if len(light_positions.shape) != 3: 284 | raise ValueError( 285 | 'Light_positions must have shape [batch_size, light_count, 3].') 286 | if len(light_intensities.shape) != 3: 287 | raise ValueError( 288 | 'Light_intensities must have shape [batch_size, light_count, 3].') 289 | if len(diffuse_colors.shape) != 3: 290 | raise ValueError( 291 | 'vertex_diffuse_colors must have shape [batch_size, vertex_count, 3].') 292 | if (ambient_color is not None and 293 | ambient_color.get_shape().as_list() != [batch_size, 3]): 294 | raise ValueError('Ambient_color must have shape [batch_size, 3].') 295 | if camera_position.get_shape().as_list() == [3]: 296 | camera_position = tf.tile( 297 | tf.expand_dims(camera_position, axis=0), [batch_size, 1]) 298 | elif camera_position.get_shape().as_list() != [batch_size, 3]: 299 | raise ValueError('Camera_position must have shape [batch_size, 3]') 300 | if camera_lookat.get_shape().as_list() == [3]: 301 | camera_lookat = tf.tile( 302 | tf.expand_dims(camera_lookat, axis=0), [batch_size, 1]) 303 | elif camera_lookat.get_shape().as_list() != [batch_size, 3]: 304 | raise ValueError('Camera_lookat must have shape [batch_size, 3]') 305 | if camera_up.get_shape().as_list() == [3]: 306 | camera_up = tf.tile(tf.expand_dims(camera_up, axis=0), [batch_size, 1]) 307 | elif camera_up.get_shape().as_list() != [batch_size, 3]: 308 | raise ValueError('Camera_up must have shape [batch_size, 3]') 309 | if isinstance(fov_y, float): 310 | fov_y = tf.constant(batch_size * [fov_y], dtype=tf.float32) 311 | elif not fov_y.get_shape().as_list(): 312 | fov_y = tf.tile(tf.expand_dims(fov_y, 0), [batch_size]) 313 | elif fov_y.get_shape().as_list() != [batch_size]: 314 | raise ValueError('Fov_y must be a float, a 0D tensor, or a 1D tensor with' 315 | 'shape [batch_size]') 316 | if isinstance(near_clip, float): 317 | near_clip = tf.constant(batch_size * [near_clip], dtype=tf.float32) 318 | elif not near_clip.get_shape().as_list(): 319 | near_clip = tf.tile(tf.expand_dims(near_clip, 0), [batch_size]) 320 | elif near_clip.get_shape().as_list() != [batch_size]: 321 | raise ValueError('Near_clip must be a float, a 0D tensor, or a 1D tensor' 322 | 'with shape [batch_size]') 323 | if isinstance(far_clip, float): 324 | far_clip = tf.constant(batch_size * [far_clip], dtype=tf.float32) 325 | elif not far_clip.get_shape().as_list(): 326 | far_clip = tf.tile(tf.expand_dims(far_clip, 0), [batch_size]) 327 | elif far_clip.get_shape().as_list() != [batch_size]: 328 | raise ValueError('Far_clip must be a float, a 0D tensor, or a 1D tensor' 329 | 'with shape [batch_size]') 330 | if specular_colors is not None and shininess_coefficients is None: 331 | raise ValueError( 332 | 'Specular colors were supplied without shininess coefficients.') 333 | if shininess_coefficients is not None and specular_colors is None: 334 | raise ValueError( 335 | 'Shininess coefficients were supplied without specular colors.') 336 | if specular_colors is not None: 337 | # Since a 0-D float32 tensor is accepted, also accept a float. 338 | if isinstance(shininess_coefficients, float): 339 | shininess_coefficients = tf.constant( 340 | shininess_coefficients, dtype=tf.float32) 341 | if len(specular_colors.shape) != 3: 342 | raise ValueError('The specular colors must have shape [batch_size, ' 343 | 'vertex_count, 3].') 344 | if len(shininess_coefficients.shape) > 2: 345 | raise ValueError('The shininess coefficients must have shape at most' 346 | '[batch_size, vertex_count].') 347 | # If we don't have per-vertex coefficients, we can just reshape the 348 | # input shininess to broadcast later, rather than interpolating an 349 | # additional vertex attribute: 350 | if len(shininess_coefficients.shape) < 2: 351 | vertex_attributes = tf.concat( 352 | [normals, vertices, diffuse_colors, specular_colors], axis=2) 353 | else: 354 | vertex_attributes = tf.concat( 355 | [ 356 | normals, vertices, diffuse_colors, specular_colors, 357 | tf.expand_dims(shininess_coefficients, axis=2) 358 | ], 359 | axis=2) 360 | else: 361 | vertex_attributes = tf.concat([normals, vertices, diffuse_colors], axis=2) 362 | 363 | camera_matrices = camera_utils.look_at(camera_position, camera_lookat, 364 | camera_up) 365 | 366 | perspective_transforms = camera_utils.perspective(image_width / image_height, 367 | fov_y, near_clip, far_clip) 368 | 369 | clip_space_transforms = tf.matmul(perspective_transforms, camera_matrices) 370 | 371 | pixel_attributes = rasterize_triangles.rasterize( 372 | vertices, vertex_attributes, triangles, clip_space_transforms, 373 | image_width, image_height, [-1] * vertex_attributes.shape[2].value) 374 | 375 | # Extract the interpolated vertex attributes from the pixel buffer and 376 | # supply them to the shader: 377 | pixel_normals = tf.nn.l2_normalize(pixel_attributes[:, :, :, 0:3], dim=3) 378 | pixel_positions = pixel_attributes[:, :, :, 3:6] 379 | diffuse_colors = pixel_attributes[:, :, :, 6:9] 380 | if specular_colors is not None: 381 | specular_colors = pixel_attributes[:, :, :, 9:12] 382 | # Retrieve the interpolated shininess coefficients if necessary, or just 383 | # reshape our input for broadcasting: 384 | if len(shininess_coefficients.shape) == 2: 385 | shininess_coefficients = pixel_attributes[:, :, :, 12] 386 | else: 387 | shininess_coefficients = tf.reshape(shininess_coefficients, [-1, 1, 1]) 388 | 389 | pixel_mask = tf.cast(tf.reduce_any(diffuse_colors >= 0, axis=3), tf.float32) 390 | 391 | renders = phong_shader( 392 | normals=pixel_normals, 393 | alphas=pixel_mask, 394 | pixel_positions=pixel_positions, 395 | light_positions=light_positions, 396 | light_intensities=light_intensities, 397 | diffuse_colors=diffuse_colors, 398 | camera_position=camera_position if specular_colors is not None else None, 399 | specular_colors=specular_colors, 400 | shininess_coefficients=shininess_coefficients, 401 | ambient_color=ambient_color) 402 | return renders 403 | -------------------------------------------------------------------------------- /mesh_renderer/kernels/rasterize_triangles_grad.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2017 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include 16 | #include 17 | 18 | #include "tensorflow/core/framework/op.h" 19 | #include "tensorflow/core/framework/op_kernel.h" 20 | 21 | namespace { 22 | 23 | // Threshold for a barycentric coordinate triplet's sum, below which the 24 | // coordinates at a pixel are deemed degenerate. Most such degenerate triplets 25 | // in an image will be exactly zero, as this is how pixels outside the mesh 26 | // are rendered. 27 | constexpr float kDegenerateBarycentricCoordinatesCutoff = 0.9f; 28 | 29 | // If the area of a triangle is very small in screen space, the corner vertices 30 | // are approaching colinearity, and we should drop the gradient to avoid 31 | // numerical instability (in particular, blowup, as the forward pass computation 32 | // already only has 8 bits of precision). 33 | constexpr float kMinimumTriangleArea = 1e-13; 34 | 35 | } // namespace 36 | 37 | namespace tf_mesh_renderer { 38 | 39 | using ::tensorflow::DEVICE_CPU; 40 | using ::tensorflow::OpKernel; 41 | using ::tensorflow::OpKernelConstruction; 42 | using ::tensorflow::OpKernelContext; 43 | using ::tensorflow::PartialTensorShape; 44 | using ::tensorflow::Status; 45 | using ::tensorflow::Tensor; 46 | using ::tensorflow::TensorShape; 47 | using ::tensorflow::errors::InvalidArgument; 48 | 49 | REGISTER_OP("RasterizeTrianglesGrad") 50 | .Input("vertices: float32") 51 | .Input("triangles: int32") 52 | .Input("barycentric_coordinates: float32") 53 | .Input("triangle_ids: int32") 54 | .Input("df_dbarycentric_coordinates: float32") 55 | .Attr("image_width: int") 56 | .Attr("image_height: int") 57 | .Output("df_dvertices: float32"); 58 | 59 | class RasterizeTrianglesGradOp : public OpKernel { 60 | public: 61 | explicit RasterizeTrianglesGradOp(OpKernelConstruction* context) 62 | : OpKernel(context) { 63 | OP_REQUIRES_OK(context, context->GetAttr("image_width", &image_width_)); 64 | OP_REQUIRES(context, image_width_ > 0, 65 | InvalidArgument("Image width must be > 0, got ", image_width_)); 66 | 67 | OP_REQUIRES_OK(context, context->GetAttr("image_height", &image_height_)); 68 | OP_REQUIRES( 69 | context, image_height_ > 0, 70 | InvalidArgument("Image height must be > 0, got ", image_height_)); 71 | } 72 | 73 | ~RasterizeTrianglesGradOp() override {} 74 | 75 | void Compute(OpKernelContext* context) override { 76 | const Tensor& vertices_tensor = context->input(0); 77 | OP_REQUIRES( 78 | context, 79 | PartialTensorShape({-1, 4}).IsCompatibleWith(vertices_tensor.shape()), 80 | InvalidArgument( 81 | "RasterizeTrianglesGrad expects vertices to have shape (-1, 4).")); 82 | auto vertices_flat = vertices_tensor.flat(); 83 | const unsigned int vertex_count = vertices_flat.size() / 4; 84 | const float* vertices = vertices_flat.data(); 85 | 86 | const Tensor& triangles_tensor = context->input(1); 87 | OP_REQUIRES( 88 | context, 89 | PartialTensorShape({-1, 3}).IsCompatibleWith(triangles_tensor.shape()), 90 | InvalidArgument( 91 | "RasterizeTrianglesGrad expects triangles to be a matrix.")); 92 | auto triangles_flat = triangles_tensor.flat(); 93 | const int* triangles = triangles_flat.data(); 94 | 95 | const Tensor& barycentric_coordinates_tensor = context->input(2); 96 | OP_REQUIRES(context, 97 | TensorShape({image_height_, image_width_, 3}) == 98 | barycentric_coordinates_tensor.shape(), 99 | InvalidArgument( 100 | "RasterizeTrianglesGrad expects barycentric_coordinates to " 101 | "have shape {image_height, image_width, 3}")); 102 | auto barycentric_coordinates_flat = 103 | barycentric_coordinates_tensor.flat(); 104 | const float* barycentric_coordinates = barycentric_coordinates_flat.data(); 105 | 106 | const Tensor& triangle_ids_tensor = context->input(3); 107 | OP_REQUIRES( 108 | context, 109 | TensorShape({image_height_, image_width_}) == 110 | triangle_ids_tensor.shape(), 111 | InvalidArgument( 112 | "RasterizeTrianglesGrad expected triangle_ids to have shape " 113 | " {image_height, image_width}")); 114 | auto triangle_ids_flat = triangle_ids_tensor.flat(); 115 | const int* triangle_ids = triangle_ids_flat.data(); 116 | 117 | // The naming convention we use for all derivatives is d_d -> 118 | // the partial of y with respect to x. 119 | const Tensor& df_dbarycentric_coordinates_tensor = context->input(4); 120 | OP_REQUIRES( 121 | context, 122 | TensorShape({image_height_, image_width_, 3}) == 123 | df_dbarycentric_coordinates_tensor.shape(), 124 | InvalidArgument( 125 | "RasterizeTrianglesGrad expects df_dbarycentric_coordinates " 126 | "to have shape {image_height, image_width, 3}")); 127 | auto df_dbarycentric_coordinates_flat = 128 | df_dbarycentric_coordinates_tensor.flat(); 129 | const float* df_dbarycentric_coordinates = 130 | df_dbarycentric_coordinates_flat.data(); 131 | 132 | Tensor* df_dvertices_tensor = nullptr; 133 | OP_REQUIRES_OK(context, 134 | context->allocate_output(0, TensorShape({vertex_count, 4}), 135 | &df_dvertices_tensor)); 136 | auto df_dvertices_flat = df_dvertices_tensor->flat(); 137 | float* df_dvertices = df_dvertices_flat.data(); 138 | std::fill(df_dvertices, df_dvertices + vertex_count * 4, 0.0f); 139 | 140 | // We first loop over each pixel in the output image, and compute 141 | // dbarycentric_coordinate[0,1,2]/dvertex[0x, 0y, 1x, 1y, 2x, 2y]. 142 | // Next we compute each value above's contribution to 143 | // df/dvertices, building up that matrix as the output of this iteration. 144 | for (unsigned int pixel_id = 0; pixel_id < image_height_ * image_width_; 145 | ++pixel_id) { 146 | // b0, b1, and b2 are the three barycentric coordinate values 147 | // rendered at pixel pixel_id. 148 | const float b0 = barycentric_coordinates[3 * pixel_id]; 149 | const float b1 = barycentric_coordinates[3 * pixel_id + 1]; 150 | const float b2 = barycentric_coordinates[3 * pixel_id + 2]; 151 | 152 | if (b0 + b1 + b2 < kDegenerateBarycentricCoordinatesCutoff) { 153 | continue; 154 | } 155 | 156 | const float df_db0 = df_dbarycentric_coordinates[3 * pixel_id]; 157 | const float df_db1 = df_dbarycentric_coordinates[3 * pixel_id + 1]; 158 | const float df_db2 = df_dbarycentric_coordinates[3 * pixel_id + 2]; 159 | 160 | const int triangle_at_current_pixel = triangle_ids[pixel_id]; 161 | const int* vertices_at_current_pixel = 162 | &triangles[3 * triangle_at_current_pixel]; 163 | 164 | // Extract vertex indices for the current triangle. 165 | const int v0_id = 4 * vertices_at_current_pixel[0]; 166 | const int v1_id = 4 * vertices_at_current_pixel[1]; 167 | const int v2_id = 4 * vertices_at_current_pixel[2]; 168 | 169 | // Extract x,y,w components of the vertices' clip space coordinates. 170 | const float x0 = vertices[v0_id]; 171 | const float y0 = vertices[v0_id + 1]; 172 | const float w0 = vertices[v0_id + 3]; 173 | const float x1 = vertices[v1_id]; 174 | const float y1 = vertices[v1_id + 1]; 175 | const float w1 = vertices[v1_id + 3]; 176 | const float x2 = vertices[v2_id]; 177 | const float y2 = vertices[v2_id + 1]; 178 | const float w2 = vertices[v2_id + 3]; 179 | 180 | // Compute pixel's NDC-s. 181 | const int ix = pixel_id % image_width_; 182 | const int iy = pixel_id / image_width_; 183 | const float px = 2 * (ix + 0.5f) / image_width_ - 1.0f; 184 | const float py = 2 * (iy + 0.5f) / image_height_ - 1.0f; 185 | 186 | // Baricentric gradients wrt each vertex coordinate share a common factor. 187 | const float db0_dx = py * (w1 - w2) - (y1 - y2); 188 | const float db1_dx = py * (w2 - w0) - (y2 - y0); 189 | const float db2_dx = -(db0_dx + db1_dx); 190 | const float db0_dy = (x1 - x2) - px * (w1 - w2); 191 | const float db1_dy = (x2 - x0) - px * (w2 - w0); 192 | const float db2_dy = -(db0_dy + db1_dy); 193 | const float db0_dw = px * (y1 - y2) - py * (x1 - x2); 194 | const float db1_dw = px * (y2 - y0) - py * (x2 - x0); 195 | const float db2_dw = -(db0_dw + db1_dw); 196 | 197 | // Combine them with chain rule. 198 | const float df_dx = df_db0 * db0_dx + df_db1 * db1_dx + df_db2 * db2_dx; 199 | const float df_dy = df_db0 * db0_dy + df_db1 * db1_dy + df_db2 * db2_dy; 200 | const float df_dw = df_db0 * db0_dw + df_db1 * db1_dw + df_db2 * db2_dw; 201 | 202 | // Values of edge equations and inverse w at the current pixel. 203 | const float edge0_over_w = x2 * db0_dx + y2 * db0_dy + w2 * db0_dw; 204 | const float edge1_over_w = x2 * db1_dx + y2 * db1_dy + w2 * db1_dw; 205 | const float edge2_over_w = x1 * db2_dx + y1 * db2_dy + w1 * db2_dw; 206 | const float w_inv = edge0_over_w + edge1_over_w + edge2_over_w; 207 | 208 | // All gradients share a common denominator. 209 | const float w_sqr = 1 / (w_inv * w_inv); 210 | 211 | // Gradients wrt each vertex share a common factor. 212 | const float edge0 = w_sqr * edge0_over_w; 213 | const float edge1 = w_sqr * edge1_over_w; 214 | const float edge2 = w_sqr * edge2_over_w; 215 | 216 | df_dvertices[v0_id + 0] += edge0 * df_dx; 217 | df_dvertices[v0_id + 1] += edge0 * df_dy; 218 | df_dvertices[v0_id + 3] += edge0 * df_dw; 219 | df_dvertices[v1_id + 0] += edge1 * df_dx; 220 | df_dvertices[v1_id + 1] += edge1 * df_dy; 221 | df_dvertices[v1_id + 3] += edge1 * df_dw; 222 | df_dvertices[v2_id + 0] += edge2 * df_dx; 223 | df_dvertices[v2_id + 1] += edge2 * df_dy; 224 | df_dvertices[v2_id + 3] += edge2 * df_dw; 225 | } 226 | } 227 | 228 | private: 229 | int image_width_; 230 | int image_height_; 231 | }; 232 | 233 | REGISTER_KERNEL_BUILDER(Name("RasterizeTrianglesGrad").Device(DEVICE_CPU), 234 | RasterizeTrianglesGradOp); 235 | 236 | } // namespace tf_mesh_renderer 237 | -------------------------------------------------------------------------------- /mesh_renderer/kernels/rasterize_triangles_impl.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2017 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include 16 | #include 17 | 18 | #include "rasterize_triangles_impl.h" 19 | 20 | namespace tf_mesh_renderer { 21 | 22 | namespace { 23 | 24 | // Takes the minimum of a, b, and c, rounds down, and converts to an integer 25 | // in the range [low, high]. 26 | inline int ClampedIntegerMin(float a, float b, float c, int low, int high) { 27 | return std::min( 28 | std::max(static_cast(std::floor(std::min(std::min(a, b), c))), low), 29 | high); 30 | } 31 | 32 | // Takes the maximum of a, b, and c, rounds up, and converts to an integer 33 | // in the range [low, high]. 34 | inline int ClampedIntegerMax(float a, float b, float c, int low, int high) { 35 | return std::min( 36 | std::max(static_cast(std::ceil(std::max(std::max(a, b), c))), low), 37 | high); 38 | } 39 | 40 | // Computes a 3x3 matrix inverse without dividing by the determinant. 41 | // Instead, makes an unnormalized matrix inverse with the correct sign 42 | // by flipping the sign of the matrix if the determinant is negative. 43 | // By leaving out determinant division, the rows of M^-1 only depend on two out 44 | // of three of the columns of M; i.e., the first row of M^-1 only depends on the 45 | // second and third columns of M, the second only depends on the first and 46 | // third, etc. This means we can compute edge functions for two neighboring 47 | // triangles independently and produce exactly the same numerical result up to 48 | // the sign. This in turn means we can avoid cracks in rasterization without 49 | // using fixed-point arithmetic. 50 | // See http://mathworld.wolfram.com/MatrixInverse.html 51 | void ComputeUnnormalizedMatrixInverse(const float a11, const float a12, 52 | const float a13, const float a21, 53 | const float a22, const float a23, 54 | const float a31, const float a32, 55 | const float a33, float m_inv[9]) { 56 | m_inv[0] = a22 * a33 - a32 * a23; 57 | m_inv[1] = a13 * a32 - a33 * a12; 58 | m_inv[2] = a12 * a23 - a22 * a13; 59 | m_inv[3] = a23 * a31 - a33 * a21; 60 | m_inv[4] = a11 * a33 - a31 * a13; 61 | m_inv[5] = a13 * a21 - a23 * a11; 62 | m_inv[6] = a21 * a32 - a31 * a22; 63 | m_inv[7] = a12 * a31 - a32 * a11; 64 | m_inv[8] = a11 * a22 - a21 * a12; 65 | 66 | // The first column of the unnormalized M^-1 contains intermediate values for 67 | // det(M). 68 | const float det = a11 * m_inv[0] + a12 * m_inv[3] + a13 * m_inv[6]; 69 | 70 | // Transfer the sign of the determinant. 71 | if (det < 0.0f) { 72 | for (int i = 0; i < 9; ++i) { 73 | m_inv[i] = -m_inv[i]; 74 | } 75 | } 76 | } 77 | 78 | // Computes the edge functions from M^-1 as described by Olano and Greer, 79 | // "Triangle Scan Conversion using 2D Homogeneous Coordinates." 80 | // 81 | // This function combines equations (3) and (4). It first computes 82 | // [a b c] = u_i * M^-1, where u_0 = [1 0 0], u_1 = [0 1 0], etc., 83 | // then computes edge_i = aX + bY + c 84 | void ComputeEdgeFunctions(const float px, const float py, const float m_inv[9], 85 | float values[3]) { 86 | for (int i = 0; i < 3; ++i) { 87 | const float a = m_inv[3 * i + 0]; 88 | const float b = m_inv[3 * i + 1]; 89 | const float c = m_inv[3 * i + 2]; 90 | 91 | values[i] = a * px + b * py + c; 92 | } 93 | } 94 | 95 | // Determines whether the point p lies inside a front-facing triangle. 96 | // Counts pixels exactly on an edge as inside the triangle, as long as the 97 | // triangle is not degenerate. Degenerate (zero-area) triangles always fail the 98 | // inside test. 99 | bool PixelIsInsideTriangle(const float edge_values[3]) { 100 | // Check that the edge values are all non-negative and that at least one is 101 | // positive (triangle is non-degenerate). 102 | return (edge_values[0] >= 0 && edge_values[1] >= 0 && edge_values[2] >= 0) && 103 | (edge_values[0] > 0 || edge_values[1] > 0 || edge_values[2] > 0); 104 | } 105 | 106 | } // namespace 107 | 108 | void RasterizeTrianglesImpl(const float* vertices, const int32* triangles, 109 | int32 triangle_count, int32 image_width, 110 | int32 image_height, int32* triangle_ids, 111 | float* barycentric_coordinates, float* z_buffer) { 112 | const float half_image_width = 0.5 * image_width; 113 | const float half_image_height = 0.5 * image_height; 114 | float unnormalized_matrix_inverse[9]; 115 | float b_over_w[3]; 116 | 117 | for (int32 triangle_id = 0; triangle_id < triangle_count; ++triangle_id) { 118 | const int32 v0_x_id = 4 * triangles[3 * triangle_id]; 119 | const int32 v1_x_id = 4 * triangles[3 * triangle_id + 1]; 120 | const int32 v2_x_id = 4 * triangles[3 * triangle_id + 2]; 121 | 122 | const float v0w = vertices[v0_x_id + 3]; 123 | const float v1w = vertices[v1_x_id + 3]; 124 | const float v2w = vertices[v2_x_id + 3]; 125 | // Early exit: if all w < 0, triangle is entirely behind the eye. 126 | if (v0w < 0 && v1w < 0 && v2w < 0) { 127 | continue; 128 | } 129 | 130 | const float v0x = vertices[v0_x_id]; 131 | const float v0y = vertices[v0_x_id + 1]; 132 | const float v1x = vertices[v1_x_id]; 133 | const float v1y = vertices[v1_x_id + 1]; 134 | const float v2x = vertices[v2_x_id]; 135 | const float v2y = vertices[v2_x_id + 1]; 136 | 137 | ComputeUnnormalizedMatrixInverse(v0x, v1x, v2x, v0y, v1y, v2y, v0w, v1w, 138 | v2w, unnormalized_matrix_inverse); 139 | 140 | // Initialize the bounding box to the entire screen. 141 | int left = 0, right = image_width, bottom = 0, top = image_height; 142 | // If the triangle is entirely inside the screen, project the vertices to 143 | // pixel coordinates and find the triangle bounding box enlarged to the 144 | // nearest integer and clamped to the image boundaries. 145 | if (v0w > 0 && v1w > 0 && v2w > 0) { 146 | const float p0x = (v0x / v0w + 1.0) * half_image_width; 147 | const float p1x = (v1x / v1w + 1.0) * half_image_width; 148 | const float p2x = (v2x / v2w + 1.0) * half_image_width; 149 | const float p0y = (v0y / v0w + 1.0) * half_image_height; 150 | const float p1y = (v1y / v1w + 1.0) * half_image_height; 151 | const float p2y = (v2y / v2w + 1.0) * half_image_height; 152 | left = ClampedIntegerMin(p0x, p1x, p2x, 0, image_width); 153 | right = ClampedIntegerMax(p0x, p1x, p2x, 0, image_width); 154 | bottom = ClampedIntegerMin(p0y, p1y, p2y, 0, image_height); 155 | top = ClampedIntegerMax(p0y, p1y, p2y, 0, image_height); 156 | } 157 | 158 | // Iterate over each pixel in the bounding box. 159 | for (int iy = bottom; iy < top; ++iy) { 160 | for (int ix = left; ix < right; ++ix) { 161 | const float px = ((ix + 0.5) / half_image_width) - 1.0; 162 | const float py = ((iy + 0.5) / half_image_height) - 1.0; 163 | const int pixel_idx = iy * image_width + ix; 164 | 165 | ComputeEdgeFunctions(px, py, unnormalized_matrix_inverse, b_over_w); 166 | if (!PixelIsInsideTriangle(b_over_w)) { 167 | continue; 168 | } 169 | 170 | const float one_over_w = b_over_w[0] + b_over_w[1] + b_over_w[2]; 171 | const float b0 = b_over_w[0] / one_over_w; 172 | const float b1 = b_over_w[1] / one_over_w; 173 | const float b2 = b_over_w[2] / one_over_w; 174 | 175 | const float v0z = vertices[v0_x_id + 2]; 176 | const float v1z = vertices[v1_x_id + 2]; 177 | const float v2z = vertices[v2_x_id + 2]; 178 | // Since we computed an unnormalized w above, we need to recompute 179 | // a properly scaled clip-space w value and then divide clip-space z 180 | // by that. 181 | const float clip_z = b0 * v0z + b1 * v1z + b2 * v2z; 182 | const float clip_w = b0 * v0w + b1 * v1w + b2 * v2w; 183 | const float z = clip_z / clip_w; 184 | 185 | // Skip the pixel if it is farther than the current z-buffer pixel or 186 | // beyond the near or far clipping plane. 187 | if (z < -1.0 || z > 1.0 || z > z_buffer[pixel_idx]) { 188 | continue; 189 | } 190 | 191 | triangle_ids[pixel_idx] = triangle_id; 192 | z_buffer[pixel_idx] = z; 193 | barycentric_coordinates[3 * pixel_idx + 0] = b0; 194 | barycentric_coordinates[3 * pixel_idx + 1] = b1; 195 | barycentric_coordinates[3 * pixel_idx + 2] = b2; 196 | } 197 | } 198 | } 199 | } 200 | 201 | } // namespace tf_mesh_renderer 202 | -------------------------------------------------------------------------------- /mesh_renderer/kernels/rasterize_triangles_impl.h: -------------------------------------------------------------------------------- 1 | // Copyright 2017 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #ifndef MESH_RENDERER_KERNELS_RASTERIZE_TRIANGLES_IMPL_H_ 16 | #define MESH_RENDERER_KERNELS_RASTERIZE_TRIANGLES_IMPL_H_ 17 | 18 | namespace tf_mesh_renderer { 19 | 20 | // Copied from tensorflow/core/platform/default/integral_types.h 21 | // to avoid making this file depend on tensorflow. 22 | typedef int int32; 23 | typedef long long int64; 24 | 25 | // Computes the triangle id, barycentric coordinates, and z-buffer at each pixel 26 | // in the image. 27 | // 28 | // vertices: A flattened 2D array with 4*vertex_count elements. 29 | // Each contiguous triplet is the XYZW location of the vertex with that 30 | // triplet's id. The coordinates are assumed to be OpenGL-style clip-space 31 | // (i.e., post-projection, pre-divide), where X points right, Y points up, 32 | // Z points away. 33 | // triangles: A flattened 2D array with 3*triangle_count elements. 34 | // Each contiguous triplet is the three vertex ids indexing into vertices 35 | // describing one triangle with clockwise winding. 36 | // triangle_count: The number of triangles stored in the array triangles. 37 | // triangle_ids: A flattened 2D array with image_height*image_width elements. 38 | // At return, each pixel contains a triangle id in the range 39 | // [0, triangle_count). The id value is also 0 if there is no triangle 40 | // at the pixel. The barycentric_coordinates must be checked to 41 | // distinguish the two cases. 42 | // barycentric_coordinates: A flattened 3D array with 43 | // image_height*image_width*3 elements. At return, contains the triplet of 44 | // barycentric coordinates at each pixel in the same vertex ordering as 45 | // triangles. If no triangle is present, all coordinates are 0. 46 | // z_buffer: A flattened 2D array with image_height*image_width elements. At 47 | // return, contains the normalized device Z coordinates of the rendered 48 | // triangles. 49 | void RasterizeTrianglesImpl(const float* vertices, const int32* triangles, 50 | int32 triangle_count, int32 image_width, 51 | int32 image_height, int32* triangle_ids, 52 | float* barycentric_coordinates, float* z_buffer); 53 | 54 | } // namespace tf_mesh_renderer 55 | 56 | #endif // MESH_RENDERER_OPS_KERNELS_RASTERIZE_TRIANGLES_IMPL_H_ 57 | -------------------------------------------------------------------------------- /mesh_renderer/kernels/rasterize_triangles_impl_test.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2017 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include 16 | 17 | #include "gtest/gtest.h" 18 | #include "rasterize_triangles_impl.h" 19 | 20 | #include "third_party/lodepng.h" 21 | 22 | namespace tf_mesh_renderer { 23 | namespace { 24 | 25 | typedef unsigned char uint8; 26 | 27 | const int kImageHeight = 480; 28 | const int kImageWidth = 640; 29 | 30 | std::string GetRunfilesRelativePath(const std::string& filename) { 31 | const std::string srcdir = std::getenv("TEST_SRCDIR"); 32 | const std::string test_data = "/tf_mesh_renderer/mesh_renderer/test_data/"; 33 | return srcdir + test_data + filename; 34 | } 35 | 36 | void LoadPng(const std::string& filename, std::vector* output) { 37 | unsigned width, height; 38 | unsigned error = lodepng::decode(*output, width, height, filename.c_str()); 39 | ASSERT_TRUE(error == 0) << "Decoder error: " << lodepng_error_text(error); 40 | } 41 | 42 | void SavePng(const std::string& filename, const std::vector& image) { 43 | unsigned error = 44 | lodepng::encode(filename.c_str(), image, kImageWidth, kImageHeight); 45 | ASSERT_TRUE(error == 0) << "Encoder error: " << lodepng_error_text(error); 46 | } 47 | 48 | void FloatRGBToUint8RGBA(const std::vector& input, 49 | std::vector* output) { 50 | output->resize(kImageHeight * kImageWidth * 4); 51 | for (int y = 0; y < kImageHeight; ++y) { 52 | for (int x = 0; x < kImageWidth; ++x) { 53 | for (int c = 0; c < 3; ++c) { 54 | (*output)[(y * kImageWidth + x) * 4 + c] = 55 | input[(y * kImageWidth + x) * 3 + c] * 255; 56 | } 57 | (*output)[(y * kImageWidth + x) * 4 + 3] = 255; 58 | } 59 | } 60 | } 61 | 62 | void ExpectImageFileAndImageAreEqual(const std::string& baseline_file, 63 | const std::vector& result, 64 | const std::string& comparison_name, 65 | const std::string& failure_message) { 66 | std::vector baseline_rgba, result_rgba; 67 | LoadPng(GetRunfilesRelativePath(baseline_file), &baseline_rgba); 68 | FloatRGBToUint8RGBA(result, &result_rgba); 69 | 70 | const bool images_match = baseline_rgba == result_rgba; 71 | 72 | if (!images_match) { 73 | const std::string result_output_path = 74 | "/tmp/" + comparison_name + "_result.png"; 75 | SavePng(result_output_path, result_rgba); 76 | } 77 | 78 | EXPECT_TRUE(images_match) << failure_message; 79 | } 80 | 81 | class RasterizeTrianglesImplTest : public ::testing::Test { 82 | protected: 83 | void CallRasterizeTrianglesImpl(const float* vertices, const int32* triangles, 84 | int32 triangle_count) { 85 | const int num_pixels = image_height_ * image_width_; 86 | barycentrics_buffer_.resize(num_pixels * 3); 87 | triangle_ids_buffer_.resize(num_pixels); 88 | 89 | constexpr float kClearDepth = 1.0; 90 | z_buffer_.resize(num_pixels, kClearDepth); 91 | 92 | RasterizeTrianglesImpl(vertices, triangles, triangle_count, image_width_, 93 | image_height_, triangle_ids_buffer_.data(), 94 | barycentrics_buffer_.data(), z_buffer_.data()); 95 | } 96 | 97 | // Expects that the sum of barycentric weights at a pixel is close to a 98 | // given value. 99 | void ExpectBarycentricSumIsNear(int x, int y, float expected) const { 100 | constexpr float kEpsilon = 1e-6f; 101 | auto it = barycentrics_buffer_.begin() + y * image_width_ * 3 + x * 3; 102 | EXPECT_NEAR(*it + *(it + 1) + *(it + 2), expected, kEpsilon); 103 | } 104 | // Expects that a pixel is covered by verifying that its barycentric 105 | // coordinates sum to one. 106 | void ExpectIsCovered(int x, int y) const { 107 | ExpectBarycentricSumIsNear(x, y, 1.0); 108 | } 109 | // Expects that a pixel is not covered by verifying that its barycentric 110 | // coordinates sum to zero. 111 | void ExpectIsNotCovered(int x, int y) const { 112 | ExpectBarycentricSumIsNear(x, y, 0.0); 113 | } 114 | 115 | int image_height_ = 480; 116 | int image_width_ = 640; 117 | std::vector barycentrics_buffer_; 118 | std::vector triangle_ids_buffer_; 119 | std::vector z_buffer_; 120 | }; 121 | 122 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeTriangle) { 123 | const std::vector vertices = {-0.5, -0.5, 0.8, 1.0, 0.0, 0.5, 124 | 0.3, 1.0, 0.5, -0.5, 0.3, 1.0}; 125 | const std::vector triangles = {0, 1, 2}; 126 | 127 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1); 128 | ExpectImageFileAndImageAreEqual("Simple_Triangle.png", barycentrics_buffer_, 129 | "triangle", "simple triangle does not match"); 130 | } 131 | 132 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeExternalTriangle) { 133 | const std::vector vertices = {-0.5, -0.5, 0.0, 1.0, 0.0, -0.5, 134 | 0.0, -1.0, 0.5, -0.5, 0.0, 1.0}; 135 | const std::vector triangles = {0, 1, 2}; 136 | 137 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1); 138 | 139 | ExpectImageFileAndImageAreEqual("External_Triangle.png", 140 | barycentrics_buffer_, "external triangle", 141 | "external triangle does not match"); 142 | } 143 | 144 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeCameraInsideBox) { 145 | const std::vector vertices = { 146 | -1.0, -1.0, 0.0, 2.0, 1.0, -1.0, 0.0, 2.0, 1.0, 1.0, 0.0, 147 | 2.0, -1.0, 1.0, 0.0, 2.0, -1.0, -1.0, 0.0, -2.0, 1.0, -1.0, 148 | 0.0, -2.0, 1.0, 1.0, 0.0, -2.0, -1.0, 1.0, 0.0, -2.0}; 149 | const std::vector triangles = {0, 1, 2, 0, 2, 3, 4, 5, 6, 4, 6, 7, 150 | 2, 3, 7, 2, 7, 6, 1, 0, 4, 1, 4, 5, 151 | 0, 3, 7, 0, 7, 4, 1, 2, 6, 1, 6, 5}; 152 | 153 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 12); 154 | 155 | ExpectImageFileAndImageAreEqual("Inside_Box.png", 156 | barycentrics_buffer_, "camera inside box", 157 | "camera inside box does not match"); 158 | } 159 | 160 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeTetrahedron) { 161 | const std::vector vertices = {-0.5, -0.5, 0.8, 1.0, 0.0, 0.5, 162 | 0.3, 1.0, 0.5, -0.5, 0.3, 1.0, 163 | 0.0, 0.0, 0.0, 1.0}; 164 | const std::vector triangles = {0, 2, 1, 0, 1, 3, 1, 2, 3, 2, 0, 3}; 165 | 166 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 4); 167 | 168 | ExpectImageFileAndImageAreEqual("Simple_Tetrahedron.png", 169 | barycentrics_buffer_, "tetrahedron", 170 | "simple tetrahedron does not match"); 171 | } 172 | 173 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeCube) { 174 | // Vertex values were obtained by dumping the clip-space vertex values from 175 | // the renderSimpleCube test in ../rasterize_triangles_test.py. 176 | const std::vector vertices = { 177 | -2.60648608, -3.22707772, 6.85085106, 6.85714293, 178 | -1.30324292, -0.992946863, 8.56856918, 8.5714283, 179 | -1.30324292, 3.97178817, 7.70971, 7.71428585, 180 | -2.60648608, 1.73765731, 5.991992, 6, 181 | 1.30324292, -3.97178817, 6.27827835, 6.28571415, 182 | 2.60648608, -1.73765731, 7.99599648, 8, 183 | 2.60648608, 3.22707772, 7.13713741, 7.14285707, 184 | 1.30324292, 0.992946863, 5.41941929, 5.4285717}; 185 | 186 | const std::vector triangles = {0, 1, 2, 2, 3, 0, 3, 2, 6, 6, 7, 3, 187 | 7, 6, 5, 5, 4, 7, 4, 5, 1, 1, 0, 4, 188 | 5, 6, 2, 2, 1, 5, 7, 4, 0, 0, 3, 7}; 189 | 190 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 12); 191 | 192 | ExpectImageFileAndImageAreEqual("Barycentrics_Cube.png", 193 | barycentrics_buffer_, "cube", "cube does not match"); 194 | } 195 | 196 | TEST_F(RasterizeTrianglesImplTest, WorksWhenPixelIsOnTriangleEdge) { 197 | // Verifies that a pixel that lies exactly on a triangle edge is considered 198 | // inside the triangle. 199 | image_width_ = 641; 200 | const int x_pixel = image_width_ / 2; 201 | const float x_ndc = 0.0; 202 | constexpr int yPixel = 5; 203 | 204 | const std::vector vertices = {x_ndc, -1.0, 0.5, 1.0, x_ndc, 1.0, 205 | 0.5, 1.0, 0.5, -1.0, 0.5, 1.0}; 206 | { 207 | const std::vector triangles = {0, 1, 2}; 208 | 209 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1); 210 | 211 | ExpectIsCovered(x_pixel, yPixel); 212 | } 213 | { 214 | // Test the triangle with the same vertices in reverse order. 215 | const std::vector triangles = {2, 1, 0}; 216 | 217 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1); 218 | 219 | ExpectIsCovered(x_pixel, yPixel); 220 | } 221 | } 222 | 223 | TEST_F(RasterizeTrianglesImplTest, CoversEdgePixelsOfImage) { 224 | // Verifies that the pixels along image edges are correct covered. 225 | 226 | const std::vector vertices = {-1.0, -1.0, 0.0, 1.0, 1.0, -1.0, 227 | 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 228 | -1.0, 1.0, 0.0, 1.0}; 229 | const std::vector triangles = {0, 1, 2, 0, 2, 3}; 230 | 231 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 2); 232 | 233 | ExpectIsCovered(0, 0); 234 | ExpectIsCovered(image_width_ - 1, 0); 235 | ExpectIsCovered(image_width_ - 1, image_height_ - 1); 236 | ExpectIsCovered(0, image_height_ - 1); 237 | } 238 | 239 | TEST_F(RasterizeTrianglesImplTest, PixelOnDegenerateTriangleIsNotInside) { 240 | // Verifies that a pixel lying exactly on a triangle with zero area is 241 | // counted as lying outside the triangle. 242 | image_width_ = 1; 243 | image_height_ = 1; 244 | const std::vector vertices = {-1.0, -1.0, 0.0, 1.0, 1.0, 1.0, 245 | 0.0, 1.0, 0.0, 0.0, 0.0, 1.0}; 246 | const std::vector triangles = {0, 1, 2}; 247 | 248 | CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1); 249 | 250 | ExpectIsNotCovered(0, 0); 251 | } 252 | 253 | } // namespace 254 | } // namespace tf_mesh_renderer 255 | -------------------------------------------------------------------------------- /mesh_renderer/kernels/rasterize_triangles_op.cc: -------------------------------------------------------------------------------- 1 | // Copyright 2017 Google LLC 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // https://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | #include 16 | #include 17 | 18 | #include "rasterize_triangles_impl.h" 19 | #include "tensorflow/core/framework/op.h" 20 | #include "tensorflow/core/framework/op_kernel.h" 21 | 22 | namespace tf_mesh_renderer { 23 | 24 | using ::tensorflow::DEVICE_CPU; 25 | using ::tensorflow::int32; 26 | using ::tensorflow::OpKernel; 27 | using ::tensorflow::OpKernelConstruction; 28 | using ::tensorflow::OpKernelContext; 29 | using ::tensorflow::PartialTensorShape; 30 | using ::tensorflow::Status; 31 | using ::tensorflow::Tensor; 32 | using ::tensorflow::TensorShape; 33 | using ::tensorflow::TensorShapeUtils; 34 | using ::tensorflow::errors::Internal; 35 | using ::tensorflow::errors::InvalidArgument; 36 | 37 | REGISTER_OP("RasterizeTriangles") 38 | .Input("vertices: float32") 39 | .Input("triangles: int32") 40 | .Attr("image_width: int") 41 | .Attr("image_height: int") 42 | .Output("barycentric_coordinates: float32") 43 | .Output("triangle_ids: int32") 44 | .Output("z_buffer: float32") 45 | .Doc(R"doc( 46 | Implements a rasterization kernel for rendering mesh geometry. 47 | 48 | vertices: 2-D tensor with shape [vertex_count, 4]. The 3-D positions of the mesh 49 | vertices in clip-space (XYZW). 50 | triangles: 2-D tensor with shape [triangle_count, 3]. Each row is a tuple of 51 | indices into vertices specifying a triangle to be drawn. The triangle has an 52 | outward facing normal when the given indices appear in a clockwise winding to 53 | the viewer. 54 | image_width: positive int attribute specifying the width of the output image. 55 | image_height: positive int attribute specifying the height of the output image. 56 | barycentric_coordinates: 3-D tensor with shape [image_height, image_width, 3] 57 | containing the rendered barycentric coordinate triplet per pixel, before 58 | perspective correction. The triplet is the zero vector if the pixel is outside 59 | the mesh boundary. For valid pixels, the ordering of the coordinates 60 | corresponds to the ordering in triangles. 61 | triangle_ids: 2-D tensor with shape [image_height, image_width]. Contains the 62 | triangle id value for each pixel in the output image. For pixels within the 63 | mesh, this is the integer value in the range [0, num_vertices] from triangles. 64 | For vertices outside the mesh this is 0; 0 can either indicate belonging to 65 | triangle 0, or being outside the mesh. This ensures all returned triangle ids 66 | will validly index into the vertex array, enabling the use of tf.gather with 67 | indices from this tensor. The barycentric coordinates can be used to determine 68 | pixel validity instead. 69 | z_buffer: 2-D tensor with shape [image_height, image_width]. Contains the Z 70 | coordinate in Normalized Device Coordinates for each pixel occupied by a 71 | triangle. 72 | )doc"); 73 | 74 | class RasterizeTrianglesOp : public OpKernel { 75 | public: 76 | explicit RasterizeTrianglesOp(OpKernelConstruction* context) 77 | : OpKernel(context) { 78 | OP_REQUIRES_OK(context, context->GetAttr("image_width", &image_width_)); 79 | OP_REQUIRES(context, image_width_ > 0, 80 | InvalidArgument("Image width must be > 0, got ", image_width_)); 81 | 82 | OP_REQUIRES_OK(context, context->GetAttr("image_height", &image_height_)); 83 | OP_REQUIRES( 84 | context, image_height_ > 0, 85 | InvalidArgument("Image height must be > 0, got ", image_height_)); 86 | } 87 | 88 | ~RasterizeTrianglesOp() override {} 89 | 90 | void Compute(OpKernelContext* context) override { 91 | const Tensor& vertices_tensor = context->input(0); 92 | OP_REQUIRES( 93 | context, 94 | PartialTensorShape({-1, 4}).IsCompatibleWith(vertices_tensor.shape()), 95 | InvalidArgument( 96 | "RasterizeTriangles expects vertices to have shape (-1, 4).")); 97 | auto vertices_flat = vertices_tensor.flat(); 98 | const float* vertices = vertices_flat.data(); 99 | 100 | const Tensor& triangles_tensor = context->input(1); 101 | OP_REQUIRES( 102 | context, 103 | PartialTensorShape({-1, 3}).IsCompatibleWith(triangles_tensor.shape()), 104 | InvalidArgument( 105 | "RasterizeTriangles expects triangles to be a matrix.")); 106 | auto triangles_flat = triangles_tensor.flat(); 107 | const int32* triangles = triangles_flat.data(); 108 | const int triangle_count = triangles_flat.size() / 3; 109 | 110 | Tensor* barycentric_tensor = nullptr; 111 | OP_REQUIRES_OK(context, 112 | context->allocate_output( 113 | 0, TensorShape({image_height_, image_width_, 3}), 114 | &barycentric_tensor)); 115 | 116 | Tensor* triangle_ids_tensor = nullptr; 117 | OP_REQUIRES_OK(context, context->allocate_output( 118 | 1, TensorShape({image_height_, image_width_}), 119 | &triangle_ids_tensor)); 120 | 121 | Tensor* z_buffer_tensor = nullptr; 122 | OP_REQUIRES_OK(context, context->allocate_output( 123 | 2, TensorShape({image_height_, image_width_}), 124 | &z_buffer_tensor)); 125 | 126 | // Clear barycentric and triangle id buffers to 0. 127 | // Clear z-buffer to 1 (the farthest NDC z value). 128 | barycentric_tensor->flat().setZero(); 129 | triangle_ids_tensor->flat().setZero(); 130 | z_buffer_tensor->flat().setConstant(1); 131 | 132 | RasterizeTrianglesImpl(vertices, triangles, triangle_count, image_width_, 133 | image_height_, 134 | triangle_ids_tensor->flat().data(), 135 | barycentric_tensor->flat().data(), 136 | z_buffer_tensor->flat().data()); 137 | } 138 | 139 | private: 140 | TF_DISALLOW_COPY_AND_ASSIGN(RasterizeTrianglesOp); 141 | 142 | int image_width_; 143 | int image_height_; 144 | }; 145 | 146 | REGISTER_KERNEL_BUILDER(Name("RasterizeTriangles").Device(DEVICE_CPU), 147 | RasterizeTrianglesOp); 148 | 149 | } // namespace tf_mesh_renderer 150 | -------------------------------------------------------------------------------- /mesh_renderer/mesh_renderer_test.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | from __future__ import absolute_import 16 | from __future__ import division 17 | from __future__ import print_function 18 | 19 | import math 20 | import os 21 | 22 | import numpy as np 23 | import tensorflow as tf 24 | 25 | import camera_utils 26 | import mesh_renderer 27 | import test_utils 28 | 29 | 30 | class RenderTest(tf.test.TestCase): 31 | 32 | def setUp(self): 33 | self.test_data_directory = ( 34 | 'mesh_renderer/test_data/') 35 | 36 | tf.reset_default_graph() 37 | # Set up a basic cube centered at the origin, with vertex normals pointing 38 | # outwards along the line from the origin to the cube vertices: 39 | self.cube_vertices = tf.constant( 40 | [[-1, -1, 1], [-1, -1, -1], [-1, 1, -1], [-1, 1, 1], [1, -1, 1], 41 | [1, -1, -1], [1, 1, -1], [1, 1, 1]], 42 | dtype=tf.float32) 43 | self.cube_normals = tf.nn.l2_normalize(self.cube_vertices, dim=1) 44 | self.cube_triangles = tf.constant( 45 | [[0, 1, 2], [2, 3, 0], [3, 2, 6], [6, 7, 3], [7, 6, 5], [5, 4, 7], 46 | [4, 5, 1], [1, 0, 4], [5, 6, 2], [2, 1, 5], [7, 4, 0], [0, 3, 7]], 47 | dtype=tf.int32) 48 | 49 | def testRendersSimpleCube(self): 50 | """Renders a simple cube to test the full forward pass. 51 | 52 | Verifies the functionality of both the custom kernel and the python wrapper. 53 | """ 54 | 55 | model_transforms = camera_utils.euler_matrices( 56 | [[-20.0, 0.0, 60.0], [45.0, 60.0, 0.0]])[:, :3, :3] 57 | 58 | vertices_world_space = tf.matmul( 59 | tf.stack([self.cube_vertices, self.cube_vertices]), 60 | model_transforms, 61 | transpose_b=True) 62 | 63 | normals_world_space = tf.matmul( 64 | tf.stack([self.cube_normals, self.cube_normals]), 65 | model_transforms, 66 | transpose_b=True) 67 | 68 | # camera position: 69 | eye = tf.constant(2 * [[0.0, 0.0, 6.0]], dtype=tf.float32) 70 | center = tf.constant(2 * [[0.0, 0.0, 0.0]], dtype=tf.float32) 71 | world_up = tf.constant(2 * [[0.0, 1.0, 0.0]], dtype=tf.float32) 72 | image_width = 640 73 | image_height = 480 74 | light_positions = tf.constant([[[0.0, 0.0, 6.0]], [[0.0, 0.0, 6.0]]]) 75 | light_intensities = tf.ones([2, 1, 3], dtype=tf.float32) 76 | vertex_diffuse_colors = tf.ones_like(vertices_world_space, dtype=tf.float32) 77 | 78 | rendered = mesh_renderer.mesh_renderer( 79 | vertices_world_space, self.cube_triangles, normals_world_space, 80 | vertex_diffuse_colors, eye, center, world_up, light_positions, 81 | light_intensities, image_width, image_height) 82 | 83 | with self.test_session() as sess: 84 | images = sess.run(rendered, feed_dict={}) 85 | for image_id in range(images.shape[0]): 86 | target_image_name = 'Gray_Cube_%i.png' % image_id 87 | baseline_image_path = os.path.join(self.test_data_directory, 88 | target_image_name) 89 | test_utils.expect_image_file_and_render_are_near( 90 | self, sess, baseline_image_path, images[image_id, :, :, :]) 91 | 92 | def testComplexShading(self): 93 | """Tests specular highlights, colors, and multiple lights per image.""" 94 | # rotate the cube for the test: 95 | model_transforms = camera_utils.euler_matrices( 96 | [[-20.0, 0.0, 60.0], [45.0, 60.0, 0.0]])[:, :3, :3] 97 | 98 | vertices_world_space = tf.matmul( 99 | tf.stack([self.cube_vertices, self.cube_vertices]), 100 | model_transforms, 101 | transpose_b=True) 102 | 103 | normals_world_space = tf.matmul( 104 | tf.stack([self.cube_normals, self.cube_normals]), 105 | model_transforms, 106 | transpose_b=True) 107 | 108 | # camera position: 109 | eye = tf.constant([[0.0, 0.0, 6.0], [0., 0.2, 18.0]], dtype=tf.float32) 110 | center = tf.constant([[0.0, 0.0, 0.0], [0.1, -0.1, 0.1]], dtype=tf.float32) 111 | world_up = tf.constant( 112 | [[0.0, 1.0, 0.0], [0.1, 1.0, 0.15]], dtype=tf.float32) 113 | fov_y = tf.constant([40., 13.3], dtype=tf.float32) 114 | near_clip = tf.constant(0.1, dtype=tf.float32) 115 | far_clip = tf.constant(25.0, dtype=tf.float32) 116 | image_width = 640 117 | image_height = 480 118 | light_positions = tf.constant([[[0.0, 0.0, 6.0], [1.0, 2.0, 6.0]], 119 | [[0.0, -2.0, 4.0], [1.0, 3.0, 4.0]]]) 120 | light_intensities = tf.constant( 121 | [[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], [[2.0, 0.0, 1.0], [0.0, 2.0, 122 | 1.0]]], 123 | dtype=tf.float32) 124 | # pyformat: disable 125 | vertex_diffuse_colors = tf.constant(2*[[[1.0, 0.0, 0.0], 126 | [0.0, 1.0, 0.0], 127 | [0.0, 0.0, 1.0], 128 | [1.0, 1.0, 1.0], 129 | [1.0, 1.0, 0.0], 130 | [1.0, 0.0, 1.0], 131 | [0.0, 1.0, 1.0], 132 | [0.5, 0.5, 0.5]]], 133 | dtype=tf.float32) 134 | vertex_specular_colors = tf.constant(2*[[[0.0, 1.0, 0.0], 135 | [0.0, 0.0, 1.0], 136 | [1.0, 1.0, 1.0], 137 | [1.0, 1.0, 0.0], 138 | [1.0, 0.0, 1.0], 139 | [0.0, 1.0, 1.0], 140 | [0.5, 0.5, 0.5], 141 | [1.0, 0.0, 0.0]]], 142 | dtype=tf.float32) 143 | # pyformat: enable 144 | shininess_coefficients = 6.0 * tf.ones([2, 8], dtype=tf.float32) 145 | ambient_color = tf.constant( 146 | [[0., 0., 0.], [0.1, 0.1, 0.2]], dtype=tf.float32) 147 | renders = mesh_renderer.mesh_renderer( 148 | vertices_world_space, self.cube_triangles, normals_world_space, 149 | vertex_diffuse_colors, eye, center, world_up, light_positions, 150 | light_intensities, image_width, image_height, vertex_specular_colors, 151 | shininess_coefficients, ambient_color, fov_y, near_clip, far_clip) 152 | tonemapped_renders = tf.concat( 153 | [ 154 | mesh_renderer.tone_mapper(renders[:, :, :, 0:3], 0.7), 155 | renders[:, :, :, 3:4] 156 | ], 157 | axis=3) 158 | 159 | # Check that shininess coefficient broadcasting works by also rendering 160 | # with a scalar shininess coefficient, and ensuring the result is identical: 161 | broadcasted_renders = mesh_renderer.mesh_renderer( 162 | vertices_world_space, self.cube_triangles, normals_world_space, 163 | vertex_diffuse_colors, eye, center, world_up, light_positions, 164 | light_intensities, image_width, image_height, vertex_specular_colors, 165 | 6.0, ambient_color, fov_y, near_clip, far_clip) 166 | tonemapped_broadcasted_renders = tf.concat( 167 | [ 168 | mesh_renderer.tone_mapper(broadcasted_renders[:, :, :, 0:3], 0.7), 169 | broadcasted_renders[:, :, :, 3:4] 170 | ], 171 | axis=3) 172 | 173 | with self.test_session() as sess: 174 | images, broadcasted_images = sess.run( 175 | [tonemapped_renders, tonemapped_broadcasted_renders], feed_dict={}) 176 | 177 | for image_id in range(images.shape[0]): 178 | target_image_name = 'Colored_Cube_%i.png' % image_id 179 | baseline_image_path = os.path.join(self.test_data_directory, 180 | target_image_name) 181 | test_utils.expect_image_file_and_render_are_near( 182 | self, sess, baseline_image_path, images[image_id, :, :, :]) 183 | test_utils.expect_image_file_and_render_are_near( 184 | self, sess, baseline_image_path, 185 | broadcasted_images[image_id, :, :, :]) 186 | 187 | def testFullRenderGradientComputation(self): 188 | """Verifies the Jacobian matrix for the entire renderer. 189 | 190 | This ensures correct gradients are propagated backwards through the entire 191 | process, not just through the rasterization kernel. Uses the simple cube 192 | forward pass. 193 | """ 194 | image_height = 21 195 | image_width = 28 196 | 197 | # rotate the cube for the test: 198 | model_transforms = camera_utils.euler_matrices( 199 | [[-20.0, 0.0, 60.0], [45.0, 60.0, 0.0]])[:, :3, :3] 200 | 201 | vertices_world_space = tf.matmul( 202 | tf.stack([self.cube_vertices, self.cube_vertices]), 203 | model_transforms, 204 | transpose_b=True) 205 | 206 | normals_world_space = tf.matmul( 207 | tf.stack([self.cube_normals, self.cube_normals]), 208 | model_transforms, 209 | transpose_b=True) 210 | 211 | # camera position: 212 | eye = tf.constant([0.0, 0.0, 6.0], dtype=tf.float32) 213 | center = tf.constant([0.0, 0.0, 0.0], dtype=tf.float32) 214 | world_up = tf.constant([0.0, 1.0, 0.0], dtype=tf.float32) 215 | 216 | # Scene has a single light from the viewer's eye. 217 | light_positions = tf.expand_dims(tf.stack([eye, eye], axis=0), axis=1) 218 | light_intensities = tf.ones([2, 1, 3], dtype=tf.float32) 219 | 220 | vertex_diffuse_colors = tf.ones_like(vertices_world_space, dtype=tf.float32) 221 | 222 | rendered = mesh_renderer.mesh_renderer( 223 | vertices_world_space, self.cube_triangles, normals_world_space, 224 | vertex_diffuse_colors, eye, center, world_up, light_positions, 225 | light_intensities, image_width, image_height) 226 | 227 | with self.test_session(): 228 | theoretical, numerical = tf.test.compute_gradient( 229 | self.cube_vertices, (8, 3), 230 | rendered, (2, image_height, image_width, 4), 231 | x_init_value=self.cube_vertices.eval(), 232 | delta=1e-3) 233 | jacobians_match, message = ( 234 | test_utils.check_jacobians_are_nearly_equal( 235 | theoretical, numerical, 0.01, 0.01)) 236 | self.assertTrue(jacobians_match, message) 237 | 238 | def testThatCubeRotates(self): 239 | """Optimize a simple cube's rotation using pixel loss. 240 | 241 | The rotation is represented as static-basis euler angles. This test checks 242 | that the computed gradients are useful. 243 | """ 244 | image_height = 480 245 | image_width = 640 246 | initial_euler_angles = [[0.0, 0.0, 0.0]] 247 | 248 | euler_angles = tf.Variable(initial_euler_angles) 249 | model_rotation = camera_utils.euler_matrices(euler_angles)[0, :3, :3] 250 | 251 | vertices_world_space = tf.reshape( 252 | tf.matmul(self.cube_vertices, model_rotation, transpose_b=True), 253 | [1, 8, 3]) 254 | 255 | normals_world_space = tf.reshape( 256 | tf.matmul(self.cube_normals, model_rotation, transpose_b=True), 257 | [1, 8, 3]) 258 | 259 | # camera position: 260 | eye = tf.constant([[0.0, 0.0, 6.0]], dtype=tf.float32) 261 | center = tf.constant([[0.0, 0.0, 0.0]], dtype=tf.float32) 262 | world_up = tf.constant([[0.0, 1.0, 0.0]], dtype=tf.float32) 263 | 264 | vertex_diffuse_colors = tf.ones_like(vertices_world_space, dtype=tf.float32) 265 | light_positions = tf.reshape(eye, [1, 1, 3]) 266 | light_intensities = tf.ones([1, 1, 3], dtype=tf.float32) 267 | 268 | render = mesh_renderer.mesh_renderer( 269 | vertices_world_space, self.cube_triangles, normals_world_space, 270 | vertex_diffuse_colors, eye, center, world_up, light_positions, 271 | light_intensities, image_width, image_height) 272 | render = tf.reshape(render, [image_height, image_width, 4]) 273 | 274 | # Pick the desired cube rotation for the test: 275 | test_model_rotation = camera_utils.euler_matrices([[-20.0, 0.0, 276 | 60.0]])[0, :3, :3] 277 | 278 | desired_vertex_positions = tf.reshape( 279 | tf.matmul(self.cube_vertices, test_model_rotation, transpose_b=True), 280 | [1, 8, 3]) 281 | desired_normals = tf.reshape( 282 | tf.matmul(self.cube_normals, test_model_rotation, transpose_b=True), 283 | [1, 8, 3]) 284 | desired_render = mesh_renderer.mesh_renderer( 285 | desired_vertex_positions, self.cube_triangles, desired_normals, 286 | vertex_diffuse_colors, eye, center, world_up, light_positions, 287 | light_intensities, image_width, image_height) 288 | desired_render = tf.reshape(desired_render, [image_height, image_width, 4]) 289 | 290 | loss = tf.reduce_mean(tf.abs(render - desired_render)) 291 | optimizer = tf.train.MomentumOptimizer(0.7, 0.1) 292 | grad = tf.gradients(loss, [euler_angles]) 293 | grad, _ = tf.clip_by_global_norm(grad, 1.0) 294 | opt_func = optimizer.apply_gradients([(grad[0], euler_angles)]) 295 | 296 | with tf.Session() as sess: 297 | sess.run(tf.global_variables_initializer()) 298 | for _ in range(35): 299 | sess.run([loss, opt_func]) 300 | final_image, desired_image = sess.run([render, desired_render]) 301 | 302 | target_image_name = 'Gray_Cube_0.png' 303 | baseline_image_path = os.path.join(self.test_data_directory, 304 | target_image_name) 305 | test_utils.expect_image_file_and_render_are_near( 306 | self, sess, baseline_image_path, desired_image) 307 | test_utils.expect_image_file_and_render_are_near( 308 | self, 309 | sess, 310 | baseline_image_path, 311 | final_image, 312 | max_outlier_fraction=0.01, 313 | pixel_error_threshold=0.04) 314 | 315 | 316 | if __name__ == '__main__': 317 | tf.test.main() 318 | -------------------------------------------------------------------------------- /mesh_renderer/rasterize_triangles_test.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | from __future__ import absolute_import 16 | from __future__ import division 17 | from __future__ import print_function 18 | 19 | import os 20 | 21 | import numpy as np 22 | import tensorflow as tf 23 | 24 | import test_utils 25 | import camera_utils 26 | import rasterize_triangles 27 | 28 | 29 | class RenderTest(tf.test.TestCase): 30 | 31 | def setUp(self): 32 | self.test_data_directory = 'mesh_renderer/test_data/' 33 | 34 | tf.reset_default_graph() 35 | self.cube_vertex_positions = tf.constant( 36 | [[-1, -1, 1], [-1, -1, -1], [-1, 1, -1], [-1, 1, 1], [1, -1, 1], 37 | [1, -1, -1], [1, 1, -1], [1, 1, 1]], 38 | dtype=tf.float32) 39 | self.cube_triangles = tf.constant( 40 | [[0, 1, 2], [2, 3, 0], [3, 2, 6], [6, 7, 3], [7, 6, 5], [5, 4, 7], 41 | [4, 5, 1], [1, 0, 4], [5, 6, 2], [2, 1, 5], [7, 4, 0], [0, 3, 7]], 42 | dtype=tf.int32) 43 | 44 | tf_float = lambda x: tf.constant(x, dtype=tf.float32) 45 | # camera position: 46 | eye = tf_float([[2.0, 3.0, 6.0]]) 47 | center = tf_float([[0.0, 0.0, 0.0]]) 48 | world_up = tf_float([[0.0, 1.0, 0.0]]) 49 | 50 | self.image_width = 640 51 | self.image_height = 480 52 | 53 | look_at = camera_utils.look_at(eye, center, world_up) 54 | perspective = camera_utils.perspective( 55 | self.image_width / self.image_height, 56 | tf_float([40.0]), tf_float([0.01]), 57 | tf_float([10.0])) 58 | self.projection = tf.matmul(perspective, look_at) 59 | 60 | def runTriangleTest(self, w_vector, target_image_name): 61 | """Directly renders a rasterized triangle's barycentric coordinates. 62 | 63 | Tests only the kernel (rasterize_triangles_module). 64 | 65 | Args: 66 | w_vector: 3 element vector of w components to scale triangle vertices. 67 | target_image_name: image file name to compare result against. 68 | """ 69 | clip_init = np.array( 70 | [[-0.5, -0.5, 0.8, 1.0], [0.0, 0.5, 0.3, 1.0], [0.5, -0.5, 0.3, 1.0]], 71 | dtype=np.float32) 72 | clip_init = clip_init * np.reshape( 73 | np.array(w_vector, dtype=np.float32), [3, 1]) 74 | 75 | clip_coordinates = tf.constant(clip_init) 76 | triangles = tf.constant([[0, 1, 2]], dtype=tf.int32) 77 | 78 | rendered_coordinates, _, _ = ( 79 | rasterize_triangles.rasterize_triangles_module.rasterize_triangles( 80 | clip_coordinates, triangles, self.image_width, self.image_height)) 81 | rendered_coordinates = tf.concat( 82 | [rendered_coordinates, 83 | tf.ones([self.image_height, self.image_width, 1])], axis=2) 84 | with self.test_session() as sess: 85 | image = rendered_coordinates.eval() 86 | baseline_image_path = os.path.join(self.test_data_directory, 87 | target_image_name) 88 | test_utils.expect_image_file_and_render_are_near( 89 | self, sess, baseline_image_path, image) 90 | 91 | def testRendersSimpleTriangle(self): 92 | self.runTriangleTest((1.0, 1.0, 1.0), 'Simple_Triangle.png') 93 | 94 | def testRendersPerspectiveCorrectTriangle(self): 95 | self.runTriangleTest((0.2, 0.5, 2.0), 'Perspective_Corrected_Triangle.png') 96 | 97 | def testRendersSimpleCube(self): 98 | """Renders a simple cube to test the kernel and python wrapper.""" 99 | vertex_rgb = (self.cube_vertex_positions * 0.5 + 0.5) 100 | vertex_rgba = tf.concat([vertex_rgb, tf.ones([8, 1])], axis=1) 101 | background_value = [0.0, 0.0, 0.0, 0.0] 102 | 103 | rendered = rasterize_triangles.rasterize( 104 | tf.expand_dims(self.cube_vertex_positions, axis=0), 105 | tf.expand_dims(vertex_rgba, axis=0), self.cube_triangles, 106 | self.projection, self.image_width, self.image_height, background_value) 107 | 108 | with self.test_session() as sess: 109 | image = rendered.eval()[0,...] 110 | target_image_name = 'Unlit_Cube_0.png' 111 | baseline_image_path = os.path.join(self.test_data_directory, 112 | target_image_name) 113 | test_utils.expect_image_file_and_render_are_near( 114 | self, sess, baseline_image_path, image) 115 | 116 | def testSimpleTriangleGradientComputation(self): 117 | """Verifies the Jacobian matrix for a single pixel. 118 | 119 | The pixel is in the center of a triangle facing the camera. This makes it 120 | easy to check which entries of the Jacobian might not make sense without 121 | worrying about corner cases. 122 | """ 123 | test_pixel_x = 325 124 | test_pixel_y = 245 125 | 126 | clip_coordinates = tf.placeholder(tf.float32, shape=[3, 4]) 127 | 128 | triangles = tf.constant([[0, 1, 2]], dtype=tf.int32) 129 | 130 | barycentric_coordinates, _, _ = ( 131 | rasterize_triangles.rasterize_triangles_module.rasterize_triangles( 132 | clip_coordinates, triangles, self.image_width, self.image_height)) 133 | 134 | pixels_to_compare = barycentric_coordinates[ 135 | test_pixel_y:test_pixel_y + 1, test_pixel_x:test_pixel_x + 1, :] 136 | 137 | with self.test_session(): 138 | ndc_init = np.array( 139 | [[-0.5, -0.5, 0.8, 1.0], [0.0, 0.5, 0.3, 1.0], [0.5, -0.5, 0.3, 1.0]], 140 | dtype=np.float32) 141 | theoretical, numerical = tf.test.compute_gradient( 142 | clip_coordinates, (3, 4), 143 | pixels_to_compare, (1, 1, 3), 144 | x_init_value=ndc_init, 145 | delta=4e-2) 146 | jacobians_match, message = ( 147 | test_utils.check_jacobians_are_nearly_equal( 148 | theoretical, numerical, 0.01, 0.0, True)) 149 | self.assertTrue(jacobians_match, message) 150 | 151 | def testInternalRenderGradientComputation(self): 152 | """Isolates and verifies the Jacobian matrix for the custom kernel.""" 153 | image_height = 21 154 | image_width = 28 155 | 156 | clip_coordinates = tf.placeholder(tf.float32, shape=[8, 4]) 157 | 158 | barycentric_coordinates, _, _ = ( 159 | rasterize_triangles.rasterize_triangles_module.rasterize_triangles( 160 | clip_coordinates, self.cube_triangles, image_width, image_height)) 161 | 162 | with self.test_session(): 163 | # Precomputed transformation of the simple cube to normalized device 164 | # coordinates, in order to isolate the rasterization gradient. 165 | # pyformat: disable 166 | ndc_init = np.array( 167 | [[-0.43889722, -0.53184521, 0.85293502, 1.0], 168 | [-0.37635487, 0.22206162, 0.90555805, 1.0], 169 | [-0.22849123, 0.76811147, 0.80993629, 1.0], 170 | [-0.2805393, -0.14092168, 0.71602166, 1.0], 171 | [0.18631913, -0.62634289, 0.88603103, 1.0], 172 | [0.16183566, 0.08129397, 0.93020856, 1.0], 173 | [0.44147962, 0.53497446, 0.85076219, 1.0], 174 | [0.53008741, -0.31276882, 0.77620775, 1.0]], 175 | dtype=np.float32) 176 | # pyformat: enable 177 | theoretical, numerical = tf.test.compute_gradient( 178 | clip_coordinates, (8, 4), 179 | barycentric_coordinates, (image_height, image_width, 3), 180 | x_init_value=ndc_init, 181 | delta=4e-2) 182 | jacobians_match, message = ( 183 | test_utils.check_jacobians_are_nearly_equal( 184 | theoretical, numerical, 0.01, 0.01)) 185 | self.assertTrue(jacobians_match, message) 186 | 187 | 188 | if __name__ == '__main__': 189 | tf.test.main() 190 | -------------------------------------------------------------------------------- /mesh_renderer/test_data/BUILD: -------------------------------------------------------------------------------- 1 | package(default_visibility = ["//visibility:public"]) 2 | 3 | filegroup( 4 | name = "images", 5 | srcs = glob(["*.png"]), 6 | ) 7 | -------------------------------------------------------------------------------- /mesh_renderer/test_data/Barycentrics_Cube.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Barycentrics_Cube.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Colored_Cube_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Colored_Cube_0.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Colored_Cube_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Colored_Cube_1.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/External_Triangle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/External_Triangle.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Gray_Cube_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Gray_Cube_0.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Gray_Cube_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Gray_Cube_1.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Inside_Box.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Inside_Box.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Perspective_Corrected_Triangle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Perspective_Corrected_Triangle.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Simple_Tetrahedron.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Simple_Tetrahedron.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Simple_Triangle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Simple_Triangle.png -------------------------------------------------------------------------------- /mesh_renderer/test_data/Unlit_Cube_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Unlit_Cube_0.png -------------------------------------------------------------------------------- /mesh_renderer/test_utils.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | """Common functions for the rasterizer and mesh renderer tests.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import os 22 | import numpy as np 23 | import tensorflow as tf 24 | 25 | 26 | def check_jacobians_are_nearly_equal(theoretical, 27 | numerical, 28 | outlier_relative_error_threshold, 29 | max_outlier_fraction, 30 | include_jacobians_in_error_message=False): 31 | """Compares two Jacobian matrices, allowing for some fraction of outliers. 32 | 33 | Args: 34 | theoretical: 2D numpy array containing a Jacobian matrix with entries 35 | computed via gradient functions. The layout should be as in the output 36 | of gradient_checker. 37 | numerical: 2D numpy array of the same shape as theoretical containing a 38 | Jacobian matrix with entries computed via finite difference 39 | approximations. The layout should be as in the output 40 | of gradient_checker. 41 | outlier_relative_error_threshold: float prescribing the maximum relative 42 | error (from the finite difference approximation) is tolerated before 43 | and entry is considered an outlier. 44 | max_outlier_fraction: float defining the maximum fraction of entries in 45 | theoretical that may be outliers before the check returns False. 46 | include_jacobians_in_error_message: bool defining whether the jacobian 47 | matrices should be included in the return message should the test fail. 48 | 49 | Returns: 50 | A tuple where the first entry is a boolean describing whether 51 | max_outlier_fraction was exceeded, and where the second entry is a string 52 | containing an error message if one is relevant. 53 | """ 54 | outlier_gradients = np.abs( 55 | numerical - theoretical) / numerical > outlier_relative_error_threshold 56 | outlier_fraction = np.count_nonzero(outlier_gradients) / np.prod( 57 | numerical.shape[:2]) 58 | jacobians_match = outlier_fraction <= max_outlier_fraction 59 | 60 | message = ( 61 | ' %f of theoretical gradients are relative outliers, but the maximum' 62 | ' allowable fraction is %f ' % (outlier_fraction, max_outlier_fraction)) 63 | if include_jacobians_in_error_message: 64 | # the gradient_checker convention is the typical Jacobian transposed: 65 | message += ('\nNumerical Jacobian:\n%s\nTheoretical Jacobian:\n%s' % 66 | (repr(numerical.T), repr(theoretical.T))) 67 | return jacobians_match, message 68 | 69 | 70 | def expect_image_file_and_render_are_near(test_instance, 71 | sess, 72 | baseline_path, 73 | result_image, 74 | max_outlier_fraction=0.001, 75 | pixel_error_threshold=0.01): 76 | """Compares the output of mesh_renderer with an image on disk. 77 | 78 | The comparison is soft: the images are considered identical if at most 79 | max_outlier_fraction of the pixels differ by more than a relative error of 80 | pixel_error_threshold of the full color value. Note that before comparison, 81 | mesh renderer values are clipped to the range [0,1]. 82 | 83 | Uses _images_are_near for the actual comparison. 84 | 85 | Args: 86 | test_instance: a python unit test instance. 87 | sess: a TensorFlow session for decoding the png. 88 | baseline_path: path to the reference image on disk. 89 | result_image: the result image, as a numpy array. 90 | max_outlier_fraction: the maximum fraction of outlier pixels allowed. 91 | pixel_error_threshold: pixel values are considered to differ if their 92 | difference exceeds this amount. Range is 0.0 - 1.0. 93 | """ 94 | baseline_bytes = open(baseline_path, 'rb').read() 95 | baseline_image = sess.run(tf.image.decode_png(baseline_bytes)) 96 | 97 | test_instance.assertEqual(baseline_image.shape, result_image.shape, 98 | 'Image shapes %s and %s do not match.' % 99 | (baseline_image.shape, result_image.shape)) 100 | 101 | result_image = np.clip(result_image, 0., 1.).copy(order='C') 102 | baseline_image = baseline_image.astype(float) / 255.0 103 | 104 | outlier_channels = (np.abs(baseline_image - result_image) > 105 | pixel_error_threshold) 106 | outlier_pixels = np.any(outlier_channels, axis=2) 107 | outlier_count = np.count_nonzero(outlier_pixels) 108 | outlier_fraction = outlier_count / np.prod(baseline_image.shape[:2]) 109 | images_match = outlier_fraction <= max_outlier_fraction 110 | 111 | outputs_dir = "/tmp" #os.environ["TEST_TMPDIR"] 112 | base_prefix = os.path.splitext(os.path.basename(baseline_path))[0] 113 | result_output_path = os.path.join(outputs_dir, base_prefix + "_result.png") 114 | 115 | message = ('%s does not match. (%f of pixels are outliers, %f is allowed.). ' 116 | 'Result image written to %s' % 117 | (baseline_path, outlier_fraction, max_outlier_fraction, result_output_path)) 118 | 119 | if not images_match: 120 | result_bytes = sess.run(tf.image.encode_png(result_image*255.0)) 121 | with open(result_output_path, 'wb') as output_file: 122 | output_file.write(result_bytes) 123 | 124 | test_instance.assertTrue(images_match, msg=message) 125 | -------------------------------------------------------------------------------- /rasterize_triangles.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 Google LLC 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # https://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | """Differentiable triangle rasterizer.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import os.path as osp 22 | import tensorflow as tf 23 | 24 | import camera_utils 25 | 26 | 27 | def get_ext_filename(ext_name): 28 | from distutils.sysconfig import get_config_var 29 | ext_path = ext_name.split('.') 30 | ext_suffix = get_config_var('EXT_SUFFIX') 31 | return osp.join(*ext_path) + ext_suffix 32 | 33 | 34 | rasterize_triangles_module_path = osp.join(osp.dirname(osp.realpath(__file__)), get_ext_filename('mesh_renderer_lib')) 35 | rasterize_triangles_module = tf.load_op_library(rasterize_triangles_module_path) 36 | 37 | 38 | def rasterize(world_space_vertices, attributes, triangles, camera_matrices, 39 | image_width, image_height, background_value): 40 | """Rasterizes a mesh and computes interpolated vertex attributes. 41 | 42 | Applies projection matrices and then calls rasterize_clip_space(). 43 | 44 | Args: 45 | world_space_vertices: 3-D float32 tensor of xyz positions with shape 46 | [batch_size, vertex_count, 3]. 47 | attributes: 3-D float32 tensor with shape [batch_size, vertex_count, 48 | attribute_count]. Each vertex attribute is interpolated across the 49 | triangle using barycentric interpolation. 50 | triangles: 2-D int32 tensor with shape [triangle_count, 3]. Each triplet 51 | should contain vertex indices describing a triangle such that the 52 | triangle's normal points toward the viewer if the forward order of the 53 | triplet defines a clockwise winding of the vertices. Gradients with 54 | respect to this tensor are not available. 55 | camera_matrices: 3-D float tensor with shape [batch_size, 4, 4] containing 56 | model-view-perspective projection matrices. 57 | image_width: int specifying desired output image width in pixels. 58 | image_height: int specifying desired output image height in pixels. 59 | background_value: a 1-D float32 tensor with shape [attribute_count]. Pixels 60 | that lie outside all triangles take this value. 61 | 62 | Returns: 63 | A 4-D float32 tensor with shape [batch_size, image_height, image_width, 64 | attribute_count], containing the interpolated vertex attributes at 65 | each pixel. 66 | 67 | Raises: 68 | ValueError: An invalid argument to the method is detected. 69 | """ 70 | clip_space_vertices = camera_utils.transform_homogeneous( 71 | camera_matrices, world_space_vertices) 72 | return rasterize_clip_space(clip_space_vertices, attributes, triangles, 73 | image_width, image_height, background_value) 74 | 75 | 76 | def rasterize_clip_space(clip_space_vertices, attributes, triangles, 77 | image_width, image_height, background_value): 78 | """Rasterizes the input mesh expressed in clip-space (xyzw) coordinates. 79 | 80 | Interpolates vertex attributes using perspective-correct interpolation and 81 | clips triangles that lie outside the viewing frustum. 82 | 83 | Args: 84 | clip_space_vertices: 3-D float32 tensor of homogenous vertices (xyzw) with 85 | shape [batch_size, vertex_count, 4]. 86 | attributes: 3-D float32 tensor with shape [batch_size, vertex_count, 87 | attribute_count]. Each vertex attribute is interpolated across the 88 | triangle using barycentric interpolation. 89 | triangles: 2-D int32 tensor with shape [triangle_count, 3]. Each triplet 90 | should contain vertex indices describing a triangle such that the 91 | triangle's normal points toward the viewer if the forward order of the 92 | triplet defines a clockwise winding of the vertices. Gradients with 93 | respect to this tensor are not available. 94 | image_width: int specifying desired output image width in pixels. 95 | image_height: int specifying desired output image height in pixels. 96 | background_value: a 1-D float32 tensor with shape [attribute_count]. Pixels 97 | that lie outside all triangles take this value. 98 | 99 | Returns: 100 | A 4-D float32 tensor with shape [batch_size, image_height, image_width, 101 | attribute_count], containing the interpolated vertex attributes at 102 | each pixel. 103 | 104 | Raises: 105 | ValueError: An invalid argument to the method is detected. 106 | """ 107 | if not image_width > 0: 108 | raise ValueError('Image width must be > 0.') 109 | if not image_height > 0: 110 | raise ValueError('Image height must be > 0.') 111 | if len(clip_space_vertices.shape) != 3: 112 | raise ValueError('The vertex buffer must be 3D.') 113 | batch_size = clip_space_vertices.shape[0].value 114 | vertex_count = clip_space_vertices.shape[1].value 115 | 116 | per_image_barycentric_coordinates = [] 117 | per_image_vertex_ids = [] 118 | for im in range(clip_space_vertices.shape[0]): 119 | barycentric_coords, triangle_ids, _ = ( 120 | rasterize_triangles_module.rasterize_triangles( 121 | clip_space_vertices[im, :, :], triangles, image_width, 122 | image_height)) 123 | per_image_barycentric_coordinates.append( 124 | tf.reshape(barycentric_coords, [-1, 3])) 125 | 126 | # Gathers the vertex indices now because the indices don't contain a batch 127 | # identifier, and reindexes the vertex ids to point to a (batch,vertex_id) 128 | vertex_ids = tf.gather(triangles, tf.reshape(triangle_ids, [-1])) 129 | reindexed_ids = tf.add(vertex_ids, im * clip_space_vertices.shape[1].value) 130 | per_image_vertex_ids.append(reindexed_ids) 131 | 132 | barycentric_coordinates = tf.concat(per_image_barycentric_coordinates, axis=0) 133 | vertex_ids = tf.concat(per_image_vertex_ids, axis=0) 134 | 135 | # Indexes with each pixel's clip-space triangle's extrema (the pixel's 136 | # 'corner points') ids to get the relevant properties for deferred shading. 137 | flattened_vertex_attributes = tf.reshape(attributes, 138 | [batch_size * vertex_count, -1]) 139 | corner_attributes = tf.gather(flattened_vertex_attributes, vertex_ids) 140 | 141 | # Computes the pixel attributes by interpolating the known attributes at the 142 | # corner points of the triangle interpolated with the barycentric coordinates. 143 | weighted_vertex_attributes = tf.multiply( 144 | corner_attributes, tf.expand_dims(barycentric_coordinates, axis=2)) 145 | summed_attributes = tf.reduce_sum(weighted_vertex_attributes, axis=1) 146 | attribute_images = tf.reshape(summed_attributes, 147 | [batch_size, image_height, image_width, -1]) 148 | 149 | # Barycentric coordinates should approximately sum to one where there is 150 | # rendered geometry, but be exactly zero where there is not. 151 | alphas = tf.clip_by_value( 152 | tf.reduce_sum(2.0 * barycentric_coordinates, axis=1), 0.0, 1.0) 153 | alphas = tf.reshape(alphas, [batch_size, image_height, image_width, 1]) 154 | 155 | attributes_with_background = ( 156 | alphas * attribute_images + (1.0 - alphas) * background_value) 157 | 158 | return attributes_with_background 159 | 160 | 161 | @tf.RegisterGradient('RasterizeTriangles') 162 | def _rasterize_triangles_grad(op, df_dbarys, df_dids, df_dz): 163 | # Gradients are only supported for barycentric coordinates. Gradients for the 164 | # z-buffer are not currently implemented. If you need gradients w.r.t. z, 165 | # include z as a vertex attribute when calling rasterize_triangles. 166 | del df_dids, df_dz 167 | return rasterize_triangles_module.rasterize_triangles_grad( 168 | op.inputs[0], op.inputs[1], op.outputs[0], op.outputs[1], df_dbarys, 169 | op.get_attr('image_width'), op.get_attr('image_height')), None 170 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import os.path as osp 2 | import sys 3 | from distutils.core import Extension, setup 4 | from distutils.command.build_ext import build_ext 5 | import subprocess 6 | from distutils.version import StrictVersion 7 | 8 | SCRIPT_DIR = osp.dirname(osp.realpath(__file__)) 9 | 10 | extra_compile_args = ['-std=c++11'] 11 | extra_link_args = [] 12 | 13 | 14 | if sys.platform == 'darwin': 15 | extra_compile_args.append('-stdlib=libc++') 16 | extra_link_args.append('-stdlib=libc++') 17 | 18 | include_dirs = ['/usr/local/include'] 19 | library_dirs = ['/usr/local/lib'] 20 | 21 | mesh_renderer_lib = Extension( 22 | name='mesh_renderer_lib', 23 | sources=['mesh_renderer/kernels/rasterize_triangles_op.cc', 24 | 'mesh_renderer/kernels/rasterize_triangles_impl.cc', 25 | 'mesh_renderer/kernels/rasterize_triangles_grad.cc'], 26 | include_dirs=include_dirs, 27 | library_dirs=library_dirs, 28 | libraries=[], 29 | language='c++', 30 | extra_compile_args=extra_compile_args, 31 | extra_link_args=extra_link_args, 32 | ) 33 | 34 | 35 | def get_version(): 36 | with open('VERSION', 'r') as fp: 37 | version = fp.readline() 38 | return version.strip() 39 | 40 | 41 | REMOVE_FLAGS = {'-Wstrict-prototypes'} 42 | 43 | 44 | class mesh_renderer_build_ext(build_ext): 45 | def get_libraries(self, ext): 46 | libs = super(mesh_renderer_build_ext, self).get_libraries(ext) 47 | last_lib = [libs[-1], "python{}".format(sys.version_info.major)] 48 | actual_python_libname = None 49 | for ll in last_lib: 50 | if actual_python_libname is not None: 51 | break 52 | for path in self.library_dirs: 53 | if osp.exists(osp.join(path, "lib" + ll + ".so")): 54 | actual_python_libname = ll 55 | break 56 | if actual_python_libname is None: 57 | actual_python_libname = libs[-1] 58 | return libs[:-1] + [actual_python_libname] 59 | 60 | def build_extensions(self): 61 | extension = self.extensions[0] 62 | assert extension.name == 'mesh_renderer_lib' 63 | 64 | import tensorflow as tf 65 | # Read more at https://www.tensorflow.org/versions/r1.1/extend/adding_an_op 66 | 67 | # seems like extra chars like "-rc1" are at the end of string, right after "patch" number so this should work 68 | tf_version_major, tf_version_minor = map(int, tf.__version__.split(".")[:2]) 69 | assert tf_version_major == 1 70 | 71 | if "__cxx11_abi_flag__" in tf.__dict__: 72 | abi = tf.__cxx11_abi_flag__ 73 | else: 74 | tf_path = tf.__path__[0] 75 | if tf_version_minor >= 4: 76 | tf_so = '{}/libtensorflow_framework.so'.format(tf_path) 77 | else: 78 | tf_so = '{}/python/_pywrap_tensorflow_internal.so'.format(tf_path) 79 | 80 | gcc_from_tf = subprocess.check_output("strings " + tf_so + " | grep GCC | grep ubuntu | uniq || true", shell=True).decode("utf-8").strip() 81 | clang_from_tf = subprocess.check_output("strings " + tf_so + " | grep clang || true", shell=True).decode("utf-8").strip() 82 | 83 | if len(gcc_from_tf) > 5: 84 | assert len(gcc_from_tf) > 5, "Cannot extract tensorflow gcc version." 85 | gcc_version = gcc_from_tf.split("-")[0].split(" ")[-1] 86 | print("Detected gcc_version: %s" % gcc_version) 87 | if StrictVersion(gcc_version) < StrictVersion("5.0"): 88 | abi = 0 89 | else: 90 | abi = 1 91 | elif len(clang_from_tf) > 0: 92 | abi = 1 93 | else: 94 | raise ValueError("not using clang or gcc, not sure how to set D_GLIBCXX_USE_CXX11_ABI.") 95 | extension.extra_compile_args.append('-D_GLIBCXX_USE_CXX11_ABI=%s' % abi) 96 | 97 | extension.include_dirs.append(tf.sysconfig.get_include()) 98 | if tf_version_minor >= 4: 99 | extension.extra_link_args.append('-ltensorflow_framework') 100 | extension.include_dirs.append(tf.sysconfig.get_include() + '/external/nsync/public') 101 | extension.library_dirs.append(tf.sysconfig.get_lib()) 102 | 103 | self.compiler.compiler_so = list(filter(lambda flag: flag not in REMOVE_FLAGS, self.compiler.compiler_so)) 104 | super(mesh_renderer_build_ext, self).build_extensions() 105 | 106 | 107 | setup( 108 | name='mesh_renderer', 109 | version=get_version(), 110 | cmdclass={'build_ext': mesh_renderer_build_ext}, 111 | py_modules=['mesh_renderer', 'camera_utils', 'rasterize_triangles'], 112 | ext_modules=[mesh_renderer_lib], 113 | description='TF rendering', 114 | author='Jonathan Raiman', 115 | author_email='raiman@openai.com', 116 | install_requires=[], 117 | ) 118 | -------------------------------------------------------------------------------- /third_party/lodepng.h: -------------------------------------------------------------------------------- 1 | /* 2 | LodePNG version 20170917 3 | 4 | Copyright (c) 2005-2017 Lode Vandevenne 5 | 6 | This software is provided 'as-is', without any express or implied 7 | warranty. In no event will the authors be held liable for any damages 8 | arising from the use of this software. 9 | 10 | Permission is granted to anyone to use this software for any purpose, 11 | including commercial applications, and to alter it and redistribute it 12 | freely, subject to the following restrictions: 13 | 14 | 1. The origin of this software must not be misrepresented; you must not 15 | claim that you wrote the original software. If you use this software 16 | in a product, an acknowledgment in the product documentation would be 17 | appreciated but is not required. 18 | 19 | 2. Altered source versions must be plainly marked as such, and must not be 20 | misrepresented as being the original software. 21 | 22 | 3. This notice may not be removed or altered from any source 23 | distribution. 24 | */ 25 | 26 | #ifndef LODEPNG_H 27 | #define LODEPNG_H 28 | 29 | #include /*for size_t*/ 30 | 31 | extern const char* LODEPNG_VERSION_STRING; 32 | 33 | /* 34 | The following #defines are used to create code sections. They can be disabled 35 | to disable code sections, which can give faster compile time and smaller binary. 36 | The "NO_COMPILE" defines are designed to be used to pass as defines to the 37 | compiler command to disable them without modifying this header, e.g. 38 | -DLODEPNG_NO_COMPILE_ZLIB for gcc. 39 | In addition to those below, you can also define LODEPNG_NO_COMPILE_CRC to 40 | allow implementing a custom lodepng_crc32. 41 | */ 42 | /*deflate & zlib. If disabled, you must specify alternative zlib functions in 43 | the custom_zlib field of the compress and decompress settings*/ 44 | #ifndef LODEPNG_NO_COMPILE_ZLIB 45 | #define LODEPNG_COMPILE_ZLIB 46 | #endif 47 | /*png encoder and png decoder*/ 48 | #ifndef LODEPNG_NO_COMPILE_PNG 49 | #define LODEPNG_COMPILE_PNG 50 | #endif 51 | /*deflate&zlib decoder and png decoder*/ 52 | #ifndef LODEPNG_NO_COMPILE_DECODER 53 | #define LODEPNG_COMPILE_DECODER 54 | #endif 55 | /*deflate&zlib encoder and png encoder*/ 56 | #ifndef LODEPNG_NO_COMPILE_ENCODER 57 | #define LODEPNG_COMPILE_ENCODER 58 | #endif 59 | /*the optional built in harddisk file loading and saving functions*/ 60 | #ifndef LODEPNG_NO_COMPILE_DISK 61 | #define LODEPNG_COMPILE_DISK 62 | #endif 63 | /*support for chunks other than IHDR, IDAT, PLTE, tRNS, IEND: ancillary and unknown chunks*/ 64 | #ifndef LODEPNG_NO_COMPILE_ANCILLARY_CHUNKS 65 | #define LODEPNG_COMPILE_ANCILLARY_CHUNKS 66 | #endif 67 | /*ability to convert error numerical codes to English text string*/ 68 | #ifndef LODEPNG_NO_COMPILE_ERROR_TEXT 69 | #define LODEPNG_COMPILE_ERROR_TEXT 70 | #endif 71 | /*Compile the default allocators (C's free, malloc and realloc). If you disable this, 72 | you can define the functions lodepng_free, lodepng_malloc and lodepng_realloc in your 73 | source files with custom allocators.*/ 74 | #ifndef LODEPNG_NO_COMPILE_ALLOCATORS 75 | #define LODEPNG_COMPILE_ALLOCATORS 76 | #endif 77 | /*compile the C++ version (you can disable the C++ wrapper here even when compiling for C++)*/ 78 | #ifdef __cplusplus 79 | #ifndef LODEPNG_NO_COMPILE_CPP 80 | #define LODEPNG_COMPILE_CPP 81 | #endif 82 | #endif 83 | 84 | #ifdef LODEPNG_COMPILE_CPP 85 | #include 86 | #include 87 | #endif /*LODEPNG_COMPILE_CPP*/ 88 | 89 | #ifdef LODEPNG_COMPILE_PNG 90 | /*The PNG color types (also used for raw).*/ 91 | typedef enum LodePNGColorType 92 | { 93 | LCT_GREY = 0, /*greyscale: 1,2,4,8,16 bit*/ 94 | LCT_RGB = 2, /*RGB: 8,16 bit*/ 95 | LCT_PALETTE = 3, /*palette: 1,2,4,8 bit*/ 96 | LCT_GREY_ALPHA = 4, /*greyscale with alpha: 8,16 bit*/ 97 | LCT_RGBA = 6 /*RGB with alpha: 8,16 bit*/ 98 | } LodePNGColorType; 99 | 100 | #ifdef LODEPNG_COMPILE_DECODER 101 | /* 102 | Converts PNG data in memory to raw pixel data. 103 | out: Output parameter. Pointer to buffer that will contain the raw pixel data. 104 | After decoding, its size is w * h * (bytes per pixel) bytes larger than 105 | initially. Bytes per pixel depends on colortype and bitdepth. 106 | Must be freed after usage with free(*out). 107 | Note: for 16-bit per channel colors, uses big endian format like PNG does. 108 | w: Output parameter. Pointer to width of pixel data. 109 | h: Output parameter. Pointer to height of pixel data. 110 | in: Memory buffer with the PNG file. 111 | insize: size of the in buffer. 112 | colortype: the desired color type for the raw output image. See explanation on PNG color types. 113 | bitdepth: the desired bit depth for the raw output image. See explanation on PNG color types. 114 | Return value: LodePNG error code (0 means no error). 115 | */ 116 | unsigned lodepng_decode_memory(unsigned char** out, unsigned* w, unsigned* h, 117 | const unsigned char* in, size_t insize, 118 | LodePNGColorType colortype, unsigned bitdepth); 119 | 120 | /*Same as lodepng_decode_memory, but always decodes to 32-bit RGBA raw image*/ 121 | unsigned lodepng_decode32(unsigned char** out, unsigned* w, unsigned* h, 122 | const unsigned char* in, size_t insize); 123 | 124 | /*Same as lodepng_decode_memory, but always decodes to 24-bit RGB raw image*/ 125 | unsigned lodepng_decode24(unsigned char** out, unsigned* w, unsigned* h, 126 | const unsigned char* in, size_t insize); 127 | 128 | #ifdef LODEPNG_COMPILE_DISK 129 | /* 130 | Load PNG from disk, from file with given name. 131 | Same as the other decode functions, but instead takes a filename as input. 132 | */ 133 | unsigned lodepng_decode_file(unsigned char** out, unsigned* w, unsigned* h, 134 | const char* filename, 135 | LodePNGColorType colortype, unsigned bitdepth); 136 | 137 | /*Same as lodepng_decode_file, but always decodes to 32-bit RGBA raw image.*/ 138 | unsigned lodepng_decode32_file(unsigned char** out, unsigned* w, unsigned* h, 139 | const char* filename); 140 | 141 | /*Same as lodepng_decode_file, but always decodes to 24-bit RGB raw image.*/ 142 | unsigned lodepng_decode24_file(unsigned char** out, unsigned* w, unsigned* h, 143 | const char* filename); 144 | #endif /*LODEPNG_COMPILE_DISK*/ 145 | #endif /*LODEPNG_COMPILE_DECODER*/ 146 | 147 | 148 | #ifdef LODEPNG_COMPILE_ENCODER 149 | /* 150 | Converts raw pixel data into a PNG image in memory. The colortype and bitdepth 151 | of the output PNG image cannot be chosen, they are automatically determined 152 | by the colortype, bitdepth and content of the input pixel data. 153 | Note: for 16-bit per channel colors, needs big endian format like PNG does. 154 | out: Output parameter. Pointer to buffer that will contain the PNG image data. 155 | Must be freed after usage with free(*out). 156 | outsize: Output parameter. Pointer to the size in bytes of the out buffer. 157 | image: The raw pixel data to encode. The size of this buffer should be 158 | w * h * (bytes per pixel), bytes per pixel depends on colortype and bitdepth. 159 | w: width of the raw pixel data in pixels. 160 | h: height of the raw pixel data in pixels. 161 | colortype: the color type of the raw input image. See explanation on PNG color types. 162 | bitdepth: the bit depth of the raw input image. See explanation on PNG color types. 163 | Return value: LodePNG error code (0 means no error). 164 | */ 165 | unsigned lodepng_encode_memory(unsigned char** out, size_t* outsize, 166 | const unsigned char* image, unsigned w, unsigned h, 167 | LodePNGColorType colortype, unsigned bitdepth); 168 | 169 | /*Same as lodepng_encode_memory, but always encodes from 32-bit RGBA raw image.*/ 170 | unsigned lodepng_encode32(unsigned char** out, size_t* outsize, 171 | const unsigned char* image, unsigned w, unsigned h); 172 | 173 | /*Same as lodepng_encode_memory, but always encodes from 24-bit RGB raw image.*/ 174 | unsigned lodepng_encode24(unsigned char** out, size_t* outsize, 175 | const unsigned char* image, unsigned w, unsigned h); 176 | 177 | #ifdef LODEPNG_COMPILE_DISK 178 | /* 179 | Converts raw pixel data into a PNG file on disk. 180 | Same as the other encode functions, but instead takes a filename as output. 181 | NOTE: This overwrites existing files without warning! 182 | */ 183 | unsigned lodepng_encode_file(const char* filename, 184 | const unsigned char* image, unsigned w, unsigned h, 185 | LodePNGColorType colortype, unsigned bitdepth); 186 | 187 | /*Same as lodepng_encode_file, but always encodes from 32-bit RGBA raw image.*/ 188 | unsigned lodepng_encode32_file(const char* filename, 189 | const unsigned char* image, unsigned w, unsigned h); 190 | 191 | /*Same as lodepng_encode_file, but always encodes from 24-bit RGB raw image.*/ 192 | unsigned lodepng_encode24_file(const char* filename, 193 | const unsigned char* image, unsigned w, unsigned h); 194 | #endif /*LODEPNG_COMPILE_DISK*/ 195 | #endif /*LODEPNG_COMPILE_ENCODER*/ 196 | 197 | 198 | #ifdef LODEPNG_COMPILE_CPP 199 | namespace lodepng 200 | { 201 | #ifdef LODEPNG_COMPILE_DECODER 202 | /*Same as lodepng_decode_memory, but decodes to an std::vector. The colortype 203 | is the format to output the pixels to. Default is RGBA 8-bit per channel.*/ 204 | unsigned decode(std::vector& out, unsigned& w, unsigned& h, 205 | const unsigned char* in, size_t insize, 206 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 207 | unsigned decode(std::vector& out, unsigned& w, unsigned& h, 208 | const std::vector& in, 209 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 210 | #ifdef LODEPNG_COMPILE_DISK 211 | /* 212 | Converts PNG file from disk to raw pixel data in memory. 213 | Same as the other decode functions, but instead takes a filename as input. 214 | */ 215 | unsigned decode(std::vector& out, unsigned& w, unsigned& h, 216 | const std::string& filename, 217 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 218 | #endif /* LODEPNG_COMPILE_DISK */ 219 | #endif /* LODEPNG_COMPILE_DECODER */ 220 | 221 | #ifdef LODEPNG_COMPILE_ENCODER 222 | /*Same as lodepng_encode_memory, but encodes to an std::vector. colortype 223 | is that of the raw input data. The output PNG color type will be auto chosen.*/ 224 | unsigned encode(std::vector& out, 225 | const unsigned char* in, unsigned w, unsigned h, 226 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 227 | unsigned encode(std::vector& out, 228 | const std::vector& in, unsigned w, unsigned h, 229 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 230 | #ifdef LODEPNG_COMPILE_DISK 231 | /* 232 | Converts 32-bit RGBA raw pixel data into a PNG file on disk. 233 | Same as the other encode functions, but instead takes a filename as output. 234 | NOTE: This overwrites existing files without warning! 235 | */ 236 | unsigned encode(const std::string& filename, 237 | const unsigned char* in, unsigned w, unsigned h, 238 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 239 | unsigned encode(const std::string& filename, 240 | const std::vector& in, unsigned w, unsigned h, 241 | LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8); 242 | #endif /* LODEPNG_COMPILE_DISK */ 243 | #endif /* LODEPNG_COMPILE_ENCODER */ 244 | } /* namespace lodepng */ 245 | #endif /*LODEPNG_COMPILE_CPP*/ 246 | #endif /*LODEPNG_COMPILE_PNG*/ 247 | 248 | #ifdef LODEPNG_COMPILE_ERROR_TEXT 249 | /*Returns an English description of the numerical error code.*/ 250 | const char* lodepng_error_text(unsigned code); 251 | #endif /*LODEPNG_COMPILE_ERROR_TEXT*/ 252 | 253 | #ifdef LODEPNG_COMPILE_DECODER 254 | /*Settings for zlib decompression*/ 255 | typedef struct LodePNGDecompressSettings LodePNGDecompressSettings; 256 | struct LodePNGDecompressSettings 257 | { 258 | unsigned ignore_adler32; /*if 1, continue and don't give an error message if the Adler32 checksum is corrupted*/ 259 | 260 | /*use custom zlib decoder instead of built in one (default: null)*/ 261 | unsigned (*custom_zlib)(unsigned char**, size_t*, 262 | const unsigned char*, size_t, 263 | const LodePNGDecompressSettings*); 264 | /*use custom deflate decoder instead of built in one (default: null) 265 | if custom_zlib is used, custom_deflate is ignored since only the built in 266 | zlib function will call custom_deflate*/ 267 | unsigned (*custom_inflate)(unsigned char**, size_t*, 268 | const unsigned char*, size_t, 269 | const LodePNGDecompressSettings*); 270 | 271 | const void* custom_context; /*optional custom settings for custom functions*/ 272 | }; 273 | 274 | extern const LodePNGDecompressSettings lodepng_default_decompress_settings; 275 | void lodepng_decompress_settings_init(LodePNGDecompressSettings* settings); 276 | #endif /*LODEPNG_COMPILE_DECODER*/ 277 | 278 | #ifdef LODEPNG_COMPILE_ENCODER 279 | /* 280 | Settings for zlib compression. Tweaking these settings tweaks the balance 281 | between speed and compression ratio. 282 | */ 283 | typedef struct LodePNGCompressSettings LodePNGCompressSettings; 284 | struct LodePNGCompressSettings /*deflate = compress*/ 285 | { 286 | /*LZ77 related settings*/ 287 | unsigned btype; /*the block type for LZ (0, 1, 2 or 3, see zlib standard). Should be 2 for proper compression.*/ 288 | unsigned use_lz77; /*whether or not to use LZ77. Should be 1 for proper compression.*/ 289 | unsigned windowsize; /*must be a power of two <= 32768. higher compresses more but is slower. Default value: 2048.*/ 290 | unsigned minmatch; /*mininum lz77 length. 3 is normally best, 6 can be better for some PNGs. Default: 0*/ 291 | unsigned nicematch; /*stop searching if >= this length found. Set to 258 for best compression. Default: 128*/ 292 | unsigned lazymatching; /*use lazy matching: better compression but a bit slower. Default: true*/ 293 | 294 | /*use custom zlib encoder instead of built in one (default: null)*/ 295 | unsigned (*custom_zlib)(unsigned char**, size_t*, 296 | const unsigned char*, size_t, 297 | const LodePNGCompressSettings*); 298 | /*use custom deflate encoder instead of built in one (default: null) 299 | if custom_zlib is used, custom_deflate is ignored since only the built in 300 | zlib function will call custom_deflate*/ 301 | unsigned (*custom_deflate)(unsigned char**, size_t*, 302 | const unsigned char*, size_t, 303 | const LodePNGCompressSettings*); 304 | 305 | const void* custom_context; /*optional custom settings for custom functions*/ 306 | }; 307 | 308 | extern const LodePNGCompressSettings lodepng_default_compress_settings; 309 | void lodepng_compress_settings_init(LodePNGCompressSettings* settings); 310 | #endif /*LODEPNG_COMPILE_ENCODER*/ 311 | 312 | #ifdef LODEPNG_COMPILE_PNG 313 | /* 314 | Color mode of an image. Contains all information required to decode the pixel 315 | bits to RGBA colors. This information is the same as used in the PNG file 316 | format, and is used both for PNG and raw image data in LodePNG. 317 | */ 318 | typedef struct LodePNGColorMode 319 | { 320 | /*header (IHDR)*/ 321 | LodePNGColorType colortype; /*color type, see PNG standard or documentation further in this header file*/ 322 | unsigned bitdepth; /*bits per sample, see PNG standard or documentation further in this header file*/ 323 | 324 | /* 325 | palette (PLTE and tRNS) 326 | 327 | Dynamically allocated with the colors of the palette, including alpha. 328 | When encoding a PNG, to store your colors in the palette of the LodePNGColorMode, first use 329 | lodepng_palette_clear, then for each color use lodepng_palette_add. 330 | If you encode an image without alpha with palette, don't forget to put value 255 in each A byte of the palette. 331 | 332 | When decoding, by default you can ignore this palette, since LodePNG already 333 | fills the palette colors in the pixels of the raw RGBA output. 334 | 335 | The palette is only supported for color type 3. 336 | */ 337 | unsigned char* palette; /*palette in RGBARGBA... order. When allocated, must be either 0, or have size 1024*/ 338 | size_t palettesize; /*palette size in number of colors (amount of bytes is 4 * palettesize)*/ 339 | 340 | /* 341 | transparent color key (tRNS) 342 | 343 | This color uses the same bit depth as the bitdepth value in this struct, which can be 1-bit to 16-bit. 344 | For greyscale PNGs, r, g and b will all 3 be set to the same. 345 | 346 | When decoding, by default you can ignore this information, since LodePNG sets 347 | pixels with this key to transparent already in the raw RGBA output. 348 | 349 | The color key is only supported for color types 0 and 2. 350 | */ 351 | unsigned key_defined; /*is a transparent color key given? 0 = false, 1 = true*/ 352 | unsigned key_r; /*red/greyscale component of color key*/ 353 | unsigned key_g; /*green component of color key*/ 354 | unsigned key_b; /*blue component of color key*/ 355 | } LodePNGColorMode; 356 | 357 | /*init, cleanup and copy functions to use with this struct*/ 358 | void lodepng_color_mode_init(LodePNGColorMode* info); 359 | void lodepng_color_mode_cleanup(LodePNGColorMode* info); 360 | /*return value is error code (0 means no error)*/ 361 | unsigned lodepng_color_mode_copy(LodePNGColorMode* dest, const LodePNGColorMode* source); 362 | 363 | void lodepng_palette_clear(LodePNGColorMode* info); 364 | /*add 1 color to the palette*/ 365 | unsigned lodepng_palette_add(LodePNGColorMode* info, 366 | unsigned char r, unsigned char g, unsigned char b, unsigned char a); 367 | 368 | /*get the total amount of bits per pixel, based on colortype and bitdepth in the struct*/ 369 | unsigned lodepng_get_bpp(const LodePNGColorMode* info); 370 | /*get the amount of color channels used, based on colortype in the struct. 371 | If a palette is used, it counts as 1 channel.*/ 372 | unsigned lodepng_get_channels(const LodePNGColorMode* info); 373 | /*is it a greyscale type? (only colortype 0 or 4)*/ 374 | unsigned lodepng_is_greyscale_type(const LodePNGColorMode* info); 375 | /*has it got an alpha channel? (only colortype 2 or 6)*/ 376 | unsigned lodepng_is_alpha_type(const LodePNGColorMode* info); 377 | /*has it got a palette? (only colortype 3)*/ 378 | unsigned lodepng_is_palette_type(const LodePNGColorMode* info); 379 | /*only returns true if there is a palette and there is a value in the palette with alpha < 255. 380 | Loops through the palette to check this.*/ 381 | unsigned lodepng_has_palette_alpha(const LodePNGColorMode* info); 382 | /* 383 | Check if the given color info indicates the possibility of having non-opaque pixels in the PNG image. 384 | Returns true if the image can have translucent or invisible pixels (it still be opaque if it doesn't use such pixels). 385 | Returns false if the image can only have opaque pixels. 386 | In detail, it returns true only if it's a color type with alpha, or has a palette with non-opaque values, 387 | or if "key_defined" is true. 388 | */ 389 | unsigned lodepng_can_have_alpha(const LodePNGColorMode* info); 390 | /*Returns the byte size of a raw image buffer with given width, height and color mode*/ 391 | size_t lodepng_get_raw_size(unsigned w, unsigned h, const LodePNGColorMode* color); 392 | 393 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS 394 | /*The information of a Time chunk in PNG.*/ 395 | typedef struct LodePNGTime 396 | { 397 | unsigned year; /*2 bytes used (0-65535)*/ 398 | unsigned month; /*1-12*/ 399 | unsigned day; /*1-31*/ 400 | unsigned hour; /*0-23*/ 401 | unsigned minute; /*0-59*/ 402 | unsigned second; /*0-60 (to allow for leap seconds)*/ 403 | } LodePNGTime; 404 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/ 405 | 406 | /*Information about the PNG image, except pixels, width and height.*/ 407 | typedef struct LodePNGInfo 408 | { 409 | /*header (IHDR), palette (PLTE) and transparency (tRNS) chunks*/ 410 | unsigned compression_method;/*compression method of the original file. Always 0.*/ 411 | unsigned filter_method; /*filter method of the original file*/ 412 | unsigned interlace_method; /*interlace method of the original file*/ 413 | LodePNGColorMode color; /*color type and bits, palette and transparency of the PNG file*/ 414 | 415 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS 416 | /* 417 | suggested background color chunk (bKGD) 418 | This color uses the same color mode as the PNG (except alpha channel), which can be 1-bit to 16-bit. 419 | 420 | For greyscale PNGs, r, g and b will all 3 be set to the same. When encoding 421 | the encoder writes the red one. For palette PNGs: When decoding, the RGB value 422 | will be stored, not a palette index. But when encoding, specify the index of 423 | the palette in background_r, the other two are then ignored. 424 | 425 | The decoder does not use this background color to edit the color of pixels. 426 | */ 427 | unsigned background_defined; /*is a suggested background color given?*/ 428 | unsigned background_r; /*red component of suggested background color*/ 429 | unsigned background_g; /*green component of suggested background color*/ 430 | unsigned background_b; /*blue component of suggested background color*/ 431 | 432 | /* 433 | non-international text chunks (tEXt and zTXt) 434 | 435 | The char** arrays each contain num strings. The actual messages are in 436 | text_strings, while text_keys are keywords that give a short description what 437 | the actual text represents, e.g. Title, Author, Description, or anything else. 438 | 439 | A keyword is minimum 1 character and maximum 79 characters long. It's 440 | discouraged to use a single line length longer than 79 characters for texts. 441 | 442 | Don't allocate these text buffers yourself. Use the init/cleanup functions 443 | correctly and use lodepng_add_text and lodepng_clear_text. 444 | */ 445 | size_t text_num; /*the amount of texts in these char** buffers (there may be more texts in itext)*/ 446 | char** text_keys; /*the keyword of a text chunk (e.g. "Comment")*/ 447 | char** text_strings; /*the actual text*/ 448 | 449 | /* 450 | international text chunks (iTXt) 451 | Similar to the non-international text chunks, but with additional strings 452 | "langtags" and "transkeys". 453 | */ 454 | size_t itext_num; /*the amount of international texts in this PNG*/ 455 | char** itext_keys; /*the English keyword of the text chunk (e.g. "Comment")*/ 456 | char** itext_langtags; /*language tag for this text's language, ISO/IEC 646 string, e.g. ISO 639 language tag*/ 457 | char** itext_transkeys; /*keyword translated to the international language - UTF-8 string*/ 458 | char** itext_strings; /*the actual international text - UTF-8 string*/ 459 | 460 | /*time chunk (tIME)*/ 461 | unsigned time_defined; /*set to 1 to make the encoder generate a tIME chunk*/ 462 | LodePNGTime time; 463 | 464 | /*phys chunk (pHYs)*/ 465 | unsigned phys_defined; /*if 0, there is no pHYs chunk and the values below are undefined, if 1 else there is one*/ 466 | unsigned phys_x; /*pixels per unit in x direction*/ 467 | unsigned phys_y; /*pixels per unit in y direction*/ 468 | unsigned phys_unit; /*may be 0 (unknown unit) or 1 (metre)*/ 469 | 470 | /* 471 | unknown chunks 472 | There are 3 buffers, one for each position in the PNG where unknown chunks can appear 473 | each buffer contains all unknown chunks for that position consecutively 474 | The 3 buffers are the unknown chunks between certain critical chunks: 475 | 0: IHDR-PLTE, 1: PLTE-IDAT, 2: IDAT-IEND 476 | Do not allocate or traverse this data yourself. Use the chunk traversing functions declared 477 | later, such as lodepng_chunk_next and lodepng_chunk_append, to read/write this struct. 478 | */ 479 | unsigned char* unknown_chunks_data[3]; 480 | size_t unknown_chunks_size[3]; /*size in bytes of the unknown chunks, given for protection*/ 481 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/ 482 | } LodePNGInfo; 483 | 484 | /*init, cleanup and copy functions to use with this struct*/ 485 | void lodepng_info_init(LodePNGInfo* info); 486 | void lodepng_info_cleanup(LodePNGInfo* info); 487 | /*return value is error code (0 means no error)*/ 488 | unsigned lodepng_info_copy(LodePNGInfo* dest, const LodePNGInfo* source); 489 | 490 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS 491 | void lodepng_clear_text(LodePNGInfo* info); /*use this to clear the texts again after you filled them in*/ 492 | unsigned lodepng_add_text(LodePNGInfo* info, const char* key, const char* str); /*push back both texts at once*/ 493 | 494 | void lodepng_clear_itext(LodePNGInfo* info); /*use this to clear the itexts again after you filled them in*/ 495 | unsigned lodepng_add_itext(LodePNGInfo* info, const char* key, const char* langtag, 496 | const char* transkey, const char* str); /*push back the 4 texts of 1 chunk at once*/ 497 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/ 498 | 499 | /* 500 | Converts raw buffer from one color type to another color type, based on 501 | LodePNGColorMode structs to describe the input and output color type. 502 | See the reference manual at the end of this header file to see which color conversions are supported. 503 | return value = LodePNG error code (0 if all went ok, an error if the conversion isn't supported) 504 | The out buffer must have size (w * h * bpp + 7) / 8, where bpp is the bits per pixel 505 | of the output color type (lodepng_get_bpp). 506 | For < 8 bpp images, there should not be padding bits at the end of scanlines. 507 | For 16-bit per channel colors, uses big endian format like PNG does. 508 | Return value is LodePNG error code 509 | */ 510 | unsigned lodepng_convert(unsigned char* out, const unsigned char* in, 511 | const LodePNGColorMode* mode_out, const LodePNGColorMode* mode_in, 512 | unsigned w, unsigned h); 513 | 514 | #ifdef LODEPNG_COMPILE_DECODER 515 | /* 516 | Settings for the decoder. This contains settings for the PNG and the Zlib 517 | decoder, but not the Info settings from the Info structs. 518 | */ 519 | typedef struct LodePNGDecoderSettings 520 | { 521 | LodePNGDecompressSettings zlibsettings; /*in here is the setting to ignore Adler32 checksums*/ 522 | 523 | unsigned ignore_crc; /*ignore CRC checksums*/ 524 | 525 | unsigned color_convert; /*whether to convert the PNG to the color type you want. Default: yes*/ 526 | 527 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS 528 | unsigned read_text_chunks; /*if false but remember_unknown_chunks is true, they're stored in the unknown chunks*/ 529 | /*store all bytes from unknown chunks in the LodePNGInfo (off by default, useful for a png editor)*/ 530 | unsigned remember_unknown_chunks; 531 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/ 532 | } LodePNGDecoderSettings; 533 | 534 | void lodepng_decoder_settings_init(LodePNGDecoderSettings* settings); 535 | #endif /*LODEPNG_COMPILE_DECODER*/ 536 | 537 | #ifdef LODEPNG_COMPILE_ENCODER 538 | /*automatically use color type with less bits per pixel if losslessly possible. Default: AUTO*/ 539 | typedef enum LodePNGFilterStrategy 540 | { 541 | /*every filter at zero*/ 542 | LFS_ZERO, 543 | /*Use filter that gives minimum sum, as described in the official PNG filter heuristic.*/ 544 | LFS_MINSUM, 545 | /*Use the filter type that gives smallest Shannon entropy for this scanline. Depending 546 | on the image, this is better or worse than minsum.*/ 547 | LFS_ENTROPY, 548 | /* 549 | Brute-force-search PNG filters by compressing each filter for each scanline. 550 | Experimental, very slow, and only rarely gives better compression than MINSUM. 551 | */ 552 | LFS_BRUTE_FORCE, 553 | /*use predefined_filters buffer: you specify the filter type for each scanline*/ 554 | LFS_PREDEFINED 555 | } LodePNGFilterStrategy; 556 | 557 | /*Gives characteristics about the colors of the image, which helps decide which color model to use for encoding. 558 | Used internally by default if "auto_convert" is enabled. Public because it's useful for custom algorithms.*/ 559 | typedef struct LodePNGColorProfile 560 | { 561 | unsigned colored; /*not greyscale*/ 562 | unsigned key; /*image is not opaque and color key is possible instead of full alpha*/ 563 | unsigned short key_r; /*key values, always as 16-bit, in 8-bit case the byte is duplicated, e.g. 65535 means 255*/ 564 | unsigned short key_g; 565 | unsigned short key_b; 566 | unsigned alpha; /*image is not opaque and alpha channel or alpha palette required*/ 567 | unsigned numcolors; /*amount of colors, up to 257. Not valid if bits == 16.*/ 568 | unsigned char palette[1024]; /*Remembers up to the first 256 RGBA colors, in no particular order*/ 569 | unsigned bits; /*bits per channel (not for palette). 1,2 or 4 for greyscale only. 16 if 16-bit per channel required.*/ 570 | } LodePNGColorProfile; 571 | 572 | void lodepng_color_profile_init(LodePNGColorProfile* profile); 573 | 574 | /*Get a LodePNGColorProfile of the image.*/ 575 | unsigned lodepng_get_color_profile(LodePNGColorProfile* profile, 576 | const unsigned char* image, unsigned w, unsigned h, 577 | const LodePNGColorMode* mode_in); 578 | /*The function LodePNG uses internally to decide the PNG color with auto_convert. 579 | Chooses an optimal color model, e.g. grey if only grey pixels, palette if < 256 colors, ...*/ 580 | unsigned lodepng_auto_choose_color(LodePNGColorMode* mode_out, 581 | const unsigned char* image, unsigned w, unsigned h, 582 | const LodePNGColorMode* mode_in); 583 | 584 | /*Settings for the encoder.*/ 585 | typedef struct LodePNGEncoderSettings 586 | { 587 | LodePNGCompressSettings zlibsettings; /*settings for the zlib encoder, such as window size, ...*/ 588 | 589 | unsigned auto_convert; /*automatically choose output PNG color type. Default: true*/ 590 | 591 | /*If true, follows the official PNG heuristic: if the PNG uses a palette or lower than 592 | 8 bit depth, set all filters to zero. Otherwise use the filter_strategy. Note that to 593 | completely follow the official PNG heuristic, filter_palette_zero must be true and 594 | filter_strategy must be LFS_MINSUM*/ 595 | unsigned filter_palette_zero; 596 | /*Which filter strategy to use when not using zeroes due to filter_palette_zero. 597 | Set filter_palette_zero to 0 to ensure always using your chosen strategy. Default: LFS_MINSUM*/ 598 | LodePNGFilterStrategy filter_strategy; 599 | /*used if filter_strategy is LFS_PREDEFINED. In that case, this must point to a buffer with 600 | the same length as the amount of scanlines in the image, and each value must <= 5. You 601 | have to cleanup this buffer, LodePNG will never free it. Don't forget that filter_palette_zero 602 | must be set to 0 to ensure this is also used on palette or low bitdepth images.*/ 603 | const unsigned char* predefined_filters; 604 | 605 | /*force creating a PLTE chunk if colortype is 2 or 6 (= a suggested palette). 606 | If colortype is 3, PLTE is _always_ created.*/ 607 | unsigned force_palette; 608 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS 609 | /*add LodePNG identifier and version as a text chunk, for debugging*/ 610 | unsigned add_id; 611 | /*encode text chunks as zTXt chunks instead of tEXt chunks, and use compression in iTXt chunks*/ 612 | unsigned text_compression; 613 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/ 614 | } LodePNGEncoderSettings; 615 | 616 | void lodepng_encoder_settings_init(LodePNGEncoderSettings* settings); 617 | #endif /*LODEPNG_COMPILE_ENCODER*/ 618 | 619 | 620 | #if defined(LODEPNG_COMPILE_DECODER) || defined(LODEPNG_COMPILE_ENCODER) 621 | /*The settings, state and information for extended encoding and decoding.*/ 622 | typedef struct LodePNGState 623 | { 624 | #ifdef LODEPNG_COMPILE_DECODER 625 | LodePNGDecoderSettings decoder; /*the decoding settings*/ 626 | #endif /*LODEPNG_COMPILE_DECODER*/ 627 | #ifdef LODEPNG_COMPILE_ENCODER 628 | LodePNGEncoderSettings encoder; /*the encoding settings*/ 629 | #endif /*LODEPNG_COMPILE_ENCODER*/ 630 | LodePNGColorMode info_raw; /*specifies the format in which you would like to get the raw pixel buffer*/ 631 | LodePNGInfo info_png; /*info of the PNG image obtained after decoding*/ 632 | unsigned error; 633 | #ifdef LODEPNG_COMPILE_CPP 634 | /* For the lodepng::State subclass. */ 635 | virtual ~LodePNGState(){} 636 | #endif 637 | } LodePNGState; 638 | 639 | /*init, cleanup and copy functions to use with this struct*/ 640 | void lodepng_state_init(LodePNGState* state); 641 | void lodepng_state_cleanup(LodePNGState* state); 642 | void lodepng_state_copy(LodePNGState* dest, const LodePNGState* source); 643 | #endif /* defined(LODEPNG_COMPILE_DECODER) || defined(LODEPNG_COMPILE_ENCODER) */ 644 | 645 | #ifdef LODEPNG_COMPILE_DECODER 646 | /* 647 | Same as lodepng_decode_memory, but uses a LodePNGState to allow custom settings and 648 | getting much more information about the PNG image and color mode. 649 | */ 650 | unsigned lodepng_decode(unsigned char** out, unsigned* w, unsigned* h, 651 | LodePNGState* state, 652 | const unsigned char* in, size_t insize); 653 | 654 | /* 655 | Read the PNG header, but not the actual data. This returns only the information 656 | that is in the header chunk of the PNG, such as width, height and color type. The 657 | information is placed in the info_png field of the LodePNGState. 658 | */ 659 | unsigned lodepng_inspect(unsigned* w, unsigned* h, 660 | LodePNGState* state, 661 | const unsigned char* in, size_t insize); 662 | #endif /*LODEPNG_COMPILE_DECODER*/ 663 | 664 | 665 | #ifdef LODEPNG_COMPILE_ENCODER 666 | /*This function allocates the out buffer with standard malloc and stores the size in *outsize.*/ 667 | unsigned lodepng_encode(unsigned char** out, size_t* outsize, 668 | const unsigned char* image, unsigned w, unsigned h, 669 | LodePNGState* state); 670 | #endif /*LODEPNG_COMPILE_ENCODER*/ 671 | 672 | /* 673 | The lodepng_chunk functions are normally not needed, except to traverse the 674 | unknown chunks stored in the LodePNGInfo struct, or add new ones to it. 675 | It also allows traversing the chunks of an encoded PNG file yourself. 676 | 677 | PNG standard chunk naming conventions: 678 | First byte: uppercase = critical, lowercase = ancillary 679 | Second byte: uppercase = public, lowercase = private 680 | Third byte: must be uppercase 681 | Fourth byte: uppercase = unsafe to copy, lowercase = safe to copy 682 | */ 683 | 684 | /* 685 | Gets the length of the data of the chunk. Total chunk length has 12 bytes more. 686 | There must be at least 4 bytes to read from. If the result value is too large, 687 | it may be corrupt data. 688 | */ 689 | unsigned lodepng_chunk_length(const unsigned char* chunk); 690 | 691 | /*puts the 4-byte type in null terminated string*/ 692 | void lodepng_chunk_type(char type[5], const unsigned char* chunk); 693 | 694 | /*check if the type is the given type*/ 695 | unsigned char lodepng_chunk_type_equals(const unsigned char* chunk, const char* type); 696 | 697 | /*0: it's one of the critical chunk types, 1: it's an ancillary chunk (see PNG standard)*/ 698 | unsigned char lodepng_chunk_ancillary(const unsigned char* chunk); 699 | 700 | /*0: public, 1: private (see PNG standard)*/ 701 | unsigned char lodepng_chunk_private(const unsigned char* chunk); 702 | 703 | /*0: the chunk is unsafe to copy, 1: the chunk is safe to copy (see PNG standard)*/ 704 | unsigned char lodepng_chunk_safetocopy(const unsigned char* chunk); 705 | 706 | /*get pointer to the data of the chunk, where the input points to the header of the chunk*/ 707 | unsigned char* lodepng_chunk_data(unsigned char* chunk); 708 | const unsigned char* lodepng_chunk_data_const(const unsigned char* chunk); 709 | 710 | /*returns 0 if the crc is correct, 1 if it's incorrect (0 for OK as usual!)*/ 711 | unsigned lodepng_chunk_check_crc(const unsigned char* chunk); 712 | 713 | /*generates the correct CRC from the data and puts it in the last 4 bytes of the chunk*/ 714 | void lodepng_chunk_generate_crc(unsigned char* chunk); 715 | 716 | /*iterate to next chunks. don't use on IEND chunk, as there is no next chunk then*/ 717 | unsigned char* lodepng_chunk_next(unsigned char* chunk); 718 | const unsigned char* lodepng_chunk_next_const(const unsigned char* chunk); 719 | 720 | /* 721 | Appends chunk to the data in out. The given chunk should already have its chunk header. 722 | The out variable and outlength are updated to reflect the new reallocated buffer. 723 | Returns error code (0 if it went ok) 724 | */ 725 | unsigned lodepng_chunk_append(unsigned char** out, size_t* outlength, const unsigned char* chunk); 726 | 727 | /* 728 | Appends new chunk to out. The chunk to append is given by giving its length, type 729 | and data separately. The type is a 4-letter string. 730 | The out variable and outlength are updated to reflect the new reallocated buffer. 731 | Returne error code (0 if it went ok) 732 | */ 733 | unsigned lodepng_chunk_create(unsigned char** out, size_t* outlength, unsigned length, 734 | const char* type, const unsigned char* data); 735 | 736 | 737 | /*Calculate CRC32 of buffer*/ 738 | unsigned lodepng_crc32(const unsigned char* buf, size_t len); 739 | #endif /*LODEPNG_COMPILE_PNG*/ 740 | 741 | 742 | #ifdef LODEPNG_COMPILE_ZLIB 743 | /* 744 | This zlib part can be used independently to zlib compress and decompress a 745 | buffer. It cannot be used to create gzip files however, and it only supports the 746 | part of zlib that is required for PNG, it does not support dictionaries. 747 | */ 748 | 749 | #ifdef LODEPNG_COMPILE_DECODER 750 | /*Inflate a buffer. Inflate is the decompression step of deflate. Out buffer must be freed after use.*/ 751 | unsigned lodepng_inflate(unsigned char** out, size_t* outsize, 752 | const unsigned char* in, size_t insize, 753 | const LodePNGDecompressSettings* settings); 754 | 755 | /* 756 | Decompresses Zlib data. Reallocates the out buffer and appends the data. The 757 | data must be according to the zlib specification. 758 | Either, *out must be NULL and *outsize must be 0, or, *out must be a valid 759 | buffer and *outsize its size in bytes. out must be freed by user after usage. 760 | */ 761 | unsigned lodepng_zlib_decompress(unsigned char** out, size_t* outsize, 762 | const unsigned char* in, size_t insize, 763 | const LodePNGDecompressSettings* settings); 764 | #endif /*LODEPNG_COMPILE_DECODER*/ 765 | 766 | #ifdef LODEPNG_COMPILE_ENCODER 767 | /* 768 | Compresses data with Zlib. Reallocates the out buffer and appends the data. 769 | Zlib adds a small header and trailer around the deflate data. 770 | The data is output in the format of the zlib specification. 771 | Either, *out must be NULL and *outsize must be 0, or, *out must be a valid 772 | buffer and *outsize its size in bytes. out must be freed by user after usage. 773 | */ 774 | unsigned lodepng_zlib_compress(unsigned char** out, size_t* outsize, 775 | const unsigned char* in, size_t insize, 776 | const LodePNGCompressSettings* settings); 777 | 778 | /* 779 | Find length-limited Huffman code for given frequencies. This function is in the 780 | public interface only for tests, it's used internally by lodepng_deflate. 781 | */ 782 | unsigned lodepng_huffman_code_lengths(unsigned* lengths, const unsigned* frequencies, 783 | size_t numcodes, unsigned maxbitlen); 784 | 785 | /*Compress a buffer with deflate. See RFC 1951. Out buffer must be freed after use.*/ 786 | unsigned lodepng_deflate(unsigned char** out, size_t* outsize, 787 | const unsigned char* in, size_t insize, 788 | const LodePNGCompressSettings* settings); 789 | 790 | #endif /*LODEPNG_COMPILE_ENCODER*/ 791 | #endif /*LODEPNG_COMPILE_ZLIB*/ 792 | 793 | #ifdef LODEPNG_COMPILE_DISK 794 | /* 795 | Load a file from disk into buffer. The function allocates the out buffer, and 796 | after usage you should free it. 797 | out: output parameter, contains pointer to loaded buffer. 798 | outsize: output parameter, size of the allocated out buffer 799 | filename: the path to the file to load 800 | return value: error code (0 means ok) 801 | */ 802 | unsigned lodepng_load_file(unsigned char** out, size_t* outsize, const char* filename); 803 | 804 | /* 805 | Save a file from buffer to disk. Warning, if it exists, this function overwrites 806 | the file without warning! 807 | buffer: the buffer to write 808 | buffersize: size of the buffer to write 809 | filename: the path to the file to save to 810 | return value: error code (0 means ok) 811 | */ 812 | unsigned lodepng_save_file(const unsigned char* buffer, size_t buffersize, const char* filename); 813 | #endif /*LODEPNG_COMPILE_DISK*/ 814 | 815 | #ifdef LODEPNG_COMPILE_CPP 816 | /* The LodePNG C++ wrapper uses std::vectors instead of manually allocated memory buffers. */ 817 | namespace lodepng 818 | { 819 | #ifdef LODEPNG_COMPILE_PNG 820 | class State : public LodePNGState 821 | { 822 | public: 823 | State(); 824 | State(const State& other); 825 | virtual ~State(); 826 | State& operator=(const State& other); 827 | }; 828 | 829 | #ifdef LODEPNG_COMPILE_DECODER 830 | /* Same as other lodepng::decode, but using a State for more settings and information. */ 831 | unsigned decode(std::vector& out, unsigned& w, unsigned& h, 832 | State& state, 833 | const unsigned char* in, size_t insize); 834 | unsigned decode(std::vector& out, unsigned& w, unsigned& h, 835 | State& state, 836 | const std::vector& in); 837 | #endif /*LODEPNG_COMPILE_DECODER*/ 838 | 839 | #ifdef LODEPNG_COMPILE_ENCODER 840 | /* Same as other lodepng::encode, but using a State for more settings and information. */ 841 | unsigned encode(std::vector& out, 842 | const unsigned char* in, unsigned w, unsigned h, 843 | State& state); 844 | unsigned encode(std::vector& out, 845 | const std::vector& in, unsigned w, unsigned h, 846 | State& state); 847 | #endif /*LODEPNG_COMPILE_ENCODER*/ 848 | 849 | #ifdef LODEPNG_COMPILE_DISK 850 | /* 851 | Load a file from disk into an std::vector. 852 | return value: error code (0 means ok) 853 | */ 854 | unsigned load_file(std::vector& buffer, const std::string& filename); 855 | 856 | /* 857 | Save the binary data in an std::vector to a file on disk. The file is overwritten 858 | without warning. 859 | */ 860 | unsigned save_file(const std::vector& buffer, const std::string& filename); 861 | #endif /* LODEPNG_COMPILE_DISK */ 862 | #endif /* LODEPNG_COMPILE_PNG */ 863 | 864 | #ifdef LODEPNG_COMPILE_ZLIB 865 | #ifdef LODEPNG_COMPILE_DECODER 866 | /* Zlib-decompress an unsigned char buffer */ 867 | unsigned decompress(std::vector& out, const unsigned char* in, size_t insize, 868 | const LodePNGDecompressSettings& settings = lodepng_default_decompress_settings); 869 | 870 | /* Zlib-decompress an std::vector */ 871 | unsigned decompress(std::vector& out, const std::vector& in, 872 | const LodePNGDecompressSettings& settings = lodepng_default_decompress_settings); 873 | #endif /* LODEPNG_COMPILE_DECODER */ 874 | 875 | #ifdef LODEPNG_COMPILE_ENCODER 876 | /* Zlib-compress an unsigned char buffer */ 877 | unsigned compress(std::vector& out, const unsigned char* in, size_t insize, 878 | const LodePNGCompressSettings& settings = lodepng_default_compress_settings); 879 | 880 | /* Zlib-compress an std::vector */ 881 | unsigned compress(std::vector& out, const std::vector& in, 882 | const LodePNGCompressSettings& settings = lodepng_default_compress_settings); 883 | #endif /* LODEPNG_COMPILE_ENCODER */ 884 | #endif /* LODEPNG_COMPILE_ZLIB */ 885 | } /* namespace lodepng */ 886 | #endif /*LODEPNG_COMPILE_CPP*/ 887 | 888 | /* 889 | TODO: 890 | [.] test if there are no memory leaks or security exploits - done a lot but needs to be checked often 891 | [.] check compatibility with various compilers - done but needs to be redone for every newer version 892 | [X] converting color to 16-bit per channel types 893 | [ ] read all public PNG chunk types (but never let the color profile and gamma ones touch RGB values) 894 | [ ] make sure encoder generates no chunks with size > (2^31)-1 895 | [ ] partial decoding (stream processing) 896 | [X] let the "isFullyOpaque" function check color keys and transparent palettes too 897 | [X] better name for the variables "codes", "codesD", "codelengthcodes", "clcl" and "lldl" 898 | [ ] don't stop decoding on errors like 69, 57, 58 (make warnings) 899 | [ ] make warnings like: oob palette, checksum fail, data after iend, wrong/unknown crit chunk, no null terminator in text, ... 900 | [ ] let the C++ wrapper catch exceptions coming from the standard library and return LodePNG error codes 901 | [ ] allow user to provide custom color conversion functions, e.g. for premultiplied alpha, padding bits or not, ... 902 | [ ] allow user to give data (void*) to custom allocator 903 | */ 904 | 905 | #endif /*LODEPNG_H inclusion guard*/ 906 | 907 | /* 908 | LodePNG Documentation 909 | --------------------- 910 | 911 | 0. table of contents 912 | -------------------- 913 | 914 | 1. about 915 | 1.1. supported features 916 | 1.2. features not supported 917 | 2. C and C++ version 918 | 3. security 919 | 4. decoding 920 | 5. encoding 921 | 6. color conversions 922 | 6.1. PNG color types 923 | 6.2. color conversions 924 | 6.3. padding bits 925 | 6.4. A note about 16-bits per channel and endianness 926 | 7. error values 927 | 8. chunks and PNG editing 928 | 9. compiler support 929 | 10. examples 930 | 10.1. decoder C++ example 931 | 10.2. decoder C example 932 | 11. state settings reference 933 | 12. changes 934 | 13. contact information 935 | 936 | 937 | 1. about 938 | -------- 939 | 940 | PNG is a file format to store raster images losslessly with good compression, 941 | supporting different color types and alpha channel. 942 | 943 | LodePNG is a PNG codec according to the Portable Network Graphics (PNG) 944 | Specification (Second Edition) - W3C Recommendation 10 November 2003. 945 | 946 | The specifications used are: 947 | 948 | *) Portable Network Graphics (PNG) Specification (Second Edition): 949 | http://www.w3.org/TR/2003/REC-PNG-20031110 950 | *) RFC 1950 ZLIB Compressed Data Format version 3.3: 951 | http://www.gzip.org/zlib/rfc-zlib.html 952 | *) RFC 1951 DEFLATE Compressed Data Format Specification ver 1.3: 953 | http://www.gzip.org/zlib/rfc-deflate.html 954 | 955 | The most recent version of LodePNG can currently be found at 956 | http://lodev.org/lodepng/ 957 | 958 | LodePNG works both in C (ISO C90) and C++, with a C++ wrapper that adds 959 | extra functionality. 960 | 961 | LodePNG exists out of two files: 962 | -lodepng.h: the header file for both C and C++ 963 | -lodepng.c(pp): give it the name lodepng.c or lodepng.cpp (or .cc) depending on your usage 964 | 965 | If you want to start using LodePNG right away without reading this doc, get the 966 | examples from the LodePNG website to see how to use it in code, or check the 967 | smaller examples in chapter 13 here. 968 | 969 | LodePNG is simple but only supports the basic requirements. To achieve 970 | simplicity, the following design choices were made: There are no dependencies 971 | on any external library. There are functions to decode and encode a PNG with 972 | a single function call, and extended versions of these functions taking a 973 | LodePNGState struct allowing to specify or get more information. By default 974 | the colors of the raw image are always RGB or RGBA, no matter what color type 975 | the PNG file uses. To read and write files, there are simple functions to 976 | convert the files to/from buffers in memory. 977 | 978 | This all makes LodePNG suitable for loading textures in games, demos and small 979 | programs, ... It's less suitable for full fledged image editors, loading PNGs 980 | over network (it requires all the image data to be available before decoding can 981 | begin), life-critical systems, ... 982 | 983 | 1.1. supported features 984 | ----------------------- 985 | 986 | The following features are supported by the decoder: 987 | 988 | *) decoding of PNGs with any color type, bit depth and interlace mode, to a 24- or 32-bit color raw image, 989 | or the same color type as the PNG 990 | *) encoding of PNGs, from any raw image to 24- or 32-bit color, or the same color type as the raw image 991 | *) Adam7 interlace and deinterlace for any color type 992 | *) loading the image from harddisk or decoding it from a buffer from other sources than harddisk 993 | *) support for alpha channels, including RGBA color model, translucent palettes and color keying 994 | *) zlib decompression (inflate) 995 | *) zlib compression (deflate) 996 | *) CRC32 and ADLER32 checksums 997 | *) handling of unknown chunks, allowing making a PNG editor that stores custom and unknown chunks. 998 | *) the following chunks are supported (generated/interpreted) by both encoder and decoder: 999 | IHDR: header information 1000 | PLTE: color palette 1001 | IDAT: pixel data 1002 | IEND: the final chunk 1003 | tRNS: transparency for palettized images 1004 | tEXt: textual information 1005 | zTXt: compressed textual information 1006 | iTXt: international textual information 1007 | bKGD: suggested background color 1008 | pHYs: physical dimensions 1009 | tIME: modification time 1010 | 1011 | 1.2. features not supported 1012 | --------------------------- 1013 | 1014 | The following features are _not_ supported: 1015 | 1016 | *) some features needed to make a conformant PNG-Editor might be still missing. 1017 | *) partial loading/stream processing. All data must be available and is processed in one call. 1018 | *) The following public chunks are not supported but treated as unknown chunks by LodePNG 1019 | cHRM, gAMA, iCCP, sRGB, sBIT, hIST, sPLT 1020 | Some of these are not supported on purpose: LodePNG wants to provide the RGB values 1021 | stored in the pixels, not values modified by system dependent gamma or color models. 1022 | 1023 | 1024 | 2. C and C++ version 1025 | -------------------- 1026 | 1027 | The C version uses buffers allocated with alloc that you need to free() 1028 | yourself. You need to use init and cleanup functions for each struct whenever 1029 | using a struct from the C version to avoid exploits and memory leaks. 1030 | 1031 | The C++ version has extra functions with std::vectors in the interface and the 1032 | lodepng::State class which is a LodePNGState with constructor and destructor. 1033 | 1034 | These files work without modification for both C and C++ compilers because all 1035 | the additional C++ code is in "#ifdef __cplusplus" blocks that make C-compilers 1036 | ignore it, and the C code is made to compile both with strict ISO C90 and C++. 1037 | 1038 | To use the C++ version, you need to rename the source file to lodepng.cpp 1039 | (instead of lodepng.c), and compile it with a C++ compiler. 1040 | 1041 | To use the C version, you need to rename the source file to lodepng.c (instead 1042 | of lodepng.cpp), and compile it with a C compiler. 1043 | 1044 | 1045 | 3. Security 1046 | ----------- 1047 | 1048 | Even if carefully designed, it's always possible that LodePNG contains possible 1049 | exploits. If you discover one, please let me know, and it will be fixed. 1050 | 1051 | When using LodePNG, care has to be taken with the C version of LodePNG, as well 1052 | as the C-style structs when working with C++. The following conventions are used 1053 | for all C-style structs: 1054 | 1055 | -if a struct has a corresponding init function, always call the init function when making a new one 1056 | -if a struct has a corresponding cleanup function, call it before the struct disappears to avoid memory leaks 1057 | -if a struct has a corresponding copy function, use the copy function instead of "=". 1058 | The destination must also be inited already. 1059 | 1060 | 1061 | 4. Decoding 1062 | ----------- 1063 | 1064 | Decoding converts a PNG compressed image to a raw pixel buffer. 1065 | 1066 | Most documentation on using the decoder is at its declarations in the header 1067 | above. For C, simple decoding can be done with functions such as 1068 | lodepng_decode32, and more advanced decoding can be done with the struct 1069 | LodePNGState and lodepng_decode. For C++, all decoding can be done with the 1070 | various lodepng::decode functions, and lodepng::State can be used for advanced 1071 | features. 1072 | 1073 | When using the LodePNGState, it uses the following fields for decoding: 1074 | *) LodePNGInfo info_png: it stores extra information about the PNG (the input) in here 1075 | *) LodePNGColorMode info_raw: here you can say what color mode of the raw image (the output) you want to get 1076 | *) LodePNGDecoderSettings decoder: you can specify a few extra settings for the decoder to use 1077 | 1078 | LodePNGInfo info_png 1079 | -------------------- 1080 | 1081 | After decoding, this contains extra information of the PNG image, except the actual 1082 | pixels, width and height because these are already gotten directly from the decoder 1083 | functions. 1084 | 1085 | It contains for example the original color type of the PNG image, text comments, 1086 | suggested background color, etc... More details about the LodePNGInfo struct are 1087 | at its declaration documentation. 1088 | 1089 | LodePNGColorMode info_raw 1090 | ------------------------- 1091 | 1092 | When decoding, here you can specify which color type you want 1093 | the resulting raw image to be. If this is different from the colortype of the 1094 | PNG, then the decoder will automatically convert the result. This conversion 1095 | always works, except if you want it to convert a color PNG to greyscale or to 1096 | a palette with missing colors. 1097 | 1098 | By default, 32-bit color is used for the result. 1099 | 1100 | LodePNGDecoderSettings decoder 1101 | ------------------------------ 1102 | 1103 | The settings can be used to ignore the errors created by invalid CRC and Adler32 1104 | chunks, and to disable the decoding of tEXt chunks. 1105 | 1106 | There's also a setting color_convert, true by default. If false, no conversion 1107 | is done, the resulting data will be as it was in the PNG (after decompression) 1108 | and you'll have to puzzle the colors of the pixels together yourself using the 1109 | color type information in the LodePNGInfo. 1110 | 1111 | 1112 | 5. Encoding 1113 | ----------- 1114 | 1115 | Encoding converts a raw pixel buffer to a PNG compressed image. 1116 | 1117 | Most documentation on using the encoder is at its declarations in the header 1118 | above. For C, simple encoding can be done with functions such as 1119 | lodepng_encode32, and more advanced decoding can be done with the struct 1120 | LodePNGState and lodepng_encode. For C++, all encoding can be done with the 1121 | various lodepng::encode functions, and lodepng::State can be used for advanced 1122 | features. 1123 | 1124 | Like the decoder, the encoder can also give errors. However it gives less errors 1125 | since the encoder input is trusted, the decoder input (a PNG image that could 1126 | be forged by anyone) is not trusted. 1127 | 1128 | When using the LodePNGState, it uses the following fields for encoding: 1129 | *) LodePNGInfo info_png: here you specify how you want the PNG (the output) to be. 1130 | *) LodePNGColorMode info_raw: here you say what color type of the raw image (the input) has 1131 | *) LodePNGEncoderSettings encoder: you can specify a few settings for the encoder to use 1132 | 1133 | LodePNGInfo info_png 1134 | -------------------- 1135 | 1136 | When encoding, you use this the opposite way as when decoding: for encoding, 1137 | you fill in the values you want the PNG to have before encoding. By default it's 1138 | not needed to specify a color type for the PNG since it's automatically chosen, 1139 | but it's possible to choose it yourself given the right settings. 1140 | 1141 | The encoder will not always exactly match the LodePNGInfo struct you give, 1142 | it tries as close as possible. Some things are ignored by the encoder. The 1143 | encoder uses, for example, the following settings from it when applicable: 1144 | colortype and bitdepth, text chunks, time chunk, the color key, the palette, the 1145 | background color, the interlace method, unknown chunks, ... 1146 | 1147 | When encoding to a PNG with colortype 3, the encoder will generate a PLTE chunk. 1148 | If the palette contains any colors for which the alpha channel is not 255 (so 1149 | there are translucent colors in the palette), it'll add a tRNS chunk. 1150 | 1151 | LodePNGColorMode info_raw 1152 | ------------------------- 1153 | 1154 | You specify the color type of the raw image that you give to the input here, 1155 | including a possible transparent color key and palette you happen to be using in 1156 | your raw image data. 1157 | 1158 | By default, 32-bit color is assumed, meaning your input has to be in RGBA 1159 | format with 4 bytes (unsigned chars) per pixel. 1160 | 1161 | LodePNGEncoderSettings encoder 1162 | ------------------------------ 1163 | 1164 | The following settings are supported (some are in sub-structs): 1165 | *) auto_convert: when this option is enabled, the encoder will 1166 | automatically choose the smallest possible color mode (including color key) that 1167 | can encode the colors of all pixels without information loss. 1168 | *) btype: the block type for LZ77. 0 = uncompressed, 1 = fixed huffman tree, 1169 | 2 = dynamic huffman tree (best compression). Should be 2 for proper 1170 | compression. 1171 | *) use_lz77: whether or not to use LZ77 for compressed block types. Should be 1172 | true for proper compression. 1173 | *) windowsize: the window size used by the LZ77 encoder (1 - 32768). Has value 1174 | 2048 by default, but can be set to 32768 for better, but slow, compression. 1175 | *) force_palette: if colortype is 2 or 6, you can make the encoder write a PLTE 1176 | chunk if force_palette is true. This can used as suggested palette to convert 1177 | to by viewers that don't support more than 256 colors (if those still exist) 1178 | *) add_id: add text chunk "Encoder: LodePNG " to the image. 1179 | *) text_compression: default 1. If 1, it'll store texts as zTXt instead of tEXt chunks. 1180 | zTXt chunks use zlib compression on the text. This gives a smaller result on 1181 | large texts but a larger result on small texts (such as a single program name). 1182 | It's all tEXt or all zTXt though, there's no separate setting per text yet. 1183 | 1184 | 1185 | 6. color conversions 1186 | -------------------- 1187 | 1188 | An important thing to note about LodePNG, is that the color type of the PNG, and 1189 | the color type of the raw image, are completely independent. By default, when 1190 | you decode a PNG, you get the result as a raw image in the color type you want, 1191 | no matter whether the PNG was encoded with a palette, greyscale or RGBA color. 1192 | And if you encode an image, by default LodePNG will automatically choose the PNG 1193 | color type that gives good compression based on the values of colors and amount 1194 | of colors in the image. It can be configured to let you control it instead as 1195 | well, though. 1196 | 1197 | To be able to do this, LodePNG does conversions from one color mode to another. 1198 | It can convert from almost any color type to any other color type, except the 1199 | following conversions: RGB to greyscale is not supported, and converting to a 1200 | palette when the palette doesn't have a required color is not supported. This is 1201 | not supported on purpose: this is information loss which requires a color 1202 | reduction algorithm that is beyong the scope of a PNG encoder (yes, RGB to grey 1203 | is easy, but there are multiple ways if you want to give some channels more 1204 | weight). 1205 | 1206 | By default, when decoding, you get the raw image in 32-bit RGBA or 24-bit RGB 1207 | color, no matter what color type the PNG has. And by default when encoding, 1208 | LodePNG automatically picks the best color model for the output PNG, and expects 1209 | the input image to be 32-bit RGBA or 24-bit RGB. So, unless you want to control 1210 | the color format of the images yourself, you can skip this chapter. 1211 | 1212 | 6.1. PNG color types 1213 | -------------------- 1214 | 1215 | A PNG image can have many color types, ranging from 1-bit color to 64-bit color, 1216 | as well as palettized color modes. After the zlib decompression and unfiltering 1217 | in the PNG image is done, the raw pixel data will have that color type and thus 1218 | a certain amount of bits per pixel. If you want the output raw image after 1219 | decoding to have another color type, a conversion is done by LodePNG. 1220 | 1221 | The PNG specification gives the following color types: 1222 | 1223 | 0: greyscale, bit depths 1, 2, 4, 8, 16 1224 | 2: RGB, bit depths 8 and 16 1225 | 3: palette, bit depths 1, 2, 4 and 8 1226 | 4: greyscale with alpha, bit depths 8 and 16 1227 | 6: RGBA, bit depths 8 and 16 1228 | 1229 | Bit depth is the amount of bits per pixel per color channel. So the total amount 1230 | of bits per pixel is: amount of channels * bitdepth. 1231 | 1232 | 6.2. color conversions 1233 | ---------------------- 1234 | 1235 | As explained in the sections about the encoder and decoder, you can specify 1236 | color types and bit depths in info_png and info_raw to change the default 1237 | behaviour. 1238 | 1239 | If, when decoding, you want the raw image to be something else than the default, 1240 | you need to set the color type and bit depth you want in the LodePNGColorMode, 1241 | or the parameters colortype and bitdepth of the simple decoding function. 1242 | 1243 | If, when encoding, you use another color type than the default in the raw input 1244 | image, you need to specify its color type and bit depth in the LodePNGColorMode 1245 | of the raw image, or use the parameters colortype and bitdepth of the simple 1246 | encoding function. 1247 | 1248 | If, when encoding, you don't want LodePNG to choose the output PNG color type 1249 | but control it yourself, you need to set auto_convert in the encoder settings 1250 | to false, and specify the color type you want in the LodePNGInfo of the 1251 | encoder (including palette: it can generate a palette if auto_convert is true, 1252 | otherwise not). 1253 | 1254 | If the input and output color type differ (whether user chosen or auto chosen), 1255 | LodePNG will do a color conversion, which follows the rules below, and may 1256 | sometimes result in an error. 1257 | 1258 | To avoid some confusion: 1259 | -the decoder converts from PNG to raw image 1260 | -the encoder converts from raw image to PNG 1261 | -the colortype and bitdepth in LodePNGColorMode info_raw, are those of the raw image 1262 | -the colortype and bitdepth in the color field of LodePNGInfo info_png, are those of the PNG 1263 | -when encoding, the color type in LodePNGInfo is ignored if auto_convert 1264 | is enabled, it is automatically generated instead 1265 | -when decoding, the color type in LodePNGInfo is set by the decoder to that of the original 1266 | PNG image, but it can be ignored since the raw image has the color type you requested instead 1267 | -if the color type of the LodePNGColorMode and PNG image aren't the same, a conversion 1268 | between the color types is done if the color types are supported. If it is not 1269 | supported, an error is returned. If the types are the same, no conversion is done. 1270 | -even though some conversions aren't supported, LodePNG supports loading PNGs from any 1271 | colortype and saving PNGs to any colortype, sometimes it just requires preparing 1272 | the raw image correctly before encoding. 1273 | -both encoder and decoder use the same color converter. 1274 | 1275 | Non supported color conversions: 1276 | -color to greyscale: no error is thrown, but the result will look ugly because 1277 | only the red channel is taken 1278 | -anything to palette when that palette does not have that color in it: in this 1279 | case an error is thrown 1280 | 1281 | Supported color conversions: 1282 | -anything to 8-bit RGB, 8-bit RGBA, 16-bit RGB, 16-bit RGBA 1283 | -any grey or grey+alpha, to grey or grey+alpha 1284 | -anything to a palette, as long as the palette has the requested colors in it 1285 | -removing alpha channel 1286 | -higher to smaller bitdepth, and vice versa 1287 | 1288 | If you want no color conversion to be done (e.g. for speed or control): 1289 | -In the encoder, you can make it save a PNG with any color type by giving the 1290 | raw color mode and LodePNGInfo the same color mode, and setting auto_convert to 1291 | false. 1292 | -In the decoder, you can make it store the pixel data in the same color type 1293 | as the PNG has, by setting the color_convert setting to false. Settings in 1294 | info_raw are then ignored. 1295 | 1296 | The function lodepng_convert does the color conversion. It is available in the 1297 | interface but normally isn't needed since the encoder and decoder already call 1298 | it. 1299 | 1300 | 6.3. padding bits 1301 | ----------------- 1302 | 1303 | In the PNG file format, if a less than 8-bit per pixel color type is used and the scanlines 1304 | have a bit amount that isn't a multiple of 8, then padding bits are used so that each 1305 | scanline starts at a fresh byte. But that is NOT true for the LodePNG raw input and output. 1306 | The raw input image you give to the encoder, and the raw output image you get from the decoder 1307 | will NOT have these padding bits, e.g. in the case of a 1-bit image with a width 1308 | of 7 pixels, the first pixel of the second scanline will the the 8th bit of the first byte, 1309 | not the first bit of a new byte. 1310 | 1311 | 6.4. A note about 16-bits per channel and endianness 1312 | ---------------------------------------------------- 1313 | 1314 | LodePNG uses unsigned char arrays for 16-bit per channel colors too, just like 1315 | for any other color format. The 16-bit values are stored in big endian (most 1316 | significant byte first) in these arrays. This is the opposite order of the 1317 | little endian used by x86 CPU's. 1318 | 1319 | LodePNG always uses big endian because the PNG file format does so internally. 1320 | Conversions to other formats than PNG uses internally are not supported by 1321 | LodePNG on purpose, there are myriads of formats, including endianness of 16-bit 1322 | colors, the order in which you store R, G, B and A, and so on. Supporting and 1323 | converting to/from all that is outside the scope of LodePNG. 1324 | 1325 | This may mean that, depending on your use case, you may want to convert the big 1326 | endian output of LodePNG to little endian with a for loop. This is certainly not 1327 | always needed, many applications and libraries support big endian 16-bit colors 1328 | anyway, but it means you cannot simply cast the unsigned char* buffer to an 1329 | unsigned short* buffer on x86 CPUs. 1330 | 1331 | 1332 | 7. error values 1333 | --------------- 1334 | 1335 | All functions in LodePNG that return an error code, return 0 if everything went 1336 | OK, or a non-zero code if there was an error. 1337 | 1338 | The meaning of the LodePNG error values can be retrieved with the function 1339 | lodepng_error_text: given the numerical error code, it returns a description 1340 | of the error in English as a string. 1341 | 1342 | Check the implementation of lodepng_error_text to see the meaning of each code. 1343 | 1344 | 1345 | 8. chunks and PNG editing 1346 | ------------------------- 1347 | 1348 | If you want to add extra chunks to a PNG you encode, or use LodePNG for a PNG 1349 | editor that should follow the rules about handling of unknown chunks, or if your 1350 | program is able to read other types of chunks than the ones handled by LodePNG, 1351 | then that's possible with the chunk functions of LodePNG. 1352 | 1353 | A PNG chunk has the following layout: 1354 | 1355 | 4 bytes length 1356 | 4 bytes type name 1357 | length bytes data 1358 | 4 bytes CRC 1359 | 1360 | 8.1. iterating through chunks 1361 | ----------------------------- 1362 | 1363 | If you have a buffer containing the PNG image data, then the first chunk (the 1364 | IHDR chunk) starts at byte number 8 of that buffer. The first 8 bytes are the 1365 | signature of the PNG and are not part of a chunk. But if you start at byte 8 1366 | then you have a chunk, and can check the following things of it. 1367 | 1368 | NOTE: none of these functions check for memory buffer boundaries. To avoid 1369 | exploits, always make sure the buffer contains all the data of the chunks. 1370 | When using lodepng_chunk_next, make sure the returned value is within the 1371 | allocated memory. 1372 | 1373 | unsigned lodepng_chunk_length(const unsigned char* chunk): 1374 | 1375 | Get the length of the chunk's data. The total chunk length is this length + 12. 1376 | 1377 | void lodepng_chunk_type(char type[5], const unsigned char* chunk): 1378 | unsigned char lodepng_chunk_type_equals(const unsigned char* chunk, const char* type): 1379 | 1380 | Get the type of the chunk or compare if it's a certain type 1381 | 1382 | unsigned char lodepng_chunk_critical(const unsigned char* chunk): 1383 | unsigned char lodepng_chunk_private(const unsigned char* chunk): 1384 | unsigned char lodepng_chunk_safetocopy(const unsigned char* chunk): 1385 | 1386 | Check if the chunk is critical in the PNG standard (only IHDR, PLTE, IDAT and IEND are). 1387 | Check if the chunk is private (public chunks are part of the standard, private ones not). 1388 | Check if the chunk is safe to copy. If it's not, then, when modifying data in a critical 1389 | chunk, unsafe to copy chunks of the old image may NOT be saved in the new one if your 1390 | program doesn't handle that type of unknown chunk. 1391 | 1392 | unsigned char* lodepng_chunk_data(unsigned char* chunk): 1393 | const unsigned char* lodepng_chunk_data_const(const unsigned char* chunk): 1394 | 1395 | Get a pointer to the start of the data of the chunk. 1396 | 1397 | unsigned lodepng_chunk_check_crc(const unsigned char* chunk): 1398 | void lodepng_chunk_generate_crc(unsigned char* chunk): 1399 | 1400 | Check if the crc is correct or generate a correct one. 1401 | 1402 | unsigned char* lodepng_chunk_next(unsigned char* chunk): 1403 | const unsigned char* lodepng_chunk_next_const(const unsigned char* chunk): 1404 | 1405 | Iterate to the next chunk. This works if you have a buffer with consecutive chunks. Note that these 1406 | functions do no boundary checking of the allocated data whatsoever, so make sure there is enough 1407 | data available in the buffer to be able to go to the next chunk. 1408 | 1409 | unsigned lodepng_chunk_append(unsigned char** out, size_t* outlength, const unsigned char* chunk): 1410 | unsigned lodepng_chunk_create(unsigned char** out, size_t* outlength, unsigned length, 1411 | const char* type, const unsigned char* data): 1412 | 1413 | These functions are used to create new chunks that are appended to the data in *out that has 1414 | length *outlength. The append function appends an existing chunk to the new data. The create 1415 | function creates a new chunk with the given parameters and appends it. Type is the 4-letter 1416 | name of the chunk. 1417 | 1418 | 8.2. chunks in info_png 1419 | ----------------------- 1420 | 1421 | The LodePNGInfo struct contains fields with the unknown chunk in it. It has 3 1422 | buffers (each with size) to contain 3 types of unknown chunks: 1423 | the ones that come before the PLTE chunk, the ones that come between the PLTE 1424 | and the IDAT chunks, and the ones that come after the IDAT chunks. 1425 | It's necessary to make the distionction between these 3 cases because the PNG 1426 | standard forces to keep the ordering of unknown chunks compared to the critical 1427 | chunks, but does not force any other ordering rules. 1428 | 1429 | info_png.unknown_chunks_data[0] is the chunks before PLTE 1430 | info_png.unknown_chunks_data[1] is the chunks after PLTE, before IDAT 1431 | info_png.unknown_chunks_data[2] is the chunks after IDAT 1432 | 1433 | The chunks in these 3 buffers can be iterated through and read by using the same 1434 | way described in the previous subchapter. 1435 | 1436 | When using the decoder to decode a PNG, you can make it store all unknown chunks 1437 | if you set the option settings.remember_unknown_chunks to 1. By default, this 1438 | option is off (0). 1439 | 1440 | The encoder will always encode unknown chunks that are stored in the info_png. 1441 | If you need it to add a particular chunk that isn't known by LodePNG, you can 1442 | use lodepng_chunk_append or lodepng_chunk_create to the chunk data in 1443 | info_png.unknown_chunks_data[x]. 1444 | 1445 | Chunks that are known by LodePNG should not be added in that way. E.g. to make 1446 | LodePNG add a bKGD chunk, set background_defined to true and add the correct 1447 | parameters there instead. 1448 | 1449 | 1450 | 9. compiler support 1451 | ------------------- 1452 | 1453 | No libraries other than the current standard C library are needed to compile 1454 | LodePNG. For the C++ version, only the standard C++ library is needed on top. 1455 | Add the files lodepng.c(pp) and lodepng.h to your project, include 1456 | lodepng.h where needed, and your program can read/write PNG files. 1457 | 1458 | It is compatible with C90 and up, and C++03 and up. 1459 | 1460 | If performance is important, use optimization when compiling! For both the 1461 | encoder and decoder, this makes a large difference. 1462 | 1463 | Make sure that LodePNG is compiled with the same compiler of the same version 1464 | and with the same settings as the rest of the program, or the interfaces with 1465 | std::vectors and std::strings in C++ can be incompatible. 1466 | 1467 | CHAR_BITS must be 8 or higher, because LodePNG uses unsigned chars for octets. 1468 | 1469 | *) gcc and g++ 1470 | 1471 | LodePNG is developed in gcc so this compiler is natively supported. It gives no 1472 | warnings with compiler options "-Wall -Wextra -pedantic -ansi", with gcc and g++ 1473 | version 4.7.1 on Linux, 32-bit and 64-bit. 1474 | 1475 | *) Clang 1476 | 1477 | Fully supported and warning-free. 1478 | 1479 | *) Mingw 1480 | 1481 | The Mingw compiler (a port of gcc for Windows) should be fully supported by 1482 | LodePNG. 1483 | 1484 | *) Visual Studio and Visual C++ Express Edition 1485 | 1486 | LodePNG should be warning-free with warning level W4. Two warnings were disabled 1487 | with pragmas though: warning 4244 about implicit conversions, and warning 4996 1488 | where it wants to use a non-standard function fopen_s instead of the standard C 1489 | fopen. 1490 | 1491 | Visual Studio may want "stdafx.h" files to be included in each source file and 1492 | give an error "unexpected end of file while looking for precompiled header". 1493 | This is not standard C++ and will not be added to the stock LodePNG. You can 1494 | disable it for lodepng.cpp only by right clicking it, Properties, C/C++, 1495 | Precompiled Headers, and set it to Not Using Precompiled Headers there. 1496 | 1497 | NOTE: Modern versions of VS should be fully supported, but old versions, e.g. 1498 | VS6, are not guaranteed to work. 1499 | 1500 | *) Compilers on Macintosh 1501 | 1502 | LodePNG has been reported to work both with gcc and LLVM for Macintosh, both for 1503 | C and C++. 1504 | 1505 | *) Other Compilers 1506 | 1507 | If you encounter problems on any compilers, feel free to let me know and I may 1508 | try to fix it if the compiler is modern and standards complient. 1509 | 1510 | 1511 | 10. examples 1512 | ------------ 1513 | 1514 | This decoder example shows the most basic usage of LodePNG. More complex 1515 | examples can be found on the LodePNG website. 1516 | 1517 | 10.1. decoder C++ example 1518 | ------------------------- 1519 | 1520 | #include "lodepng.h" 1521 | #include 1522 | 1523 | int main(int argc, char *argv[]) 1524 | { 1525 | const char* filename = argc > 1 ? argv[1] : "test.png"; 1526 | 1527 | //load and decode 1528 | std::vector image; 1529 | unsigned width, height; 1530 | unsigned error = lodepng::decode(image, width, height, filename); 1531 | 1532 | //if there's an error, display it 1533 | if(error) std::cout << "decoder error " << error << ": " << lodepng_error_text(error) << std::endl; 1534 | 1535 | //the pixels are now in the vector "image", 4 bytes per pixel, ordered RGBARGBA..., use it as texture, draw it, ... 1536 | } 1537 | 1538 | 10.2. decoder C example 1539 | ----------------------- 1540 | 1541 | #include "lodepng.h" 1542 | 1543 | int main(int argc, char *argv[]) 1544 | { 1545 | unsigned error; 1546 | unsigned char* image; 1547 | size_t width, height; 1548 | const char* filename = argc > 1 ? argv[1] : "test.png"; 1549 | 1550 | error = lodepng_decode32_file(&image, &width, &height, filename); 1551 | 1552 | if(error) printf("decoder error %u: %s\n", error, lodepng_error_text(error)); 1553 | 1554 | / * use image here * / 1555 | 1556 | free(image); 1557 | return 0; 1558 | } 1559 | 1560 | 11. state settings reference 1561 | ---------------------------- 1562 | 1563 | A quick reference of some settings to set on the LodePNGState 1564 | 1565 | For decoding: 1566 | 1567 | state.decoder.zlibsettings.ignore_adler32: ignore ADLER32 checksums 1568 | state.decoder.zlibsettings.custom_...: use custom inflate function 1569 | state.decoder.ignore_crc: ignore CRC checksums 1570 | state.decoder.color_convert: convert internal PNG color to chosen one 1571 | state.decoder.read_text_chunks: whether to read in text metadata chunks 1572 | state.decoder.remember_unknown_chunks: whether to read in unknown chunks 1573 | state.info_raw.colortype: desired color type for decoded image 1574 | state.info_raw.bitdepth: desired bit depth for decoded image 1575 | state.info_raw....: more color settings, see struct LodePNGColorMode 1576 | state.info_png....: no settings for decoder but ouput, see struct LodePNGInfo 1577 | 1578 | For encoding: 1579 | 1580 | state.encoder.zlibsettings.btype: disable compression by setting it to 0 1581 | state.encoder.zlibsettings.use_lz77: use LZ77 in compression 1582 | state.encoder.zlibsettings.windowsize: tweak LZ77 windowsize 1583 | state.encoder.zlibsettings.minmatch: tweak min LZ77 length to match 1584 | state.encoder.zlibsettings.nicematch: tweak LZ77 match where to stop searching 1585 | state.encoder.zlibsettings.lazymatching: try one more LZ77 matching 1586 | state.encoder.zlibsettings.custom_...: use custom deflate function 1587 | state.encoder.auto_convert: choose optimal PNG color type, if 0 uses info_png 1588 | state.encoder.filter_palette_zero: PNG filter strategy for palette 1589 | state.encoder.filter_strategy: PNG filter strategy to encode with 1590 | state.encoder.force_palette: add palette even if not encoding to one 1591 | state.encoder.add_id: add LodePNG identifier and version as a text chunk 1592 | state.encoder.text_compression: use compressed text chunks for metadata 1593 | state.info_raw.colortype: color type of raw input image you provide 1594 | state.info_raw.bitdepth: bit depth of raw input image you provide 1595 | state.info_raw: more color settings, see struct LodePNGColorMode 1596 | state.info_png.color.colortype: desired color type if auto_convert is false 1597 | state.info_png.color.bitdepth: desired bit depth if auto_convert is false 1598 | state.info_png.color....: more color settings, see struct LodePNGColorMode 1599 | state.info_png....: more PNG related settings, see struct LodePNGInfo 1600 | 1601 | 1602 | 12. changes 1603 | ----------- 1604 | 1605 | The version number of LodePNG is the date of the change given in the format 1606 | yyyymmdd. 1607 | 1608 | Some changes aren't backwards compatible. Those are indicated with a (!) 1609 | symbol. 1610 | 1611 | *) 17 sep 2017: fix memory leak for some encoder input error cases 1612 | *) 27 nov 2016: grey+alpha auto color model detection bugfix 1613 | *) 18 apr 2016: Changed qsort to custom stable sort (for platforms w/o qsort). 1614 | *) 09 apr 2016: Fixed colorkey usage detection, and better file loading (within 1615 | the limits of pure C90). 1616 | *) 08 dec 2015: Made load_file function return error if file can't be opened. 1617 | *) 24 okt 2015: Bugfix with decoding to palette output. 1618 | *) 18 apr 2015: Boundary PM instead of just package-merge for faster encoding. 1619 | *) 23 aug 2014: Reduced needless memory usage of decoder. 1620 | *) 28 jun 2014: Removed fix_png setting, always support palette OOB for 1621 | simplicity. Made ColorProfile public. 1622 | *) 09 jun 2014: Faster encoder by fixing hash bug and more zeros optimization. 1623 | *) 22 dec 2013: Power of two windowsize required for optimization. 1624 | *) 15 apr 2013: Fixed bug with LAC_ALPHA and color key. 1625 | *) 25 mar 2013: Added an optional feature to ignore some PNG errors (fix_png). 1626 | *) 11 mar 2013 (!): Bugfix with custom free. Changed from "my" to "lodepng_" 1627 | prefix for the custom allocators and made it possible with a new #define to 1628 | use custom ones in your project without needing to change lodepng's code. 1629 | *) 28 jan 2013: Bugfix with color key. 1630 | *) 27 okt 2012: Tweaks in text chunk keyword length error handling. 1631 | *) 8 okt 2012 (!): Added new filter strategy (entropy) and new auto color mode. 1632 | (no palette). Better deflate tree encoding. New compression tweak settings. 1633 | Faster color conversions while decoding. Some internal cleanups. 1634 | *) 23 sep 2012: Reduced warnings in Visual Studio a little bit. 1635 | *) 1 sep 2012 (!): Removed #define's for giving custom (de)compression functions 1636 | and made it work with function pointers instead. 1637 | *) 23 jun 2012: Added more filter strategies. Made it easier to use custom alloc 1638 | and free functions and toggle #defines from compiler flags. Small fixes. 1639 | *) 6 may 2012 (!): Made plugging in custom zlib/deflate functions more flexible. 1640 | *) 22 apr 2012 (!): Made interface more consistent, renaming a lot. Removed 1641 | redundant C++ codec classes. Reduced amount of structs. Everything changed, 1642 | but it is cleaner now imho and functionality remains the same. Also fixed 1643 | several bugs and shrunk the implementation code. Made new samples. 1644 | *) 6 nov 2011 (!): By default, the encoder now automatically chooses the best 1645 | PNG color model and bit depth, based on the amount and type of colors of the 1646 | raw image. For this, autoLeaveOutAlphaChannel replaced by auto_choose_color. 1647 | *) 9 okt 2011: simpler hash chain implementation for the encoder. 1648 | *) 8 sep 2011: lz77 encoder lazy matching instead of greedy matching. 1649 | *) 23 aug 2011: tweaked the zlib compression parameters after benchmarking. 1650 | A bug with the PNG filtertype heuristic was fixed, so that it chooses much 1651 | better ones (it's quite significant). A setting to do an experimental, slow, 1652 | brute force search for PNG filter types is added. 1653 | *) 17 aug 2011 (!): changed some C zlib related function names. 1654 | *) 16 aug 2011: made the code less wide (max 120 characters per line). 1655 | *) 17 apr 2011: code cleanup. Bugfixes. Convert low to 16-bit per sample colors. 1656 | *) 21 feb 2011: fixed compiling for C90. Fixed compiling with sections disabled. 1657 | *) 11 dec 2010: encoding is made faster, based on suggestion by Peter Eastman 1658 | to optimize long sequences of zeros. 1659 | *) 13 nov 2010: added LodePNG_InfoColor_hasPaletteAlpha and 1660 | LodePNG_InfoColor_canHaveAlpha functions for convenience. 1661 | *) 7 nov 2010: added LodePNG_error_text function to get error code description. 1662 | *) 30 okt 2010: made decoding slightly faster 1663 | *) 26 okt 2010: (!) changed some C function and struct names (more consistent). 1664 | Reorganized the documentation and the declaration order in the header. 1665 | *) 08 aug 2010: only changed some comments and external samples. 1666 | *) 05 jul 2010: fixed bug thanks to warnings in the new gcc version. 1667 | *) 14 mar 2010: fixed bug where too much memory was allocated for char buffers. 1668 | *) 02 sep 2008: fixed bug where it could create empty tree that linux apps could 1669 | read by ignoring the problem but windows apps couldn't. 1670 | *) 06 jun 2008: added more error checks for out of memory cases. 1671 | *) 26 apr 2008: added a few more checks here and there to ensure more safety. 1672 | *) 06 mar 2008: crash with encoding of strings fixed 1673 | *) 02 feb 2008: support for international text chunks added (iTXt) 1674 | *) 23 jan 2008: small cleanups, and #defines to divide code in sections 1675 | *) 20 jan 2008: support for unknown chunks allowing using LodePNG for an editor. 1676 | *) 18 jan 2008: support for tIME and pHYs chunks added to encoder and decoder. 1677 | *) 17 jan 2008: ability to encode and decode compressed zTXt chunks added 1678 | Also various fixes, such as in the deflate and the padding bits code. 1679 | *) 13 jan 2008: Added ability to encode Adam7-interlaced images. Improved 1680 | filtering code of encoder. 1681 | *) 07 jan 2008: (!) changed LodePNG to use ISO C90 instead of C++. A 1682 | C++ wrapper around this provides an interface almost identical to before. 1683 | Having LodePNG be pure ISO C90 makes it more portable. The C and C++ code 1684 | are together in these files but it works both for C and C++ compilers. 1685 | *) 29 dec 2007: (!) changed most integer types to unsigned int + other tweaks 1686 | *) 30 aug 2007: bug fixed which makes this Borland C++ compatible 1687 | *) 09 aug 2007: some VS2005 warnings removed again 1688 | *) 21 jul 2007: deflate code placed in new namespace separate from zlib code 1689 | *) 08 jun 2007: fixed bug with 2- and 4-bit color, and small interlaced images 1690 | *) 04 jun 2007: improved support for Visual Studio 2005: crash with accessing 1691 | invalid std::vector element [0] fixed, and level 3 and 4 warnings removed 1692 | *) 02 jun 2007: made the encoder add a tag with version by default 1693 | *) 27 may 2007: zlib and png code separated (but still in the same file), 1694 | simple encoder/decoder functions added for more simple usage cases 1695 | *) 19 may 2007: minor fixes, some code cleaning, new error added (error 69), 1696 | moved some examples from here to lodepng_examples.cpp 1697 | *) 12 may 2007: palette decoding bug fixed 1698 | *) 24 apr 2007: changed the license from BSD to the zlib license 1699 | *) 11 mar 2007: very simple addition: ability to encode bKGD chunks. 1700 | *) 04 mar 2007: (!) tEXt chunk related fixes, and support for encoding 1701 | palettized PNG images. Plus little interface change with palette and texts. 1702 | *) 03 mar 2007: Made it encode dynamic Huffman shorter with repeat codes. 1703 | Fixed a bug where the end code of a block had length 0 in the Huffman tree. 1704 | *) 26 feb 2007: Huffman compression with dynamic trees (BTYPE 2) now implemented 1705 | and supported by the encoder, resulting in smaller PNGs at the output. 1706 | *) 27 jan 2007: Made the Adler-32 test faster so that a timewaste is gone. 1707 | *) 24 jan 2007: gave encoder an error interface. Added color conversion from any 1708 | greyscale type to 8-bit greyscale with or without alpha. 1709 | *) 21 jan 2007: (!) Totally changed the interface. It allows more color types 1710 | to convert to and is more uniform. See the manual for how it works now. 1711 | *) 07 jan 2007: Some cleanup & fixes, and a few changes over the last days: 1712 | encode/decode custom tEXt chunks, separate classes for zlib & deflate, and 1713 | at last made the decoder give errors for incorrect Adler32 or Crc. 1714 | *) 01 jan 2007: Fixed bug with encoding PNGs with less than 8 bits per channel. 1715 | *) 29 dec 2006: Added support for encoding images without alpha channel, and 1716 | cleaned out code as well as making certain parts faster. 1717 | *) 28 dec 2006: Added "Settings" to the encoder. 1718 | *) 26 dec 2006: The encoder now does LZ77 encoding and produces much smaller files now. 1719 | Removed some code duplication in the decoder. Fixed little bug in an example. 1720 | *) 09 dec 2006: (!) Placed output parameters of public functions as first parameter. 1721 | Fixed a bug of the decoder with 16-bit per color. 1722 | *) 15 okt 2006: Changed documentation structure 1723 | *) 09 okt 2006: Encoder class added. It encodes a valid PNG image from the 1724 | given image buffer, however for now it's not compressed. 1725 | *) 08 sep 2006: (!) Changed to interface with a Decoder class 1726 | *) 30 jul 2006: (!) LodePNG_InfoPng , width and height are now retrieved in different 1727 | way. Renamed decodePNG to decodePNGGeneric. 1728 | *) 29 jul 2006: (!) Changed the interface: image info is now returned as a 1729 | struct of type LodePNG::LodePNG_Info, instead of a vector, which was a bit clumsy. 1730 | *) 28 jul 2006: Cleaned the code and added new error checks. 1731 | Corrected terminology "deflate" into "inflate". 1732 | *) 23 jun 2006: Added SDL example in the documentation in the header, this 1733 | example allows easy debugging by displaying the PNG and its transparency. 1734 | *) 22 jun 2006: (!) Changed way to obtain error value. Added 1735 | loadFile function for convenience. Made decodePNG32 faster. 1736 | *) 21 jun 2006: (!) Changed type of info vector to unsigned. 1737 | Changed position of palette in info vector. Fixed an important bug that 1738 | happened on PNGs with an uncompressed block. 1739 | *) 16 jun 2006: Internally changed unsigned into unsigned where 1740 | needed, and performed some optimizations. 1741 | *) 07 jun 2006: (!) Renamed functions to decodePNG and placed them 1742 | in LodePNG namespace. Changed the order of the parameters. Rewrote the 1743 | documentation in the header. Renamed files to lodepng.cpp and lodepng.h 1744 | *) 22 apr 2006: Optimized and improved some code 1745 | *) 07 sep 2005: (!) Changed to std::vector interface 1746 | *) 12 aug 2005: Initial release (C++, decoder only) 1747 | 1748 | 1749 | 13. contact information 1750 | ----------------------- 1751 | 1752 | Feel free to contact me with suggestions, problems, comments, ... concerning 1753 | LodePNG. If you encounter a PNG image that doesn't work properly with this 1754 | decoder, feel free to send it and I'll use it to find and fix the problem. 1755 | 1756 | My email address is (puzzle the account and domain together with an @ symbol): 1757 | Domain: gmail dot com. 1758 | Account: lode dot vandevenne. 1759 | 1760 | 1761 | Copyright (c) 2005-2017 Lode Vandevenne 1762 | */ 1763 | --------------------------------------------------------------------------------