├── .gitignore
├── LICENSE
├── MANIFEST
├── README.md
├── VERSION
├── camera_utils.py
├── data
    └── .gitignore
├── mesh_renderer.py
├── mesh_renderer
    ├── kernels
    │   ├── rasterize_triangles_grad.cc
    │   ├── rasterize_triangles_impl.cc
    │   ├── rasterize_triangles_impl.h
    │   ├── rasterize_triangles_impl_test.cc
    │   └── rasterize_triangles_op.cc
    ├── mesh_renderer_test.py
    ├── rasterize_triangles_test.py
    ├── test_data
    │   ├── BUILD
    │   ├── Barycentrics_Cube.png
    │   ├── Colored_Cube_0.png
    │   ├── Colored_Cube_1.png
    │   ├── External_Triangle.png
    │   ├── Gray_Cube_0.png
    │   ├── Gray_Cube_1.png
    │   ├── Inside_Box.png
    │   ├── Perspective_Corrected_Triangle.png
    │   ├── Simple_Tetrahedron.png
    │   ├── Simple_Triangle.png
    │   └── Unlit_Cube_0.png
    └── test_utils.py
├── notebooks
    └── FirstCrack.ipynb
├── rasterize_triangles.py
├── setup.py
└── third_party
    ├── lodepng.cpp
    └── lodepng.h


/.gitignore:
--------------------------------------------------------------------------------
1 | bazel*
2 | build
3 | *.egg-info
4 | .ipynb_checkpoints
5 | dist
6 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 | 
  2 |                                  Apache License
  3 |                            Version 2.0, January 2004
  4 |                         http://www.apache.org/licenses/
  5 | 
  6 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  7 | 
  8 |    1. Definitions.
  9 | 
 10 |       "License" shall mean the terms and conditions for use, reproduction,
 11 |       and distribution as defined by Sections 1 through 9 of this document.
 12 | 
 13 |       "Licensor" shall mean the copyright owner or entity authorized by
 14 |       the copyright owner that is granting the License.
 15 | 
 16 |       "Legal Entity" shall mean the union of the acting entity and all
 17 |       other entities that control, are controlled by, or are under common
 18 |       control with that entity. For the purposes of this definition,
 19 |       "control" means (i) the power, direct or indirect, to cause the
 20 |       direction or management of such entity, whether by contract or
 21 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 22 |       outstanding shares, or (iii) beneficial ownership of such entity.
 23 | 
 24 |       "You" (or "Your") shall mean an individual or Legal Entity
 25 |       exercising permissions granted by this License.
 26 | 
 27 |       "Source" form shall mean the preferred form for making modifications,
 28 |       including but not limited to software source code, documentation
 29 |       source, and configuration files.
 30 | 
 31 |       "Object" form shall mean any form resulting from mechanical
 32 |       transformation or translation of a Source form, including but
 33 |       not limited to compiled object code, generated documentation,
 34 |       and conversions to other media types.
 35 | 
 36 |       "Work" shall mean the work of authorship, whether in Source or
 37 |       Object form, made available under the License, as indicated by a
 38 |       copyright notice that is included in or attached to the work
 39 |       (an example is provided in the Appendix below).
 40 | 
 41 |       "Derivative Works" shall mean any work, whether in Source or Object
 42 |       form, that is based on (or derived from) the Work and for which the
 43 |       editorial revisions, annotations, elaborations, or other modifications
 44 |       represent, as a whole, an original work of authorship. For the purposes
 45 |       of this License, Derivative Works shall not include works that remain
 46 |       separable from, or merely link (or bind by name) to the interfaces of,
 47 |       the Work and Derivative Works thereof.
 48 | 
 49 |       "Contribution" shall mean any work of authorship, including
 50 |       the original version of the Work and any modifications or additions
 51 |       to that Work or Derivative Works thereof, that is intentionally
 52 |       submitted to Licensor for inclusion in the Work by the copyright owner
 53 |       or by an individual or Legal Entity authorized to submit on behalf of
 54 |       the copyright owner. For the purposes of this definition, "submitted"
 55 |       means any form of electronic, verbal, or written communication sent
 56 |       to the Licensor or its representatives, including but not limited to
 57 |       communication on electronic mailing lists, source code control systems,
 58 |       and issue tracking systems that are managed by, or on behalf of, the
 59 |       Licensor for the purpose of discussing and improving the Work, but
 60 |       excluding communication that is conspicuously marked or otherwise
 61 |       designated in writing by the copyright owner as "Not a Contribution."
 62 | 
 63 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 64 |       on behalf of whom a Contribution has been received by Licensor and
 65 |       subsequently incorporated within the Work.
 66 | 
 67 |    2. Grant of Copyright License. Subject to the terms and conditions of
 68 |       this License, each Contributor hereby grants to You a perpetual,
 69 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 70 |       copyright license to reproduce, prepare Derivative Works of,
 71 |       publicly display, publicly perform, sublicense, and distribute the
 72 |       Work and such Derivative Works in Source or Object form.
 73 | 
 74 |    3. Grant of Patent License. Subject to the terms and conditions of
 75 |       this License, each Contributor hereby grants to You a perpetual,
 76 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 77 |       (except as stated in this section) patent license to make, have made,
 78 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 79 |       where such license applies only to those patent claims licensable
 80 |       by such Contributor that are necessarily infringed by their
 81 |       Contribution(s) alone or by combination of their Contribution(s)
 82 |       with the Work to which such Contribution(s) was submitted. If You
 83 |       institute patent litigation against any entity (including a
 84 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 85 |       or a Contribution incorporated within the Work constitutes direct
 86 |       or contributory patent infringement, then any patent licenses
 87 |       granted to You under this License for that Work shall terminate
 88 |       as of the date such litigation is filed.
 89 | 
 90 |    4. Redistribution. You may reproduce and distribute copies of the
 91 |       Work or Derivative Works thereof in any medium, with or without
 92 |       modifications, and in Source or Object form, provided that You
 93 |       meet the following conditions:
 94 | 
 95 |       (a) You must give any other recipients of the Work or
 96 |           Derivative Works a copy of this License; and
 97 | 
 98 |       (b) You must cause any modified files to carry prominent notices
 99 |           stating that You changed the files; and
100 | 
101 |       (c) You must retain, in the Source form of any Derivative Works
102 |           that You distribute, all copyright, patent, trademark, and
103 |           attribution notices from the Source form of the Work,
104 |           excluding those notices that do not pertain to any part of
105 |           the Derivative Works; and
106 | 
107 |       (d) If the Work includes a "NOTICE" text file as part of its
108 |           distribution, then any Derivative Works that You distribute must
109 |           include a readable copy of the attribution notices contained
110 |           within such NOTICE file, excluding those notices that do not
111 |           pertain to any part of the Derivative Works, in at least one
112 |           of the following places: within a NOTICE text file distributed
113 |           as part of the Derivative Works; within the Source form or
114 |           documentation, if provided along with the Derivative Works; or,
115 |           within a display generated by the Derivative Works, if and
116 |           wherever such third-party notices normally appear. The contents
117 |           of the NOTICE file are for informational purposes only and
118 |           do not modify the License. You may add Your own attribution
119 |           notices within Derivative Works that You distribute, alongside
120 |           or as an addendum to the NOTICE text from the Work, provided
121 |           that such additional attribution notices cannot be construed
122 |           as modifying the License.
123 | 
124 |       You may add Your own copyright statement to Your modifications and
125 |       may provide additional or different license terms and conditions
126 |       for use, reproduction, or distribution of Your modifications, or
127 |       for any such Derivative Works as a whole, provided Your use,
128 |       reproduction, and distribution of the Work otherwise complies with
129 |       the conditions stated in this License.
130 | 
131 |    5. Submission of Contributions. Unless You explicitly state otherwise,
132 |       any Contribution intentionally submitted for inclusion in the Work
133 |       by You to the Licensor shall be under the terms and conditions of
134 |       this License, without any additional terms or conditions.
135 |       Notwithstanding the above, nothing herein shall supersede or modify
136 |       the terms of any separate license agreement you may have executed
137 |       with Licensor regarding such Contributions.
138 | 
139 |    6. Trademarks. This License does not grant permission to use the trade
140 |       names, trademarks, service marks, or product names of the Licensor,
141 |       except as required for reasonable and customary use in describing the
142 |       origin of the Work and reproducing the content of the NOTICE file.
143 | 
144 |    7. Disclaimer of Warranty. Unless required by applicable law or
145 |       agreed to in writing, Licensor provides the Work (and each
146 |       Contributor provides its Contributions) on an "AS IS" BASIS,
147 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 |       implied, including, without limitation, any warranties or conditions
149 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 |       PARTICULAR PURPOSE. You are solely responsible for determining the
151 |       appropriateness of using or redistributing the Work and assume any
152 |       risks associated with Your exercise of permissions under this License.
153 | 
154 |    8. Limitation of Liability. In no event and under no legal theory,
155 |       whether in tort (including negligence), contract, or otherwise,
156 |       unless required by applicable law (such as deliberate and grossly
157 |       negligent acts) or agreed to in writing, shall any Contributor be
158 |       liable to You for damages, including any direct, indirect, special,
159 |       incidental, or consequential damages of any character arising as a
160 |       result of this License or out of the use or inability to use the
161 |       Work (including but not limited to damages for loss of goodwill,
162 |       work stoppage, computer failure or malfunction, or any and all
163 |       other commercial damages or losses), even if such Contributor
164 |       has been advised of the possibility of such damages.
165 | 
166 |    9. Accepting Warranty or Additional Liability. While redistributing
167 |       the Work or Derivative Works thereof, You may choose to offer,
168 |       and charge a fee for, acceptance of support, warranty, indemnity,
169 |       or other liability obligations and/or rights consistent with this
170 |       License. However, in accepting such obligations, You may act only
171 |       on Your own behalf and on Your sole responsibility, not on behalf
172 |       of any other Contributor, and only if You agree to indemnify,
173 |       defend, and hold each Contributor harmless for any liability
174 |       incurred by, or claims asserted against, such Contributor by reason
175 |       of your accepting any such warranty or additional liability.
176 | 
177 |    END OF TERMS AND CONDITIONS
178 | 
179 |    APPENDIX: How to apply the Apache License to your work.
180 | 
181 |       To apply the Apache License to your work, attach the following
182 |       boilerplate notice, with the fields enclosed by brackets "[]"
183 |       replaced with your own identifying information. (Don't include
184 |       the brackets!)  The text should be enclosed in the appropriate
185 |       comment syntax for the file format. We also recommend that a
186 |       file or class name and description of purpose be included on the
187 |       same "printed page" as the copyright notice for easier
188 |       identification within third-party archives.
189 | 
190 |    Copyright [yyyy] [name of copyright owner]
191 | 
192 |    Licensed under the Apache License, Version 2.0 (the "License");
193 |    you may not use this file except in compliance with the License.
194 |    You may obtain a copy of the License at
195 | 
196 |        http://www.apache.org/licenses/LICENSE-2.0
197 | 
198 |    Unless required by applicable law or agreed to in writing, software
199 |    distributed under the License is distributed on an "AS IS" BASIS,
200 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 |    See the License for the specific language governing permissions and
202 |    limitations under the License.
203 | 


--------------------------------------------------------------------------------
/MANIFEST:
--------------------------------------------------------------------------------
1 | # file GENERATED by distutils, do NOT edit
2 | camera_utils.py
3 | mesh_renderer.py
4 | rasterize_triangles.py
5 | setup.py
6 | mesh_renderer/kernels/rasterize_triangles_grad.cc
7 | mesh_renderer/kernels/rasterize_triangles_impl.cc
8 | mesh_renderer/kernels/rasterize_triangles_op.cc
9 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # TF Mesh Renderer
  2 | 
  3 | This is a differentiable, 3D mesh renderer using TensorFlow.
  4 | [Original repository](https://github.com/google/tf_mesh_renderer).
  5 | 
  6 | This for sends it to Pypi, and removes bazel as a dependency for installation
  7 | (e.g. just use `python3 setup.py install`).
  8 | 
  9 | ### Installation
 10 | ```
 11 | pip install mesh_renderer
 12 | ```
 13 | 
 14 | ### Usage
 15 | 
 16 | ```
 17 | # load your geometry (this is a cube):
 18 | object_vertices = np.array([[-1, -1, 1], [-1, -1, -1], [-1, 1, -1], [-1, 1, 1], [1, -1, 1],
 19 |                             [1, -1, -1], [1, 1, -1], [1, 1, 1]])
 20 | object_triangles = np.array([[0, 1, 2], [2, 3, 0], [3, 2, 6], [6, 7, 3], [7, 6, 5], [5, 4, 7],
 21 |                              [4, 5, 1], [1, 0, 4], [5, 6, 2], [2, 1, 5], [7, 4, 0], [0, 3, 7]], dtype=np.int32)
 22 | object_vertices = tf.constant(object_vertices, dtype=tf.float32)
 23 | object_triangles = tf.constant(object_triangles, dtype=tf.int32)
 24 | object_normals = tf.nn.l2_normalize(object_vertices, dim=1)
 25 | 
 26 | # rotate the geometry:
 27 | angles = [[-1.16, 0.00, 3.48]]
 28 | 
 29 | model_rotation = camera_utils.euler_matrices(angles)[0, :3, :3]
 30 | # camera position:
 31 | eye = tf.constant([[0.0, 0.0, 6.0]], dtype=tf.float32)
 32 | lightbulb = tf.constant([[0.0, 0.0, 6.0]], dtype=tf.float32)
 33 | center = tf.constant([[0.0, 0.0, 0.0]], dtype=tf.float32)
 34 | world_up = tf.constant([[0.0, 1.0, 0.0]], dtype=tf.float32)
 35 | vertex_diffuse_colors = tf.reshape(tf.ones_like(vertices), [1, vertices.get_shape()[0].value, 3])
 36 | light_positions = tf.expand_dims(lightbulb, axis=0)
 37 | light_intensities = tf.ones([1, 1, 3], dtype=tf.float32)
 38 | ambient_color = tf.constant([[0.0, 0.0, 0.0]])
 39 | 
 40 | vertex_positions = tf.reshape(
 41 |     tf.matmul(vertices, model_rotation, transpose_b=True),
 42 |     [1, vertices.get_shape()[0].value, 3])
 43 | desired_normals = tf.reshape(
 44 |     tf.matmul(normals, model_rotation, transpose_b=True),
 45 |     [1, vertices.get_shape()[0].value, 3])
 46 | 
 47 | # render is a tf.Tensor 3d tensor of shape height x width x 4 (r, g, b, a)
 48 | # you can backpropagate through it.
 49 | render = mesh_renderer.mesh_renderer(
 50 |     vertex_positions, triangles, desired_normals,
 51 |     vertex_diffuse_colors, eye, center, world_up, light_positions,
 52 |     light_intensities, image_width, image_height,
 53 |     ambient_color=ambient_color,
 54 | )
 55 | ```
 56 | 
 57 | 
 58 | # Original Readme
 59 | 
 60 | This is a differentiable, 3D mesh renderer using TensorFlow.
 61 | 
 62 | This is not an official Google product.
 63 | 
 64 | The interface to the renderer is provided by mesh_renderer.py and
 65 | rasterize_triangles.py, which provide TensorFlow Ops that can be added to a
 66 | TensorFlow graph. The internals of the renderer are handled by a C++ kernel.
 67 | 
 68 | The input to the C++ rendering kernel is a list of 3D vertices and a list of
 69 | triangles, where a triangle consists of a list of three vertex ids. The
 70 | output of the renderer is a pair of images containing triangle ids and
 71 | barycentric weights. Pixel values in the barycentric weight image are the
 72 | weights of the pixel center point with respect to the triangle at that pixel
 73 | (identified by the triangle id). The renderer provides derivatives of the
 74 | barycentric weights of the pixel centers with respect to the vertex
 75 | positions.
 76 | 
 77 | Any approximation error stems from the assumption that the triangle id at a
 78 | pixel does not change as the vertices are moved. This is a reasonable
 79 | approximation for small changes in vertex position. Even when the triangle id
 80 | does change, the derivatives will be computed by extrapolating the barycentric
 81 | weights of a neighboring triangle, which will produce a good approximation if
 82 | the mesh is smooth. The main source of error occurs at occlusion boundaries, and
 83 | particularly at the edge of an open mesh, where the background appears opposite
 84 | the triangle's edge.
 85 | 
 86 | The algorithm implemented is described by Olano and Greer, "Triangle Scan
 87 | Conversion using 2D Homogeneous Coordinates," HWWS 1997.
 88 | 
 89 | How to Build
 90 | ------------
 91 | 
 92 | Follow the instructions to [install TensorFlow using virtualenv](https://www.tensorflow.org/install/install_linux#installing_with_virtualenv).
 93 | 
 94 | Build and run tests using Bazel from inside the (tensorflow) virtualenv:
 95 | 
 96 | `(tensorflow)$ ./runtests.sh`
 97 | 
 98 | The script calls the Bazel rules using the Python interpreter at
 99 | `$VIRTUAL_ENV/bin/python`. If you aren't using virtualenv, `bazel test ...` may
100 | be sufficient.
101 | 
102 | Citation
103 | --------
104 | 
105 | If you use this renderer in your research, please cite [this paper](http://openaccess.thecvf.com/content_cvpr_2018/html/Genova_Unsupervised_Training_for_CVPR_2018_paper.html "CVF Version"):
106 | 
107 | *Unsupervised Training for 3D Morphable Model Regression*. Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, and William T. Freeman. CVPR 2018, pp. 8377-8386.
108 | 
109 | ```
110 | @InProceedings{Genova_2018_CVPR,
111 |   author = {Genova, Kyle and Cole, Forrester and Maschinot, Aaron and Sarna, Aaron and Vlasic, Daniel and Freeman, William T.},
112 |   title = {Unsupervised Training for 3D Morphable Model Regression},
113 |   booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
114 |   month = {June},
115 |   year = {2018}
116 | }
117 | ```
118 | 
119 | 
120 | Bust of safo: https://cdn.thingiverse.com/zipfiles/ac/39/53/07/80/Bust_of_Sappho_.zip
121 | 
122 | 


--------------------------------------------------------------------------------
/VERSION:
--------------------------------------------------------------------------------
1 | 1.0
2 | 


--------------------------------------------------------------------------------
/camera_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 Google LLC
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     https://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | """Collection of TF functions for managing 3D camera matrices."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | import math
 22 | import tensorflow as tf
 23 | 
 24 | 
 25 | def perspective(aspect_ratio, fov_y, near_clip, far_clip):
 26 |   """Computes perspective transformation matrices.
 27 | 
 28 |   Functionality mimes gluPerspective (third_party/GL/glu/include/GLU/glu.h).
 29 | 
 30 |   Args:
 31 |     aspect_ratio: float value specifying the image aspect ratio (width/height).
 32 |     fov_y: 1-D float32 Tensor with shape [batch_size] specifying output vertical
 33 |         field of views in degrees.
 34 |     near_clip: 1-D float32 Tensor with shape [batch_size] specifying near
 35 |         clipping plane distance.
 36 |     far_clip: 1-D float32 Tensor with shape [batch_size] specifying far clipping
 37 |         plane distance.
 38 | 
 39 |   Returns:
 40 |     A [batch_size, 4, 4] float tensor that maps from right-handed points in eye
 41 |     space to left-handed points in clip space.
 42 |   """
 43 |   # The multiplication of fov_y by pi/360.0 simultaneously converts to radians
 44 |   # and adds the half-angle factor of .5.
 45 |   focal_lengths_y = 1.0 / tf.tan(fov_y * (math.pi / 360.0))
 46 |   depth_range = far_clip - near_clip
 47 |   p_22 = -(far_clip + near_clip) / depth_range
 48 |   p_23 = -2.0 * (far_clip * near_clip / depth_range)
 49 | 
 50 |   zeros = tf.zeros_like(p_23, dtype=tf.float32)
 51 |   # pyformat: disable
 52 |   perspective_transform = tf.concat(
 53 |       [
 54 |           focal_lengths_y / aspect_ratio, zeros, zeros, zeros,
 55 |           zeros, focal_lengths_y, zeros, zeros,
 56 |           zeros, zeros, p_22, p_23,
 57 |           zeros, zeros, -tf.ones_like(p_23, dtype=tf.float32), zeros
 58 |       ], axis=0)
 59 |   # pyformat: enable
 60 |   perspective_transform = tf.reshape(perspective_transform, [4, 4, -1])
 61 |   return tf.transpose(perspective_transform, [2, 0, 1])
 62 | 
 63 | 
 64 | def look_at(eye, center, world_up):
 65 |   """Computes camera viewing matrices.
 66 | 
 67 |   Functionality mimes gluLookAt (third_party/GL/glu/include/GLU/glu.h).
 68 | 
 69 |   Args:
 70 |     eye: 2-D float32 tensor with shape [batch_size, 3] containing the XYZ world
 71 |         space position of the camera.
 72 |     center: 2-D float32 tensor with shape [batch_size, 3] containing a position
 73 |         along the center of the camera's gaze.
 74 |     world_up: 2-D float32 tensor with shape [batch_size, 3] specifying the
 75 |         world's up direction; the output camera will have no tilt with respect
 76 |         to this direction.
 77 | 
 78 |   Returns:
 79 |     A [batch_size, 4, 4] float tensor containing a right-handed camera
 80 |     extrinsics matrix that maps points from world space to points in eye space.
 81 |   """
 82 |   batch_size = center.shape[0].value
 83 |   vector_degeneracy_cutoff = 1e-6
 84 |   forward = center - eye
 85 |   forward_norm = tf.norm(forward, ord='euclidean', axis=1, keepdims=True)
 86 |   tf.assert_greater(
 87 |       forward_norm,
 88 |       vector_degeneracy_cutoff,
 89 |       message='Camera matrix is degenerate because eye and center are close.')
 90 |   forward = tf.divide(forward, forward_norm)
 91 | 
 92 |   to_side = tf.cross(forward, world_up)
 93 |   to_side_norm = tf.norm(to_side, ord='euclidean', axis=1, keepdims=True)
 94 |   tf.assert_greater(
 95 |       to_side_norm,
 96 |       vector_degeneracy_cutoff,
 97 |       message='Camera matrix is degenerate because up and gaze are close or'
 98 |       'because up is degenerate.')
 99 |   to_side = tf.divide(to_side, to_side_norm)
100 |   cam_up = tf.cross(to_side, forward)
101 | 
102 |   w_column = tf.constant(
103 |       batch_size * [[0., 0., 0., 1.]], dtype=tf.float32)  # [batch_size, 4]
104 |   w_column = tf.reshape(w_column, [batch_size, 4, 1])
105 |   view_rotation = tf.stack(
106 |       [to_side, cam_up, -forward,
107 |        tf.zeros_like(to_side, dtype=tf.float32)],
108 |       axis=1)  # [batch_size, 4, 3] matrix
109 |   view_rotation = tf.concat(
110 |       [view_rotation, w_column], axis=2)  # [batch_size, 4, 4]
111 | 
112 |   identity_batch = tf.tile(tf.expand_dims(tf.eye(3), 0), [batch_size, 1, 1])
113 |   view_translation = tf.concat([identity_batch, tf.expand_dims(-eye, 2)], 2)
114 |   view_translation = tf.concat(
115 |       [view_translation,
116 |        tf.reshape(w_column, [batch_size, 1, 4])], 1)
117 |   camera_matrices = tf.matmul(view_rotation, view_translation)
118 |   return camera_matrices
119 | 
120 | 
121 | def euler_matrices(angles):
122 |   """Computes a XYZ Tait-Bryan (improper Euler angle) rotation.
123 | 
124 |   Returns 4x4 matrices for convenient multiplication with other transformations.
125 | 
126 |   Args:
127 |     angles: a [batch_size, 3] tensor containing X, Y, and Z angles in radians.
128 | 
129 |   Returns:
130 |     a [batch_size, 4, 4] tensor of matrices.
131 |   """
132 |   s = tf.sin(angles)
133 |   c = tf.cos(angles)
134 |   # Rename variables for readability in the matrix definition below.
135 |   c0, c1, c2 = (c[:, 0], c[:, 1], c[:, 2])
136 |   s0, s1, s2 = (s[:, 0], s[:, 1], s[:, 2])
137 | 
138 |   zeros = tf.zeros_like(s[:, 0])
139 |   ones = tf.ones_like(s[:, 0])
140 | 
141 |   # pyformat: disable
142 |   flattened = tf.concat(
143 |       [
144 |           c2 * c1, c2 * s1 * s0 - c0 * s2, s2 * s0 + c2 * c0 * s1, zeros,
145 |           c1 * s2, c2 * c0 + s2 * s1 * s0, c0 * s2 * s1 - c2 * s0, zeros,
146 |           -s1, c1 * s0, c1 * c0, zeros,
147 |           zeros, zeros, zeros, ones
148 |       ],
149 |       axis=0)
150 |   # pyformat: enable
151 |   reshaped = tf.reshape(flattened, [4, 4, -1])
152 |   return tf.transpose(reshaped, [2, 0, 1])
153 | 
154 | 
155 | def transform_homogeneous(matrices, vertices):
156 |   """Applies batched 4x4 homogenous matrix transformations to 3-D vertices.
157 | 
158 |   The vertices are input and output as as row-major, but are interpreted as
159 |   column vectors multiplied on the right-hand side of the matrices. More
160 |   explicitly, this function computes (MV^T)^T.
161 |   Vertices are assumed to be xyz, and are extended to xyzw with w=1.
162 | 
163 |   Args:
164 |     matrices: a [batch_size, 4, 4] tensor of matrices.
165 |     vertices: a [batch_size, N, 3] tensor of xyz vertices.
166 | 
167 |   Returns:
168 |     a [batch_size, N, 4] tensor of xyzw vertices.
169 | 
170 |   Raises:
171 |     ValueError: if matrices or vertices have the wrong number of dimensions.
172 |   """
173 |   if len(matrices.shape) != 3:
174 |     raise ValueError(
175 |         'matrices must have 3 dimensions (missing batch dimension?)')
176 |   if len(vertices.shape) != 3:
177 |     raise ValueError(
178 |         'vertices must have 3 dimensions (missing batch dimension?)')
179 |   homogeneous_coord = tf.ones(
180 |       [tf.shape(vertices)[0], tf.shape(vertices)[1], 1], dtype=tf.float32)
181 |   vertices_homogeneous = tf.concat([vertices, homogeneous_coord], 2)
182 | 
183 |   return tf.matmul(vertices_homogeneous, matrices, transpose_b=True)
184 | 


--------------------------------------------------------------------------------
/data/.gitignore:
--------------------------------------------------------------------------------
1 | *.stl
2 | 


--------------------------------------------------------------------------------
/mesh_renderer.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 Google LLC
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     https://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | """Differentiable 3-D rendering of a triangle mesh."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | import tensorflow as tf
 22 | 
 23 | import camera_utils
 24 | import rasterize_triangles
 25 | 
 26 | 
 27 | def phong_shader(normals,
 28 |                  alphas,
 29 |                  pixel_positions,
 30 |                  light_positions,
 31 |                  light_intensities,
 32 |                  diffuse_colors=None,
 33 |                  camera_position=None,
 34 |                  specular_colors=None,
 35 |                  shininess_coefficients=None,
 36 |                  ambient_color=None):
 37 |   """Computes pixelwise lighting from rasterized buffers with the Phong model.
 38 | 
 39 |   Args:
 40 |     normals: a 4D float32 tensor with shape [batch_size, image_height,
 41 |         image_width, 3]. The inner dimension is the world space XYZ normal for
 42 |         the corresponding pixel. Should be already normalized.
 43 |     alphas: a 3D float32 tensor with shape [batch_size, image_height,
 44 |         image_width]. The inner dimension is the alpha value (transparency)
 45 |         for the corresponding pixel.
 46 |     pixel_positions: a 4D float32 tensor with shape [batch_size, image_height,
 47 |         image_width, 3]. The inner dimension is the world space XYZ position for
 48 |         the corresponding pixel.
 49 |     light_positions: a 3D tensor with shape [batch_size, light_count, 3]. The
 50 |         XYZ position of each light in the scene. In the same coordinate space as
 51 |         pixel_positions.
 52 |     light_intensities: a 3D tensor with shape [batch_size, light_count, 3]. The
 53 |         RGB intensity values for each light. Intensities may be above one.
 54 |     diffuse_colors: a 4D float32 tensor with shape [batch_size, image_height,
 55 |         image_width, 3]. The inner dimension is the diffuse RGB coefficients at
 56 |         a pixel in the range [0, 1].
 57 |     camera_position: a 1D tensor with shape [batch_size, 3]. The XYZ camera
 58 |         position in the scene. If supplied, specular reflections will be
 59 |         computed. If not supplied, specular_colors and shininess_coefficients
 60 |         are expected to be None. In the same coordinate space as
 61 |         pixel_positions.
 62 |     specular_colors: a 4D float32 tensor with shape [batch_size, image_height,
 63 |         image_width, 3]. The inner dimension is the specular RGB coefficients at
 64 |         a pixel in the range [0, 1]. If None, assumed to be tf.zeros()
 65 |     shininess_coefficients: A 3D float32 tensor that is broadcasted to shape
 66 |         [batch_size, image_height, image_width]. The inner dimension is the
 67 |         shininess coefficient for the object at a pixel. Dimensions that are
 68 |         constant can be given length 1, so [batch_size, 1, 1] and [1, 1, 1] are
 69 |         also valid input shapes.
 70 |     ambient_color: a 2D tensor with shape [batch_size, 3]. The RGB ambient
 71 |         color, which is added to each pixel before tone mapping. If None, it is
 72 |         assumed to be tf.zeros().
 73 |   Returns:
 74 |     A 4D float32 tensor of shape [batch_size, image_height, image_width, 4]
 75 |     containing the lit RGBA color values for each image at each pixel. Colors
 76 |     are in the range [0,1].
 77 | 
 78 |   Raises:
 79 |     ValueError: An invalid argument to the method is detected.
 80 |   """
 81 |   batch_size, image_height, image_width = [s.value for s in normals.shape[:-1]]
 82 |   light_count = light_positions.shape[1].value
 83 |   pixel_count = image_height * image_width
 84 |   # Reshape all values to easily do pixelwise computations:
 85 |   normals = tf.reshape(normals, [batch_size, -1, 3])
 86 |   alphas = tf.reshape(alphas, [batch_size, -1, 1])
 87 |   diffuse_colors = tf.reshape(diffuse_colors, [batch_size, -1, 3])
 88 |   if camera_position is not None:
 89 |     specular_colors = tf.reshape(specular_colors, [batch_size, -1, 3])
 90 | 
 91 |   # Ambient component
 92 |   output_colors = tf.zeros([batch_size, image_height * image_width, 3])
 93 |   if ambient_color is not None:
 94 |     ambient_reshaped = tf.expand_dims(ambient_color, axis=1)
 95 |     output_colors = tf.add(output_colors, ambient_reshaped * diffuse_colors)
 96 | 
 97 |   # Diffuse component
 98 |   pixel_positions = tf.reshape(pixel_positions, [batch_size, -1, 3])
 99 |   per_light_pixel_positions = tf.stack(
100 |       [pixel_positions] * light_count,
101 |       axis=1)  # [batch_size, light_count, pixel_count, 3]
102 |   directions_to_lights = tf.nn.l2_normalize(
103 |       tf.expand_dims(light_positions, axis=2) - per_light_pixel_positions,
104 |       dim=3)  # [batch_size, light_count, pixel_count, 3]
105 |   # The specular component should only contribute when the light and normal
106 |   # face one another (i.e. the dot product is nonnegative):
107 |   normals_dot_lights = tf.clip_by_value(
108 |       tf.reduce_sum(
109 |           tf.expand_dims(normals, axis=1) * directions_to_lights, axis=3), 0.0,
110 |       1.0)  # [batch_size, light_count, pixel_count]
111 |   diffuse_output = tf.expand_dims(
112 |       diffuse_colors, axis=1) * tf.expand_dims(
113 |           normals_dot_lights, axis=3) * tf.expand_dims(
114 |               light_intensities, axis=2)
115 |   diffuse_output = tf.reduce_sum(
116 |       diffuse_output, axis=1)  # [batch_size, pixel_count, 3]
117 |   output_colors = tf.add(output_colors, diffuse_output)
118 | 
119 |   # Specular component
120 |   if camera_position is not None:
121 |     camera_position = tf.reshape(camera_position, [batch_size, 1, 3])
122 |     mirror_reflection_direction = tf.nn.l2_normalize(
123 |         2.0 * tf.expand_dims(normals_dot_lights, axis=3) * tf.expand_dims(
124 |             normals, axis=1) - directions_to_lights,
125 |         dim=3)
126 |     direction_to_camera = tf.nn.l2_normalize(
127 |         camera_position - pixel_positions, dim=2)
128 |     reflection_direction_dot_camera_direction = tf.reduce_sum(
129 |         tf.expand_dims(direction_to_camera, axis=1) *
130 |         mirror_reflection_direction,
131 |         axis=3)
132 |     # The specular component should only contribute when the reflection is
133 |     # external:
134 |     reflection_direction_dot_camera_direction = tf.clip_by_value(
135 |         tf.nn.l2_normalize(reflection_direction_dot_camera_direction, dim=2),
136 |         0.0, 1.0)
137 |     # The specular component should also only contribute when the diffuse
138 |     # component contributes:
139 |     reflection_direction_dot_camera_direction = tf.where(
140 |         normals_dot_lights != 0.0, reflection_direction_dot_camera_direction,
141 |         tf.zeros_like(
142 |             reflection_direction_dot_camera_direction, dtype=tf.float32))
143 |     # Reshape to support broadcasting the shininess coefficient, which rarely
144 |     # varies per-vertex:
145 |     reflection_direction_dot_camera_direction = tf.reshape(
146 |         reflection_direction_dot_camera_direction,
147 |         [batch_size, light_count, image_height, image_width])
148 |     shininess_coefficients = tf.expand_dims(shininess_coefficients, axis=1)
149 |     specularity = tf.reshape(
150 |         tf.pow(reflection_direction_dot_camera_direction,
151 |                shininess_coefficients),
152 |         [batch_size, light_count, pixel_count, 1])
153 |     specular_output = tf.expand_dims(
154 |         specular_colors, axis=1) * specularity * tf.expand_dims(
155 |             light_intensities, axis=2)
156 |     specular_output = tf.reduce_sum(specular_output, axis=1)
157 |     output_colors = tf.add(output_colors, specular_output)
158 |   rgb_images = tf.reshape(output_colors,
159 |                           [batch_size, image_height, image_width, 3])
160 |   alpha_images = tf.reshape(alphas, [batch_size, image_height, image_width, 1])
161 |   valid_rgb_values = tf.concat(3 * [alpha_images > 0.5], axis=3)
162 |   rgb_images = tf.where(valid_rgb_values, rgb_images,
163 |                         tf.zeros_like(rgb_images, dtype=tf.float32))
164 |   return tf.reverse(tf.concat([rgb_images, alpha_images], axis=3), axis=[1])
165 | 
166 | 
167 | def tone_mapper(image, gamma):
168 |   """Applies gamma correction to the input image.
169 | 
170 |   Tone maps the input image batch in order to make scenes with a high dynamic
171 |   range viewable. The gamma correction factor is computed separately per image,
172 |   but is shared between all provided channels. The exact function computed is:
173 | 
174 |   image_out = A*image_in^gamma, where A is an image-wide constant computed so
175 |   that the maximum image value is approximately 1. The correction is applied
176 |   to all channels.
177 | 
178 |   Args:
179 |     image: 4-D float32 tensor with shape [batch_size, image_height,
180 |         image_width, channel_count]. The batch of images to tone map.
181 |     gamma: 0-D float32 nonnegative tensor. Values of gamma below one compress
182 |         relative contrast in the image, and values above one increase it. A
183 |         value of 1 is equivalent to scaling the image to have a maximum value
184 |         of 1.
185 |   Returns:
186 |     4-D float32 tensor with shape [batch_size, image_height, image_width,
187 |     channel_count]. Contains the gamma-corrected images, clipped to the range
188 |     [0, 1].
189 |   """
190 |   batch_size = image.shape[0].value
191 |   corrected_image = tf.pow(image, gamma)
192 |   image_max = tf.reduce_max(
193 |       tf.reshape(corrected_image, [batch_size, -1]), axis=1)
194 |   scaled_image = tf.divide(corrected_image,
195 |                            tf.reshape(image_max, [batch_size, 1, 1, 1]))
196 |   return tf.clip_by_value(scaled_image, 0.0, 1.0)
197 | 
198 | 
199 | def mesh_renderer(vertices,
200 |                   triangles,
201 |                   normals,
202 |                   diffuse_colors,
203 |                   camera_position,
204 |                   camera_lookat,
205 |                   camera_up,
206 |                   light_positions,
207 |                   light_intensities,
208 |                   image_width,
209 |                   image_height,
210 |                   specular_colors=None,
211 |                   shininess_coefficients=None,
212 |                   ambient_color=None,
213 |                   fov_y=40.0,
214 |                   near_clip=0.01,
215 |                   far_clip=10.0):
216 |   """Renders an input scene using phong shading, and returns an output image.
217 | 
218 |   Args:
219 |     vertices: 3-D float32 tensor with shape [batch_size, vertex_count, 3]. Each
220 |         triplet is an xyz position in world space.
221 |     triangles: 2-D int32 tensor with shape [triangle_count, 3]. Each triplet
222 |         should contain vertex indices describing a triangle such that the
223 |         triangle's normal points toward the viewer if the forward order of the
224 |         triplet defines a clockwise winding of the vertices. Gradients with
225 |         respect to this tensor are not available.
226 |     normals: 3-D float32 tensor with shape [batch_size, vertex_count, 3]. Each
227 |         triplet is the xyz vertex normal for its corresponding vertex. Each
228 |         vector is assumed to be already normalized.
229 |     diffuse_colors: 3-D float32 tensor with shape [batch_size,
230 |         vertex_count, 3]. The RGB diffuse reflection in the range [0,1] for
231 |         each vertex.
232 |     camera_position: 2-D tensor with shape [batch_size, 3] or 1-D tensor with
233 |         shape [3] specifying the XYZ world space camera position.
234 |     camera_lookat: 2-D tensor with shape [batch_size, 3] or 1-D tensor with
235 |         shape [3] containing an XYZ point along the center of the camera's gaze.
236 |     camera_up: 2-D tensor with shape [batch_size, 3] or 1-D tensor with shape
237 |         [3] containing the up direction for the camera. The camera will have no
238 |         tilt with respect to this direction.
239 |     light_positions: a 3-D tensor with shape [batch_size, light_count, 3]. The
240 |         XYZ position of each light in the scene. In the same coordinate space as
241 |         pixel_positions.
242 |     light_intensities: a 3-D tensor with shape [batch_size, light_count, 3]. The
243 |         RGB intensity values for each light. Intensities may be above one.
244 |     image_width: int specifying desired output image width in pixels.
245 |     image_height: int specifying desired output image height in pixels.
246 |     specular_colors: 3-D float32 tensor with shape [batch_size,
247 |         vertex_count, 3]. The RGB specular reflection in the range [0, 1] for
248 |         each vertex.  If supplied, specular reflections will be computed, and
249 |         both specular_colors and shininess_coefficients are expected.
250 |     shininess_coefficients: a 0D-2D float32 tensor with maximum shape
251 |        [batch_size, vertex_count]. The phong shininess coefficient of each
252 |        vertex. A 0D tensor or float gives a constant shininess coefficient
253 |        across all batches and images. A 1D tensor must have shape [batch_size],
254 |        and a single shininess coefficient per image is used.
255 |     ambient_color: a 2D tensor with shape [batch_size, 3]. The RGB ambient
256 |         color, which is added to each pixel in the scene. If None, it is
257 |         assumed to be black.
258 |     fov_y: float, 0D tensor, or 1D tensor with shape [batch_size] specifying
259 |         desired output image y field of view in degrees.
260 |     near_clip: float, 0D tensor, or 1D tensor with shape [batch_size] specifying
261 |         near clipping plane distance.
262 |     far_clip: float, 0D tensor, or 1D tensor with shape [batch_size] specifying
263 |         far clipping plane distance.
264 | 
265 |   Returns:
266 |     A 4-D float32 tensor of shape [batch_size, image_height, image_width, 4]
267 |     containing the lit RGBA color values for each image at each pixel. RGB
268 |     colors are the intensity values before tonemapping and can be in the range
269 |     [0, infinity]. Clipping to the range [0,1] with tf.clip_by_value is likely
270 |     reasonable for both viewing and training most scenes. More complex scenes
271 |     with multiple lights should tone map color values for display only. One
272 |     simple tonemapping approach is to rescale color values as x/(1+x); gamma
273 |     compression is another common techinque. Alpha values are zero for
274 |     background pixels and near one for mesh pixels.
275 |   Raises:
276 |     ValueError: An invalid argument to the method is detected.
277 |   """
278 |   if len(vertices.shape) != 3:
279 |     raise ValueError('Vertices must have shape [batch_size, vertex_count, 3].')
280 |   batch_size = vertices.shape[0].value
281 |   if len(normals.shape) != 3:
282 |     raise ValueError('Normals must have shape [batch_size, vertex_count, 3].')
283 |   if len(light_positions.shape) != 3:
284 |     raise ValueError(
285 |         'Light_positions must have shape [batch_size, light_count, 3].')
286 |   if len(light_intensities.shape) != 3:
287 |     raise ValueError(
288 |         'Light_intensities must have shape [batch_size, light_count, 3].')
289 |   if len(diffuse_colors.shape) != 3:
290 |     raise ValueError(
291 |         'vertex_diffuse_colors must have shape [batch_size, vertex_count, 3].')
292 |   if (ambient_color is not None and
293 |       ambient_color.get_shape().as_list() != [batch_size, 3]):
294 |     raise ValueError('Ambient_color must have shape [batch_size, 3].')
295 |   if camera_position.get_shape().as_list() == [3]:
296 |     camera_position = tf.tile(
297 |         tf.expand_dims(camera_position, axis=0), [batch_size, 1])
298 |   elif camera_position.get_shape().as_list() != [batch_size, 3]:
299 |     raise ValueError('Camera_position must have shape [batch_size, 3]')
300 |   if camera_lookat.get_shape().as_list() == [3]:
301 |     camera_lookat = tf.tile(
302 |         tf.expand_dims(camera_lookat, axis=0), [batch_size, 1])
303 |   elif camera_lookat.get_shape().as_list() != [batch_size, 3]:
304 |     raise ValueError('Camera_lookat must have shape [batch_size, 3]')
305 |   if camera_up.get_shape().as_list() == [3]:
306 |     camera_up = tf.tile(tf.expand_dims(camera_up, axis=0), [batch_size, 1])
307 |   elif camera_up.get_shape().as_list() != [batch_size, 3]:
308 |     raise ValueError('Camera_up must have shape [batch_size, 3]')
309 |   if isinstance(fov_y, float):
310 |     fov_y = tf.constant(batch_size * [fov_y], dtype=tf.float32)
311 |   elif not fov_y.get_shape().as_list():
312 |     fov_y = tf.tile(tf.expand_dims(fov_y, 0), [batch_size])
313 |   elif fov_y.get_shape().as_list() != [batch_size]:
314 |     raise ValueError('Fov_y must be a float, a 0D tensor, or a 1D tensor with'
315 |                      'shape [batch_size]')
316 |   if isinstance(near_clip, float):
317 |     near_clip = tf.constant(batch_size * [near_clip], dtype=tf.float32)
318 |   elif not near_clip.get_shape().as_list():
319 |     near_clip = tf.tile(tf.expand_dims(near_clip, 0), [batch_size])
320 |   elif near_clip.get_shape().as_list() != [batch_size]:
321 |     raise ValueError('Near_clip must be a float, a 0D tensor, or a 1D tensor'
322 |                      'with shape [batch_size]')
323 |   if isinstance(far_clip, float):
324 |     far_clip = tf.constant(batch_size * [far_clip], dtype=tf.float32)
325 |   elif not far_clip.get_shape().as_list():
326 |     far_clip = tf.tile(tf.expand_dims(far_clip, 0), [batch_size])
327 |   elif far_clip.get_shape().as_list() != [batch_size]:
328 |     raise ValueError('Far_clip must be a float, a 0D tensor, or a 1D tensor'
329 |                      'with shape [batch_size]')
330 |   if specular_colors is not None and shininess_coefficients is None:
331 |     raise ValueError(
332 |         'Specular colors were supplied without shininess coefficients.')
333 |   if shininess_coefficients is not None and specular_colors is None:
334 |     raise ValueError(
335 |         'Shininess coefficients were supplied without specular colors.')
336 |   if specular_colors is not None:
337 |     # Since a 0-D float32 tensor is accepted, also accept a float.
338 |     if isinstance(shininess_coefficients, float):
339 |       shininess_coefficients = tf.constant(
340 |           shininess_coefficients, dtype=tf.float32)
341 |     if len(specular_colors.shape) != 3:
342 |       raise ValueError('The specular colors must have shape [batch_size, '
343 |                        'vertex_count, 3].')
344 |     if len(shininess_coefficients.shape) > 2:
345 |       raise ValueError('The shininess coefficients must have shape at most'
346 |                        '[batch_size, vertex_count].')
347 |     # If we don't have per-vertex coefficients, we can just reshape the
348 |     # input shininess to broadcast later, rather than interpolating an
349 |     # additional vertex attribute:
350 |     if len(shininess_coefficients.shape) < 2:
351 |       vertex_attributes = tf.concat(
352 |           [normals, vertices, diffuse_colors, specular_colors], axis=2)
353 |     else:
354 |       vertex_attributes = tf.concat(
355 |           [
356 |               normals, vertices, diffuse_colors, specular_colors,
357 |               tf.expand_dims(shininess_coefficients, axis=2)
358 |           ],
359 |           axis=2)
360 |   else:
361 |     vertex_attributes = tf.concat([normals, vertices, diffuse_colors], axis=2)
362 | 
363 |   camera_matrices = camera_utils.look_at(camera_position, camera_lookat,
364 |                                          camera_up)
365 | 
366 |   perspective_transforms = camera_utils.perspective(image_width / image_height,
367 |                                                     fov_y, near_clip, far_clip)
368 | 
369 |   clip_space_transforms = tf.matmul(perspective_transforms, camera_matrices)
370 | 
371 |   pixel_attributes = rasterize_triangles.rasterize(
372 |       vertices, vertex_attributes, triangles, clip_space_transforms,
373 |       image_width, image_height, [-1] * vertex_attributes.shape[2].value)
374 | 
375 |   # Extract the interpolated vertex attributes from the pixel buffer and
376 |   # supply them to the shader:
377 |   pixel_normals = tf.nn.l2_normalize(pixel_attributes[:, :, :, 0:3], dim=3)
378 |   pixel_positions = pixel_attributes[:, :, :, 3:6]
379 |   diffuse_colors = pixel_attributes[:, :, :, 6:9]
380 |   if specular_colors is not None:
381 |     specular_colors = pixel_attributes[:, :, :, 9:12]
382 |     # Retrieve the interpolated shininess coefficients if necessary, or just
383 |     # reshape our input for broadcasting:
384 |     if len(shininess_coefficients.shape) == 2:
385 |       shininess_coefficients = pixel_attributes[:, :, :, 12]
386 |     else:
387 |       shininess_coefficients = tf.reshape(shininess_coefficients, [-1, 1, 1])
388 | 
389 |   pixel_mask = tf.cast(tf.reduce_any(diffuse_colors >= 0, axis=3), tf.float32)
390 | 
391 |   renders = phong_shader(
392 |       normals=pixel_normals,
393 |       alphas=pixel_mask,
394 |       pixel_positions=pixel_positions,
395 |       light_positions=light_positions,
396 |       light_intensities=light_intensities,
397 |       diffuse_colors=diffuse_colors,
398 |       camera_position=camera_position if specular_colors is not None else None,
399 |       specular_colors=specular_colors,
400 |       shininess_coefficients=shininess_coefficients,
401 |       ambient_color=ambient_color)
402 |   return renders
403 | 


--------------------------------------------------------------------------------
/mesh_renderer/kernels/rasterize_triangles_grad.cc:
--------------------------------------------------------------------------------
  1 | // Copyright 2017 Google LLC
  2 | //
  3 | // Licensed under the Apache License, Version 2.0 (the "License");
  4 | // you may not use this file except in compliance with the License.
  5 | // You may obtain a copy of the License at
  6 | //
  7 | //     https://www.apache.org/licenses/LICENSE-2.0
  8 | //
  9 | // Unless required by applicable law or agreed to in writing, software
 10 | // distributed under the License is distributed on an "AS IS" BASIS,
 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | // See the License for the specific language governing permissions and
 13 | // limitations under the License.
 14 | 
 15 | #include <algorithm>
 16 | #include <vector>
 17 | 
 18 | #include "tensorflow/core/framework/op.h"
 19 | #include "tensorflow/core/framework/op_kernel.h"
 20 | 
 21 | namespace {
 22 | 
 23 | // Threshold for a barycentric coordinate triplet's sum, below which the
 24 | // coordinates at a pixel are deemed degenerate. Most such degenerate triplets
 25 | // in an image will be exactly zero, as this is how pixels outside the mesh
 26 | // are rendered.
 27 | constexpr float kDegenerateBarycentricCoordinatesCutoff = 0.9f;
 28 | 
 29 | // If the area of a triangle is very small in screen space, the corner vertices
 30 | // are approaching colinearity, and we should drop the gradient to avoid
 31 | // numerical instability (in particular, blowup, as the forward pass computation
 32 | // already only has 8 bits of precision).
 33 | constexpr float kMinimumTriangleArea = 1e-13;
 34 | 
 35 | }  // namespace
 36 | 
 37 | namespace tf_mesh_renderer {
 38 | 
 39 |   using ::tensorflow::DEVICE_CPU;
 40 |   using ::tensorflow::OpKernel;
 41 |   using ::tensorflow::OpKernelConstruction;
 42 |   using ::tensorflow::OpKernelContext;
 43 |   using ::tensorflow::PartialTensorShape;
 44 |   using ::tensorflow::Status;
 45 |   using ::tensorflow::Tensor;
 46 |   using ::tensorflow::TensorShape;
 47 |   using ::tensorflow::errors::InvalidArgument;
 48 | 
 49 |   REGISTER_OP("RasterizeTrianglesGrad")
 50 |       .Input("vertices: float32")
 51 |       .Input("triangles: int32")
 52 |       .Input("barycentric_coordinates: float32")
 53 |       .Input("triangle_ids: int32")
 54 |       .Input("df_dbarycentric_coordinates: float32")
 55 |       .Attr("image_width: int")
 56 |       .Attr("image_height: int")
 57 |       .Output("df_dvertices: float32");
 58 | 
 59 |   class RasterizeTrianglesGradOp : public OpKernel {
 60 |    public:
 61 |     explicit RasterizeTrianglesGradOp(OpKernelConstruction* context)
 62 |         : OpKernel(context) {
 63 |       OP_REQUIRES_OK(context, context->GetAttr("image_width", &image_width_));
 64 |       OP_REQUIRES(context, image_width_ > 0,
 65 |                   InvalidArgument("Image width must be > 0, got ", image_width_));
 66 | 
 67 |       OP_REQUIRES_OK(context, context->GetAttr("image_height", &image_height_));
 68 |       OP_REQUIRES(
 69 |           context, image_height_ > 0,
 70 |           InvalidArgument("Image height must be > 0, got ", image_height_));
 71 |     }
 72 | 
 73 |     ~RasterizeTrianglesGradOp() override {}
 74 | 
 75 |     void Compute(OpKernelContext* context) override {
 76 |       const Tensor& vertices_tensor = context->input(0);
 77 |       OP_REQUIRES(
 78 |           context,
 79 |           PartialTensorShape({-1, 4}).IsCompatibleWith(vertices_tensor.shape()),
 80 |           InvalidArgument(
 81 |               "RasterizeTrianglesGrad expects vertices to have shape (-1, 4)."));
 82 |       auto vertices_flat = vertices_tensor.flat<float>();
 83 |       const unsigned int vertex_count = vertices_flat.size() / 4;
 84 |       const float* vertices = vertices_flat.data();
 85 | 
 86 |       const Tensor& triangles_tensor = context->input(1);
 87 |       OP_REQUIRES(
 88 |           context,
 89 |           PartialTensorShape({-1, 3}).IsCompatibleWith(triangles_tensor.shape()),
 90 |           InvalidArgument(
 91 |               "RasterizeTrianglesGrad expects triangles to be a matrix."));
 92 |       auto triangles_flat = triangles_tensor.flat<int>();
 93 |       const int* triangles = triangles_flat.data();
 94 | 
 95 |       const Tensor& barycentric_coordinates_tensor = context->input(2);
 96 |       OP_REQUIRES(context,
 97 |                   TensorShape({image_height_, image_width_, 3}) ==
 98 |                       barycentric_coordinates_tensor.shape(),
 99 |                   InvalidArgument(
100 |                       "RasterizeTrianglesGrad expects barycentric_coordinates to "
101 |                       "have shape {image_height, image_width, 3}"));
102 |       auto barycentric_coordinates_flat =
103 |           barycentric_coordinates_tensor.flat<float>();
104 |       const float* barycentric_coordinates = barycentric_coordinates_flat.data();
105 | 
106 |       const Tensor& triangle_ids_tensor = context->input(3);
107 |       OP_REQUIRES(
108 |           context,
109 |           TensorShape({image_height_, image_width_}) ==
110 |               triangle_ids_tensor.shape(),
111 |           InvalidArgument(
112 |               "RasterizeTrianglesGrad expected triangle_ids to have shape "
113 |               " {image_height, image_width}"));
114 |       auto triangle_ids_flat = triangle_ids_tensor.flat<int>();
115 |       const int* triangle_ids = triangle_ids_flat.data();
116 | 
117 |       // The naming convention we use for all derivatives is d<y>_d<x> ->
118 |       // the partial of y with respect to x.
119 |       const Tensor& df_dbarycentric_coordinates_tensor = context->input(4);
120 |       OP_REQUIRES(
121 |           context,
122 |           TensorShape({image_height_, image_width_, 3}) ==
123 |               df_dbarycentric_coordinates_tensor.shape(),
124 |           InvalidArgument(
125 |               "RasterizeTrianglesGrad expects df_dbarycentric_coordinates "
126 |               "to have shape {image_height, image_width, 3}"));
127 |       auto df_dbarycentric_coordinates_flat =
128 |           df_dbarycentric_coordinates_tensor.flat<float>();
129 |       const float* df_dbarycentric_coordinates =
130 |           df_dbarycentric_coordinates_flat.data();
131 | 
132 |       Tensor* df_dvertices_tensor = nullptr;
133 |       OP_REQUIRES_OK(context,
134 |                      context->allocate_output(0, TensorShape({vertex_count, 4}),
135 |                                               &df_dvertices_tensor));
136 |       auto df_dvertices_flat = df_dvertices_tensor->flat<float>();
137 |       float* df_dvertices = df_dvertices_flat.data();
138 |       std::fill(df_dvertices, df_dvertices + vertex_count * 4, 0.0f);
139 | 
140 |       // We first loop over each pixel in the output image, and compute
141 |       // dbarycentric_coordinate[0,1,2]/dvertex[0x, 0y, 1x, 1y, 2x, 2y].
142 |       // Next we compute each value above's contribution to
143 |       // df/dvertices, building up that matrix as the output of this iteration.
144 |       for (unsigned int pixel_id = 0; pixel_id < image_height_ * image_width_;
145 |            ++pixel_id) {
146 |         // b0, b1, and b2 are the three barycentric coordinate values
147 |         // rendered at pixel pixel_id.
148 |         const float b0 = barycentric_coordinates[3 * pixel_id];
149 |         const float b1 = barycentric_coordinates[3 * pixel_id + 1];
150 |         const float b2 = barycentric_coordinates[3 * pixel_id + 2];
151 | 
152 |         if (b0 + b1 + b2 < kDegenerateBarycentricCoordinatesCutoff) {
153 |           continue;
154 |         }
155 | 
156 |         const float df_db0 = df_dbarycentric_coordinates[3 * pixel_id];
157 |         const float df_db1 = df_dbarycentric_coordinates[3 * pixel_id + 1];
158 |         const float df_db2 = df_dbarycentric_coordinates[3 * pixel_id + 2];
159 | 
160 |         const int triangle_at_current_pixel = triangle_ids[pixel_id];
161 |         const int* vertices_at_current_pixel =
162 |             &triangles[3 * triangle_at_current_pixel];
163 | 
164 |         // Extract vertex indices for the current triangle.
165 |         const int v0_id = 4 * vertices_at_current_pixel[0];
166 |         const int v1_id = 4 * vertices_at_current_pixel[1];
167 |         const int v2_id = 4 * vertices_at_current_pixel[2];
168 | 
169 |         // Extract x,y,w components of the vertices' clip space coordinates.
170 |         const float x0 = vertices[v0_id];
171 |         const float y0 = vertices[v0_id + 1];
172 |         const float w0 = vertices[v0_id + 3];
173 |         const float x1 = vertices[v1_id];
174 |         const float y1 = vertices[v1_id + 1];
175 |         const float w1 = vertices[v1_id + 3];
176 |         const float x2 = vertices[v2_id];
177 |         const float y2 = vertices[v2_id + 1];
178 |         const float w2 = vertices[v2_id + 3];
179 | 
180 |         // Compute pixel's NDC-s.
181 |         const int ix = pixel_id % image_width_;
182 |         const int iy = pixel_id / image_width_;
183 |         const float px = 2 * (ix + 0.5f) / image_width_ - 1.0f;
184 |         const float py = 2 * (iy + 0.5f) / image_height_ - 1.0f;
185 | 
186 |         // Baricentric gradients wrt each vertex coordinate share a common factor.
187 |         const float db0_dx = py * (w1 - w2) - (y1 - y2);
188 |         const float db1_dx = py * (w2 - w0) - (y2 - y0);
189 |         const float db2_dx = -(db0_dx + db1_dx);
190 |         const float db0_dy = (x1 - x2) - px * (w1 - w2);
191 |         const float db1_dy = (x2 - x0) - px * (w2 - w0);
192 |         const float db2_dy = -(db0_dy + db1_dy);
193 |         const float db0_dw = px * (y1 - y2) - py * (x1 - x2);
194 |         const float db1_dw = px * (y2 - y0) - py * (x2 - x0);
195 |         const float db2_dw = -(db0_dw + db1_dw);
196 | 
197 |         // Combine them with chain rule.
198 |         const float df_dx = df_db0 * db0_dx + df_db1 * db1_dx + df_db2 * db2_dx;
199 |         const float df_dy = df_db0 * db0_dy + df_db1 * db1_dy + df_db2 * db2_dy;
200 |         const float df_dw = df_db0 * db0_dw + df_db1 * db1_dw + df_db2 * db2_dw;
201 | 
202 |         // Values of edge equations and inverse w at the current pixel.
203 |         const float edge0_over_w = x2 * db0_dx + y2 * db0_dy + w2 * db0_dw;
204 |         const float edge1_over_w = x2 * db1_dx + y2 * db1_dy + w2 * db1_dw;
205 |         const float edge2_over_w = x1 * db2_dx + y1 * db2_dy + w1 * db2_dw;
206 |         const float w_inv = edge0_over_w + edge1_over_w + edge2_over_w;
207 | 
208 |         // All gradients share a common denominator.
209 |         const float w_sqr = 1 / (w_inv * w_inv);
210 | 
211 |         // Gradients wrt each vertex share a common factor.
212 |         const float edge0 = w_sqr * edge0_over_w;
213 |         const float edge1 = w_sqr * edge1_over_w;
214 |         const float edge2 = w_sqr * edge2_over_w;
215 | 
216 |         df_dvertices[v0_id + 0] += edge0 * df_dx;
217 |         df_dvertices[v0_id + 1] += edge0 * df_dy;
218 |         df_dvertices[v0_id + 3] += edge0 * df_dw;
219 |         df_dvertices[v1_id + 0] += edge1 * df_dx;
220 |         df_dvertices[v1_id + 1] += edge1 * df_dy;
221 |         df_dvertices[v1_id + 3] += edge1 * df_dw;
222 |         df_dvertices[v2_id + 0] += edge2 * df_dx;
223 |         df_dvertices[v2_id + 1] += edge2 * df_dy;
224 |         df_dvertices[v2_id + 3] += edge2 * df_dw;
225 |       }
226 |     }
227 | 
228 |    private:
229 |     int image_width_;
230 |     int image_height_;
231 |   };
232 | 
233 |   REGISTER_KERNEL_BUILDER(Name("RasterizeTrianglesGrad").Device(DEVICE_CPU),
234 |                           RasterizeTrianglesGradOp);
235 | 
236 | }  // namespace tf_mesh_renderer
237 | 


--------------------------------------------------------------------------------
/mesh_renderer/kernels/rasterize_triangles_impl.cc:
--------------------------------------------------------------------------------
  1 | // Copyright 2017 Google LLC
  2 | //
  3 | // Licensed under the Apache License, Version 2.0 (the "License");
  4 | // you may not use this file except in compliance with the License.
  5 | // You may obtain a copy of the License at
  6 | //
  7 | //     https://www.apache.org/licenses/LICENSE-2.0
  8 | //
  9 | // Unless required by applicable law or agreed to in writing, software
 10 | // distributed under the License is distributed on an "AS IS" BASIS,
 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | // See the License for the specific language governing permissions and
 13 | // limitations under the License.
 14 | 
 15 | #include <algorithm>
 16 | #include <cmath>
 17 | 
 18 | #include "rasterize_triangles_impl.h"
 19 | 
 20 | namespace tf_mesh_renderer {
 21 | 
 22 | namespace {
 23 | 
 24 | // Takes the minimum of a, b, and c, rounds down, and converts to an integer
 25 | // in the range [low, high].
 26 | inline int ClampedIntegerMin(float a, float b, float c, int low, int high) {
 27 |   return std::min(
 28 |       std::max(static_cast<int>(std::floor(std::min(std::min(a, b), c))), low),
 29 |       high);
 30 | }
 31 | 
 32 | // Takes the maximum of a, b, and c, rounds up, and converts to an integer
 33 | // in the range [low, high].
 34 | inline int ClampedIntegerMax(float a, float b, float c, int low, int high) {
 35 |   return std::min(
 36 |       std::max(static_cast<int>(std::ceil(std::max(std::max(a, b), c))), low),
 37 |       high);
 38 | }
 39 | 
 40 | // Computes a 3x3 matrix inverse without dividing by the determinant.
 41 | // Instead, makes an unnormalized matrix inverse with the correct sign
 42 | // by flipping the sign of the matrix if the determinant is negative.
 43 | // By leaving out determinant division, the rows of M^-1 only depend on two out
 44 | // of three of the columns of M; i.e., the first row of M^-1 only depends on the
 45 | // second and third columns of M, the second only depends on the first and
 46 | // third, etc. This means we can compute edge functions for two neighboring
 47 | // triangles independently and produce exactly the same numerical result up to
 48 | // the sign. This in turn means we can avoid cracks in rasterization without
 49 | // using fixed-point arithmetic.
 50 | // See http://mathworld.wolfram.com/MatrixInverse.html
 51 | void ComputeUnnormalizedMatrixInverse(const float a11, const float a12,
 52 |                                       const float a13, const float a21,
 53 |                                       const float a22, const float a23,
 54 |                                       const float a31, const float a32,
 55 |                                       const float a33, float m_inv[9]) {
 56 |   m_inv[0] = a22 * a33 - a32 * a23;
 57 |   m_inv[1] = a13 * a32 - a33 * a12;
 58 |   m_inv[2] = a12 * a23 - a22 * a13;
 59 |   m_inv[3] = a23 * a31 - a33 * a21;
 60 |   m_inv[4] = a11 * a33 - a31 * a13;
 61 |   m_inv[5] = a13 * a21 - a23 * a11;
 62 |   m_inv[6] = a21 * a32 - a31 * a22;
 63 |   m_inv[7] = a12 * a31 - a32 * a11;
 64 |   m_inv[8] = a11 * a22 - a21 * a12;
 65 | 
 66 |   // The first column of the unnormalized M^-1 contains intermediate values for
 67 |   // det(M).
 68 |   const float det = a11 * m_inv[0] + a12 * m_inv[3] + a13 * m_inv[6];
 69 | 
 70 |   // Transfer the sign of the determinant.
 71 |   if (det < 0.0f) {
 72 |     for (int i = 0; i < 9; ++i) {
 73 |       m_inv[i] = -m_inv[i];
 74 |     }
 75 |   }
 76 | }
 77 | 
 78 | // Computes the edge functions from M^-1 as described by Olano and Greer,
 79 | // "Triangle Scan Conversion using 2D Homogeneous Coordinates."
 80 | //
 81 | // This function combines equations (3) and (4). It first computes
 82 | // [a b c] = u_i * M^-1, where u_0 = [1 0 0], u_1 = [0 1 0], etc.,
 83 | // then computes edge_i = aX + bY + c
 84 | void ComputeEdgeFunctions(const float px, const float py, const float m_inv[9],
 85 |                           float values[3]) {
 86 |   for (int i = 0; i < 3; ++i) {
 87 |     const float a = m_inv[3 * i + 0];
 88 |     const float b = m_inv[3 * i + 1];
 89 |     const float c = m_inv[3 * i + 2];
 90 | 
 91 |     values[i] = a * px + b * py + c;
 92 |   }
 93 | }
 94 | 
 95 | // Determines whether the point p lies inside a front-facing triangle.
 96 | // Counts pixels exactly on an edge as inside the triangle, as long as the
 97 | // triangle is not degenerate. Degenerate (zero-area) triangles always fail the
 98 | // inside test.
 99 | bool PixelIsInsideTriangle(const float edge_values[3]) {
100 |   // Check that the edge values are all non-negative and that at least one is
101 |   // positive (triangle is non-degenerate).
102 |   return (edge_values[0] >= 0 && edge_values[1] >= 0 && edge_values[2] >= 0) &&
103 |          (edge_values[0] > 0 || edge_values[1] > 0 || edge_values[2] > 0);
104 | }
105 | 
106 | }  // namespace
107 | 
108 | void RasterizeTrianglesImpl(const float* vertices, const int32* triangles,
109 |                             int32 triangle_count, int32 image_width,
110 |                             int32 image_height, int32* triangle_ids,
111 |                             float* barycentric_coordinates, float* z_buffer) {
112 |   const float half_image_width = 0.5 * image_width;
113 |   const float half_image_height = 0.5 * image_height;
114 |   float unnormalized_matrix_inverse[9];
115 |   float b_over_w[3];
116 | 
117 |   for (int32 triangle_id = 0; triangle_id < triangle_count; ++triangle_id) {
118 |     const int32 v0_x_id = 4 * triangles[3 * triangle_id];
119 |     const int32 v1_x_id = 4 * triangles[3 * triangle_id + 1];
120 |     const int32 v2_x_id = 4 * triangles[3 * triangle_id + 2];
121 | 
122 |     const float v0w = vertices[v0_x_id + 3];
123 |     const float v1w = vertices[v1_x_id + 3];
124 |     const float v2w = vertices[v2_x_id + 3];
125 |     // Early exit: if all w < 0, triangle is entirely behind the eye.
126 |     if (v0w < 0 && v1w < 0 && v2w < 0) {
127 |       continue;
128 |     }
129 | 
130 |     const float v0x = vertices[v0_x_id];
131 |     const float v0y = vertices[v0_x_id + 1];
132 |     const float v1x = vertices[v1_x_id];
133 |     const float v1y = vertices[v1_x_id + 1];
134 |     const float v2x = vertices[v2_x_id];
135 |     const float v2y = vertices[v2_x_id + 1];
136 | 
137 |     ComputeUnnormalizedMatrixInverse(v0x, v1x, v2x, v0y, v1y, v2y, v0w, v1w,
138 |                                      v2w, unnormalized_matrix_inverse);
139 | 
140 |     // Initialize the bounding box to the entire screen.
141 |     int left = 0, right = image_width, bottom = 0, top = image_height;
142 |     // If the triangle is entirely inside the screen, project the vertices to
143 |     // pixel coordinates and find the triangle bounding box enlarged to the
144 |     // nearest integer and clamped to the image boundaries.
145 |     if (v0w > 0 && v1w > 0 && v2w > 0) {
146 |       const float p0x = (v0x / v0w + 1.0) * half_image_width;
147 |       const float p1x = (v1x / v1w + 1.0) * half_image_width;
148 |       const float p2x = (v2x / v2w + 1.0) * half_image_width;
149 |       const float p0y = (v0y / v0w + 1.0) * half_image_height;
150 |       const float p1y = (v1y / v1w + 1.0) * half_image_height;
151 |       const float p2y = (v2y / v2w + 1.0) * half_image_height;
152 |       left = ClampedIntegerMin(p0x, p1x, p2x, 0, image_width);
153 |       right = ClampedIntegerMax(p0x, p1x, p2x, 0, image_width);
154 |       bottom = ClampedIntegerMin(p0y, p1y, p2y, 0, image_height);
155 |       top = ClampedIntegerMax(p0y, p1y, p2y, 0, image_height);
156 |     }
157 | 
158 |     // Iterate over each pixel in the bounding box.
159 |     for (int iy = bottom; iy < top; ++iy) {
160 |       for (int ix = left; ix < right; ++ix) {
161 |         const float px = ((ix + 0.5) / half_image_width) - 1.0;
162 |         const float py = ((iy + 0.5) / half_image_height) - 1.0;
163 |         const int pixel_idx = iy * image_width + ix;
164 | 
165 |         ComputeEdgeFunctions(px, py, unnormalized_matrix_inverse, b_over_w);
166 |         if (!PixelIsInsideTriangle(b_over_w)) {
167 |           continue;
168 |         }
169 | 
170 |         const float one_over_w = b_over_w[0] + b_over_w[1] + b_over_w[2];
171 |         const float b0 = b_over_w[0] / one_over_w;
172 |         const float b1 = b_over_w[1] / one_over_w;
173 |         const float b2 = b_over_w[2] / one_over_w;
174 | 
175 |         const float v0z = vertices[v0_x_id + 2];
176 |         const float v1z = vertices[v1_x_id + 2];
177 |         const float v2z = vertices[v2_x_id + 2];
178 |         // Since we computed an unnormalized w above, we need to recompute
179 |         // a properly scaled clip-space w value and then divide clip-space z
180 |         // by that.
181 |         const float clip_z = b0 * v0z + b1 * v1z + b2 * v2z;
182 |         const float clip_w = b0 * v0w + b1 * v1w + b2 * v2w;
183 |         const float z = clip_z / clip_w;
184 | 
185 |         // Skip the pixel if it is farther than the current z-buffer pixel or
186 |         // beyond the near or far clipping plane.
187 |         if (z < -1.0 || z > 1.0 || z > z_buffer[pixel_idx]) {
188 |           continue;
189 |         }
190 | 
191 |         triangle_ids[pixel_idx] = triangle_id;
192 |         z_buffer[pixel_idx] = z;
193 |         barycentric_coordinates[3 * pixel_idx + 0] = b0;
194 |         barycentric_coordinates[3 * pixel_idx + 1] = b1;
195 |         barycentric_coordinates[3 * pixel_idx + 2] = b2;
196 |       }
197 |     }
198 |   }
199 | }
200 | 
201 | }  // namespace tf_mesh_renderer
202 | 


--------------------------------------------------------------------------------
/mesh_renderer/kernels/rasterize_triangles_impl.h:
--------------------------------------------------------------------------------
 1 | // Copyright 2017 Google LLC
 2 | //
 3 | // Licensed under the Apache License, Version 2.0 (the "License");
 4 | // you may not use this file except in compliance with the License.
 5 | // You may obtain a copy of the License at
 6 | //
 7 | //     https://www.apache.org/licenses/LICENSE-2.0
 8 | //
 9 | // Unless required by applicable law or agreed to in writing, software
10 | // distributed under the License is distributed on an "AS IS" BASIS,
11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | // See the License for the specific language governing permissions and
13 | // limitations under the License.
14 | 
15 | #ifndef MESH_RENDERER_KERNELS_RASTERIZE_TRIANGLES_IMPL_H_
16 | #define MESH_RENDERER_KERNELS_RASTERIZE_TRIANGLES_IMPL_H_
17 | 
18 | namespace tf_mesh_renderer {
19 | 
20 | // Copied from tensorflow/core/platform/default/integral_types.h
21 | // to avoid making this file depend on tensorflow.
22 | typedef int int32;
23 | typedef long long int64;
24 | 
25 | // Computes the triangle id, barycentric coordinates, and z-buffer at each pixel
26 | // in the image.
27 | //
28 | // vertices: A flattened 2D array with 4*vertex_count elements.
29 | //     Each contiguous triplet is the XYZW location of the vertex with that
30 | //     triplet's id. The coordinates are assumed to be OpenGL-style clip-space
31 | //     (i.e., post-projection, pre-divide), where X points right, Y points up,
32 | //     Z points away.
33 | // triangles: A flattened 2D array with 3*triangle_count elements.
34 | //     Each contiguous triplet is the three vertex ids indexing into vertices
35 | //     describing one triangle with clockwise winding.
36 | // triangle_count: The number of triangles stored in the array triangles.
37 | // triangle_ids: A flattened 2D array with image_height*image_width elements.
38 | //     At return, each pixel contains a triangle id in the range
39 | //     [0, triangle_count). The id value is also 0 if there is no triangle
40 | //     at the pixel. The barycentric_coordinates must be checked to
41 | //     distinguish the two cases.
42 | // barycentric_coordinates: A flattened 3D array with
43 | //     image_height*image_width*3 elements. At return, contains the triplet of
44 | //     barycentric coordinates at each pixel in the same vertex ordering as
45 | //     triangles. If no triangle is present, all coordinates are 0.
46 | // z_buffer: A flattened 2D array with image_height*image_width elements. At
47 | //     return, contains the normalized device Z coordinates of the rendered
48 | //     triangles.
49 | void RasterizeTrianglesImpl(const float* vertices, const int32* triangles,
50 |                             int32 triangle_count, int32 image_width,
51 |                             int32 image_height, int32* triangle_ids,
52 |                             float* barycentric_coordinates, float* z_buffer);
53 | 
54 | }  // namespace tf_mesh_renderer
55 | 
56 | #endif  // MESH_RENDERER_OPS_KERNELS_RASTERIZE_TRIANGLES_IMPL_H_
57 | 


--------------------------------------------------------------------------------
/mesh_renderer/kernels/rasterize_triangles_impl_test.cc:
--------------------------------------------------------------------------------
  1 | // Copyright 2017 Google LLC
  2 | //
  3 | // Licensed under the Apache License, Version 2.0 (the "License");
  4 | // you may not use this file except in compliance with the License.
  5 | // You may obtain a copy of the License at
  6 | //
  7 | //     https://www.apache.org/licenses/LICENSE-2.0
  8 | //
  9 | // Unless required by applicable law or agreed to in writing, software
 10 | // distributed under the License is distributed on an "AS IS" BASIS,
 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | // See the License for the specific language governing permissions and
 13 | // limitations under the License.
 14 | 
 15 | #include <fstream>
 16 | 
 17 | #include "gtest/gtest.h"
 18 | #include "rasterize_triangles_impl.h"
 19 | 
 20 | #include "third_party/lodepng.h"
 21 | 
 22 | namespace tf_mesh_renderer {
 23 | namespace {
 24 | 
 25 | typedef unsigned char uint8;
 26 | 
 27 | const int kImageHeight = 480;
 28 | const int kImageWidth = 640;
 29 | 
 30 | std::string GetRunfilesRelativePath(const std::string& filename) {
 31 |   const std::string srcdir = std::getenv("TEST_SRCDIR");
 32 |   const std::string test_data = "/tf_mesh_renderer/mesh_renderer/test_data/";
 33 |   return srcdir + test_data + filename;
 34 | }
 35 | 
 36 | void LoadPng(const std::string& filename, std::vector<uint8>* output) {
 37 |   unsigned width, height;
 38 |   unsigned error = lodepng::decode(*output, width, height, filename.c_str());
 39 |   ASSERT_TRUE(error == 0) << "Decoder error: " << lodepng_error_text(error);
 40 | }
 41 | 
 42 | void SavePng(const std::string& filename, const std::vector<uint8>& image) {
 43 |   unsigned error =
 44 |       lodepng::encode(filename.c_str(), image, kImageWidth, kImageHeight);
 45 |   ASSERT_TRUE(error == 0) << "Encoder error: " << lodepng_error_text(error);
 46 | }
 47 | 
 48 | void FloatRGBToUint8RGBA(const std::vector<float>& input,
 49 |                          std::vector<uint8>* output) {
 50 |   output->resize(kImageHeight * kImageWidth * 4);
 51 |   for (int y = 0; y < kImageHeight; ++y) {
 52 |     for (int x = 0; x < kImageWidth; ++x) {
 53 |       for (int c = 0; c < 3; ++c) {
 54 |         (*output)[(y * kImageWidth + x) * 4 + c] =
 55 |             input[(y * kImageWidth + x) * 3 + c] * 255;
 56 |       }
 57 |       (*output)[(y * kImageWidth + x) * 4 + 3] = 255;
 58 |     }
 59 |   }
 60 | }
 61 | 
 62 | void ExpectImageFileAndImageAreEqual(const std::string& baseline_file,
 63 |                                      const std::vector<float>& result,
 64 |                                      const std::string& comparison_name,
 65 |                                      const std::string& failure_message) {
 66 |   std::vector<uint8> baseline_rgba, result_rgba;
 67 |   LoadPng(GetRunfilesRelativePath(baseline_file), &baseline_rgba);
 68 |   FloatRGBToUint8RGBA(result, &result_rgba);
 69 | 
 70 |   const bool images_match = baseline_rgba == result_rgba;
 71 | 
 72 |   if (!images_match) {
 73 |     const std::string result_output_path =
 74 |         "/tmp/" + comparison_name + "_result.png";
 75 |     SavePng(result_output_path, result_rgba);
 76 |   }
 77 | 
 78 |   EXPECT_TRUE(images_match) << failure_message;
 79 | }
 80 | 
 81 | class RasterizeTrianglesImplTest : public ::testing::Test {
 82 |  protected:
 83 |   void CallRasterizeTrianglesImpl(const float* vertices, const int32* triangles,
 84 |                                   int32 triangle_count) {
 85 |     const int num_pixels = image_height_ * image_width_;
 86 |     barycentrics_buffer_.resize(num_pixels * 3);
 87 |     triangle_ids_buffer_.resize(num_pixels);
 88 | 
 89 |     constexpr float kClearDepth = 1.0;
 90 |     z_buffer_.resize(num_pixels, kClearDepth);
 91 | 
 92 |     RasterizeTrianglesImpl(vertices, triangles, triangle_count, image_width_,
 93 |                            image_height_, triangle_ids_buffer_.data(),
 94 |                            barycentrics_buffer_.data(), z_buffer_.data());
 95 |   }
 96 | 
 97 |   // Expects that the sum of barycentric weights at a pixel is close to a
 98 |   // given value.
 99 |   void ExpectBarycentricSumIsNear(int x, int y, float expected) const {
100 |     constexpr float kEpsilon = 1e-6f;
101 |     auto it = barycentrics_buffer_.begin() + y * image_width_ * 3 + x * 3;
102 |     EXPECT_NEAR(*it + *(it + 1) + *(it + 2), expected, kEpsilon);
103 |   }
104 |   // Expects that a pixel is covered by verifying that its barycentric
105 |   // coordinates sum to one.
106 |   void ExpectIsCovered(int x, int y) const {
107 |     ExpectBarycentricSumIsNear(x, y, 1.0);
108 |   }
109 |   // Expects that a pixel is not covered by verifying that its barycentric
110 |   // coordinates sum to zero.
111 |   void ExpectIsNotCovered(int x, int y) const {
112 |     ExpectBarycentricSumIsNear(x, y, 0.0);
113 |   }
114 | 
115 |   int image_height_ = 480;
116 |   int image_width_ = 640;
117 |   std::vector<float> barycentrics_buffer_;
118 |   std::vector<int32> triangle_ids_buffer_;
119 |   std::vector<float> z_buffer_;
120 | };
121 | 
122 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeTriangle) {
123 |   const std::vector<float> vertices = {-0.5, -0.5, 0.8, 1.0,  0.0, 0.5,
124 |                                        0.3,  1.0,  0.5, -0.5, 0.3, 1.0};
125 |   const std::vector<int32> triangles = {0, 1, 2};
126 | 
127 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1);
128 |   ExpectImageFileAndImageAreEqual("Simple_Triangle.png", barycentrics_buffer_,
129 |                                   "triangle", "simple triangle does not match");
130 | }
131 | 
132 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeExternalTriangle) {
133 |   const std::vector<float> vertices = {-0.5, -0.5, 0.0, 1.0,  0.0, -0.5,
134 |                                        0.0,  -1.0, 0.5, -0.5, 0.0, 1.0};
135 |   const std::vector<int32> triangles = {0, 1, 2};
136 | 
137 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1);
138 | 
139 |   ExpectImageFileAndImageAreEqual("External_Triangle.png",
140 |                                   barycentrics_buffer_, "external triangle",
141 |                                   "external triangle does not match");
142 | }
143 | 
144 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeCameraInsideBox) {
145 |   const std::vector<float> vertices = {
146 |       -1.0, -1.0, 0.0, 2.0, 1.0, -1.0, 0.0,  2.0, 1.0,  1.0, 0.0,
147 |       2.0,  -1.0, 1.0, 0.0, 2.0, -1.0, -1.0, 0.0, -2.0, 1.0, -1.0,
148 |       0.0,  -2.0, 1.0, 1.0, 0.0, -2.0, -1.0, 1.0, 0.0,  -2.0};
149 |   const std::vector<int32> triangles = {0, 1, 2, 0, 2, 3, 4, 5, 6, 4, 6, 7,
150 |                                         2, 3, 7, 2, 7, 6, 1, 0, 4, 1, 4, 5,
151 |                                         0, 3, 7, 0, 7, 4, 1, 2, 6, 1, 6, 5};
152 | 
153 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 12);
154 | 
155 |   ExpectImageFileAndImageAreEqual("Inside_Box.png",
156 |                                   barycentrics_buffer_, "camera inside box",
157 |                                   "camera inside box does not match");
158 | }
159 | 
160 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeTetrahedron) {
161 |   const std::vector<float> vertices = {-0.5, -0.5, 0.8, 1.0,  0.0, 0.5,
162 |                                        0.3,  1.0,  0.5, -0.5, 0.3, 1.0,
163 |                                        0.0,  0.0,  0.0, 1.0};
164 |   const std::vector<int32> triangles = {0, 2, 1, 0, 1, 3, 1, 2, 3, 2, 0, 3};
165 | 
166 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 4);
167 | 
168 |   ExpectImageFileAndImageAreEqual("Simple_Tetrahedron.png",
169 |                                   barycentrics_buffer_, "tetrahedron",
170 |                                   "simple tetrahedron does not match");
171 | }
172 | 
173 | TEST_F(RasterizeTrianglesImplTest, CanRasterizeCube) {
174 |   // Vertex values were obtained by dumping the clip-space vertex values from
175 |   // the renderSimpleCube test in ../rasterize_triangles_test.py.
176 |   const std::vector<float> vertices = {
177 |       -2.60648608, -3.22707772,  6.85085106, 6.85714293,
178 |       -1.30324292, -0.992946863, 8.56856918, 8.5714283,
179 |       -1.30324292, 3.97178817,   7.70971,    7.71428585,
180 |       -2.60648608, 1.73765731,   5.991992,   6,
181 |       1.30324292,  -3.97178817,  6.27827835, 6.28571415,
182 |       2.60648608,  -1.73765731,  7.99599648, 8,
183 |       2.60648608,  3.22707772,   7.13713741, 7.14285707,
184 |       1.30324292,  0.992946863,  5.41941929, 5.4285717};
185 | 
186 |   const std::vector<int32> triangles = {0, 1, 2, 2, 3, 0, 3, 2, 6, 6, 7, 3,
187 |                                         7, 6, 5, 5, 4, 7, 4, 5, 1, 1, 0, 4,
188 |                                         5, 6, 2, 2, 1, 5, 7, 4, 0, 0, 3, 7};
189 | 
190 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 12);
191 | 
192 |   ExpectImageFileAndImageAreEqual("Barycentrics_Cube.png",
193 |       barycentrics_buffer_, "cube", "cube does not match");
194 | }
195 | 
196 | TEST_F(RasterizeTrianglesImplTest, WorksWhenPixelIsOnTriangleEdge) {
197 |   // Verifies that a pixel that lies exactly on a triangle edge is considered
198 |   // inside the triangle.
199 |   image_width_ = 641;
200 |   const int x_pixel = image_width_ / 2;
201 |   const float x_ndc = 0.0;
202 |   constexpr int yPixel = 5;
203 | 
204 |   const std::vector<float> vertices = {x_ndc, -1.0, 0.5, 1.0,  x_ndc, 1.0,
205 |                                        0.5,   1.0,  0.5, -1.0, 0.5,   1.0};
206 |   {
207 |     const std::vector<int32> triangles = {0, 1, 2};
208 | 
209 |     CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1);
210 | 
211 |     ExpectIsCovered(x_pixel, yPixel);
212 |   }
213 |   {
214 |     // Test the triangle with the same vertices in reverse order.
215 |     const std::vector<int32> triangles = {2, 1, 0};
216 | 
217 |     CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1);
218 | 
219 |     ExpectIsCovered(x_pixel, yPixel);
220 |   }
221 | }
222 | 
223 | TEST_F(RasterizeTrianglesImplTest, CoversEdgePixelsOfImage) {
224 |   // Verifies that the pixels along image edges are correct covered.
225 | 
226 |   const std::vector<float> vertices = {-1.0, -1.0, 0.0, 1.0, 1.0, -1.0,
227 |                                        0.0,  1.0,  1.0, 1.0, 0.0, 1.0,
228 |                                        -1.0, 1.0,  0.0, 1.0};
229 |   const std::vector<int32> triangles = {0, 1, 2, 0, 2, 3};
230 | 
231 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 2);
232 | 
233 |   ExpectIsCovered(0, 0);
234 |   ExpectIsCovered(image_width_ - 1, 0);
235 |   ExpectIsCovered(image_width_ - 1, image_height_ - 1);
236 |   ExpectIsCovered(0, image_height_ - 1);
237 | }
238 | 
239 | TEST_F(RasterizeTrianglesImplTest, PixelOnDegenerateTriangleIsNotInside) {
240 |   // Verifies that a pixel lying exactly on a triangle with zero area is
241 |   // counted as lying outside the triangle.
242 |   image_width_ = 1;
243 |   image_height_ = 1;
244 |   const std::vector<float> vertices = {-1.0, -1.0, 0.0, 1.0, 1.0, 1.0,
245 |                                        0.0,  1.0,  0.0, 0.0, 0.0, 1.0};
246 |   const std::vector<int32> triangles = {0, 1, 2};
247 | 
248 |   CallRasterizeTrianglesImpl(vertices.data(), triangles.data(), 1);
249 | 
250 |   ExpectIsNotCovered(0, 0);
251 | }
252 | 
253 | }  // namespace
254 | }  // namespace tf_mesh_renderer
255 | 


--------------------------------------------------------------------------------
/mesh_renderer/kernels/rasterize_triangles_op.cc:
--------------------------------------------------------------------------------
  1 | // Copyright 2017 Google LLC
  2 | //
  3 | // Licensed under the Apache License, Version 2.0 (the "License");
  4 | // you may not use this file except in compliance with the License.
  5 | // You may obtain a copy of the License at
  6 | //
  7 | //     https://www.apache.org/licenses/LICENSE-2.0
  8 | //
  9 | // Unless required by applicable law or agreed to in writing, software
 10 | // distributed under the License is distributed on an "AS IS" BASIS,
 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | // See the License for the specific language governing permissions and
 13 | // limitations under the License.
 14 | 
 15 | #include <algorithm>
 16 | #include <vector>
 17 | 
 18 | #include "rasterize_triangles_impl.h"
 19 | #include "tensorflow/core/framework/op.h"
 20 | #include "tensorflow/core/framework/op_kernel.h"
 21 | 
 22 | namespace tf_mesh_renderer {
 23 | 
 24 | using ::tensorflow::DEVICE_CPU;
 25 | using ::tensorflow::int32;
 26 | using ::tensorflow::OpKernel;
 27 | using ::tensorflow::OpKernelConstruction;
 28 | using ::tensorflow::OpKernelContext;
 29 | using ::tensorflow::PartialTensorShape;
 30 | using ::tensorflow::Status;
 31 | using ::tensorflow::Tensor;
 32 | using ::tensorflow::TensorShape;
 33 | using ::tensorflow::TensorShapeUtils;
 34 | using ::tensorflow::errors::Internal;
 35 | using ::tensorflow::errors::InvalidArgument;
 36 | 
 37 | REGISTER_OP("RasterizeTriangles")
 38 |     .Input("vertices: float32")
 39 |     .Input("triangles: int32")
 40 |     .Attr("image_width: int")
 41 |     .Attr("image_height: int")
 42 |     .Output("barycentric_coordinates: float32")
 43 |     .Output("triangle_ids: int32")
 44 |     .Output("z_buffer: float32")
 45 |     .Doc(R"doc(
 46 | Implements a rasterization kernel for rendering mesh geometry.
 47 | 
 48 | vertices: 2-D tensor with shape [vertex_count, 4]. The 3-D positions of the mesh
 49 |   vertices in clip-space (XYZW).
 50 | triangles: 2-D tensor with shape [triangle_count, 3]. Each row is a tuple of
 51 |   indices into vertices specifying a triangle to be drawn. The triangle has an
 52 |   outward facing normal when the given indices appear in a clockwise winding to
 53 |   the viewer.
 54 | image_width: positive int attribute specifying the width of the output image.
 55 | image_height: positive int attribute specifying the height of the output image.
 56 | barycentric_coordinates: 3-D tensor with shape [image_height, image_width, 3]
 57 |   containing the rendered barycentric coordinate triplet per pixel, before
 58 |   perspective correction. The triplet is the zero vector if the pixel is outside
 59 |   the mesh boundary. For valid pixels, the ordering of the coordinates
 60 |   corresponds to the ordering in triangles.
 61 | triangle_ids: 2-D tensor with shape [image_height, image_width]. Contains the
 62 |   triangle id value for each pixel in the output image. For pixels within the
 63 |   mesh, this is the integer value in the range [0, num_vertices] from triangles.
 64 |   For vertices outside the mesh this is 0; 0 can either indicate belonging to
 65 |   triangle 0, or being outside the mesh. This ensures all returned triangle ids
 66 |   will validly index into the vertex array, enabling the use of tf.gather with
 67 |   indices from this tensor. The barycentric coordinates can be used to determine
 68 |   pixel validity instead.
 69 | z_buffer: 2-D tensor with shape [image_height, image_width]. Contains the Z
 70 |   coordinate in Normalized Device Coordinates for each pixel occupied by a
 71 |   triangle.
 72 | )doc");
 73 | 
 74 | class RasterizeTrianglesOp : public OpKernel {
 75 |  public:
 76 |   explicit RasterizeTrianglesOp(OpKernelConstruction* context)
 77 |       : OpKernel(context) {
 78 |     OP_REQUIRES_OK(context, context->GetAttr("image_width", &image_width_));
 79 |     OP_REQUIRES(context, image_width_ > 0,
 80 |                 InvalidArgument("Image width must be > 0, got ", image_width_));
 81 | 
 82 |     OP_REQUIRES_OK(context, context->GetAttr("image_height", &image_height_));
 83 |     OP_REQUIRES(
 84 |         context, image_height_ > 0,
 85 |         InvalidArgument("Image height must be > 0, got ", image_height_));
 86 |   }
 87 | 
 88 |   ~RasterizeTrianglesOp() override {}
 89 | 
 90 |   void Compute(OpKernelContext* context) override {
 91 |     const Tensor& vertices_tensor = context->input(0);
 92 |     OP_REQUIRES(
 93 |         context,
 94 |         PartialTensorShape({-1, 4}).IsCompatibleWith(vertices_tensor.shape()),
 95 |         InvalidArgument(
 96 |             "RasterizeTriangles expects vertices to have shape (-1, 4)."));
 97 |     auto vertices_flat = vertices_tensor.flat<float>();
 98 |     const float* vertices = vertices_flat.data();
 99 | 
100 |     const Tensor& triangles_tensor = context->input(1);
101 |     OP_REQUIRES(
102 |         context,
103 |         PartialTensorShape({-1, 3}).IsCompatibleWith(triangles_tensor.shape()),
104 |         InvalidArgument(
105 |             "RasterizeTriangles expects triangles to be a matrix."));
106 |     auto triangles_flat = triangles_tensor.flat<int32>();
107 |     const int32* triangles = triangles_flat.data();
108 |     const int triangle_count = triangles_flat.size() / 3;
109 | 
110 |     Tensor* barycentric_tensor = nullptr;
111 |     OP_REQUIRES_OK(context,
112 |                    context->allocate_output(
113 |                        0, TensorShape({image_height_, image_width_, 3}),
114 |                        &barycentric_tensor));
115 | 
116 |     Tensor* triangle_ids_tensor = nullptr;
117 |     OP_REQUIRES_OK(context, context->allocate_output(
118 |                                 1, TensorShape({image_height_, image_width_}),
119 |                                 &triangle_ids_tensor));
120 | 
121 |     Tensor* z_buffer_tensor = nullptr;
122 |     OP_REQUIRES_OK(context, context->allocate_output(
123 |                                 2, TensorShape({image_height_, image_width_}),
124 |                                 &z_buffer_tensor));
125 | 
126 |     // Clear barycentric and triangle id buffers to 0.
127 |     // Clear z-buffer to 1 (the farthest NDC z value).
128 |     barycentric_tensor->flat<float>().setZero();
129 |     triangle_ids_tensor->flat<int32>().setZero();
130 |     z_buffer_tensor->flat<float>().setConstant(1);
131 | 
132 |     RasterizeTrianglesImpl(vertices, triangles, triangle_count, image_width_,
133 |                            image_height_,
134 |                            triangle_ids_tensor->flat<int32>().data(),
135 |                            barycentric_tensor->flat<float>().data(),
136 |                            z_buffer_tensor->flat<float>().data());
137 |   }
138 | 
139 |  private:
140 |   TF_DISALLOW_COPY_AND_ASSIGN(RasterizeTrianglesOp);
141 | 
142 |   int image_width_;
143 |   int image_height_;
144 | };
145 | 
146 | REGISTER_KERNEL_BUILDER(Name("RasterizeTriangles").Device(DEVICE_CPU),
147 |                         RasterizeTrianglesOp);
148 | 
149 | }  // namespace tf_mesh_renderer
150 | 


--------------------------------------------------------------------------------
/mesh_renderer/mesh_renderer_test.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 Google LLC
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     https://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | from __future__ import absolute_import
 16 | from __future__ import division
 17 | from __future__ import print_function
 18 | 
 19 | import math
 20 | import os
 21 | 
 22 | import numpy as np
 23 | import tensorflow as tf
 24 | 
 25 | import camera_utils
 26 | import mesh_renderer
 27 | import test_utils
 28 | 
 29 | 
 30 | class RenderTest(tf.test.TestCase):
 31 | 
 32 |   def setUp(self):
 33 |     self.test_data_directory = (
 34 |         'mesh_renderer/test_data/')
 35 | 
 36 |     tf.reset_default_graph()
 37 |     # Set up a basic cube centered at the origin, with vertex normals pointing
 38 |     # outwards along the line from the origin to the cube vertices:
 39 |     self.cube_vertices = tf.constant(
 40 |         [[-1, -1, 1], [-1, -1, -1], [-1, 1, -1], [-1, 1, 1], [1, -1, 1],
 41 |          [1, -1, -1], [1, 1, -1], [1, 1, 1]],
 42 |         dtype=tf.float32)
 43 |     self.cube_normals = tf.nn.l2_normalize(self.cube_vertices, dim=1)
 44 |     self.cube_triangles = tf.constant(
 45 |         [[0, 1, 2], [2, 3, 0], [3, 2, 6], [6, 7, 3], [7, 6, 5], [5, 4, 7],
 46 |          [4, 5, 1], [1, 0, 4], [5, 6, 2], [2, 1, 5], [7, 4, 0], [0, 3, 7]],
 47 |         dtype=tf.int32)
 48 | 
 49 |   def testRendersSimpleCube(self):
 50 |     """Renders a simple cube to test the full forward pass.
 51 | 
 52 |     Verifies the functionality of both the custom kernel and the python wrapper.
 53 |     """
 54 | 
 55 |     model_transforms = camera_utils.euler_matrices(
 56 |         [[-20.0, 0.0, 60.0], [45.0, 60.0, 0.0]])[:, :3, :3]
 57 | 
 58 |     vertices_world_space = tf.matmul(
 59 |         tf.stack([self.cube_vertices, self.cube_vertices]),
 60 |         model_transforms,
 61 |         transpose_b=True)
 62 | 
 63 |     normals_world_space = tf.matmul(
 64 |         tf.stack([self.cube_normals, self.cube_normals]),
 65 |         model_transforms,
 66 |         transpose_b=True)
 67 | 
 68 |     # camera position:
 69 |     eye = tf.constant(2 * [[0.0, 0.0, 6.0]], dtype=tf.float32)
 70 |     center = tf.constant(2 * [[0.0, 0.0, 0.0]], dtype=tf.float32)
 71 |     world_up = tf.constant(2 * [[0.0, 1.0, 0.0]], dtype=tf.float32)
 72 |     image_width = 640
 73 |     image_height = 480
 74 |     light_positions = tf.constant([[[0.0, 0.0, 6.0]], [[0.0, 0.0, 6.0]]])
 75 |     light_intensities = tf.ones([2, 1, 3], dtype=tf.float32)
 76 |     vertex_diffuse_colors = tf.ones_like(vertices_world_space, dtype=tf.float32)
 77 | 
 78 |     rendered = mesh_renderer.mesh_renderer(
 79 |         vertices_world_space, self.cube_triangles, normals_world_space,
 80 |         vertex_diffuse_colors, eye, center, world_up, light_positions,
 81 |         light_intensities, image_width, image_height)
 82 | 
 83 |     with self.test_session() as sess:
 84 |       images = sess.run(rendered, feed_dict={})
 85 |       for image_id in range(images.shape[0]):
 86 |         target_image_name = 'Gray_Cube_%i.png' % image_id
 87 |         baseline_image_path = os.path.join(self.test_data_directory,
 88 |                                            target_image_name)
 89 |         test_utils.expect_image_file_and_render_are_near(
 90 |             self, sess, baseline_image_path, images[image_id, :, :, :])
 91 | 
 92 |   def testComplexShading(self):
 93 |     """Tests specular highlights, colors, and multiple lights per image."""
 94 |     # rotate the cube for the test:
 95 |     model_transforms = camera_utils.euler_matrices(
 96 |         [[-20.0, 0.0, 60.0], [45.0, 60.0, 0.0]])[:, :3, :3]
 97 | 
 98 |     vertices_world_space = tf.matmul(
 99 |         tf.stack([self.cube_vertices, self.cube_vertices]),
100 |         model_transforms,
101 |         transpose_b=True)
102 | 
103 |     normals_world_space = tf.matmul(
104 |         tf.stack([self.cube_normals, self.cube_normals]),
105 |         model_transforms,
106 |         transpose_b=True)
107 | 
108 |     # camera position:
109 |     eye = tf.constant([[0.0, 0.0, 6.0], [0., 0.2, 18.0]], dtype=tf.float32)
110 |     center = tf.constant([[0.0, 0.0, 0.0], [0.1, -0.1, 0.1]], dtype=tf.float32)
111 |     world_up = tf.constant(
112 |         [[0.0, 1.0, 0.0], [0.1, 1.0, 0.15]], dtype=tf.float32)
113 |     fov_y = tf.constant([40., 13.3], dtype=tf.float32)
114 |     near_clip = tf.constant(0.1, dtype=tf.float32)
115 |     far_clip = tf.constant(25.0, dtype=tf.float32)
116 |     image_width = 640
117 |     image_height = 480
118 |     light_positions = tf.constant([[[0.0, 0.0, 6.0], [1.0, 2.0, 6.0]],
119 |                                    [[0.0, -2.0, 4.0], [1.0, 3.0, 4.0]]])
120 |     light_intensities = tf.constant(
121 |         [[[1.0, 1.0, 1.0], [1.0, 1.0, 1.0]], [[2.0, 0.0, 1.0], [0.0, 2.0,
122 |                                                                 1.0]]],
123 |         dtype=tf.float32)
124 |     # pyformat: disable
125 |     vertex_diffuse_colors = tf.constant(2*[[[1.0, 0.0, 0.0],
126 |                                             [0.0, 1.0, 0.0],
127 |                                             [0.0, 0.0, 1.0],
128 |                                             [1.0, 1.0, 1.0],
129 |                                             [1.0, 1.0, 0.0],
130 |                                             [1.0, 0.0, 1.0],
131 |                                             [0.0, 1.0, 1.0],
132 |                                             [0.5, 0.5, 0.5]]],
133 |                                         dtype=tf.float32)
134 |     vertex_specular_colors = tf.constant(2*[[[0.0, 1.0, 0.0],
135 |                                              [0.0, 0.0, 1.0],
136 |                                              [1.0, 1.0, 1.0],
137 |                                              [1.0, 1.0, 0.0],
138 |                                              [1.0, 0.0, 1.0],
139 |                                              [0.0, 1.0, 1.0],
140 |                                              [0.5, 0.5, 0.5],
141 |                                              [1.0, 0.0, 0.0]]],
142 |                                          dtype=tf.float32)
143 |     # pyformat: enable
144 |     shininess_coefficients = 6.0 * tf.ones([2, 8], dtype=tf.float32)
145 |     ambient_color = tf.constant(
146 |         [[0., 0., 0.], [0.1, 0.1, 0.2]], dtype=tf.float32)
147 |     renders = mesh_renderer.mesh_renderer(
148 |         vertices_world_space, self.cube_triangles, normals_world_space,
149 |         vertex_diffuse_colors, eye, center, world_up, light_positions,
150 |         light_intensities, image_width, image_height, vertex_specular_colors,
151 |         shininess_coefficients, ambient_color, fov_y, near_clip, far_clip)
152 |     tonemapped_renders = tf.concat(
153 |         [
154 |             mesh_renderer.tone_mapper(renders[:, :, :, 0:3], 0.7),
155 |             renders[:, :, :, 3:4]
156 |         ],
157 |         axis=3)
158 | 
159 |     # Check that shininess coefficient broadcasting works by also rendering
160 |     # with a scalar shininess coefficient, and ensuring the result is identical:
161 |     broadcasted_renders = mesh_renderer.mesh_renderer(
162 |         vertices_world_space, self.cube_triangles, normals_world_space,
163 |         vertex_diffuse_colors, eye, center, world_up, light_positions,
164 |         light_intensities, image_width, image_height, vertex_specular_colors,
165 |         6.0, ambient_color, fov_y, near_clip, far_clip)
166 |     tonemapped_broadcasted_renders = tf.concat(
167 |         [
168 |             mesh_renderer.tone_mapper(broadcasted_renders[:, :, :, 0:3], 0.7),
169 |             broadcasted_renders[:, :, :, 3:4]
170 |         ],
171 |         axis=3)
172 | 
173 |     with self.test_session() as sess:
174 |       images, broadcasted_images = sess.run(
175 |           [tonemapped_renders, tonemapped_broadcasted_renders], feed_dict={})
176 | 
177 |       for image_id in range(images.shape[0]):
178 |         target_image_name = 'Colored_Cube_%i.png' % image_id
179 |         baseline_image_path = os.path.join(self.test_data_directory,
180 |                                            target_image_name)
181 |         test_utils.expect_image_file_and_render_are_near(
182 |             self, sess, baseline_image_path, images[image_id, :, :, :])
183 |         test_utils.expect_image_file_and_render_are_near(
184 |             self, sess, baseline_image_path,
185 |             broadcasted_images[image_id, :, :, :])
186 | 
187 |   def testFullRenderGradientComputation(self):
188 |     """Verifies the Jacobian matrix for the entire renderer.
189 | 
190 |     This ensures correct gradients are propagated backwards through the entire
191 |     process, not just through the rasterization kernel. Uses the simple cube
192 |     forward pass.
193 |     """
194 |     image_height = 21
195 |     image_width = 28
196 | 
197 |     # rotate the cube for the test:
198 |     model_transforms = camera_utils.euler_matrices(
199 |         [[-20.0, 0.0, 60.0], [45.0, 60.0, 0.0]])[:, :3, :3]
200 | 
201 |     vertices_world_space = tf.matmul(
202 |         tf.stack([self.cube_vertices, self.cube_vertices]),
203 |         model_transforms,
204 |         transpose_b=True)
205 | 
206 |     normals_world_space = tf.matmul(
207 |         tf.stack([self.cube_normals, self.cube_normals]),
208 |         model_transforms,
209 |         transpose_b=True)
210 | 
211 |     # camera position:
212 |     eye = tf.constant([0.0, 0.0, 6.0], dtype=tf.float32)
213 |     center = tf.constant([0.0, 0.0, 0.0], dtype=tf.float32)
214 |     world_up = tf.constant([0.0, 1.0, 0.0], dtype=tf.float32)
215 | 
216 |     # Scene has a single light from the viewer's eye.
217 |     light_positions = tf.expand_dims(tf.stack([eye, eye], axis=0), axis=1)
218 |     light_intensities = tf.ones([2, 1, 3], dtype=tf.float32)
219 | 
220 |     vertex_diffuse_colors = tf.ones_like(vertices_world_space, dtype=tf.float32)
221 | 
222 |     rendered = mesh_renderer.mesh_renderer(
223 |         vertices_world_space, self.cube_triangles, normals_world_space,
224 |         vertex_diffuse_colors, eye, center, world_up, light_positions,
225 |         light_intensities, image_width, image_height)
226 | 
227 |     with self.test_session():
228 |       theoretical, numerical = tf.test.compute_gradient(
229 |           self.cube_vertices, (8, 3),
230 |           rendered, (2, image_height, image_width, 4),
231 |           x_init_value=self.cube_vertices.eval(),
232 |           delta=1e-3)
233 |       jacobians_match, message = (
234 |           test_utils.check_jacobians_are_nearly_equal(
235 |               theoretical, numerical, 0.01, 0.01))
236 |       self.assertTrue(jacobians_match, message)
237 | 
238 |   def testThatCubeRotates(self):
239 |     """Optimize a simple cube's rotation using pixel loss.
240 | 
241 |     The rotation is represented as static-basis euler angles. This test checks
242 |     that the computed gradients are useful.
243 |     """
244 |     image_height = 480
245 |     image_width = 640
246 |     initial_euler_angles = [[0.0, 0.0, 0.0]]
247 | 
248 |     euler_angles = tf.Variable(initial_euler_angles)
249 |     model_rotation = camera_utils.euler_matrices(euler_angles)[0, :3, :3]
250 | 
251 |     vertices_world_space = tf.reshape(
252 |         tf.matmul(self.cube_vertices, model_rotation, transpose_b=True),
253 |         [1, 8, 3])
254 | 
255 |     normals_world_space = tf.reshape(
256 |         tf.matmul(self.cube_normals, model_rotation, transpose_b=True),
257 |         [1, 8, 3])
258 | 
259 |     # camera position:
260 |     eye = tf.constant([[0.0, 0.0, 6.0]], dtype=tf.float32)
261 |     center = tf.constant([[0.0, 0.0, 0.0]], dtype=tf.float32)
262 |     world_up = tf.constant([[0.0, 1.0, 0.0]], dtype=tf.float32)
263 | 
264 |     vertex_diffuse_colors = tf.ones_like(vertices_world_space, dtype=tf.float32)
265 |     light_positions = tf.reshape(eye, [1, 1, 3])
266 |     light_intensities = tf.ones([1, 1, 3], dtype=tf.float32)
267 | 
268 |     render = mesh_renderer.mesh_renderer(
269 |         vertices_world_space, self.cube_triangles, normals_world_space,
270 |         vertex_diffuse_colors, eye, center, world_up, light_positions,
271 |         light_intensities, image_width, image_height)
272 |     render = tf.reshape(render, [image_height, image_width, 4])
273 | 
274 |     # Pick the desired cube rotation for the test:
275 |     test_model_rotation = camera_utils.euler_matrices([[-20.0, 0.0,
276 |                                                         60.0]])[0, :3, :3]
277 | 
278 |     desired_vertex_positions = tf.reshape(
279 |         tf.matmul(self.cube_vertices, test_model_rotation, transpose_b=True),
280 |         [1, 8, 3])
281 |     desired_normals = tf.reshape(
282 |         tf.matmul(self.cube_normals, test_model_rotation, transpose_b=True),
283 |         [1, 8, 3])
284 |     desired_render = mesh_renderer.mesh_renderer(
285 |         desired_vertex_positions, self.cube_triangles, desired_normals,
286 |         vertex_diffuse_colors, eye, center, world_up, light_positions,
287 |         light_intensities, image_width, image_height)
288 |     desired_render = tf.reshape(desired_render, [image_height, image_width, 4])
289 | 
290 |     loss = tf.reduce_mean(tf.abs(render - desired_render))
291 |     optimizer = tf.train.MomentumOptimizer(0.7, 0.1)
292 |     grad = tf.gradients(loss, [euler_angles])
293 |     grad, _ = tf.clip_by_global_norm(grad, 1.0)
294 |     opt_func = optimizer.apply_gradients([(grad[0], euler_angles)])
295 | 
296 |     with tf.Session() as sess:
297 |       sess.run(tf.global_variables_initializer())
298 |       for _ in range(35):
299 |         sess.run([loss, opt_func])
300 |       final_image, desired_image = sess.run([render, desired_render])
301 | 
302 |       target_image_name = 'Gray_Cube_0.png'
303 |       baseline_image_path = os.path.join(self.test_data_directory,
304 |                                          target_image_name)
305 |       test_utils.expect_image_file_and_render_are_near(
306 |           self, sess, baseline_image_path, desired_image)
307 |       test_utils.expect_image_file_and_render_are_near(
308 |           self,
309 |           sess,
310 |           baseline_image_path,
311 |           final_image,
312 |           max_outlier_fraction=0.01,
313 |           pixel_error_threshold=0.04)
314 | 
315 | 
316 | if __name__ == '__main__':
317 |   tf.test.main()
318 | 


--------------------------------------------------------------------------------
/mesh_renderer/rasterize_triangles_test.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 Google LLC
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     https://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | from __future__ import absolute_import
 16 | from __future__ import division
 17 | from __future__ import print_function
 18 | 
 19 | import os
 20 | 
 21 | import numpy as np
 22 | import tensorflow as tf
 23 | 
 24 | import test_utils
 25 | import camera_utils
 26 | import rasterize_triangles
 27 | 
 28 | 
 29 | class RenderTest(tf.test.TestCase):
 30 | 
 31 |   def setUp(self):
 32 |     self.test_data_directory = 'mesh_renderer/test_data/'
 33 | 
 34 |     tf.reset_default_graph()
 35 |     self.cube_vertex_positions = tf.constant(
 36 |         [[-1, -1, 1], [-1, -1, -1], [-1, 1, -1], [-1, 1, 1], [1, -1, 1],
 37 |          [1, -1, -1], [1, 1, -1], [1, 1, 1]],
 38 |         dtype=tf.float32)
 39 |     self.cube_triangles = tf.constant(
 40 |         [[0, 1, 2], [2, 3, 0], [3, 2, 6], [6, 7, 3], [7, 6, 5], [5, 4, 7],
 41 |          [4, 5, 1], [1, 0, 4], [5, 6, 2], [2, 1, 5], [7, 4, 0], [0, 3, 7]],
 42 |         dtype=tf.int32)
 43 | 
 44 |     tf_float = lambda x: tf.constant(x, dtype=tf.float32)
 45 |     # camera position:
 46 |     eye = tf_float([[2.0, 3.0, 6.0]])
 47 |     center = tf_float([[0.0, 0.0, 0.0]])
 48 |     world_up = tf_float([[0.0, 1.0, 0.0]])
 49 | 
 50 |     self.image_width = 640
 51 |     self.image_height = 480
 52 | 
 53 |     look_at = camera_utils.look_at(eye, center, world_up)
 54 |     perspective = camera_utils.perspective(
 55 |         self.image_width / self.image_height,
 56 |         tf_float([40.0]), tf_float([0.01]),
 57 |         tf_float([10.0]))
 58 |     self.projection = tf.matmul(perspective, look_at)
 59 | 
 60 |   def runTriangleTest(self, w_vector, target_image_name):
 61 |     """Directly renders a rasterized triangle's barycentric coordinates.
 62 | 
 63 |     Tests only the kernel (rasterize_triangles_module).
 64 | 
 65 |     Args:
 66 |       w_vector: 3 element vector of w components to scale triangle vertices.
 67 |       target_image_name: image file name to compare result against.
 68 |     """
 69 |     clip_init = np.array(
 70 |         [[-0.5, -0.5, 0.8, 1.0], [0.0, 0.5, 0.3, 1.0], [0.5, -0.5, 0.3, 1.0]],
 71 |         dtype=np.float32)
 72 |     clip_init = clip_init * np.reshape(
 73 |         np.array(w_vector, dtype=np.float32), [3, 1])
 74 | 
 75 |     clip_coordinates = tf.constant(clip_init)
 76 |     triangles = tf.constant([[0, 1, 2]], dtype=tf.int32)
 77 | 
 78 |     rendered_coordinates, _, _ = (
 79 |         rasterize_triangles.rasterize_triangles_module.rasterize_triangles(
 80 |             clip_coordinates, triangles, self.image_width, self.image_height))
 81 |     rendered_coordinates = tf.concat(
 82 |         [rendered_coordinates,
 83 |          tf.ones([self.image_height, self.image_width, 1])], axis=2)
 84 |     with self.test_session() as sess:
 85 |       image = rendered_coordinates.eval()
 86 |       baseline_image_path = os.path.join(self.test_data_directory,
 87 |                                          target_image_name)
 88 |       test_utils.expect_image_file_and_render_are_near(
 89 |           self, sess, baseline_image_path, image)
 90 | 
 91 |   def testRendersSimpleTriangle(self):
 92 |     self.runTriangleTest((1.0, 1.0, 1.0), 'Simple_Triangle.png')
 93 | 
 94 |   def testRendersPerspectiveCorrectTriangle(self):
 95 |     self.runTriangleTest((0.2, 0.5, 2.0), 'Perspective_Corrected_Triangle.png')
 96 | 
 97 |   def testRendersSimpleCube(self):
 98 |     """Renders a simple cube to test the kernel and python wrapper."""
 99 |     vertex_rgb = (self.cube_vertex_positions * 0.5 + 0.5)
100 |     vertex_rgba = tf.concat([vertex_rgb, tf.ones([8, 1])], axis=1)
101 |     background_value = [0.0, 0.0, 0.0, 0.0]
102 | 
103 |     rendered = rasterize_triangles.rasterize(
104 |         tf.expand_dims(self.cube_vertex_positions, axis=0),
105 |         tf.expand_dims(vertex_rgba, axis=0), self.cube_triangles,
106 |         self.projection, self.image_width, self.image_height, background_value)
107 | 
108 |     with self.test_session() as sess:
109 |       image = rendered.eval()[0,...]
110 |       target_image_name = 'Unlit_Cube_0.png'
111 |       baseline_image_path = os.path.join(self.test_data_directory,
112 |                                          target_image_name)
113 |       test_utils.expect_image_file_and_render_are_near(
114 |           self, sess, baseline_image_path, image)
115 | 
116 |   def testSimpleTriangleGradientComputation(self):
117 |     """Verifies the Jacobian matrix for a single pixel.
118 | 
119 |     The pixel is in the center of a triangle facing the camera. This makes it
120 |     easy to check which entries of the Jacobian might not make sense without
121 |     worrying about corner cases.
122 |     """
123 |     test_pixel_x = 325
124 |     test_pixel_y = 245
125 | 
126 |     clip_coordinates = tf.placeholder(tf.float32, shape=[3, 4])
127 | 
128 |     triangles = tf.constant([[0, 1, 2]], dtype=tf.int32)
129 | 
130 |     barycentric_coordinates, _, _ = (
131 |         rasterize_triangles.rasterize_triangles_module.rasterize_triangles(
132 |             clip_coordinates, triangles, self.image_width, self.image_height))
133 | 
134 |     pixels_to_compare = barycentric_coordinates[
135 |         test_pixel_y:test_pixel_y + 1, test_pixel_x:test_pixel_x + 1, :]
136 | 
137 |     with self.test_session():
138 |       ndc_init = np.array(
139 |           [[-0.5, -0.5, 0.8, 1.0], [0.0, 0.5, 0.3, 1.0], [0.5, -0.5, 0.3, 1.0]],
140 |           dtype=np.float32)
141 |       theoretical, numerical = tf.test.compute_gradient(
142 |           clip_coordinates, (3, 4),
143 |           pixels_to_compare, (1, 1, 3),
144 |           x_init_value=ndc_init,
145 |           delta=4e-2)
146 |       jacobians_match, message = (
147 |           test_utils.check_jacobians_are_nearly_equal(
148 |               theoretical, numerical, 0.01, 0.0, True))
149 |       self.assertTrue(jacobians_match, message)
150 | 
151 |   def testInternalRenderGradientComputation(self):
152 |     """Isolates and verifies the Jacobian matrix for the custom kernel."""
153 |     image_height = 21
154 |     image_width = 28
155 | 
156 |     clip_coordinates = tf.placeholder(tf.float32, shape=[8, 4])
157 | 
158 |     barycentric_coordinates, _, _ = (
159 |         rasterize_triangles.rasterize_triangles_module.rasterize_triangles(
160 |             clip_coordinates, self.cube_triangles, image_width, image_height))
161 | 
162 |     with self.test_session():
163 |       # Precomputed transformation of the simple cube to normalized device
164 |       # coordinates, in order to isolate the rasterization gradient.
165 |       # pyformat: disable
166 |       ndc_init = np.array(
167 |           [[-0.43889722, -0.53184521, 0.85293502, 1.0],
168 |            [-0.37635487, 0.22206162, 0.90555805, 1.0],
169 |            [-0.22849123, 0.76811147, 0.80993629, 1.0],
170 |            [-0.2805393, -0.14092168, 0.71602166, 1.0],
171 |            [0.18631913, -0.62634289, 0.88603103, 1.0],
172 |            [0.16183566, 0.08129397, 0.93020856, 1.0],
173 |            [0.44147962, 0.53497446, 0.85076219, 1.0],
174 |            [0.53008741, -0.31276882, 0.77620775, 1.0]],
175 |           dtype=np.float32)
176 |       # pyformat: enable
177 |       theoretical, numerical = tf.test.compute_gradient(
178 |           clip_coordinates, (8, 4),
179 |           barycentric_coordinates, (image_height, image_width, 3),
180 |           x_init_value=ndc_init,
181 |           delta=4e-2)
182 |       jacobians_match, message = (
183 |           test_utils.check_jacobians_are_nearly_equal(
184 |               theoretical, numerical, 0.01, 0.01))
185 |       self.assertTrue(jacobians_match, message)
186 | 
187 | 
188 | if __name__ == '__main__':
189 |   tf.test.main()
190 | 


--------------------------------------------------------------------------------
/mesh_renderer/test_data/BUILD:
--------------------------------------------------------------------------------
1 | package(default_visibility = ["//visibility:public"])
2 | 
3 | filegroup(
4 |     name = "images",
5 |     srcs = glob(["*.png"]),
6 | )
7 | 


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Barycentrics_Cube.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Barycentrics_Cube.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Colored_Cube_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Colored_Cube_0.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Colored_Cube_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Colored_Cube_1.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/External_Triangle.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/External_Triangle.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Gray_Cube_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Gray_Cube_0.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Gray_Cube_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Gray_Cube_1.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Inside_Box.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Inside_Box.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Perspective_Corrected_Triangle.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Perspective_Corrected_Triangle.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Simple_Tetrahedron.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Simple_Tetrahedron.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Simple_Triangle.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Simple_Triangle.png


--------------------------------------------------------------------------------
/mesh_renderer/test_data/Unlit_Cube_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/JonathanRaiman/tf_mesh_renderer/5c84139ef6ec1939e5701f9788579b8416028bdd/mesh_renderer/test_data/Unlit_Cube_0.png


--------------------------------------------------------------------------------
/mesh_renderer/test_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 Google LLC
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     https://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | """Common functions for the rasterizer and mesh renderer tests."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | import os
 22 | import numpy as np
 23 | import tensorflow as tf
 24 | 
 25 | 
 26 | def check_jacobians_are_nearly_equal(theoretical,
 27 |                                      numerical,
 28 |                                      outlier_relative_error_threshold,
 29 |                                      max_outlier_fraction,
 30 |                                      include_jacobians_in_error_message=False):
 31 |   """Compares two Jacobian matrices, allowing for some fraction of outliers.
 32 | 
 33 |   Args:
 34 |     theoretical: 2D numpy array containing a Jacobian matrix with entries
 35 |         computed via gradient functions. The layout should be as in the output
 36 |         of gradient_checker.
 37 |     numerical: 2D numpy array of the same shape as theoretical containing a
 38 |         Jacobian matrix with entries computed via finite difference
 39 |         approximations. The layout should be as in the output
 40 |         of gradient_checker.
 41 |     outlier_relative_error_threshold: float prescribing the maximum relative
 42 |         error (from the finite difference approximation) is tolerated before
 43 |         and entry is considered an outlier.
 44 |     max_outlier_fraction: float defining the maximum fraction of entries in
 45 |         theoretical that may be outliers before the check returns False.
 46 |     include_jacobians_in_error_message: bool defining whether the jacobian
 47 |         matrices should be included in the return message should the test fail.
 48 | 
 49 |   Returns:
 50 |     A tuple where the first entry is a boolean describing whether
 51 |     max_outlier_fraction was exceeded, and where the second entry is a string
 52 |     containing an error message if one is relevant.
 53 |   """
 54 |   outlier_gradients = np.abs(
 55 |       numerical - theoretical) / numerical > outlier_relative_error_threshold
 56 |   outlier_fraction = np.count_nonzero(outlier_gradients) / np.prod(
 57 |       numerical.shape[:2])
 58 |   jacobians_match = outlier_fraction <= max_outlier_fraction
 59 | 
 60 |   message = (
 61 |       ' %f of theoretical gradients are relative outliers, but the maximum'
 62 |       ' allowable fraction is %f ' % (outlier_fraction, max_outlier_fraction))
 63 |   if include_jacobians_in_error_message:
 64 |     # the gradient_checker convention is the typical Jacobian transposed:
 65 |     message += ('\nNumerical Jacobian:\n%s\nTheoretical Jacobian:\n%s' %
 66 |                 (repr(numerical.T), repr(theoretical.T)))
 67 |   return jacobians_match, message
 68 | 
 69 | 
 70 | def expect_image_file_and_render_are_near(test_instance,
 71 |                                           sess,
 72 |                                           baseline_path,
 73 |                                           result_image,
 74 |                                           max_outlier_fraction=0.001,
 75 |                                           pixel_error_threshold=0.01):
 76 |   """Compares the output of mesh_renderer with an image on disk.
 77 | 
 78 |   The comparison is soft: the images are considered identical if at most
 79 |   max_outlier_fraction of the pixels differ by more than a relative error of
 80 |   pixel_error_threshold of the full color value. Note that before comparison,
 81 |   mesh renderer values are clipped to the range [0,1].
 82 | 
 83 |   Uses _images_are_near for the actual comparison.
 84 | 
 85 |   Args:
 86 |     test_instance: a python unit test instance.
 87 |     sess: a TensorFlow session for decoding the png.
 88 |     baseline_path: path to the reference image on disk.
 89 |     result_image: the result image, as a numpy array.
 90 |     max_outlier_fraction: the maximum fraction of outlier pixels allowed.
 91 |     pixel_error_threshold: pixel values are considered to differ if their
 92 |       difference exceeds this amount. Range is 0.0 - 1.0.
 93 |   """
 94 |   baseline_bytes = open(baseline_path, 'rb').read()
 95 |   baseline_image = sess.run(tf.image.decode_png(baseline_bytes))
 96 | 
 97 |   test_instance.assertEqual(baseline_image.shape, result_image.shape,
 98 |                             'Image shapes %s and %s do not match.' %
 99 |                             (baseline_image.shape, result_image.shape))
100 | 
101 |   result_image = np.clip(result_image, 0., 1.).copy(order='C')
102 |   baseline_image = baseline_image.astype(float) / 255.0
103 | 
104 |   outlier_channels = (np.abs(baseline_image - result_image) >
105 |                       pixel_error_threshold)
106 |   outlier_pixels = np.any(outlier_channels, axis=2)
107 |   outlier_count = np.count_nonzero(outlier_pixels)
108 |   outlier_fraction = outlier_count / np.prod(baseline_image.shape[:2])
109 |   images_match = outlier_fraction <= max_outlier_fraction
110 | 
111 |   outputs_dir = "/tmp" #os.environ["TEST_TMPDIR"]
112 |   base_prefix = os.path.splitext(os.path.basename(baseline_path))[0]
113 |   result_output_path = os.path.join(outputs_dir, base_prefix + "_result.png")
114 | 
115 |   message = ('%s does not match. (%f of pixels are outliers, %f is allowed.). '
116 |              'Result image written to %s' %
117 |              (baseline_path, outlier_fraction, max_outlier_fraction, result_output_path))
118 | 
119 |   if not images_match:
120 |     result_bytes = sess.run(tf.image.encode_png(result_image*255.0))
121 |     with open(result_output_path, 'wb') as output_file:
122 |       output_file.write(result_bytes)
123 | 
124 |   test_instance.assertTrue(images_match, msg=message)
125 | 


--------------------------------------------------------------------------------
/rasterize_triangles.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 Google LLC
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     https://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | 
 15 | """Differentiable triangle rasterizer."""
 16 | 
 17 | from __future__ import absolute_import
 18 | from __future__ import division
 19 | from __future__ import print_function
 20 | 
 21 | import os.path as osp
 22 | import tensorflow as tf
 23 | 
 24 | import camera_utils
 25 | 
 26 | 
 27 | def get_ext_filename(ext_name):
 28 |     from distutils.sysconfig import get_config_var
 29 |     ext_path = ext_name.split('.')
 30 |     ext_suffix = get_config_var('EXT_SUFFIX')
 31 |     return osp.join(*ext_path) + ext_suffix
 32 | 
 33 | 
 34 | rasterize_triangles_module_path = osp.join(osp.dirname(osp.realpath(__file__)), get_ext_filename('mesh_renderer_lib'))
 35 | rasterize_triangles_module = tf.load_op_library(rasterize_triangles_module_path)
 36 | 
 37 | 
 38 | def rasterize(world_space_vertices, attributes, triangles, camera_matrices,
 39 |               image_width, image_height, background_value):
 40 |   """Rasterizes a mesh and computes interpolated vertex attributes.
 41 | 
 42 |   Applies projection matrices and then calls rasterize_clip_space().
 43 | 
 44 |   Args:
 45 |     world_space_vertices: 3-D float32 tensor of xyz positions with shape
 46 |       [batch_size, vertex_count, 3].
 47 |     attributes: 3-D float32 tensor with shape [batch_size, vertex_count,
 48 |       attribute_count]. Each vertex attribute is interpolated across the
 49 |       triangle using barycentric interpolation.
 50 |     triangles: 2-D int32 tensor with shape [triangle_count, 3]. Each triplet
 51 |       should contain vertex indices describing a triangle such that the
 52 |       triangle's normal points toward the viewer if the forward order of the
 53 |       triplet defines a clockwise winding of the vertices. Gradients with
 54 |       respect to this tensor are not available.
 55 |     camera_matrices: 3-D float tensor with shape [batch_size, 4, 4] containing
 56 |       model-view-perspective projection matrices.
 57 |     image_width: int specifying desired output image width in pixels.
 58 |     image_height: int specifying desired output image height in pixels.
 59 |     background_value: a 1-D float32 tensor with shape [attribute_count]. Pixels
 60 |       that lie outside all triangles take this value.
 61 | 
 62 |   Returns:
 63 |     A 4-D float32 tensor with shape [batch_size, image_height, image_width,
 64 |     attribute_count], containing the interpolated vertex attributes at
 65 |     each pixel.
 66 | 
 67 |   Raises:
 68 |     ValueError: An invalid argument to the method is detected.
 69 |   """
 70 |   clip_space_vertices = camera_utils.transform_homogeneous(
 71 |       camera_matrices, world_space_vertices)
 72 |   return rasterize_clip_space(clip_space_vertices, attributes, triangles,
 73 |                               image_width, image_height, background_value)
 74 | 
 75 | 
 76 | def rasterize_clip_space(clip_space_vertices, attributes, triangles,
 77 |                          image_width, image_height, background_value):
 78 |   """Rasterizes the input mesh expressed in clip-space (xyzw) coordinates.
 79 | 
 80 |   Interpolates vertex attributes using perspective-correct interpolation and
 81 |   clips triangles that lie outside the viewing frustum.
 82 | 
 83 |   Args:
 84 |     clip_space_vertices: 3-D float32 tensor of homogenous vertices (xyzw) with
 85 |       shape [batch_size, vertex_count, 4].
 86 |     attributes: 3-D float32 tensor with shape [batch_size, vertex_count,
 87 |       attribute_count]. Each vertex attribute is interpolated across the
 88 |       triangle using barycentric interpolation.
 89 |     triangles: 2-D int32 tensor with shape [triangle_count, 3]. Each triplet
 90 |       should contain vertex indices describing a triangle such that the
 91 |       triangle's normal points toward the viewer if the forward order of the
 92 |       triplet defines a clockwise winding of the vertices. Gradients with
 93 |       respect to this tensor are not available.
 94 |     image_width: int specifying desired output image width in pixels.
 95 |     image_height: int specifying desired output image height in pixels.
 96 |     background_value: a 1-D float32 tensor with shape [attribute_count]. Pixels
 97 |       that lie outside all triangles take this value.
 98 | 
 99 |   Returns:
100 |     A 4-D float32 tensor with shape [batch_size, image_height, image_width,
101 |     attribute_count], containing the interpolated vertex attributes at
102 |     each pixel.
103 | 
104 |   Raises:
105 |     ValueError: An invalid argument to the method is detected.
106 |   """
107 |   if not image_width > 0:
108 |     raise ValueError('Image width must be > 0.')
109 |   if not image_height > 0:
110 |     raise ValueError('Image height must be > 0.')
111 |   if len(clip_space_vertices.shape) != 3:
112 |     raise ValueError('The vertex buffer must be 3D.')
113 |   batch_size = clip_space_vertices.shape[0].value
114 |   vertex_count = clip_space_vertices.shape[1].value
115 | 
116 |   per_image_barycentric_coordinates = []
117 |   per_image_vertex_ids = []
118 |   for im in range(clip_space_vertices.shape[0]):
119 |     barycentric_coords, triangle_ids, _ = (
120 |         rasterize_triangles_module.rasterize_triangles(
121 |             clip_space_vertices[im, :, :], triangles, image_width,
122 |             image_height))
123 |     per_image_barycentric_coordinates.append(
124 |         tf.reshape(barycentric_coords, [-1, 3]))
125 | 
126 |     # Gathers the vertex indices now because the indices don't contain a batch
127 |     # identifier, and reindexes the vertex ids to point to a (batch,vertex_id)
128 |     vertex_ids = tf.gather(triangles, tf.reshape(triangle_ids, [-1]))
129 |     reindexed_ids = tf.add(vertex_ids, im * clip_space_vertices.shape[1].value)
130 |     per_image_vertex_ids.append(reindexed_ids)
131 | 
132 |   barycentric_coordinates = tf.concat(per_image_barycentric_coordinates, axis=0)
133 |   vertex_ids = tf.concat(per_image_vertex_ids, axis=0)
134 | 
135 |   # Indexes with each pixel's clip-space triangle's extrema (the pixel's
136 |   # 'corner points') ids to get the relevant properties for deferred shading.
137 |   flattened_vertex_attributes = tf.reshape(attributes,
138 |                                            [batch_size * vertex_count, -1])
139 |   corner_attributes = tf.gather(flattened_vertex_attributes, vertex_ids)
140 | 
141 |   # Computes the pixel attributes by interpolating the known attributes at the
142 |   # corner points of the triangle interpolated with the barycentric coordinates.
143 |   weighted_vertex_attributes = tf.multiply(
144 |       corner_attributes, tf.expand_dims(barycentric_coordinates, axis=2))
145 |   summed_attributes = tf.reduce_sum(weighted_vertex_attributes, axis=1)
146 |   attribute_images = tf.reshape(summed_attributes,
147 |                                 [batch_size, image_height, image_width, -1])
148 | 
149 |   # Barycentric coordinates should approximately sum to one where there is
150 |   # rendered geometry, but be exactly zero where there is not.
151 |   alphas = tf.clip_by_value(
152 |       tf.reduce_sum(2.0 * barycentric_coordinates, axis=1), 0.0, 1.0)
153 |   alphas = tf.reshape(alphas, [batch_size, image_height, image_width, 1])
154 | 
155 |   attributes_with_background = (
156 |       alphas * attribute_images + (1.0 - alphas) * background_value)
157 | 
158 |   return attributes_with_background
159 | 
160 | 
161 | @tf.RegisterGradient('RasterizeTriangles')
162 | def _rasterize_triangles_grad(op, df_dbarys, df_dids, df_dz):
163 |   # Gradients are only supported for barycentric coordinates. Gradients for the
164 |   # z-buffer are not currently implemented. If you need gradients w.r.t. z,
165 |   # include z as a vertex attribute when calling rasterize_triangles.
166 |   del df_dids, df_dz
167 |   return rasterize_triangles_module.rasterize_triangles_grad(
168 |       op.inputs[0], op.inputs[1], op.outputs[0], op.outputs[1], df_dbarys,
169 |       op.get_attr('image_width'), op.get_attr('image_height')), None
170 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
  1 | import os.path as osp
  2 | import sys
  3 | from distutils.core import Extension, setup
  4 | from distutils.command.build_ext import build_ext
  5 | import subprocess
  6 | from distutils.version import StrictVersion
  7 | 
  8 | SCRIPT_DIR = osp.dirname(osp.realpath(__file__))
  9 | 
 10 | extra_compile_args = ['-std=c++11']
 11 | extra_link_args = []
 12 | 
 13 | 
 14 | if sys.platform == 'darwin':
 15 |     extra_compile_args.append('-stdlib=libc++')
 16 |     extra_link_args.append('-stdlib=libc++')
 17 | 
 18 | include_dirs = ['/usr/local/include']
 19 | library_dirs = ['/usr/local/lib']
 20 | 
 21 | mesh_renderer_lib = Extension(
 22 |     name='mesh_renderer_lib',
 23 |     sources=['mesh_renderer/kernels/rasterize_triangles_op.cc',
 24 |              'mesh_renderer/kernels/rasterize_triangles_impl.cc',
 25 |              'mesh_renderer/kernels/rasterize_triangles_grad.cc'],
 26 |     include_dirs=include_dirs,
 27 |     library_dirs=library_dirs,
 28 |     libraries=[],
 29 |     language='c++',
 30 |     extra_compile_args=extra_compile_args,
 31 |     extra_link_args=extra_link_args,
 32 | )
 33 | 
 34 | 
 35 | def get_version():
 36 |     with open('VERSION', 'r') as fp:
 37 |         version = fp.readline()
 38 |         return version.strip()
 39 | 
 40 | 
 41 | REMOVE_FLAGS = {'-Wstrict-prototypes'}
 42 | 
 43 | 
 44 | class mesh_renderer_build_ext(build_ext):
 45 |     def get_libraries(self, ext):
 46 |         libs = super(mesh_renderer_build_ext, self).get_libraries(ext)
 47 |         last_lib = [libs[-1], "python{}".format(sys.version_info.major)]
 48 |         actual_python_libname = None
 49 |         for ll in last_lib:
 50 |             if actual_python_libname is not None:
 51 |                 break
 52 |             for path in self.library_dirs:
 53 |                 if osp.exists(osp.join(path, "lib" + ll + ".so")):
 54 |                     actual_python_libname = ll
 55 |                     break
 56 |         if actual_python_libname is None:
 57 |             actual_python_libname = libs[-1]
 58 |         return libs[:-1] + [actual_python_libname]
 59 | 
 60 |     def build_extensions(self):
 61 |         extension = self.extensions[0]
 62 |         assert extension.name == 'mesh_renderer_lib'
 63 | 
 64 |         import tensorflow as tf
 65 |         # Read more at https://www.tensorflow.org/versions/r1.1/extend/adding_an_op
 66 | 
 67 |         # seems like extra chars like "-rc1" are at the end of string, right after "patch" number so this should work
 68 |         tf_version_major, tf_version_minor = map(int, tf.__version__.split(".")[:2])
 69 |         assert tf_version_major == 1
 70 | 
 71 |         if "__cxx11_abi_flag__" in tf.__dict__:
 72 |             abi = tf.__cxx11_abi_flag__
 73 |         else:
 74 |             tf_path = tf.__path__[0]
 75 |             if tf_version_minor >= 4:
 76 |                 tf_so = '{}/libtensorflow_framework.so'.format(tf_path)
 77 |             else:
 78 |                 tf_so = '{}/python/_pywrap_tensorflow_internal.so'.format(tf_path)
 79 | 
 80 |             gcc_from_tf = subprocess.check_output("strings " + tf_so + " | grep GCC | grep ubuntu | uniq || true", shell=True).decode("utf-8").strip()
 81 |             clang_from_tf = subprocess.check_output("strings " + tf_so + " | grep clang || true", shell=True).decode("utf-8").strip()
 82 | 
 83 |             if len(gcc_from_tf) > 5:
 84 |                 assert len(gcc_from_tf) > 5, "Cannot extract tensorflow gcc version."
 85 |                 gcc_version = gcc_from_tf.split("-")[0].split(" ")[-1]
 86 |                 print("Detected gcc_version: %s" % gcc_version)
 87 |                 if StrictVersion(gcc_version) < StrictVersion("5.0"):
 88 |                     abi = 0
 89 |                 else:
 90 |                     abi = 1
 91 |             elif len(clang_from_tf) > 0:
 92 |                 abi = 1
 93 |             else:
 94 |                 raise ValueError("not using clang or gcc, not sure how to set D_GLIBCXX_USE_CXX11_ABI.")
 95 |         extension.extra_compile_args.append('-D_GLIBCXX_USE_CXX11_ABI=%s' % abi)
 96 | 
 97 |         extension.include_dirs.append(tf.sysconfig.get_include())
 98 |         if tf_version_minor >= 4:
 99 |             extension.extra_link_args.append('-ltensorflow_framework')
100 |             extension.include_dirs.append(tf.sysconfig.get_include() + '/external/nsync/public')
101 |             extension.library_dirs.append(tf.sysconfig.get_lib())
102 | 
103 |         self.compiler.compiler_so = list(filter(lambda flag: flag not in REMOVE_FLAGS, self.compiler.compiler_so))
104 |         super(mesh_renderer_build_ext, self).build_extensions()
105 | 
106 | 
107 | setup(
108 |     name='mesh_renderer',
109 |     version=get_version(),
110 |     cmdclass={'build_ext': mesh_renderer_build_ext},
111 |     py_modules=['mesh_renderer', 'camera_utils', 'rasterize_triangles'],
112 |     ext_modules=[mesh_renderer_lib],
113 |     description='TF rendering',
114 |     author='Jonathan Raiman',
115 |     author_email='raiman@openai.com',
116 |     install_requires=[],
117 | )
118 | 


--------------------------------------------------------------------------------
/third_party/lodepng.h:
--------------------------------------------------------------------------------
   1 | /*
   2 | LodePNG version 20170917
   3 | 
   4 | Copyright (c) 2005-2017 Lode Vandevenne
   5 | 
   6 | This software is provided 'as-is', without any express or implied
   7 | warranty. In no event will the authors be held liable for any damages
   8 | arising from the use of this software.
   9 | 
  10 | Permission is granted to anyone to use this software for any purpose,
  11 | including commercial applications, and to alter it and redistribute it
  12 | freely, subject to the following restrictions:
  13 | 
  14 |     1. The origin of this software must not be misrepresented; you must not
  15 |     claim that you wrote the original software. If you use this software
  16 |     in a product, an acknowledgment in the product documentation would be
  17 |     appreciated but is not required.
  18 | 
  19 |     2. Altered source versions must be plainly marked as such, and must not be
  20 |     misrepresented as being the original software.
  21 | 
  22 |     3. This notice may not be removed or altered from any source
  23 |     distribution.
  24 | */
  25 | 
  26 | #ifndef LODEPNG_H
  27 | #define LODEPNG_H
  28 | 
  29 | #include <string.h> /*for size_t*/
  30 | 
  31 | extern const char* LODEPNG_VERSION_STRING;
  32 | 
  33 | /*
  34 | The following #defines are used to create code sections. They can be disabled
  35 | to disable code sections, which can give faster compile time and smaller binary.
  36 | The "NO_COMPILE" defines are designed to be used to pass as defines to the
  37 | compiler command to disable them without modifying this header, e.g.
  38 | -DLODEPNG_NO_COMPILE_ZLIB for gcc.
  39 | In addition to those below, you can also define LODEPNG_NO_COMPILE_CRC to
  40 | allow implementing a custom lodepng_crc32.
  41 | */
  42 | /*deflate & zlib. If disabled, you must specify alternative zlib functions in
  43 | the custom_zlib field of the compress and decompress settings*/
  44 | #ifndef LODEPNG_NO_COMPILE_ZLIB
  45 | #define LODEPNG_COMPILE_ZLIB
  46 | #endif
  47 | /*png encoder and png decoder*/
  48 | #ifndef LODEPNG_NO_COMPILE_PNG
  49 | #define LODEPNG_COMPILE_PNG
  50 | #endif
  51 | /*deflate&zlib decoder and png decoder*/
  52 | #ifndef LODEPNG_NO_COMPILE_DECODER
  53 | #define LODEPNG_COMPILE_DECODER
  54 | #endif
  55 | /*deflate&zlib encoder and png encoder*/
  56 | #ifndef LODEPNG_NO_COMPILE_ENCODER
  57 | #define LODEPNG_COMPILE_ENCODER
  58 | #endif
  59 | /*the optional built in harddisk file loading and saving functions*/
  60 | #ifndef LODEPNG_NO_COMPILE_DISK
  61 | #define LODEPNG_COMPILE_DISK
  62 | #endif
  63 | /*support for chunks other than IHDR, IDAT, PLTE, tRNS, IEND: ancillary and unknown chunks*/
  64 | #ifndef LODEPNG_NO_COMPILE_ANCILLARY_CHUNKS
  65 | #define LODEPNG_COMPILE_ANCILLARY_CHUNKS
  66 | #endif
  67 | /*ability to convert error numerical codes to English text string*/
  68 | #ifndef LODEPNG_NO_COMPILE_ERROR_TEXT
  69 | #define LODEPNG_COMPILE_ERROR_TEXT
  70 | #endif
  71 | /*Compile the default allocators (C's free, malloc and realloc). If you disable this,
  72 | you can define the functions lodepng_free, lodepng_malloc and lodepng_realloc in your
  73 | source files with custom allocators.*/
  74 | #ifndef LODEPNG_NO_COMPILE_ALLOCATORS
  75 | #define LODEPNG_COMPILE_ALLOCATORS
  76 | #endif
  77 | /*compile the C++ version (you can disable the C++ wrapper here even when compiling for C++)*/
  78 | #ifdef __cplusplus
  79 | #ifndef LODEPNG_NO_COMPILE_CPP
  80 | #define LODEPNG_COMPILE_CPP
  81 | #endif
  82 | #endif
  83 | 
  84 | #ifdef LODEPNG_COMPILE_CPP
  85 | #include <vector>
  86 | #include <string>
  87 | #endif /*LODEPNG_COMPILE_CPP*/
  88 | 
  89 | #ifdef LODEPNG_COMPILE_PNG
  90 | /*The PNG color types (also used for raw).*/
  91 | typedef enum LodePNGColorType
  92 | {
  93 |   LCT_GREY = 0, /*greyscale: 1,2,4,8,16 bit*/
  94 |   LCT_RGB = 2, /*RGB: 8,16 bit*/
  95 |   LCT_PALETTE = 3, /*palette: 1,2,4,8 bit*/
  96 |   LCT_GREY_ALPHA = 4, /*greyscale with alpha: 8,16 bit*/
  97 |   LCT_RGBA = 6 /*RGB with alpha: 8,16 bit*/
  98 | } LodePNGColorType;
  99 | 
 100 | #ifdef LODEPNG_COMPILE_DECODER
 101 | /*
 102 | Converts PNG data in memory to raw pixel data.
 103 | out: Output parameter. Pointer to buffer that will contain the raw pixel data.
 104 |      After decoding, its size is w * h * (bytes per pixel) bytes larger than
 105 |      initially. Bytes per pixel depends on colortype and bitdepth.
 106 |      Must be freed after usage with free(*out).
 107 |      Note: for 16-bit per channel colors, uses big endian format like PNG does.
 108 | w: Output parameter. Pointer to width of pixel data.
 109 | h: Output parameter. Pointer to height of pixel data.
 110 | in: Memory buffer with the PNG file.
 111 | insize: size of the in buffer.
 112 | colortype: the desired color type for the raw output image. See explanation on PNG color types.
 113 | bitdepth: the desired bit depth for the raw output image. See explanation on PNG color types.
 114 | Return value: LodePNG error code (0 means no error).
 115 | */
 116 | unsigned lodepng_decode_memory(unsigned char** out, unsigned* w, unsigned* h,
 117 |                                const unsigned char* in, size_t insize,
 118 |                                LodePNGColorType colortype, unsigned bitdepth);
 119 | 
 120 | /*Same as lodepng_decode_memory, but always decodes to 32-bit RGBA raw image*/
 121 | unsigned lodepng_decode32(unsigned char** out, unsigned* w, unsigned* h,
 122 |                           const unsigned char* in, size_t insize);
 123 | 
 124 | /*Same as lodepng_decode_memory, but always decodes to 24-bit RGB raw image*/
 125 | unsigned lodepng_decode24(unsigned char** out, unsigned* w, unsigned* h,
 126 |                           const unsigned char* in, size_t insize);
 127 | 
 128 | #ifdef LODEPNG_COMPILE_DISK
 129 | /*
 130 | Load PNG from disk, from file with given name.
 131 | Same as the other decode functions, but instead takes a filename as input.
 132 | */
 133 | unsigned lodepng_decode_file(unsigned char** out, unsigned* w, unsigned* h,
 134 |                              const char* filename,
 135 |                              LodePNGColorType colortype, unsigned bitdepth);
 136 | 
 137 | /*Same as lodepng_decode_file, but always decodes to 32-bit RGBA raw image.*/
 138 | unsigned lodepng_decode32_file(unsigned char** out, unsigned* w, unsigned* h,
 139 |                                const char* filename);
 140 | 
 141 | /*Same as lodepng_decode_file, but always decodes to 24-bit RGB raw image.*/
 142 | unsigned lodepng_decode24_file(unsigned char** out, unsigned* w, unsigned* h,
 143 |                                const char* filename);
 144 | #endif /*LODEPNG_COMPILE_DISK*/
 145 | #endif /*LODEPNG_COMPILE_DECODER*/
 146 | 
 147 | 
 148 | #ifdef LODEPNG_COMPILE_ENCODER
 149 | /*
 150 | Converts raw pixel data into a PNG image in memory. The colortype and bitdepth
 151 |   of the output PNG image cannot be chosen, they are automatically determined
 152 |   by the colortype, bitdepth and content of the input pixel data.
 153 |   Note: for 16-bit per channel colors, needs big endian format like PNG does.
 154 | out: Output parameter. Pointer to buffer that will contain the PNG image data.
 155 |      Must be freed after usage with free(*out).
 156 | outsize: Output parameter. Pointer to the size in bytes of the out buffer.
 157 | image: The raw pixel data to encode. The size of this buffer should be
 158 |        w * h * (bytes per pixel), bytes per pixel depends on colortype and bitdepth.
 159 | w: width of the raw pixel data in pixels.
 160 | h: height of the raw pixel data in pixels.
 161 | colortype: the color type of the raw input image. See explanation on PNG color types.
 162 | bitdepth: the bit depth of the raw input image. See explanation on PNG color types.
 163 | Return value: LodePNG error code (0 means no error).
 164 | */
 165 | unsigned lodepng_encode_memory(unsigned char** out, size_t* outsize,
 166 |                                const unsigned char* image, unsigned w, unsigned h,
 167 |                                LodePNGColorType colortype, unsigned bitdepth);
 168 | 
 169 | /*Same as lodepng_encode_memory, but always encodes from 32-bit RGBA raw image.*/
 170 | unsigned lodepng_encode32(unsigned char** out, size_t* outsize,
 171 |                           const unsigned char* image, unsigned w, unsigned h);
 172 | 
 173 | /*Same as lodepng_encode_memory, but always encodes from 24-bit RGB raw image.*/
 174 | unsigned lodepng_encode24(unsigned char** out, size_t* outsize,
 175 |                           const unsigned char* image, unsigned w, unsigned h);
 176 | 
 177 | #ifdef LODEPNG_COMPILE_DISK
 178 | /*
 179 | Converts raw pixel data into a PNG file on disk.
 180 | Same as the other encode functions, but instead takes a filename as output.
 181 | NOTE: This overwrites existing files without warning!
 182 | */
 183 | unsigned lodepng_encode_file(const char* filename,
 184 |                              const unsigned char* image, unsigned w, unsigned h,
 185 |                              LodePNGColorType colortype, unsigned bitdepth);
 186 | 
 187 | /*Same as lodepng_encode_file, but always encodes from 32-bit RGBA raw image.*/
 188 | unsigned lodepng_encode32_file(const char* filename,
 189 |                                const unsigned char* image, unsigned w, unsigned h);
 190 | 
 191 | /*Same as lodepng_encode_file, but always encodes from 24-bit RGB raw image.*/
 192 | unsigned lodepng_encode24_file(const char* filename,
 193 |                                const unsigned char* image, unsigned w, unsigned h);
 194 | #endif /*LODEPNG_COMPILE_DISK*/
 195 | #endif /*LODEPNG_COMPILE_ENCODER*/
 196 | 
 197 | 
 198 | #ifdef LODEPNG_COMPILE_CPP
 199 | namespace lodepng
 200 | {
 201 | #ifdef LODEPNG_COMPILE_DECODER
 202 | /*Same as lodepng_decode_memory, but decodes to an std::vector. The colortype
 203 | is the format to output the pixels to. Default is RGBA 8-bit per channel.*/
 204 | unsigned decode(std::vector<unsigned char>& out, unsigned& w, unsigned& h,
 205 |                 const unsigned char* in, size_t insize,
 206 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 207 | unsigned decode(std::vector<unsigned char>& out, unsigned& w, unsigned& h,
 208 |                 const std::vector<unsigned char>& in,
 209 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 210 | #ifdef LODEPNG_COMPILE_DISK
 211 | /*
 212 | Converts PNG file from disk to raw pixel data in memory.
 213 | Same as the other decode functions, but instead takes a filename as input.
 214 | */
 215 | unsigned decode(std::vector<unsigned char>& out, unsigned& w, unsigned& h,
 216 |                 const std::string& filename,
 217 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 218 | #endif /* LODEPNG_COMPILE_DISK */
 219 | #endif /* LODEPNG_COMPILE_DECODER */
 220 | 
 221 | #ifdef LODEPNG_COMPILE_ENCODER
 222 | /*Same as lodepng_encode_memory, but encodes to an std::vector. colortype
 223 | is that of the raw input data. The output PNG color type will be auto chosen.*/
 224 | unsigned encode(std::vector<unsigned char>& out,
 225 |                 const unsigned char* in, unsigned w, unsigned h,
 226 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 227 | unsigned encode(std::vector<unsigned char>& out,
 228 |                 const std::vector<unsigned char>& in, unsigned w, unsigned h,
 229 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 230 | #ifdef LODEPNG_COMPILE_DISK
 231 | /*
 232 | Converts 32-bit RGBA raw pixel data into a PNG file on disk.
 233 | Same as the other encode functions, but instead takes a filename as output.
 234 | NOTE: This overwrites existing files without warning!
 235 | */
 236 | unsigned encode(const std::string& filename,
 237 |                 const unsigned char* in, unsigned w, unsigned h,
 238 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 239 | unsigned encode(const std::string& filename,
 240 |                 const std::vector<unsigned char>& in, unsigned w, unsigned h,
 241 |                 LodePNGColorType colortype = LCT_RGBA, unsigned bitdepth = 8);
 242 | #endif /* LODEPNG_COMPILE_DISK */
 243 | #endif /* LODEPNG_COMPILE_ENCODER */
 244 | } /* namespace lodepng */
 245 | #endif /*LODEPNG_COMPILE_CPP*/
 246 | #endif /*LODEPNG_COMPILE_PNG*/
 247 | 
 248 | #ifdef LODEPNG_COMPILE_ERROR_TEXT
 249 | /*Returns an English description of the numerical error code.*/
 250 | const char* lodepng_error_text(unsigned code);
 251 | #endif /*LODEPNG_COMPILE_ERROR_TEXT*/
 252 | 
 253 | #ifdef LODEPNG_COMPILE_DECODER
 254 | /*Settings for zlib decompression*/
 255 | typedef struct LodePNGDecompressSettings LodePNGDecompressSettings;
 256 | struct LodePNGDecompressSettings
 257 | {
 258 |   unsigned ignore_adler32; /*if 1, continue and don't give an error message if the Adler32 checksum is corrupted*/
 259 | 
 260 |   /*use custom zlib decoder instead of built in one (default: null)*/
 261 |   unsigned (*custom_zlib)(unsigned char**, size_t*,
 262 |                           const unsigned char*, size_t,
 263 |                           const LodePNGDecompressSettings*);
 264 |   /*use custom deflate decoder instead of built in one (default: null)
 265 |   if custom_zlib is used, custom_deflate is ignored since only the built in
 266 |   zlib function will call custom_deflate*/
 267 |   unsigned (*custom_inflate)(unsigned char**, size_t*,
 268 |                              const unsigned char*, size_t,
 269 |                              const LodePNGDecompressSettings*);
 270 | 
 271 |   const void* custom_context; /*optional custom settings for custom functions*/
 272 | };
 273 | 
 274 | extern const LodePNGDecompressSettings lodepng_default_decompress_settings;
 275 | void lodepng_decompress_settings_init(LodePNGDecompressSettings* settings);
 276 | #endif /*LODEPNG_COMPILE_DECODER*/
 277 | 
 278 | #ifdef LODEPNG_COMPILE_ENCODER
 279 | /*
 280 | Settings for zlib compression. Tweaking these settings tweaks the balance
 281 | between speed and compression ratio.
 282 | */
 283 | typedef struct LodePNGCompressSettings LodePNGCompressSettings;
 284 | struct LodePNGCompressSettings /*deflate = compress*/
 285 | {
 286 |   /*LZ77 related settings*/
 287 |   unsigned btype; /*the block type for LZ (0, 1, 2 or 3, see zlib standard). Should be 2 for proper compression.*/
 288 |   unsigned use_lz77; /*whether or not to use LZ77. Should be 1 for proper compression.*/
 289 |   unsigned windowsize; /*must be a power of two <= 32768. higher compresses more but is slower. Default value: 2048.*/
 290 |   unsigned minmatch; /*mininum lz77 length. 3 is normally best, 6 can be better for some PNGs. Default: 0*/
 291 |   unsigned nicematch; /*stop searching if >= this length found. Set to 258 for best compression. Default: 128*/
 292 |   unsigned lazymatching; /*use lazy matching: better compression but a bit slower. Default: true*/
 293 | 
 294 |   /*use custom zlib encoder instead of built in one (default: null)*/
 295 |   unsigned (*custom_zlib)(unsigned char**, size_t*,
 296 |                           const unsigned char*, size_t,
 297 |                           const LodePNGCompressSettings*);
 298 |   /*use custom deflate encoder instead of built in one (default: null)
 299 |   if custom_zlib is used, custom_deflate is ignored since only the built in
 300 |   zlib function will call custom_deflate*/
 301 |   unsigned (*custom_deflate)(unsigned char**, size_t*,
 302 |                              const unsigned char*, size_t,
 303 |                              const LodePNGCompressSettings*);
 304 | 
 305 |   const void* custom_context; /*optional custom settings for custom functions*/
 306 | };
 307 | 
 308 | extern const LodePNGCompressSettings lodepng_default_compress_settings;
 309 | void lodepng_compress_settings_init(LodePNGCompressSettings* settings);
 310 | #endif /*LODEPNG_COMPILE_ENCODER*/
 311 | 
 312 | #ifdef LODEPNG_COMPILE_PNG
 313 | /*
 314 | Color mode of an image. Contains all information required to decode the pixel
 315 | bits to RGBA colors. This information is the same as used in the PNG file
 316 | format, and is used both for PNG and raw image data in LodePNG.
 317 | */
 318 | typedef struct LodePNGColorMode
 319 | {
 320 |   /*header (IHDR)*/
 321 |   LodePNGColorType colortype; /*color type, see PNG standard or documentation further in this header file*/
 322 |   unsigned bitdepth;  /*bits per sample, see PNG standard or documentation further in this header file*/
 323 | 
 324 |   /*
 325 |   palette (PLTE and tRNS)
 326 | 
 327 |   Dynamically allocated with the colors of the palette, including alpha.
 328 |   When encoding a PNG, to store your colors in the palette of the LodePNGColorMode, first use
 329 |   lodepng_palette_clear, then for each color use lodepng_palette_add.
 330 |   If you encode an image without alpha with palette, don't forget to put value 255 in each A byte of the palette.
 331 | 
 332 |   When decoding, by default you can ignore this palette, since LodePNG already
 333 |   fills the palette colors in the pixels of the raw RGBA output.
 334 | 
 335 |   The palette is only supported for color type 3.
 336 |   */
 337 |   unsigned char* palette; /*palette in RGBARGBA... order. When allocated, must be either 0, or have size 1024*/
 338 |   size_t palettesize; /*palette size in number of colors (amount of bytes is 4 * palettesize)*/
 339 | 
 340 |   /*
 341 |   transparent color key (tRNS)
 342 | 
 343 |   This color uses the same bit depth as the bitdepth value in this struct, which can be 1-bit to 16-bit.
 344 |   For greyscale PNGs, r, g and b will all 3 be set to the same.
 345 | 
 346 |   When decoding, by default you can ignore this information, since LodePNG sets
 347 |   pixels with this key to transparent already in the raw RGBA output.
 348 | 
 349 |   The color key is only supported for color types 0 and 2.
 350 |   */
 351 |   unsigned key_defined; /*is a transparent color key given? 0 = false, 1 = true*/
 352 |   unsigned key_r;       /*red/greyscale component of color key*/
 353 |   unsigned key_g;       /*green component of color key*/
 354 |   unsigned key_b;       /*blue component of color key*/
 355 | } LodePNGColorMode;
 356 | 
 357 | /*init, cleanup and copy functions to use with this struct*/
 358 | void lodepng_color_mode_init(LodePNGColorMode* info);
 359 | void lodepng_color_mode_cleanup(LodePNGColorMode* info);
 360 | /*return value is error code (0 means no error)*/
 361 | unsigned lodepng_color_mode_copy(LodePNGColorMode* dest, const LodePNGColorMode* source);
 362 | 
 363 | void lodepng_palette_clear(LodePNGColorMode* info);
 364 | /*add 1 color to the palette*/
 365 | unsigned lodepng_palette_add(LodePNGColorMode* info,
 366 |                              unsigned char r, unsigned char g, unsigned char b, unsigned char a);
 367 | 
 368 | /*get the total amount of bits per pixel, based on colortype and bitdepth in the struct*/
 369 | unsigned lodepng_get_bpp(const LodePNGColorMode* info);
 370 | /*get the amount of color channels used, based on colortype in the struct.
 371 | If a palette is used, it counts as 1 channel.*/
 372 | unsigned lodepng_get_channels(const LodePNGColorMode* info);
 373 | /*is it a greyscale type? (only colortype 0 or 4)*/
 374 | unsigned lodepng_is_greyscale_type(const LodePNGColorMode* info);
 375 | /*has it got an alpha channel? (only colortype 2 or 6)*/
 376 | unsigned lodepng_is_alpha_type(const LodePNGColorMode* info);
 377 | /*has it got a palette? (only colortype 3)*/
 378 | unsigned lodepng_is_palette_type(const LodePNGColorMode* info);
 379 | /*only returns true if there is a palette and there is a value in the palette with alpha < 255.
 380 | Loops through the palette to check this.*/
 381 | unsigned lodepng_has_palette_alpha(const LodePNGColorMode* info);
 382 | /*
 383 | Check if the given color info indicates the possibility of having non-opaque pixels in the PNG image.
 384 | Returns true if the image can have translucent or invisible pixels (it still be opaque if it doesn't use such pixels).
 385 | Returns false if the image can only have opaque pixels.
 386 | In detail, it returns true only if it's a color type with alpha, or has a palette with non-opaque values,
 387 | or if "key_defined" is true.
 388 | */
 389 | unsigned lodepng_can_have_alpha(const LodePNGColorMode* info);
 390 | /*Returns the byte size of a raw image buffer with given width, height and color mode*/
 391 | size_t lodepng_get_raw_size(unsigned w, unsigned h, const LodePNGColorMode* color);
 392 | 
 393 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS
 394 | /*The information of a Time chunk in PNG.*/
 395 | typedef struct LodePNGTime
 396 | {
 397 |   unsigned year;    /*2 bytes used (0-65535)*/
 398 |   unsigned month;   /*1-12*/
 399 |   unsigned day;     /*1-31*/
 400 |   unsigned hour;    /*0-23*/
 401 |   unsigned minute;  /*0-59*/
 402 |   unsigned second;  /*0-60 (to allow for leap seconds)*/
 403 | } LodePNGTime;
 404 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/
 405 | 
 406 | /*Information about the PNG image, except pixels, width and height.*/
 407 | typedef struct LodePNGInfo
 408 | {
 409 |   /*header (IHDR), palette (PLTE) and transparency (tRNS) chunks*/
 410 |   unsigned compression_method;/*compression method of the original file. Always 0.*/
 411 |   unsigned filter_method;     /*filter method of the original file*/
 412 |   unsigned interlace_method;  /*interlace method of the original file*/
 413 |   LodePNGColorMode color;     /*color type and bits, palette and transparency of the PNG file*/
 414 | 
 415 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS
 416 |   /*
 417 |   suggested background color chunk (bKGD)
 418 |   This color uses the same color mode as the PNG (except alpha channel), which can be 1-bit to 16-bit.
 419 | 
 420 |   For greyscale PNGs, r, g and b will all 3 be set to the same. When encoding
 421 |   the encoder writes the red one. For palette PNGs: When decoding, the RGB value
 422 |   will be stored, not a palette index. But when encoding, specify the index of
 423 |   the palette in background_r, the other two are then ignored.
 424 | 
 425 |   The decoder does not use this background color to edit the color of pixels.
 426 |   */
 427 |   unsigned background_defined; /*is a suggested background color given?*/
 428 |   unsigned background_r;       /*red component of suggested background color*/
 429 |   unsigned background_g;       /*green component of suggested background color*/
 430 |   unsigned background_b;       /*blue component of suggested background color*/
 431 | 
 432 |   /*
 433 |   non-international text chunks (tEXt and zTXt)
 434 | 
 435 |   The char** arrays each contain num strings. The actual messages are in
 436 |   text_strings, while text_keys are keywords that give a short description what
 437 |   the actual text represents, e.g. Title, Author, Description, or anything else.
 438 | 
 439 |   A keyword is minimum 1 character and maximum 79 characters long. It's
 440 |   discouraged to use a single line length longer than 79 characters for texts.
 441 | 
 442 |   Don't allocate these text buffers yourself. Use the init/cleanup functions
 443 |   correctly and use lodepng_add_text and lodepng_clear_text.
 444 |   */
 445 |   size_t text_num; /*the amount of texts in these char** buffers (there may be more texts in itext)*/
 446 |   char** text_keys; /*the keyword of a text chunk (e.g. "Comment")*/
 447 |   char** text_strings; /*the actual text*/
 448 | 
 449 |   /*
 450 |   international text chunks (iTXt)
 451 |   Similar to the non-international text chunks, but with additional strings
 452 |   "langtags" and "transkeys".
 453 |   */
 454 |   size_t itext_num; /*the amount of international texts in this PNG*/
 455 |   char** itext_keys; /*the English keyword of the text chunk (e.g. "Comment")*/
 456 |   char** itext_langtags; /*language tag for this text's language, ISO/IEC 646 string, e.g. ISO 639 language tag*/
 457 |   char** itext_transkeys; /*keyword translated to the international language - UTF-8 string*/
 458 |   char** itext_strings; /*the actual international text - UTF-8 string*/
 459 | 
 460 |   /*time chunk (tIME)*/
 461 |   unsigned time_defined; /*set to 1 to make the encoder generate a tIME chunk*/
 462 |   LodePNGTime time;
 463 | 
 464 |   /*phys chunk (pHYs)*/
 465 |   unsigned phys_defined; /*if 0, there is no pHYs chunk and the values below are undefined, if 1 else there is one*/
 466 |   unsigned phys_x; /*pixels per unit in x direction*/
 467 |   unsigned phys_y; /*pixels per unit in y direction*/
 468 |   unsigned phys_unit; /*may be 0 (unknown unit) or 1 (metre)*/
 469 | 
 470 |   /*
 471 |   unknown chunks
 472 |   There are 3 buffers, one for each position in the PNG where unknown chunks can appear
 473 |   each buffer contains all unknown chunks for that position consecutively
 474 |   The 3 buffers are the unknown chunks between certain critical chunks:
 475 |   0: IHDR-PLTE, 1: PLTE-IDAT, 2: IDAT-IEND
 476 |   Do not allocate or traverse this data yourself. Use the chunk traversing functions declared
 477 |   later, such as lodepng_chunk_next and lodepng_chunk_append, to read/write this struct.
 478 |   */
 479 |   unsigned char* unknown_chunks_data[3];
 480 |   size_t unknown_chunks_size[3]; /*size in bytes of the unknown chunks, given for protection*/
 481 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/
 482 | } LodePNGInfo;
 483 | 
 484 | /*init, cleanup and copy functions to use with this struct*/
 485 | void lodepng_info_init(LodePNGInfo* info);
 486 | void lodepng_info_cleanup(LodePNGInfo* info);
 487 | /*return value is error code (0 means no error)*/
 488 | unsigned lodepng_info_copy(LodePNGInfo* dest, const LodePNGInfo* source);
 489 | 
 490 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS
 491 | void lodepng_clear_text(LodePNGInfo* info); /*use this to clear the texts again after you filled them in*/
 492 | unsigned lodepng_add_text(LodePNGInfo* info, const char* key, const char* str); /*push back both texts at once*/
 493 | 
 494 | void lodepng_clear_itext(LodePNGInfo* info); /*use this to clear the itexts again after you filled them in*/
 495 | unsigned lodepng_add_itext(LodePNGInfo* info, const char* key, const char* langtag,
 496 |                            const char* transkey, const char* str); /*push back the 4 texts of 1 chunk at once*/
 497 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/
 498 | 
 499 | /*
 500 | Converts raw buffer from one color type to another color type, based on
 501 | LodePNGColorMode structs to describe the input and output color type.
 502 | See the reference manual at the end of this header file to see which color conversions are supported.
 503 | return value = LodePNG error code (0 if all went ok, an error if the conversion isn't supported)
 504 | The out buffer must have size (w * h * bpp + 7) / 8, where bpp is the bits per pixel
 505 | of the output color type (lodepng_get_bpp).
 506 | For < 8 bpp images, there should not be padding bits at the end of scanlines.
 507 | For 16-bit per channel colors, uses big endian format like PNG does.
 508 | Return value is LodePNG error code
 509 | */
 510 | unsigned lodepng_convert(unsigned char* out, const unsigned char* in,
 511 |                          const LodePNGColorMode* mode_out, const LodePNGColorMode* mode_in,
 512 |                          unsigned w, unsigned h);
 513 | 
 514 | #ifdef LODEPNG_COMPILE_DECODER
 515 | /*
 516 | Settings for the decoder. This contains settings for the PNG and the Zlib
 517 | decoder, but not the Info settings from the Info structs.
 518 | */
 519 | typedef struct LodePNGDecoderSettings
 520 | {
 521 |   LodePNGDecompressSettings zlibsettings; /*in here is the setting to ignore Adler32 checksums*/
 522 | 
 523 |   unsigned ignore_crc; /*ignore CRC checksums*/
 524 | 
 525 |   unsigned color_convert; /*whether to convert the PNG to the color type you want. Default: yes*/
 526 | 
 527 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS
 528 |   unsigned read_text_chunks; /*if false but remember_unknown_chunks is true, they're stored in the unknown chunks*/
 529 |   /*store all bytes from unknown chunks in the LodePNGInfo (off by default, useful for a png editor)*/
 530 |   unsigned remember_unknown_chunks;
 531 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/
 532 | } LodePNGDecoderSettings;
 533 | 
 534 | void lodepng_decoder_settings_init(LodePNGDecoderSettings* settings);
 535 | #endif /*LODEPNG_COMPILE_DECODER*/
 536 | 
 537 | #ifdef LODEPNG_COMPILE_ENCODER
 538 | /*automatically use color type with less bits per pixel if losslessly possible. Default: AUTO*/
 539 | typedef enum LodePNGFilterStrategy
 540 | {
 541 |   /*every filter at zero*/
 542 |   LFS_ZERO,
 543 |   /*Use filter that gives minimum sum, as described in the official PNG filter heuristic.*/
 544 |   LFS_MINSUM,
 545 |   /*Use the filter type that gives smallest Shannon entropy for this scanline. Depending
 546 |   on the image, this is better or worse than minsum.*/
 547 |   LFS_ENTROPY,
 548 |   /*
 549 |   Brute-force-search PNG filters by compressing each filter for each scanline.
 550 |   Experimental, very slow, and only rarely gives better compression than MINSUM.
 551 |   */
 552 |   LFS_BRUTE_FORCE,
 553 |   /*use predefined_filters buffer: you specify the filter type for each scanline*/
 554 |   LFS_PREDEFINED
 555 | } LodePNGFilterStrategy;
 556 | 
 557 | /*Gives characteristics about the colors of the image, which helps decide which color model to use for encoding.
 558 | Used internally by default if "auto_convert" is enabled. Public because it's useful for custom algorithms.*/
 559 | typedef struct LodePNGColorProfile
 560 | {
 561 |   unsigned colored; /*not greyscale*/
 562 |   unsigned key; /*image is not opaque and color key is possible instead of full alpha*/
 563 |   unsigned short key_r; /*key values, always as 16-bit, in 8-bit case the byte is duplicated, e.g. 65535 means 255*/
 564 |   unsigned short key_g;
 565 |   unsigned short key_b;
 566 |   unsigned alpha; /*image is not opaque and alpha channel or alpha palette required*/
 567 |   unsigned numcolors; /*amount of colors, up to 257. Not valid if bits == 16.*/
 568 |   unsigned char palette[1024]; /*Remembers up to the first 256 RGBA colors, in no particular order*/
 569 |   unsigned bits; /*bits per channel (not for palette). 1,2 or 4 for greyscale only. 16 if 16-bit per channel required.*/
 570 | } LodePNGColorProfile;
 571 | 
 572 | void lodepng_color_profile_init(LodePNGColorProfile* profile);
 573 | 
 574 | /*Get a LodePNGColorProfile of the image.*/
 575 | unsigned lodepng_get_color_profile(LodePNGColorProfile* profile,
 576 |                                    const unsigned char* image, unsigned w, unsigned h,
 577 |                                    const LodePNGColorMode* mode_in);
 578 | /*The function LodePNG uses internally to decide the PNG color with auto_convert.
 579 | Chooses an optimal color model, e.g. grey if only grey pixels, palette if < 256 colors, ...*/
 580 | unsigned lodepng_auto_choose_color(LodePNGColorMode* mode_out,
 581 |                                    const unsigned char* image, unsigned w, unsigned h,
 582 |                                    const LodePNGColorMode* mode_in);
 583 | 
 584 | /*Settings for the encoder.*/
 585 | typedef struct LodePNGEncoderSettings
 586 | {
 587 |   LodePNGCompressSettings zlibsettings; /*settings for the zlib encoder, such as window size, ...*/
 588 | 
 589 |   unsigned auto_convert; /*automatically choose output PNG color type. Default: true*/
 590 | 
 591 |   /*If true, follows the official PNG heuristic: if the PNG uses a palette or lower than
 592 |   8 bit depth, set all filters to zero. Otherwise use the filter_strategy. Note that to
 593 |   completely follow the official PNG heuristic, filter_palette_zero must be true and
 594 |   filter_strategy must be LFS_MINSUM*/
 595 |   unsigned filter_palette_zero;
 596 |   /*Which filter strategy to use when not using zeroes due to filter_palette_zero.
 597 |   Set filter_palette_zero to 0 to ensure always using your chosen strategy. Default: LFS_MINSUM*/
 598 |   LodePNGFilterStrategy filter_strategy;
 599 |   /*used if filter_strategy is LFS_PREDEFINED. In that case, this must point to a buffer with
 600 |   the same length as the amount of scanlines in the image, and each value must <= 5. You
 601 |   have to cleanup this buffer, LodePNG will never free it. Don't forget that filter_palette_zero
 602 |   must be set to 0 to ensure this is also used on palette or low bitdepth images.*/
 603 |   const unsigned char* predefined_filters;
 604 | 
 605 |   /*force creating a PLTE chunk if colortype is 2 or 6 (= a suggested palette).
 606 |   If colortype is 3, PLTE is _always_ created.*/
 607 |   unsigned force_palette;
 608 | #ifdef LODEPNG_COMPILE_ANCILLARY_CHUNKS
 609 |   /*add LodePNG identifier and version as a text chunk, for debugging*/
 610 |   unsigned add_id;
 611 |   /*encode text chunks as zTXt chunks instead of tEXt chunks, and use compression in iTXt chunks*/
 612 |   unsigned text_compression;
 613 | #endif /*LODEPNG_COMPILE_ANCILLARY_CHUNKS*/
 614 | } LodePNGEncoderSettings;
 615 | 
 616 | void lodepng_encoder_settings_init(LodePNGEncoderSettings* settings);
 617 | #endif /*LODEPNG_COMPILE_ENCODER*/
 618 | 
 619 | 
 620 | #if defined(LODEPNG_COMPILE_DECODER) || defined(LODEPNG_COMPILE_ENCODER)
 621 | /*The settings, state and information for extended encoding and decoding.*/
 622 | typedef struct LodePNGState
 623 | {
 624 | #ifdef LODEPNG_COMPILE_DECODER
 625 |   LodePNGDecoderSettings decoder; /*the decoding settings*/
 626 | #endif /*LODEPNG_COMPILE_DECODER*/
 627 | #ifdef LODEPNG_COMPILE_ENCODER
 628 |   LodePNGEncoderSettings encoder; /*the encoding settings*/
 629 | #endif /*LODEPNG_COMPILE_ENCODER*/
 630 |   LodePNGColorMode info_raw; /*specifies the format in which you would like to get the raw pixel buffer*/
 631 |   LodePNGInfo info_png; /*info of the PNG image obtained after decoding*/
 632 |   unsigned error;
 633 | #ifdef LODEPNG_COMPILE_CPP
 634 |   /* For the lodepng::State subclass. */
 635 |   virtual ~LodePNGState(){}
 636 | #endif
 637 | } LodePNGState;
 638 | 
 639 | /*init, cleanup and copy functions to use with this struct*/
 640 | void lodepng_state_init(LodePNGState* state);
 641 | void lodepng_state_cleanup(LodePNGState* state);
 642 | void lodepng_state_copy(LodePNGState* dest, const LodePNGState* source);
 643 | #endif /* defined(LODEPNG_COMPILE_DECODER) || defined(LODEPNG_COMPILE_ENCODER) */
 644 | 
 645 | #ifdef LODEPNG_COMPILE_DECODER
 646 | /*
 647 | Same as lodepng_decode_memory, but uses a LodePNGState to allow custom settings and
 648 | getting much more information about the PNG image and color mode.
 649 | */
 650 | unsigned lodepng_decode(unsigned char** out, unsigned* w, unsigned* h,
 651 |                         LodePNGState* state,
 652 |                         const unsigned char* in, size_t insize);
 653 | 
 654 | /*
 655 | Read the PNG header, but not the actual data. This returns only the information
 656 | that is in the header chunk of the PNG, such as width, height and color type. The
 657 | information is placed in the info_png field of the LodePNGState.
 658 | */
 659 | unsigned lodepng_inspect(unsigned* w, unsigned* h,
 660 |                          LodePNGState* state,
 661 |                          const unsigned char* in, size_t insize);
 662 | #endif /*LODEPNG_COMPILE_DECODER*/
 663 | 
 664 | 
 665 | #ifdef LODEPNG_COMPILE_ENCODER
 666 | /*This function allocates the out buffer with standard malloc and stores the size in *outsize.*/
 667 | unsigned lodepng_encode(unsigned char** out, size_t* outsize,
 668 |                         const unsigned char* image, unsigned w, unsigned h,
 669 |                         LodePNGState* state);
 670 | #endif /*LODEPNG_COMPILE_ENCODER*/
 671 | 
 672 | /*
 673 | The lodepng_chunk functions are normally not needed, except to traverse the
 674 | unknown chunks stored in the LodePNGInfo struct, or add new ones to it.
 675 | It also allows traversing the chunks of an encoded PNG file yourself.
 676 | 
 677 | PNG standard chunk naming conventions:
 678 | First byte: uppercase = critical, lowercase = ancillary
 679 | Second byte: uppercase = public, lowercase = private
 680 | Third byte: must be uppercase
 681 | Fourth byte: uppercase = unsafe to copy, lowercase = safe to copy
 682 | */
 683 | 
 684 | /*
 685 | Gets the length of the data of the chunk. Total chunk length has 12 bytes more.
 686 | There must be at least 4 bytes to read from. If the result value is too large,
 687 | it may be corrupt data.
 688 | */
 689 | unsigned lodepng_chunk_length(const unsigned char* chunk);
 690 | 
 691 | /*puts the 4-byte type in null terminated string*/
 692 | void lodepng_chunk_type(char type[5], const unsigned char* chunk);
 693 | 
 694 | /*check if the type is the given type*/
 695 | unsigned char lodepng_chunk_type_equals(const unsigned char* chunk, const char* type);
 696 | 
 697 | /*0: it's one of the critical chunk types, 1: it's an ancillary chunk (see PNG standard)*/
 698 | unsigned char lodepng_chunk_ancillary(const unsigned char* chunk);
 699 | 
 700 | /*0: public, 1: private (see PNG standard)*/
 701 | unsigned char lodepng_chunk_private(const unsigned char* chunk);
 702 | 
 703 | /*0: the chunk is unsafe to copy, 1: the chunk is safe to copy (see PNG standard)*/
 704 | unsigned char lodepng_chunk_safetocopy(const unsigned char* chunk);
 705 | 
 706 | /*get pointer to the data of the chunk, where the input points to the header of the chunk*/
 707 | unsigned char* lodepng_chunk_data(unsigned char* chunk);
 708 | const unsigned char* lodepng_chunk_data_const(const unsigned char* chunk);
 709 | 
 710 | /*returns 0 if the crc is correct, 1 if it's incorrect (0 for OK as usual!)*/
 711 | unsigned lodepng_chunk_check_crc(const unsigned char* chunk);
 712 | 
 713 | /*generates the correct CRC from the data and puts it in the last 4 bytes of the chunk*/
 714 | void lodepng_chunk_generate_crc(unsigned char* chunk);
 715 | 
 716 | /*iterate to next chunks. don't use on IEND chunk, as there is no next chunk then*/
 717 | unsigned char* lodepng_chunk_next(unsigned char* chunk);
 718 | const unsigned char* lodepng_chunk_next_const(const unsigned char* chunk);
 719 | 
 720 | /*
 721 | Appends chunk to the data in out. The given chunk should already have its chunk header.
 722 | The out variable and outlength are updated to reflect the new reallocated buffer.
 723 | Returns error code (0 if it went ok)
 724 | */
 725 | unsigned lodepng_chunk_append(unsigned char** out, size_t* outlength, const unsigned char* chunk);
 726 | 
 727 | /*
 728 | Appends new chunk to out. The chunk to append is given by giving its length, type
 729 | and data separately. The type is a 4-letter string.
 730 | The out variable and outlength are updated to reflect the new reallocated buffer.
 731 | Returne error code (0 if it went ok)
 732 | */
 733 | unsigned lodepng_chunk_create(unsigned char** out, size_t* outlength, unsigned length,
 734 |                               const char* type, const unsigned char* data);
 735 | 
 736 | 
 737 | /*Calculate CRC32 of buffer*/
 738 | unsigned lodepng_crc32(const unsigned char* buf, size_t len);
 739 | #endif /*LODEPNG_COMPILE_PNG*/
 740 | 
 741 | 
 742 | #ifdef LODEPNG_COMPILE_ZLIB
 743 | /*
 744 | This zlib part can be used independently to zlib compress and decompress a
 745 | buffer. It cannot be used to create gzip files however, and it only supports the
 746 | part of zlib that is required for PNG, it does not support dictionaries.
 747 | */
 748 | 
 749 | #ifdef LODEPNG_COMPILE_DECODER
 750 | /*Inflate a buffer. Inflate is the decompression step of deflate. Out buffer must be freed after use.*/
 751 | unsigned lodepng_inflate(unsigned char** out, size_t* outsize,
 752 |                          const unsigned char* in, size_t insize,
 753 |                          const LodePNGDecompressSettings* settings);
 754 | 
 755 | /*
 756 | Decompresses Zlib data. Reallocates the out buffer and appends the data. The
 757 | data must be according to the zlib specification.
 758 | Either, *out must be NULL and *outsize must be 0, or, *out must be a valid
 759 | buffer and *outsize its size in bytes. out must be freed by user after usage.
 760 | */
 761 | unsigned lodepng_zlib_decompress(unsigned char** out, size_t* outsize,
 762 |                                  const unsigned char* in, size_t insize,
 763 |                                  const LodePNGDecompressSettings* settings);
 764 | #endif /*LODEPNG_COMPILE_DECODER*/
 765 | 
 766 | #ifdef LODEPNG_COMPILE_ENCODER
 767 | /*
 768 | Compresses data with Zlib. Reallocates the out buffer and appends the data.
 769 | Zlib adds a small header and trailer around the deflate data.
 770 | The data is output in the format of the zlib specification.
 771 | Either, *out must be NULL and *outsize must be 0, or, *out must be a valid
 772 | buffer and *outsize its size in bytes. out must be freed by user after usage.
 773 | */
 774 | unsigned lodepng_zlib_compress(unsigned char** out, size_t* outsize,
 775 |                                const unsigned char* in, size_t insize,
 776 |                                const LodePNGCompressSettings* settings);
 777 | 
 778 | /*
 779 | Find length-limited Huffman code for given frequencies. This function is in the
 780 | public interface only for tests, it's used internally by lodepng_deflate.
 781 | */
 782 | unsigned lodepng_huffman_code_lengths(unsigned* lengths, const unsigned* frequencies,
 783 |                                       size_t numcodes, unsigned maxbitlen);
 784 | 
 785 | /*Compress a buffer with deflate. See RFC 1951. Out buffer must be freed after use.*/
 786 | unsigned lodepng_deflate(unsigned char** out, size_t* outsize,
 787 |                          const unsigned char* in, size_t insize,
 788 |                          const LodePNGCompressSettings* settings);
 789 | 
 790 | #endif /*LODEPNG_COMPILE_ENCODER*/
 791 | #endif /*LODEPNG_COMPILE_ZLIB*/
 792 | 
 793 | #ifdef LODEPNG_COMPILE_DISK
 794 | /*
 795 | Load a file from disk into buffer. The function allocates the out buffer, and
 796 | after usage you should free it.
 797 | out: output parameter, contains pointer to loaded buffer.
 798 | outsize: output parameter, size of the allocated out buffer
 799 | filename: the path to the file to load
 800 | return value: error code (0 means ok)
 801 | */
 802 | unsigned lodepng_load_file(unsigned char** out, size_t* outsize, const char* filename);
 803 | 
 804 | /*
 805 | Save a file from buffer to disk. Warning, if it exists, this function overwrites
 806 | the file without warning!
 807 | buffer: the buffer to write
 808 | buffersize: size of the buffer to write
 809 | filename: the path to the file to save to
 810 | return value: error code (0 means ok)
 811 | */
 812 | unsigned lodepng_save_file(const unsigned char* buffer, size_t buffersize, const char* filename);
 813 | #endif /*LODEPNG_COMPILE_DISK*/
 814 | 
 815 | #ifdef LODEPNG_COMPILE_CPP
 816 | /* The LodePNG C++ wrapper uses std::vectors instead of manually allocated memory buffers. */
 817 | namespace lodepng
 818 | {
 819 | #ifdef LODEPNG_COMPILE_PNG
 820 | class State : public LodePNGState
 821 | {
 822 |   public:
 823 |     State();
 824 |     State(const State& other);
 825 |     virtual ~State();
 826 |     State& operator=(const State& other);
 827 | };
 828 | 
 829 | #ifdef LODEPNG_COMPILE_DECODER
 830 | /* Same as other lodepng::decode, but using a State for more settings and information. */
 831 | unsigned decode(std::vector<unsigned char>& out, unsigned& w, unsigned& h,
 832 |                 State& state,
 833 |                 const unsigned char* in, size_t insize);
 834 | unsigned decode(std::vector<unsigned char>& out, unsigned& w, unsigned& h,
 835 |                 State& state,
 836 |                 const std::vector<unsigned char>& in);
 837 | #endif /*LODEPNG_COMPILE_DECODER*/
 838 | 
 839 | #ifdef LODEPNG_COMPILE_ENCODER
 840 | /* Same as other lodepng::encode, but using a State for more settings and information. */
 841 | unsigned encode(std::vector<unsigned char>& out,
 842 |                 const unsigned char* in, unsigned w, unsigned h,
 843 |                 State& state);
 844 | unsigned encode(std::vector<unsigned char>& out,
 845 |                 const std::vector<unsigned char>& in, unsigned w, unsigned h,
 846 |                 State& state);
 847 | #endif /*LODEPNG_COMPILE_ENCODER*/
 848 | 
 849 | #ifdef LODEPNG_COMPILE_DISK
 850 | /*
 851 | Load a file from disk into an std::vector.
 852 | return value: error code (0 means ok)
 853 | */
 854 | unsigned load_file(std::vector<unsigned char>& buffer, const std::string& filename);
 855 | 
 856 | /*
 857 | Save the binary data in an std::vector to a file on disk. The file is overwritten
 858 | without warning.
 859 | */
 860 | unsigned save_file(const std::vector<unsigned char>& buffer, const std::string& filename);
 861 | #endif /* LODEPNG_COMPILE_DISK */
 862 | #endif /* LODEPNG_COMPILE_PNG */
 863 | 
 864 | #ifdef LODEPNG_COMPILE_ZLIB
 865 | #ifdef LODEPNG_COMPILE_DECODER
 866 | /* Zlib-decompress an unsigned char buffer */
 867 | unsigned decompress(std::vector<unsigned char>& out, const unsigned char* in, size_t insize,
 868 |                     const LodePNGDecompressSettings& settings = lodepng_default_decompress_settings);
 869 | 
 870 | /* Zlib-decompress an std::vector */
 871 | unsigned decompress(std::vector<unsigned char>& out, const std::vector<unsigned char>& in,
 872 |                     const LodePNGDecompressSettings& settings = lodepng_default_decompress_settings);
 873 | #endif /* LODEPNG_COMPILE_DECODER */
 874 | 
 875 | #ifdef LODEPNG_COMPILE_ENCODER
 876 | /* Zlib-compress an unsigned char buffer */
 877 | unsigned compress(std::vector<unsigned char>& out, const unsigned char* in, size_t insize,
 878 |                   const LodePNGCompressSettings& settings = lodepng_default_compress_settings);
 879 | 
 880 | /* Zlib-compress an std::vector */
 881 | unsigned compress(std::vector<unsigned char>& out, const std::vector<unsigned char>& in,
 882 |                   const LodePNGCompressSettings& settings = lodepng_default_compress_settings);
 883 | #endif /* LODEPNG_COMPILE_ENCODER */
 884 | #endif /* LODEPNG_COMPILE_ZLIB */
 885 | } /* namespace lodepng */
 886 | #endif /*LODEPNG_COMPILE_CPP*/
 887 | 
 888 | /*
 889 | TODO:
 890 | [.] test if there are no memory leaks or security exploits - done a lot but needs to be checked often
 891 | [.] check compatibility with various compilers  - done but needs to be redone for every newer version
 892 | [X] converting color to 16-bit per channel types
 893 | [ ] read all public PNG chunk types (but never let the color profile and gamma ones touch RGB values)
 894 | [ ] make sure encoder generates no chunks with size > (2^31)-1
 895 | [ ] partial decoding (stream processing)
 896 | [X] let the "isFullyOpaque" function check color keys and transparent palettes too
 897 | [X] better name for the variables "codes", "codesD", "codelengthcodes", "clcl" and "lldl"
 898 | [ ] don't stop decoding on errors like 69, 57, 58 (make warnings)
 899 | [ ] make warnings like: oob palette, checksum fail, data after iend, wrong/unknown crit chunk, no null terminator in text, ...
 900 | [ ] let the C++ wrapper catch exceptions coming from the standard library and return LodePNG error codes
 901 | [ ] allow user to provide custom color conversion functions, e.g. for premultiplied alpha, padding bits or not, ...
 902 | [ ] allow user to give data (void*) to custom allocator
 903 | */
 904 | 
 905 | #endif /*LODEPNG_H inclusion guard*/
 906 | 
 907 | /*
 908 | LodePNG Documentation
 909 | ---------------------
 910 | 
 911 | 0. table of contents
 912 | --------------------
 913 | 
 914 |   1. about
 915 |    1.1. supported features
 916 |    1.2. features not supported
 917 |   2. C and C++ version
 918 |   3. security
 919 |   4. decoding
 920 |   5. encoding
 921 |   6. color conversions
 922 |     6.1. PNG color types
 923 |     6.2. color conversions
 924 |     6.3. padding bits
 925 |     6.4. A note about 16-bits per channel and endianness
 926 |   7. error values
 927 |   8. chunks and PNG editing
 928 |   9. compiler support
 929 |   10. examples
 930 |    10.1. decoder C++ example
 931 |    10.2. decoder C example
 932 |   11. state settings reference
 933 |   12. changes
 934 |   13. contact information
 935 | 
 936 | 
 937 | 1. about
 938 | --------
 939 | 
 940 | PNG is a file format to store raster images losslessly with good compression,
 941 | supporting different color types and alpha channel.
 942 | 
 943 | LodePNG is a PNG codec according to the Portable Network Graphics (PNG)
 944 | Specification (Second Edition) - W3C Recommendation 10 November 2003.
 945 | 
 946 | The specifications used are:
 947 | 
 948 | *) Portable Network Graphics (PNG) Specification (Second Edition):
 949 |      http://www.w3.org/TR/2003/REC-PNG-20031110
 950 | *) RFC 1950 ZLIB Compressed Data Format version 3.3:
 951 |      http://www.gzip.org/zlib/rfc-zlib.html
 952 | *) RFC 1951 DEFLATE Compressed Data Format Specification ver 1.3:
 953 |      http://www.gzip.org/zlib/rfc-deflate.html
 954 | 
 955 | The most recent version of LodePNG can currently be found at
 956 | http://lodev.org/lodepng/
 957 | 
 958 | LodePNG works both in C (ISO C90) and C++, with a C++ wrapper that adds
 959 | extra functionality.
 960 | 
 961 | LodePNG exists out of two files:
 962 | -lodepng.h: the header file for both C and C++
 963 | -lodepng.c(pp): give it the name lodepng.c or lodepng.cpp (or .cc) depending on your usage
 964 | 
 965 | If you want to start using LodePNG right away without reading this doc, get the
 966 | examples from the LodePNG website to see how to use it in code, or check the
 967 | smaller examples in chapter 13 here.
 968 | 
 969 | LodePNG is simple but only supports the basic requirements. To achieve
 970 | simplicity, the following design choices were made: There are no dependencies
 971 | on any external library. There are functions to decode and encode a PNG with
 972 | a single function call, and extended versions of these functions taking a
 973 | LodePNGState struct allowing to specify or get more information. By default
 974 | the colors of the raw image are always RGB or RGBA, no matter what color type
 975 | the PNG file uses. To read and write files, there are simple functions to
 976 | convert the files to/from buffers in memory.
 977 | 
 978 | This all makes LodePNG suitable for loading textures in games, demos and small
 979 | programs, ... It's less suitable for full fledged image editors, loading PNGs
 980 | over network (it requires all the image data to be available before decoding can
 981 | begin), life-critical systems, ...
 982 | 
 983 | 1.1. supported features
 984 | -----------------------
 985 | 
 986 | The following features are supported by the decoder:
 987 | 
 988 | *) decoding of PNGs with any color type, bit depth and interlace mode, to a 24- or 32-bit color raw image,
 989 |    or the same color type as the PNG
 990 | *) encoding of PNGs, from any raw image to 24- or 32-bit color, or the same color type as the raw image
 991 | *) Adam7 interlace and deinterlace for any color type
 992 | *) loading the image from harddisk or decoding it from a buffer from other sources than harddisk
 993 | *) support for alpha channels, including RGBA color model, translucent palettes and color keying
 994 | *) zlib decompression (inflate)
 995 | *) zlib compression (deflate)
 996 | *) CRC32 and ADLER32 checksums
 997 | *) handling of unknown chunks, allowing making a PNG editor that stores custom and unknown chunks.
 998 | *) the following chunks are supported (generated/interpreted) by both encoder and decoder:
 999 |     IHDR: header information
1000 |     PLTE: color palette
1001 |     IDAT: pixel data
1002 |     IEND: the final chunk
1003 |     tRNS: transparency for palettized images
1004 |     tEXt: textual information
1005 |     zTXt: compressed textual information
1006 |     iTXt: international textual information
1007 |     bKGD: suggested background color
1008 |     pHYs: physical dimensions
1009 |     tIME: modification time
1010 | 
1011 | 1.2. features not supported
1012 | ---------------------------
1013 | 
1014 | The following features are _not_ supported:
1015 | 
1016 | *) some features needed to make a conformant PNG-Editor might be still missing.
1017 | *) partial loading/stream processing. All data must be available and is processed in one call.
1018 | *) The following public chunks are not supported but treated as unknown chunks by LodePNG
1019 |     cHRM, gAMA, iCCP, sRGB, sBIT, hIST, sPLT
1020 |    Some of these are not supported on purpose: LodePNG wants to provide the RGB values
1021 |    stored in the pixels, not values modified by system dependent gamma or color models.
1022 | 
1023 | 
1024 | 2. C and C++ version
1025 | --------------------
1026 | 
1027 | The C version uses buffers allocated with alloc that you need to free()
1028 | yourself. You need to use init and cleanup functions for each struct whenever
1029 | using a struct from the C version to avoid exploits and memory leaks.
1030 | 
1031 | The C++ version has extra functions with std::vectors in the interface and the
1032 | lodepng::State class which is a LodePNGState with constructor and destructor.
1033 | 
1034 | These files work without modification for both C and C++ compilers because all
1035 | the additional C++ code is in "#ifdef __cplusplus" blocks that make C-compilers
1036 | ignore it, and the C code is made to compile both with strict ISO C90 and C++.
1037 | 
1038 | To use the C++ version, you need to rename the source file to lodepng.cpp
1039 | (instead of lodepng.c), and compile it with a C++ compiler.
1040 | 
1041 | To use the C version, you need to rename the source file to lodepng.c (instead
1042 | of lodepng.cpp), and compile it with a C compiler.
1043 | 
1044 | 
1045 | 3. Security
1046 | -----------
1047 | 
1048 | Even if carefully designed, it's always possible that LodePNG contains possible
1049 | exploits. If you discover one, please let me know, and it will be fixed.
1050 | 
1051 | When using LodePNG, care has to be taken with the C version of LodePNG, as well
1052 | as the C-style structs when working with C++. The following conventions are used
1053 | for all C-style structs:
1054 | 
1055 | -if a struct has a corresponding init function, always call the init function when making a new one
1056 | -if a struct has a corresponding cleanup function, call it before the struct disappears to avoid memory leaks
1057 | -if a struct has a corresponding copy function, use the copy function instead of "=".
1058 |  The destination must also be inited already.
1059 | 
1060 | 
1061 | 4. Decoding
1062 | -----------
1063 | 
1064 | Decoding converts a PNG compressed image to a raw pixel buffer.
1065 | 
1066 | Most documentation on using the decoder is at its declarations in the header
1067 | above. For C, simple decoding can be done with functions such as
1068 | lodepng_decode32, and more advanced decoding can be done with the struct
1069 | LodePNGState and lodepng_decode. For C++, all decoding can be done with the
1070 | various lodepng::decode functions, and lodepng::State can be used for advanced
1071 | features.
1072 | 
1073 | When using the LodePNGState, it uses the following fields for decoding:
1074 | *) LodePNGInfo info_png: it stores extra information about the PNG (the input) in here
1075 | *) LodePNGColorMode info_raw: here you can say what color mode of the raw image (the output) you want to get
1076 | *) LodePNGDecoderSettings decoder: you can specify a few extra settings for the decoder to use
1077 | 
1078 | LodePNGInfo info_png
1079 | --------------------
1080 | 
1081 | After decoding, this contains extra information of the PNG image, except the actual
1082 | pixels, width and height because these are already gotten directly from the decoder
1083 | functions.
1084 | 
1085 | It contains for example the original color type of the PNG image, text comments,
1086 | suggested background color, etc... More details about the LodePNGInfo struct are
1087 | at its declaration documentation.
1088 | 
1089 | LodePNGColorMode info_raw
1090 | -------------------------
1091 | 
1092 | When decoding, here you can specify which color type you want
1093 | the resulting raw image to be. If this is different from the colortype of the
1094 | PNG, then the decoder will automatically convert the result. This conversion
1095 | always works, except if you want it to convert a color PNG to greyscale or to
1096 | a palette with missing colors.
1097 | 
1098 | By default, 32-bit color is used for the result.
1099 | 
1100 | LodePNGDecoderSettings decoder
1101 | ------------------------------
1102 | 
1103 | The settings can be used to ignore the errors created by invalid CRC and Adler32
1104 | chunks, and to disable the decoding of tEXt chunks.
1105 | 
1106 | There's also a setting color_convert, true by default. If false, no conversion
1107 | is done, the resulting data will be as it was in the PNG (after decompression)
1108 | and you'll have to puzzle the colors of the pixels together yourself using the
1109 | color type information in the LodePNGInfo.
1110 | 
1111 | 
1112 | 5. Encoding
1113 | -----------
1114 | 
1115 | Encoding converts a raw pixel buffer to a PNG compressed image.
1116 | 
1117 | Most documentation on using the encoder is at its declarations in the header
1118 | above. For C, simple encoding can be done with functions such as
1119 | lodepng_encode32, and more advanced decoding can be done with the struct
1120 | LodePNGState and lodepng_encode. For C++, all encoding can be done with the
1121 | various lodepng::encode functions, and lodepng::State can be used for advanced
1122 | features.
1123 | 
1124 | Like the decoder, the encoder can also give errors. However it gives less errors
1125 | since the encoder input is trusted, the decoder input (a PNG image that could
1126 | be forged by anyone) is not trusted.
1127 | 
1128 | When using the LodePNGState, it uses the following fields for encoding:
1129 | *) LodePNGInfo info_png: here you specify how you want the PNG (the output) to be.
1130 | *) LodePNGColorMode info_raw: here you say what color type of the raw image (the input) has
1131 | *) LodePNGEncoderSettings encoder: you can specify a few settings for the encoder to use
1132 | 
1133 | LodePNGInfo info_png
1134 | --------------------
1135 | 
1136 | When encoding, you use this the opposite way as when decoding: for encoding,
1137 | you fill in the values you want the PNG to have before encoding. By default it's
1138 | not needed to specify a color type for the PNG since it's automatically chosen,
1139 | but it's possible to choose it yourself given the right settings.
1140 | 
1141 | The encoder will not always exactly match the LodePNGInfo struct you give,
1142 | it tries as close as possible. Some things are ignored by the encoder. The
1143 | encoder uses, for example, the following settings from it when applicable:
1144 | colortype and bitdepth, text chunks, time chunk, the color key, the palette, the
1145 | background color, the interlace method, unknown chunks, ...
1146 | 
1147 | When encoding to a PNG with colortype 3, the encoder will generate a PLTE chunk.
1148 | If the palette contains any colors for which the alpha channel is not 255 (so
1149 | there are translucent colors in the palette), it'll add a tRNS chunk.
1150 | 
1151 | LodePNGColorMode info_raw
1152 | -------------------------
1153 | 
1154 | You specify the color type of the raw image that you give to the input here,
1155 | including a possible transparent color key and palette you happen to be using in
1156 | your raw image data.
1157 | 
1158 | By default, 32-bit color is assumed, meaning your input has to be in RGBA
1159 | format with 4 bytes (unsigned chars) per pixel.
1160 | 
1161 | LodePNGEncoderSettings encoder
1162 | ------------------------------
1163 | 
1164 | The following settings are supported (some are in sub-structs):
1165 | *) auto_convert: when this option is enabled, the encoder will
1166 | automatically choose the smallest possible color mode (including color key) that
1167 | can encode the colors of all pixels without information loss.
1168 | *) btype: the block type for LZ77. 0 = uncompressed, 1 = fixed huffman tree,
1169 |    2 = dynamic huffman tree (best compression). Should be 2 for proper
1170 |    compression.
1171 | *) use_lz77: whether or not to use LZ77 for compressed block types. Should be
1172 |    true for proper compression.
1173 | *) windowsize: the window size used by the LZ77 encoder (1 - 32768). Has value
1174 |    2048 by default, but can be set to 32768 for better, but slow, compression.
1175 | *) force_palette: if colortype is 2 or 6, you can make the encoder write a PLTE
1176 |    chunk if force_palette is true. This can used as suggested palette to convert
1177 |    to by viewers that don't support more than 256 colors (if those still exist)
1178 | *) add_id: add text chunk "Encoder: LodePNG <version>" to the image.
1179 | *) text_compression: default 1. If 1, it'll store texts as zTXt instead of tEXt chunks.
1180 |   zTXt chunks use zlib compression on the text. This gives a smaller result on
1181 |   large texts but a larger result on small texts (such as a single program name).
1182 |   It's all tEXt or all zTXt though, there's no separate setting per text yet.
1183 | 
1184 | 
1185 | 6. color conversions
1186 | --------------------
1187 | 
1188 | An important thing to note about LodePNG, is that the color type of the PNG, and
1189 | the color type of the raw image, are completely independent. By default, when
1190 | you decode a PNG, you get the result as a raw image in the color type you want,
1191 | no matter whether the PNG was encoded with a palette, greyscale or RGBA color.
1192 | And if you encode an image, by default LodePNG will automatically choose the PNG
1193 | color type that gives good compression based on the values of colors and amount
1194 | of colors in the image. It can be configured to let you control it instead as
1195 | well, though.
1196 | 
1197 | To be able to do this, LodePNG does conversions from one color mode to another.
1198 | It can convert from almost any color type to any other color type, except the
1199 | following conversions: RGB to greyscale is not supported, and converting to a
1200 | palette when the palette doesn't have a required color is not supported. This is
1201 | not supported on purpose: this is information loss which requires a color
1202 | reduction algorithm that is beyong the scope of a PNG encoder (yes, RGB to grey
1203 | is easy, but there are multiple ways if you want to give some channels more
1204 | weight).
1205 | 
1206 | By default, when decoding, you get the raw image in 32-bit RGBA or 24-bit RGB
1207 | color, no matter what color type the PNG has. And by default when encoding,
1208 | LodePNG automatically picks the best color model for the output PNG, and expects
1209 | the input image to be 32-bit RGBA or 24-bit RGB. So, unless you want to control
1210 | the color format of the images yourself, you can skip this chapter.
1211 | 
1212 | 6.1. PNG color types
1213 | --------------------
1214 | 
1215 | A PNG image can have many color types, ranging from 1-bit color to 64-bit color,
1216 | as well as palettized color modes. After the zlib decompression and unfiltering
1217 | in the PNG image is done, the raw pixel data will have that color type and thus
1218 | a certain amount of bits per pixel. If you want the output raw image after
1219 | decoding to have another color type, a conversion is done by LodePNG.
1220 | 
1221 | The PNG specification gives the following color types:
1222 | 
1223 | 0: greyscale, bit depths 1, 2, 4, 8, 16
1224 | 2: RGB, bit depths 8 and 16
1225 | 3: palette, bit depths 1, 2, 4 and 8
1226 | 4: greyscale with alpha, bit depths 8 and 16
1227 | 6: RGBA, bit depths 8 and 16
1228 | 
1229 | Bit depth is the amount of bits per pixel per color channel. So the total amount
1230 | of bits per pixel is: amount of channels * bitdepth.
1231 | 
1232 | 6.2. color conversions
1233 | ----------------------
1234 | 
1235 | As explained in the sections about the encoder and decoder, you can specify
1236 | color types and bit depths in info_png and info_raw to change the default
1237 | behaviour.
1238 | 
1239 | If, when decoding, you want the raw image to be something else than the default,
1240 | you need to set the color type and bit depth you want in the LodePNGColorMode,
1241 | or the parameters colortype and bitdepth of the simple decoding function.
1242 | 
1243 | If, when encoding, you use another color type than the default in the raw input
1244 | image, you need to specify its color type and bit depth in the LodePNGColorMode
1245 | of the raw image, or use the parameters colortype and bitdepth of the simple
1246 | encoding function.
1247 | 
1248 | If, when encoding, you don't want LodePNG to choose the output PNG color type
1249 | but control it yourself, you need to set auto_convert in the encoder settings
1250 | to false, and specify the color type you want in the LodePNGInfo of the
1251 | encoder (including palette: it can generate a palette if auto_convert is true,
1252 | otherwise not).
1253 | 
1254 | If the input and output color type differ (whether user chosen or auto chosen),
1255 | LodePNG will do a color conversion, which follows the rules below, and may
1256 | sometimes result in an error.
1257 | 
1258 | To avoid some confusion:
1259 | -the decoder converts from PNG to raw image
1260 | -the encoder converts from raw image to PNG
1261 | -the colortype and bitdepth in LodePNGColorMode info_raw, are those of the raw image
1262 | -the colortype and bitdepth in the color field of LodePNGInfo info_png, are those of the PNG
1263 | -when encoding, the color type in LodePNGInfo is ignored if auto_convert
1264 |  is enabled, it is automatically generated instead
1265 | -when decoding, the color type in LodePNGInfo is set by the decoder to that of the original
1266 |  PNG image, but it can be ignored since the raw image has the color type you requested instead
1267 | -if the color type of the LodePNGColorMode and PNG image aren't the same, a conversion
1268 |  between the color types is done if the color types are supported. If it is not
1269 |  supported, an error is returned. If the types are the same, no conversion is done.
1270 | -even though some conversions aren't supported, LodePNG supports loading PNGs from any
1271 |  colortype and saving PNGs to any colortype, sometimes it just requires preparing
1272 |  the raw image correctly before encoding.
1273 | -both encoder and decoder use the same color converter.
1274 | 
1275 | Non supported color conversions:
1276 | -color to greyscale: no error is thrown, but the result will look ugly because
1277 | only the red channel is taken
1278 | -anything to palette when that palette does not have that color in it: in this
1279 | case an error is thrown
1280 | 
1281 | Supported color conversions:
1282 | -anything to 8-bit RGB, 8-bit RGBA, 16-bit RGB, 16-bit RGBA
1283 | -any grey or grey+alpha, to grey or grey+alpha
1284 | -anything to a palette, as long as the palette has the requested colors in it
1285 | -removing alpha channel
1286 | -higher to smaller bitdepth, and vice versa
1287 | 
1288 | If you want no color conversion to be done (e.g. for speed or control):
1289 | -In the encoder, you can make it save a PNG with any color type by giving the
1290 | raw color mode and LodePNGInfo the same color mode, and setting auto_convert to
1291 | false.
1292 | -In the decoder, you can make it store the pixel data in the same color type
1293 | as the PNG has, by setting the color_convert setting to false. Settings in
1294 | info_raw are then ignored.
1295 | 
1296 | The function lodepng_convert does the color conversion. It is available in the
1297 | interface but normally isn't needed since the encoder and decoder already call
1298 | it.
1299 | 
1300 | 6.3. padding bits
1301 | -----------------
1302 | 
1303 | In the PNG file format, if a less than 8-bit per pixel color type is used and the scanlines
1304 | have a bit amount that isn't a multiple of 8, then padding bits are used so that each
1305 | scanline starts at a fresh byte. But that is NOT true for the LodePNG raw input and output.
1306 | The raw input image you give to the encoder, and the raw output image you get from the decoder
1307 | will NOT have these padding bits, e.g. in the case of a 1-bit image with a width
1308 | of 7 pixels, the first pixel of the second scanline will the the 8th bit of the first byte,
1309 | not the first bit of a new byte.
1310 | 
1311 | 6.4. A note about 16-bits per channel and endianness
1312 | ----------------------------------------------------
1313 | 
1314 | LodePNG uses unsigned char arrays for 16-bit per channel colors too, just like
1315 | for any other color format. The 16-bit values are stored in big endian (most
1316 | significant byte first) in these arrays. This is the opposite order of the
1317 | little endian used by x86 CPU's.
1318 | 
1319 | LodePNG always uses big endian because the PNG file format does so internally.
1320 | Conversions to other formats than PNG uses internally are not supported by
1321 | LodePNG on purpose, there are myriads of formats, including endianness of 16-bit
1322 | colors, the order in which you store R, G, B and A, and so on. Supporting and
1323 | converting to/from all that is outside the scope of LodePNG.
1324 | 
1325 | This may mean that, depending on your use case, you may want to convert the big
1326 | endian output of LodePNG to little endian with a for loop. This is certainly not
1327 | always needed, many applications and libraries support big endian 16-bit colors
1328 | anyway, but it means you cannot simply cast the unsigned char* buffer to an
1329 | unsigned short* buffer on x86 CPUs.
1330 | 
1331 | 
1332 | 7. error values
1333 | ---------------
1334 | 
1335 | All functions in LodePNG that return an error code, return 0 if everything went
1336 | OK, or a non-zero code if there was an error.
1337 | 
1338 | The meaning of the LodePNG error values can be retrieved with the function
1339 | lodepng_error_text: given the numerical error code, it returns a description
1340 | of the error in English as a string.
1341 | 
1342 | Check the implementation of lodepng_error_text to see the meaning of each code.
1343 | 
1344 | 
1345 | 8. chunks and PNG editing
1346 | -------------------------
1347 | 
1348 | If you want to add extra chunks to a PNG you encode, or use LodePNG for a PNG
1349 | editor that should follow the rules about handling of unknown chunks, or if your
1350 | program is able to read other types of chunks than the ones handled by LodePNG,
1351 | then that's possible with the chunk functions of LodePNG.
1352 | 
1353 | A PNG chunk has the following layout:
1354 | 
1355 | 4 bytes length
1356 | 4 bytes type name
1357 | length bytes data
1358 | 4 bytes CRC
1359 | 
1360 | 8.1. iterating through chunks
1361 | -----------------------------
1362 | 
1363 | If you have a buffer containing the PNG image data, then the first chunk (the
1364 | IHDR chunk) starts at byte number 8 of that buffer. The first 8 bytes are the
1365 | signature of the PNG and are not part of a chunk. But if you start at byte 8
1366 | then you have a chunk, and can check the following things of it.
1367 | 
1368 | NOTE: none of these functions check for memory buffer boundaries. To avoid
1369 | exploits, always make sure the buffer contains all the data of the chunks.
1370 | When using lodepng_chunk_next, make sure the returned value is within the
1371 | allocated memory.
1372 | 
1373 | unsigned lodepng_chunk_length(const unsigned char* chunk):
1374 | 
1375 | Get the length of the chunk's data. The total chunk length is this length + 12.
1376 | 
1377 | void lodepng_chunk_type(char type[5], const unsigned char* chunk):
1378 | unsigned char lodepng_chunk_type_equals(const unsigned char* chunk, const char* type):
1379 | 
1380 | Get the type of the chunk or compare if it's a certain type
1381 | 
1382 | unsigned char lodepng_chunk_critical(const unsigned char* chunk):
1383 | unsigned char lodepng_chunk_private(const unsigned char* chunk):
1384 | unsigned char lodepng_chunk_safetocopy(const unsigned char* chunk):
1385 | 
1386 | Check if the chunk is critical in the PNG standard (only IHDR, PLTE, IDAT and IEND are).
1387 | Check if the chunk is private (public chunks are part of the standard, private ones not).
1388 | Check if the chunk is safe to copy. If it's not, then, when modifying data in a critical
1389 | chunk, unsafe to copy chunks of the old image may NOT be saved in the new one if your
1390 | program doesn't handle that type of unknown chunk.
1391 | 
1392 | unsigned char* lodepng_chunk_data(unsigned char* chunk):
1393 | const unsigned char* lodepng_chunk_data_const(const unsigned char* chunk):
1394 | 
1395 | Get a pointer to the start of the data of the chunk.
1396 | 
1397 | unsigned lodepng_chunk_check_crc(const unsigned char* chunk):
1398 | void lodepng_chunk_generate_crc(unsigned char* chunk):
1399 | 
1400 | Check if the crc is correct or generate a correct one.
1401 | 
1402 | unsigned char* lodepng_chunk_next(unsigned char* chunk):
1403 | const unsigned char* lodepng_chunk_next_const(const unsigned char* chunk):
1404 | 
1405 | Iterate to the next chunk. This works if you have a buffer with consecutive chunks. Note that these
1406 | functions do no boundary checking of the allocated data whatsoever, so make sure there is enough
1407 | data available in the buffer to be able to go to the next chunk.
1408 | 
1409 | unsigned lodepng_chunk_append(unsigned char** out, size_t* outlength, const unsigned char* chunk):
1410 | unsigned lodepng_chunk_create(unsigned char** out, size_t* outlength, unsigned length,
1411 |                               const char* type, const unsigned char* data):
1412 | 
1413 | These functions are used to create new chunks that are appended to the data in *out that has
1414 | length *outlength. The append function appends an existing chunk to the new data. The create
1415 | function creates a new chunk with the given parameters and appends it. Type is the 4-letter
1416 | name of the chunk.
1417 | 
1418 | 8.2. chunks in info_png
1419 | -----------------------
1420 | 
1421 | The LodePNGInfo struct contains fields with the unknown chunk in it. It has 3
1422 | buffers (each with size) to contain 3 types of unknown chunks:
1423 | the ones that come before the PLTE chunk, the ones that come between the PLTE
1424 | and the IDAT chunks, and the ones that come after the IDAT chunks.
1425 | It's necessary to make the distionction between these 3 cases because the PNG
1426 | standard forces to keep the ordering of unknown chunks compared to the critical
1427 | chunks, but does not force any other ordering rules.
1428 | 
1429 | info_png.unknown_chunks_data[0] is the chunks before PLTE
1430 | info_png.unknown_chunks_data[1] is the chunks after PLTE, before IDAT
1431 | info_png.unknown_chunks_data[2] is the chunks after IDAT
1432 | 
1433 | The chunks in these 3 buffers can be iterated through and read by using the same
1434 | way described in the previous subchapter.
1435 | 
1436 | When using the decoder to decode a PNG, you can make it store all unknown chunks
1437 | if you set the option settings.remember_unknown_chunks to 1. By default, this
1438 | option is off (0).
1439 | 
1440 | The encoder will always encode unknown chunks that are stored in the info_png.
1441 | If you need it to add a particular chunk that isn't known by LodePNG, you can
1442 | use lodepng_chunk_append or lodepng_chunk_create to the chunk data in
1443 | info_png.unknown_chunks_data[x].
1444 | 
1445 | Chunks that are known by LodePNG should not be added in that way. E.g. to make
1446 | LodePNG add a bKGD chunk, set background_defined to true and add the correct
1447 | parameters there instead.
1448 | 
1449 | 
1450 | 9. compiler support
1451 | -------------------
1452 | 
1453 | No libraries other than the current standard C library are needed to compile
1454 | LodePNG. For the C++ version, only the standard C++ library is needed on top.
1455 | Add the files lodepng.c(pp) and lodepng.h to your project, include
1456 | lodepng.h where needed, and your program can read/write PNG files.
1457 | 
1458 | It is compatible with C90 and up, and C++03 and up.
1459 | 
1460 | If performance is important, use optimization when compiling! For both the
1461 | encoder and decoder, this makes a large difference.
1462 | 
1463 | Make sure that LodePNG is compiled with the same compiler of the same version
1464 | and with the same settings as the rest of the program, or the interfaces with
1465 | std::vectors and std::strings in C++ can be incompatible.
1466 | 
1467 | CHAR_BITS must be 8 or higher, because LodePNG uses unsigned chars for octets.
1468 | 
1469 | *) gcc and g++
1470 | 
1471 | LodePNG is developed in gcc so this compiler is natively supported. It gives no
1472 | warnings with compiler options "-Wall -Wextra -pedantic -ansi", with gcc and g++
1473 | version 4.7.1 on Linux, 32-bit and 64-bit.
1474 | 
1475 | *) Clang
1476 | 
1477 | Fully supported and warning-free.
1478 | 
1479 | *) Mingw
1480 | 
1481 | The Mingw compiler (a port of gcc for Windows) should be fully supported by
1482 | LodePNG.
1483 | 
1484 | *) Visual Studio and Visual C++ Express Edition
1485 | 
1486 | LodePNG should be warning-free with warning level W4. Two warnings were disabled
1487 | with pragmas though: warning 4244 about implicit conversions, and warning 4996
1488 | where it wants to use a non-standard function fopen_s instead of the standard C
1489 | fopen.
1490 | 
1491 | Visual Studio may want "stdafx.h" files to be included in each source file and
1492 | give an error "unexpected end of file while looking for precompiled header".
1493 | This is not standard C++ and will not be added to the stock LodePNG. You can
1494 | disable it for lodepng.cpp only by right clicking it, Properties, C/C++,
1495 | Precompiled Headers, and set it to Not Using Precompiled Headers there.
1496 | 
1497 | NOTE: Modern versions of VS should be fully supported, but old versions, e.g.
1498 | VS6, are not guaranteed to work.
1499 | 
1500 | *) Compilers on Macintosh
1501 | 
1502 | LodePNG has been reported to work both with gcc and LLVM for Macintosh, both for
1503 | C and C++.
1504 | 
1505 | *) Other Compilers
1506 | 
1507 | If you encounter problems on any compilers, feel free to let me know and I may
1508 | try to fix it if the compiler is modern and standards complient.
1509 | 
1510 | 
1511 | 10. examples
1512 | ------------
1513 | 
1514 | This decoder example shows the most basic usage of LodePNG. More complex
1515 | examples can be found on the LodePNG website.
1516 | 
1517 | 10.1. decoder C++ example
1518 | -------------------------
1519 | 
1520 | #include "lodepng.h"
1521 | #include <iostream>
1522 | 
1523 | int main(int argc, char *argv[])
1524 | {
1525 |   const char* filename = argc > 1 ? argv[1] : "test.png";
1526 | 
1527 |   //load and decode
1528 |   std::vector<unsigned char> image;
1529 |   unsigned width, height;
1530 |   unsigned error = lodepng::decode(image, width, height, filename);
1531 | 
1532 |   //if there's an error, display it
1533 |   if(error) std::cout << "decoder error " << error << ": " << lodepng_error_text(error) << std::endl;
1534 | 
1535 |   //the pixels are now in the vector "image", 4 bytes per pixel, ordered RGBARGBA..., use it as texture, draw it, ...
1536 | }
1537 | 
1538 | 10.2. decoder C example
1539 | -----------------------
1540 | 
1541 | #include "lodepng.h"
1542 | 
1543 | int main(int argc, char *argv[])
1544 | {
1545 |   unsigned error;
1546 |   unsigned char* image;
1547 |   size_t width, height;
1548 |   const char* filename = argc > 1 ? argv[1] : "test.png";
1549 | 
1550 |   error = lodepng_decode32_file(&image, &width, &height, filename);
1551 | 
1552 |   if(error) printf("decoder error %u: %s\n", error, lodepng_error_text(error));
1553 | 
1554 |   / * use image here * /
1555 | 
1556 |   free(image);
1557 |   return 0;
1558 | }
1559 | 
1560 | 11. state settings reference
1561 | ----------------------------
1562 | 
1563 | A quick reference of some settings to set on the LodePNGState
1564 | 
1565 | For decoding:
1566 | 
1567 | state.decoder.zlibsettings.ignore_adler32: ignore ADLER32 checksums
1568 | state.decoder.zlibsettings.custom_...: use custom inflate function
1569 | state.decoder.ignore_crc: ignore CRC checksums
1570 | state.decoder.color_convert: convert internal PNG color to chosen one
1571 | state.decoder.read_text_chunks: whether to read in text metadata chunks
1572 | state.decoder.remember_unknown_chunks: whether to read in unknown chunks
1573 | state.info_raw.colortype: desired color type for decoded image
1574 | state.info_raw.bitdepth: desired bit depth for decoded image
1575 | state.info_raw....: more color settings, see struct LodePNGColorMode
1576 | state.info_png....: no settings for decoder but ouput, see struct LodePNGInfo
1577 | 
1578 | For encoding:
1579 | 
1580 | state.encoder.zlibsettings.btype: disable compression by setting it to 0
1581 | state.encoder.zlibsettings.use_lz77: use LZ77 in compression
1582 | state.encoder.zlibsettings.windowsize: tweak LZ77 windowsize
1583 | state.encoder.zlibsettings.minmatch: tweak min LZ77 length to match
1584 | state.encoder.zlibsettings.nicematch: tweak LZ77 match where to stop searching
1585 | state.encoder.zlibsettings.lazymatching: try one more LZ77 matching
1586 | state.encoder.zlibsettings.custom_...: use custom deflate function
1587 | state.encoder.auto_convert: choose optimal PNG color type, if 0 uses info_png
1588 | state.encoder.filter_palette_zero: PNG filter strategy for palette
1589 | state.encoder.filter_strategy: PNG filter strategy to encode with
1590 | state.encoder.force_palette: add palette even if not encoding to one
1591 | state.encoder.add_id: add LodePNG identifier and version as a text chunk
1592 | state.encoder.text_compression: use compressed text chunks for metadata
1593 | state.info_raw.colortype: color type of raw input image you provide
1594 | state.info_raw.bitdepth: bit depth of raw input image you provide
1595 | state.info_raw: more color settings, see struct LodePNGColorMode
1596 | state.info_png.color.colortype: desired color type if auto_convert is false
1597 | state.info_png.color.bitdepth: desired bit depth if auto_convert is false
1598 | state.info_png.color....: more color settings, see struct LodePNGColorMode
1599 | state.info_png....: more PNG related settings, see struct LodePNGInfo
1600 | 
1601 | 
1602 | 12. changes
1603 | -----------
1604 | 
1605 | The version number of LodePNG is the date of the change given in the format
1606 | yyyymmdd.
1607 | 
1608 | Some changes aren't backwards compatible. Those are indicated with a (!)
1609 | symbol.
1610 | 
1611 | *) 17 sep 2017: fix memory leak for some encoder input error cases
1612 | *) 27 nov 2016: grey+alpha auto color model detection bugfix
1613 | *) 18 apr 2016: Changed qsort to custom stable sort (for platforms w/o qsort).
1614 | *) 09 apr 2016: Fixed colorkey usage detection, and better file loading (within
1615 |    the limits of pure C90).
1616 | *) 08 dec 2015: Made load_file function return error if file can't be opened.
1617 | *) 24 okt 2015: Bugfix with decoding to palette output.
1618 | *) 18 apr 2015: Boundary PM instead of just package-merge for faster encoding.
1619 | *) 23 aug 2014: Reduced needless memory usage of decoder.
1620 | *) 28 jun 2014: Removed fix_png setting, always support palette OOB for
1621 |     simplicity. Made ColorProfile public.
1622 | *) 09 jun 2014: Faster encoder by fixing hash bug and more zeros optimization.
1623 | *) 22 dec 2013: Power of two windowsize required for optimization.
1624 | *) 15 apr 2013: Fixed bug with LAC_ALPHA and color key.
1625 | *) 25 mar 2013: Added an optional feature to ignore some PNG errors (fix_png).
1626 | *) 11 mar 2013 (!): Bugfix with custom free. Changed from "my" to "lodepng_"
1627 |     prefix for the custom allocators and made it possible with a new #define to
1628 |     use custom ones in your project without needing to change lodepng's code.
1629 | *) 28 jan 2013: Bugfix with color key.
1630 | *) 27 okt 2012: Tweaks in text chunk keyword length error handling.
1631 | *) 8 okt 2012 (!): Added new filter strategy (entropy) and new auto color mode.
1632 |     (no palette). Better deflate tree encoding. New compression tweak settings.
1633 |     Faster color conversions while decoding. Some internal cleanups.
1634 | *) 23 sep 2012: Reduced warnings in Visual Studio a little bit.
1635 | *) 1 sep 2012 (!): Removed #define's for giving custom (de)compression functions
1636 |     and made it work with function pointers instead.
1637 | *) 23 jun 2012: Added more filter strategies. Made it easier to use custom alloc
1638 |     and free functions and toggle #defines from compiler flags. Small fixes.
1639 | *) 6 may 2012 (!): Made plugging in custom zlib/deflate functions more flexible.
1640 | *) 22 apr 2012 (!): Made interface more consistent, renaming a lot. Removed
1641 |     redundant C++ codec classes. Reduced amount of structs. Everything changed,
1642 |     but it is cleaner now imho and functionality remains the same. Also fixed
1643 |     several bugs and shrunk the implementation code. Made new samples.
1644 | *) 6 nov 2011 (!): By default, the encoder now automatically chooses the best
1645 |     PNG color model and bit depth, based on the amount and type of colors of the
1646 |     raw image. For this, autoLeaveOutAlphaChannel replaced by auto_choose_color.
1647 | *) 9 okt 2011: simpler hash chain implementation for the encoder.
1648 | *) 8 sep 2011: lz77 encoder lazy matching instead of greedy matching.
1649 | *) 23 aug 2011: tweaked the zlib compression parameters after benchmarking.
1650 |     A bug with the PNG filtertype heuristic was fixed, so that it chooses much
1651 |     better ones (it's quite significant). A setting to do an experimental, slow,
1652 |     brute force search for PNG filter types is added.
1653 | *) 17 aug 2011 (!): changed some C zlib related function names.
1654 | *) 16 aug 2011: made the code less wide (max 120 characters per line).
1655 | *) 17 apr 2011: code cleanup. Bugfixes. Convert low to 16-bit per sample colors.
1656 | *) 21 feb 2011: fixed compiling for C90. Fixed compiling with sections disabled.
1657 | *) 11 dec 2010: encoding is made faster, based on suggestion by Peter Eastman
1658 |     to optimize long sequences of zeros.
1659 | *) 13 nov 2010: added LodePNG_InfoColor_hasPaletteAlpha and
1660 |     LodePNG_InfoColor_canHaveAlpha functions for convenience.
1661 | *) 7 nov 2010: added LodePNG_error_text function to get error code description.
1662 | *) 30 okt 2010: made decoding slightly faster
1663 | *) 26 okt 2010: (!) changed some C function and struct names (more consistent).
1664 |      Reorganized the documentation and the declaration order in the header.
1665 | *) 08 aug 2010: only changed some comments and external samples.
1666 | *) 05 jul 2010: fixed bug thanks to warnings in the new gcc version.
1667 | *) 14 mar 2010: fixed bug where too much memory was allocated for char buffers.
1668 | *) 02 sep 2008: fixed bug where it could create empty tree that linux apps could
1669 |     read by ignoring the problem but windows apps couldn't.
1670 | *) 06 jun 2008: added more error checks for out of memory cases.
1671 | *) 26 apr 2008: added a few more checks here and there to ensure more safety.
1672 | *) 06 mar 2008: crash with encoding of strings fixed
1673 | *) 02 feb 2008: support for international text chunks added (iTXt)
1674 | *) 23 jan 2008: small cleanups, and #defines to divide code in sections
1675 | *) 20 jan 2008: support for unknown chunks allowing using LodePNG for an editor.
1676 | *) 18 jan 2008: support for tIME and pHYs chunks added to encoder and decoder.
1677 | *) 17 jan 2008: ability to encode and decode compressed zTXt chunks added
1678 |     Also various fixes, such as in the deflate and the padding bits code.
1679 | *) 13 jan 2008: Added ability to encode Adam7-interlaced images. Improved
1680 |     filtering code of encoder.
1681 | *) 07 jan 2008: (!) changed LodePNG to use ISO C90 instead of C++. A
1682 |     C++ wrapper around this provides an interface almost identical to before.
1683 |     Having LodePNG be pure ISO C90 makes it more portable. The C and C++ code
1684 |     are together in these files but it works both for C and C++ compilers.
1685 | *) 29 dec 2007: (!) changed most integer types to unsigned int + other tweaks
1686 | *) 30 aug 2007: bug fixed which makes this Borland C++ compatible
1687 | *) 09 aug 2007: some VS2005 warnings removed again
1688 | *) 21 jul 2007: deflate code placed in new namespace separate from zlib code
1689 | *) 08 jun 2007: fixed bug with 2- and 4-bit color, and small interlaced images
1690 | *) 04 jun 2007: improved support for Visual Studio 2005: crash with accessing
1691 |     invalid std::vector element [0] fixed, and level 3 and 4 warnings removed
1692 | *) 02 jun 2007: made the encoder add a tag with version by default
1693 | *) 27 may 2007: zlib and png code separated (but still in the same file),
1694 |     simple encoder/decoder functions added for more simple usage cases
1695 | *) 19 may 2007: minor fixes, some code cleaning, new error added (error 69),
1696 |     moved some examples from here to lodepng_examples.cpp
1697 | *) 12 may 2007: palette decoding bug fixed
1698 | *) 24 apr 2007: changed the license from BSD to the zlib license
1699 | *) 11 mar 2007: very simple addition: ability to encode bKGD chunks.
1700 | *) 04 mar 2007: (!) tEXt chunk related fixes, and support for encoding
1701 |     palettized PNG images. Plus little interface change with palette and texts.
1702 | *) 03 mar 2007: Made it encode dynamic Huffman shorter with repeat codes.
1703 |     Fixed a bug where the end code of a block had length 0 in the Huffman tree.
1704 | *) 26 feb 2007: Huffman compression with dynamic trees (BTYPE 2) now implemented
1705 |     and supported by the encoder, resulting in smaller PNGs at the output.
1706 | *) 27 jan 2007: Made the Adler-32 test faster so that a timewaste is gone.
1707 | *) 24 jan 2007: gave encoder an error interface. Added color conversion from any
1708 |     greyscale type to 8-bit greyscale with or without alpha.
1709 | *) 21 jan 2007: (!) Totally changed the interface. It allows more color types
1710 |     to convert to and is more uniform. See the manual for how it works now.
1711 | *) 07 jan 2007: Some cleanup & fixes, and a few changes over the last days:
1712 |     encode/decode custom tEXt chunks, separate classes for zlib & deflate, and
1713 |     at last made the decoder give errors for incorrect Adler32 or Crc.
1714 | *) 01 jan 2007: Fixed bug with encoding PNGs with less than 8 bits per channel.
1715 | *) 29 dec 2006: Added support for encoding images without alpha channel, and
1716 |     cleaned out code as well as making certain parts faster.
1717 | *) 28 dec 2006: Added "Settings" to the encoder.
1718 | *) 26 dec 2006: The encoder now does LZ77 encoding and produces much smaller files now.
1719 |     Removed some code duplication in the decoder. Fixed little bug in an example.
1720 | *) 09 dec 2006: (!) Placed output parameters of public functions as first parameter.
1721 |     Fixed a bug of the decoder with 16-bit per color.
1722 | *) 15 okt 2006: Changed documentation structure
1723 | *) 09 okt 2006: Encoder class added. It encodes a valid PNG image from the
1724 |     given image buffer, however for now it's not compressed.
1725 | *) 08 sep 2006: (!) Changed to interface with a Decoder class
1726 | *) 30 jul 2006: (!) LodePNG_InfoPng , width and height are now retrieved in different
1727 |     way. Renamed decodePNG to decodePNGGeneric.
1728 | *) 29 jul 2006: (!) Changed the interface: image info is now returned as a
1729 |     struct of type LodePNG::LodePNG_Info, instead of a vector, which was a bit clumsy.
1730 | *) 28 jul 2006: Cleaned the code and added new error checks.
1731 |     Corrected terminology "deflate" into "inflate".
1732 | *) 23 jun 2006: Added SDL example in the documentation in the header, this
1733 |     example allows easy debugging by displaying the PNG and its transparency.
1734 | *) 22 jun 2006: (!) Changed way to obtain error value. Added
1735 |     loadFile function for convenience. Made decodePNG32 faster.
1736 | *) 21 jun 2006: (!) Changed type of info vector to unsigned.
1737 |     Changed position of palette in info vector. Fixed an important bug that
1738 |     happened on PNGs with an uncompressed block.
1739 | *) 16 jun 2006: Internally changed unsigned into unsigned where
1740 |     needed, and performed some optimizations.
1741 | *) 07 jun 2006: (!) Renamed functions to decodePNG and placed them
1742 |     in LodePNG namespace. Changed the order of the parameters. Rewrote the
1743 |     documentation in the header. Renamed files to lodepng.cpp and lodepng.h
1744 | *) 22 apr 2006: Optimized and improved some code
1745 | *) 07 sep 2005: (!) Changed to std::vector interface
1746 | *) 12 aug 2005: Initial release (C++, decoder only)
1747 | 
1748 | 
1749 | 13. contact information
1750 | -----------------------
1751 | 
1752 | Feel free to contact me with suggestions, problems, comments, ... concerning
1753 | LodePNG. If you encounter a PNG image that doesn't work properly with this
1754 | decoder, feel free to send it and I'll use it to find and fix the problem.
1755 | 
1756 | My email address is (puzzle the account and domain together with an @ symbol):
1757 | Domain: gmail dot com.
1758 | Account: lode dot vandevenne.
1759 | 
1760 | 
1761 | Copyright (c) 2005-2017 Lode Vandevenne
1762 | */
1763 | 


--------------------------------------------------------------------------------