├── .gitignore ├── doc ├── lod_generate.jpg ├── arcsin_angular_error.svg ├── spatial_selection.svg ├── hierarchy_selection.svg └── graph_cut.svg ├── .gitmodules ├── src ├── array_view.natvis ├── nvclusterlod_context.hpp ├── nvclusterlod_context.cpp ├── nvclusterlod_cpp.hpp ├── array_view.hpp └── nvclusterlod_hierarchy.cpp ├── CHANGELOG.md ├── CONTRIBUTING.txt ├── .clang-format ├── test ├── src │ └── test_lod_capi.c └── CMakeLists.txt ├── include └── nvclusterlod │ ├── nvclusterlod_hierarchy_storage.hpp │ ├── nvclusterlod_common.h │ ├── nvclusterlod_hierarchy.h │ ├── nvclusterlod_mesh_storage.hpp │ ├── nvclusterlod_mesh.h │ └── nvclusterlod_cache.hpp ├── CMakeLists.txt ├── LICENSE └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | build 2 | .vscode 3 | -------------------------------------------------------------------------------- /doc/lod_generate.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nvpro-samples/nv_cluster_lod_builder/HEAD/doc/lod_generate.jpg -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "nv_cluster_builder"] 2 | path = nv_cluster_builder 3 | url = ../nv_cluster_builder.git 4 | -------------------------------------------------------------------------------- /src/array_view.natvis: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | {{ size={m_size} }} 5 | 6 | m_ptr 7 | m_size 8 | m_stride 9 | 10 | m_size 11 | *(value_type*)((uint8_t*)m_ptr + $i * m_stride) 12 | 13 | 14 | 15 | 16 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | ## [4] 4 | 5 | ### Added 6 | 7 | - Optional decimation callback 8 | 9 | ## [3] 10 | 11 | ### Added 12 | 13 | - Support `nvclusterlod_ContextCreateInfo::parallelize`. 14 | 15 | ### Changed 16 | 17 | - Modified error enums. 18 | - Use `std::span` internally. 19 | 20 | ## [2] 21 | 22 | ### Added 23 | 24 | - Shared library support in CMake (`NVCLUSTERLOD_BUILDER_SHARED`). 25 | - Fallback for missing libc++ parallel execution. 26 | 27 | ### Changed 28 | 29 | - Real C API, removing namespace, adding prefixes, and symbol export. 30 | - Triangles now `vec3u` rather than indices. 31 | - Spheres use `vec3f` center. 32 | 33 | ### Removed 34 | 35 | - `vertexOffset` input parameter. 36 | -------------------------------------------------------------------------------- /src/nvclusterlod_context.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | 21 | #pragma once 22 | 23 | #include 24 | #include 25 | 26 | struct nvclusterlod_Context_t 27 | { 28 | nvcluster_Context clusterContext; 29 | bool parallelize = false; 30 | }; 31 | -------------------------------------------------------------------------------- /CONTRIBUTING.txt: -------------------------------------------------------------------------------- 1 | Developer Certificate of Origin 2 | Version 1.1 3 | 4 | Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 5 | 6 | Everyone is permitted to copy and distribute verbatim copies of this 7 | license document, but changing it is not allowed. 8 | 9 | 10 | Developer's Certificate of Origin 1.1 11 | 12 | By making a contribution to this project, I certify that: 13 | 14 | (a) The contribution was created in whole or in part by me and I 15 | have the right to submit it under the open source license 16 | indicated in the file; or 17 | 18 | (b) The contribution is based upon previous work that, to the best 19 | of my knowledge, is covered under an appropriate open source 20 | license and I have the right under that license to submit that 21 | work with modifications, whether created in whole or in part 22 | by me, under the same open source license (unless I am 23 | permitted to submit under a different license), as indicated 24 | in the file; or 25 | 26 | (c) The contribution was provided directly to me by some other 27 | person who certified (a), (b) or (c) and I have not modified 28 | it. 29 | 30 | (d) I understand and agree that this project and the contribution 31 | are public and that a record of the contribution (including all 32 | personal information I submit with it, including my sign-off) is 33 | maintained indefinitely and may be redistributed consistent with 34 | this project or the open source license(s) involved. 35 | -------------------------------------------------------------------------------- /.clang-format: -------------------------------------------------------------------------------- 1 | BasedOnStyle: LLVM 2 | AccessModifierOffset: '-2' 3 | AlignAfterOpenBracket: Align 4 | AlignConsecutiveAssignments: 'true' 5 | AlignConsecutiveDeclarations: 'true' 6 | AlignOperands: 'true' 7 | AlignTrailingComments: 'true' 8 | AllowAllParametersOfDeclarationOnNextLine: 'false' 9 | AllowShortBlocksOnASingleLine: 'false' 10 | AllowShortCaseLabelsOnASingleLine: 'false' 11 | AllowShortFunctionsOnASingleLine: Inline 12 | AllowShortIfStatementsOnASingleLine: 'false' 13 | AllowShortLoopsOnASingleLine: 'false' 14 | AlwaysBreakAfterReturnType: None 15 | AlwaysBreakBeforeMultilineStrings: 'true' 16 | AlwaysBreakTemplateDeclarations: 'true' 17 | BinPackArguments: 'true' 18 | BinPackParameters: 'false' 19 | ExperimentalAutoDetectBinPacking: 'false' 20 | BreakBeforeBinaryOperators: NonAssignment 21 | BreakBeforeBraces: Custom 22 | BreakBeforeTernaryOperators: 'false' 23 | BreakConstructorInitializersBeforeComma: 'true' 24 | ColumnLimit: '120' 25 | ConstructorInitializerAllOnOneLineOrOnePerLine: 'false' 26 | Cpp11BracedListStyle: 'true' 27 | IndentCaseLabels: 'true' 28 | IndentWidth: '2' 29 | KeepEmptyLinesAtTheStartOfBlocks: 'true' 30 | Language: Cpp 31 | MaxEmptyLinesToKeep: '2' 32 | NamespaceIndentation: None 33 | ObjCSpaceBeforeProtocolList: 'true' 34 | PointerAlignment: Left 35 | SpaceAfterCStyleCast: 'false' 36 | SpaceBeforeAssignmentOperators: 'true' 37 | SpaceBeforeParens: Never 38 | SpaceInEmptyParentheses: 'false' 39 | SpacesBeforeTrailingComments: '2' 40 | SpacesInAngles: 'false' 41 | SpacesInCStyleCastParentheses: 'false' 42 | SpacesInParentheses: 'false' 43 | SpacesInSquareBrackets: 'false' 44 | Standard: Cpp11 45 | TabWidth: '2' 46 | UseTab: Never 47 | SortIncludes: 'true' 48 | ReflowComments: 'false' 49 | BraceWrapping: { 50 | AfterClass: 'true', 51 | AfterControlStatement: 'true', 52 | AfterEnum: 'true', 53 | AfterFunction: 'true', 54 | AfterNamespace: 'false', 55 | AfterStruct: 'true', 56 | AfterUnion: 'true', 57 | BeforeCatch: 'true', 58 | BeforeElse: 'true', 59 | IndentBraces: 'false' 60 | } 61 | PenaltyExcessCharacter: 1 62 | PenaltyBreakBeforeFirstCallParameter: 40 63 | PenaltyBreakFirstLessLess: 1 64 | PenaltyBreakComment: 30 65 | PenaltyBreakString: 30 66 | PenaltyReturnTypeOnItsOwnLine: 9999 67 | -------------------------------------------------------------------------------- /test/src/test_lod_capi.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | #ifdef __cplusplus 21 | #error This file verifies the API is C compatible 22 | #endif 23 | 24 | #include 25 | #include 26 | #include 27 | 28 | int runCTest(void) 29 | { 30 | nvcluster_ContextCreateInfo clusterCreateInfo = nvcluster_defaultContextCreateInfo(); 31 | nvcluster_Context clusterContext = 0; 32 | nvcluster_Result clusterCreateResult = nvclusterCreateContext(&clusterCreateInfo, &clusterContext); 33 | if(clusterCreateResult != NVCLUSTER_SUCCESS) 34 | { 35 | printf("Create Context Result: %s\n", nvclusterResultString(clusterCreateResult)); 36 | return 0; 37 | } 38 | 39 | nvclusterlod_ContextCreateInfo lodCreateInfo = { 40 | .version = NVCLUSTERLOD_VERSION, 41 | .parallelize = NVCLUSTER_TRUE, 42 | .clusterContext = clusterContext, 43 | }; 44 | nvclusterlod_Context lodContext = 0; 45 | nvclusterlod_Result lodCreateResult = nvclusterlodCreateContext(&lodCreateInfo, &lodContext); 46 | if(lodCreateResult != NVCLUSTERLOD_SUCCESS) 47 | { 48 | printf("Create Context Result: %s\n", nvclusterlodResultString(lodCreateResult)); 49 | return 0; 50 | } 51 | 52 | nvclusterlod_Result lodDestroyResult = nvclusterlodDestroyContext(lodContext); 53 | if(lodDestroyResult != NVCLUSTERLOD_SUCCESS) 54 | { 55 | printf("Destroy Context Result: %s\n", nvclusterlodResultString(lodDestroyResult)); 56 | return 0; 57 | } 58 | 59 | nvcluster_Result clusterDestroyResult = nvclusterDestroyContext(clusterContext); 60 | if(clusterDestroyResult != NVCLUSTER_SUCCESS) 61 | { 62 | printf("Destroy Context Result: %s\n", nvclusterResultString(clusterDestroyResult)); 63 | return 0; 64 | } 65 | return 1; 66 | } 67 | -------------------------------------------------------------------------------- /test/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # 15 | # SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 16 | # SPDX-License-Identifier: Apache-2.0 17 | 18 | find_package(GTest QUIET) 19 | if(NOT GTest_FOUND) 20 | include(FetchContent) 21 | FetchContent_Declare( 22 | googletest 23 | GIT_REPOSITORY https://github.com/google/googletest.git 24 | GIT_TAG v1.14.0 25 | GIT_SHALLOW TRUE 26 | ) 27 | FetchContent_MakeAvailable(googletest) 28 | endif() 29 | 30 | add_executable(nv_cluster_lod_builder_tests 31 | src/test_lod.cpp 32 | src/test_lod_capi.c 33 | ) 34 | target_link_libraries(nv_cluster_lod_builder_tests nv_cluster_lod_builder gtest_main gmock_main) 35 | 36 | # opportunistically 37 | if(TARGET meshoptimizer) 38 | target_link_libraries(nv_cluster_lod_builder_tests meshoptimizer) 39 | target_compile_definitions(nv_cluster_lod_builder_tests PRIVATE TESTS_HAVE_MESHOPTIMIZER=1) 40 | else() 41 | target_compile_definitions(nv_cluster_lod_builder_tests PRIVATE TESTS_HAVE_MESHOPTIMIZER=0) 42 | endif() 43 | 44 | if(MSVC) 45 | target_compile_options(nv_cluster_lod_builder_tests PRIVATE 46 | $<$:/W4> 47 | $<$:/WX> 48 | $<$:/wd4201> # nonstandard extension used: nameless struct/union 49 | ) 50 | target_compile_definitions(nv_cluster_lod_builder_tests PRIVATE WIN32_LEAN_AND_MEAN=1 NOMINMAX) 51 | else() 52 | target_compile_options(nv_cluster_lod_builder_tests PRIVATE 53 | $<$:-Wall> 54 | $<$:-Wextra> 55 | $<$:-Wpedantic> 56 | $<$:-Wshadow> 57 | $<$:-Wconversion> 58 | $<$:-Werror> 59 | ) 60 | if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU") 61 | target_compile_definitions(nv_cluster_lod_builder_tests PRIVATE 62 | $<$:_GLIBCXX_ASSERTIONS> 63 | # Do not use ABI breaking _GLIBCXX_DEBUG or _GLIBCXX_DEBUG_BACKTRACE 64 | ) 65 | endif() 66 | endif() 67 | 68 | include(GoogleTest) 69 | gtest_discover_tests(nv_cluster_lod_builder_tests) 70 | -------------------------------------------------------------------------------- /include/nvclusterlod/nvclusterlod_hierarchy_storage.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | #pragma once 21 | 22 | #include 23 | 24 | #include 25 | #include 26 | 27 | namespace nvclusterlod { 28 | 29 | // Shortcut and storage for hierarchy output 30 | struct LodHierarchy 31 | { 32 | std::vector nodes; 33 | std::vector groupCumulativeBoundingSpheres; 34 | std::vector groupCumulativeQuadricError; 35 | 36 | void shrink_to_fit() 37 | { 38 | // nodes is conservatively sized for hierarchy output. If this object is 39 | // kept around, memory can be saved by reallocating. 40 | nodes.shrink_to_fit(); 41 | } 42 | }; 43 | 44 | inline nvclusterlod_Result generateLodHierarchy(nvclusterlod_Context context, const nvclusterlod_HierarchyInput& input, LodHierarchy& hierarchy) 45 | { 46 | // Get conservative output sizes 47 | nvclusterlod_HierarchyCounts sizes; 48 | if(nvclusterlod_Result result = nvclusterlodGetHierarchyRequirements(context, &input, &sizes); 49 | result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 50 | { 51 | return result; 52 | } 53 | 54 | // Allocate storage 55 | hierarchy.nodes.resize(sizes.nodeCount); 56 | hierarchy.groupCumulativeBoundingSpheres.resize(input.groupCount); 57 | hierarchy.groupCumulativeQuadricError.resize(input.groupCount); 58 | 59 | // Pack output pointers 60 | nvclusterlod_HierarchyOutput output; 61 | output.groupCumulativeBoundingSpheres = hierarchy.groupCumulativeBoundingSpheres.data(); 62 | output.groupCumulativeQuadricError = hierarchy.groupCumulativeQuadricError.data(); 63 | output.nodeCount = sizes.nodeCount; 64 | output.nodes = hierarchy.nodes.data(); 65 | 66 | // Make LODs 67 | if(nvclusterlod_Result result = nvclusterlodBuildHierarchy(context, &input, &output); result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 68 | { 69 | hierarchy = {}; 70 | return result; 71 | } 72 | // Truncate to output size written 73 | hierarchy.nodes.resize(output.nodeCount); 74 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 75 | } 76 | 77 | inline nvclusterlod_HierarchyInput makeHierarchyInput(const nvclusterlod_MeshOutput& meshOutput) 78 | { 79 | return { 80 | .clusterGeneratingGroups = meshOutput.clusterGeneratingGroups, 81 | .clusterBoundingSpheres = meshOutput.clusterBoundingSpheres, 82 | .groupQuadricErrors = meshOutput.groupQuadricErrors, 83 | .groupClusterRanges = meshOutput.groupClusterRanges, 84 | .lodLevelGroupRanges = meshOutput.lodLevelGroupRanges, 85 | .clusterCount = meshOutput.clusterCount, 86 | .groupCount = meshOutput.groupCount, 87 | .lodLevelCount = meshOutput.lodLevelCount, 88 | }; 89 | } 90 | 91 | inline nvclusterlod_Result generateLodHierarchy(nvclusterlod_Context context, const nvclusterlod_MeshOutput& meshOutput, LodHierarchy& hierarchy) 92 | { 93 | return generateLodHierarchy(context, makeHierarchyInput(meshOutput), hierarchy); 94 | } 95 | 96 | } // namespace nvclusterlod 97 | -------------------------------------------------------------------------------- /src/nvclusterlod_context.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | 21 | #include "nvclusterlod_context.hpp" 22 | 23 | uint32_t nvclusterlodVersion(void) 24 | { 25 | return NVCLUSTERLOD_VERSION; 26 | } 27 | 28 | nvclusterlod_Result nvclusterlodCreateContext(const nvclusterlod_ContextCreateInfo* createInfo, nvclusterlod_Context* context) 29 | { 30 | if(createInfo == nullptr || context == nullptr) 31 | { 32 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_NULL_INPUT; 33 | } 34 | if(createInfo->version != NVCLUSTERLOD_VERSION) 35 | { 36 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_CONTEXT_VERSION_MISMATCH; 37 | } 38 | 39 | *context = new nvclusterlod_Context_t{createInfo->clusterContext, createInfo->parallelize == NVCLUSTER_TRUE}; 40 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 41 | } 42 | 43 | nvclusterlod_Result nvclusterlodDestroyContext(nvclusterlod_Context context) 44 | { 45 | if(context == nullptr) 46 | { 47 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_NULL_INPUT; 48 | } 49 | delete context; 50 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 51 | } 52 | 53 | const char* nvclusterlodResultString(nvclusterlod_Result result) 54 | { 55 | // clang-format off 56 | switch(result) 57 | { 58 | case nvclusterlod_Result::NVCLUSTERLOD_SUCCESS: return "NVCLUSTERLOD_SUCCESS"; 59 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_EMPTY_CLUSTER_GENERATING_GROUPS: return "NVCLUSTERLOD_ERROR_EMPTY_CLUSTER_GENERATING_GROUPS"; 60 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_CLUSTERING_TRIANGLES_FAILED: return "NVCLUSTERLOD_ERROR_CLUSTERING_TRIANGLES_FAILED"; 61 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_CLUSTERING_CLUSTERS_FAILED: return "NVCLUSTERLOD_ERROR_CLUSTERING_CLUSTERS_FAILED"; 62 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_CLUSTERING_NODES_FAILED: return "NVCLUSTERLOD_ERROR_CLUSTERING_NODES_FAILED"; 63 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_NODES_OVERFLOW: return "NVCLUSTERLOD_ERROR_NODES_OVERFLOW"; 64 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_EMPTY_LOD_LEVELS: return "NVCLUSTERLOD_ERROR_EMPTY_LOD_LEVELS"; 65 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_LOD_LEVELS_OVERFLOW: return "NVCLUSTERLOD_ERROR_LOD_LEVELS_OVERFLOW"; 66 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_CLUSTER_COUNT_NOT_DECREASING: return "NVCLUSTERLOD_ERROR_CLUSTER_COUNT_NOT_DECREASING"; 67 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_OUTPUT_MESH_OVERFLOW: return "NVCLUSTERLOD_ERROR_OUTPUT_MESH_OVERFLOW"; 68 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_OUTPUT_INCONSISTENT_COUNTS: return "NVCLUSTERLOD_ERROR_OUTPUT_INCONSISTENT_COUNTS"; 69 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_EMPTY_ROOT_CLUSTER: return "NVCLUSTERLOD_ERROR_EMPTY_ROOT_CLUSTER"; 70 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_PRODUCED_NAN_BOUNDING_SPHERES: return "NVCLUSTERLOD_ERROR_PRODUCED_NAN_BOUNDING_SPHERES"; 71 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_GROUP_CLUSTER_COUNT_OVERFLOW: return "NVCLUSTERLOD_ERROR_GROUP_CLUSTER_COUNT_OVERFLOW"; 72 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_NODE_CHILD_COUNT_OVERFLOW: return "NVCLUSTERLOD_ERROR_NODE_CHILD_COUNT_OVERFLOW"; 73 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_NULL_INPUT: return "NVCLUSTERLOD_ERROR_NULL_INPUT"; 74 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_CONTEXT_VERSION_MISMATCH: return "NVCLUSTERLOD_ERROR_CONTEXT_VERSION_MISMATCH"; 75 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_CLUSTER_ITEM_VERTEX_COUNT_NOT_THREE: return "NVCLUSTERLOD_ERROR_CLUSTER_ITEM_VERTEX_COUNT_NOT_THREE"; 76 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_MAKE_BOUNDING_SPHERES_FROM_EMPTY_SET: return "NVCLUSTERLOD_ERROR_MAKE_BOUNDING_SPHERES_FROM_EMPTY_SET"; 77 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_EMPTY_DECIMATION_RESULT: return "NVCLUSTERLOD_ERROR_EMPTY_DECIMATION_RESULT"; 78 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_USER_DECIMATION_FAILED: return "NVCLUSTERLOD_ERROR_USER_DECIMATION_FAILED"; 79 | case nvclusterlod_Result::NVCLUSTERLOD_ERROR_NO_DECIMATION_CALLBACK: return "NVCLUSTERLOD_ERROR_NO_DECIMATION_CALLBACK"; 80 | default: return ""; 81 | } 82 | // clang-format on 83 | } 84 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # 15 | # SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 16 | # SPDX-License-Identifier: Apache-2.0 17 | 18 | cmake_minimum_required(VERSION 3.20) 19 | project(nv_cluster_lod_builder VERSION 2.0) 20 | 21 | option(NVCLUSTERLOD_MULTITHREADED "Use parallel algorithms to generate clusters." ON) 22 | set(NVCLUSTER_MULTITHREADED ${NVCLUSTERLOD_MULTITHREADED} CACHE BOOL "Build with multithreaded cluster generation support" FORCE) 23 | 24 | option(NVCLUSTERLOD_FETCH_MESHOPTIMIZER "Fetch meshoptimizer if not found" ON) 25 | if(NOT TARGET meshoptimizer AND NVCLUSTERLOD_FETCH_MESHOPTIMIZER) 26 | include(FetchContent) 27 | FetchContent_Declare( 28 | meshoptimizer 29 | GIT_REPOSITORY https://github.com/zeux/meshoptimizer.git 30 | GIT_TAG v0.25 31 | GIT_SHALLOW TRUE 32 | ) 33 | FetchContent_MakeAvailable(meshoptimizer) 34 | endif() 35 | if(NOT TARGET nv_cluster_builder) 36 | add_subdirectory(nv_cluster_builder) 37 | endif() 38 | 39 | set(SOURCES 40 | src/array_view.hpp 41 | src/array_view.natvis 42 | src/nvclusterlod_context.cpp 43 | src/nvclusterlod_context.hpp 44 | src/nvclusterlod_hierarchy.cpp 45 | src/nvclusterlod_mesh.cpp 46 | src/nvclusterlod_cpp.hpp 47 | ) 48 | set(HEADERS 49 | include/nvclusterlod/nvclusterlod_cache.hpp 50 | include/nvclusterlod/nvclusterlod_common.h 51 | include/nvclusterlod/nvclusterlod_hierarchy.h 52 | include/nvclusterlod/nvclusterlod_hierarchy_storage.hpp 53 | include/nvclusterlod/nvclusterlod_mesh.h 54 | include/nvclusterlod/nvclusterlod_mesh_storage.hpp 55 | ) 56 | 57 | source_group("public_include" FILES ${HEADERS}) 58 | source_group("source" FILES ${SOURCES}) 59 | 60 | # Optionally build as a shared library 61 | include(CMakeDependentOption) 62 | cmake_dependent_option( 63 | NVCLUSTERLOD_BUILDER_SHARED # option variable 64 | "Build shared library" # description 65 | OFF # default value if exposed; user can override 66 | "NOT BUILD_SHARED_LIBS" # condition to expose option 67 | ON # value if not exposed; user can't override 68 | ) 69 | 70 | if (NVCLUSTERLOD_BUILDER_SHARED) 71 | set(CMAKE_C_VISIBILITY_PRESET hidden) 72 | set(CMAKE_CXX_VISIBILITY_PRESET hidden) 73 | set(CMAKE_VISIBILITY_INLINES_HIDDEN 1) 74 | add_library(nv_cluster_lod_builder SHARED ${SOURCES} ${HEADERS}) 75 | target_compile_definitions(nv_cluster_lod_builder PUBLIC NVCLUSTERLOD_BUILDER_SHARED) 76 | else() 77 | add_library(nv_cluster_lod_builder STATIC ${SOURCES} ${HEADERS}) 78 | endif () 79 | target_compile_features(nv_cluster_lod_builder PUBLIC cxx_std_20) 80 | target_include_directories(nv_cluster_lod_builder PUBLIC include) 81 | target_include_directories(nv_cluster_lod_builder PRIVATE src) 82 | target_compile_definitions(nv_cluster_lod_builder PRIVATE NVCLUSTERLOD_BUILDER_COMPILING) 83 | 84 | # All the warnings. Branch on COMPILE_LANGUAGE to avoid passing unknowns to nvcc 85 | if(MSVC) 86 | target_compile_options(nv_cluster_lod_builder PRIVATE 87 | $<$:/W4> 88 | $<$:/WX> 89 | ) 90 | target_compile_definitions(nv_cluster_lod_builder PRIVATE WIN32_LEAN_AND_MEAN=1 NOMINMAX) 91 | else() 92 | target_compile_options(nv_cluster_lod_builder PRIVATE 93 | $<$:-Wall> 94 | $<$:-Wextra> 95 | $<$:-Wpedantic> 96 | $<$:-Wshadow> 97 | $<$:-Wconversion> 98 | $<$:-Werror> 99 | ) 100 | if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU") 101 | target_compile_definitions(nv_cluster_lod_builder PRIVATE 102 | $<$:_GLIBCXX_ASSERTIONS> 103 | # Do not use ABI breaking _GLIBCXX_DEBUG or _GLIBCXX_DEBUG_BACKTRACE 104 | ) 105 | endif() 106 | endif() 107 | 108 | # Headers from nv_cluster_builder are used in the API 109 | target_link_libraries(nv_cluster_lod_builder PUBLIC nv_cluster_builder) 110 | if(TARGET meshoptimizer) 111 | target_link_libraries(nv_cluster_lod_builder PRIVATE meshoptimizer) 112 | target_compile_definitions(nv_cluster_lod_builder PRIVATE NVCLUSTERLOD_HAS_MESHOPTIMIER=1) 113 | else() 114 | target_compile_definitions(nv_cluster_lod_builder PRIVATE NVCLUSTERLOD_HAS_MESHOPTIMIER=0) 115 | endif() 116 | target_compile_definitions(nv_cluster_lod_builder PUBLIC NVCLUSTERLOD_MULTITHREADED=$) 117 | 118 | if(NOT MSVC) 119 | # Optional TBB for std::execution on linux 120 | find_library(TBB_LIBRARIES NAMES tbb HINTS ${TBB_DIR}) 121 | if(TBB_LIBRARIES) 122 | message(STATUS "TBB: ${TBB_LIBRARIES}") 123 | target_link_libraries(nv_cluster_lod_builder PRIVATE ${TBB_LIBRARIES}) 124 | else() 125 | message(STATUS "TBB not found for std::execution") 126 | endif() 127 | endif() 128 | 129 | if(BUILD_TESTING) 130 | option(BUILD_NV_CLUSTER_LOD_BUILDER_TESTING "Build nv_cluster_lod_builder tests" ON) 131 | if(BUILD_NV_CLUSTER_LOD_BUILDER_TESTING) 132 | enable_testing() 133 | add_subdirectory(test) 134 | endif() 135 | endif() 136 | 137 | install(TARGETS nv_cluster_lod_builder) 138 | -------------------------------------------------------------------------------- /include/nvclusterlod/nvclusterlod_common.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | #ifndef NVCLUSTERLOD_COMMON_H 20 | #define NVCLUSTERLOD_COMMON_H 21 | 22 | #define NVCLUSTERLOD_VERSION 4 23 | 24 | #include 25 | #include 26 | #include 27 | 28 | #ifdef __cplusplus 29 | extern "C" { 30 | #endif 31 | 32 | #if defined(NVCLUSTERLOD_BUILDER_SHARED) 33 | #if defined(_MSC_VER) 34 | // msvc 35 | #if defined(NVCLUSTERLOD_BUILDER_COMPILING) 36 | #define NVCLUSTERLOD_API __declspec(dllexport) 37 | #else 38 | #define NVCLUSTERLOD_API __declspec(dllimport) 39 | #endif 40 | #elif defined(__GNUC__) 41 | // gcc/clang 42 | #define NVCLUSTERLOD_API __attribute__((visibility("default"))) 43 | #else 44 | // Unsupported. If hit, use cmake GenerateExportHeader 45 | #pragma warning Unsupported compiler 46 | #define NVCLUSTERLOD_API 47 | #endif 48 | #else // defined(NVCLUSTERLOD_BUILDER_SHARED) 49 | // static lib, no export needed 50 | #define NVCLUSTERLOD_API 51 | #endif 52 | 53 | #ifdef __cplusplus 54 | #define NVCLUSTERLOD_DEFAULT(x) = x 55 | #define NVCLUSTERLOD_STATIC_ASSERT(cond) static_assert(cond); 56 | #else 57 | #define NVCLUSTERLOD_DEFAULT(x) 58 | #define NVCLUSTERLOD_STATIC_ASSERT(cond) 59 | #endif 60 | 61 | typedef struct nvclusterlod_Vec3u 62 | { 63 | uint32_t x NVCLUSTERLOD_DEFAULT(0u); 64 | uint32_t y NVCLUSTERLOD_DEFAULT(0u); 65 | uint32_t z NVCLUSTERLOD_DEFAULT(0u); 66 | } nvclusterlod_Vec3u; 67 | 68 | #define nvclusterlod_defaultVec3u() {0u, 0u, 0u} 69 | 70 | typedef struct nvclusterlod_Sphere 71 | { 72 | nvcluster_Vec3f center NVCLUSTERLOD_DEFAULT(nvcluster_defaultVec3f()); 73 | float radius NVCLUSTERLOD_DEFAULT(0.0f); 74 | } nvclusterlod_Sphere; 75 | 76 | NVCLUSTERLOD_STATIC_ASSERT(sizeof(nvclusterlod_Sphere) == 16) 77 | 78 | // Returns an approximate error to distance ratio (i.e. asin(angularError)) for 79 | // the center of a perspective projection a target pixel error, field of view, 80 | // and resolution. This utility can be used to pre-compute a target LOD 81 | // threshold. 82 | inline float nvclusterlodErrorOverDistance(float errorSizeInPixels, float fov, float resolution) 83 | { 84 | return sinf(atanf(tanf(fov * 0.5f) * errorSizeInPixels / resolution)); 85 | } 86 | 87 | // Inverse of nvclusterlodErrorOverDistance. 88 | inline float nvclusterlodPixelError(float quadricErrorOverDistance, float fov, float resolution) 89 | { 90 | return tanf(asinf(quadricErrorOverDistance)) * resolution / tanf(fov * 0.5f); 91 | } 92 | 93 | typedef struct nvclusterlod_ContextCreateInfo 94 | { 95 | // Version expected. nvclusterlodCreateContext() returns 96 | // nvclusterlod_Result::NVCLUSTERLOD_ERROR_CONTEXT_VERSION_MISMATCH if another 97 | // is found at runtime. 98 | uint32_t version NVCLUSTERLOD_DEFAULT(NVCLUSTERLOD_VERSION); 99 | 100 | // Set to NVCLUSTER_TRUE or NVCLUSTER_FALSE to enable or disable internal 101 | // parallelisation using std execution policies at runtime 102 | nvcluster_Bool parallelize NVCLUSTER_DEFAULT(NVCLUSTER_TRUE); 103 | 104 | // Cluster builder context for the LOD builder to use 105 | nvcluster_Context clusterContext NVCLUSTERLOD_DEFAULT(nullptr); 106 | } nvclusterlod_ContextCreateInfo; 107 | 108 | struct nvclusterlod_Context_t; 109 | typedef struct nvclusterlod_Context_t* nvclusterlod_Context; 110 | 111 | typedef enum nvclusterlod_Result 112 | { 113 | NVCLUSTERLOD_SUCCESS = 0, 114 | NVCLUSTERLOD_ERROR_EMPTY_CLUSTER_GENERATING_GROUPS, 115 | NVCLUSTERLOD_ERROR_CLUSTERING_TRIANGLES_FAILED, // dropped error from nv_cluster_builder 116 | NVCLUSTERLOD_ERROR_CLUSTERING_CLUSTERS_FAILED, // dropped error from nv_cluster_builder 117 | NVCLUSTERLOD_ERROR_CLUSTERING_NODES_FAILED, // dropped error from nv_cluster_builder 118 | NVCLUSTERLOD_ERROR_NODES_OVERFLOW, 119 | NVCLUSTERLOD_ERROR_EMPTY_LOD_LEVELS, 120 | NVCLUSTERLOD_ERROR_LOD_LEVELS_OVERFLOW, 121 | NVCLUSTERLOD_ERROR_CLUSTER_COUNT_NOT_DECREASING, // infinite loop detection in iterative decimation 122 | NVCLUSTERLOD_ERROR_OUTPUT_MESH_OVERFLOW, 123 | NVCLUSTERLOD_ERROR_OUTPUT_INCONSISTENT_COUNTS, // internal consistency 124 | NVCLUSTERLOD_ERROR_EMPTY_ROOT_CLUSTER, 125 | NVCLUSTERLOD_ERROR_PRODUCED_NAN_BOUNDING_SPHERES, 126 | NVCLUSTERLOD_ERROR_GROUP_CLUSTER_COUNT_OVERFLOW, 127 | NVCLUSTERLOD_ERROR_NODE_CHILD_COUNT_OVERFLOW, 128 | NVCLUSTERLOD_ERROR_NULL_INPUT, 129 | NVCLUSTERLOD_ERROR_CONTEXT_VERSION_MISMATCH, 130 | NVCLUSTERLOD_ERROR_CLUSTER_ITEM_VERTEX_COUNT_NOT_THREE, 131 | NVCLUSTERLOD_ERROR_MAKE_BOUNDING_SPHERES_FROM_EMPTY_SET, 132 | NVCLUSTERLOD_ERROR_EMPTY_DECIMATION_RESULT, 133 | NVCLUSTERLOD_ERROR_USER_DECIMATION_FAILED, 134 | NVCLUSTERLOD_ERROR_NO_DECIMATION_CALLBACK, // callback required if NVCLUSTERLOD_FETCH_MESHOPTIMIZER is OFF 135 | } nvclusterlod_Result; 136 | 137 | NVCLUSTERLOD_API uint32_t nvclusterlodVersion(void); 138 | NVCLUSTERLOD_API nvclusterlod_Result nvclusterlodCreateContext(const nvclusterlod_ContextCreateInfo* createInfo, 139 | nvclusterlod_Context* context); 140 | NVCLUSTERLOD_API nvclusterlod_Result nvclusterlodDestroyContext(nvclusterlod_Context context); 141 | NVCLUSTERLOD_API const char* nvclusterlodResultString(nvclusterlod_Result result); 142 | 143 | #ifdef __cplusplus 144 | } // extern "C" 145 | #endif 146 | 147 | #endif // NVCLUSTERLOD_COMMON_H 148 | -------------------------------------------------------------------------------- /doc/arcsin_angular_error.svg: -------------------------------------------------------------------------------- 1 | 2 | 14 | 32 | 45 | 48 | 56 | 60 | 64 | 68 | 72 | 76 | 77 | 82 | 85 | 90 | 95 | 100 | 106 | 107 | 116 | 125 | 129 | 130 | d 142 | 147 | e 159 | Bounding sphere 169 | 174 | Quadric error 184 | 190 | 191 | 197 | 203 | 204 | -------------------------------------------------------------------------------- /include/nvclusterlod/nvclusterlod_hierarchy.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | #ifndef NVCLUSTERLOD_HIERARCHY_H 20 | #define NVCLUSTERLOD_HIERARCHY_H 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | 27 | #ifdef __cplusplus 28 | extern "C" { 29 | #endif 30 | 31 | // The hierarchy input is mostly nvclusterlod_MeshOutput with const pointers, 32 | // with the exception of triangle data that is not needed. 33 | typedef struct nvclusterlod_HierarchyInput 34 | { 35 | // The group of clusters that was decimated to produce the geometry in each 36 | // cluster, or NVCLUSTERLOD_ORIGINAL_MESH_GROUP if the cluster is original 37 | // mesh geometry. This relationship forms a DAG. Levels of detail are 38 | // generated by iteratively decimating groups of clusters and re-clustering 39 | // the result. The clusters in a group will have mixed generating groups. See 40 | // the readme for a visuzliation. 41 | const uint32_t* clusterGeneratingGroups NVCLUSTERLOD_DEFAULT(nullptr); 42 | 43 | // Bounding spheres of the clusters 44 | const nvclusterlod_Sphere* clusterBoundingSpheres NVCLUSTERLOD_DEFAULT(nullptr); 45 | 46 | // Error metric after decimating geometry in each group. Counter-intuitively, 47 | // not the error of the geometry in the group - that value does not exist 48 | // per-group. For the current level, use 49 | // groupQuadricErrors[clusterGeneratingGroups[cluster]]. This saves 50 | // duplicating data per cluster. The final LOD (just one 51 | // group) is not decimated and has an error of zero. 52 | // TODO: shouldn't this be infinite error so it's always drawn? 53 | const float* groupQuadricErrors NVCLUSTERLOD_DEFAULT(nullptr); 54 | 55 | // Ranges of clusters for each group of clusters. I.e. cluster values for a 56 | // group are stored at cluster*[range.offset + i] for i in {0 .. range.count - 57 | // 1}. 58 | const nvcluster_Range* groupClusterRanges NVCLUSTERLOD_DEFAULT(nullptr); 59 | 60 | // Ranges of groups for each LOD level. I.e. group values for a LOD are stored 61 | // at group*[range.offset + i] for i in {0 .. range.count - 1}. The finest LOD 62 | // is at index 0 (comprised of clusters of the original mesh), followed by the 63 | // coarser LODs from finer to coarser. 64 | const nvcluster_Range* lodLevelGroupRanges NVCLUSTERLOD_DEFAULT(nullptr); 65 | 66 | // Number of clusters for all LODs 67 | uint32_t clusterCount NVCLUSTERLOD_DEFAULT(0u); 68 | 69 | // Number of cluster groups for all LODs 70 | uint32_t groupCount NVCLUSTERLOD_DEFAULT(0u); 71 | 72 | // Number of LOD levels 73 | uint32_t lodLevelCount NVCLUSTERLOD_DEFAULT(0u); 74 | } nvclusterlod_HierarchyInput; 75 | 76 | // Limit imposed by the bits in nvclusterlod_InternalNodeChildren::childCountMinusOne 77 | #define NVCLUSTERLOD_NODE_MAX_CHILDREN (32u) 78 | 79 | // Packed children subrange used in interior nodes in the spatial hierarchy of 80 | // bounding spheres 81 | typedef struct nvclusterlod_InternalNodeChildren 82 | { 83 | // A nvclusterlod_HierarchyNode is one of the following, both overlaying this 84 | // single bit type enum to conserve space: 85 | // - 0: nvclusterlod_InternalNodeChildren 86 | // - 1: nvclusterlod_LeafNodeClusterGroup 87 | uint32_t isClusterGroup : 1; 88 | 89 | // Offset of the first child in the nodes array 90 | uint32_t childOffset : 26; 91 | 92 | // Number of children, minus one to reclaim the value 0, which would be 93 | // invalid 94 | uint32_t childCountMinusOne : 5; 95 | } nvclusterlod_InternalNodeChildren; 96 | 97 | NVCLUSTERLOD_STATIC_ASSERT(sizeof(nvclusterlod_InternalNodeChildren) == sizeof(uint32_t)) 98 | 99 | // Limit imposed by the bits in nvclusterlod_LeafNodeClusterGroup::clusterCountMinusOne 100 | #define NVCLUSTERLOD_GROUP_MAX_CLUSTERS (256u) 101 | 102 | // Cluster group referenced by leaf nodes in the spatial hierarchy of bounding 103 | // spheres 104 | typedef struct nvclusterlod_LeafNodeClusterGroup 105 | { 106 | // A nvclusterlod_HierarchyNode is one of the following, both overlaying this 107 | // single bit type enum to conserve space: 108 | // - 0: nvclusterlod_InternalNodeChildren 109 | // - 1: nvclusterlod_LeafNodeClusterGroup 110 | uint32_t isClusterGroup : 1; 111 | 112 | // Index of the cluster group for the node 113 | uint32_t group : 23; 114 | 115 | // Number of clusters in the group, minus one to reclaim 0 as a group always 116 | // contains at least one cluster 117 | uint32_t clusterCountMinusOne : 8; 118 | } nvclusterlod_LeafNodeClusterGroup; 119 | 120 | NVCLUSTERLOD_STATIC_ASSERT(sizeof(nvclusterlod_LeafNodeClusterGroup) == sizeof(uint32_t)) 121 | 122 | // Spatial hierarchy node 123 | typedef struct nvclusterlod_HierarchyNode 124 | { 125 | // May either point to more children or a cluster group. The single bit type 126 | // enum isClusterGroup is aliased in both. 127 | union 128 | { 129 | nvclusterlod_InternalNodeChildren children; 130 | nvclusterlod_LeafNodeClusterGroup clusterGroup; 131 | }; 132 | 133 | // Bounding sphere for the node. Note the bounding sphere conservatively and 134 | // encompases all generating group bounding spheres cumulatively (i.e. not 135 | // just bounding the children geometry). 136 | nvclusterlod_Sphere boundingSphere; 137 | 138 | // Cumulative maximum error in all children geometry due to mesh decimation 139 | float maxClusterQuadricError NVCLUSTERLOD_DEFAULT(0.0f); 140 | } nvclusterlod_HierarchyNode; 141 | 142 | NVCLUSTERLOD_STATIC_ASSERT(sizeof(nvclusterlod_HierarchyNode) == 4 + 16 + 4) 143 | 144 | // Output data for a spatial hierarchy to accelerate LOD selection 145 | typedef struct nvclusterlod_HierarchyOutput 146 | { 147 | // Spatial hierarchy of bounding spheres. Nodes reference other nodes in this 148 | // array. There is actually a hierarchy per LOD level. Since LOD selection is 149 | // the typical use case, the roots of each are merged into a single tree for 150 | // convenience. 151 | nvclusterlod_HierarchyNode* nodes NVCLUSTERLOD_DEFAULT(nullptr); 152 | 153 | // Bounding sphere for each cluster group, encompassing all generating groups 154 | // within each group recursively 155 | nvclusterlod_Sphere* groupCumulativeBoundingSpheres NVCLUSTERLOD_DEFAULT(nullptr); 156 | 157 | // Quadric error of the group and the maximum quadric error of all generating 158 | // groups recursively 159 | float* groupCumulativeQuadricError NVCLUSTERLOD_DEFAULT(nullptr); 160 | 161 | // Number of nodes in the tree 162 | uint32_t nodeCount NVCLUSTERLOD_DEFAULT(0u); 163 | } nvclusterlod_HierarchyOutput; 164 | 165 | // Memory requirements to build a spatial hierarchy of LODs 166 | typedef struct nvclusterlod_HierarchyCounts 167 | { 168 | uint32_t nodeCount NVCLUSTERLOD_DEFAULT(0u); 169 | } nvclusterlod_HierarchyCounts; 170 | 171 | // Get the memory requirements for the spatial hierarchy output 172 | nvclusterlod_Result nvclusterlodGetHierarchyRequirements(nvclusterlod_Context context, 173 | const nvclusterlod_HierarchyInput* input, 174 | nvclusterlod_HierarchyCounts* counts); 175 | 176 | // Generate a spatial hierarchy of all LODs, merged into a single tree 177 | nvclusterlod_Result nvclusterlodBuildHierarchy(nvclusterlod_Context context, 178 | const nvclusterlod_HierarchyInput* input, 179 | nvclusterlod_HierarchyOutput* output); 180 | 181 | #ifdef __cplusplus 182 | } // extern "C" 183 | #endif 184 | 185 | #endif // NVCLUSTERLOD_HIERARCHY_H 186 | -------------------------------------------------------------------------------- /doc/spatial_selection.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | -------------------------------------------------------------------------------- /src/nvclusterlod_cpp.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | #pragma once 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | 29 | namespace nvclusterlod { 30 | 31 | // Output span plus a write index 32 | template 33 | class OutputSpan 34 | { 35 | public: 36 | OutputSpan(std::span output_) 37 | : output(output_) 38 | { 39 | } 40 | uint32_t capacity() const { return static_cast(output.size()); } 41 | T& allocate() { return output[writeIndex++]; } 42 | std::span allocate(uint32_t count) 43 | { 44 | auto result = std::span(output).subspan(writeIndex, count); 45 | writeIndex += count; 46 | return result; 47 | } 48 | uint32_t allocatedCount() const { return writeIndex; } 49 | bool full() const { return writeIndex == output.size(); } 50 | void append(const T& value) { output[writeIndex++] = value; } 51 | void append(const std::span& values) 52 | { 53 | std::copy(values.begin(), values.end(), output.begin() + writeIndex); 54 | writeIndex += static_cast(values.size()); 55 | } 56 | std::span allocated() const { return std::span(output).subspan(0, writeIndex); } 57 | 58 | private: 59 | std::span output; 60 | uint32_t writeIndex = 0u; 61 | }; 62 | 63 | // Sphere with a vec3f center that's binary compatible with nvclusterlod_Sphere 64 | struct Sphere 65 | { 66 | nvcluster::vec3f center; 67 | float radius; 68 | operator nvclusterlod_Sphere() const { return {center, radius}; } 69 | }; 70 | static_assert(sizeof(Sphere) == sizeof(nvclusterlod_Sphere)); 71 | 72 | // Returns whether `inner` is inside or equal to `outer`. 73 | inline bool isInside(const Sphere& inner, const Sphere& outer) 74 | { 75 | const float radiusDifference = outer.radius - inner.radius; 76 | return (radiusDifference >= 0.0f) // if this is negative then `inner` cannot be inside `outer` 77 | && length_squared(inner.center - outer.center) <= radiusDifference * radiusDifference; 78 | } 79 | 80 | struct Mesh 81 | { 82 | static Mesh fromCAPI(const nvclusterlod_MeshInput& input) 83 | { 84 | return {.triangleVertices = {reinterpret_cast(input.triangleVertices), input.triangleCount}, 85 | .vertexPositions = {reinterpret_cast(input.vertexPositions), input.vertexCount, 86 | input.vertexStride}}; 87 | } 88 | std::span triangleVertices; 89 | ArrayView vertexPositions; 90 | }; 91 | 92 | // Safer nvclusterlod_MeshInput wrapper with bounds checkable spans around C API 93 | struct MeshInput 94 | { 95 | MeshInput(const nvclusterlod_MeshInput& input) 96 | : mesh(Mesh::fromCAPI(input)) 97 | , capi(input) 98 | { 99 | } 100 | Mesh mesh; 101 | const nvclusterlod_MeshInput& capi; 102 | }; 103 | 104 | // Safer nvclusterlod_MeshOutput wrapper with bounds checkable spans around C API 105 | struct MeshOutput 106 | { 107 | MeshOutput(const nvclusterlod_MeshOutput& output) 108 | : clusterTriangleRanges({reinterpret_cast(output.clusterTriangleRanges), output.clusterCount}) 109 | , triangleVertices({reinterpret_cast(output.triangleVertices), output.triangleCount}) 110 | , clusterGeneratingGroups({output.clusterGeneratingGroups, output.clusterCount}) 111 | , clusterBoundingSpheres({reinterpret_cast(output.clusterBoundingSpheres), 112 | output.clusterBoundingSpheres ? output.clusterCount : 0u}) 113 | , groupQuadricErrors({output.groupQuadricErrors, output.groupCount}) 114 | , groupClusterRanges({reinterpret_cast(output.groupClusterRanges), output.groupCount}) 115 | , lodLevelGroupRanges({reinterpret_cast(output.lodLevelGroupRanges), output.lodLevelCount}) 116 | { 117 | } 118 | 119 | nvclusterlod_Result writeCounts(nvclusterlod_MeshOutput& output) 120 | { 121 | // Triangle count 122 | output.triangleCount = triangleVertices.allocatedCount(); 123 | 124 | // Cluster count 125 | output.clusterCount = clusterTriangleRanges.allocatedCount(); 126 | if(output.clusterCount != clusterGeneratingGroups.allocatedCount()) 127 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_OUTPUT_INCONSISTENT_COUNTS; 128 | if(clusterBoundingSpheres.capacity() && output.clusterCount != clusterBoundingSpheres.allocatedCount()) 129 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_OUTPUT_INCONSISTENT_COUNTS; 130 | 131 | // Group count 132 | output.groupCount = groupQuadricErrors.allocatedCount(); 133 | if(output.groupCount != groupClusterRanges.allocatedCount()) 134 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_OUTPUT_INCONSISTENT_COUNTS; 135 | 136 | // LOD level count 137 | output.lodLevelCount = lodLevelGroupRanges.allocatedCount(); 138 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 139 | } 140 | 141 | OutputSpan clusterTriangleRanges; 142 | OutputSpan triangleVertices; 143 | OutputSpan clusterGeneratingGroups; 144 | OutputSpan clusterBoundingSpheres; 145 | OutputSpan groupQuadricErrors; 146 | OutputSpan groupClusterRanges; 147 | OutputSpan lodLevelGroupRanges; 148 | }; 149 | 150 | // Safer nvclusterlod_HierarchyInput wrapper with bounds-checkable spans 151 | struct HierarchyInput 152 | { 153 | static HierarchyInput fromCAPI(const nvclusterlod_HierarchyInput& input) 154 | { 155 | return { 156 | .clusterGeneratingGroups = {input.clusterGeneratingGroups, input.clusterCount}, 157 | .clusterBoundingSpheres = {reinterpret_cast(input.clusterBoundingSpheres), input.clusterCount}, 158 | .groupQuadricErrors = {input.groupQuadricErrors, input.groupCount}, 159 | .groupClusterRanges = {reinterpret_cast(input.groupClusterRanges), input.groupCount}, 160 | .lodLevelGroupRanges = {reinterpret_cast(input.lodLevelGroupRanges), input.lodLevelCount}, 161 | }; 162 | } 163 | 164 | std::span clusterGeneratingGroups; 165 | std::span clusterBoundingSpheres; // may be empty if not provided 166 | std::span groupQuadricErrors; 167 | std::span groupClusterRanges; 168 | std::span lodLevelGroupRanges; 169 | }; 170 | 171 | // Safer nvclusterlod_HierarchyOutput wrapper with bounds-checkable spans 172 | struct HierarchyOutput 173 | { 174 | // groupCount is required to size the per-group outputs, as the C API does not 175 | // carry these capacities in the output struct; it is implied from the input. 176 | HierarchyOutput(const nvclusterlod_HierarchyOutput& output, uint32_t groupCount) 177 | : nodes({reinterpret_cast(output.nodes), output.nodeCount}) 178 | , groupCumulativeBoundingSpheres{reinterpret_cast(output.groupCumulativeBoundingSpheres), groupCount} 179 | , groupCumulativeQuadricError{output.groupCumulativeQuadricError, groupCount} 180 | { 181 | } 182 | 183 | void writeCounts(nvclusterlod_HierarchyOutput& output) { output.nodeCount = nodes.allocatedCount(); } 184 | 185 | OutputSpan nodes; 186 | std::span groupCumulativeBoundingSpheres; 187 | std::span groupCumulativeQuadricError; 188 | }; 189 | 190 | } // namespace nvclusterlod 191 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS -------------------------------------------------------------------------------- /include/nvclusterlod/nvclusterlod_mesh_storage.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | #pragma once 21 | 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | 29 | namespace nvclusterlod { 30 | 31 | // Shortcut and storage for LOD output 32 | struct LodMesh 33 | { 34 | template 35 | void resize(const Counts& counts) 36 | { 37 | triangleVertices.resize(counts.triangleCount); 38 | clusterTriangleRanges.resize(counts.clusterCount); 39 | clusterGeneratingGroups.resize(counts.clusterCount); 40 | clusterBoundingSpheres.resize(counts.clusterCount); 41 | groupQuadricErrors.resize(counts.groupCount); 42 | groupClusterRanges.resize(counts.groupCount); 43 | lodLevelGroupRanges.resize(counts.lodLevelCount); 44 | } 45 | 46 | void shrink_to_fit() 47 | { 48 | // If keeping the object around, reallocating the conservatively sized 49 | // output memory is worthwhile. 50 | triangleVertices.shrink_to_fit(); 51 | clusterTriangleRanges.shrink_to_fit(); 52 | clusterGeneratingGroups.shrink_to_fit(); 53 | clusterBoundingSpheres.shrink_to_fit(); 54 | groupQuadricErrors.shrink_to_fit(); 55 | groupClusterRanges.shrink_to_fit(); 56 | lodLevelGroupRanges.shrink_to_fit(); 57 | } 58 | 59 | std::vector triangleVertices; 60 | std::vector clusterTriangleRanges; 61 | std::vector clusterGeneratingGroups; 62 | std::vector clusterBoundingSpheres; 63 | std::vector groupQuadricErrors; 64 | std::vector groupClusterRanges; 65 | std::vector lodLevelGroupRanges; 66 | }; 67 | 68 | // LodMesh delayed init constructor 69 | inline nvclusterlod_Result generateLodMesh(nvclusterlod_Context context, const nvclusterlod_MeshInput& input, LodMesh& lodMesh) 70 | { 71 | // Get conservative output sizes 72 | nvclusterlod_MeshCounts counts; 73 | if(nvclusterlod_Result result = nvclusterlodGetMeshRequirements(context, &input, &counts); result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 74 | { 75 | return result; 76 | } 77 | 78 | // Allocate storage 79 | lodMesh.resize(counts); 80 | 81 | // Make LODs 82 | nvclusterlod_MeshOutput lodOutput{}; 83 | lodOutput.clusterTriangleRanges = lodMesh.clusterTriangleRanges.data(); 84 | lodOutput.triangleVertices = lodMesh.triangleVertices.data(); 85 | lodOutput.clusterGeneratingGroups = lodMesh.clusterGeneratingGroups.data(); 86 | lodOutput.clusterBoundingSpheres = lodMesh.clusterBoundingSpheres.data(); 87 | lodOutput.groupQuadricErrors = lodMesh.groupQuadricErrors.data(); 88 | lodOutput.groupClusterRanges = lodMesh.groupClusterRanges.data(); 89 | lodOutput.lodLevelGroupRanges = lodMesh.lodLevelGroupRanges.data(); 90 | lodOutput.clusterCount = uint32_t(lodMesh.clusterTriangleRanges.size()); 91 | lodOutput.groupCount = uint32_t(lodMesh.groupQuadricErrors.size()); 92 | lodOutput.lodLevelCount = uint32_t(lodMesh.lodLevelGroupRanges.size()); 93 | lodOutput.triangleCount = uint32_t(lodMesh.triangleVertices.size()); 94 | 95 | if(nvclusterlod_Result result = nvclusterlodBuildMesh(context, &input, &lodOutput); result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 96 | { 97 | return result; 98 | } 99 | 100 | // Truncate the output to the size written 101 | lodMesh.resize(lodOutput); 102 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 103 | } 104 | 105 | struct LocalizedLodMesh 106 | { 107 | LodMesh lodMesh; // contains cluster-local triangle indices 108 | std::vector clusterVertexRanges; 109 | std::vector vertexGlobalIndices; 110 | 111 | // Per-cluster maximums 112 | uint32_t maxClusterTriangles = 0; 113 | uint32_t maxClusterVertices = 0; 114 | }; 115 | 116 | // Computes unique triangle vertices per cluster, returning vertex ranges and 117 | // their indices into the original global vertices. Localized triangle vertices 118 | // are written to clusterTriangleVerticesLocal, allowing in-place conversion. 119 | // TODO: parallelize? 120 | // TODO: LocalizedClusterVertices::vertexGlobalIndices indirection is transient 121 | // and might better be handled as a callback 122 | inline nvclusterlod_Result generateLocalizedLodMesh(LodMesh&& input, LocalizedLodMesh& localizedMesh) 123 | { 124 | if(&localizedMesh.lodMesh != &input) 125 | { 126 | localizedMesh.lodMesh = std::move(input); 127 | } 128 | 129 | for(size_t clusterTriangleRangeIndex = 0; 130 | clusterTriangleRangeIndex < localizedMesh.lodMesh.clusterTriangleRanges.size(); clusterTriangleRangeIndex++) 131 | { 132 | const nvcluster_Range& clusterTriangleRange = localizedMesh.lodMesh.clusterTriangleRanges[clusterTriangleRangeIndex]; 133 | std::span globalTriangles( 134 | localizedMesh.lodMesh.triangleVertices.data() + clusterTriangleRange.offset, clusterTriangleRange.count); 135 | std::span localTriangles(localizedMesh.lodMesh.triangleVertices.data() + clusterTriangleRange.offset, 136 | clusterTriangleRange.count); 137 | 138 | uint32_t currentLocalTriangleIndex = 0; 139 | 140 | nvcluster_Range vertexRange{.offset = uint32_t(localizedMesh.vertexGlobalIndices.size()), .count = 0}; 141 | 142 | { 143 | std::unordered_map vertexCache; 144 | for(size_t globalTriangleIndex = 0; globalTriangleIndex < globalTriangles.size(); globalTriangleIndex++) 145 | { 146 | const auto& inputTriangle = reinterpret_cast(globalTriangles[globalTriangleIndex]); 147 | auto& outputTriangle = reinterpret_cast(localTriangles[currentLocalTriangleIndex]); 148 | currentLocalTriangleIndex++; 149 | for(int j = 0; j < 3; ++j) 150 | { 151 | auto [vertIndexIt, isNew] = vertexCache.try_emplace(inputTriangle[j], uint32_t(vertexCache.size())); 152 | 153 | if(isNew) 154 | { 155 | localizedMesh.vertexGlobalIndices.push_back(inputTriangle[j]); 156 | } 157 | outputTriangle[j] = vertIndexIt->second; 158 | } 159 | } 160 | vertexRange.count = uint32_t(vertexCache.size()); 161 | } 162 | localizedMesh.clusterVertexRanges.push_back(vertexRange); 163 | localizedMesh.maxClusterTriangles = std::max(localizedMesh.maxClusterTriangles, clusterTriangleRange.count); 164 | localizedMesh.maxClusterVertices = std::max(localizedMesh.maxClusterVertices, vertexRange.count); 165 | } 166 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 167 | } 168 | 169 | inline nvclusterlod_Result generateLocalizedLodMesh(nvclusterlod_Context context, const nvclusterlod_MeshInput& input, LocalizedLodMesh& localizedMesh) 170 | { 171 | LodMesh lodMesh; 172 | if(nvclusterlod_Result result = generateLodMesh(context, input, lodMesh); result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 173 | { 174 | return result; 175 | } 176 | return generateLocalizedLodMesh(std::move(lodMesh), localizedMesh); 177 | } 178 | 179 | // Utility call to build lists of generating groups (that contributed 180 | // decimated clusters) for each group. This collapses duplicate values in 181 | // clusterGeneratingGroups for each groupClusterRanges. 182 | struct GroupGeneratingGroups 183 | { 184 | std::vector ranges; // ranges of groups 185 | std::vector groups; // indices of generating groups 186 | 187 | // Accessors to view this struct as an array of arrays. This avoids having the 188 | // many heap allocations that a std::vector of vectors has. 189 | std::span operator[](size_t i) const 190 | { 191 | return std::span(groups.data() + ranges[i].offset, ranges[i].count); 192 | } 193 | size_t size() const { return ranges.size(); } 194 | }; 195 | 196 | inline nvclusterlod_Result generateGroupGeneratingGroups(std::span groupClusterRanges, 197 | std::span clusterGeneratingGroups, 198 | nvclusterlod::GroupGeneratingGroups& groupGeneratingGroups) 199 | { 200 | groupGeneratingGroups.ranges.reserve(groupClusterRanges.size()); 201 | 202 | // Iterate over all groups, find the unique set of generating groups from 203 | // their clusters and append them linearly 204 | for(size_t groupIndex = 0; groupIndex < groupClusterRanges.size(); groupIndex++) 205 | { 206 | const nvcluster_Range& clusterRange = groupClusterRanges[groupIndex]; 207 | if(clusterRange.count == 0) 208 | { 209 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_EMPTY_CLUSTER_GENERATING_GROUPS; 210 | } 211 | 212 | std::span generatingGroups = 213 | std::span(clusterGeneratingGroups).subspan(clusterRange.offset, clusterRange.count); 214 | 215 | if(generatingGroups[0] == NVCLUSTERLOD_ORIGINAL_MESH_GROUP) 216 | { 217 | groupGeneratingGroups.ranges.push_back({uint32_t(groupGeneratingGroups.groups.size()), 0}); // LOD0 groups have no generating group 218 | } 219 | else 220 | { 221 | std::unordered_set uniqueGeneratingGroups(generatingGroups.begin(), generatingGroups.end()); 222 | nvcluster_Range newGroupRange = {uint32_t(groupGeneratingGroups.groups.size()), uint32_t(uniqueGeneratingGroups.size())}; 223 | groupGeneratingGroups.ranges.push_back(newGroupRange); 224 | groupGeneratingGroups.groups.insert(groupGeneratingGroups.groups.end(), uniqueGeneratingGroups.begin(), 225 | uniqueGeneratingGroups.end()); 226 | } 227 | } 228 | assert(groupGeneratingGroups.ranges[0].offset == 0); 229 | assert(groupGeneratingGroups.ranges.size() == groupClusterRanges.size()); 230 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 231 | } 232 | } // namespace nvclusterlod 233 | -------------------------------------------------------------------------------- /include/nvclusterlod/nvclusterlod_mesh.h: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | #ifndef NVCLUSTERLOD_MESH_H 20 | #define NVCLUSTERLOD_MESH_H 21 | 22 | #include 23 | #include 24 | #include 25 | 26 | #ifdef __cplusplus 27 | extern "C" { 28 | #endif 29 | 30 | // LODs are formed from a directed acyclic graph of groups of clusters of 31 | // triangles. Each group has a generating group index. This value is a special 32 | // index indicating this group has no generating group, i.e. it is a cluster 33 | // group formed from original mesh triangles. 34 | #define NVCLUSTERLOD_ORIGINAL_MESH_GROUP (~0u) 35 | 36 | #define nvclusterlod_defaultClusterConfig() \ 37 | { \ 38 | .minClusterSize = 96, .maxClusterSize = 128, .costUnderfill = 0.9f, .costOverlap = 0.5f, .preSplitThreshold = 1u << 17, \ 39 | } 40 | 41 | #define nvclusterlod_defaultGroupConfig() \ 42 | { \ 43 | .minClusterSize = 24, .maxClusterSize = 32, .costUnderfill = 0.5f, .costOverlap = 0.0f, .preSplitThreshold = 0, \ 44 | } 45 | 46 | typedef struct nvclusterlod_DecimateTrianglesCallbackParams 47 | { 48 | // Mesh data to decimate 49 | const nvclusterlod_Vec3u* triangleVertices NVCLUSTERLOD_DEFAULT(nullptr); 50 | const nvcluster_Vec3f* vertexPositions NVCLUSTERLOD_DEFAULT(nullptr); 51 | 52 | // One byte per vertex. 1 if the vertex is locked, 0 otherwise. 53 | const uint8_t* vertexLockFlags NVCLUSTERLOD_DEFAULT(nullptr); 54 | 55 | // Output location to write decimated triangles. Has space for triangleCount, 56 | // but the callback should produce targetTriangleCount. 57 | nvclusterlod_Vec3u* decimatedTriangleVertices NVCLUSTERLOD_DEFAULT(nullptr); 58 | 59 | uint32_t triangleCount NVCLUSTERLOD_DEFAULT(0u); 60 | uint32_t vertexStride NVCLUSTERLOD_DEFAULT(0u); 61 | uint32_t vertexCount NVCLUSTERLOD_DEFAULT(0u); 62 | uint32_t targetTriangleCount NVCLUSTERLOD_DEFAULT(0u); 63 | } nvclusterlod_DecimateTrianglesCallbackParams; 64 | 65 | typedef struct nvclusterlod_DecimateTrianglesCallbackResult 66 | { 67 | uint32_t decimatedTriangleCount NVCLUSTERLOD_DEFAULT(0u); 68 | uint32_t additionalVertexCount NVCLUSTERLOD_DEFAULT(0u); 69 | float quadricError NVCLUSTERLOD_DEFAULT(0.0f); 70 | } nvclusterlod_DecimateTrianglesCallbackResult; 71 | 72 | // Callback type for custom decimation, e.g. to allow preserving UV seams or 73 | // weight edge collapses based on vertex attributes. Must be thread-safe. The 74 | // user writes targetTriangleCount triangles into decimatedTriangleVertices, 75 | // provides the number written and geometric quadric error due to the lower 76 | // detail. See internal decimateTrianglesDefault() for an example 77 | // implementation. If any new vertices are generated, the user is responsible 78 | // for over-allocating the input mesh and any synchroniation needed. Returning 79 | // NVCLUSTER_FALSE or zero decimatedTriangleCount will fail the overall 80 | // operation. Note that the triangles may reference a small subset of vertices, 81 | // which is all vertices for the whole mesh. 82 | typedef nvcluster_Bool (*nvclusterlod_DecimateTrianglesCallback)(void* userData, 83 | const nvclusterlod_DecimateTrianglesCallbackParams* params, 84 | nvclusterlod_DecimateTrianglesCallbackResult* result); 85 | 86 | // Input mesh and clustering parameters used to generate decimated LODs. 87 | typedef struct nvclusterlod_MeshInput 88 | { 89 | // Array of triangles. Each indexes 3 vertices. 90 | const nvclusterlod_Vec3u* triangleVertices NVCLUSTERLOD_DEFAULT(nullptr); 91 | 92 | // Number of triangles in triangleVertices 93 | uint32_t triangleCount NVCLUSTERLOD_DEFAULT(0u); 94 | 95 | // Pointer to the first vertex position 96 | const nvcluster_Vec3f* vertexPositions NVCLUSTERLOD_DEFAULT(nullptr); 97 | 98 | // Maximum vertex (plus one) referenced by triangleVertices 99 | uint32_t vertexCount NVCLUSTERLOD_DEFAULT(0u); 100 | 101 | // Stride in bytes between successive vertices (e.g. 12 bytes for tightly 102 | // packed positions) 103 | uint32_t vertexStride NVCLUSTERLOD_DEFAULT(sizeof(nvcluster_Vec3f)); 104 | 105 | // Configuration for the generation of triangle clusters 106 | nvcluster_Config clusterConfig NVCLUSTERLOD_DEFAULT(nvclusterlod_defaultClusterConfig()); 107 | 108 | // Configuration for the generation of cluster groups 109 | // Each LOD is comprised of a number of cluster groups 110 | nvcluster_Config groupConfig NVCLUSTERLOD_DEFAULT(nvclusterlod_defaultGroupConfig()); 111 | 112 | // Decimation factor applied between successive LODs 113 | float decimationFactor NVCLUSTERLOD_DEFAULT(0.0f); 114 | 115 | // Optional user data passed to callbacks 116 | void* userData NVCLUSTERLOD_DEFAULT(nullptr); 117 | 118 | // Optional callback to override the default triangle decimation. Mandatory if 119 | // compiling with the cmake option NVCLUSTERLOD_FETCH_MESHOPTIMIZER OFF. 120 | nvclusterlod_DecimateTrianglesCallback decimateTrianglesCallback NVCLUSTERLOD_DEFAULT(nullptr); 121 | } nvclusterlod_MeshInput; 122 | 123 | // Memory requirements for the output storage of the mesh LODs 124 | typedef struct nvclusterlod_MeshCounts 125 | { 126 | // Maximum total number of triangles across LODs 127 | uint32_t triangleCount NVCLUSTERLOD_DEFAULT(0u); 128 | 129 | // Maximum total number of clusters across LODs 130 | uint32_t clusterCount NVCLUSTERLOD_DEFAULT(0u); 131 | 132 | // Maximum total number of cluster groups across LODs 133 | uint32_t groupCount NVCLUSTERLOD_DEFAULT(0u); 134 | 135 | // Maximum number of LODs in the mesh 136 | uint32_t lodLevelCount NVCLUSTERLOD_DEFAULT(0u); 137 | } nvclusterlod_MeshCounts; 138 | 139 | // Pointers to output clusters referencing original vertices and counts written. 140 | // Continuous LOD can be created by stitching together clusters from various 141 | // LODs. Bounding spheres and errors must first be updated to propagate their 142 | // values between LODs. This is currently performed by the spatial hierarchy 143 | // API, but could be done separately. These values may be passed directly to 144 | // nvclusterlod_HierarchyInput. 145 | typedef struct nvclusterlod_MeshOutput 146 | { 147 | // Clusters of triangles. This is the granularity at which geometry can be 148 | // swapped in and out for detail selection. Each range selects a subset of 149 | // the triangleVertices array. 150 | nvcluster_Range* clusterTriangleRanges NVCLUSTERLOD_DEFAULT(nullptr); 151 | 152 | // New triangles for all LODs. These are produced by iterative decimation of 153 | // the original input mesh and reference vertices in the original. 154 | nvclusterlod_Vec3u* triangleVertices NVCLUSTERLOD_DEFAULT(nullptr); 155 | 156 | // The group of clusters that was decimated to produce the geometry in each 157 | // cluster, or NVCLUSTERLOD_ORIGINAL_MESH_GROUP if the cluster is original 158 | // mesh geometry. This relationship forms a DAG. Levels of detail are 159 | // generated by iteratively decimating groups of clusters and re-clustering 160 | // the result. The clusters in a group will have mixed generating groups. See 161 | // the readme for a visuzliation. 162 | uint32_t* clusterGeneratingGroups NVCLUSTERLOD_DEFAULT(nullptr); 163 | 164 | // Bounding spheres of the clusters, may be nullptr 165 | nvclusterlod_Sphere* clusterBoundingSpheres NVCLUSTERLOD_DEFAULT(nullptr); 166 | 167 | // Error metric after decimating geometry in each group. Counter-intuitively, 168 | // not the error of the geometry in the group - that value does not exist 169 | // per-group. For the current level, use 170 | // groupQuadricErrors[clusterGeneratingGroups[cluster]]. This saves 171 | // duplicating data per cluster. The final LOD (just one 172 | // group) is not decimated and has an error of zero. 173 | // TODO: shouldn't this be infinite error so it's always drawn? 174 | float* groupQuadricErrors NVCLUSTERLOD_DEFAULT(nullptr); 175 | 176 | // Ranges of clusters for each group of clusters. I.e. cluster values for a 177 | // group are stored at cluster*[range.offset + i] for i in {0 .. range.count - 178 | // 1}. 179 | nvcluster_Range* groupClusterRanges NVCLUSTERLOD_DEFAULT(nullptr); 180 | 181 | // Ranges of groups for each LOD level. I.e. group values for a LOD are stored 182 | // at group*[range.offset + i] for i in {0 .. range.count - 1}. The finest LOD 183 | // is at index 0 (comprised of clusters of the original mesh), followed by the 184 | // coarser LODs from finer to coarser. 185 | nvcluster_Range* lodLevelGroupRanges NVCLUSTERLOD_DEFAULT(nullptr); 186 | 187 | // Number of triangles for all LODs 188 | uint32_t triangleCount NVCLUSTERLOD_DEFAULT(0u); 189 | 190 | // Number of clusters for all LODs 191 | uint32_t clusterCount NVCLUSTERLOD_DEFAULT(0u); 192 | 193 | // Number of cluster groups for all LODs 194 | uint32_t groupCount NVCLUSTERLOD_DEFAULT(0u); 195 | 196 | // Number of LOD levels 197 | uint32_t lodLevelCount NVCLUSTERLOD_DEFAULT(0u); 198 | } nvclusterlod_MeshOutput; 199 | 200 | // Usage: 201 | // 1. call nvclusterlodMeshGetRequirements(...) to get conservative sizes 202 | // 2. allocate nvclusterlod_MeshOutput data 203 | // 3. call nvclusterlodMeshBuild(...) 204 | // 4. resize down to what was written 205 | // Alternatively use nvclusterlod::LodMesh, which encapsulates the above Note 206 | // that returned vertices are global indices. The utility 207 | // nvclusterlod::LocalizedLodMesh can be used to create vertex indices local to 208 | // each cluster. 209 | 210 | // Request the memory requirements to build the LODs for the input mesh 211 | nvclusterlod_Result nvclusterlodGetMeshRequirements(nvclusterlod_Context context, 212 | const nvclusterlod_MeshInput* input, 213 | nvclusterlod_MeshCounts* outputRequiredCounts); 214 | 215 | // Build the LODs for the input mesh 216 | nvclusterlod_Result nvclusterlodBuildMesh(nvclusterlod_Context context, const nvclusterlod_MeshInput* input, nvclusterlod_MeshOutput* output); 217 | 218 | #ifdef __cplusplus 219 | } // extern "C" 220 | #endif 221 | 222 | #endif // NVCLUSTERLOD_MESH_H 223 | -------------------------------------------------------------------------------- /src/array_view.hpp: -------------------------------------------------------------------------------- 1 | // 2 | // Copyright (c) 2022-2023, NVIDIA CORPORATION. All rights reserved. 3 | // 4 | // NVIDIA CORPORATION and its licensors retain all intellectual property 5 | // and proprietary rights in and to this software, related documentation 6 | // and any modifications thereto. Any use, reproduction, disclosure or 7 | // distribution of this software and related documentation without an express 8 | // license agreement from NVIDIA CORPORATION is strictly prohibited. 9 | // 10 | 11 | /** 12 | * @file array_view.hpp 13 | * @brief Defines ArrayView, an extended pointer that holds a size and byte stride. 14 | * 15 | * The primary motivation is to provide an API where the data may come directly from many sources, but with stronger 16 | * type and array bounds safety. For example: 17 | * 18 | * @code 19 | * std::vector src1{1, 2, 3}; 20 | * struct Obj { int i; double d; }; 21 | * Obj src2[]{{3, 0.0}, {4, 0.0}, {5, 0.0}}; 22 | * apiFunc(ArrayView(src1)); 23 | * apiFunc(ArrayView(&src2[0].i, 3, sizeof(Obj))); 24 | * @endcode 25 | * 26 | * Type conversions and slicing also helps keep track of indices. 27 | * 28 | * @code 29 | * struct Obj { int x, y; }; 30 | * Obj src[]{{1, 2}, {2, 4}, {3, 9}}; 31 | * ArrayView view(src); 32 | * ArrayView cast(view); 33 | * apiFunc(cast.slice(2, 4)); // passes an int array of {2, 4, 3, 9} 34 | * @endcode 35 | * 36 | * The DynamicArrayView object introduces a resize callback so an API can write varying amounts of data. 37 | * 38 | * @code 39 | * std::vector src{1, 2, 3}; 40 | * DynamicArrayView view(src); 41 | * view.resize(5, 42); // src now has {1, 2, 3, 42, 42} 42 | * @endcode 43 | */ 44 | 45 | #pragma once 46 | 47 | #include 48 | #include 49 | #include 50 | #include 51 | #include 52 | #include 53 | 54 | #if defined(_NDEBUG) 55 | #define ARRAY_VIEW_ITERATOR_OVERFLOW_DETECTION 0 56 | #else 57 | #define ARRAY_VIEW_ITERATOR_OVERFLOW_DETECTION 1 58 | #endif 59 | 60 | #if ARRAY_VIEW_ITERATOR_OVERFLOW_DETECTION 61 | #define ARRAY_VIEW_BOUNDS_CHECK(expr) assert(expr) 62 | #else 63 | #define ARRAY_VIEW_BOUNDS_CHECK(expr) static_cast(0) 64 | #endif 65 | 66 | /** 67 | * @brief Basic pointer iterator for ArrayView, but with a byte stride 68 | */ 69 | template 70 | class StrideIterator 71 | { 72 | public: 73 | using iterator_category = std::random_access_iterator_tag; 74 | using value_type = ValueType; 75 | using pointer = value_type*; 76 | using reference = value_type&; 77 | using difference_type = ptrdiff_t; 78 | using stride_type = size_t; 79 | using byte_pointer = std::conditional_t, const uint8_t*, uint8_t*>; 80 | 81 | #if ARRAY_VIEW_ITERATOR_OVERFLOW_DETECTION 82 | using size_type = uint64_t; 83 | StrideIterator(pointer ptr, stride_type stride, size_type size) 84 | : m_ptr(ptr) 85 | , m_stride(stride) 86 | , m_begin(ptr) 87 | , m_end(reinterpret_cast(reinterpret_cast(ptr) + size * stride)) 88 | { 89 | } 90 | #else 91 | StrideIterator(pointer ptr, stride_type stride) 92 | : m_ptr(ptr) 93 | , m_stride(stride) 94 | { 95 | } 96 | #endif 97 | StrideIterator() = default; 98 | StrideIterator(const StrideIterator& other) = default; 99 | 100 | bool operator==(const StrideIterator& other) const { return (m_ptr == other.m_ptr); } 101 | bool operator!=(const StrideIterator& other) const { return (m_ptr != other.m_ptr); } 102 | bool operator<(const StrideIterator& other) const { return (m_ptr < other.m_ptr); } 103 | bool operator<=(const StrideIterator& other) const { return (m_ptr <= other.m_ptr); } 104 | bool operator>(const StrideIterator& other) const { return (m_ptr > other.m_ptr); } 105 | bool operator>=(const StrideIterator& other) const { return (m_ptr >= other.m_ptr); } 106 | 107 | StrideIterator& operator+=(const difference_type& i) 108 | { 109 | m_ptr = reinterpret_cast(reinterpret_cast(m_ptr) + static_cast(i) * m_stride); 110 | return (*this); 111 | } 112 | StrideIterator& operator-=(const difference_type& i) 113 | { 114 | m_ptr = reinterpret_cast(reinterpret_cast(m_ptr) - i * m_stride); 115 | return (*this); 116 | } 117 | StrideIterator& operator++() 118 | { 119 | m_ptr = reinterpret_cast(reinterpret_cast(m_ptr) + m_stride); 120 | return (*this); 121 | } 122 | StrideIterator& operator--() 123 | { 124 | m_ptr = reinterpret_cast(reinterpret_cast(m_ptr) - m_stride); 125 | return (*this); 126 | } 127 | StrideIterator operator++(int) 128 | { 129 | auto result(*this); 130 | ++(*this); 131 | return result; 132 | } 133 | StrideIterator operator--(int) 134 | { 135 | auto result(*this); 136 | --(*this); 137 | return result; 138 | } 139 | StrideIterator operator+(const difference_type& i) const 140 | { 141 | StrideIterator result(*this); 142 | return result += i; 143 | } 144 | StrideIterator operator-(const difference_type& i) const 145 | { 146 | StrideIterator result(*this); 147 | return result -= i; 148 | } 149 | 150 | difference_type operator-(const StrideIterator& other) const 151 | { 152 | const auto& lhs = reinterpret_cast(m_ptr); 153 | const auto& rhs = reinterpret_cast(other.m_ptr); 154 | return static_cast((lhs - rhs) / m_stride); 155 | } 156 | 157 | value_type& operator*() 158 | { 159 | ARRAY_VIEW_BOUNDS_CHECK(m_ptr >= m_begin && m_ptr < m_end); 160 | return *m_ptr; 161 | } 162 | const value_type& operator*() const 163 | { 164 | ARRAY_VIEW_BOUNDS_CHECK(m_ptr >= m_begin && m_ptr < m_end); 165 | return *m_ptr; 166 | } 167 | value_type* operator->() 168 | { 169 | ARRAY_VIEW_BOUNDS_CHECK(m_ptr >= m_begin && m_ptr < m_end); 170 | return m_ptr; 171 | } 172 | 173 | // Relative offset accessor. Used by e.g. std::reduce(). 174 | value_type& operator[](difference_type idx) const { return *(*this + idx); } 175 | 176 | private: 177 | pointer m_ptr{nullptr}; 178 | stride_type m_stride{0}; 179 | 180 | #if ARRAY_VIEW_ITERATOR_OVERFLOW_DETECTION 181 | pointer m_begin{nullptr}; 182 | pointer m_end{nullptr}; 183 | #endif 184 | }; 185 | 186 | /** 187 | * @brief Random access container view - just (pointer, size, stride). 188 | * 189 | * - Constructable from a std::vector 190 | * - Supports implicit casting between types, e.g. uint[3] -> uvec[1] 191 | * - More type and size safety when making these conversions 192 | * 193 | * Similar to a C++20 std::span, but with a stride. 194 | */ 195 | template 196 | class ArrayView 197 | { 198 | public: 199 | using value_type = ValueType; 200 | using size_type = size_t; 201 | using stride_type = size_t; 202 | using iterator = StrideIterator; 203 | 204 | // Constructs an empty view 205 | ArrayView() 206 | : m_ptr(nullptr) 207 | , m_size(0) 208 | , m_stride(sizeof(value_type)) // default stride for iterating an empty view 209 | { 210 | } 211 | 212 | // Construct from std::vector. Use remove_cv_t as a vector's type cannot be 213 | // const but the vector can be 214 | ArrayView(std::vector>& vector) 215 | : m_ptr(vector.data()) 216 | , m_size(vector.size()) 217 | , m_stride(sizeof(value_type)) // tightly packed 218 | { 219 | } 220 | 221 | // Const std::vector version 222 | ArrayView(const std::vector>& vector) 223 | : m_ptr(vector.data()) // if passing a const vector, make sure you have ArrayView 224 | , m_size(vector.size()) 225 | , m_stride(sizeof(value_type)) // tightly packed 226 | { 227 | } 228 | 229 | // Disallow r-value const reference std::vector construction 230 | ArrayView(const std::vector>&& vector) = delete; 231 | 232 | // Simple pointer + size wrapper, but keeping type safety 233 | ArrayView(value_type* ptr, size_type size, stride_type stride = sizeof(value_type)) 234 | : m_ptr(ptr) 235 | , m_size(size) 236 | , m_stride(stride) 237 | { 238 | ARRAY_VIEW_BOUNDS_CHECK(stride > 0); 239 | } 240 | 241 | ArrayView(const ArrayView& other) = default; 242 | 243 | // The assignment operator copies pointers, not data! 244 | ArrayView& operator=(const ArrayView& other) = default; 245 | 246 | // Implicit conversion to ConstArrayView, i.e. ArrayView. Only exists for constructing a const value type 247 | // from a non const value type. The template parameter Dummy is used to avoid defining a copy constructor and breaking 248 | // the rule of 5. Based on https://quuxplusone.github.io/blog/2018/12/01/const-iterator-antipatterns/ 249 | template >> 250 | ArrayView(const ArrayView>& other) 251 | : m_ptr(other.data()) 252 | , m_size(other.size()) 253 | , m_stride(other.stride()) 254 | { 255 | } 256 | 257 | /** 258 | * @brief Constructor to convert from a different type of ArrayView object 259 | * 260 | * Marked explicit because this is somewhat dangerous, e.g. it can hide vec4 to vec3 conversion and not even assert if 261 | * the sizes make it an even multiple 262 | */ 263 | template , std::decay_t> && 267 | // And not a const conversion (not sure why gcc was allowing reinterpret_cast without this) 268 | !(!std::is_const_v && std::is_const_v)>> 269 | explicit ArrayView(const ArrayView& other) 270 | : m_ptr(reinterpret_cast(other.data())) // const to non-const is a common error here. make sure you have ArrayView 271 | , m_size((other.size() * static_cast(sizeof(T))) / static_cast(sizeof(value_type))) 272 | , m_stride(sizeof(T) == sizeof(value_type) ? other.stride() : static_cast(sizeof(value_type))) 273 | { 274 | // Sanity check that both views now refer to the same amount of data 275 | ARRAY_VIEW_BOUNDS_CHECK(size() * static_cast(sizeof(value_type)) 276 | == other.size() * static_cast(sizeof(T))); 277 | ARRAY_VIEW_BOUNDS_CHECK(size() * stride() == other.size() * other.stride()); 278 | 279 | // Either the array was tightly packed or the element size must be the same, to keep the same stride 280 | ARRAY_VIEW_BOUNDS_CHECK(sizeof(value_type) == sizeof(T) || other.stride() == static_cast(sizeof(T))); 281 | } 282 | 283 | value_type& operator[](size_type idx) const 284 | { 285 | return *(begin() + static_cast(idx)); 286 | } 287 | 288 | bool empty() const { return m_size == 0; } 289 | value_type* data() const { return m_ptr; } 290 | size_type size() const { return m_size; } 291 | stride_type stride() const { return m_stride; } 292 | 293 | #if ARRAY_VIEW_ITERATOR_OVERFLOW_DETECTION 294 | iterator begin() const { return iterator(m_ptr, m_stride, m_size); } 295 | iterator end() const { return iterator(m_ptr, m_stride, m_size) + m_size; } 296 | #else 297 | iterator begin() const { return iterator(m_ptr, m_stride); } 298 | iterator end() const { return iterator(m_ptr, m_stride) + m_size; } 299 | #endif 300 | 301 | ArrayView slice(size_type position, size_type length) const 302 | { 303 | ARRAY_VIEW_BOUNDS_CHECK(position < m_size); 304 | ARRAY_VIEW_BOUNDS_CHECK(length <= m_size - position); 305 | return ArrayView(&*(begin() + position), length, m_stride); 306 | } 307 | 308 | // Returns a slice if the view is not empty, otherwise, returns an empty view 309 | ArrayView slice_nonempty(size_type position, size_type length) const 310 | { 311 | return empty() ? ArrayView{} : slice(position, length); 312 | } 313 | 314 | protected: 315 | value_type* m_ptr; 316 | size_type m_size; 317 | stride_type m_stride; 318 | }; 319 | 320 | // Deduction guides for constructing a VectorView with a std::vector, necessary for const vectors. 321 | template 322 | ArrayView(std::vector& vector) -> ArrayView; 323 | template 324 | ArrayView(const std::vector& vector) -> ArrayView; 325 | 326 | // Utility type to force the value type to be const or non-const 327 | template 328 | using ConstArrayView = ArrayView; 329 | template 330 | using MutableArrayView = ArrayView>; 331 | 332 | // Const to non const cast function, ideally never to be used. 333 | template 334 | const ArrayView& ArrayViewConstCast(const ConstArrayView& constArrayView) 335 | { 336 | return (const ArrayView&)(constArrayView); 337 | } 338 | 339 | static_assert(std::is_constructible_v, ArrayView>); 340 | static_assert(!std::is_constructible_v, ConstArrayView>); 341 | static_assert(std::is_copy_constructible_v>); 342 | static_assert(std::is_trivially_copy_constructible_v>); 343 | static_assert(std::is_copy_constructible_v>); 344 | static_assert(std::is_trivially_copy_constructible_v>); 345 | static_assert(std::is_constructible_v, const std::vector&>); 346 | static_assert(!std::is_constructible_v, const std::vector&&>, 347 | "Creating an ArrayView from an r-value would make a dangling pointer"); 348 | 349 | // TODO get rid of this 350 | 351 | /** 352 | * @brief Adds a resize callback to ArrayView 353 | */ 354 | template 355 | class DynamicArrayView : public ArrayView 356 | { 357 | public: 358 | using typename ArrayView::value_type; 359 | using typename ArrayView::size_type; 360 | using typename ArrayView::stride_type; 361 | using resize_func_type = ValueType*(size_type, const ValueType&); 362 | 363 | DynamicArrayView() = default; 364 | DynamicArrayView(const DynamicArrayView& other) = default; 365 | DynamicArrayView(std::function resizeCallback, value_type* ptr, size_type size, stride_type stride = sizeof(value_type)) 366 | : ArrayView(ptr, size, stride) 367 | , m_resizeCallback(resizeCallback) 368 | { 369 | } 370 | 371 | // Implementation for std::vector, keeping a reference to the original container in the lambda's capture 372 | DynamicArrayView(std::vector>& vector) 373 | : ArrayView(vector) 374 | , m_resizeCallback([&vector](size_type size, const ValueType& value) { 375 | vector.resize(size, value); 376 | return vector.data(); 377 | }) 378 | { 379 | } 380 | 381 | // Type conversion constructor. Currently implemented by chaining lambdas, each encoding size and offset manipulation 382 | // of the original resize function. Alternatives would be to use void pointer return type and maintain size ratio and 383 | // offset variables. 384 | template >> 385 | DynamicArrayView(const DynamicArrayView& other) 386 | : ArrayView(other) 387 | , m_resizeCallback([cb = other.m_resizeCallback](size_type size, const ValueType& value) { 388 | size_t otherTypeSize = static_cast(sizeof(T)); 389 | size_t thisTypeSize = static_cast(sizeof(value_type)); 390 | return reinterpret_cast(cb((size * thisTypeSize) / otherTypeSize, reinterpret_cast(value))); 391 | }) 392 | { 393 | } 394 | 395 | void resize(size_type size, const ValueType& value = ValueType()) 396 | { 397 | this->m_ptr = m_resizeCallback(size, value); 398 | this->m_size = size; 399 | assert(!size || this->m_ptr); 400 | } 401 | 402 | // Returns true if this object has been initialized with a resize callback 403 | bool resizable() const { return static_cast(m_resizeCallback); } 404 | 405 | private: 406 | std::function m_resizeCallback; 407 | 408 | // Provide access to m_resizeCallback from other DynamicArrayView types 409 | template 410 | friend class DynamicArrayView; 411 | }; 412 | -------------------------------------------------------------------------------- /include/nvclusterlod/nvclusterlod_cache.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION. 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | #pragma once 21 | 22 | #include 23 | #include 24 | 25 | #include "nvclusterlod_common.h" 26 | #include "nvclusterlod_hierarchy_storage.hpp" 27 | #include "nvclusterlod_mesh_storage.hpp" 28 | 29 | // This file provides basic helpers to de-/serialize the 30 | // key containers from the storage classes into a flat 31 | // uncompressed binary cache. 32 | 33 | namespace nvclusterlod { 34 | 35 | namespace detail { 36 | static constexpr uint64_t ALIGNMENT = 16ULL; 37 | static constexpr uint64_t ALIGN_MASK = ALIGNMENT - 1; 38 | static_assert(ALIGNMENT >= sizeof(uint64_t)); 39 | template 40 | inline uint64_t getCachedSize(const std::span& view) 41 | { 42 | // use one extra ALIGNMENT to store count 43 | return ((view.size_bytes() + ALIGN_MASK) & ~ALIGN_MASK) + ALIGNMENT; 44 | } 45 | 46 | template 47 | inline void storeAndAdvance(bool& isValid, uint64_t& dataAddress, uint64_t dataEnd, const std::span& view) 48 | { 49 | assert(static_cast(dataAddress) % ALIGNMENT == 0); 50 | 51 | if(isValid && dataAddress + getCachedSize(view) <= dataEnd) 52 | { 53 | union 54 | { 55 | uint64_t count; 56 | uint8_t countData[ALIGNMENT]; 57 | }; 58 | memset(countData, 0, sizeof(countData)); 59 | 60 | count = view.size(); 61 | 62 | // store count first 63 | memcpy(reinterpret_cast(dataAddress), countData, ALIGNMENT); 64 | dataAddress += ALIGNMENT; 65 | 66 | if(view.size()) 67 | { 68 | // then data 69 | memcpy(reinterpret_cast(dataAddress), view.data(), view.size_bytes()); 70 | dataAddress += (view.size_bytes() + ALIGN_MASK) & ~ALIGN_MASK; 71 | } 72 | } 73 | else 74 | { 75 | isValid = false; 76 | } 77 | } 78 | 79 | template 80 | inline void loadAndAdvance(bool& isValid, uint64_t& dataAddress, uint64_t dataEnd, std::span& view) 81 | { 82 | union 83 | { 84 | const T* basePointer; 85 | uint64_t baseRaw; 86 | }; 87 | baseRaw = dataAddress; 88 | 89 | assert(dataAddress % ALIGNMENT == 0); 90 | 91 | uint64_t count = *reinterpret_cast(basePointer); 92 | baseRaw += ALIGNMENT; 93 | 94 | if(isValid && count && (baseRaw + (sizeof(T) * count) <= dataEnd)) 95 | { 96 | // each array is 16 byte aligned 97 | view = std::span(basePointer, count); 98 | } 99 | else 100 | { 101 | view = {}; 102 | // count of zero is valid, otherwise bail 103 | isValid = isValid && count == 0; 104 | } 105 | 106 | baseRaw += sizeof(T) * count; 107 | baseRaw = (baseRaw + ALIGN_MASK) & ~(ALIGN_MASK); 108 | 109 | dataAddress = baseRaw; 110 | } 111 | } // namespace detail 112 | 113 | struct LodMeshView 114 | { 115 | std::span triangleVertices; 116 | std::span clusterTriangleRanges; 117 | std::span clusterGeneratingGroups; 118 | std::span clusterBoundingSpheres; 119 | std::span groupQuadricErrors; 120 | std::span groupClusterRanges; 121 | std::span lodLevelGroupRanges; 122 | }; 123 | 124 | inline void toView(const LodMesh& storage, LodMeshView& view) 125 | { 126 | view.triangleVertices = storage.triangleVertices; 127 | view.clusterTriangleRanges = storage.clusterTriangleRanges; 128 | view.clusterGeneratingGroups = storage.clusterGeneratingGroups; 129 | view.clusterBoundingSpheres = storage.clusterBoundingSpheres; 130 | view.groupQuadricErrors = storage.groupQuadricErrors; 131 | view.groupClusterRanges = storage.groupClusterRanges; 132 | view.lodLevelGroupRanges = storage.lodLevelGroupRanges; 133 | } 134 | 135 | inline void toStorage(const LodMeshView& view, LodMesh& storage) 136 | { 137 | storage.triangleVertices.resize(view.triangleVertices.size()); 138 | storage.clusterTriangleRanges.resize(view.clusterTriangleRanges.size()); 139 | storage.clusterGeneratingGroups.resize(view.clusterGeneratingGroups.size()); 140 | storage.clusterBoundingSpheres.resize(view.clusterBoundingSpheres.size()); 141 | storage.groupQuadricErrors.resize(view.groupQuadricErrors.size()); 142 | storage.groupClusterRanges.resize(view.groupClusterRanges.size()); 143 | storage.lodLevelGroupRanges.resize(view.lodLevelGroupRanges.size()); 144 | 145 | memcpy(storage.triangleVertices.data(), view.triangleVertices.data(), view.triangleVertices.size_bytes()); 146 | memcpy(storage.clusterTriangleRanges.data(), view.clusterTriangleRanges.data(), view.clusterTriangleRanges.size_bytes()); 147 | memcpy(storage.clusterGeneratingGroups.data(), view.clusterGeneratingGroups.data(), view.clusterGeneratingGroups.size_bytes()); 148 | memcpy(storage.clusterBoundingSpheres.data(), view.clusterBoundingSpheres.data(), view.clusterBoundingSpheres.size_bytes()); 149 | memcpy(storage.groupQuadricErrors.data(), view.groupQuadricErrors.data(), view.groupQuadricErrors.size_bytes()); 150 | memcpy(storage.groupClusterRanges.data(), view.groupClusterRanges.data(), view.groupClusterRanges.size_bytes()); 151 | memcpy(storage.lodLevelGroupRanges.data(), view.lodLevelGroupRanges.data(), view.lodLevelGroupRanges.size_bytes()); 152 | } 153 | 154 | inline uint64_t getCachedSize(const LodMeshView& meshView) 155 | { 156 | uint64_t cachedSize = 0; 157 | 158 | cachedSize += detail::getCachedSize(meshView.triangleVertices); 159 | cachedSize += detail::getCachedSize(meshView.clusterTriangleRanges); 160 | cachedSize += detail::getCachedSize(meshView.clusterGeneratingGroups); 161 | cachedSize += detail::getCachedSize(meshView.clusterBoundingSpheres); 162 | cachedSize += detail::getCachedSize(meshView.groupQuadricErrors); 163 | cachedSize += detail::getCachedSize(meshView.groupClusterRanges); 164 | cachedSize += detail::getCachedSize(meshView.lodLevelGroupRanges); 165 | 166 | return cachedSize; 167 | } 168 | 169 | inline bool storeCached(const LodMeshView& view, uint64_t dataSize, void* data) 170 | { 171 | uint64_t dataAddress = reinterpret_cast(data); 172 | uint64_t dataEnd = dataAddress + dataSize; 173 | 174 | bool isValid = true; 175 | 176 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.triangleVertices); 177 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.clusterTriangleRanges); 178 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.clusterGeneratingGroups); 179 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.clusterBoundingSpheres); 180 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.groupQuadricErrors); 181 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.groupClusterRanges); 182 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.lodLevelGroupRanges); 183 | 184 | return isValid; 185 | } 186 | 187 | inline bool loadCached(LodMeshView& view, uint64_t dataSize, const void* data) 188 | { 189 | uint64_t dataAddress = reinterpret_cast(data); 190 | uint64_t dataEnd = dataAddress + dataSize; 191 | 192 | bool isValid = true; 193 | 194 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.triangleVertices); 195 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.clusterTriangleRanges); 196 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.clusterGeneratingGroups); 197 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.clusterBoundingSpheres); 198 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.groupQuadricErrors); 199 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.groupClusterRanges); 200 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.lodLevelGroupRanges); 201 | 202 | return isValid; 203 | } 204 | 205 | struct LodHierarchyView 206 | { 207 | std::span nodes; 208 | std::span groupCumulativeBoundingSpheres; 209 | std::span groupCumulativeQuadricError; 210 | }; 211 | 212 | inline void toView(const LodHierarchy& storage, LodHierarchyView& view) 213 | { 214 | view.nodes = storage.nodes; 215 | view.groupCumulativeBoundingSpheres = storage.groupCumulativeBoundingSpheres; 216 | view.groupCumulativeQuadricError = storage.groupCumulativeQuadricError; 217 | } 218 | 219 | inline void toStorage(const LodHierarchyView& view, LodHierarchy& storage) 220 | { 221 | storage.nodes.resize(view.nodes.size()); 222 | storage.groupCumulativeBoundingSpheres.resize(view.groupCumulativeBoundingSpheres.size()); 223 | storage.groupCumulativeQuadricError.resize(view.groupCumulativeQuadricError.size()); 224 | 225 | memcpy(storage.nodes.data(), view.nodes.data(), view.nodes.size_bytes()); 226 | memcpy(storage.groupCumulativeBoundingSpheres.data(), view.groupCumulativeBoundingSpheres.data(), 227 | view.groupCumulativeBoundingSpheres.size_bytes()); 228 | memcpy(storage.groupCumulativeQuadricError.data(), view.groupCumulativeQuadricError.data(), 229 | view.groupCumulativeQuadricError.size_bytes()); 230 | } 231 | 232 | inline LodHierarchyView getView(const LodHierarchy& hierarchy) 233 | { 234 | LodHierarchyView view; 235 | view.nodes = hierarchy.nodes; 236 | view.groupCumulativeBoundingSpheres = hierarchy.groupCumulativeBoundingSpheres; 237 | view.groupCumulativeQuadricError = hierarchy.groupCumulativeQuadricError; 238 | return view; 239 | } 240 | 241 | inline uint64_t getCachedSize(const LodHierarchyView& hierarchyView) 242 | { 243 | uint64_t cachedSize = 0; 244 | 245 | cachedSize += detail::getCachedSize(hierarchyView.nodes); 246 | cachedSize += detail::getCachedSize(hierarchyView.groupCumulativeBoundingSpheres); 247 | cachedSize += detail::getCachedSize(hierarchyView.groupCumulativeQuadricError); 248 | 249 | return cachedSize; 250 | } 251 | 252 | inline bool storeCached(const LodHierarchyView& view, uint64_t dataSize, void* data) 253 | { 254 | uint64_t dataAddress = reinterpret_cast(data); 255 | uint64_t dataEnd = dataAddress + dataSize; 256 | 257 | bool isValid = true; 258 | 259 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.nodes); 260 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.groupCumulativeBoundingSpheres); 261 | detail::storeAndAdvance(isValid, dataAddress, dataEnd, view.groupCumulativeQuadricError); 262 | 263 | return isValid; 264 | } 265 | 266 | inline bool loadCached(LodHierarchyView& view, uint64_t dataSize, const void* data) 267 | { 268 | uint64_t dataAddress = reinterpret_cast(data); 269 | uint64_t dataEnd = dataAddress + dataSize; 270 | 271 | bool isValid = true; 272 | 273 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.nodes); 274 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.groupCumulativeBoundingSpheres); 275 | detail::loadAndAdvance(isValid, dataAddress, dataEnd, view.groupCumulativeQuadricError); 276 | 277 | return isValid; 278 | } 279 | 280 | struct LodGeometryInfo 281 | { 282 | // details of the original mesh are embedded for compatibility 283 | uint64_t inputTriangleCount = 0; 284 | uint64_t inputVertexCount = 0; 285 | uint64_t inputTriangleIndicesHash = 0; 286 | uint64_t inputVerticesHash = 0; 287 | nvcluster_Config clusterConfig; 288 | nvcluster_Config groupConfig; 289 | float decimationFactor = 0; 290 | }; 291 | 292 | struct LodGeometryView 293 | { 294 | LodGeometryInfo info; 295 | 296 | // this is also the storage order 297 | LodMeshView lodMesh; 298 | LodHierarchyView lodHierarchy; 299 | }; 300 | 301 | inline uint64_t getCachedSize(const LodGeometryView& view) 302 | { 303 | uint64_t cachedSize = 0; 304 | 305 | cachedSize += (sizeof(LodGeometryInfo) + detail::ALIGN_MASK) & ~detail::ALIGN_MASK; 306 | cachedSize += getCachedSize(view.lodMesh); 307 | cachedSize += getCachedSize(view.lodHierarchy); 308 | 309 | return cachedSize; 310 | } 311 | 312 | inline bool storeCached(const LodGeometryView& view, uint64_t dataSize, void* data) 313 | { 314 | uint64_t dataAddress = reinterpret_cast(data); 315 | uint64_t dataEnd = dataAddress + dataSize; 316 | 317 | bool isValid = dataAddress % detail::ALIGNMENT == 0 && dataAddress + sizeof(LodGeometryInfo) <= dataEnd; 318 | 319 | if(isValid) 320 | { 321 | memcpy(reinterpret_cast(dataAddress), &view.info, sizeof(LodGeometryInfo)); 322 | dataAddress += (sizeof(LodGeometryInfo) + detail::ALIGN_MASK) & ~detail::ALIGN_MASK; 323 | } 324 | 325 | isValid = isValid && storeCached(view.lodMesh, dataEnd - dataAddress, reinterpret_cast(dataAddress)); 326 | dataAddress += getCachedSize(view.lodMesh); 327 | isValid = isValid && storeCached(view.lodHierarchy, dataEnd - dataAddress, reinterpret_cast(dataAddress)); 328 | dataAddress += getCachedSize(view.lodHierarchy); 329 | 330 | return isValid; 331 | } 332 | 333 | inline bool loadCached(LodGeometryView& view, uint64_t dataSize, const void* data) 334 | { 335 | uint64_t dataAddress = reinterpret_cast(data); 336 | uint64_t dataEnd = dataAddress + dataSize; 337 | 338 | bool isValid = true; 339 | 340 | if(dataAddress % detail::ALIGNMENT == 0 && dataAddress + sizeof(LodGeometryInfo) <= dataEnd) 341 | { 342 | memcpy(&view.info, data, sizeof(LodGeometryInfo)); 343 | dataAddress += (sizeof(LodGeometryInfo) + detail::ALIGN_MASK) & ~detail::ALIGN_MASK; 344 | } 345 | else 346 | { 347 | view = {}; 348 | return false; 349 | } 350 | 351 | isValid = isValid && loadCached(view.lodMesh, dataEnd - dataAddress, reinterpret_cast(dataAddress)); 352 | dataAddress += getCachedSize(view.lodMesh); 353 | isValid = isValid && loadCached(view.lodHierarchy, dataEnd - dataAddress, reinterpret_cast(dataAddress)); 354 | dataAddress += getCachedSize(view.lodHierarchy); 355 | return isValid; 356 | } 357 | 358 | class CacheHeader 359 | { 360 | public: 361 | CacheHeader() 362 | { 363 | std::fill(std::begin(data), std::end(data), 0); 364 | header = {}; 365 | } 366 | 367 | private: 368 | struct Header 369 | { 370 | uint64_t magic = 0x00646f6c6c63766eULL; // nvcllod 371 | uint32_t lodVersion = NVCLUSTERLOD_VERSION; 372 | uint32_t clusterVersion = NVCLUSTER_VERSION; 373 | }; 374 | 375 | union 376 | { 377 | Header header; 378 | uint8_t data[(sizeof(Header) + detail::ALIGNMENT - 1) & ~(detail::ALIGNMENT - 1)]; 379 | }; 380 | }; 381 | 382 | class CacheView 383 | { 384 | // Optionally if you want to have a simple cache file for this 385 | // data, we provide a canonical layout, and this simple class 386 | // to open it. 387 | // 388 | // The cache data must be stored in three sections: 389 | // 390 | #if 0 391 | struct CacheFile 392 | { 393 | // first: library version specific header 394 | CacheHeader header; 395 | // second: for each geometry serialized data of the `LodGeometryView` 396 | uint8_t geometryViewData[]; 397 | // third: offset table 398 | // offsets where each `LodGeometry` data is stored. 399 | // ordered with ascending offsets 400 | // `geometryDataSize = geometryOffsets[geometryIndex + 1] - geometryOffsets[geometryIndex];` 401 | uint64_t geometryOffsets[geometryCount + 1]; 402 | uint64_t geometryCount; 403 | }; 404 | #endif 405 | 406 | public: 407 | bool isValid() const 408 | { 409 | return m_dataSize != 0; 410 | } 411 | 412 | bool init(uint64_t dataSize, const void* data) 413 | { 414 | m_dataSize = dataSize; 415 | m_dataBytes = reinterpret_cast(data); 416 | 417 | if(dataSize <= sizeof(CacheHeader) + sizeof(uint64_t)) 418 | { 419 | m_dataSize = 0; 420 | return false; 421 | } 422 | 423 | CacheHeader defaultHeader; 424 | 425 | if(memcmp(data, &defaultHeader, sizeof(CacheHeader)) != 0) 426 | { 427 | m_dataSize = 0; 428 | return false; 429 | } 430 | 431 | m_geometryCount = *getPointer(m_dataSize - sizeof(uint64_t)); 432 | 433 | if(dataSize <= (sizeof(CacheHeader) + sizeof(uint64_t) * (m_geometryCount + 2))) 434 | { 435 | m_dataSize = 0; 436 | return false; 437 | } 438 | 439 | m_tableStart = m_dataSize - sizeof(uint64_t) * (m_geometryCount + 2); 440 | 441 | return true; 442 | } 443 | 444 | void deinit() 445 | { 446 | *(this) = {}; 447 | } 448 | 449 | uint64_t getGeometryCount() const 450 | { 451 | return m_geometryCount; 452 | } 453 | 454 | bool getLodGeometryView(LodGeometryView& view, uint64_t geometryIndex) const 455 | { 456 | if(geometryIndex >= m_geometryCount) 457 | { 458 | assert(0); 459 | return false; 460 | } 461 | 462 | const uint64_t* geometryOffsets = getPointer(m_tableStart, m_geometryCount + 1); 463 | uint64_t base = geometryOffsets[geometryIndex]; 464 | 465 | if(base + sizeof(LodGeometryInfo) > m_tableStart) 466 | { 467 | // this must not happen on a valid file 468 | assert(0); 469 | return false; 470 | } 471 | 472 | uint64_t geometryTotalSize = geometryOffsets[geometryIndex + 1] - base; 473 | 474 | const uint8_t* geoData = getPointer(base, geometryTotalSize); 475 | 476 | return loadCached(view, geometryTotalSize, geoData); 477 | } 478 | 479 | private: 480 | template 481 | const T* getPointer(uint64_t offset, [[maybe_unused]] uint64_t count = 1) const 482 | { 483 | assert(offset + sizeof(T) * count <= m_dataSize); 484 | return reinterpret_cast(m_dataBytes + offset); 485 | } 486 | 487 | uint64_t m_dataSize = 0; 488 | uint64_t m_tableStart = 0; 489 | const uint8_t* m_dataBytes = nullptr; 490 | uint64_t m_geometryCount = 0; 491 | }; 492 | } // namespace nvclusterlod 493 | -------------------------------------------------------------------------------- /doc/hierarchy_selection.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 12 | 13 | 14 | LOD 1 15 | 16 | LOD 0 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | -------------------------------------------------------------------------------- /src/nvclusterlod_hierarchy.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | * 16 | * SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION 17 | * SPDX-License-Identifier: Apache-2.0 18 | */ 19 | 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | #include 31 | 32 | // Create a 32-bit mask with the lowest bitCount bits set to 1. 33 | // bitCount must be less than 32. 34 | #define U32_MASK(bitCount) ((1u << (bitCount)) - 1u) 35 | 36 | namespace nvclusterlod { 37 | 38 | // From the set of input nodes, cluster them according to their spatial location so each cluster contains at most maxClusterItems 39 | template 40 | static nvcluster_Result clusterNodesSpatially(nvcluster_Context context, 41 | std::span nodes, 42 | uint32_t maxClusterItems, 43 | nvcluster::ClusterStorage& clusters) 44 | { 45 | // For each node, compute its axis-aligned bounding box and centroid location 46 | std::vector triangleClusterAabbs(nodes.size()); 47 | std::vector triangleClusterCentroids(nodes.size()); 48 | 49 | parallel_batches(nodes.size(), [&](uint64_t nodeIndex) { 50 | const nvclusterlod_HierarchyNode& node = nodes[nodeIndex]; 51 | auto boundingSphere = std::bit_cast(node.boundingSphere); 52 | triangleClusterAabbs[nodeIndex] = {boundingSphere.center - boundingSphere.radius, 53 | boundingSphere.center + boundingSphere.radius}; 54 | triangleClusterCentroids[nodeIndex] = boundingSphere.center; 55 | }); 56 | 57 | // Call the clusterizer to group the nodes 58 | nvcluster_Input clusterBounds{ 59 | .itemBoundingBoxes = reinterpret_cast(triangleClusterAabbs.data()), 60 | .itemCentroids = reinterpret_cast(triangleClusterCentroids.data()), 61 | .itemCount = uint32_t(triangleClusterAabbs.size()), 62 | }; 63 | 64 | nvcluster_Config config = { 65 | .minClusterSize = maxClusterItems, 66 | .maxClusterSize = maxClusterItems, 67 | }; 68 | 69 | nvcluster_Result result = nvcluster::generateClusters(context, config, clusterBounds, clusters); 70 | return result; 71 | } 72 | 73 | // Find the sphere within spheres that lies farthest from the target sphere, 74 | // accounting for the radii of the spheres 75 | inline const nvclusterlod::Sphere farthestSphere(std::span spheres, const nvclusterlod::Sphere& target) 76 | { 77 | const nvclusterlod::Sphere* result = ⌖ 78 | float maxDist = 0.0f; 79 | for(size_t sphereIndex = 0; sphereIndex < spheres.size(); sphereIndex++) 80 | { 81 | const nvclusterlod::Sphere& candidate = spheres[sphereIndex]; 82 | float dist = nvcluster::length(candidate.center - target.center) + candidate.radius + target.radius; 83 | if(std::isinf(dist) || dist > maxDist) 84 | { 85 | maxDist = dist; 86 | result = &candidate; 87 | } 88 | } 89 | return *result; 90 | }; 91 | 92 | // Create a sphere that bounds all the input spheres 93 | static inline nvclusterlod_Result makeBoundingSphere(std::span spheres, nvclusterlod::Sphere& sphere) 94 | { 95 | if(spheres.empty()) 96 | { 97 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_MAKE_BOUNDING_SPHERES_FROM_EMPTY_SET; 98 | } 99 | 100 | // Loosely based on Ritter's bounding sphere algorithm, extending to include 101 | // sphere radii. Not verified, but I can imagine it works. 102 | const nvclusterlod::Sphere& x = spheres[0]; 103 | const nvclusterlod::Sphere y = farthestSphere(spheres, x); 104 | const nvclusterlod::Sphere z = farthestSphere(spheres, y); 105 | 106 | // Make a sphere containing y and z 107 | auto yz = z.center - y.center; 108 | float dist = nvcluster::length(yz); 109 | sphere = {y.center, (dist + y.radius + z.radius) * 0.5f}; 110 | // TODO: I bet normalize could cancel down somehow to avoid the 111 | // singularity check 112 | if(dist > 1e-10f) 113 | sphere.center += yz * (sphere.radius - y.radius) / dist; 114 | 115 | // Grow the sphere to include the farthest sphere 116 | const nvclusterlod::Sphere f = farthestSphere(spheres, sphere); 117 | sphere.radius = nvcluster::length(f.center - sphere.center) + f.radius; 118 | sphere.radius *= (1.0f + 5.0f * std::numeric_limits::epsilon()); // try to ensure children are bounded despite rounding errors 119 | if(std::isnan(sphere.center[0]) || std::isnan(sphere.center[1]) || std::isnan(sphere.center[2]) || std::isnan(sphere.radius)) 120 | { 121 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_PRODUCED_NAN_BOUNDING_SPHERES; 122 | } 123 | 124 | #ifndef NDEBUG 125 | for(size_t childIndex = 0; childIndex < spheres.size(); childIndex++) 126 | { 127 | assert(spheres[childIndex].radius <= sphere.radius); 128 | assert(isInside(spheres[childIndex], sphere)); 129 | } 130 | #endif 131 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 132 | } 133 | 134 | template 135 | nvclusterlod_Result buildHierarchy(nvclusterlod_Context context, const HierarchyInput& input, HierarchyOutput& output) 136 | { 137 | // Build sets of generating groups that contributed clusters for decimation 138 | // into each group. 139 | nvclusterlod::GroupGeneratingGroups groupGeneratingGroups; 140 | 141 | std::span groupClusterRangesCAPI(reinterpret_cast(input.groupClusterRanges.data()), 142 | input.groupClusterRanges.size()); 143 | nvclusterlod_Result result = 144 | nvclusterlod::generateGroupGeneratingGroups(groupClusterRangesCAPI, input.clusterGeneratingGroups, groupGeneratingGroups); 145 | if(result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 146 | { 147 | return result; 148 | } 149 | 150 | // Compute cumulative bounding spheres and quadric errors. Cumulative bounding 151 | // spheres avoid rendering overlapping geometry with a constant angular error 152 | // threshold at the cost of producing significantly oversized bounding 153 | // spheres. 154 | for(size_t lodLevel = 0; lodLevel < input.lodLevelGroupRanges.size(); ++lodLevel) 155 | { 156 | const nvcluster::Range& lodGroupRange = input.lodLevelGroupRanges[lodLevel]; 157 | for(uint32_t group = lodGroupRange.offset; group < lodGroupRange.end(); group++) 158 | { 159 | if(lodLevel == 0) 160 | { 161 | // Find the bounding sphere for each group 162 | result = makeBoundingSphere(input.clusterBoundingSpheres.subspan(input.groupClusterRanges[group].offset, 163 | input.groupClusterRanges[group].count), 164 | output.groupCumulativeBoundingSpheres[group]); 165 | if(result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 166 | { 167 | return result; 168 | } 169 | } 170 | else 171 | { 172 | // Higher LOD bounding spheres just include the generating group 173 | // spheres. The current group will seemingly always be a subset. 174 | // However, since the bounding sphere algorithm isn't perfectly tight 175 | // it's possible that a cluster bounding sphere may be outside the one 176 | // computed here. This isn't important for LOD but can be surprising if 177 | // validated. 178 | // TODO: only compute LOD0 clusterBoundingSpheres from triangles? 179 | std::vector generatingSpheres; 180 | const nvcluster_Range& generatingGroupRange = groupGeneratingGroups.ranges[group]; 181 | generatingSpheres.reserve(generatingGroupRange.count); 182 | for(uint32_t indexInGeneratingGroups = generatingGroupRange.offset; 183 | indexInGeneratingGroups < generatingGroupRange.offset + generatingGroupRange.count; indexInGeneratingGroups++) 184 | { 185 | uint32_t generatingGroup = groupGeneratingGroups.groups[indexInGeneratingGroups]; 186 | generatingSpheres.push_back(output.groupCumulativeBoundingSpheres[generatingGroup]); 187 | } 188 | result = makeBoundingSphere(generatingSpheres, output.groupCumulativeBoundingSpheres[group]); 189 | if(result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 190 | { 191 | return result; 192 | } 193 | } 194 | 195 | // Compute cumulative quadric error 196 | float maxGeneratingGroupQuadricError = 0.0f; 197 | const nvcluster_Range& generatingGroupRange = groupGeneratingGroups.ranges[group]; 198 | for(uint32_t indexInGeneratingGroups = generatingGroupRange.offset; 199 | indexInGeneratingGroups < generatingGroupRange.offset + generatingGroupRange.count; indexInGeneratingGroups++) 200 | { 201 | uint32_t generatingGroup = groupGeneratingGroups.groups[indexInGeneratingGroups]; 202 | maxGeneratingGroupQuadricError = 203 | std::max(maxGeneratingGroupQuadricError, output.groupCumulativeQuadricError[generatingGroup]); 204 | } 205 | output.groupCumulativeQuadricError[group] = maxGeneratingGroupQuadricError + input.groupQuadricErrors[group]; 206 | } 207 | } 208 | 209 | // Allocate the initial root node, just so it is first 210 | size_t lodCount = input.lodLevelGroupRanges.size(); 211 | if(lodCount == 0) 212 | { 213 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_EMPTY_LOD_LEVELS; 214 | } 215 | if(lodCount >= NVCLUSTERLOD_NODE_MAX_CHILDREN) // can fit all LODs into one root node. 216 | { 217 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_LOD_LEVELS_OVERFLOW; 218 | } 219 | 220 | // The very first node is the root node. 221 | nvclusterlod_HierarchyNode& rootNode = output.nodes.allocate(); 222 | 223 | // The root node children are next. Root children are a per-LOD spatial 224 | // hierarchy. They are combined for convenience. Note that lodNodes are 225 | // allocated here, but written after we have written their descendents. 226 | uint32_t lodNodesGlobalOffset = output.nodes.allocatedCount(); 227 | OutputSpan lodNodes = output.nodes.allocate(uint32_t(lodCount)); 228 | 229 | // Write the spatial hierarchy for each LOD level 230 | for(size_t lodIndex = 0; lodIndex < lodCount; ++lodIndex) 231 | { 232 | // Create leaf nodes for each group of clusters. 233 | std::vector nodes; 234 | nodes.reserve(input.lodLevelGroupRanges[lodIndex].count); 235 | const nvcluster::Range& lodGroupRange = input.lodLevelGroupRanges[lodIndex]; 236 | for(uint32_t groupIndex = lodGroupRange.offset; groupIndex < lodGroupRange.end(); groupIndex++) 237 | { 238 | if(input.groupClusterRanges[groupIndex].count > NVCLUSTERLOD_GROUP_MAX_CLUSTERS) 239 | { 240 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_GROUP_CLUSTER_COUNT_OVERFLOW; 241 | } 242 | nvclusterlod_LeafNodeClusterGroup clusterGroup{ 243 | .isClusterGroup = 1, 244 | .group = groupIndex & U32_MASK(23), 245 | .clusterCountMinusOne = (input.groupClusterRanges[groupIndex].count - 1u) & U32_MASK(8), 246 | }; 247 | assert(uint32_t(clusterGroup.clusterCountMinusOne) + 1 == input.groupClusterRanges[groupIndex].count); 248 | nodes.push_back(nvclusterlod_HierarchyNode{ 249 | .clusterGroup = clusterGroup, 250 | .boundingSphere = output.groupCumulativeBoundingSpheres[groupIndex], 251 | .maxClusterQuadricError = output.groupCumulativeQuadricError[groupIndex], 252 | }); 253 | } 254 | 255 | // Build traversal hierarchy per-LOD 256 | // NOTE: could explore mixing nodes from different LODs near the top of the 257 | // tree to improve paralellism. Ideally the result could be N root nodes 258 | // rather than just one too. 259 | while(nodes.size() > 1) 260 | { 261 | nvcluster::ClusterStorage nodeClusters; 262 | nvcluster_Result clusterResult = 263 | clusterNodesSpatially(context->clusterContext, nodes, NVCLUSTERLOD_NODE_MAX_CHILDREN, nodeClusters); 264 | if(clusterResult != nvcluster_Result::NVCLUSTER_SUCCESS) 265 | { 266 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_CLUSTERING_NODES_FAILED; 267 | } 268 | std::vector newNodes; 269 | newNodes.reserve(nodeClusters.clusterItemRanges.size()); 270 | 271 | for(size_t rangeIndex = 0; rangeIndex < nodeClusters.clusterItemRanges.size(); rangeIndex++) 272 | { 273 | const nvcluster_Range& range = nodeClusters.clusterItemRanges[rangeIndex]; 274 | std::span group = std::span(nodeClusters.items).subspan(range.offset, range.count); 275 | if(group.empty() || group.size() > NVCLUSTERLOD_NODE_MAX_CHILDREN) 276 | { 277 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_NODE_CHILD_COUNT_OVERFLOW; 278 | } 279 | float maxClusterQuadricError = 0.0f; 280 | std::vector boundingSpheres; 281 | boundingSpheres.reserve(group.size()); 282 | for(uint32_t nodeIndex : group) 283 | { 284 | boundingSpheres.push_back(std::bit_cast(nodes[nodeIndex].boundingSphere)); 285 | maxClusterQuadricError = std::max(maxClusterQuadricError, nodes[nodeIndex].maxClusterQuadricError); 286 | } 287 | nvclusterlod_InternalNodeChildren nodeRange{ 288 | .isClusterGroup = 0, 289 | .childOffset = output.nodes.allocatedCount() & U32_MASK(26), 290 | .childCountMinusOne = uint32_t(group.size() - 1) & U32_MASK(5), 291 | }; 292 | nvclusterlod::Sphere boundingSphere; 293 | result = makeBoundingSphere(boundingSpheres, boundingSphere); 294 | if(result != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 295 | { 296 | return result; 297 | } 298 | newNodes.push_back(nvclusterlod_HierarchyNode{ 299 | .children = nodeRange, 300 | .boundingSphere = boundingSphere, 301 | .maxClusterQuadricError = maxClusterQuadricError, 302 | }); 303 | 304 | for(const uint32_t& nodeIndex : group) 305 | { 306 | output.nodes.append(nodes[nodeIndex]); 307 | } 308 | } 309 | std::swap(nodes, newNodes); 310 | } 311 | assert(nodes.size() == 1); 312 | 313 | // Always traverse lowest detail LOD by making the sphere radius huge. The 314 | // application may want this information, but can read it from the last 315 | // groupCumulativeBoundingSpheres instead. 316 | if(lodIndex == lodCount - 1) 317 | { 318 | nodes[0].boundingSphere = {{0.0f, 0.0f, 0.0f}, std::numeric_limits::max()}; 319 | } 320 | lodNodes.append(nodes); 321 | } 322 | assert(lodNodes.allocatedCount() == lodNodes.capacity()); 323 | 324 | // Link the per-LOD trees into a single root node 325 | // TODO: would need to combine recursively to support more than 326 | // NodeRange::MaxChildren LOD levels 327 | { 328 | float maxClusterQuadricError = 0.0f; 329 | for(const nvclusterlod_HierarchyNode& node : lodNodes.allocated()) 330 | maxClusterQuadricError = std::max(maxClusterQuadricError, node.maxClusterQuadricError); 331 | nvclusterlod_InternalNodeChildren nodeRange{ 332 | .isClusterGroup = 0, 333 | .childOffset = lodNodesGlobalOffset & U32_MASK(26), 334 | .childCountMinusOne = (lodNodes.allocatedCount() - 1) & U32_MASK(5), 335 | }; 336 | if(uint32_t(nodeRange.childCountMinusOne + 1) != lodNodes.allocatedCount()) 337 | { 338 | return nvclusterlod_Result::NVCLUSTERLOD_ERROR_NODES_OVERFLOW; 339 | } 340 | rootNode = nvclusterlod_HierarchyNode{ 341 | .children = nodeRange, 342 | .boundingSphere = {{0.0f, 0.0f, 0.0f}, std::numeric_limits::max()}, // always include everything 343 | .maxClusterQuadricError = maxClusterQuadricError, 344 | }; 345 | } 346 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 347 | } 348 | 349 | } // namespace nvclusterlod 350 | 351 | // Compute the number of nodes required to store the LOD hierarchy for the given input 352 | nvclusterlod_Result nvclusterlodGetHierarchyRequirements(nvclusterlod_Context, 353 | const nvclusterlod_HierarchyInput* input, 354 | nvclusterlod_HierarchyCounts* counts) 355 | { 356 | *counts = nvclusterlod_HierarchyCounts{ 357 | .nodeCount = uint32_t(input->clusterCount + 1), 358 | }; 359 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 360 | } 361 | 362 | // C API entrypoint 363 | nvclusterlod_Result nvclusterlodBuildHierarchy(nvclusterlod_Context context, 364 | const nvclusterlod_HierarchyInput* input, 365 | nvclusterlod_HierarchyOutput* output) 366 | { 367 | nvclusterlod::HierarchyOutput outputAllocator(*output, input->groupCount); 368 | #if !defined(NVCLUSTERLOD_MULTITHREADED) || NVCLUSTERLOD_MULTITHREADED 369 | auto buildHierarchy = context->parallelize ? nvclusterlod::buildHierarchy : nvclusterlod::buildHierarchy; 370 | #else 371 | auto buildHierarchy = nvclusterlod::buildHierarchy; 372 | #endif 373 | if(nvclusterlod_Result r = buildHierarchy(context, nvclusterlod::HierarchyInput::fromCAPI(*input), outputAllocator); 374 | r != nvclusterlod_Result::NVCLUSTERLOD_SUCCESS) 375 | { 376 | return r; 377 | } 378 | outputAllocator.writeCounts(*output); 379 | return nvclusterlod_Result::NVCLUSTERLOD_SUCCESS; 380 | } 381 | -------------------------------------------------------------------------------- /doc/graph_cut.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # nv_cluster_lod_builder 2 | 3 | > [!IMPORTANT] 4 | > This repository has been archived and is no longer maintained by NVIDIA. It 5 | > was an R&D project to provide a simple and quick start for ray tracing 6 | > continuous LOD with 7 | > [nv_cluster_builder](https://github.com/nvpro-samples/nv_cluster_builder). For 8 | > similar actively maintained code, please refer to 9 | > [`meshoptimizer/demo/clusterlod.h`](https://github.com/zeux/meshoptimizer/blob/master/demo/clusterlod.h). 10 | 11 | ![](doc/lod_stitch.svg) 12 | 13 | **nv_cluster_lod_builder** is a _continuous_ level of detail (LOD) mesh library. 14 | Continuous LOD allows for fine-grained control over geometric detail within a 15 | mesh, compared to traditional discrete LOD. Clusters of triangles are carefully 16 | precomputed by decimating the original mesh in a way that they can be seamlessly 17 | combined across different LOD levels. At rendering time, a subset of these 18 | clusters is selected to adaptively provide the required amount of detail as the 19 | camera navigates the scene. 20 | 21 | Key features of continuous LOD systems include: 22 | 23 | - **Fast rendering with more detail:** Triangles are allocated where they are 24 | most needed. 25 | - **Reduced memory usage with geometry streaming:** Particularly beneficial for 26 | ray tracing applications. 27 | 28 | This library serves as a quick placeholder or learning tool, demonstrating the 29 | basics of creating continuous LOD data. For a reference implementation of the 30 | rendering system, see 31 | https://github.com/nvpro-samples/vk_lod_clusters. 32 | 33 | **Input:** a triangle mesh with millions of triangles 34 | 35 | **Output:** 36 | 37 | 1. [nvclusterlod/nvclusterlod_mesh.h](include/nvclusterlod/nvclusterlod_mesh.h) - decimated clusters of the 38 | original mesh, with groupings and relations to other groups 39 | 2. [nvclusterlod/nvclusterlod_hierarchy.h](include/nvclusterlod/nvclusterlod_hierarchy.h) - a spatial 40 | hierarchy of *cluster groups* to improve performance of runtime cluster 41 | selection for rendering 42 | 43 | To render, select *cluster groups* where: 44 | 45 | - Detail or decimation error of the group is small enough, relative to the camera 46 | - Detail or decimation error of the group's decimated geometry is not small enough 47 | 48 | This is the gist, but the library also does some massaging of the values that 49 | feed into these checks to make sure multiple LODs do not render over the top of 50 | each other. See below. 51 | 52 | Geometry can be streamed in when needed to save memory. 53 | 54 | **Table of Contents** 55 | 56 | - [How it works](#how-it-works) 57 | - [Building LODs](#building-lods) 58 | - [Selecting Clusters](#selecting-clusters) 59 | - [Spatial Hierarchy](#spatial-hierarchy) 60 | - [Streaming](#streaming) 61 | - [References](#references) 62 | - [Usage Example](#usage-example) 63 | - [Build Integration](#build-integration) 64 | - [Dependencies](#dependencies) 65 | - [License](#license) 66 | - [Limitations](#limitations) 67 | 68 | ## How it works 69 | 70 | The key to continuous LOD is a decimation strategy that allows regular 71 | watertight LOD transitions across a mesh. Such transitions require borders that 72 | match on both sides, and are obtained by keeping the border of the triangle 73 | edges fixed during decimation. Since these edges do not change, successive 74 | iterations of decimation must choose different borders and fix new edges to let 75 | the old ones decimate. 76 | 77 | To explain why, consider forming groups of triangles and decimating triangles 78 | within. Then grouping the decimated groups and decimating again *recursively* 79 | until there is just one root group. In this case, some of the vertices would 80 | remain fixed across the entire hierarchy, and would be decimated only when the 81 | last two groups are grouped and decimated to form the coarsest LOD. To avoid 82 | this, new groups must instead be allowed to cross any border, and in fact 83 | encouraged to. 84 | 85 | This library makes groups of geometry and decimates within *groups*. Decimated 86 | geometry is then re-grouped, encouraging crossing the old group's borders when 87 | forming new groups. Groups are made from clusters of triangle rather than just 88 | triangles for performance reasons. A group is a cluster of clusters of 89 | triangles. Whole triangle clusters are swapped in and out at runtime for detail 90 | transitions. 91 | 92 | ### Building LODs 93 | 94 | 95 | 96 | The image above shows the process that is repeated to create LODs until there is 97 | just a single cluster representing the whole mesh: 98 | 99 | 1. Make clusters [, within old borders] 100 | 101 | This library uses 102 | [nv_cluster_builder](https://github.com/nvpro-samples/nv_cluster_builder)' 103 | segmented API to make clusters of a fixed size from triangles within groups 104 | of the previous iteration, or globally for the first iteration. 105 | 106 | 2. Group clusters [, crossing old borders] 107 | 108 | This is just making clusters of clusters, but with a catch. Border edges 109 | cannot decimate so it is important to encourage grouping clusters in a way to 110 | keep old borders internal to the group. Then the previously locked edges are 111 | free to decimate. This is done by adding a connection and weight between 112 | clusters sharing many vertices (locked in particular) and optimizing for a 113 | *minimum cut* when making cluster groups with 114 | [nv_cluster_builder](https://github.com/nvpro-samples/nv_cluster_builder). 115 | 116 | If there is only one cluster in one group, the operation is complete. 117 | 118 | 3. Decimate within groups, keep border 119 | 120 | Vertices shared between groups are computed and locked before using 121 | [meshoptimizer](https://github.com/zeux/meshoptimizer)'s `simplify` to 122 | decimate each cluster group. The aim is to halve the number of triangles. 123 | These become the input to the next iteration. 124 | 125 | The code is documented and intended to be read too. These steps can be found in 126 | `nvclusterlodMeshCreate()` at the bottom of 127 | [`nvclusterlod_mesh.cpp`](src/nvclusterlod_mesh.cpp). 128 | 129 | When decimating, the generating group is tracked. This is the geometry each 130 | cluster was decimated from. A cluster's group is one of many groups generated by 131 | decimating its generating group. Similarly a group has many generating groups. 132 | Clusters will be selected in intersections of groups and generating groups - 133 | perhaps something to optimize the decision making with. The term *parent* is 134 | avoided due to possible confusion between the originating geometry and the 135 | direction to the root node. 136 | 137 | ![](doc/lod_dag.svg) 138 | 139 | The image above shows an example 2D illustration with colored groups, their 140 | clusterings and relationships. Notably, two groups of clusters may produce 141 | decimated clusters that are both part of a new group. This allows group borders 142 | to be decimated after each iteration. The relationships form a directed acyclic 143 | graph (DAG), i.e. not a tree, with the constraint that relationships don't skip 144 | levels - but maybe that could help with uneven detail? LOD transitions may only 145 | happen across group borders, which places a limit on the rate of LOD change. 146 | 147 | The output data are: 148 | 149 | - Clusters of triangles, referencing vertices in the original mesh 150 | - Groupings of clusters and their relationships: 151 | - Generating geometry, input to decimation 152 | - Generated geometry, decimation output 153 | - Group bounding spheres 154 | - Group decimation quadric error 155 | 156 | ### Selecting Clusters 157 | 158 | The first step is to pick the goal. A couple of examples are: 159 | 160 | 1. Pixel-sized triangles? 161 | 2. Sub-pixel-sized geometric error? 162 | 163 | The latter may be more efficient if for example large triangles give the same 164 | visual result. This may be more challenging to quantify particularly if 165 | decimation introduces error not captured by the metric. For the moment this 166 | library uses [*quadric 167 | error*](https://www.cs.cmu.edu/~./garland/Papers/quadrics.pdf), an approximate 168 | measure of the object-space distance between the decimated mesh and the original 169 | high-resolution mesh. Inaccuracies from decimating vertex attributes such as 170 | normals and UVs are currently ignored. 171 | 172 | A conservative maximum vertex position error is maintained for all cluster 173 | groups. This is the farthest any vertex may be from representing the original 174 | surface. When rendering we ask, "what is the largest possible angular error from 175 | the camera?" for a particular group. We then want to render geometry when its 176 | error is just less than a threshold, but not any overlapping geometry. 177 | 178 | ![](doc/arcsin_angular_error.svg) 179 | 180 | The farthest a decimated vertex may be incorrectly representing geometry is the 181 | quadric error. This will be bigger in screen space nearer the camera so the 182 | nearest point on the group's bounding sphere is chosen. The largest possible 183 | angular error from the camera is then the angular size of a sphere with quadric 184 | error radius at that point. Convenient and simple: the arcsine of the error 185 | divided by the distance to the closest point on the bounding sphere. A target 186 | threshold can be chosen based on a single pixel's FOV at the center of the 187 | projection - to keep any geometric error less than the size of a pixel. This 188 | avoids varying the threshold across the image, which would further complicate a 189 | problem yet to solve. 190 | 191 | ![](doc/graph_cut.svg) 192 | 193 | We have a target goal and a way to compute it, but how can we guarantee a single 194 | unique continuous surface? I.e. no holes and no overlaps. An ideal solution 195 | would be to pick clusters that satisfy the angular error threshold but constrain 196 | the rest to only making a single LOD transition per group. That would require 197 | traversing the graph with its adjacency information, visualized above. The term 198 | is a making graph cut and it would be challenging to do quickly and in parallel 199 | on a GPU. 200 | 201 | We ideally want to test whether to render a cluster independently. We could 202 | render geometry where its error is the first below the threshold, i.e. its 203 | decimated error is greater. Just that would actually guarantee no holes, but 204 | there would still be overlaps. E.g. two clusters that represent the same surface 205 | being drawn at once. This can happen when the bounding sphere of a decimated 206 | group is so far from the camera that its conservative angular error is smaller 207 | than a group's angular error that it was decimated from. 208 | 209 | The solution implemented by this library is to artificially increase the size of 210 | the bounding spheres such that the nearest point to the camera is always nearer 211 | than that on a bounding sphere of its generating geometry. In short, make 212 | bounding spheres bound generating geometry too. Once done, a single watertight 213 | mesh can be stitched together from independent parallel decisions. In general, 214 | the angular error, or whatever metric is compared to a threshold, must never 215 | decrease with each level of decimation. The failure above was a decrease due to 216 | the size distortion of a perspective projection. 217 | 218 | One derivation glossed over so far is why store bounding spheres and errors per 219 | group. The simple answer is that for LOD transitions to work, the entire group 220 | must change LOD at the same time, so all clusters in a group must share the same 221 | values. 222 | 223 | ### Spatial Hierarchy 224 | 225 | This library provides a spatial hierarchy of bounding spheres to search for 226 | cluster groups of the right LOD in the right spatial region relative to the 227 | camera. 228 | 229 | One way to think of this is there are many high-detailed clusters and few low 230 | detail. If an object is far away, only the low-detailed clusters should be 231 | checked. That is, the search can exit early if it is known that all remaining 232 | clusters are too detailed. 233 | 234 | ![](doc/spatial_selection.svg) 235 | 236 | Another way of thinking about this is at a certain distance range from the 237 | camera, as shown in the image above, only clusters with certain bounding sphere 238 | radii and quadric error ranges should be rendered. Thus, the search space can be 239 | reduced by conservatively searching only that region. An r-tree could work well 240 | here too. 241 | 242 | The hierarchy is actually a set of hierarchies - one for each LOD level. For 243 | convenience, per-level roots are merged since the application would need to 244 | search all levels anyway, or at least their roots. 245 | 246 | Leaf nodes point to cluster groups and are initialized with the group's 247 | decimated cluster maximum quadric error (i.e. from the next level\*) and the 248 | group's bounding sphere. The hierarchy is built by recursively spatially 249 | clustering nodes - not fast, but it works and it isn't a bottleneck yet. 250 | Internal nodes are given the maximum quadric error and bounding sphere of their 251 | children. 252 | 253 | \*The group quadric error is the error of the generated group's clusters, i.e. 254 | after decimation, not the error in the group's clusters. This avoids 255 | unnecessarily storing a per-cluster error. 256 | 257 | ![](doc/hierarchy_selection.svg) 258 | 259 | The tree can be traversed using the same angular error check as for cluster 260 | groups, exiting when the node's error is less than the threshold. The trees for 261 | LODs with too fine detail will exit early. The blue crosses in the above image 262 | show an example - those nodes are already below the threshold. Since traversed 263 | leaf nodes have already been checked to be above the threshold, and they are 264 | initialized with cluster's generated group's error, their clusters only need to 265 | check that they are below the threshold in order to select them for rendering. 266 | Note that the entire group may not necessarily be drawn. For example, two of the 267 | yellow clusters were not below the threshold (red cross). This same check is 268 | made by the blue group's leaf node and blue clusters are drawn instead. 269 | 270 | While it is possible to exit early from a tree with too coarse detail, it may 271 | interfere with streaming, depending on how dependencies are implemented. 272 | 273 | ### Streaming 274 | 275 | This is not a definitive how-to, but outlines some ideas for getting started 276 | with streaming continuous LOD. 277 | 278 | The first feature needed for streaming is indirection - e.g. pointers to cluster 279 | groups that are initially null and can be populated over time (outside of 280 | cluster selection and rendering). Then, cluster groups in leaf nodes encountered 281 | during hierarchy traversal must be marked and streamed in. Finally, minor 282 | changes are needed for selecting clusters: 283 | 284 | - Obviously, don't traverse leaf nodes whose groups have not been loaded yet 285 | - Consider clusters to be below the threshold if their generating group has not been loaded 286 | 287 | Choosing to keep lower detail geometry loaded greatly simplifies things. That 288 | is, making sure decimated geometry is loaded first. This happens naturally due 289 | to traversal order, but tracking dependencies host side may be needed if 290 | streaming less than everything-at-once from traversal. 291 | 292 | Initially, streaming at the granularity of cluster groups and using the 293 | generated group indices directly as dependencies is straight forward. Cluster 294 | groups could also be combined for coarser streaming granularity, with a new set 295 | of dependencies. 296 | 297 | **Simple Streaming** 298 | 299 | Compute per-group needed flags during traversal. Emit load/unload events on 300 | rising/falling edges. Fulfil those events in whole and set or unset the pointers 301 | to the new data between traversal+rendering. Some filtering such as per-group 302 | frame age may be useful to avoid frequently unloading and reloading groups. 303 | 304 | **Continuous Streaming** 305 | 306 | The simple streaming above has less control over the amount streamed per batch. 307 | This can be improved by adding batch size limits and queues. Note that by 308 | partially streaming will require manually resolving dependency orders. Some 309 | ideas are: 310 | 311 | 1. Limit the number of load/unload events emitted per frame 312 | 2. Add a global event queue 313 | - Delay unloads and ignore pulses by comparing events at the front of the 314 | queue with the most recent events inserted into the back of the queue. 315 | - Prioritise events by a detail metric so geometry loads evenly on screen 316 | 3. Set a fixed memory limit (memory pool even) and/or fixed cluster/group count 317 | - Prioritise loading until memory exhausted 318 | - Then only unload until memory reclaimed 319 | 4. To maintain dependency loading in topological order, expand events after the global queue 320 | - Recursively load generated groups first 321 | - Unload groups only if it is not a dependency of another group 322 | - The order of dependency resolution must not be changed after this step in the pipeline, but batches can still be formed 323 | 5. Form batches during dependency loading 324 | - The memory limit may be hit during dependency expansion 325 | - Must not include load/unload for same item in batch, assuming batches are 326 | executed in parallel 327 | 328 | ### References 329 | 330 | - [(1989) A pyramidal data structure for triangle-based surface description](https://ieeexplore.ieee.org/document/19053) 331 | - [(1995) On Levels of Detail in Terrains](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=3fa8c74a44f02aaaa18fe2d3cfdedfc9b8dbc50a) 332 | - [(1998) Efficient Implementation of Multi-Triangulations](https://dl.acm.org/doi/10.5555/288216.288222) 333 | - [(2001) Visualization of Large Terrains Made Easy](https://ieeexplore.ieee.org/document/964533) 334 | - [(2005) Batched Multi Triangulation](https://ieeexplore.ieee.org/document/1532797) 335 | - [(2021) A Deep Dive into Unreal Engine's 5 Nanite](https://advances.realtimerendering.com/s2021/Karis_Nanite_SIGGRAPH_Advances_2021_final.pdf) ([video](https://www.youtube.com/watch?v=eviSykqSUUw)) 336 | - [(2023) Real-Time Ray Tracing of Micro-Poly Geometry with Hierarchical Level of Detail](https://www.intel.com/content/www/us/en/developer/articles/technical/real-time-ray-tracing-of-micro-poly-geometry.html) ([video](https://www.youtube.com/watch?v=Tx32yi_0ETY)) 337 | 338 | ## Usage Example 339 | 340 | For a complete usage example, see https://github.com/nvpro-samples/vk_lod_clusters. 341 | 342 | To create LOD data with this library: 343 | 344 | ```cpp 345 | #include 346 | #include 347 | #include 348 | #include 349 | 350 | ... 351 | 352 | // Create contexts. One day they may store persistent resources between 353 | // execution. For now, they are empty and cheap to create per call. 354 | nvcluster_Context context; 355 | nvcluster_ContextCreateInfo contextCreateInfo{}; 356 | nvclusterCreateContext(&contextCreateInfo, &context); // Add error checking 357 | 358 | nvclusterlod_Context lodContext; 359 | nvclusterlod_ContextCreateInfo lodContextCreateInfo{.clusterContext = context}; 360 | nvclusterlodCreateContext(&lodContextCreateInfo, &lodContext); // Add error checking, don't leak context etc. 361 | 362 | // Input mesh 363 | std::vector indices = ...; 364 | std::vector positions = ...; 365 | 366 | // Create decimated clusters 367 | const nvclusterlod_MeshInput meshInput{ 368 | // Mesh data 369 | .indices = reinterpret_cast(indices.data()), 370 | .indexCount = static_cast(indices.size()), 371 | .vertices = reinterpret_cast(positions.data()), 372 | .vertexCount = static_cast(positions.size()), 373 | .vertexStride = sizeof(nvcluster_Vec3f), 374 | // Use default configurations and decimation factor: 375 | .clusterConfig = {}, 376 | .clusterGroupConfig = {}, 377 | .decimationFactor = 0.5, 378 | }; 379 | 380 | nvclusterlod::LocalizedLodMesh mesh; 381 | nvclusterlod::generateLocalizedLodMesh(lodContext, meshInput, mesh); // Add error checking, don't leak lodContext, context etc. 382 | 383 | // Build a spatial hierarchy for faster selection 384 | const nvclusterlod_HierarchyInput hierarchyInput { 385 | .clusterGeneratingGroups = mesh.lodMesh.clusterGeneratingGroups.data(), 386 | .groupQuadricErrors = mesh.lodMesh.groupQuadricErrors.data(), 387 | .groupClusterRanges = mesh.lodMesh.groupClusterRanges.data(), 388 | .groupCount = static_cast(mesh.lodMesh.groupClusterRanges.size()), 389 | .clusterBoundingSpheres = mesh.lodMesh.clusterBoundingSpheres.data(), 390 | .clusterCount = static_cast(mesh.lodMesh.clusterBoundingSpheres.size()), 391 | .lodLevelGroupRanges = mesh.lodMesh.lodLevelGroupRanges.data(), 392 | .lodLevelCount = static_cast(mesh.lodMesh.lodLevelGroupRanges.size()) 393 | }; 394 | 395 | nvclusterlod::LodHierarchy hierarchy; 396 | nvclusterlod::generateLodHierarchy(lodContext, hierarchyInput, hierarchy); // Add error checking, don't leak lodContext, context etc. 397 | 398 | // Upload mesh and hierarchy to the GPU. These are both simple structures of arrays. 399 | ... 400 | 401 | // If not wrapping the C API, 402 | nvclusterlodDestroyContext(lodContext); 403 | nvclusterDestroyContext(context); 404 | ``` 405 | 406 | **Rendering whole levels of detail** 407 | 408 | ```cpp 409 | // For each LOD level (highest detail first) 410 | for(size_t lod = 0; lod < mesh.lodMesh.lodLevelGroupRanges.size(); lod++) 411 | { 412 | const nvcluster_Range& lodLevelGroupRange = mesh.lodMesh.lodLevelGroupRanges[lod]; 413 | glBegin(GL_TRIANGLES); // Naive OpenGL immediate mode just for illustration 414 | 415 | // For each group 416 | for(uint32_t groupIndex = lodLevelGroupRange.offset; groupIndex < lodLevelGroupRange.offset + lodLevelGroupRange.count; groupIndex++) 417 | { 418 | const nvcluster_Range& groupClusterRange = mesh.lodMesh.groupClusterRanges[groupIndex]; 419 | 420 | // For each cluster 421 | for(uint32_t clusterIndex = groupClusterRange.offset; clusterIndex < groupClusterRange.offset + groupClusterRange.count; clusterIndex++) 422 | { 423 | const nvcluster_Range& clusterTriangleRange = mesh.lodMesh.clusterTriangleRanges[clusterIndex]; 424 | const nvcluster_Range& clusterVertexRange = mesh.clusterVertexRanges[clusterIndex]; 425 | 426 | // Can use this to pre-compute a per-cluster vertex array 427 | const uint32_t* clusterVertexGlobalIndices = &mesh.vertexGlobalIndices[clusterVertexRange.offset]; 428 | 429 | // For each triangle 430 | for(uint32_t triangleIndex = clusterTriangleRange.offset; triangleIndex < clusterTriangleRange.offset + clusterTriangleRange.count; triangleIndex++) 431 | { 432 | // For each triangle vertex 433 | for (uint32_t vertex = 0; vertex < 3; ++vertex) 434 | { 435 | uint32_t localVertexIndex = mesh.lodMesh.triangleVertices[3 * triangleIndex + vertex]; 436 | uint32_t globalVertexIndex = clusterVertexGlobalIndices[localVertexIndex]; 437 | glVertex3fv(glm::value_ptr(positions[globalVertexIndex])); 438 | } 439 | } 440 | } 441 | } 442 | glEnd(); 443 | } 444 | ``` 445 | 446 | **Selecting clusters** 447 | 448 | It is intended clusters are rendered based on their quadric error, a measure of 449 | geometric accuracy. A threshold in error over distance [to the camera] is chosen 450 | - the arcsine of which would be the angular error. This could be converted to a 451 | screen space pixel size, but for ray tracing where there are shadows and 452 | reflections behind the camera, a pure distance metric is a good start. 453 | 454 | To form a single unique surface with clusters of the right LOD, render clusters 455 | where: 456 | 457 | ```cpp 458 | errorOverDistance( 459 | objectToEyeTransform, 460 | hierarchy.groupCumulativeBoundingSpheres[clusterGroup], 461 | hierarchy.groupCumulativeQuadricError[clusterGroup] 462 | ) >= threshold 463 | && 464 | errorOverDistance( 465 | objectToEyeTransform, 466 | hierarchy.groupCumulativeBoundingSpheres[clusterGeneratingGroup], 467 | hierarchy.groupCumulativeQuadricError[clusterGeneratingGroup] 468 | ) < threshold 469 | ``` 470 | 471 | The `clusterGeneratingGroup` is the group from which a cluster was generated by 472 | decimation. E.g. decimating the "generating" group of clusters *generates* 473 | another a new smaller set of clusters. 474 | 475 | The `groupCumulativeQuadricError` is actually the error after its geometry is 476 | decimated, not the error of the group itself's clusters. This value doesn't 477 | exist at the group level, which is the reason for the surprise. The above 478 | conditions gives a band in which cluster are chosen. Their group's decimated 479 | geometry (first check) is closest to but not exceeding the threshold. Their 480 | geometry (second check) does exceed the threshold so they are first past the 481 | threshold. This holds true given some massaging of the bounding spheres to 482 | guarantee the decimated geometry will always pass the threshold before the 483 | geometry itself. 484 | 485 | The `groupCumulativeBoundingSpheres` conservatively include their generating 486 | group's bounding spheres. This guarantees that clusters from multiple levels 487 | cannot be rendered at once. 488 | 489 | **Spatial hierarchy** 490 | 491 | Performing a test per cluster would be expensive. Even only testing every unique 492 | group--generating-group pair. This library creates a spatial hierarchy of 493 | bounding spheres to reduce the search space. 494 | 495 | `hierarchy.nodes` contains a tree of all clusters. The first node is the root 496 | node. Simply descend while the following condition holds and check all cluster 497 | range nodes when found. It's actually a combination of hierarchies for each LOD 498 | level. The way it works is described below. 499 | 500 | ```cpp 501 | errorOverDistance( 502 | objectToEyeTransform, 503 | node.boundingSphere, 504 | node.maxClusterQuadricError 505 | ) >= threshold 506 | ``` 507 | 508 | ## Build Integration 509 | 510 | This library uses CMake and requires C++20. It is currently a static 511 | library, designed with C compatibility in mind with data passed as a structure 512 | of arrays and output allocated by the user. Integration has been verified by 513 | directly including it with `add_subdirectory`: 514 | 515 | ```cmake 516 | add_subdirectory(nv_cluster_lod_builder) 517 | ... 518 | target_link_libraries(my_target PUBLIC nv_cluster_lod_builder) 519 | ``` 520 | 521 | If there is interest, please reach out for CMake config files (for 522 | `find_package()`) or any other features. GitHub issues are welcome. 523 | 524 | ### Dependencies 525 | 526 | nv_cluster_lod_builder depends upon 527 | [nv_cluster_builder](https://github.com/nvpro-samples/nv_cluster_builder) and 528 | [meshoptimizer](https://github.com/zeux/meshoptimizer), which are submodules. To 529 | download them, run 530 | 531 | ``` 532 | git submodule update --init --recursive 533 | ``` 534 | 535 | Parallel execution on linux uses `tbb` if available. For ubuntu, `sudo apt install libtbb-dev`. 536 | 537 | ## License 538 | 539 | This library and 540 | [nv_cluster_builder](https://github.com/nvpro-samples/nv_cluster_builder) are 541 | licensed under the [Apache License 542 | 2.0](http://www.apache.org/licenses/LICENSE-2.0). 543 | 544 | This library uses third-party dependencies, which have their own: 545 | 546 | - [meshoptimizer](https://github.com/zeux/meshoptimizer), licensed under the 547 | [MIT License](https://github.com/zeux/meshoptimizer/blob/47aafa533b439a78b53cd2854c177db61be7e666/LICENSE.md) 548 | 549 | ## Limitations 550 | 551 | This library is intended to enable a quick start to continuous LOD. It 552 | demonstrates the basics for use as a learning tool or a placeholder. 553 | 554 | Cluster and cluster group quality includes the limitations outlined in 555 | [nv_cluster_builder](https://github.com/nvpro-samples/nv_cluster_builder). 556 | 557 | The number of triangles per cluster is configurable, but the vertex count is 558 | unconstrained. There are plans to address this, but for now it is possible that 559 | the 256 vertex limit of `VK_NV_cluster_acceleration_structure` may be exceeded. 560 | 561 | The decimation step uses [meshoptimizer](https://github.com/zeux/meshoptimizer) 562 | for its lightweight convenience. This step is internal and not configurable. 563 | Texture seams are not preserved and in general vertex attributes are yet to be 564 | plumbed through. 565 | 566 | Performance is limited by the clustering and decimation algorithms that run on 567 | the CPU, although there is some parallelization. 568 | --------------------------------------------------------------------------------