├── README.md ├── filter_points.cs ├── pointcloud_clod.vs └── resources └── clod.png /README.md: -------------------------------------------------------------------------------- 1 | # ieeevr_2019_clod 2 | 3 | Source for paper: "Real-Time Continuous Level of Detail Rendering of Point Clouds", 4 | Markus Schütz, Katharina Krösl, Michael Wimmer, 5 | IEEE VR 2019, March, Osaka 6 | 7 | The full source code is part of the [Fenek](https://github.com/m-schuetz/Fenek) live coding framework. 8 | 9 | Preprints are available [here](https://www.cg.tuwien.ac.at/research/publications/2019/schuetz-2019-CLOD/). 10 | 11 | 12 | 13 | The compute shader ```filter_points.cs``` is executed for each point of the 14 | full point cloud (inputBuffer) and it stores a selected subset 15 | with continuous LOD properties in a new vertex buffer (targetBuffer). 16 | 17 | ```pointcloud_clod.vs``` then renders the downsampled vertex buffer and also computes point sizes based on the sampling density / target spacing. 18 | 19 | - This is an in-core method. However, it should theoretically be possible to apply it to most LOD methods that follow the [layered point clouds scheme](https://dl.acm.org/citation.cfm?id=1652364) (e.g. [Potree](https://www.cg.tuwien.ac.at/research/publications/2016/SCHUETZ-2016-POT/), [Entwine](https://entwine.io/), ...) by dropping excess points of a node in the vertex shader. We did not do this at this time because state-of-the-art hierarchical methods are notoriously bad at handling highly complex geometries such as indoor environments. Our, currently in-core, continuous LOD method does not work on spatial acceleration structures such as Octrees, KD-Trees, etc. instead, it iterates over all points to compute the ideal subset for the current viewpoint, which is feasible for data sets up to 100M points because the downsampling step is essentially little more than a highly performant and cache friendly copy operation from one GPU buffer to another. 20 | - It downsamples ~86M points to 5M points in ~5.45ms on a GTX 1080 => 15.9M points / ms. 21 | - Initial tests for an RTX 2080 TI have shown performances of roughly ~86M points to 3M points in ~2ms => 43M points / ms. For reference, a frame in VR has to be computed in around 11ms. 22 | - The data structure is a series of subsamples of the original point cloud, essentially a flattened version of a layered point cloud scheme (in our case Potree), with the level as an additional attribute in the color channel of each point. The points are stored in an unordered large vertex buffer and the level attribute provides all the necessary hierarchical information. Our current structure does not group points in nodes/tiles/... 23 | - In VR, this method is distributed over multiple frames, 24 | e.g. process 18M points per frame of the input buffer, 25 | which takes roughly 1.1ms per frame. 26 | After 5 frames, the new downsampled vertex buffer is finished 27 | and it will be used to render the point cloud for the next 5 frames. 28 | - Points are culled against an "extented-frustum" so that enough points are available 29 | during motion even though the rendered model is computed for the frustum from 5 frames earlier. 30 | - Distribution of the downsampling step over multiple frames is actually not necessary anymore for the 2080 TI. 31 | The same models with the same LOD can be downsampled and rendered at 90FPS in a single frame on a 2080 TI, 32 | compared to a 1080 that required distribution of the downsampling step over ~5 frames. 33 | Also, MSAA was set to 8x at the 2080 TI instead of 4x on the 1080, while still rendering at 90fps. 34 | 35 | An out-of-core continuous LOD method for arbitarily large point clouds is currently in work. 36 | -------------------------------------------------------------------------------- /filter_points.cs: -------------------------------------------------------------------------------- 1 | #version 450 2 | 3 | // author: Markus Schütz 4 | // license: MIT license (https://opensource.org/licenses/MIT) 5 | 6 | // Source for paper: "Real-Time Continuous Level of Detail Rendering of Point Clouds" 7 | // Markus Schütz, Katharina Krösl, Michael Wimmer 8 | // IEEE VR 2019, March, Osaka 9 | // 10 | // This compute shader is executed for each point of the 11 | // full point cloud (inputBuffer) and it stores a selected subset 12 | // with continuous LOD properties in a new vertex buffer (targetBuffer). 13 | // 14 | // - This is an in-core method 15 | // - It downsamples ~86M points to 5M points in ~5.45ms on a GTX 1080 => 15.9M points / ms. 16 | // - Initial tests for an RTX 2080 TI have shown performances up to 44M points / ms. 17 | // - Each input point needs a level attribute in the alpha channel of the color 18 | // - In VR, this method is distributed over multiple frames, 19 | // e.g. process 18M points per frame of the input buffer, 20 | // which takes roughly 1.1ms per frame. 21 | // After 5 frames, the new downsampled vertex buffer is finished 22 | // and it will be used to render the point cloud for the next 5 frames. 23 | // - Points are culled against an "extented-frustum" so that enough points are available 24 | // during motion even though the rendered model is computed for the frustum from 5 frames earlier. 25 | // - Distribution of the downsampling step over multiple frames is actually not necessary anymore for the 2080 TI. 26 | // The same models with the same LOD can be downsampled and rendered at 90FPS in a single frame on a 2080 TI, 27 | // compared to a 1080 that required distribution of the downsampling step over ~5 frames. 28 | // 29 | 30 | layout(local_size_x = 128, local_size_y = 1) in; 31 | 32 | struct Vertex{ 33 | float x; 34 | float y; 35 | float z; 36 | uint colors; 37 | }; 38 | 39 | layout(std430, binding = 0) buffer ssInputBuffer{ 40 | Vertex inputBuffer[]; 41 | }; 42 | 43 | layout(std430, binding = 1) buffer ssTargetBuffer{ 44 | Vertex targetBuffer[]; 45 | }; 46 | 47 | layout(std430, binding = 3) buffer ssDrawParameters{ 48 | uint count; 49 | uint primCount; 50 | uint first; 51 | uint baseInstance; 52 | } drawParameters; 53 | 54 | layout(location = 21) uniform int uBatchOffset; 55 | layout(location = 22) uniform int uBatchSize; 56 | 57 | layout(std140, binding = 4) uniform shader_data{ 58 | mat4 transform; 59 | mat4 world; 60 | mat4 view; 61 | mat4 proj; 62 | 63 | vec2 screenSize; 64 | vec4 pivot; 65 | 66 | float CLOD; 67 | float scale; 68 | float spacing; 69 | float time; 70 | } ssArgs; 71 | 72 | 73 | float rand(float n){ 74 | return fract(cos(n) * 123456.789); 75 | } 76 | 77 | void main(){ 78 | 79 | uint inputIndex = gl_GlobalInvocationID.x; 80 | 81 | if(inputIndex > uBatchSize){ 82 | return; 83 | } 84 | 85 | inputIndex = inputIndex + uBatchOffset; 86 | 87 | Vertex v = inputBuffer[inputIndex]; 88 | 89 | vec3 aPosition = vec3(v.x, v.y, v.z); 90 | float level = float((v.colors & 0xFF000000) >> 24); 91 | float aRandom = rand(v.x + v.y + v.z); 92 | 93 | vec4 projected = (ssArgs.transform * vec4(aPosition, 1)); 94 | projected.xyz = projected.xyz / projected.w; 95 | 96 | // extented-frustum culling 97 | float extent = 2; 98 | if(abs(projected.x) > extent || abs(projected.y) > extent){ 99 | return; 100 | } 101 | 102 | // near-clipping 103 | if(projected.w < 0){ 104 | return; 105 | } 106 | 107 | vec3 worldPos = (ssArgs.world * vec4(aPosition, 1)).xyz; 108 | 109 | // without level randomization 110 | //float pointSpacing = uScale * uSpacing / pow(2, level); 111 | 112 | // with level randomization 113 | float pointSpacing = ssArgs.scale * ssArgs.spacing / pow(2, level + aRandom); 114 | 115 | float d = distance(worldPos, ssArgs.pivot.xyz); 116 | float dc = length(projected.xy); 117 | 118 | // targetSpacing dependant on camera distance 119 | //float targetSpacing = (ssArgs.CLOD / 1000) * d; 120 | 121 | // dependant on cam distance and distance to center of screen 122 | float targetSpacing = (d * ssArgs.CLOD) / (1000 * max(1 - 0.7 * dc , 0.3)); 123 | 124 | // reduce density away from center with the gaussian function 125 | // no significant improvement over 1 / (d - dc), so we've settled with the simpler one 126 | //float sigma = 0.4; 127 | //float gbc = (1 / (sigma * sqrt(2 * 3.1415))) * exp(-0.5 * pow( dc / sigma, 2.0 )); 128 | //targetSpacing = (1. * d * ssArgs.CLOD) / (1000 * gbc); 129 | 130 | if(pointSpacing < targetSpacing){ 131 | return; 132 | } 133 | 134 | int targetIndex = int(atomicAdd(drawParameters.count, 1)); 135 | targetBuffer[targetIndex] = v; 136 | } 137 | 138 | 139 | -------------------------------------------------------------------------------- /pointcloud_clod.vs: -------------------------------------------------------------------------------- 1 | #version 450 2 | 3 | // author: Markus Schütz 4 | // license: MIT license (https://opensource.org/licenses/MIT) 5 | 6 | // Source for paper: "Real-Time Continuous Level of Detail Rendering of Point Clouds" 7 | // Markus Schütz, Katharina Krösl, Michael Wimmer 8 | // IEEE VR 2019, March, Osaka 9 | 10 | layout(location = 0) in vec3 aPosition; 11 | layout(location = 1) in vec4 aColor; 12 | 13 | layout(location = 0) uniform int uNodeIndex; 14 | 15 | layout(binding = 0) uniform sampler2D uGradient; 16 | 17 | layout(std140, binding = 4) uniform shader_data{ 18 | mat4 transform; 19 | mat4 world; 20 | mat4 view; 21 | mat4 proj; 22 | mat4 centralView; 23 | mat4 centralProj; 24 | mat4 centralTransform; 25 | 26 | vec2 screenSize; 27 | vec4 pivot; 28 | 29 | float CLOD; 30 | float scale; 31 | float spacing; 32 | float time; 33 | 34 | float minMilimeters; 35 | float pointSize; 36 | float colorMultiplier; 37 | } ssArgs; 38 | 39 | out vec3 vColor; 40 | out float vPointSize; 41 | out float vRadius; 42 | out float vLinearDepth; 43 | 44 | 45 | float rand(float n){ 46 | return fract(cos(n) * 123456.789); 47 | } 48 | 49 | 50 | void main() { 51 | 52 | vec4 pos = ssArgs.transform * vec4(aPosition, 1.0); 53 | 54 | gl_Position = pos; 55 | 56 | vec4 projected = gl_Position / gl_Position.w; 57 | 58 | vec4 centralProjected = ssArgs.centralTransform * vec4(aPosition, 1.0); 59 | centralProjected.xyz = centralProjected.xyz / centralProjected.w; 60 | 61 | vLinearDepth = gl_Position.w; 62 | 63 | vColor = aColor.rgb * ssArgs.colorMultiplier; 64 | 65 | vec3 worldPos = (ssArgs.world * vec4(aPosition, 1)).xyz; 66 | 67 | float d = distance(worldPos, ssArgs.pivot.xyz); 68 | float dc = length(centralProjected.xy); 69 | 70 | float level = mod(aColor.a * 255, 128); 71 | float aRandom = rand(aPosition.x + aPosition.y + aPosition.z); 72 | 73 | float pointSpacing = ssArgs.scale * ssArgs.spacing / pow(2, level + aRandom); 74 | 75 | // targetSpacing dependant on camera distance 76 | //float targetSpacing = (d * ssArgs.CLOD / 1000); 77 | 78 | // dependent on cam distance and distance to center of screen 79 | //float targetSpacing = (d * ssArgs.CLOD) / (1000 * max(1 - 0.7 * dc , 0.3)); 80 | float targetSpacing = (d * ssArgs.CLOD) / (1000 * max(1 - 0.7 * dc , 0.3)); 81 | 82 | // reduce density away from center with the gaussian function 83 | // no significant improvement over 1 / (d - dc), so we've settled with the simpler one 84 | //float sigma = 0.4; 85 | //float gbc = (1 / (sigma * sqrt(2 * 3.1415))) * exp(-0.5 * pow( dc / sigma, 2.0 )); 86 | //targetSpacing = (1. * d * ssArgs.CLOD) / (1000 * gbc); 87 | 88 | float minPixels = 1; 89 | float maxPixels = 80; 90 | float sizeMultiplier = 1 * ssArgs.pointSize; 91 | 92 | float minMilimeters = ssArgs.scale * ssArgs.minMilimeters / sizeMultiplier; 93 | 94 | { // point size based on target spacing 95 | float ws = max(targetSpacing, minMilimeters / 1000.0); 96 | 97 | float l = sizeMultiplier * 2 * ws; 98 | vec4 v1 = ssArgs.view * ssArgs.world * vec4(aPosition, 1.0); 99 | vec4 v2 = vec4(v1.x + l, v1.y + l, v1.z, 1.0); 100 | 101 | vec4 vp1 = ssArgs.proj * v1; 102 | vec4 vp2 = ssArgs.proj * v2; 103 | 104 | vec2 vs1 = vp1.xy / vp1.w; 105 | vec2 vs2 = vp2.xy / vp2.w; 106 | 107 | float ds = distance(vs1, vs2); 108 | float dp = ds * ssArgs.screenSize.y; 109 | 110 | gl_PointSize = (dp / 1) * 1; 111 | 112 | gl_PointSize = clamp(gl_PointSize, minPixels, maxPixels); 113 | 114 | vRadius = ws; 115 | } 116 | 117 | { // adjust point size within blend-in range 118 | float zeroAt = pointSpacing; 119 | float fullAt = 0.8 * pointSpacing; 120 | 121 | float u = (targetSpacing - fullAt) / (zeroAt - fullAt); 122 | u = 1 - clamp(u, 0, 1); 123 | 124 | gl_PointSize = gl_PointSize * u; 125 | } 126 | 127 | vPointSize = gl_PointSize; 128 | 129 | gl_PointSize *= 0.8; 130 | } 131 | 132 | -------------------------------------------------------------------------------- /resources/clod.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/m-schuetz/ieeevr_2019_clod/0be1750da26b653cafd4cb9782d129f03b81566e/resources/clod.png --------------------------------------------------------------------------------