├── .gitignore
├── .nojekyll
├── LICENSE
├── README.md
├── scene
└── renderedAnt.png
└── src
├── kdtree.h
├── kernel.cu
├── main.cpp
├── nvcc_compile.sh
├── scene.cpp
└── scene.hpp
/.gitignore:
--------------------------------------------------------------------------------
1 | *.i
2 | *.ii
3 | *.gpu
4 | *.ptx
5 | *.cubin
6 | *.fatbin
7 |
--------------------------------------------------------------------------------
/.nojekyll:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jasonge27/StacklessRayTracer/6558c40e3be5d4aee360e8be56f6b3dd50c8117f/.nojekyll
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2016 Jason Ge
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Ray Tracer on GPU
2 | Implemented stackless KDTree on GPU using CUDA to accelerate ray tracing rendering algorithm. Hardware level optimization for register spills and local memory overhead.
3 |
4 | ## Parallel Ray Tracer on GPU
5 |
6 | Ray tracing follows every beam of light in a scene. Since we know the physical laws of reflection, refraction and scattering, when enough number of light beams are contructed, we can render the scene up to real world effects. Ray tracing is a natually parallel algorithm because physics tell us light beams do not interfere with each other.
7 |
8 | The main calculation of ray tracing involves calculating the intersecting point of any light beam and the objects in the scene. We implemented stackless KDTree algorithm in the paper [1] and observed 100X times speed up.
9 |
10 | | | Direct Intersection Calculation | Intersection Calculation by Vanilla Kdtree | Intersection Calculation by Stackless Kdtree |
11 | | :--------------------------------------: | :-----------------------------: | :--------------------------------------: | :--------------------------------------: |
12 | | CPU rendering time | 255s | 22.193s | 13.371s |
13 | | GPU rendering time (before hardware optimization) | 3.282s | 0.198s | 0.098s |
14 | | Speedup | 77 | 112 | 137 |
15 |
16 | Experiment Details.
17 |
18 | CPU: Intel Core i7-2630QM, GPU: NVIDIA GeForce GT 610 with 6GB RAM.
19 |
20 | OS: Linux 12.04, Compiler: nvcc in CUDA 5.0
21 |
22 | Scene: an 3996 face ant in a cube, 800*600 light beams, 3 times reflection for shallow calculation.
23 |
24 | Effect:
25 |
26 |
27 |
28 |
29 |
30 | ## Hardware Level Optimization for GPU
31 |
32 | ### Bottleneck analysis
33 |
34 | #### Register Spills vs Warp Occupancy Trade-off
35 |
36 | On the one hand, if we use more registers per thread, we can reduce the local memory overhead caused by register spill. On the other hand, if we use less registers per thread, we can run more warps in parallel. We need to find the right balance in this trade-off.
37 |
38 | - Register spills. Adding option `-Xptaxs -v` when compiling with `nvcc` gives us the register usage info: **register usage: 63, stored spill 324B, load spill 452B** Explanation. GT 610 has compute capability 2.1 with 63 registers at most in each thread. The total size of registers available is 64KB. If a thread asks for more registers than the limit, there will be register spill.
39 |
40 | Visual profiler says **local memory overhead: 62.3%**
41 |
42 | Explanation. When register spills happen, the thread will read/write through L1 cache into local memory which is slower than directly to registers. Moreover, in the case of L1 cache miss, instructions have to be re-issued putting more pressure on the memory bandwidth. This is why we have such a large **local memory overhead**.
43 |
44 | - Warp occupancy. Visual profiler says **warp occupancy 33.1%**
45 |
46 | Explanation. We are using blocks of dimension 16*8 in the computation. GT 610 can allow at most 48 warps to be simultanously running in hardware given the block dimension we've chosen. Warp occupancy says we're only using 16 warps on average, which is a consequence of register spill. Each block uses up to 4KB registers, with the limit of 64KB registers in total, it makes sense that we can only run 16 warps at the same time.
47 |
48 |
49 |
50 | #### Branch divergence
51 |
52 | Visual profiler gives us **warp execution efficiency 45.4%**
53 |
54 | Since the threads in one single warp share the same intruction stream, when the threads diverge in `if-then A else B` statement into thread group 1 and thread group 2, thread group 1 will excecute A with group 2 waiting, and then group 2 will execute B with group 1 watiing. The warp exec efficiency ratio says that basically our time doubled because of this phenomenon.
55 |
56 | ### Optimizations
57 |
58 | - Optimization 1. Using compiler flags `-prec-dev=false, -prec-sqrt=false` decreases the precision of division and sqrt computation. The compiler will then optimize out some intermediate varialbes to reduce register usage.
59 |
60 | - Optimization 2. Trade-off between register spills and warp occupancy. We use the compiler flag `maxregcount=48` to limit maximal number of registers each thread can use.
61 |
62 | | | Time | Warp Occupancy | Local memory overhead | Spilled Store | Spilled Load |
63 | | :-----------------: | :--: | :------------: | :-------------------: | :-----------: | :----------: |
64 | | Before Optimization | 98ms | 31.1% | 62.3% | 324B | 452B |
65 | | Optimization 1 | 60ms | 33.4% | 48.8% | 122B | 122B |
66 | | Optimization 1+2 | 49ms | 49.6% | 53.7% | 240B | 240B |
67 |
68 |
69 |
70 | References
71 |
72 | [1] Popov, Gunther et al., Stackless KD-Tree Traversal for High Performance GPU Ray Tracing, 2007
73 |
74 |
--------------------------------------------------------------------------------
/scene/renderedAnt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jasonge27/StacklessRayTracer/6558c40e3be5d4aee360e8be56f6b3dd50c8117f/scene/renderedAnt.png
--------------------------------------------------------------------------------
/src/kdtree.h:
--------------------------------------------------------------------------------
1 | #ifndef KD_TREE_H
2 | #define KD_TREE_H
3 |
4 | #define LEAFSIZE 15
5 | //#define TREEDEPTH 15
6 |
7 | typedef struct node
8 | {
9 | struct node* pleft;
10 | struct node* pright;
11 |
12 | int axis;
13 | float splitpos;
14 |
15 | int trinum;
16 | int tri[LEAFSIZE];
17 |
18 | // struct node* rope[6];
19 |
20 | float xmin;
21 | float xmax;
22 | float ymin;
23 | float ymax;
24 | float zmin;
25 | float zmax;
26 | }KdTree;
27 |
28 | typedef struct node_rp
29 | {
30 | int left;
31 | int right;
32 |
33 | int axis;
34 | float splitpos;
35 |
36 | int trinum;
37 | int tri[LEAFSIZE];
38 |
39 | int rope[6];
40 |
41 | float xmin;
42 | float xmax;
43 | float ymin;
44 | float ymax;
45 | float zmin;
46 | float zmax;
47 | }KdTree_rp;
48 |
49 | #endif
50 |
--------------------------------------------------------------------------------
/src/kernel.cu:
--------------------------------------------------------------------------------
1 | //kernelPBO.cu (Rob Farber)
2 |
3 | #include
4 | #include "scene.hpp"
5 | #include "kdtree.h"
6 |
7 | #define blocksize_x 8
8 | #define blocksize_y 6
9 |
10 | #define EPSILON (1e-6)
11 |
12 | __constant__ float3 dcamPos;
13 | __constant__ float3 dwincenterPos;
14 |
15 | __constant__ float3 dwinup;
16 | __constant__ float3 dwindown;
17 | __constant__ float3 dboxup;
18 | __constant__ float3 dboxdown;
19 |
20 | __constant__ unsigned int dwinwidth;
21 | __constant__ unsigned int dwinheight;
22 |
23 | __constant__ float hx;
24 | __constant__ float hy;
25 |
26 | __constant__ float3 objectColor;
27 |
28 | __constant__ unsigned int nTri;
29 | __constant__ unsigned int nVtx;
30 | __constant__ unsigned int nLS;
31 |
32 | /*__constant__ float* dVtxBuf;
33 | __constant__ int* dTriVtxBuf;
34 | __constant__ float* dNormal;
35 | __constant__ float* dLS;
36 | __constant__ float* dsandboxColor;
37 | __constant__ unsigned int* dsandboxIsReflective;
38 | */
39 |
40 | texture texref_VtxBuf;
41 | texture texref_TriVtx;
42 | texture texref_Normal;
43 | texture texref_LS;
44 | texture texref_sandboxColor;
45 | texture texref_sandboxIsReflective;
46 |
47 | // memory address in cudamemory
48 | /*
49 | extern "C" float* dVtxBuf = NULL;
50 | extern "C" int* dTriVtxBuf = NULL;
51 | extern "C" float* dNormal = NULL;
52 | extern "C" float* dLS = NULL;
53 | extern "C" float* dsandboxColor = NULL;
54 | extern "C" unsigned int* dsandboxIsReflective = NULL;
55 | */
56 |
57 | void checkCUDAError(const char *msg)
58 | {
59 | cudaError_t err = cudaGetLastError();
60 | if( cudaSuccess != err) {
61 | fprintf(stderr, "Cuda error: %s: %s.\n", msg, cudaGetErrorString( err) );
62 | exit(EXIT_FAILURE);
63 | }
64 | }
65 |
66 | __device__ inline float3 CrossProduct( float3 p1, float3 p2)
67 | {
68 | float3 tmp;
69 | tmp.x = p1.y*p2.z - p1.z*p2.y;
70 | tmp.y = p1.z*p2.x - p1.x*p2.z;
71 | // tmp.y = p1.x*p2.z - p1.z*p2.x;
72 | tmp.z = p1.x*p2.y - p1.y*p2.x;
73 | return tmp;
74 | }
75 |
76 | __device__ inline float InnerProduct( float3 p1, float3 p2)
77 | {
78 | return p1.x*p2.x + p1.y*p2.y + p1.z*p2.z;
79 | }
80 |
81 | // return the value of t, when p+t*dir intersect triangle p0-p1-p2
82 | __device__ inline float TestSingleIntersection( float3 p, float3 dir, float3 p0, float3 p1, float3 p2)
83 | {
84 | float3 E1;
85 | E1.x = p1.x - p0.x;
86 | E1.y = p1.y - p0.y;
87 | E1.z = p1.z - p0.z;
88 |
89 | float3 E2;
90 | E2.x = p2.x - p0.x;
91 | E2.y = p2.y - p0.y;
92 | E2.z = p2.z - p0.z;
93 |
94 | float3 T;
95 | T.x = p.x - p0.x;
96 | T.y = p.y - p0.y;
97 | T.z = p.z - p0.z;
98 |
99 | float3 P = CrossProduct( dir, E2);
100 | float3 Q = CrossProduct( T, E1 );
101 |
102 | float s = InnerProduct(P, E1);
103 | // s = 0.5; //FIXME
104 | if ( (s < EPSILON ) && (s>-EPSILON) ) return -1;
105 |
106 | float t = InnerProduct(Q, E2) / s;
107 | if ( t < 1e-5) return -2;
108 |
109 | float u = InnerProduct(P, T) / s;
110 | if ( (u < 0) ) return -3;
111 |
112 | float v = InnerProduct(Q, dir) / s;
113 | if ( (v < 0 ) ) return -4;
114 |
115 | if ( u+v > 1 ) return -5;
116 |
117 | // t = 0.5; //FIXME
118 | return t;
119 | }
120 |
121 | __device__ inline bool FromLeftToRight( KdTree_rp* dtree, int idx, float3 dir)
122 | {
123 | switch (dtree[idx].axis)
124 | {
125 | case 0:
126 | if ( dir.x > EPSILON) return true;
127 | else return false;
128 | case 1:
129 | if ( dir.y > EPSILON) return true;
130 | else return false;
131 | case 2:
132 | if ( dir.z > EPSILON) return true;
133 | else return false;
134 | default:
135 | printf("no such case!\n");
136 | }
137 | }
138 |
139 | __device__ inline bool TestNodeIntersection( KdTree_rp* dtree, int idx, float3 p, float3 dir)
140 | {
141 |
142 | float tmp1, tmp2;
143 |
144 | // x-y
145 | if ( (dir.x-EPSILON) )
146 | {
147 | if ( (p.x>dtree[idx].xmax) || ( p.xdtree[idx].ymax) && (tmp2>dtree[idx].ymax))
160 | || ( (tmp1-EPSILON) )
165 | {
166 | if ( (p.y>dtree[idx].ymax) || (p.ydtree[idx].zmax) && (tmp2>dtree[idx].zmax))
179 | || ( (tmp1-EPSILON) )
184 | {
185 | if ( (p.z>dtree[idx].zmax) || (p.zdtree[idx].xmax) && (tmp2>dtree[idx].xmax))
198 | || ( (tmp1 1e-5) || ( dir.x< -1e-5) )
217 | {
218 | if ( prevface != 0)
219 | {
220 | tmp = ( dtree[idx].xmin - p.x) /dir.x;
221 | a = p.y + tmp* dir.y;
222 | b = p.z + tmp* dir.z;
223 |
224 | if ( (tmp > -ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
226 | {
227 | t =tmp;
228 | face = 0;
229 | }
230 | }
231 |
232 | if ( prevface != 1)
233 | {
234 | tmp = ( dtree[idx].xmax - p.x) /dir.x;
235 | a = p.y + tmp* dir.y;
236 | b = p.z + tmp* dir.z;
237 | if ( (tmp > -ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
239 | {
240 | t =tmp;
241 | face = 1;
242 | }
243 | }
244 | }
245 |
246 | if ( (dir.y > 1e-5) || ( dir.y < -1e-5) )
247 | {
248 | if ( prevface != 2)
249 | {
250 | tmp = ( dtree[idx].ymin - p.y) / dir.y;
251 | a = p.x + tmp * dir.x;
252 | b = p.z + tmp * dir.z;
253 | if ( (tmp > -ep2) && ( tmp < t) )
254 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
255 | {
256 | t =tmp;
257 | face = 2;
258 | }
259 | }
260 |
261 | if ( prevface != 3 )
262 | {
263 | tmp = ( dtree[idx].ymax - p.y) / dir.y;
264 | a = p.x + tmp * dir.x;
265 | b = p.z + tmp * dir.z;
266 | if ( (tmp > -ep2) && ( tmp < t) )
267 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
268 | {
269 | t =tmp;
270 | face = 3;
271 | }
272 | }
273 | }
274 |
275 | if ( (dir.z > 1e-5) || ( dir.z < -1e-5) )
276 | {
277 | if ( prevface != 4 )
278 | {
279 | tmp = ( dtree[idx].zmin - p.z) / dir.z;
280 | a = p.x + tmp * dir.x;
281 | b = p.y + tmp * dir.y;
282 | if ( (tmp > -ep2) && ( tmp < t) )
283 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) )
284 | {
285 | t =tmp;
286 | face = 4;
287 | }
288 | }
289 |
290 | if ( prevface != 5)
291 | {
292 | tmp = ( dtree[idx].zmax - p.z) / dir.z;
293 | a = p.x + tmp * dir.x;
294 | b = p.y + tmp * dir.y;
295 | if ( (tmp > -ep2) && ( tmp < t) )
296 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) )
297 | {
298 | t =tmp;
299 | face = 5;
300 | }
301 | }
302 | }
303 |
304 | if ( face >=0 )
305 | {
306 | pt.x = p.x + t * dir.x;
307 | pt.y = p.y + t*dir.y;
308 | pt.z = p.z + t*dir.z;
309 | }
310 |
311 | }
312 |
313 | __device__ inline void IntersectBox2( float3 p, float3 dir, KdTree_rp* dtree, int idx, float3& pt, int& face)
314 | {
315 | float t = 1e6;
316 | float tmp = 0;
317 |
318 | float ep = 0;
319 | float ep2 = 0;
320 |
321 | face = -1;
322 |
323 | float a,b;
324 |
325 | if ( (dir.x > 1e-5) || ( dir.x< -1e-5) )
326 | {
327 | tmp = ( dtree[idx].xmin - p.x) /dir.x;
328 | a = p.y + tmp* dir.y;
329 | b = p.z + tmp* dir.z;
330 |
331 | if ( (tmp > ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
333 | {
334 | t =tmp;
335 | face = 0;
336 | }
337 |
338 | tmp = ( dtree[idx].xmax - p.x) /dir.x;
339 | a = p.y + tmp* dir.y;
340 | b = p.z + tmp* dir.z;
341 | if ( (tmp > ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
343 | {
344 | t =tmp;
345 | face = 1;
346 | }
347 | }
348 |
349 | if ( (dir.y > 1e-5) || ( dir.y < -1e-5) )
350 | {
351 | tmp = ( dtree[idx].ymin - p.y) / dir.y;
352 | a = p.x + tmp * dir.x;
353 | b = p.z + tmp * dir.z;
354 | if ( (tmp > ep2) && ( tmp < t) )
355 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
356 | {
357 | t =tmp;
358 | face = 2;
359 | }
360 |
361 | tmp = ( dtree[idx].ymax - p.y) / dir.y;
362 | a = p.x + tmp * dir.x;
363 | b = p.z + tmp * dir.z;
364 | if ( (tmp > ep2) && ( tmp < t) )
365 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) )
366 | {
367 | t =tmp;
368 | face = 3;
369 | }
370 | }
371 |
372 | if ( (dir.z > 1e-5) || ( dir.z < -1e-5) )
373 | {
374 | tmp = ( dtree[idx].zmin - p.z) / dir.z;
375 | a = p.x + tmp * dir.x;
376 | b = p.y + tmp * dir.y;
377 | if ( (tmp > ep2) && ( tmp < t) )
378 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) )
379 | {
380 | t =tmp;
381 | face = 4;
382 | }
383 |
384 | tmp = ( dtree[idx].zmax - p.z) / dir.z;
385 | a = p.x + tmp * dir.x;
386 | b = p.y + tmp * dir.y;
387 | if ( (tmp > ep2) && ( tmp < t) )
388 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) )
389 | {
390 | t =tmp;
391 | face = 5;
392 | }
393 | }
394 |
395 | if ( face >=0 )
396 | {
397 | pt.x = p.x + t * dir.x;
398 | pt.y = p.y + t*dir.y;
399 | pt.z = p.z + t*dir.z;
400 | }
401 |
402 | }
403 |
404 | __device__ inline int FindLeaf( KdTree_rp* dtree, int idx, float3 pt)
405 | {
406 | while ( dtree[idx].trinum <0 )
407 | {
408 | switch (dtree[idx].axis)
409 | {
410 | case 0:
411 | if ( pt.x < dtree[idx].splitpos ) idx = dtree[idx].left;
412 | else idx = dtree[idx].right;
413 | break;
414 | case 1:
415 | if ( pt.y < dtree[idx].splitpos ) idx = dtree[idx].left;
416 | else idx = dtree[idx].right;
417 | break;
418 | case 2:
419 | if ( pt.z < dtree[idx].splitpos ) idx = dtree[idx].left;
420 | else idx = dtree[idx].right;
421 | break;
422 | }
423 | }
424 |
425 | return idx;
426 |
427 | }
428 |
429 | __device__ inline void FindIntersectedTriangle( float3 p, float3 dir, int& intersectedTri, float3& intersectedPt, float* dVtxBuf, int* dTriVtxBuf, KdTree_rp* dtree)
430 | {
431 | float tmin = 1e6;
432 | float t = tmin;
433 | intersectedTri = -1;
434 | float3 p0;
435 | float3 p1;
436 | float3 p2;
437 | float3 pt;
438 | int idx0;
439 |
440 | // int stack[2*TREEDEPTH];
441 |
442 | int idx = 0;
443 | int face = -1;
444 | if ( (p.x>dtree[0].xmax) || ( p.y > dtree[0].ymax) || (p.z > dtree[0].zmax)
445 | || ( p.x 1e-5) && ( t < tmin ) )
489 | {
490 | intersectedTri = idx;
491 | intersectedPt.x = p.x + t*dir.x;
492 | intersectedPt.y = p.y + t*dir.y;
493 | intersectedPt.z = p.z + t*dir.z;
494 | tmin = t;
495 | }
496 |
497 | }
498 |
499 | if ( intersectedTri >= 0 ) return;
500 |
501 | IntersectBox2(p, dir, dtree, idx, pt, face );
502 | idx = dtree[idx].rope[face];
503 | if ( (face==1) || (face==3) || (face==5) ) face--;
504 | else face++;
505 | */
506 |
507 | face = -2;
508 | pt = p;
509 | }
510 |
511 | if ( (face >= 0) || (face==-2) )
512 | while ( (intersectedTri<0) && (idx>=0) )
513 | {
514 | idx = FindLeaf( dtree, idx, pt );
515 |
516 | // printf("%d\n",idx);
517 | if (dtree[idx].trinum>0) // leaf node
518 | {
519 | // printf("leaf node:%d\n", idx);
520 | for ( int i = 0; i < dtree[idx].trinum; i++ )
521 | {
522 | // printf("%d ", dtree[idx].tri[i]);
523 |
524 | idx0 = dTriVtxBuf[3*dtree[idx].tri[i]];
525 |
526 | p0.x = dVtxBuf[3*idx0];
527 | p0.y = dVtxBuf[3*idx0+1];
528 | p0.z = dVtxBuf[3*idx0+2];
529 |
530 | idx0 = dTriVtxBuf[3*dtree[idx].tri[i]+1];
531 | p1.x = dVtxBuf[3*idx0];
532 | p1.y = dVtxBuf[3*idx0+1];
533 | p1.z = dVtxBuf[3*idx0+2];
534 |
535 | idx0 = dTriVtxBuf[3*dtree[idx].tri[i]+2];
536 | p2.x = dVtxBuf[3*idx0];
537 | p2.y = dVtxBuf[3*idx0+1];
538 | p2.z = dVtxBuf[3*idx0+2];
539 |
540 | t = TestSingleIntersection( p, dir ,p0, p1, p2);
541 |
542 | if ( (t > 1e-5) && ( t < tmin ) )
543 | {
544 | intersectedTri = dtree[idx].tri[i];
545 | intersectedPt.x = p.x + t*dir.x;
546 | intersectedPt.y = p.y + t*dir.y;
547 | intersectedPt.z = p.z + t*dir.z;
548 | tmin = t;
549 | }
550 |
551 | }
552 | // printf("\n");
553 | }
554 |
555 | if ( intersectedTri >= 0 ) break;
556 |
557 | // second intersected point of the current leaf node
558 | // face: the intersected face of the current leaf ndoe
559 | IntersectBox(pt, dir, dtree, idx, pt, face, face);
560 |
561 | // next node adjacent to the leaf
562 | idx = dtree[idx].rope[face];
563 | if ( (face==1) || (face==3) || (face==5) ) face --;
564 | else face++;
565 | }
566 | }
567 |
568 | __device__ inline uchar4 computeColor( int depth, float3 p, float3 dir, float* dVtxBuf, int* dTriVtxBuf, float* dNormal, float* dLS, KdTree_rp* dtree )
569 | {
570 | int tri = -1;
571 | float3 intersectPt;
572 | int triOnTheWay = -1;
573 | float3 intersectPtOnTheWay;
574 |
575 | float3 tempDir;
576 |
577 |
578 | FindIntersectedTriangle(p,dir, tri, intersectPt, dVtxBuf, dTriVtxBuf, dtree);
579 |
580 | uchar4 color;
581 | color.w= 0;
582 | color.x=0; color.y = 0; color.z= 0;
583 |
584 | int i =0;
585 |
586 | float l;
587 | l = sqrt( dir.x * dir.x + dir.y*dir.y + dir.z*dir.z);
588 | dir.x = dir.x /l;
589 | dir.y = dir.y / l;
590 | dir.z = dir.z /l;
591 |
592 | if ( tri < 0) // no intersection, return sandbox color
593 | {
594 | int face = -1;
595 |
596 | float3 pt;
597 | if ( fabs(dir.y)>0)
598 | {
599 | l = (-1.0 -p.y )/ dir.y;
600 | pt.x = p.x + l*dir.x;
601 | pt.y = -1.0;
602 | pt.z = p.z + l * dir.z;
603 | if ( (pt.x >=-1.0) && (pt.x<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0) )
604 | {
605 | face = 2; // ymin
606 | if ( depth == 0)
607 | {
608 | tempDir.x = dir.x;
609 | tempDir.y = -dir.y;
610 | tempDir.z = dir.z;
611 | color = computeColor(depth+1,pt, tempDir, dVtxBuf, dTriVtxBuf, dNormal, dLS, dtree);
612 | color.x = color.x * 0.8;
613 | color.y = color.y * 0.8;
614 | color.z = color.z * 0.8;
615 | }
616 | }
617 | }
618 |
619 | if ( fabs(dir.y)>0)
620 | {
621 | l = (1.0 -p.y )/ dir.y;
622 | pt.x = p.x + l*dir.x;
623 | pt.y = 1.0;
624 | pt.z = p.z + l * dir.z;
625 | if ( (pt.x >=-1.0) && (pt.x<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0) )
626 | face = 3; // ymax
627 | }
628 |
629 | if ( fabs(dir.x)>0)
630 | {
631 | l = (-1.0 - p.x ) /dir.x;
632 | pt.x = -1.0;
633 | pt.y = p.y + l * dir.y;
634 | pt.z = p.z + l * dir.z;
635 | if ( (pt.y>=-1.0) && (pt.y<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0))
636 | face = 0; // xmin
637 | }
638 |
639 | if ( fabs(dir.x)>0)
640 | {
641 | l = (1.0 - p.x ) /dir.x;
642 | pt.x = 1.0;
643 | pt.y = p.y + l * dir.y;
644 | pt.z = p.z + l * dir.z;
645 | if ( (pt.y>=-1.0) && (pt.y<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0))
646 | face = 1; // xmax
647 | }
648 |
649 | if ( fabs(dir.z)>0)
650 | {
651 | l = (1.0 - p.z ) /dir.z;
652 | pt.x = p.x + l*dir.x;
653 | pt.y = p.y + l*dir.y;
654 | pt.z = 1.0;
655 | if ( (pt.x>=-1.0) && (pt.x<=1.0) && (pt.y>=-1.0) && (pt.y<=1.0) )
656 | face = 5; // zmax
657 | }
658 |
659 | switch (face)
660 | {
661 | case 2:
662 | break;
663 | case 3:
664 | color.x = 0;
665 | color.y = 0;
666 | color.z = 0;
667 | break;
668 | case 0:
669 | color.x = 10;
670 | color.y = 100;
671 | color.z = 10;
672 | break;
673 | case 5:
674 | color.x = 100;
675 | color.y = 10;
676 | color.z = 10;
677 | break;
678 | case 1:
679 | color.x = 10;
680 | color.y = 10;
681 | color.z =100;
682 | break;
683 | default:
684 | break;
685 | }
686 |
687 | }
688 | else // intersected with a triangle
689 | {
690 | // check shadow or illumination
691 | triOnTheWay = -1;
692 | for ( i = 0; i < nLS; i++) // for every light source
693 | {
694 |
695 | tempDir.x = dLS[3*i] - intersectPt.x;
696 | tempDir.y = dLS[3*i+1] - intersectPt.y;
697 | tempDir.z = dLS[3*i+2] - intersectPt.z;
698 |
699 | l = sqrt(tempDir.x*tempDir.x + tempDir.y * tempDir.y + tempDir.z * tempDir.z);
700 | tempDir.x = tempDir.x / l;
701 | tempDir.y = tempDir.y / l;
702 | tempDir.z = tempDir.z / l;
703 |
704 |
705 | l = tempDir.x*dNormal[3*tri] + tempDir.y*dNormal[3*tri+1] + tempDir.z*dNormal[3*tri+2];
706 |
707 | if ( l < -1e-5) continue;
708 |
709 | FindIntersectedTriangle( intersectPt, tempDir, triOnTheWay, intersectPtOnTheWay, dVtxBuf, dTriVtxBuf, dtree);
710 |
711 | if ( triOnTheWay >= 0 ) // in a shadow of triangle "triOnTheWay"
712 | {
713 | /*
714 | color.w=0;
715 | color.x = 100;
716 | color.y = 0;
717 | color.z = 0;
718 | */
719 | }
720 | else // directly illuminated by the current light source
721 | {
722 | color.w = 0;
723 |
724 | if ( i == 0) l = 0.3;
725 | else if ( i == 1) l = 0.1;
726 |
727 | color.x = color.x+ l* objectColor.x ;
728 | color.y = color.y+ l* objectColor.y ;
729 | color.z = color.z+ l *objectColor.z ;
730 | }
731 | }
732 |
733 | l = 0.5*fabs( dir.x*dNormal[3*tri] + dir.y *dNormal[3*tri+1] + dir.z*dNormal[3*tri+2] );
734 | color.x = color.x+ l * objectColor.x;
735 | color.y = color.y + l* objectColor.y;
736 | color.z = color.z + l*objectColor.z;
737 | /*
738 | color.x = objectColor.x;
739 | color.y = objectColor.y;
740 | color.z = objectColor.z;
741 | */
742 | }
743 |
744 |
745 | return color;
746 | }
747 |
748 | __global__ void TracingKernel( uchar4* pos, float* dVtxBuf, int* dTriVtxBuf, float* dNormal, float* dLS, KdTree_rp* dtree )
749 | {
750 | int pixelx = blockIdx.x*blocksize_x + threadIdx.x;
751 | int pixely = blockIdx.y*blocksize_y + threadIdx.y;
752 |
753 | float3 dir;
754 | // dir.x = dwindown.x + hx*pixelx - dcamPos.x;
755 | // dir.y = dwindown.y + hy*pixely - dcamPos.y;
756 | dir.x = dwinup.x - hx*pixelx - dcamPos.x;
757 | dir.y = dwinup.y- hy*pixely - dcamPos.y;
758 | dir.z = dwindown.z - dcamPos.z;
759 |
760 | pos[ pixely*dwinwidth + pixelx ] = computeColor(0, dcamPos, dir, dVtxBuf, dTriVtxBuf, dNormal, dLS, dtree);
761 |
762 | __syncthreads();
763 | }
764 |
765 | __global__ void TracingKernel_test( uchar4* pos, float* dVtxBuf, int* dTriVtxBuf, float* dNormal, float* dLS, KdTree_rp* dtree )
766 | {
767 | int pixelx = blockIdx.x*blocksize_x + threadIdx.x;
768 | int pixely = blockIdx.y*blocksize_y + threadIdx.y;
769 |
770 | float3 dir;
771 | // dir.x = dwindown.x + hx*pixelx - dcamPos.x;
772 | // dir.y = dwindown.y + hy*pixely - dcamPos.y;
773 | dir.x = -0.1;
774 | dir.y = -1;
775 | dir.z = 1;
776 |
777 | pos[ pixely*dwinwidth + pixelx ] = computeColor(0, dcamPos, dir, dVtxBuf, dTriVtxBuf, dNormal, dLS, dtree);
778 |
779 | __syncthreads();
780 | }
781 |
782 | // Be sure to launch after setting const. and tex. memory
783 | extern "C" void launch_kernel( uchar4* pos, CScene& scene )
784 | {
785 | dim3 dimBlock(8,8);
786 |
787 | dim3 dimGrid;
788 | dimGrid.x = scene.m_winwidth / dimBlock.x;
789 | dimGrid.y = scene.m_winheight / dimBlock.y;
790 |
791 | TracingKernel<<>>(pos, scene.dVtxBuf, scene.dTriVtxBuf, scene.dNormal, scene.dLS, scene.dtree);
792 |
793 | cudaError_t error = cudaGetLastError();
794 | if (error != cudaSuccess )
795 | {
796 | printf("Cuda Error:%s\n", cudaGetErrorString(error));
797 | exit(-1);
798 | }
799 | }
800 |
801 | extern "C" void launch_kernel_test( uchar4* pos, CScene& scene )
802 | {
803 | dim3 dimBlock(blocksize_x,blocksize_y);
804 | // dim3 dimBlock(1);
805 |
806 |
807 | dim3 dimGrid;
808 | dimGrid.x = scene.m_winwidth / dimBlock.x;
809 | dimGrid.y = scene.m_winheight / dimBlock.y;
810 |
811 | uchar4* dpos;
812 |
813 | cudaMalloc( (void**) &dpos, sizeof(uchar4)* scene.m_winwidth*scene.m_winheight );
814 | cudaError_t error = cudaGetLastError();
815 | if (error != cudaSuccess )
816 | {
817 | printf("Cuda Error:%s\n", cudaGetErrorString(error));
818 | exit(-1);
819 | }
820 |
821 | cudaMemcpy( dpos, pos, sizeof(uchar4)*scene.m_winwidth*scene.m_winheight, cudaMemcpyHostToDevice);
822 | error = cudaGetLastError();
823 | if (error != cudaSuccess )
824 | {
825 | printf("Cuda Error:%s\n", cudaGetErrorString(error));
826 | exit(-1);
827 | }
828 |
829 | printf("Tracing...\n");
830 | float GPU_time = 0;
831 | cudaEvent_t start, stop;
832 | cudaEventCreate( &start );
833 | cudaEventCreate( &stop );
834 | cudaEventRecord( start, 0 );
835 |
836 | TracingKernel<<>>(dpos, scene.dVtxBuf, scene.dTriVtxBuf, scene.dNormal, scene.dLS, scene.dtree);
837 | // TracingKernel_test<<>>(dpos, scene.dVtxBuf, scene.dTriVtxBuf, scene.dNormal, scene.dLS, scene.dtree);
838 |
839 | cudaEventRecord( stop, 0);
840 | cudaEventSynchronize( stop );
841 | cudaEventElapsedTime( &GPU_time, start, stop);
842 | printf("GPU time:%f\n", GPU_time);
843 |
844 | printf("%d %d\n", scene.m_winwidth, scene.m_winheight);
845 | printf("DeviceToHost..\n");
846 | cudaMemcpy( pos, dpos, sizeof(uchar4)*scene.m_winwidth*scene.m_winheight, cudaMemcpyDeviceToHost);
847 | printf("DeviceToHostDone.\n");
848 |
849 | printf("Free...\n");
850 | cudaFree(dpos);
851 | printf("Free done.\n");
852 |
853 | error = cudaGetLastError();
854 | if (error != cudaSuccess )
855 | {
856 | printf("Cuda Error:%s\n", cudaGetErrorString(error));
857 | exit(-1);
858 | }
859 | }
860 |
861 | extern "C" void cudaSetConstantMem( CScene& scene )
862 | {
863 | int a = scene.TriVtxBuf[0][0];
864 | float3 camPos;
865 | camPos.x = scene.cameraPos.x;
866 | camPos.y = scene.cameraPos.y;
867 | camPos.z = scene.cameraPos.z;
868 | cudaMemcpyToSymbol(dcamPos, &camPos, sizeof(float3));
869 |
870 | float3 wincenterPos;
871 | wincenterPos.x = scene.windowCenter.x;
872 | wincenterPos.y = scene.windowCenter.y;
873 | wincenterPos.z = scene.windowCenter.z;
874 | cudaMemcpyToSymbol(dwincenterPos, &wincenterPos, sizeof(float3));
875 |
876 | float3 winup;
877 | winup.x = scene.window_diagup.x;
878 | winup.y = scene.window_diagup.y;
879 | winup.z = scene.window_diagup.z;
880 | cudaMemcpyToSymbol(dwinup, &winup, sizeof(float3));
881 |
882 | float3 windown;
883 | windown.x = scene.window_diagdown.x;
884 | windown.y = scene.window_diagdown.y;
885 | windown.z = scene.window_diagdown.z;
886 | cudaMemcpyToSymbol(dwindown, &windown, sizeof(float3));
887 |
888 | float3 boxup;
889 | boxup.x = scene.m_sandbox.m_diagup.x;
890 | boxup.y = scene.m_sandbox.m_diagup.y;
891 | boxup.z = scene.m_sandbox.m_diagup.z;
892 | cudaMemcpyToSymbol(dboxup, &boxup, sizeof(float3));
893 |
894 | float3 boxdown;
895 | boxdown.x = scene.m_sandbox.m_diagdown.x;
896 | boxdown.y = scene.m_sandbox.m_diagdown.y;
897 | boxdown.z = scene.m_sandbox.m_diagdown.z;
898 | cudaMemcpyToSymbol(dboxdown, &boxdown, sizeof(float3));
899 |
900 | float3 objColor;
901 | objColor.x = scene.objectColor.r;
902 | objColor.y = scene.objectColor.g;
903 | objColor.z = scene.objectColor.b;
904 | cudaMemcpyToSymbol(objectColor, &objColor, sizeof(float3));
905 |
906 | float hosthx = (winup.x-windown.x) / scene.m_winwidth;
907 | float hosthy = (winup.y - windown.y) / scene.m_winheight;
908 | cudaMemcpyToSymbol(hx, &hosthx, sizeof(hosthx));
909 | cudaMemcpyToSymbol(hy, &hosthy, sizeof(hosthy));
910 |
911 | cudaMemcpyToSymbol(dwinwidth, &scene.m_winwidth, sizeof(scene.m_winwidth));
912 | cudaMemcpyToSymbol(dwinheight, &scene.m_winheight, sizeof(scene.m_winheight));
913 |
914 | cudaMemcpyToSymbol(nTri, &scene.nTri, sizeof(unsigned int) );
915 | cudaMemcpyToSymbol(nVtx, &scene.nVtx, sizeof(&scene.nVtx) );
916 | cudaMemcpyToSymbol(nLS, &scene.nLightSource, sizeof(unsigned int) );
917 |
918 | /*
919 | cudaMemcpyToSymbol( dVtxBuf, &scene.dVtxBuf, sizeof( scene.dVtxBuf ) );
920 | cudaMemcpyToSymbol( dTriVtxBuf, &scene.dTriVtxBuf, sizeof( scene.dTriVtxBuf ) );
921 | cudaMemcpyToSymbol( dNormal, &scene.dNormal, sizeof(scene.dNormal) );
922 | cudaMemcpyToSymbol( dLS, &scene.dLS, sizeof(scene.dLS) );
923 | cudaMemcpyToSymbol( dsandboxColor, &scene.dsandboxColor, sizeof(scene.dsandboxColor) );
924 | cudaMemcpyToSymbol( dsandboxIsReflective, &scene.dsandboxIsReflective, sizeof(scene.dsandboxIsReflective) );
925 | */
926 | }
927 |
928 | extern "C" void cudaSceneMalloc( CScene& scene )
929 | {
930 | cudaMalloc( (void**) & scene.dVtxBuf, sizeof(float)*scene.nVtx*3);
931 | cudaMalloc( (void**) & scene.dTriVtxBuf, sizeof(int)*scene.nTri*3);
932 | cudaMalloc((void**) &scene.dNormal, sizeof(float)*scene.nTri*3);
933 | cudaMalloc((void**) &scene.dLS, sizeof(float)*3*scene.nLightSource);
934 | cudaMalloc((void**) &scene.dsandboxColor, sizeof(float)*3*5);
935 | cudaMalloc((void**) &scene.dsandboxIsReflective, sizeof(unsigned int)*5);
936 | cudaMalloc((void**) &scene.dtree, sizeof(KdTree_rp)*scene.treesize);
937 | }
938 |
939 | extern "C" void cudaBindToTexture( unsigned int nVtx, unsigned int nTri, unsigned int nLS, CScene& scene )
940 | {
941 | cudaBindTexture(0, texref_VtxBuf, scene.dVtxBuf, sizeof(float)*nVtx*3);
942 | cudaBindTexture(0, texref_TriVtx, scene.dTriVtxBuf, sizeof(int)*nTri*3);
943 | cudaBindTexture(0, texref_Normal, scene.dNormal, sizeof(float)*3*nTri);
944 | cudaBindTexture(0, texref_LS, scene.dLS, sizeof(float)*3*nLS);
945 | cudaBindTexture(0, texref_sandboxColor, scene.dsandboxColor, sizeof(float)*3*5);
946 | cudaBindTexture(0, texref_sandboxIsReflective, scene.dsandboxColor, sizeof(unsigned int)*5);
947 | }
948 |
949 | extern "C" void cudaPassSceneToGlobalMem( CScene& scene, float* pVtxBuf, int* pTriVtxBuf, float* pNormal, float* pLS, float* psandboxColor, unsigned int* psandboxIsReflective, KdTree_rp* ptree)
950 | {
951 | if (scene.dVtxBuf) cudaMemcpy( scene.dVtxBuf, pVtxBuf, sizeof(float)*scene.nVtx*3, cudaMemcpyHostToDevice);
952 |
953 | if (scene.dTriVtxBuf) cudaMemcpy( scene.dTriVtxBuf, pTriVtxBuf, sizeof(int)*scene.nTri*3, cudaMemcpyHostToDevice);
954 |
955 | if (scene.dNormal) cudaMemcpy( scene.dNormal, pNormal, sizeof(float)*scene.nTri*3, cudaMemcpyHostToDevice);
956 |
957 | if (scene.dLS) cudaMemcpy( scene.dLS, pLS, sizeof(float)*3* scene.nLightSource, cudaMemcpyHostToDevice);
958 |
959 | if (scene.dsandboxColor) cudaMemcpy( scene.dsandboxColor, psandboxColor, sizeof(float3)*5, cudaMemcpyHostToDevice );
960 |
961 | if (scene.dsandboxIsReflective) cudaMemcpy( scene.dsandboxIsReflective, psandboxIsReflective, sizeof(unsigned int)*5, cudaMemcpyHostToDevice);
962 |
963 | if (scene.dtree) cudaMemcpy( scene.dtree, ptree, sizeof(KdTree_rp)*scene.treesize, cudaMemcpyHostToDevice );
964 | }
965 |
966 | // called when entire application ends
967 | extern "C" void cudaFreeTextureResources()
968 | {
969 | cudaUnbindTexture(texref_VtxBuf);
970 | cudaUnbindTexture(texref_Normal);
971 | cudaUnbindTexture(texref_TriVtx);
972 | cudaUnbindTexture(texref_LS);
973 | }
974 |
975 | extern "C" void cudaFreeGlobalMemory( CScene& scene)
976 | {
977 | if ( scene.dVtxBuf ) cudaFree(scene.dVtxBuf);
978 | if ( scene.dTriVtxBuf ) cudaFree(scene.dTriVtxBuf);
979 | if ( scene.dNormal ) cudaFree(scene.dNormal);
980 | if ( scene.dLS ) cudaFree(scene.dLS);
981 | if ( scene.dsandboxColor ) cudaFree(scene.dsandboxColor);
982 | if ( scene.dsandboxIsReflective ) cudaFree(scene.dsandboxIsReflective);
983 | }
984 |
985 |
986 |
987 |
988 |
989 |
990 |
991 |
--------------------------------------------------------------------------------
/src/main.cpp:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #include "scene.hpp"
4 |
5 | #include
6 | #include
7 |
8 | using namespace cv;
9 |
10 | CScene scene;
11 |
12 | extern "C" void launch_kernel_test( char*, CScene& scene );
13 | extern "C" void cudaFreeTextureResources();
14 |
15 | int main( int argc,char** argv )
16 | {
17 | scene.LoadObjects(argc, argv);
18 | scene.WriteObjects();
19 |
20 | printf("malloc...\n");
21 | scene.SceneMalloc();
22 | printf("malloc done.\n");
23 |
24 | printf("copy to global mem...\n");
25 | scene.PassSceneToGlobalMem();
26 | printf("copy done.\n");
27 |
28 | printf("constant mem...\n");
29 | scene.SetConstantMem();
30 | printf("constant mem copy done.\n");
31 |
32 | char* pos = (char*)malloc( sizeof(char)*4* scene.m_winwidth * scene.m_winheight );
33 | for ( int i = 0; i < scene.m_winwidth; i++ )
34 | for ( int j = 0; j < scene.m_winheight; j++ )
35 | for ( int k =0; k < 4; k++)
36 | pos[ (i*scene.m_winheight + j)*4 + k ] = 0;
37 |
38 | launch_kernel_test( pos, scene );
39 |
40 | //free the global memory associated with scene
41 | scene.FreeGlobalMemory( );
42 |
43 | cudaFreeTextureResources();
44 |
45 | Mat image( scene.m_winheight, scene.m_winwidth, CV_8UC3, Scalar(0,0,0));
46 |
47 | for ( int i = 0; i < scene.m_winheight; i++ )
48 | for ( int j = 0; j < scene.m_winwidth; j++ )
49 | {
50 | image.at(i,j)[0] = pos[ (i*scene.m_winwidth+ j)*4 + 2 ];
51 | image.at(i,j)[1] = pos[ (i*scene.m_winwidth+ j)*4 + 1 ];
52 | image.at(i,j)[2] = pos[ (i*scene.m_winwidth + j)*4 + 0 ];
53 | }
54 |
55 | imwrite("im_testK.png", image);
56 | free(pos);
57 |
58 | return 0;
59 | }
60 |
61 |
--------------------------------------------------------------------------------
/src/nvcc_compile.sh:
--------------------------------------------------------------------------------
1 | nvcc -arch=sm_21 -O3 -I /usr/include/opencv -I /usr/include/opencv2 main.cpp scene.cpp kernel.cu -lopencv_core -lopencv_highgui -o raytracer
2 |
--------------------------------------------------------------------------------
/src/scene.cpp:
--------------------------------------------------------------------------------
1 | #include
2 | #include "scene.hpp"
3 | #include
4 | #include
5 | #include
6 | #include
7 | #include
8 | #include
9 |
10 | #include "kdtree.h"
11 |
12 | using namespace std;
13 |
14 |
15 | using namespace std;
16 |
17 | extern "C" void cudaSceneMalloc( CScene& );
18 |
19 | extern "C" void cudaBindToTexture( int nVtx, float* pVtxBuf, int nTri, int* pTriVtxBuf, float* pNormal, int nLS, float* pLS, float* psandboxColor, unsigned int* psandboxIsReflective);
20 |
21 | extern "C" void cudaPassSceneToGlobalMem( CScene&, float* pVtxBuf, int* pTriVtxBuf, float* pNormal, float* pLS, float* psandboxColor, unsigned int* psandboxIsReflective, KdTree_rp* ptree);
22 |
23 | extern "C" void cudaSetConstantMem( CScene & );
24 |
25 | extern "C" void cudaFreeTextureResources( );
26 |
27 | extern "C" void cudaFreeGlobalMemory( CScene& );
28 |
29 | CPoint3D CPoint3D::operator- (CPoint3D param)
30 | {
31 | CPoint3D tmp;
32 | tmp.x = x - param.x;
33 | tmp.y = y - param.y;
34 | tmp.z = z - param.z;
35 | return tmp;
36 | }
37 |
38 | CPoint3D CPoint3D::operator+ (CPoint3D param)
39 | {
40 | CPoint3D tmp;
41 | tmp.x = x + param.x;
42 | tmp.y = y + param.y;
43 | tmp.z = z + param.z;
44 | return tmp;
45 | }
46 |
47 | float CPoint3D::operator* (CPoint3D param)
48 | {
49 | return x*param.x + y*param.y + z*param.z;
50 | }
51 |
52 | CPoint3D CPoint3D::operator^ (CPoint3D param)
53 | {
54 | CPoint3D tmp;
55 | tmp.x = y*param.z - z*param.y;
56 | tmp.y = x*param.z - z*param.x;
57 | tmp.z = x*param.y - y*param.x;
58 | return tmp;
59 | }
60 |
61 | void CPoint3D::normalize()
62 | {
63 | float l = sqrt(x*x+y*y+z*z);
64 | x = x/l;
65 | y = y/l;
66 | z = z/l;
67 | }
68 |
69 | CScene::CScene()
70 | {
71 | dVtxBuf = NULL;
72 | dTriVtxBuf = NULL;
73 | dLS = NULL;
74 | dNormal = NULL;
75 | dsandboxColor = NULL;
76 | dsandboxIsReflective = NULL;
77 |
78 | int scale = 1;
79 | // grey color for the object
80 | objectColor.r = 100;
81 | objectColor.g = 100;
82 | objectColor.b = 100;
83 |
84 | cameraPos.x = 0;
85 | cameraPos.y = 0;
86 | cameraPos.z = -2*scale;
87 |
88 | windowCenter.x = 0;
89 | windowCenter.y = 0;
90 | windowCenter.z = -1*scale;
91 |
92 | window_diagup.x = 1*scale;
93 | window_diagup.y = 1*scale;
94 | window_diagup.z = -1*scale;
95 |
96 | window_diagdown.x = -1*scale;
97 | window_diagdown.y= -1*scale;
98 | window_diagdown.z = -1*scale;
99 |
100 | m_sandbox.m_diagup.x = -1*scale;
101 | m_sandbox.m_diagup.y = -1*scale;
102 | m_sandbox.m_diagup.z = -1*scale;
103 |
104 | m_sandbox.m_diagup.x = 1*scale;
105 | m_sandbox.m_diagup.y = 1*scale;
106 | m_sandbox.m_diagup.z = 1*scale;
107 |
108 | m_winwidth = 800;
109 | m_winheight = 600;
110 |
111 | CPoint3D LS;
112 | LS.x = 1*scale;
113 | LS.y = -1*scale;
114 | LS.z = -1*scale;
115 | lightSource.push_back(LS);
116 |
117 | LS.x = -1*scale;
118 | LS.y = -1*scale;
119 | LS.z = -1*scale;
120 | lightSource.push_back(LS);
121 |
122 | nLightSource = lightSource.size();
123 | }
124 |
125 | void CScene::LoadSandBox( int argc, char** argv )
126 | {
127 | }
128 |
129 | void CScene::CalculateNormal()
130 | {
131 | NormalBuf.resize(nTri);
132 | for ( int i = 0; i < nTri; i++ )
133 | {
134 | CPoint3D a = VtxBuf[TriVtxBuf[i][1]] - VtxBuf[TriVtxBuf[i][0]];
135 | a.x = VtxBuf[TriVtxBuf[i][1]].x - VtxBuf[TriVtxBuf[i][0]].x;
136 | a.y = VtxBuf[TriVtxBuf[i][1]].y- VtxBuf[TriVtxBuf[i][0]].y;
137 | a.z = VtxBuf[TriVtxBuf[i][1]].z - VtxBuf[TriVtxBuf[i][0]].z;
138 |
139 | CPoint3D b = VtxBuf[TriVtxBuf[i][2]] - VtxBuf[TriVtxBuf[i][0]];
140 | b.x = VtxBuf[TriVtxBuf[i][2]].x - VtxBuf[TriVtxBuf[i][0]].x;
141 | b.y = VtxBuf[TriVtxBuf[i][2]].y - VtxBuf[TriVtxBuf[i][0]].y;
142 | b.z = VtxBuf[TriVtxBuf[i][2]].z - VtxBuf[TriVtxBuf[i][0]].z;
143 | // if 0,1,2 are counter-clockwise labeled looking from outside
144 | // then the normal vector is outward pointing
145 | NormalBuf[i].x = a.y*b.z - a.z*b.y;
146 | NormalBuf[i].y = a.z*b.x - a.x *b.z;
147 | NormalBuf[i].z = a.x*b.y - a.y*b.x;
148 |
149 | NormalBuf[i].normalize();
150 | }
151 | }
152 |
153 | void CScene::CalcBoundingBox()
154 | {
155 | xmin = 1e6;
156 | ymin = 1e6;
157 | zmin = 1e6;
158 | xmax = -1e6;
159 | ymax= -1e6;
160 | zmax = -1e6;
161 | for ( int i = 0; i < nVtx; i++)
162 | {
163 | if (VtxBuf[i].x < xmin) xmin = VtxBuf[i].x;
164 | if (VtxBuf[i].x > xmax) xmax = VtxBuf[i].x;
165 | if (VtxBuf[i].y < ymin) ymin = VtxBuf[i].y;
166 | if (VtxBuf[i].y > ymax) ymax = VtxBuf[i].y;
167 | if (VtxBuf[i].z < zmin) zmin = VtxBuf[i].z;
168 | if (VtxBuf[i].z > zmax) zmax = VtxBuf[i].z;
169 | }
170 | printf("x:%.2f~%.2f\n", xmin, xmax);
171 | printf("y:%.2f~%.2f\n", ymin, ymax);
172 | printf("Z:%.2f~%.2f\n", zmin, zmax);
173 | }
174 |
175 | int CScene::CalcTreeSize(KdTree* node)
176 | {
177 | if ( !node ) return 0;
178 | else return 1+CalcTreeSize(node->pleft)+CalcTreeSize(node->pright);
179 | }
180 |
181 | /*
182 | bool CScene::cmpx( int i, int j )
183 | {
184 | CPoint3D p1 = VtxBuf[i];
185 | CPoint3D p2 = Vtxbuf[j];
186 | if (p1.x < p2.x ) return true;
187 | else return false;
188 | }
189 |
190 | bool CScene::cmpy( int i, int j )
191 | {
192 | CPoint3D p1 = VtxBuf[i];
193 | CPoint3D p2 = Vtxbuf[j];
194 | if ( p1.y < p2.y ) return true;
195 | else return false;
196 | }
197 |
198 | bool CScene::cmpz( int i, int j )
199 | {
200 | CPoint3D p1 = VtxBuf[i];
201 | CPoint3D p2 = Vtxbuf[j];
202 | if ( p1.z < p2.z ) return true;
203 | else return false;
204 | }
205 |
206 | vector ChooseTriSTLVersion( vector plist, float mid, int axis)
207 | {
208 | switch axis:
209 | case 0:
210 | sort(plist.begin(), plist.end(), cmpx );
211 | vector::iterator it = upper_bound( plist.begin(), plist.end(), cmpx(axis) );
212 | case 1:
213 | sort(plist.begin(), plist.end(), cmpy );
214 | vector::iterator it = upper_bound( plist.begin(), plist.end(), cmpy(axis) );
215 | case 2:
216 | sort(plist.begin(), plist.end(), cmpz );
217 | vector::iterator it = upper_bound( plist.begin(), plist.end(), cmpz(axis) );
218 |
219 | vector chosen;
220 | chosen.resize( it - plist.begin()+1);
221 | copy( plist.begin(), it, chosen.begin() );
222 |
223 | return chosen;
224 | }
225 | */
226 |
227 | bool IsInBox( CPoint3D p, float xmin, float xmax, float ymin, float ymax, float zmin, float zmax )
228 | {
229 | if ( (p.x>=xmin) && (p.x<=xmax) && (p.y>=ymin) && (p.y<=ymax) && (p.z>=zmin) && (p.z<=zmax) )
230 | return true;
231 | else return false;
232 | }
233 |
234 | vector CScene::ChooseTri( vector trilist, float xmin, float xmax, float ymin, float ymax, float zmin, float zmax)
235 | {
236 | vector chosen;
237 | chosen.clear();
238 |
239 | for ( int i = 0; i < trilist.size(); i++)
240 | {
241 | int idx0 = TriVtxBuf[ trilist[i] ][0];
242 | int idx1 = TriVtxBuf[ trilist[i] ][1];
243 | int idx2 = TriVtxBuf[ trilist[i] ][2];
244 |
245 | CPoint3D p0 = VtxBuf[ idx0 ];
246 | CPoint3D p1 = VtxBuf[ idx1 ];
247 | CPoint3D p2 = VtxBuf[ idx2 ];
248 |
249 | if ( IsInBox(p0, xmin, xmax, ymin, ymax, zmin, zmax) ||
250 | IsInBox(p1, xmin, xmax, ymin, ymax, zmin, zmax) ||
251 | IsInBox(p2, xmin, xmax, ymin, ymax, zmin, zmax) )
252 | chosen.push_back( trilist[i] );
253 | }
254 |
255 | return chosen;
256 | }
257 |
258 | KdTree* CScene::ConstructKdTree(int axis, int depth, vectortrilist, float xmin, float xmax, float ymin, float ymax, float zmin, float zmax)
259 | {
260 | // if (trilist.size() == 0 ) return NULL;
261 |
262 | KdTree* node= new KdTree;
263 | node->xmin = xmin;
264 | node->xmax = xmax;
265 | node->ymin = ymin;
266 | node->ymax = ymax;
267 | node->zmin = zmin;
268 | node->zmax = zmax;
269 |
270 | // for ( int i =0; i< 6; i++) node->rope[i] = NULL;
271 |
272 | if ( (trilist.size() <= LEAFSIZE) )
273 | {
274 | node->trinum = trilist.size();
275 | // node->tri = (int*)malloc( sizeof(int)*trilist.size() );
276 | for ( int i = 0; i < trilist.size(); i++)
277 | node->tri[i] = trilist[i];
278 | node->axis = axis;
279 | node->splitpos = -1;
280 | node->pleft = NULL;
281 | node->pright = NULL;
282 | }
283 | else
284 | {
285 | node->trinum = -1;
286 | // node->tri = NULL;
287 | float mid;
288 | vector leftptlist;
289 | leftptlist.clear();
290 | vector rightptlist;
291 | rightptlist.clear();
292 | switch (axis)
293 | {
294 | case 0:
295 | mid = (xmin+xmax)/2;
296 | leftptlist = ChooseTri( trilist, xmin, mid, ymin, ymax, zmin, zmax);
297 | rightptlist = ChooseTri( trilist, mid, xmax, ymin, ymax, zmin, zmax);
298 |
299 | node->pleft = ConstructKdTree( (axis+1)%3, depth+1, leftptlist, xmin,mid, ymin, ymax, zmin, zmax);
300 | node->pright = ConstructKdTree( (axis+1)%3, depth+1, rightptlist, mid, xmax, ymin, ymax, zmin, zmax);
301 | break;
302 | case 1:
303 | mid = (ymin+ymax)/2;
304 | leftptlist = ChooseTri( trilist, xmin, xmax, ymin, mid, zmin, zmax);
305 | rightptlist = ChooseTri( trilist, xmin, xmax, mid, ymax, zmin, zmax);
306 |
307 | node->pleft = ConstructKdTree( (axis+1)%3, depth+1, leftptlist, xmin,xmax, ymin, mid, zmin, zmax);
308 | node->pright = ConstructKdTree( (axis+1)%3, depth+1, rightptlist, xmin, xmax, mid, ymax, zmin, zmax);
309 | break;
310 | case 2:
311 | mid = (zmin+zmax)/2;
312 | leftptlist = ChooseTri( trilist, xmin, xmax, ymin, ymax, zmin, mid);
313 | rightptlist = ChooseTri( trilist, xmin, xmax, ymin, ymax, mid, zmax);
314 |
315 | node->pleft = ConstructKdTree( (axis+1)%3, depth+1, leftptlist, xmin, xmax, ymin, ymax, zmin, mid);
316 | node->pright = ConstructKdTree( (axis+1)%3, depth+1, rightptlist, xmin, xmax, ymin, ymax, mid, zmax);
317 | break;
318 | default:
319 | break;
320 | }
321 | node->axis = axis;
322 | node->splitpos = mid;
323 | }
324 |
325 | return node;
326 | }
327 |
328 | bool CScene::SplitPlaneAboveBox( int axis, float splitpos, int idx)
329 | {
330 | switch (axis)
331 | {
332 | case 0:
333 | if ( splitpos >= tree_rp[idx].xmax) return true;
334 | else return false;
335 | case 1:
336 | if ( splitpos >= tree_rp[idx].ymax) return true;
337 | else return false;
338 | case 2:
339 | if ( splitpos >= tree_rp[idx].zmax) return true;
340 | else return false;
341 | default:
342 | return true;
343 | }
344 | }
345 |
346 | bool CScene::SplitPlaneBelowBox( int axis, float splitpos, int idx)
347 | {
348 | switch (axis)
349 | {
350 | case 0:
351 | if ( splitpos <= tree_rp[idx].xmin) return true;
352 | else return false;
353 | case 1:
354 | if ( splitpos <= tree_rp[idx].ymin) return true;
355 | else return false;
356 | case 2:
357 | if ( splitpos <= tree_rp[idx].zmin) return true;
358 | else return false;
359 | default:
360 | return true;
361 | }
362 | }
363 |
364 | void CScene::OptimizeRope( int idx, int &rp, int face)
365 | {
366 | /*
367 | while ( tree_rp[rp].trinum < 0) // not leaf
368 | {
369 | if ( face/2 == tree_rp[rp].axis ) break;
370 |
371 | if ( SplitPlaneAboveBox( tree_rp[rp].axis, tree_rp[rp].splitpos, idx) )
372 | rp = tree_rp[rp].left;
373 | else if ( SplitPlaneBelowBox( tree_rp[rp].axis, tree_rp[rp].splitpos, idx) )
374 | rp = tree_rp[rp].right;
375 | else break;
376 |
377 | if ( rp<0) break;
378 |
379 | }
380 | */
381 |
382 | }
383 |
384 | void CScene::BuildRope( int idx, int rope[6] )
385 | {
386 | for ( int i =0; i < 6; i++ )
387 | if ( abs(rope[i]) > 4000 )
388 | { printf("rope out of range\n"); exit(-1); return;}
389 |
390 | if ( tree_rp[idx].trinum >= 0) // is leaf
391 | {
392 | for ( int i = 0; i < 6; i++ )
393 | tree_rp[idx].rope[i] = rope[i];
394 | }
395 | else
396 | {
397 | for ( int i =0 ; i< 6; i++ )
398 | {
399 | if ( rope[i]>=0 ) OptimizeRope( idx, rope[i], i );
400 | }
401 |
402 | if ( tree_rp[idx].axis < 0) printf("Line 381\n");
403 |
404 | int sl = 2* tree_rp[idx].axis;
405 | int sr = 2* tree_rp[idx].axis + 1;
406 |
407 | int ropeL[6];
408 | int ropeR[6];
409 | for ( int i = 0; i< 6; i++ )
410 | {
411 | ropeL[i] = rope[i];
412 | ropeR[i] = rope[i];
413 | }
414 | ropeL[sr] = tree_rp[idx].right;
415 | ropeR[sl] = tree_rp[idx].left;
416 |
417 | if ( tree_rp[idx].right >0) BuildRope( tree_rp[idx].right, ropeR);
418 | if ( tree_rp[idx].left > 0) BuildRope( tree_rp[idx].left, ropeL );
419 | }
420 | }
421 |
422 | void CScene::RpKdTreePrint(int argc)
423 | {
424 | for ( int i =0; i < treesize; i++)
425 | {
426 | printf("node%d: left%d, right%d, axis%d, splitpos%.2f\n", i, tree_rp[i].left, tree_rp[i].right, tree_rp[i].axis, tree_rp[i].splitpos );
427 | if (tree_rp[i].trinum >= 0)
428 | {
429 | printf(" Leaf node containing:");
430 | for ( int j = 0; j < tree_rp[i].trinum ; j++)
431 | // printf("%d ", tree_rp[i].tri[j]);
432 | printf(". ");
433 | printf("\n");
434 | printf(" Rope: ");
435 | for ( int j=0; j<6; j++ )
436 | printf("%d ", tree_rp[i].rope[j]);
437 | printf("\n");
438 | }
439 |
440 | printf(" xmin %.3f, xmax %.3f, ymin %.3f, ymax % .3f, zmin %.3f, zmax %.3f\n", tree_rp[i].xmin,tree_rp[i].xmax, tree_rp[i].ymin, tree_rp[i].ymax, tree_rp[i].zmin, tree_rp[i].zmax);
441 |
442 |
443 | }
444 |
445 | return;
446 | }
447 |
448 | void CScene::ConstructRpKdTree( int idx, KdTree* node)
449 | {
450 | if (!node) return;
451 |
452 | static int count = 0;
453 |
454 | tree_rp[idx].xmin = node->xmin;
455 | tree_rp[idx].xmax = node->xmax;
456 | tree_rp[idx].ymin = node->ymin;
457 | tree_rp[idx].ymax = node->ymax;
458 | tree_rp[idx].zmin = node->zmin;
459 | tree_rp[idx].zmax = node->zmax;
460 |
461 | tree_rp[idx].axis = node->axis;
462 | tree_rp[idx].splitpos = node->splitpos;
463 |
464 | tree_rp[idx].trinum = node->trinum;
465 | if ( node->trinum>= 0)
466 | {
467 | // tree_rp[idx].tri= (int*) malloc( node->trinum* sizeof(int) );
468 |
469 | for ( int i =0; i< node->trinum; i++)
470 | tree_rp[idx].tri[i] = node->tri[i];
471 | }
472 | // else tree_rp[idx].trinum = NULL;
473 |
474 | if ( node->pleft )
475 | {
476 | count++;
477 | tree_rp[idx].left = count;
478 | ConstructRpKdTree( count, node->pleft);
479 | }
480 | else tree_rp[idx].left = -1;
481 |
482 | if ( node->pright )
483 | {
484 | count++;
485 | tree_rp[idx].right = count;
486 | ConstructRpKdTree( count, node->pright);
487 | }
488 | else tree_rp[idx].right = -1;
489 | }
490 |
491 | void CScene::LoadObjects( int argc, char** argv )
492 | {
493 | FILE* file = fopen( argv[1], "r");
494 | if (!file)
495 | {
496 | cerr<<"Cannot open obj file"< idx;
504 | idx.resize(3);
505 |
506 | VtxBuf.clear();
507 | TriVtxBuf.clear();
508 |
509 | while ( fscanf(file, "%s", buf)!=EOF )
510 | {
511 | switch( buf[0] )
512 | {
513 | case '#':
514 | // eat up rest of the line
515 | fgets(buf,sizeof(buf), file);
516 | break;
517 | case 'v':
518 | fscanf(file, "%f %f %f\n", &tmp.x, &tmp.y, &tmp.z);
519 | VtxBuf.push_back(tmp);
520 | // eat up rest of the line
521 | // fgets(buf,sizeof(buf), file);
522 | break;
523 | case 'f':
524 | fscanf(file, "%d %d %d\n", &idx[0], &idx[1], &idx[2]);
525 | idx[0] = idx[0]-1; idx[1] = idx[1]-1, idx[2] =idx[2]-1;
526 | TriVtxBuf.push_back(idx);
527 | // eat up rest of the line
528 | // fgets(buf,sizeof(buf), file);
529 | break;
530 | default:
531 | // eat up rest of the line
532 | fgets(buf, sizeof(buf), file);
533 | break;
534 | }
535 | }
536 |
537 | nVtx = VtxBuf.size();
538 | nTri = TriVtxBuf.size();
539 | printf("Obj file loaded. Vertex number:%d. Face number:%d\n", nVtx, nTri);
540 |
541 | fclose(file);
542 |
543 | printf("Calculating normal vector...\n");
544 | CalculateNormal();
545 |
546 | printf("Calculating bounding box...\n");
547 | CalcBoundingBox();
548 |
549 | vector trilist;
550 | trilist.resize( nTri );
551 | for ( int i = 0; i < nTri; i++ ) trilist[i] = i;
552 |
553 | printf("Building kdtree...\n");
554 | tree = ConstructKdTree( 0, 0, trilist, xmin, xmax, ymin, ymax, zmin, zmax );
555 |
556 | treesize = CalcTreeSize(tree);
557 | tree_rp = (KdTree_rp*)malloc( treesize * sizeof(KdTree_rp) );
558 | printf("Convert to indexed kdtree...\n");
559 | ConstructRpKdTree(0,tree);
560 |
561 |
562 | printf("Building rope...\n");
563 | for ( int i =0; i< treesize; i++)
564 | for ( int j =0; j<6; j++) tree_rp[i].rope[j] = -10;
565 | int rope[6];
566 | for ( int i =0; i < 6; i++ ) rope[i] = -1;
567 | BuildRope( 0, rope );
568 | }
569 |
570 | void CScene::WriteObjects()
571 | {
572 | FILE* fout = fopen("scene.obj", "w");
573 | for ( int i =0; i< nVtx; i++)
574 | fprintf(fout, "v %f %f %f\n", VtxBuf[i].x, VtxBuf[i].y, VtxBuf[i].z);
575 | for ( int i = 0; i
5 | #include "kdtree.h"
6 |
7 | class CPoint3D
8 | {
9 | public:
10 | float x;
11 | float y;
12 | float z;
13 |
14 | CPoint3D operator- (CPoint3D);
15 | CPoint3D operator+ (CPoint3D);
16 | float operator* (CPoint3D); // inner dot product between vectors
17 | CPoint3D operator^ (CPoint3D); // outer dot product between vectors
18 |
19 | void normalize();
20 | };
21 |
22 | class RGB
23 | {
24 | public:
25 | unsigned int r;
26 | unsigned int g;
27 | unsigned int b;
28 | };
29 |
30 | class CSandBox
31 | {
32 | public:
33 | CPoint3D m_diagdown; //x,y,z small
34 | CPoint3D m_diagup; // x,y,z large
35 |
36 | // The faces are labeled as:
37 | // 0: zmax, 1:xmax, 2:ymax, 3:xmin, 4:ymin
38 | // xmin face is the window face
39 | RGB facecolor[5];
40 | unsigned int isReflective[5];
41 |
42 | CSandBox()
43 | {
44 | facecolor[1].r = 200;
45 | facecolor[1].g = 0;
46 | facecolor[1].b = 0;
47 |
48 | facecolor[2].r = 0;
49 | facecolor[2].g = 200;
50 | facecolor[2].b = 0;
51 |
52 | facecolor[3].r = 0;
53 | facecolor[3].g = 0;
54 | facecolor[3].b = 200;
55 |
56 | facecolor[4].r = 250;
57 | facecolor[4].g = 250;
58 | facecolor[4].b = 250;
59 |
60 | isReflective[0] =1;
61 | isReflective[1] =0;
62 | isReflective[2] =0;
63 | isReflective[3] =0;
64 | isReflective[4] =0;
65 | }
66 | };
67 |
68 | // Passing to texture memory is done by calling member function PassSceneToTexture()
69 | // Passing to constant memory is done when calling member functiSetConstantMem()
70 | class CScene // can only manage one object
71 | {
72 | public:
73 | CSandBox m_sandbox; // pass to const. & tex. memory
74 |
75 | unsigned int m_winwidth ; // pass to constant memory
76 | unsigned int m_winheight ; // pass to const. mem
77 |
78 | unsigned int nVtx; // pass to const. mem
79 | std::vector VtxBuf; // pass to tex. mem
80 |
81 | unsigned int nTri; // pass to const. mem
82 | std::vector > TriVtxBuf; // pass to tex. mem
83 |
84 | RGB objectColor; // pass to const. mem
85 |
86 | std::vector NormalBuf; // pass to tex. mem
87 |
88 | unsigned int nLightSource; // pass to const. mem
89 | std::vector lightSource; // pass to tex. mem
90 |
91 | CPoint3D cameraPos; // pass to const. mem
92 |
93 | // pass to const. mem
94 | CPoint3D windowCenter;
95 | CPoint3D window_diagup;
96 | CPoint3D window_diagdown;
97 |
98 | void CalcBoundingBox();
99 |
100 | public:
101 | // The Bounding box and KdTree
102 | float xmin;
103 | float xmax;
104 | float ymin;
105 | float ymax;
106 | float zmin;
107 | float zmax;
108 |
109 | // list of Kdtree
110 | KdTree* tree;
111 | KdTree* ConstructKdTree( int axis, int depth, std::vector trilist,
112 |
113 | float xmin, float xmax, float ymin, float ymax, float zmin, float zmax);
114 | void OptimizeRope( int idx, int & rp, int face);
115 | void BuildRope( int idx, int rope[6] );
116 | bool SplitPlaneBelowBox( int, float, int );
117 | bool SplitPlaneAboveBox( int , float ,int );
118 |
119 | std::vector ChooseTri( std::vector, float, float, float, float, float, float);
120 |
121 | int treesize;
122 | int CalcTreeSize( KdTree* node );
123 | // Dynamic aray of kdtree
124 | KdTree_rp* tree_rp;
125 | void ConstructRpKdTree( int idx, KdTree* node);
126 |
127 | public:
128 | void RpKdTreePrint( int argc);
129 |
130 | public:
131 | float* dVtxBuf;
132 | int* dTriVtxBuf;
133 | float* dNormal;
134 | float* dLS;
135 | float* dsandboxColor;
136 | unsigned int* dsandboxIsReflective;
137 | KdTree_rp* dtree;
138 |
139 | private:
140 | void CalculateNormal();
141 |
142 | public:
143 | CScene();
144 |
145 | public:
146 | // callback
147 | void keyboard( unsigned char key, int x, int y)
148 | {
149 | }
150 |
151 | void LoadSandBox( int argc, char** argv );
152 | void LoadObjects( int argc, char** argv );
153 | void WriteObjects( );
154 |
155 | //The first thing always
156 | void SceneMalloc();
157 |
158 | // Make sure to call SceneMalloc before
159 | void PassSceneToGlobalMem();
160 |
161 | // Bind global memory to texture
162 | void BindToTexture();
163 |
164 | void SetConstantMem();
165 |
166 | // when entire application ends
167 | // Don't forget to call in the main
168 | void FreeTextureResources();
169 | void FreeGlobalMemory();
170 |
171 | //TODO
172 | //void TransformObject( );
173 | //
174 | };
175 |
176 | #endif
177 |
--------------------------------------------------------------------------------