├── .gitignore ├── .nojekyll ├── LICENSE ├── README.md ├── scene └── renderedAnt.png └── src ├── kdtree.h ├── kernel.cu ├── main.cpp ├── nvcc_compile.sh ├── scene.cpp └── scene.hpp /.gitignore: -------------------------------------------------------------------------------- 1 | *.i 2 | *.ii 3 | *.gpu 4 | *.ptx 5 | *.cubin 6 | *.fatbin 7 | -------------------------------------------------------------------------------- /.nojekyll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jasonge27/StacklessRayTracer/6558c40e3be5d4aee360e8be56f6b3dd50c8117f/.nojekyll -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 Jason Ge 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Ray Tracer on GPU 2 | Implemented stackless KDTree on GPU using CUDA to accelerate ray tracing rendering algorithm. Hardware level optimization for register spills and local memory overhead. 3 | 4 | ## Parallel Ray Tracer on GPU 5 | 6 | Ray tracing follows every beam of light in a scene. Since we know the physical laws of reflection, refraction and scattering, when enough number of light beams are contructed, we can render the scene up to real world effects. Ray tracing is a natually parallel algorithm because physics tell us light beams do not interfere with each other. 7 | 8 | The main calculation of ray tracing involves calculating the intersecting point of any light beam and the objects in the scene. We implemented stackless KDTree algorithm in the paper [1] and observed 100X times speed up. 9 | 10 | | | Direct Intersection Calculation | Intersection Calculation by Vanilla Kdtree | Intersection Calculation by Stackless Kdtree | 11 | | :--------------------------------------: | :-----------------------------: | :--------------------------------------: | :--------------------------------------: | 12 | | CPU rendering time | 255s | 22.193s | 13.371s | 13 | | GPU rendering time (before hardware optimization) | 3.282s | 0.198s | 0.098s | 14 | | Speedup | 77 | 112 | 137 | 15 | 16 | Experiment Details. 17 | 18 | CPU: Intel Core i7-2630QM, GPU: NVIDIA GeForce GT 610 with 6GB RAM. 19 | 20 | OS: Linux 12.04, Compiler: nvcc in CUDA 5.0 21 | 22 | Scene: an 3996 face ant in a cube, 800*600 light beams, 3 times reflection for shallow calculation. 23 | 24 | Effect: 25 | 26 |

27 | 28 |

29 | 30 | ## Hardware Level Optimization for GPU 31 | 32 | ### Bottleneck analysis 33 | 34 | #### Register Spills vs Warp Occupancy Trade-off 35 | 36 | On the one hand, if we use more registers per thread, we can reduce the local memory overhead caused by register spill. On the other hand, if we use less registers per thread, we can run more warps in parallel. We need to find the right balance in this trade-off. 37 | 38 | - Register spills. Adding option `-Xptaxs -v` when compiling with `nvcc` gives us the register usage info: **register usage: 63, stored spill 324B, load spill 452B** Explanation. GT 610 has compute capability 2.1 with 63 registers at most in each thread. The total size of registers available is 64KB. If a thread asks for more registers than the limit, there will be register spill. 39 | 40 | Visual profiler says **local memory overhead: 62.3%** 41 | 42 | Explanation. When register spills happen, the thread will read/write through L1 cache into local memory which is slower than directly to registers. Moreover, in the case of L1 cache miss, instructions have to be re-issued putting more pressure on the memory bandwidth. This is why we have such a large **local memory overhead**. 43 | 44 | - Warp occupancy. Visual profiler says **warp occupancy 33.1%** 45 | 46 | Explanation. We are using blocks of dimension 16*8 in the computation. GT 610 can allow at most 48 warps to be simultanously running in hardware given the block dimension we've chosen. Warp occupancy says we're only using 16 warps on average, which is a consequence of register spill. Each block uses up to 4KB registers, with the limit of 64KB registers in total, it makes sense that we can only run 16 warps at the same time. 47 | 48 | ​ 49 | 50 | #### Branch divergence 51 | 52 | Visual profiler gives us **warp execution efficiency 45.4%** 53 | 54 | Since the threads in one single warp share the same intruction stream, when the threads diverge in `if-then A else B` statement into thread group 1 and thread group 2, thread group 1 will excecute A with group 2 waiting, and then group 2 will execute B with group 1 watiing. The warp exec efficiency ratio says that basically our time doubled because of this phenomenon. 55 | 56 | ### Optimizations 57 | 58 | - Optimization 1. Using compiler flags `-prec-dev=false, -prec-sqrt=false` decreases the precision of division and sqrt computation. The compiler will then optimize out some intermediate varialbes to reduce register usage. 59 | 60 | - Optimization 2. Trade-off between register spills and warp occupancy. We use the compiler flag `maxregcount=48` to limit maximal number of registers each thread can use. 61 | 62 | | | Time | Warp Occupancy | Local memory overhead | Spilled Store | Spilled Load | 63 | | :-----------------: | :--: | :------------: | :-------------------: | :-----------: | :----------: | 64 | | Before Optimization | 98ms | 31.1% | 62.3% | 324B | 452B | 65 | | Optimization 1 | 60ms | 33.4% | 48.8% | 122B | 122B | 66 | | Optimization 1+2 | 49ms | 49.6% | 53.7% | 240B | 240B | 67 | 68 | ​ 69 | 70 | References 71 | 72 | [1] Popov, Gunther et al., Stackless KD-Tree Traversal for High Performance GPU Ray Tracing, 2007 73 | 74 | -------------------------------------------------------------------------------- /scene/renderedAnt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jasonge27/StacklessRayTracer/6558c40e3be5d4aee360e8be56f6b3dd50c8117f/scene/renderedAnt.png -------------------------------------------------------------------------------- /src/kdtree.h: -------------------------------------------------------------------------------- 1 | #ifndef KD_TREE_H 2 | #define KD_TREE_H 3 | 4 | #define LEAFSIZE 15 5 | //#define TREEDEPTH 15 6 | 7 | typedef struct node 8 | { 9 | struct node* pleft; 10 | struct node* pright; 11 | 12 | int axis; 13 | float splitpos; 14 | 15 | int trinum; 16 | int tri[LEAFSIZE]; 17 | 18 | // struct node* rope[6]; 19 | 20 | float xmin; 21 | float xmax; 22 | float ymin; 23 | float ymax; 24 | float zmin; 25 | float zmax; 26 | }KdTree; 27 | 28 | typedef struct node_rp 29 | { 30 | int left; 31 | int right; 32 | 33 | int axis; 34 | float splitpos; 35 | 36 | int trinum; 37 | int tri[LEAFSIZE]; 38 | 39 | int rope[6]; 40 | 41 | float xmin; 42 | float xmax; 43 | float ymin; 44 | float ymax; 45 | float zmin; 46 | float zmax; 47 | }KdTree_rp; 48 | 49 | #endif 50 | -------------------------------------------------------------------------------- /src/kernel.cu: -------------------------------------------------------------------------------- 1 | //kernelPBO.cu (Rob Farber) 2 | 3 | #include 4 | #include "scene.hpp" 5 | #include "kdtree.h" 6 | 7 | #define blocksize_x 8 8 | #define blocksize_y 6 9 | 10 | #define EPSILON (1e-6) 11 | 12 | __constant__ float3 dcamPos; 13 | __constant__ float3 dwincenterPos; 14 | 15 | __constant__ float3 dwinup; 16 | __constant__ float3 dwindown; 17 | __constant__ float3 dboxup; 18 | __constant__ float3 dboxdown; 19 | 20 | __constant__ unsigned int dwinwidth; 21 | __constant__ unsigned int dwinheight; 22 | 23 | __constant__ float hx; 24 | __constant__ float hy; 25 | 26 | __constant__ float3 objectColor; 27 | 28 | __constant__ unsigned int nTri; 29 | __constant__ unsigned int nVtx; 30 | __constant__ unsigned int nLS; 31 | 32 | /*__constant__ float* dVtxBuf; 33 | __constant__ int* dTriVtxBuf; 34 | __constant__ float* dNormal; 35 | __constant__ float* dLS; 36 | __constant__ float* dsandboxColor; 37 | __constant__ unsigned int* dsandboxIsReflective; 38 | */ 39 | 40 | texture texref_VtxBuf; 41 | texture texref_TriVtx; 42 | texture texref_Normal; 43 | texture texref_LS; 44 | texture texref_sandboxColor; 45 | texture texref_sandboxIsReflective; 46 | 47 | // memory address in cudamemory 48 | /* 49 | extern "C" float* dVtxBuf = NULL; 50 | extern "C" int* dTriVtxBuf = NULL; 51 | extern "C" float* dNormal = NULL; 52 | extern "C" float* dLS = NULL; 53 | extern "C" float* dsandboxColor = NULL; 54 | extern "C" unsigned int* dsandboxIsReflective = NULL; 55 | */ 56 | 57 | void checkCUDAError(const char *msg) 58 | { 59 | cudaError_t err = cudaGetLastError(); 60 | if( cudaSuccess != err) { 61 | fprintf(stderr, "Cuda error: %s: %s.\n", msg, cudaGetErrorString( err) ); 62 | exit(EXIT_FAILURE); 63 | } 64 | } 65 | 66 | __device__ inline float3 CrossProduct( float3 p1, float3 p2) 67 | { 68 | float3 tmp; 69 | tmp.x = p1.y*p2.z - p1.z*p2.y; 70 | tmp.y = p1.z*p2.x - p1.x*p2.z; 71 | // tmp.y = p1.x*p2.z - p1.z*p2.x; 72 | tmp.z = p1.x*p2.y - p1.y*p2.x; 73 | return tmp; 74 | } 75 | 76 | __device__ inline float InnerProduct( float3 p1, float3 p2) 77 | { 78 | return p1.x*p2.x + p1.y*p2.y + p1.z*p2.z; 79 | } 80 | 81 | // return the value of t, when p+t*dir intersect triangle p0-p1-p2 82 | __device__ inline float TestSingleIntersection( float3 p, float3 dir, float3 p0, float3 p1, float3 p2) 83 | { 84 | float3 E1; 85 | E1.x = p1.x - p0.x; 86 | E1.y = p1.y - p0.y; 87 | E1.z = p1.z - p0.z; 88 | 89 | float3 E2; 90 | E2.x = p2.x - p0.x; 91 | E2.y = p2.y - p0.y; 92 | E2.z = p2.z - p0.z; 93 | 94 | float3 T; 95 | T.x = p.x - p0.x; 96 | T.y = p.y - p0.y; 97 | T.z = p.z - p0.z; 98 | 99 | float3 P = CrossProduct( dir, E2); 100 | float3 Q = CrossProduct( T, E1 ); 101 | 102 | float s = InnerProduct(P, E1); 103 | // s = 0.5; //FIXME 104 | if ( (s < EPSILON ) && (s>-EPSILON) ) return -1; 105 | 106 | float t = InnerProduct(Q, E2) / s; 107 | if ( t < 1e-5) return -2; 108 | 109 | float u = InnerProduct(P, T) / s; 110 | if ( (u < 0) ) return -3; 111 | 112 | float v = InnerProduct(Q, dir) / s; 113 | if ( (v < 0 ) ) return -4; 114 | 115 | if ( u+v > 1 ) return -5; 116 | 117 | // t = 0.5; //FIXME 118 | return t; 119 | } 120 | 121 | __device__ inline bool FromLeftToRight( KdTree_rp* dtree, int idx, float3 dir) 122 | { 123 | switch (dtree[idx].axis) 124 | { 125 | case 0: 126 | if ( dir.x > EPSILON) return true; 127 | else return false; 128 | case 1: 129 | if ( dir.y > EPSILON) return true; 130 | else return false; 131 | case 2: 132 | if ( dir.z > EPSILON) return true; 133 | else return false; 134 | default: 135 | printf("no such case!\n"); 136 | } 137 | } 138 | 139 | __device__ inline bool TestNodeIntersection( KdTree_rp* dtree, int idx, float3 p, float3 dir) 140 | { 141 | 142 | float tmp1, tmp2; 143 | 144 | // x-y 145 | if ( (dir.x-EPSILON) ) 146 | { 147 | if ( (p.x>dtree[idx].xmax) || ( p.xdtree[idx].ymax) && (tmp2>dtree[idx].ymax)) 160 | || ( (tmp1-EPSILON) ) 165 | { 166 | if ( (p.y>dtree[idx].ymax) || (p.ydtree[idx].zmax) && (tmp2>dtree[idx].zmax)) 179 | || ( (tmp1-EPSILON) ) 184 | { 185 | if ( (p.z>dtree[idx].zmax) || (p.zdtree[idx].xmax) && (tmp2>dtree[idx].xmax)) 198 | || ( (tmp1 1e-5) || ( dir.x< -1e-5) ) 217 | { 218 | if ( prevface != 0) 219 | { 220 | tmp = ( dtree[idx].xmin - p.x) /dir.x; 221 | a = p.y + tmp* dir.y; 222 | b = p.z + tmp* dir.z; 223 | 224 | if ( (tmp > -ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 226 | { 227 | t =tmp; 228 | face = 0; 229 | } 230 | } 231 | 232 | if ( prevface != 1) 233 | { 234 | tmp = ( dtree[idx].xmax - p.x) /dir.x; 235 | a = p.y + tmp* dir.y; 236 | b = p.z + tmp* dir.z; 237 | if ( (tmp > -ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 239 | { 240 | t =tmp; 241 | face = 1; 242 | } 243 | } 244 | } 245 | 246 | if ( (dir.y > 1e-5) || ( dir.y < -1e-5) ) 247 | { 248 | if ( prevface != 2) 249 | { 250 | tmp = ( dtree[idx].ymin - p.y) / dir.y; 251 | a = p.x + tmp * dir.x; 252 | b = p.z + tmp * dir.z; 253 | if ( (tmp > -ep2) && ( tmp < t) ) 254 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 255 | { 256 | t =tmp; 257 | face = 2; 258 | } 259 | } 260 | 261 | if ( prevface != 3 ) 262 | { 263 | tmp = ( dtree[idx].ymax - p.y) / dir.y; 264 | a = p.x + tmp * dir.x; 265 | b = p.z + tmp * dir.z; 266 | if ( (tmp > -ep2) && ( tmp < t) ) 267 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 268 | { 269 | t =tmp; 270 | face = 3; 271 | } 272 | } 273 | } 274 | 275 | if ( (dir.z > 1e-5) || ( dir.z < -1e-5) ) 276 | { 277 | if ( prevface != 4 ) 278 | { 279 | tmp = ( dtree[idx].zmin - p.z) / dir.z; 280 | a = p.x + tmp * dir.x; 281 | b = p.y + tmp * dir.y; 282 | if ( (tmp > -ep2) && ( tmp < t) ) 283 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) ) 284 | { 285 | t =tmp; 286 | face = 4; 287 | } 288 | } 289 | 290 | if ( prevface != 5) 291 | { 292 | tmp = ( dtree[idx].zmax - p.z) / dir.z; 293 | a = p.x + tmp * dir.x; 294 | b = p.y + tmp * dir.y; 295 | if ( (tmp > -ep2) && ( tmp < t) ) 296 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) ) 297 | { 298 | t =tmp; 299 | face = 5; 300 | } 301 | } 302 | } 303 | 304 | if ( face >=0 ) 305 | { 306 | pt.x = p.x + t * dir.x; 307 | pt.y = p.y + t*dir.y; 308 | pt.z = p.z + t*dir.z; 309 | } 310 | 311 | } 312 | 313 | __device__ inline void IntersectBox2( float3 p, float3 dir, KdTree_rp* dtree, int idx, float3& pt, int& face) 314 | { 315 | float t = 1e6; 316 | float tmp = 0; 317 | 318 | float ep = 0; 319 | float ep2 = 0; 320 | 321 | face = -1; 322 | 323 | float a,b; 324 | 325 | if ( (dir.x > 1e-5) || ( dir.x< -1e-5) ) 326 | { 327 | tmp = ( dtree[idx].xmin - p.x) /dir.x; 328 | a = p.y + tmp* dir.y; 329 | b = p.z + tmp* dir.z; 330 | 331 | if ( (tmp > ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 333 | { 334 | t =tmp; 335 | face = 0; 336 | } 337 | 338 | tmp = ( dtree[idx].xmax - p.x) /dir.x; 339 | a = p.y + tmp* dir.y; 340 | b = p.z + tmp* dir.z; 341 | if ( (tmp > ep2) && (tmp=dtree[idx].ymin-ep) && (a<=dtree[idx].ymax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 343 | { 344 | t =tmp; 345 | face = 1; 346 | } 347 | } 348 | 349 | if ( (dir.y > 1e-5) || ( dir.y < -1e-5) ) 350 | { 351 | tmp = ( dtree[idx].ymin - p.y) / dir.y; 352 | a = p.x + tmp * dir.x; 353 | b = p.z + tmp * dir.z; 354 | if ( (tmp > ep2) && ( tmp < t) ) 355 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 356 | { 357 | t =tmp; 358 | face = 2; 359 | } 360 | 361 | tmp = ( dtree[idx].ymax - p.y) / dir.y; 362 | a = p.x + tmp * dir.x; 363 | b = p.z + tmp * dir.z; 364 | if ( (tmp > ep2) && ( tmp < t) ) 365 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].zmin-ep) && (b<=dtree[idx].zmax+ep) ) 366 | { 367 | t =tmp; 368 | face = 3; 369 | } 370 | } 371 | 372 | if ( (dir.z > 1e-5) || ( dir.z < -1e-5) ) 373 | { 374 | tmp = ( dtree[idx].zmin - p.z) / dir.z; 375 | a = p.x + tmp * dir.x; 376 | b = p.y + tmp * dir.y; 377 | if ( (tmp > ep2) && ( tmp < t) ) 378 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) ) 379 | { 380 | t =tmp; 381 | face = 4; 382 | } 383 | 384 | tmp = ( dtree[idx].zmax - p.z) / dir.z; 385 | a = p.x + tmp * dir.x; 386 | b = p.y + tmp * dir.y; 387 | if ( (tmp > ep2) && ( tmp < t) ) 388 | if ( (a>=dtree[idx].xmin-ep) && (a<=dtree[idx].xmax+ep) && (b>=dtree[idx].ymin-ep) && (b<=dtree[idx].ymax+ep) ) 389 | { 390 | t =tmp; 391 | face = 5; 392 | } 393 | } 394 | 395 | if ( face >=0 ) 396 | { 397 | pt.x = p.x + t * dir.x; 398 | pt.y = p.y + t*dir.y; 399 | pt.z = p.z + t*dir.z; 400 | } 401 | 402 | } 403 | 404 | __device__ inline int FindLeaf( KdTree_rp* dtree, int idx, float3 pt) 405 | { 406 | while ( dtree[idx].trinum <0 ) 407 | { 408 | switch (dtree[idx].axis) 409 | { 410 | case 0: 411 | if ( pt.x < dtree[idx].splitpos ) idx = dtree[idx].left; 412 | else idx = dtree[idx].right; 413 | break; 414 | case 1: 415 | if ( pt.y < dtree[idx].splitpos ) idx = dtree[idx].left; 416 | else idx = dtree[idx].right; 417 | break; 418 | case 2: 419 | if ( pt.z < dtree[idx].splitpos ) idx = dtree[idx].left; 420 | else idx = dtree[idx].right; 421 | break; 422 | } 423 | } 424 | 425 | return idx; 426 | 427 | } 428 | 429 | __device__ inline void FindIntersectedTriangle( float3 p, float3 dir, int& intersectedTri, float3& intersectedPt, float* dVtxBuf, int* dTriVtxBuf, KdTree_rp* dtree) 430 | { 431 | float tmin = 1e6; 432 | float t = tmin; 433 | intersectedTri = -1; 434 | float3 p0; 435 | float3 p1; 436 | float3 p2; 437 | float3 pt; 438 | int idx0; 439 | 440 | // int stack[2*TREEDEPTH]; 441 | 442 | int idx = 0; 443 | int face = -1; 444 | if ( (p.x>dtree[0].xmax) || ( p.y > dtree[0].ymax) || (p.z > dtree[0].zmax) 445 | || ( p.x 1e-5) && ( t < tmin ) ) 489 | { 490 | intersectedTri = idx; 491 | intersectedPt.x = p.x + t*dir.x; 492 | intersectedPt.y = p.y + t*dir.y; 493 | intersectedPt.z = p.z + t*dir.z; 494 | tmin = t; 495 | } 496 | 497 | } 498 | 499 | if ( intersectedTri >= 0 ) return; 500 | 501 | IntersectBox2(p, dir, dtree, idx, pt, face ); 502 | idx = dtree[idx].rope[face]; 503 | if ( (face==1) || (face==3) || (face==5) ) face--; 504 | else face++; 505 | */ 506 | 507 | face = -2; 508 | pt = p; 509 | } 510 | 511 | if ( (face >= 0) || (face==-2) ) 512 | while ( (intersectedTri<0) && (idx>=0) ) 513 | { 514 | idx = FindLeaf( dtree, idx, pt ); 515 | 516 | // printf("%d\n",idx); 517 | if (dtree[idx].trinum>0) // leaf node 518 | { 519 | // printf("leaf node:%d\n", idx); 520 | for ( int i = 0; i < dtree[idx].trinum; i++ ) 521 | { 522 | // printf("%d ", dtree[idx].tri[i]); 523 | 524 | idx0 = dTriVtxBuf[3*dtree[idx].tri[i]]; 525 | 526 | p0.x = dVtxBuf[3*idx0]; 527 | p0.y = dVtxBuf[3*idx0+1]; 528 | p0.z = dVtxBuf[3*idx0+2]; 529 | 530 | idx0 = dTriVtxBuf[3*dtree[idx].tri[i]+1]; 531 | p1.x = dVtxBuf[3*idx0]; 532 | p1.y = dVtxBuf[3*idx0+1]; 533 | p1.z = dVtxBuf[3*idx0+2]; 534 | 535 | idx0 = dTriVtxBuf[3*dtree[idx].tri[i]+2]; 536 | p2.x = dVtxBuf[3*idx0]; 537 | p2.y = dVtxBuf[3*idx0+1]; 538 | p2.z = dVtxBuf[3*idx0+2]; 539 | 540 | t = TestSingleIntersection( p, dir ,p0, p1, p2); 541 | 542 | if ( (t > 1e-5) && ( t < tmin ) ) 543 | { 544 | intersectedTri = dtree[idx].tri[i]; 545 | intersectedPt.x = p.x + t*dir.x; 546 | intersectedPt.y = p.y + t*dir.y; 547 | intersectedPt.z = p.z + t*dir.z; 548 | tmin = t; 549 | } 550 | 551 | } 552 | // printf("\n"); 553 | } 554 | 555 | if ( intersectedTri >= 0 ) break; 556 | 557 | // second intersected point of the current leaf node 558 | // face: the intersected face of the current leaf ndoe 559 | IntersectBox(pt, dir, dtree, idx, pt, face, face); 560 | 561 | // next node adjacent to the leaf 562 | idx = dtree[idx].rope[face]; 563 | if ( (face==1) || (face==3) || (face==5) ) face --; 564 | else face++; 565 | } 566 | } 567 | 568 | __device__ inline uchar4 computeColor( int depth, float3 p, float3 dir, float* dVtxBuf, int* dTriVtxBuf, float* dNormal, float* dLS, KdTree_rp* dtree ) 569 | { 570 | int tri = -1; 571 | float3 intersectPt; 572 | int triOnTheWay = -1; 573 | float3 intersectPtOnTheWay; 574 | 575 | float3 tempDir; 576 | 577 | 578 | FindIntersectedTriangle(p,dir, tri, intersectPt, dVtxBuf, dTriVtxBuf, dtree); 579 | 580 | uchar4 color; 581 | color.w= 0; 582 | color.x=0; color.y = 0; color.z= 0; 583 | 584 | int i =0; 585 | 586 | float l; 587 | l = sqrt( dir.x * dir.x + dir.y*dir.y + dir.z*dir.z); 588 | dir.x = dir.x /l; 589 | dir.y = dir.y / l; 590 | dir.z = dir.z /l; 591 | 592 | if ( tri < 0) // no intersection, return sandbox color 593 | { 594 | int face = -1; 595 | 596 | float3 pt; 597 | if ( fabs(dir.y)>0) 598 | { 599 | l = (-1.0 -p.y )/ dir.y; 600 | pt.x = p.x + l*dir.x; 601 | pt.y = -1.0; 602 | pt.z = p.z + l * dir.z; 603 | if ( (pt.x >=-1.0) && (pt.x<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0) ) 604 | { 605 | face = 2; // ymin 606 | if ( depth == 0) 607 | { 608 | tempDir.x = dir.x; 609 | tempDir.y = -dir.y; 610 | tempDir.z = dir.z; 611 | color = computeColor(depth+1,pt, tempDir, dVtxBuf, dTriVtxBuf, dNormal, dLS, dtree); 612 | color.x = color.x * 0.8; 613 | color.y = color.y * 0.8; 614 | color.z = color.z * 0.8; 615 | } 616 | } 617 | } 618 | 619 | if ( fabs(dir.y)>0) 620 | { 621 | l = (1.0 -p.y )/ dir.y; 622 | pt.x = p.x + l*dir.x; 623 | pt.y = 1.0; 624 | pt.z = p.z + l * dir.z; 625 | if ( (pt.x >=-1.0) && (pt.x<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0) ) 626 | face = 3; // ymax 627 | } 628 | 629 | if ( fabs(dir.x)>0) 630 | { 631 | l = (-1.0 - p.x ) /dir.x; 632 | pt.x = -1.0; 633 | pt.y = p.y + l * dir.y; 634 | pt.z = p.z + l * dir.z; 635 | if ( (pt.y>=-1.0) && (pt.y<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0)) 636 | face = 0; // xmin 637 | } 638 | 639 | if ( fabs(dir.x)>0) 640 | { 641 | l = (1.0 - p.x ) /dir.x; 642 | pt.x = 1.0; 643 | pt.y = p.y + l * dir.y; 644 | pt.z = p.z + l * dir.z; 645 | if ( (pt.y>=-1.0) && (pt.y<=1.0) && (pt.z>=-1.0) && (pt.z<=1.0)) 646 | face = 1; // xmax 647 | } 648 | 649 | if ( fabs(dir.z)>0) 650 | { 651 | l = (1.0 - p.z ) /dir.z; 652 | pt.x = p.x + l*dir.x; 653 | pt.y = p.y + l*dir.y; 654 | pt.z = 1.0; 655 | if ( (pt.x>=-1.0) && (pt.x<=1.0) && (pt.y>=-1.0) && (pt.y<=1.0) ) 656 | face = 5; // zmax 657 | } 658 | 659 | switch (face) 660 | { 661 | case 2: 662 | break; 663 | case 3: 664 | color.x = 0; 665 | color.y = 0; 666 | color.z = 0; 667 | break; 668 | case 0: 669 | color.x = 10; 670 | color.y = 100; 671 | color.z = 10; 672 | break; 673 | case 5: 674 | color.x = 100; 675 | color.y = 10; 676 | color.z = 10; 677 | break; 678 | case 1: 679 | color.x = 10; 680 | color.y = 10; 681 | color.z =100; 682 | break; 683 | default: 684 | break; 685 | } 686 | 687 | } 688 | else // intersected with a triangle 689 | { 690 | // check shadow or illumination 691 | triOnTheWay = -1; 692 | for ( i = 0; i < nLS; i++) // for every light source 693 | { 694 | 695 | tempDir.x = dLS[3*i] - intersectPt.x; 696 | tempDir.y = dLS[3*i+1] - intersectPt.y; 697 | tempDir.z = dLS[3*i+2] - intersectPt.z; 698 | 699 | l = sqrt(tempDir.x*tempDir.x + tempDir.y * tempDir.y + tempDir.z * tempDir.z); 700 | tempDir.x = tempDir.x / l; 701 | tempDir.y = tempDir.y / l; 702 | tempDir.z = tempDir.z / l; 703 | 704 | 705 | l = tempDir.x*dNormal[3*tri] + tempDir.y*dNormal[3*tri+1] + tempDir.z*dNormal[3*tri+2]; 706 | 707 | if ( l < -1e-5) continue; 708 | 709 | FindIntersectedTriangle( intersectPt, tempDir, triOnTheWay, intersectPtOnTheWay, dVtxBuf, dTriVtxBuf, dtree); 710 | 711 | if ( triOnTheWay >= 0 ) // in a shadow of triangle "triOnTheWay" 712 | { 713 | /* 714 | color.w=0; 715 | color.x = 100; 716 | color.y = 0; 717 | color.z = 0; 718 | */ 719 | } 720 | else // directly illuminated by the current light source 721 | { 722 | color.w = 0; 723 | 724 | if ( i == 0) l = 0.3; 725 | else if ( i == 1) l = 0.1; 726 | 727 | color.x = color.x+ l* objectColor.x ; 728 | color.y = color.y+ l* objectColor.y ; 729 | color.z = color.z+ l *objectColor.z ; 730 | } 731 | } 732 | 733 | l = 0.5*fabs( dir.x*dNormal[3*tri] + dir.y *dNormal[3*tri+1] + dir.z*dNormal[3*tri+2] ); 734 | color.x = color.x+ l * objectColor.x; 735 | color.y = color.y + l* objectColor.y; 736 | color.z = color.z + l*objectColor.z; 737 | /* 738 | color.x = objectColor.x; 739 | color.y = objectColor.y; 740 | color.z = objectColor.z; 741 | */ 742 | } 743 | 744 | 745 | return color; 746 | } 747 | 748 | __global__ void TracingKernel( uchar4* pos, float* dVtxBuf, int* dTriVtxBuf, float* dNormal, float* dLS, KdTree_rp* dtree ) 749 | { 750 | int pixelx = blockIdx.x*blocksize_x + threadIdx.x; 751 | int pixely = blockIdx.y*blocksize_y + threadIdx.y; 752 | 753 | float3 dir; 754 | // dir.x = dwindown.x + hx*pixelx - dcamPos.x; 755 | // dir.y = dwindown.y + hy*pixely - dcamPos.y; 756 | dir.x = dwinup.x - hx*pixelx - dcamPos.x; 757 | dir.y = dwinup.y- hy*pixely - dcamPos.y; 758 | dir.z = dwindown.z - dcamPos.z; 759 | 760 | pos[ pixely*dwinwidth + pixelx ] = computeColor(0, dcamPos, dir, dVtxBuf, dTriVtxBuf, dNormal, dLS, dtree); 761 | 762 | __syncthreads(); 763 | } 764 | 765 | __global__ void TracingKernel_test( uchar4* pos, float* dVtxBuf, int* dTriVtxBuf, float* dNormal, float* dLS, KdTree_rp* dtree ) 766 | { 767 | int pixelx = blockIdx.x*blocksize_x + threadIdx.x; 768 | int pixely = blockIdx.y*blocksize_y + threadIdx.y; 769 | 770 | float3 dir; 771 | // dir.x = dwindown.x + hx*pixelx - dcamPos.x; 772 | // dir.y = dwindown.y + hy*pixely - dcamPos.y; 773 | dir.x = -0.1; 774 | dir.y = -1; 775 | dir.z = 1; 776 | 777 | pos[ pixely*dwinwidth + pixelx ] = computeColor(0, dcamPos, dir, dVtxBuf, dTriVtxBuf, dNormal, dLS, dtree); 778 | 779 | __syncthreads(); 780 | } 781 | 782 | // Be sure to launch after setting const. and tex. memory 783 | extern "C" void launch_kernel( uchar4* pos, CScene& scene ) 784 | { 785 | dim3 dimBlock(8,8); 786 | 787 | dim3 dimGrid; 788 | dimGrid.x = scene.m_winwidth / dimBlock.x; 789 | dimGrid.y = scene.m_winheight / dimBlock.y; 790 | 791 | TracingKernel<<>>(pos, scene.dVtxBuf, scene.dTriVtxBuf, scene.dNormal, scene.dLS, scene.dtree); 792 | 793 | cudaError_t error = cudaGetLastError(); 794 | if (error != cudaSuccess ) 795 | { 796 | printf("Cuda Error:%s\n", cudaGetErrorString(error)); 797 | exit(-1); 798 | } 799 | } 800 | 801 | extern "C" void launch_kernel_test( uchar4* pos, CScene& scene ) 802 | { 803 | dim3 dimBlock(blocksize_x,blocksize_y); 804 | // dim3 dimBlock(1); 805 | 806 | 807 | dim3 dimGrid; 808 | dimGrid.x = scene.m_winwidth / dimBlock.x; 809 | dimGrid.y = scene.m_winheight / dimBlock.y; 810 | 811 | uchar4* dpos; 812 | 813 | cudaMalloc( (void**) &dpos, sizeof(uchar4)* scene.m_winwidth*scene.m_winheight ); 814 | cudaError_t error = cudaGetLastError(); 815 | if (error != cudaSuccess ) 816 | { 817 | printf("Cuda Error:%s\n", cudaGetErrorString(error)); 818 | exit(-1); 819 | } 820 | 821 | cudaMemcpy( dpos, pos, sizeof(uchar4)*scene.m_winwidth*scene.m_winheight, cudaMemcpyHostToDevice); 822 | error = cudaGetLastError(); 823 | if (error != cudaSuccess ) 824 | { 825 | printf("Cuda Error:%s\n", cudaGetErrorString(error)); 826 | exit(-1); 827 | } 828 | 829 | printf("Tracing...\n"); 830 | float GPU_time = 0; 831 | cudaEvent_t start, stop; 832 | cudaEventCreate( &start ); 833 | cudaEventCreate( &stop ); 834 | cudaEventRecord( start, 0 ); 835 | 836 | TracingKernel<<>>(dpos, scene.dVtxBuf, scene.dTriVtxBuf, scene.dNormal, scene.dLS, scene.dtree); 837 | // TracingKernel_test<<>>(dpos, scene.dVtxBuf, scene.dTriVtxBuf, scene.dNormal, scene.dLS, scene.dtree); 838 | 839 | cudaEventRecord( stop, 0); 840 | cudaEventSynchronize( stop ); 841 | cudaEventElapsedTime( &GPU_time, start, stop); 842 | printf("GPU time:%f\n", GPU_time); 843 | 844 | printf("%d %d\n", scene.m_winwidth, scene.m_winheight); 845 | printf("DeviceToHost..\n"); 846 | cudaMemcpy( pos, dpos, sizeof(uchar4)*scene.m_winwidth*scene.m_winheight, cudaMemcpyDeviceToHost); 847 | printf("DeviceToHostDone.\n"); 848 | 849 | printf("Free...\n"); 850 | cudaFree(dpos); 851 | printf("Free done.\n"); 852 | 853 | error = cudaGetLastError(); 854 | if (error != cudaSuccess ) 855 | { 856 | printf("Cuda Error:%s\n", cudaGetErrorString(error)); 857 | exit(-1); 858 | } 859 | } 860 | 861 | extern "C" void cudaSetConstantMem( CScene& scene ) 862 | { 863 | int a = scene.TriVtxBuf[0][0]; 864 | float3 camPos; 865 | camPos.x = scene.cameraPos.x; 866 | camPos.y = scene.cameraPos.y; 867 | camPos.z = scene.cameraPos.z; 868 | cudaMemcpyToSymbol(dcamPos, &camPos, sizeof(float3)); 869 | 870 | float3 wincenterPos; 871 | wincenterPos.x = scene.windowCenter.x; 872 | wincenterPos.y = scene.windowCenter.y; 873 | wincenterPos.z = scene.windowCenter.z; 874 | cudaMemcpyToSymbol(dwincenterPos, &wincenterPos, sizeof(float3)); 875 | 876 | float3 winup; 877 | winup.x = scene.window_diagup.x; 878 | winup.y = scene.window_diagup.y; 879 | winup.z = scene.window_diagup.z; 880 | cudaMemcpyToSymbol(dwinup, &winup, sizeof(float3)); 881 | 882 | float3 windown; 883 | windown.x = scene.window_diagdown.x; 884 | windown.y = scene.window_diagdown.y; 885 | windown.z = scene.window_diagdown.z; 886 | cudaMemcpyToSymbol(dwindown, &windown, sizeof(float3)); 887 | 888 | float3 boxup; 889 | boxup.x = scene.m_sandbox.m_diagup.x; 890 | boxup.y = scene.m_sandbox.m_diagup.y; 891 | boxup.z = scene.m_sandbox.m_diagup.z; 892 | cudaMemcpyToSymbol(dboxup, &boxup, sizeof(float3)); 893 | 894 | float3 boxdown; 895 | boxdown.x = scene.m_sandbox.m_diagdown.x; 896 | boxdown.y = scene.m_sandbox.m_diagdown.y; 897 | boxdown.z = scene.m_sandbox.m_diagdown.z; 898 | cudaMemcpyToSymbol(dboxdown, &boxdown, sizeof(float3)); 899 | 900 | float3 objColor; 901 | objColor.x = scene.objectColor.r; 902 | objColor.y = scene.objectColor.g; 903 | objColor.z = scene.objectColor.b; 904 | cudaMemcpyToSymbol(objectColor, &objColor, sizeof(float3)); 905 | 906 | float hosthx = (winup.x-windown.x) / scene.m_winwidth; 907 | float hosthy = (winup.y - windown.y) / scene.m_winheight; 908 | cudaMemcpyToSymbol(hx, &hosthx, sizeof(hosthx)); 909 | cudaMemcpyToSymbol(hy, &hosthy, sizeof(hosthy)); 910 | 911 | cudaMemcpyToSymbol(dwinwidth, &scene.m_winwidth, sizeof(scene.m_winwidth)); 912 | cudaMemcpyToSymbol(dwinheight, &scene.m_winheight, sizeof(scene.m_winheight)); 913 | 914 | cudaMemcpyToSymbol(nTri, &scene.nTri, sizeof(unsigned int) ); 915 | cudaMemcpyToSymbol(nVtx, &scene.nVtx, sizeof(&scene.nVtx) ); 916 | cudaMemcpyToSymbol(nLS, &scene.nLightSource, sizeof(unsigned int) ); 917 | 918 | /* 919 | cudaMemcpyToSymbol( dVtxBuf, &scene.dVtxBuf, sizeof( scene.dVtxBuf ) ); 920 | cudaMemcpyToSymbol( dTriVtxBuf, &scene.dTriVtxBuf, sizeof( scene.dTriVtxBuf ) ); 921 | cudaMemcpyToSymbol( dNormal, &scene.dNormal, sizeof(scene.dNormal) ); 922 | cudaMemcpyToSymbol( dLS, &scene.dLS, sizeof(scene.dLS) ); 923 | cudaMemcpyToSymbol( dsandboxColor, &scene.dsandboxColor, sizeof(scene.dsandboxColor) ); 924 | cudaMemcpyToSymbol( dsandboxIsReflective, &scene.dsandboxIsReflective, sizeof(scene.dsandboxIsReflective) ); 925 | */ 926 | } 927 | 928 | extern "C" void cudaSceneMalloc( CScene& scene ) 929 | { 930 | cudaMalloc( (void**) & scene.dVtxBuf, sizeof(float)*scene.nVtx*3); 931 | cudaMalloc( (void**) & scene.dTriVtxBuf, sizeof(int)*scene.nTri*3); 932 | cudaMalloc((void**) &scene.dNormal, sizeof(float)*scene.nTri*3); 933 | cudaMalloc((void**) &scene.dLS, sizeof(float)*3*scene.nLightSource); 934 | cudaMalloc((void**) &scene.dsandboxColor, sizeof(float)*3*5); 935 | cudaMalloc((void**) &scene.dsandboxIsReflective, sizeof(unsigned int)*5); 936 | cudaMalloc((void**) &scene.dtree, sizeof(KdTree_rp)*scene.treesize); 937 | } 938 | 939 | extern "C" void cudaBindToTexture( unsigned int nVtx, unsigned int nTri, unsigned int nLS, CScene& scene ) 940 | { 941 | cudaBindTexture(0, texref_VtxBuf, scene.dVtxBuf, sizeof(float)*nVtx*3); 942 | cudaBindTexture(0, texref_TriVtx, scene.dTriVtxBuf, sizeof(int)*nTri*3); 943 | cudaBindTexture(0, texref_Normal, scene.dNormal, sizeof(float)*3*nTri); 944 | cudaBindTexture(0, texref_LS, scene.dLS, sizeof(float)*3*nLS); 945 | cudaBindTexture(0, texref_sandboxColor, scene.dsandboxColor, sizeof(float)*3*5); 946 | cudaBindTexture(0, texref_sandboxIsReflective, scene.dsandboxColor, sizeof(unsigned int)*5); 947 | } 948 | 949 | extern "C" void cudaPassSceneToGlobalMem( CScene& scene, float* pVtxBuf, int* pTriVtxBuf, float* pNormal, float* pLS, float* psandboxColor, unsigned int* psandboxIsReflective, KdTree_rp* ptree) 950 | { 951 | if (scene.dVtxBuf) cudaMemcpy( scene.dVtxBuf, pVtxBuf, sizeof(float)*scene.nVtx*3, cudaMemcpyHostToDevice); 952 | 953 | if (scene.dTriVtxBuf) cudaMemcpy( scene.dTriVtxBuf, pTriVtxBuf, sizeof(int)*scene.nTri*3, cudaMemcpyHostToDevice); 954 | 955 | if (scene.dNormal) cudaMemcpy( scene.dNormal, pNormal, sizeof(float)*scene.nTri*3, cudaMemcpyHostToDevice); 956 | 957 | if (scene.dLS) cudaMemcpy( scene.dLS, pLS, sizeof(float)*3* scene.nLightSource, cudaMemcpyHostToDevice); 958 | 959 | if (scene.dsandboxColor) cudaMemcpy( scene.dsandboxColor, psandboxColor, sizeof(float3)*5, cudaMemcpyHostToDevice ); 960 | 961 | if (scene.dsandboxIsReflective) cudaMemcpy( scene.dsandboxIsReflective, psandboxIsReflective, sizeof(unsigned int)*5, cudaMemcpyHostToDevice); 962 | 963 | if (scene.dtree) cudaMemcpy( scene.dtree, ptree, sizeof(KdTree_rp)*scene.treesize, cudaMemcpyHostToDevice ); 964 | } 965 | 966 | // called when entire application ends 967 | extern "C" void cudaFreeTextureResources() 968 | { 969 | cudaUnbindTexture(texref_VtxBuf); 970 | cudaUnbindTexture(texref_Normal); 971 | cudaUnbindTexture(texref_TriVtx); 972 | cudaUnbindTexture(texref_LS); 973 | } 974 | 975 | extern "C" void cudaFreeGlobalMemory( CScene& scene) 976 | { 977 | if ( scene.dVtxBuf ) cudaFree(scene.dVtxBuf); 978 | if ( scene.dTriVtxBuf ) cudaFree(scene.dTriVtxBuf); 979 | if ( scene.dNormal ) cudaFree(scene.dNormal); 980 | if ( scene.dLS ) cudaFree(scene.dLS); 981 | if ( scene.dsandboxColor ) cudaFree(scene.dsandboxColor); 982 | if ( scene.dsandboxIsReflective ) cudaFree(scene.dsandboxIsReflective); 983 | } 984 | 985 | 986 | 987 | 988 | 989 | 990 | 991 | -------------------------------------------------------------------------------- /src/main.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "scene.hpp" 4 | 5 | #include 6 | #include 7 | 8 | using namespace cv; 9 | 10 | CScene scene; 11 | 12 | extern "C" void launch_kernel_test( char*, CScene& scene ); 13 | extern "C" void cudaFreeTextureResources(); 14 | 15 | int main( int argc,char** argv ) 16 | { 17 | scene.LoadObjects(argc, argv); 18 | scene.WriteObjects(); 19 | 20 | printf("malloc...\n"); 21 | scene.SceneMalloc(); 22 | printf("malloc done.\n"); 23 | 24 | printf("copy to global mem...\n"); 25 | scene.PassSceneToGlobalMem(); 26 | printf("copy done.\n"); 27 | 28 | printf("constant mem...\n"); 29 | scene.SetConstantMem(); 30 | printf("constant mem copy done.\n"); 31 | 32 | char* pos = (char*)malloc( sizeof(char)*4* scene.m_winwidth * scene.m_winheight ); 33 | for ( int i = 0; i < scene.m_winwidth; i++ ) 34 | for ( int j = 0; j < scene.m_winheight; j++ ) 35 | for ( int k =0; k < 4; k++) 36 | pos[ (i*scene.m_winheight + j)*4 + k ] = 0; 37 | 38 | launch_kernel_test( pos, scene ); 39 | 40 | //free the global memory associated with scene 41 | scene.FreeGlobalMemory( ); 42 | 43 | cudaFreeTextureResources(); 44 | 45 | Mat image( scene.m_winheight, scene.m_winwidth, CV_8UC3, Scalar(0,0,0)); 46 | 47 | for ( int i = 0; i < scene.m_winheight; i++ ) 48 | for ( int j = 0; j < scene.m_winwidth; j++ ) 49 | { 50 | image.at(i,j)[0] = pos[ (i*scene.m_winwidth+ j)*4 + 2 ]; 51 | image.at(i,j)[1] = pos[ (i*scene.m_winwidth+ j)*4 + 1 ]; 52 | image.at(i,j)[2] = pos[ (i*scene.m_winwidth + j)*4 + 0 ]; 53 | } 54 | 55 | imwrite("im_testK.png", image); 56 | free(pos); 57 | 58 | return 0; 59 | } 60 | 61 | -------------------------------------------------------------------------------- /src/nvcc_compile.sh: -------------------------------------------------------------------------------- 1 | nvcc -arch=sm_21 -O3 -I /usr/include/opencv -I /usr/include/opencv2 main.cpp scene.cpp kernel.cu -lopencv_core -lopencv_highgui -o raytracer 2 | -------------------------------------------------------------------------------- /src/scene.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include "scene.hpp" 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | 10 | #include "kdtree.h" 11 | 12 | using namespace std; 13 | 14 | 15 | using namespace std; 16 | 17 | extern "C" void cudaSceneMalloc( CScene& ); 18 | 19 | extern "C" void cudaBindToTexture( int nVtx, float* pVtxBuf, int nTri, int* pTriVtxBuf, float* pNormal, int nLS, float* pLS, float* psandboxColor, unsigned int* psandboxIsReflective); 20 | 21 | extern "C" void cudaPassSceneToGlobalMem( CScene&, float* pVtxBuf, int* pTriVtxBuf, float* pNormal, float* pLS, float* psandboxColor, unsigned int* psandboxIsReflective, KdTree_rp* ptree); 22 | 23 | extern "C" void cudaSetConstantMem( CScene & ); 24 | 25 | extern "C" void cudaFreeTextureResources( ); 26 | 27 | extern "C" void cudaFreeGlobalMemory( CScene& ); 28 | 29 | CPoint3D CPoint3D::operator- (CPoint3D param) 30 | { 31 | CPoint3D tmp; 32 | tmp.x = x - param.x; 33 | tmp.y = y - param.y; 34 | tmp.z = z - param.z; 35 | return tmp; 36 | } 37 | 38 | CPoint3D CPoint3D::operator+ (CPoint3D param) 39 | { 40 | CPoint3D tmp; 41 | tmp.x = x + param.x; 42 | tmp.y = y + param.y; 43 | tmp.z = z + param.z; 44 | return tmp; 45 | } 46 | 47 | float CPoint3D::operator* (CPoint3D param) 48 | { 49 | return x*param.x + y*param.y + z*param.z; 50 | } 51 | 52 | CPoint3D CPoint3D::operator^ (CPoint3D param) 53 | { 54 | CPoint3D tmp; 55 | tmp.x = y*param.z - z*param.y; 56 | tmp.y = x*param.z - z*param.x; 57 | tmp.z = x*param.y - y*param.x; 58 | return tmp; 59 | } 60 | 61 | void CPoint3D::normalize() 62 | { 63 | float l = sqrt(x*x+y*y+z*z); 64 | x = x/l; 65 | y = y/l; 66 | z = z/l; 67 | } 68 | 69 | CScene::CScene() 70 | { 71 | dVtxBuf = NULL; 72 | dTriVtxBuf = NULL; 73 | dLS = NULL; 74 | dNormal = NULL; 75 | dsandboxColor = NULL; 76 | dsandboxIsReflective = NULL; 77 | 78 | int scale = 1; 79 | // grey color for the object 80 | objectColor.r = 100; 81 | objectColor.g = 100; 82 | objectColor.b = 100; 83 | 84 | cameraPos.x = 0; 85 | cameraPos.y = 0; 86 | cameraPos.z = -2*scale; 87 | 88 | windowCenter.x = 0; 89 | windowCenter.y = 0; 90 | windowCenter.z = -1*scale; 91 | 92 | window_diagup.x = 1*scale; 93 | window_diagup.y = 1*scale; 94 | window_diagup.z = -1*scale; 95 | 96 | window_diagdown.x = -1*scale; 97 | window_diagdown.y= -1*scale; 98 | window_diagdown.z = -1*scale; 99 | 100 | m_sandbox.m_diagup.x = -1*scale; 101 | m_sandbox.m_diagup.y = -1*scale; 102 | m_sandbox.m_diagup.z = -1*scale; 103 | 104 | m_sandbox.m_diagup.x = 1*scale; 105 | m_sandbox.m_diagup.y = 1*scale; 106 | m_sandbox.m_diagup.z = 1*scale; 107 | 108 | m_winwidth = 800; 109 | m_winheight = 600; 110 | 111 | CPoint3D LS; 112 | LS.x = 1*scale; 113 | LS.y = -1*scale; 114 | LS.z = -1*scale; 115 | lightSource.push_back(LS); 116 | 117 | LS.x = -1*scale; 118 | LS.y = -1*scale; 119 | LS.z = -1*scale; 120 | lightSource.push_back(LS); 121 | 122 | nLightSource = lightSource.size(); 123 | } 124 | 125 | void CScene::LoadSandBox( int argc, char** argv ) 126 | { 127 | } 128 | 129 | void CScene::CalculateNormal() 130 | { 131 | NormalBuf.resize(nTri); 132 | for ( int i = 0; i < nTri; i++ ) 133 | { 134 | CPoint3D a = VtxBuf[TriVtxBuf[i][1]] - VtxBuf[TriVtxBuf[i][0]]; 135 | a.x = VtxBuf[TriVtxBuf[i][1]].x - VtxBuf[TriVtxBuf[i][0]].x; 136 | a.y = VtxBuf[TriVtxBuf[i][1]].y- VtxBuf[TriVtxBuf[i][0]].y; 137 | a.z = VtxBuf[TriVtxBuf[i][1]].z - VtxBuf[TriVtxBuf[i][0]].z; 138 | 139 | CPoint3D b = VtxBuf[TriVtxBuf[i][2]] - VtxBuf[TriVtxBuf[i][0]]; 140 | b.x = VtxBuf[TriVtxBuf[i][2]].x - VtxBuf[TriVtxBuf[i][0]].x; 141 | b.y = VtxBuf[TriVtxBuf[i][2]].y - VtxBuf[TriVtxBuf[i][0]].y; 142 | b.z = VtxBuf[TriVtxBuf[i][2]].z - VtxBuf[TriVtxBuf[i][0]].z; 143 | // if 0,1,2 are counter-clockwise labeled looking from outside 144 | // then the normal vector is outward pointing 145 | NormalBuf[i].x = a.y*b.z - a.z*b.y; 146 | NormalBuf[i].y = a.z*b.x - a.x *b.z; 147 | NormalBuf[i].z = a.x*b.y - a.y*b.x; 148 | 149 | NormalBuf[i].normalize(); 150 | } 151 | } 152 | 153 | void CScene::CalcBoundingBox() 154 | { 155 | xmin = 1e6; 156 | ymin = 1e6; 157 | zmin = 1e6; 158 | xmax = -1e6; 159 | ymax= -1e6; 160 | zmax = -1e6; 161 | for ( int i = 0; i < nVtx; i++) 162 | { 163 | if (VtxBuf[i].x < xmin) xmin = VtxBuf[i].x; 164 | if (VtxBuf[i].x > xmax) xmax = VtxBuf[i].x; 165 | if (VtxBuf[i].y < ymin) ymin = VtxBuf[i].y; 166 | if (VtxBuf[i].y > ymax) ymax = VtxBuf[i].y; 167 | if (VtxBuf[i].z < zmin) zmin = VtxBuf[i].z; 168 | if (VtxBuf[i].z > zmax) zmax = VtxBuf[i].z; 169 | } 170 | printf("x:%.2f~%.2f\n", xmin, xmax); 171 | printf("y:%.2f~%.2f\n", ymin, ymax); 172 | printf("Z:%.2f~%.2f\n", zmin, zmax); 173 | } 174 | 175 | int CScene::CalcTreeSize(KdTree* node) 176 | { 177 | if ( !node ) return 0; 178 | else return 1+CalcTreeSize(node->pleft)+CalcTreeSize(node->pright); 179 | } 180 | 181 | /* 182 | bool CScene::cmpx( int i, int j ) 183 | { 184 | CPoint3D p1 = VtxBuf[i]; 185 | CPoint3D p2 = Vtxbuf[j]; 186 | if (p1.x < p2.x ) return true; 187 | else return false; 188 | } 189 | 190 | bool CScene::cmpy( int i, int j ) 191 | { 192 | CPoint3D p1 = VtxBuf[i]; 193 | CPoint3D p2 = Vtxbuf[j]; 194 | if ( p1.y < p2.y ) return true; 195 | else return false; 196 | } 197 | 198 | bool CScene::cmpz( int i, int j ) 199 | { 200 | CPoint3D p1 = VtxBuf[i]; 201 | CPoint3D p2 = Vtxbuf[j]; 202 | if ( p1.z < p2.z ) return true; 203 | else return false; 204 | } 205 | 206 | vector ChooseTriSTLVersion( vector plist, float mid, int axis) 207 | { 208 | switch axis: 209 | case 0: 210 | sort(plist.begin(), plist.end(), cmpx ); 211 | vector::iterator it = upper_bound( plist.begin(), plist.end(), cmpx(axis) ); 212 | case 1: 213 | sort(plist.begin(), plist.end(), cmpy ); 214 | vector::iterator it = upper_bound( plist.begin(), plist.end(), cmpy(axis) ); 215 | case 2: 216 | sort(plist.begin(), plist.end(), cmpz ); 217 | vector::iterator it = upper_bound( plist.begin(), plist.end(), cmpz(axis) ); 218 | 219 | vector chosen; 220 | chosen.resize( it - plist.begin()+1); 221 | copy( plist.begin(), it, chosen.begin() ); 222 | 223 | return chosen; 224 | } 225 | */ 226 | 227 | bool IsInBox( CPoint3D p, float xmin, float xmax, float ymin, float ymax, float zmin, float zmax ) 228 | { 229 | if ( (p.x>=xmin) && (p.x<=xmax) && (p.y>=ymin) && (p.y<=ymax) && (p.z>=zmin) && (p.z<=zmax) ) 230 | return true; 231 | else return false; 232 | } 233 | 234 | vector CScene::ChooseTri( vector trilist, float xmin, float xmax, float ymin, float ymax, float zmin, float zmax) 235 | { 236 | vector chosen; 237 | chosen.clear(); 238 | 239 | for ( int i = 0; i < trilist.size(); i++) 240 | { 241 | int idx0 = TriVtxBuf[ trilist[i] ][0]; 242 | int idx1 = TriVtxBuf[ trilist[i] ][1]; 243 | int idx2 = TriVtxBuf[ trilist[i] ][2]; 244 | 245 | CPoint3D p0 = VtxBuf[ idx0 ]; 246 | CPoint3D p1 = VtxBuf[ idx1 ]; 247 | CPoint3D p2 = VtxBuf[ idx2 ]; 248 | 249 | if ( IsInBox(p0, xmin, xmax, ymin, ymax, zmin, zmax) || 250 | IsInBox(p1, xmin, xmax, ymin, ymax, zmin, zmax) || 251 | IsInBox(p2, xmin, xmax, ymin, ymax, zmin, zmax) ) 252 | chosen.push_back( trilist[i] ); 253 | } 254 | 255 | return chosen; 256 | } 257 | 258 | KdTree* CScene::ConstructKdTree(int axis, int depth, vectortrilist, float xmin, float xmax, float ymin, float ymax, float zmin, float zmax) 259 | { 260 | // if (trilist.size() == 0 ) return NULL; 261 | 262 | KdTree* node= new KdTree; 263 | node->xmin = xmin; 264 | node->xmax = xmax; 265 | node->ymin = ymin; 266 | node->ymax = ymax; 267 | node->zmin = zmin; 268 | node->zmax = zmax; 269 | 270 | // for ( int i =0; i< 6; i++) node->rope[i] = NULL; 271 | 272 | if ( (trilist.size() <= LEAFSIZE) ) 273 | { 274 | node->trinum = trilist.size(); 275 | // node->tri = (int*)malloc( sizeof(int)*trilist.size() ); 276 | for ( int i = 0; i < trilist.size(); i++) 277 | node->tri[i] = trilist[i]; 278 | node->axis = axis; 279 | node->splitpos = -1; 280 | node->pleft = NULL; 281 | node->pright = NULL; 282 | } 283 | else 284 | { 285 | node->trinum = -1; 286 | // node->tri = NULL; 287 | float mid; 288 | vector leftptlist; 289 | leftptlist.clear(); 290 | vector rightptlist; 291 | rightptlist.clear(); 292 | switch (axis) 293 | { 294 | case 0: 295 | mid = (xmin+xmax)/2; 296 | leftptlist = ChooseTri( trilist, xmin, mid, ymin, ymax, zmin, zmax); 297 | rightptlist = ChooseTri( trilist, mid, xmax, ymin, ymax, zmin, zmax); 298 | 299 | node->pleft = ConstructKdTree( (axis+1)%3, depth+1, leftptlist, xmin,mid, ymin, ymax, zmin, zmax); 300 | node->pright = ConstructKdTree( (axis+1)%3, depth+1, rightptlist, mid, xmax, ymin, ymax, zmin, zmax); 301 | break; 302 | case 1: 303 | mid = (ymin+ymax)/2; 304 | leftptlist = ChooseTri( trilist, xmin, xmax, ymin, mid, zmin, zmax); 305 | rightptlist = ChooseTri( trilist, xmin, xmax, mid, ymax, zmin, zmax); 306 | 307 | node->pleft = ConstructKdTree( (axis+1)%3, depth+1, leftptlist, xmin,xmax, ymin, mid, zmin, zmax); 308 | node->pright = ConstructKdTree( (axis+1)%3, depth+1, rightptlist, xmin, xmax, mid, ymax, zmin, zmax); 309 | break; 310 | case 2: 311 | mid = (zmin+zmax)/2; 312 | leftptlist = ChooseTri( trilist, xmin, xmax, ymin, ymax, zmin, mid); 313 | rightptlist = ChooseTri( trilist, xmin, xmax, ymin, ymax, mid, zmax); 314 | 315 | node->pleft = ConstructKdTree( (axis+1)%3, depth+1, leftptlist, xmin, xmax, ymin, ymax, zmin, mid); 316 | node->pright = ConstructKdTree( (axis+1)%3, depth+1, rightptlist, xmin, xmax, ymin, ymax, mid, zmax); 317 | break; 318 | default: 319 | break; 320 | } 321 | node->axis = axis; 322 | node->splitpos = mid; 323 | } 324 | 325 | return node; 326 | } 327 | 328 | bool CScene::SplitPlaneAboveBox( int axis, float splitpos, int idx) 329 | { 330 | switch (axis) 331 | { 332 | case 0: 333 | if ( splitpos >= tree_rp[idx].xmax) return true; 334 | else return false; 335 | case 1: 336 | if ( splitpos >= tree_rp[idx].ymax) return true; 337 | else return false; 338 | case 2: 339 | if ( splitpos >= tree_rp[idx].zmax) return true; 340 | else return false; 341 | default: 342 | return true; 343 | } 344 | } 345 | 346 | bool CScene::SplitPlaneBelowBox( int axis, float splitpos, int idx) 347 | { 348 | switch (axis) 349 | { 350 | case 0: 351 | if ( splitpos <= tree_rp[idx].xmin) return true; 352 | else return false; 353 | case 1: 354 | if ( splitpos <= tree_rp[idx].ymin) return true; 355 | else return false; 356 | case 2: 357 | if ( splitpos <= tree_rp[idx].zmin) return true; 358 | else return false; 359 | default: 360 | return true; 361 | } 362 | } 363 | 364 | void CScene::OptimizeRope( int idx, int &rp, int face) 365 | { 366 | /* 367 | while ( tree_rp[rp].trinum < 0) // not leaf 368 | { 369 | if ( face/2 == tree_rp[rp].axis ) break; 370 | 371 | if ( SplitPlaneAboveBox( tree_rp[rp].axis, tree_rp[rp].splitpos, idx) ) 372 | rp = tree_rp[rp].left; 373 | else if ( SplitPlaneBelowBox( tree_rp[rp].axis, tree_rp[rp].splitpos, idx) ) 374 | rp = tree_rp[rp].right; 375 | else break; 376 | 377 | if ( rp<0) break; 378 | 379 | } 380 | */ 381 | 382 | } 383 | 384 | void CScene::BuildRope( int idx, int rope[6] ) 385 | { 386 | for ( int i =0; i < 6; i++ ) 387 | if ( abs(rope[i]) > 4000 ) 388 | { printf("rope out of range\n"); exit(-1); return;} 389 | 390 | if ( tree_rp[idx].trinum >= 0) // is leaf 391 | { 392 | for ( int i = 0; i < 6; i++ ) 393 | tree_rp[idx].rope[i] = rope[i]; 394 | } 395 | else 396 | { 397 | for ( int i =0 ; i< 6; i++ ) 398 | { 399 | if ( rope[i]>=0 ) OptimizeRope( idx, rope[i], i ); 400 | } 401 | 402 | if ( tree_rp[idx].axis < 0) printf("Line 381\n"); 403 | 404 | int sl = 2* tree_rp[idx].axis; 405 | int sr = 2* tree_rp[idx].axis + 1; 406 | 407 | int ropeL[6]; 408 | int ropeR[6]; 409 | for ( int i = 0; i< 6; i++ ) 410 | { 411 | ropeL[i] = rope[i]; 412 | ropeR[i] = rope[i]; 413 | } 414 | ropeL[sr] = tree_rp[idx].right; 415 | ropeR[sl] = tree_rp[idx].left; 416 | 417 | if ( tree_rp[idx].right >0) BuildRope( tree_rp[idx].right, ropeR); 418 | if ( tree_rp[idx].left > 0) BuildRope( tree_rp[idx].left, ropeL ); 419 | } 420 | } 421 | 422 | void CScene::RpKdTreePrint(int argc) 423 | { 424 | for ( int i =0; i < treesize; i++) 425 | { 426 | printf("node%d: left%d, right%d, axis%d, splitpos%.2f\n", i, tree_rp[i].left, tree_rp[i].right, tree_rp[i].axis, tree_rp[i].splitpos ); 427 | if (tree_rp[i].trinum >= 0) 428 | { 429 | printf(" Leaf node containing:"); 430 | for ( int j = 0; j < tree_rp[i].trinum ; j++) 431 | // printf("%d ", tree_rp[i].tri[j]); 432 | printf(". "); 433 | printf("\n"); 434 | printf(" Rope: "); 435 | for ( int j=0; j<6; j++ ) 436 | printf("%d ", tree_rp[i].rope[j]); 437 | printf("\n"); 438 | } 439 | 440 | printf(" xmin %.3f, xmax %.3f, ymin %.3f, ymax % .3f, zmin %.3f, zmax %.3f\n", tree_rp[i].xmin,tree_rp[i].xmax, tree_rp[i].ymin, tree_rp[i].ymax, tree_rp[i].zmin, tree_rp[i].zmax); 441 | 442 | 443 | } 444 | 445 | return; 446 | } 447 | 448 | void CScene::ConstructRpKdTree( int idx, KdTree* node) 449 | { 450 | if (!node) return; 451 | 452 | static int count = 0; 453 | 454 | tree_rp[idx].xmin = node->xmin; 455 | tree_rp[idx].xmax = node->xmax; 456 | tree_rp[idx].ymin = node->ymin; 457 | tree_rp[idx].ymax = node->ymax; 458 | tree_rp[idx].zmin = node->zmin; 459 | tree_rp[idx].zmax = node->zmax; 460 | 461 | tree_rp[idx].axis = node->axis; 462 | tree_rp[idx].splitpos = node->splitpos; 463 | 464 | tree_rp[idx].trinum = node->trinum; 465 | if ( node->trinum>= 0) 466 | { 467 | // tree_rp[idx].tri= (int*) malloc( node->trinum* sizeof(int) ); 468 | 469 | for ( int i =0; i< node->trinum; i++) 470 | tree_rp[idx].tri[i] = node->tri[i]; 471 | } 472 | // else tree_rp[idx].trinum = NULL; 473 | 474 | if ( node->pleft ) 475 | { 476 | count++; 477 | tree_rp[idx].left = count; 478 | ConstructRpKdTree( count, node->pleft); 479 | } 480 | else tree_rp[idx].left = -1; 481 | 482 | if ( node->pright ) 483 | { 484 | count++; 485 | tree_rp[idx].right = count; 486 | ConstructRpKdTree( count, node->pright); 487 | } 488 | else tree_rp[idx].right = -1; 489 | } 490 | 491 | void CScene::LoadObjects( int argc, char** argv ) 492 | { 493 | FILE* file = fopen( argv[1], "r"); 494 | if (!file) 495 | { 496 | cerr<<"Cannot open obj file"< idx; 504 | idx.resize(3); 505 | 506 | VtxBuf.clear(); 507 | TriVtxBuf.clear(); 508 | 509 | while ( fscanf(file, "%s", buf)!=EOF ) 510 | { 511 | switch( buf[0] ) 512 | { 513 | case '#': 514 | // eat up rest of the line 515 | fgets(buf,sizeof(buf), file); 516 | break; 517 | case 'v': 518 | fscanf(file, "%f %f %f\n", &tmp.x, &tmp.y, &tmp.z); 519 | VtxBuf.push_back(tmp); 520 | // eat up rest of the line 521 | // fgets(buf,sizeof(buf), file); 522 | break; 523 | case 'f': 524 | fscanf(file, "%d %d %d\n", &idx[0], &idx[1], &idx[2]); 525 | idx[0] = idx[0]-1; idx[1] = idx[1]-1, idx[2] =idx[2]-1; 526 | TriVtxBuf.push_back(idx); 527 | // eat up rest of the line 528 | // fgets(buf,sizeof(buf), file); 529 | break; 530 | default: 531 | // eat up rest of the line 532 | fgets(buf, sizeof(buf), file); 533 | break; 534 | } 535 | } 536 | 537 | nVtx = VtxBuf.size(); 538 | nTri = TriVtxBuf.size(); 539 | printf("Obj file loaded. Vertex number:%d. Face number:%d\n", nVtx, nTri); 540 | 541 | fclose(file); 542 | 543 | printf("Calculating normal vector...\n"); 544 | CalculateNormal(); 545 | 546 | printf("Calculating bounding box...\n"); 547 | CalcBoundingBox(); 548 | 549 | vector trilist; 550 | trilist.resize( nTri ); 551 | for ( int i = 0; i < nTri; i++ ) trilist[i] = i; 552 | 553 | printf("Building kdtree...\n"); 554 | tree = ConstructKdTree( 0, 0, trilist, xmin, xmax, ymin, ymax, zmin, zmax ); 555 | 556 | treesize = CalcTreeSize(tree); 557 | tree_rp = (KdTree_rp*)malloc( treesize * sizeof(KdTree_rp) ); 558 | printf("Convert to indexed kdtree...\n"); 559 | ConstructRpKdTree(0,tree); 560 | 561 | 562 | printf("Building rope...\n"); 563 | for ( int i =0; i< treesize; i++) 564 | for ( int j =0; j<6; j++) tree_rp[i].rope[j] = -10; 565 | int rope[6]; 566 | for ( int i =0; i < 6; i++ ) rope[i] = -1; 567 | BuildRope( 0, rope ); 568 | } 569 | 570 | void CScene::WriteObjects() 571 | { 572 | FILE* fout = fopen("scene.obj", "w"); 573 | for ( int i =0; i< nVtx; i++) 574 | fprintf(fout, "v %f %f %f\n", VtxBuf[i].x, VtxBuf[i].y, VtxBuf[i].z); 575 | for ( int i = 0; i 5 | #include "kdtree.h" 6 | 7 | class CPoint3D 8 | { 9 | public: 10 | float x; 11 | float y; 12 | float z; 13 | 14 | CPoint3D operator- (CPoint3D); 15 | CPoint3D operator+ (CPoint3D); 16 | float operator* (CPoint3D); // inner dot product between vectors 17 | CPoint3D operator^ (CPoint3D); // outer dot product between vectors 18 | 19 | void normalize(); 20 | }; 21 | 22 | class RGB 23 | { 24 | public: 25 | unsigned int r; 26 | unsigned int g; 27 | unsigned int b; 28 | }; 29 | 30 | class CSandBox 31 | { 32 | public: 33 | CPoint3D m_diagdown; //x,y,z small 34 | CPoint3D m_diagup; // x,y,z large 35 | 36 | // The faces are labeled as: 37 | // 0: zmax, 1:xmax, 2:ymax, 3:xmin, 4:ymin 38 | // xmin face is the window face 39 | RGB facecolor[5]; 40 | unsigned int isReflective[5]; 41 | 42 | CSandBox() 43 | { 44 | facecolor[1].r = 200; 45 | facecolor[1].g = 0; 46 | facecolor[1].b = 0; 47 | 48 | facecolor[2].r = 0; 49 | facecolor[2].g = 200; 50 | facecolor[2].b = 0; 51 | 52 | facecolor[3].r = 0; 53 | facecolor[3].g = 0; 54 | facecolor[3].b = 200; 55 | 56 | facecolor[4].r = 250; 57 | facecolor[4].g = 250; 58 | facecolor[4].b = 250; 59 | 60 | isReflective[0] =1; 61 | isReflective[1] =0; 62 | isReflective[2] =0; 63 | isReflective[3] =0; 64 | isReflective[4] =0; 65 | } 66 | }; 67 | 68 | // Passing to texture memory is done by calling member function PassSceneToTexture() 69 | // Passing to constant memory is done when calling member functiSetConstantMem() 70 | class CScene // can only manage one object 71 | { 72 | public: 73 | CSandBox m_sandbox; // pass to const. & tex. memory 74 | 75 | unsigned int m_winwidth ; // pass to constant memory 76 | unsigned int m_winheight ; // pass to const. mem 77 | 78 | unsigned int nVtx; // pass to const. mem 79 | std::vector VtxBuf; // pass to tex. mem 80 | 81 | unsigned int nTri; // pass to const. mem 82 | std::vector > TriVtxBuf; // pass to tex. mem 83 | 84 | RGB objectColor; // pass to const. mem 85 | 86 | std::vector NormalBuf; // pass to tex. mem 87 | 88 | unsigned int nLightSource; // pass to const. mem 89 | std::vector lightSource; // pass to tex. mem 90 | 91 | CPoint3D cameraPos; // pass to const. mem 92 | 93 | // pass to const. mem 94 | CPoint3D windowCenter; 95 | CPoint3D window_diagup; 96 | CPoint3D window_diagdown; 97 | 98 | void CalcBoundingBox(); 99 | 100 | public: 101 | // The Bounding box and KdTree 102 | float xmin; 103 | float xmax; 104 | float ymin; 105 | float ymax; 106 | float zmin; 107 | float zmax; 108 | 109 | // list of Kdtree 110 | KdTree* tree; 111 | KdTree* ConstructKdTree( int axis, int depth, std::vector trilist, 112 | 113 | float xmin, float xmax, float ymin, float ymax, float zmin, float zmax); 114 | void OptimizeRope( int idx, int & rp, int face); 115 | void BuildRope( int idx, int rope[6] ); 116 | bool SplitPlaneBelowBox( int, float, int ); 117 | bool SplitPlaneAboveBox( int , float ,int ); 118 | 119 | std::vector ChooseTri( std::vector, float, float, float, float, float, float); 120 | 121 | int treesize; 122 | int CalcTreeSize( KdTree* node ); 123 | // Dynamic aray of kdtree 124 | KdTree_rp* tree_rp; 125 | void ConstructRpKdTree( int idx, KdTree* node); 126 | 127 | public: 128 | void RpKdTreePrint( int argc); 129 | 130 | public: 131 | float* dVtxBuf; 132 | int* dTriVtxBuf; 133 | float* dNormal; 134 | float* dLS; 135 | float* dsandboxColor; 136 | unsigned int* dsandboxIsReflective; 137 | KdTree_rp* dtree; 138 | 139 | private: 140 | void CalculateNormal(); 141 | 142 | public: 143 | CScene(); 144 | 145 | public: 146 | // callback 147 | void keyboard( unsigned char key, int x, int y) 148 | { 149 | } 150 | 151 | void LoadSandBox( int argc, char** argv ); 152 | void LoadObjects( int argc, char** argv ); 153 | void WriteObjects( ); 154 | 155 | //The first thing always 156 | void SceneMalloc(); 157 | 158 | // Make sure to call SceneMalloc before 159 | void PassSceneToGlobalMem(); 160 | 161 | // Bind global memory to texture 162 | void BindToTexture(); 163 | 164 | void SetConstantMem(); 165 | 166 | // when entire application ends 167 | // Don't forget to call in the main 168 | void FreeTextureResources(); 169 | void FreeGlobalMemory(); 170 | 171 | //TODO 172 | //void TransformObject( ); 173 | // 174 | }; 175 | 176 | #endif 177 | --------------------------------------------------------------------------------