{"id":2499,"date":"2013-05-15T20:26:32","date_gmt":"2013-05-15T11:26:32","guid":{"rendered":"http:\/\/peta.okechan.net\/blog\/?p=2499"},"modified":"2013-05-16T09:26:23","modified_gmt":"2013-05-16T00:26:23","slug":"cuda%e3%81%a7%e3%83%ac%e3%82%a4%e3%81%a8%e3%82%b7%e3%83%bc%e3%83%b3%e3%82%92%e6%a7%8b%e6%88%90%e3%81%99%e3%82%8b%e3%83%9d%e3%83%aa%e3%82%b4%e3%83%b3%e3%81%a8%e3%81%ae%e4%ba%a4%e5%b7%ae%e5%88%a4","status":"publish","type":"post","link":"https:\/\/peta.okechan.net\/blog\/archives\/2499","title":{"rendered":"CUDA\u3092\u4f7f\u3063\u3066\u30ec\u30a4\u3068\u30b7\u30fc\u30f3\u3092\u69cb\u6210\u3059\u308b\u30dd\u30ea\u30b4\u30f3\u3068\u306e\u4ea4\u5dee\u5224\u5b9a\u3092\u3059\u308b\u30b3\u30fc\u30c9\u3092\u66f8\u3044\u3066\u307f\u305f"},"content":{"rendered":"<p><a href=\"https:\/\/peta.okechan.net\/blog\/archives\/2485\" title=\"Shadow caster map\u306b\u3088\u308b\u9ad8\u901f\u30fb\u9ad8\u54c1\u8cea\u306a\u5f71\u8a08\u7b97\">Shadow caster map<\/a>\u306e\u3068\u304d\u306b\u3001\u3084\u3063\u3071\u30ac\u30c1\u306eGPU\u30ec\u30a4\u30c8\u30ec\u3058\u3083\u306a\u3044\u3068\u3060\u3081\u304b\u3082\u3002\u3063\u3066\u66f8\u3044\u305f\u306e\u3067\u3068\u308a\u3042\u3048\u305a\u8a66\u3057\u306bCUDA\u3067\u30ec\u30a4\u3068\u30b7\u30fc\u30f3\u306e\u30dd\u30ea\u30b4\u30f3\u3068\u306e\u4ea4\u5dee\u5224\u5b9a\u3092\u3059\u308b\u30b3\u30fc\u30c9\u3092\u66f8\u3044\u3066\u307f\u305f\u3002<br \/>\n\uff08\u30e1\u30a4\u30f3\u306e\u90e8\u5206\u3060\u3051\u3060\u3051\u3069\u30ab\u30fc\u30cd\u30eb\u306e\u30b3\u30fc\u30c9\u3092\u6700\u5f8c\u306e\u65b9\u306b\u8f09\u305b\u308b\u3002\uff09<\/p>\n<p>\u305d\u306e\u52d5\u4f5c\u78ba\u8a8d\u3068\u3057\u3066\u3001\u8996\u70b9\u304b\u3089\u30ec\u30a4\u3092\u98db\u3070\u3057\u3066\u4ea4\u5dee\u3057\u305f\u3068\u3053\u308d\u307e\u3067\u306e\u8ddd\u96e2\uff08\u3044\u308f\u3086\u308b\u6df1\u5ea6\u5024\uff09\u3092\u76f4\u63a5\u53ef\u8996\u5316\u3057\u3066\u307f\u305f\u3002<br \/>\n<a href=\"https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart-150x150.jpg\" alt=\"cudart\" width=\"150\" height=\"150\" class=\"alignnone size-thumbnail wp-image-2500\" srcset=\"https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart-150x150.jpg 150w, https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart-300x300.jpg 300w, https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart.jpg 512w\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" \/><\/a><br \/>\n1024 x 1024\u3067\u8a08\u7b97\u3057\u305f\u3082\u306e\u3092\u7e2e\u5c0f\u3057\u3066\u3042\u308b\u3002<br \/>\n\u5de6\u53f3\u9006\u306b\u306a\u3063\u3068\u308a\u307e\u3059\u306d\u3002\u591a\u5206\u30ec\u30a4\u30c8\u30ec\u90e8\u5206\u3067\u306f\u306a\u304f\u6700\u7d42\u7684\u306a\u753b\u50cf\u51fa\u529b\u51e6\u7406\u90e8\u5206\u306e\u554f\u984c\u3002<\/p>\n<p>\u30b7\u30fc\u30f3\u306e\u30c7\u30fc\u30bf\u304b\u3089BVH\u3092\u69cb\u7bc9\uff08CPU\u5074\uff09\u2192BVH\u3092\u30c8\u30e9\u30d0\u30fc\u30b9\uff08GPU\u5074\uff09\u2192\u30ec\u30a4\u3068\u30dd\u30ea\u30b4\u30f3\u306e\u4ea4\u5dee\u5224\u5b9a\uff08GPU\u5074\uff09\u3068\u3044\u3046\u6d41\u308c\u3002<br \/>\n\u4e0a\u306e\u753b\u50cf\u306e\u5834\u5408\u306f\u3001\u30b7\u30fc\u30f3\u306e\u30dd\u30ea\u30b4\u30f3\u657032\u3001\u30ec\u30a4\u65701024 x 1024 \u2252 100\u4e07\u3067\u30011.7\u79d2\u307b\u3069\u639b\u304b\u3063\u305f\u304b\u3089\u3042\u307e\u308a\u901f\u304f\u306a\u3044\u3002<br \/>\nGeforce 9400M\u3067\u8d05\u6ca2\u8a00\u3044\u904e\u304e\u304b\u306a\uff1f<br \/>\n\u3084\u306f\u308a\u53ef\u80fd\u306a\u3089\u3070GLSL\u3068\u304b\u3067\u666e\u901a\u306e\u30b0\u30e9\u30d5\u30a3\u30c3\u30af\u30b9\u51e6\u7406\u306b\u8fd1\u3044\u3084\u308a\u304b\u305f\u3067\u51e6\u7406\u3059\u308b\u3088\u3046\u306b\u3057\u305f\u307b\u3046\u304cGPU\u306e\u6027\u80fd\u306f\u5f15\u304d\u51fa\u305b\u308b\u307f\u305f\u3044\u3002<br \/>\n\u307e\u3060\u5168\u7136\u6700\u9069\u5316\u306f\u3057\u3066\u306a\u3044\u3051\u3069\u3001\u4eca\u306e\u69cb\u9020\u306e\u307e\u307e\u3067\u5927\u304d\u304f\u9ad8\u901f\u5316\u3059\u308b\u4f59\u5730\u304c\u3042\u308b\u306e\u304b\u306f\u5206\u304b\u3089\u306a\u3044\u3002<br \/>\n\u5206\u5c90\u3057\u307e\u304f\u308a\u306a\u306e\u3067\u305d\u308c\u3092\u6e1b\u3089\u305b\u308c\u3070\u591a\u5c11\u306f\u901f\u304f\u306a\u308b\u3093\u3058\u3083\u306a\u304b\u308d\u3046\u304b\u3002<br \/>\n\u3082\u3063\u3068\u65b0\u3057\u3044GPU\u306a\u3089\u7279\u6027\u3082\u9055\u3046\u3060\u308d\u3046\u3057\u3082\u3063\u3068\u8272\u3005\u3084\u308a\u3088\u3046\u304c\u3042\u308b\u611f\u3058\u306f\u3059\u308b\u3002<\/p>\n<p>\u305f\u3060\u3001CUDA\u3063\u3066\u3046\u3061\u306e\u74b0\u5883\u3067\u306f\u306a\u3093\u304b\u4e0d\u5b89\u5b9a\u3067\u3001\u3061\u3083\u3093\u3068\u30e1\u30e2\u30ea\u3092\u89e3\u653e\u3057\u3066\u308b\u3064\u3082\u308a\u306a\u3093\u3060\u3051\u3069\u4f55\u56de\u304b\u5b9f\u884c\u3057\u3066\u308b\u3068\u30e1\u30e2\u30ea\u304c\u8db3\u308a\u306a\u304f\u306a\u3063\u305f\u308a\u3001\u305f\u307e\u306bcuInit()\u304cCUDA_ERROR_NOT_INITIALIZED\u3092\u8fd4\u3057\u3066\u304d\u305f\u308a\uff08\u304a\u307e\u3048\u304c\u305d\u308c\u8fd4\u3057\u3066\u3069\u3046\u3059\u3093\u306d\u3093w\uff09\u3001\u5fae\u5999\u611f\u304c\u3042\u308b\u3002<br \/>\nOpenGL\u3067\u3084\u3063\u3066\u308b\u3068\u304d\u306f\u305d\u3046\u3044\u3046\u4e8b\u306f\u5168\u304f\u306a\u3044\u3002<br \/>\n\u3084\u306f\u308aWindows\u3067\u958b\u767a\u3057\u305f\u307b\u3046\u304c\u5b89\u5b9a\u6027\u306f\u9ad8\u3044\u306e\u304b\u3082\u3057\u308c\u306a\u3044\u3002<\/p>\n<p>\u3061\u306a\u307f\u306bGPU\u306e\u30e1\u30e2\u30ea\u304c\u8db3\u308a\u306a\u304f\u306a\u3063\u305f\u3089\u3001Mac\u3092\u518d\u8d77\u52d5\u3059\u308c\u3070\u5f53\u7136\u89e3\u653e\u3055\u308c\u308b\u3093\u3060\u3051\u3069\u3001\u30b9\u30ea\u30fc\u30d7\u3057\u3066\u3059\u3050\u89e3\u9664\u3059\u308b\u3060\u3051\u3067\u3082\u89e3\u653e\u3055\u308c\u308b\u307f\u305f\u3044\u3002<\/p>\n<p>\u4ee5\u4e0b\u3001\u4eca\u56de\u306e\u30ab\u30fc\u30cd\u30eb\u306e\u30b3\u30fc\u30c9\u3002<br \/>\n\u6065\u3092\u5fcd\u3093\u3067\u5168\u4f53\u3092\u516c\u958b\u3057\u305f\u304b\u3063\u305f\u3093\u3060\u3051\u3069\u3001Xcode\u3092\u4f7f\u3063\u3066\u308b\u3082\u3093\u3067\u3001\u500b\u4eba\u60c5\u5831\u7684\u306b\u305d\u306e\u307e\u307e\u516c\u958b\u3057\u3066\u5927\u4e08\u592b\u304b\u3044\u307e\u3044\u3061\u78ba\u4fe1\u304c\u3082\u3066\u306a\u3044\u306e\u3067\u3001\u809d\u3068\u306a\u308b\u90e8\u5206\u3060\u3051\u3002<br \/>\nBVH\u306e\u30c7\u30fc\u30bf\u69cb\u9020\u3068\u3057\u3066\u306f\u3001\u30ce\u30fc\u30c9\u306fAABB\u305d\u306e\u3082\u306e\u3067\u3001\u5168\u4f53\u3092\u4e8c\u5206\u30d2\u30fc\u30d7\u5316\u3057\u30661\u6b21\u5143\u914d\u5217\u5316\u3057\u305f\u3082\u306e\u306a\u306e\u3067\u3001\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u8a08\u7b97\u3060\u3051\u3067\u89aa\u3084\u5b50\u3092\u8fbf\u308c\u308b\u3088\u3046\u306b\u306a\u3063\u3066\u3044\u308b\u3002<br \/>\nCPU\u5074\u306e\u30b3\u30fc\u30c9\u3067\u306ftraverse\u3068intersect\u306e\u30ad\u30c3\u30af\u30671\u30bb\u30c3\u30c8\u3067\u3001\u3053\u308c\u3092\u5168\u3066\u306e\u30ec\u30a4\u306e\u51e6\u7406\u304c\u7d42\u308f\u308b\u307e\u3067\u8907\u6570\u56de\u7e70\u308a\u8fd4\u3059\u3088\u3046\u306b\u3057\u3066\u3044\u308b\u3002<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">#define IN_CUDA_KERNEL\r\n#include &quot;CUDARay.hpp&quot;\r\n#include &quot;CUDARayHitInfo.hpp&quot;\r\n#include &quot;CUDAAabb.hpp&quot;\r\n\r\ntexture&lt;float4, cudaTextureType1D, cudaReadModeElementType&gt; bvhTexRef;\r\ntexture&lt;float4, cudaTextureType1D, cudaReadModeElementType&gt; triTexRef;\r\n\r\n\/\/ \u89aa\u30ce\u30fc\u30c9\u306eID\u3092\u5f97\u308b\r\n__device__ int parent(int nodeId) {\r\n    return (nodeId - (2 - nodeId % 2)) \/ 2;\r\n}\r\n\r\n\/\/ \u5de6\u306e\u5b50\u30ce\u30fc\u30c9\u306eID\u3092\u5f97\u308b\r\n__device__ int leftChild(int nodeId) {\r\n    return nodeId * 2 + 1;\r\n}\r\n\r\n\/\/ \u53f3\u306e\u5b50\u30ce\u30fc\u30c9\u306eID\u3092\u5f97\u308b\r\n__device__ int rightChild(int nodeId) {\r\n    return nodeId * 2 + 2;\r\n}\r\n\r\n\/\/ \u8fd1\u3044\u5074\u306e\u5b50\u3068\u9060\u3044\u5074\u306e\u5b50\u3092\u5f97\u308b\r\n__device__ void getNearAndFarChild(CUDARay &amp;ray, int nodeId, int &amp;near, int &amp;far) {\r\n    int left = leftChild(nodeId);\r\n    int right = rightChild(nodeId);\r\n    \r\n    CUDAAabb left_aabb(bvhTexRef, left);\r\n    CUDAAabb right_aabb(bvhTexRef, right);\r\n    \r\n    \/\/ AABB\u306e\u4e2d\u5fc3\u3068\u30ec\u30a4\u306e\u59cb\u70b9\u3067\u6bd4\u8f03\u3059\u308b\r\n    float ld = (left_aabb.getCenter() - ray.origin).length2();\r\n    float rd = (right_aabb.getCenter() - ray.origin).length2();\r\n    \r\n    near = (ld &lt; rd)?left:right;\r\n    far = (ld &lt; rd)?right:left;\r\n}\r\n\r\n\/\/ AABB\u3068\u30ec\u30a4\u306e\u4ea4\u5dee\u5224\u5b9a\r\n__device__ bool boxtest(CUDARay &amp;ray, float maxT, int nodeId)\r\n{\r\n    CUDAAabb aabb(bvhTexRef, nodeId);\r\n    return aabb.isIntersect(ray, maxT);\r\n}\r\n\r\n\/\/ \u8449\u30ce\u30fc\u30c9\u304b\u3069\u3046\u304b\r\n__device__ bool isLeaf(int nodeId, int aabb_count)\r\n{    \r\n    return (((aabb_count + 1) \/ 2 - 1) &lt;= nodeId &amp;&amp; nodeId &lt;= (aabb_count - 1));\r\n}\r\n\r\nextern &quot;C&quot;\r\n{\r\n    \/\/ \u30c8\u30e9\u30d0\u30fc\u30b9\u51e6\u7406\r\n    __global__ void traverse(CUDARay *rays, CUDARayHitInfo *hitInfos, int *stagedCount, int n, int aabb_count)\r\n    {\r\n        int threadIndex = blockDim.x * blockIdx.x + threadIdx.x;\r\n        if (!(threadIndex &lt; n)) return;\r\n        \r\n        CUDARay ray = rays&#x5B;threadIndex];\r\n        float maxT = hitInfos&#x5B;threadIndex].t;\r\n        \r\n        while (ray.current_node &gt; -1) {\r\n            int near, far;\r\n            getNearAndFarChild(ray, ray.current_node, near, far);\r\n            \r\n            \/\/ \u9060\u3044\u307b\u3046\u306e\u5b50\u30ce\u30fc\u30c9\u304b\u3089\u623b\u3063\u3066\u304d\u305f\u5834\u5408\u306f\u3055\u3089\u306b\u89aa\u306b\u623b\u308b\r\n            if (ray.last_node == far) {\r\n                ray.last_node = ray.current_node;\r\n                ray.current_node = parent(ray.current_node);\r\n                continue;\r\n            }\r\n            \r\n            \/\/ \u5b50\u30ce\u30fc\u30c9\u3068\u30ec\u30a4-AABB\u306e\u4ea4\u5dee\u30c6\u30b9\u30c8\u3092\u3059\u308b\u3002\r\n            \/\/ \u89aa\u304b\u3089\u964d\u308a\u3066\u304d\u305f\u5834\u5408\u306fnear, \u305d\u3046\u3058\u3083\u306a\u3044\u5834\u5408\u306ffar\u3092\u51e6\u7406\u3059\u308b\r\n            int tryChild = (ray.last_node == parent(ray.current_node))?near:far;\r\n            if (boxtest(ray, maxT, tryChild)) {\r\n                ray.last_node = ray.current_node;\r\n                ray.current_node = tryChild;\r\n                if (isLeaf(ray.current_node, aabb_count)) {\r\n                    \/\/ \u8449\u30ce\u30fc\u30c9\u306a\u3089\u7d42\u4e86\r\n                    atomicAdd(stagedCount, 1);\r\n                    break;\r\n                }\r\n            } else {\r\n                if (tryChild == near) {\r\n                    \/\/ \u4ea4\u5dee\u5224\u5b9a\u306e\u5931\u6557\u304cnear\u306a\u3089\u6b21\u306ffar\u3092\u51e6\u7406\u3059\u308b\r\n                    ray.last_node = near;\r\n                } else {\r\n                    \/\/ 2\u3064\u306e\u5b50\u3068\u306e\u4ea4\u5dee\u5224\u5b9a\u304c\u7d42\u308f\u3063\u305f\u3089\u89aa\u306b\u623b\u308b\r\n                    ray.last_node = ray.current_node;\r\n                    ray.current_node = parent(ray.current_node);\r\n                }\r\n                \/\/ray.last_node = (tryChild == near)?near:ray.current_node;\r\n                \/\/ray.current_node = (tryChild == near)?ray.current_node:parent(ray.current_node);\r\n            }\r\n        }\r\n        \r\n        \/\/__syncthreads();\r\n        rays&#x5B;threadIndex] = ray;\r\n        \/\/ \u3053\u306e\u30d5\u30a7\u30fc\u30ba\u3067\u306fhitInfo\u306b\u66f8\u304d\u8fbc\u307f\u306f\u306a\u3044\r\n    }\r\n    \r\n    \/\/ \u30ec\u30a4-\u4e09\u89d2\u5f62\u4ea4\u5dee\u5224\u5b9a\r\n    __global__ void intersect(CUDARay *rays, CUDARayHitInfo *hitInfos, int n, int leaf_base_id)\r\n    {\r\n        int threadIndex = blockDim.x * blockIdx.x + threadIdx.x;\r\n        if (!(threadIndex &lt; n)) return;\r\n        \r\n        CUDARay ray = rays&#x5B;threadIndex];\r\n        \r\n        if (ray.current_node &gt;= leaf_base_id) {\r\n            CUDARayHitInfo hitInfo = hitInfos&#x5B;threadIndex];\r\n            \/\/ ray.current_node\u304b\u3089\u4e09\u89d2\u5f62\u3092\u53d6\u5f97\r\n            CUDATriangle tri(triTexRef, ray.current_node - leaf_base_id);\r\n            if (tri.isIntersect(ray, hitInfo)) {\r\n                \/\/ hitInfo.t\u3088\u308a\u8fd1\u304f\u3067\u4ea4\u5dee\r\n                hitInfo.triangle_id = ray.current_node - leaf_base_id;\r\n                hitInfos&#x5B;threadIndex] = hitInfo;\r\n            }\r\n        }\r\n    }\r\n}<\/pre>\n","protected":false},"excerpt":{"rendered":"<p><a href=\"https:\/\/peta.okechan.net\/blog\/archives\/2485\" title=\"Shadow caster map\u306b\u3088\u308b\u9ad8\u901f\u30fb\u9ad8\u54c1\u8cea\u306a\u5f71\u8a08\u7b97\">Shadow caster map<\/a>\u306e\u3068\u304d\u306b\u3001\u3084\u3063\u3071\u30ac\u30c1\u306eGPU\u30ec\u30a4\u30c8\u30ec\u3058\u3083\u306a\u3044\u3068\u3060\u3081\u304b\u3082\u3002\u3063\u3066\u66f8\u3044\u305f\u306e\u3067\u3068\u308a\u3042\u3048\u305a\u8a66\u3057\u306bCUDA\u3067\u30ec\u30a4\u3068\u30b7\u30fc\u30f3\u306e\u30dd\u30ea\u30b4\u30f3\u3068\u306e\u4ea4\u5dee\u5224\u5b9a\u3092\u3059\u308b\u30b3\u30fc\u30c9\u3092\u66f8\u3044\u3066\u307f\u305f\u3002<br \/>\n\uff08\u30e1\u30a4\u30f3\u306e\u90e8\u5206\u3060\u3051\u3060\u3051\u3069\u30ab\u30fc\u30cd\u30eb\u306e\u30b3\u30fc\u30c9\u3092\u6700\u5f8c\u306e\u65b9\u306b\u8f09\u305b\u308b\u3002\uff09<\/p>\n<p>\u305d\u306e\u52d5\u4f5c\u78ba\u8a8d\u3068\u3057\u3066\u3001\u8996\u70b9\u304b\u3089\u30ec\u30a4\u3092\u98db\u3070\u3057\u3066\u4ea4\u5dee\u3057\u305f\u3068\u3053\u308d\u307e\u3067\u306e\u8ddd\u96e2\uff08\u3044\u308f\u3086\u308b\u6df1\u5ea6\u5024\uff09\u3092\u76f4\u63a5\u53ef\u8996\u5316\u3057\u3066\u307f\u305f\u3002<br \/>\n<a href=\"https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart-150x150.jpg\" alt=\"cudart\" width=\"150\" height=\"150\" class=\"alignnone size-thumbnail wp-image-2500\" srcset=\"https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart-150x150.jpg 150w, https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart-300x300.jpg 300w, https:\/\/peta.okechan.net\/blog\/wp-content\/uploads\/2013\/05\/cudart.jpg 512w\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" \/><\/a><br \/>\n1024 x 1024\u3067\u8a08\u7b97\u3057\u305f\u3082\u306e\u3092\u7e2e\u5c0f\u3057\u3066\u3042\u308b\u3002<br \/>\n\u5de6\u53f3\u9006\u306b\u306a\u3063\u3068\u308a\u307e\u3059\u306d\u3002\u591a\u5206\u30ec\u30a4\u30c8\u30ec\u90e8\u5206\u3067\u306f\u306a\u304f\u6700\u7d42\u7684\u306a\u753b\u50cf\u51fa\u529b\u51e6\u7406\u90e8\u5206\u306e\u554f\u984c\u3002<\/p>\n<p>\u30b7\u30fc\u30f3\u306e\u30c7\u30fc\u30bf\u304b\u3089BVH\u3092\u69cb\u7bc9\uff08CPU\u5074\uff09\u2192BVH\u3092\u30c8\u30e9\u30d0\u30fc\u30b9\uff08GPU\u5074\uff09\u2192\u30ec\u30a4\u3068\u30dd\u30ea\u30b4\u30f3\u306e\u4ea4\u5dee\u5224\u5b9a\uff08GPU\u5074\uff09\u3068\u3044\u3046\u6d41\u308c\u3002<br \/>\n\u4e0a\u306e\u753b\u50cf\u306e\u5834\u5408\u306f\u3001\u30b7\u30fc\u30f3\u306e\u30dd\u30ea\u30b4\u30f3\u657032\u3001\u30ec\u30a4\u65701024 x 1024 \u2252 100\u4e07\u3067\u30011.7\u79d2\u307b\u3069\u639b\u304b\u3063\u305f\u304b\u3089\u3042\u307e\u308a\u901f\u304f\u306a\u3044\u3002<br \/>\nGeforce 9400M\u3067\u8d05\u6ca2\u8a00\u3044\u904e\u304e\u304b\u306a\uff1f<br \/>\n\u3084\u306f\u308a\u53ef\u80fd\u306a\u3089\u3070GLSL\u3068\u304b\u3067\u666e\u901a\u306e\u30b0\u30e9\u30d5\u30a3\u30c3\u30af\u30b9\u51e6\u7406\u306b\u8fd1\u3044\u3084\u308a\u304b\u305f\u3067\u51e6\u7406\u3059\u308b\u3088\u3046\u306b\u3057\u305f\u307b\u3046\u304cGPU\u306e\u6027\u80fd\u306f\u5f15\u304d\u51fa\u305b\u308b\u307f\u305f\u3044\u3002<br \/>\n\u307e\u3060\u5168\u7136\u6700\u9069\u5316\u306f\u3057\u3066\u306a\u3044\u3051\u3069\u3001\u4eca\u306e\u69cb\u9020\u306e\u307e\u307e\u3067\u5927\u304d\u304f\u9ad8\u901f\u5316\u3059\u308b\u4f59\u5730\u304c\u3042\u308b\u306e\u304b\u306f\u5206\u304b\u3089\u306a\u3044\u3002<br \/>\n\u5206\u5c90\u3057\u307e\u304f\u308a\u306a\u306e\u3067\u305d\u308c\u3092\u6e1b\u3089\u305b\u308c\u3070\u591a\u5c11\u306f\u901f\u304f\u306a\u308b\u3093\u3058\u3083\u306a\u304b\u308d\u3046\u304b\u3002<br \/>\n\u3082\u3063\u3068\u65b0\u3057\u3044GPU\u306a\u3089\u7279\u6027\u3082\u9055\u3046\u3060\u308d\u3046\u3057\u3082\u3063\u3068\u8272\u3005\u3084\u308a\u3088\u3046\u304c\u3042\u308b\u611f\u3058\u306f\u3059\u308b\u3002<\/p>\n<p>\u305f\u3060\u3001CUDA\u3063\u3066\u3046\u3061\u306e\u74b0\u5883\u3067\u306f\u306a\u3093\u304b\u4e0d\u5b89\u5b9a\u3067\u3001\u3061\u3083\u3093\u3068\u30e1\u30e2\u30ea\u3092\u89e3\u653e\u3057\u3066\u308b\u3064\u3082\u308a\u306a\u3093\u3060\u3051\u3069\u4f55\u56de\u304b\u5b9f\u884c\u3057\u3066\u308b\u3068\u30e1\u30e2\u30ea\u304c\u8db3\u308a\u306a\u304f\u306a\u3063\u305f\u308a\u3001\u305f\u307e\u306bcuInit()\u304cCUDA_ERROR_NOT_INITIALIZED\u3092\u8fd4\u3057\u3066\u304d\u305f\u308a\uff08\u304a\u307e\u3048\u304c\u305d\u308c\u8fd4\u3057\u3066\u3069\u3046\u3059\u3093\u306d\u3093w\uff09\u3001\u5fae\u5999\u611f\u304c\u3042\u308b\u3002<br \/>\nOpenGL\u3067\u3084\u3063\u3066\u308b\u3068\u304d\u306f\u305d\u3046\u3044\u3046\u4e8b\u306f\u5168\u304f\u306a\u3044\u3002<br \/>\n\u3084\u306f\u308aWindows\u3067\u958b\u767a\u3057\u305f\u307b\u3046\u304c\u5b89\u5b9a\u6027\u306f\u9ad8\u3044\u306e\u304b\u3082\u3057\u308c\u306a\u3044\u3002<\/p>\n<p>\u3061\u306a\u307f\u306bGPU\u306e\u30e1\u30e2\u30ea\u304c\u8db3\u308a\u306a\u304f\u306a\u3063\u305f\u3089\u3001Mac\u3092\u518d\u8d77\u52d5\u3059\u308c\u3070\u5f53\u7136\u89e3\u653e\u3055\u308c\u308b\u3093\u3060\u3051\u3069\u3001\u30b9\u30ea\u30fc\u30d7\u3057\u3066\u3059\u3050\u89e3\u9664\u3059\u308b\u3060\u3051\u3067\u3082\u89e3\u653e\u3055\u308c\u308b\u307f\u305f\u3044\u3002<\/p>\n<p>\u4ee5\u4e0b\u3001\u4eca\u56de\u306e\u30ab\u30fc\u30cd\u30eb\u306e\u30b3\u30fc\u30c9\u3002<br \/>\n\u6065\u3092\u5fcd\u3093\u3067\u5168\u4f53\u3092\u516c\u958b\u3057\u305f\u304b\u3063\u305f\u3093\u3060\u3051\u3069\u3001Xcode\u3092\u4f7f\u3063\u3066\u308b\u3082\u3093\u3067\u3001\u500b\u4eba\u60c5\u5831\u7684\u306b\u305d\u306e\u307e\u307e\u516c\u958b\u3057\u3066\u5927\u4e08\u592b\u304b\u3044\u307e\u3044\u3061\u78ba\u4fe1\u304c\u3082\u3066\u306a\u3044\u306e\u3067\u3001\u809d\u3068\u306a\u308b\u90e8\u5206\u3060\u3051\u3002<br \/>\nBVH\u306e\u30c7\u30fc\u30bf\u69cb\u9020\u3068\u3057\u3066\u306f\u3001\u30ce\u30fc\u30c9\u306fAABB\u305d\u306e\u3082\u306e\u3067\u3001\u5168\u4f53\u3092\u4e8c\u5206\u30d2\u30fc\u30d7\u5316\u3057\u30661\u6b21\u5143\u914d\u5217\u5316\u3057\u305f\u3082\u306e\u306a\u306e\u3067\u3001\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u8a08\u7b97\u3060\u3051\u3067\u89aa\u3084\u5b50\u3092\u8fbf\u308c\u308b\u3088\u3046\u306b\u306a\u3063\u3066\u3044\u308b\u3002<br \/>\nCPU\u5074\u306e\u30b3\u30fc\u30c9\u3067\u306ftraverse\u3068intersect\u306e\u30ad\u30c3\u30af\u30671\u30bb\u30c3\u30c8\u3067\u3001\u3053\u308c\u3092\u5168\u3066\u306e\u30ec\u30a4\u306e\u51e6\u7406\u304c\u7d42\u308f\u308b\u307e\u3067\u8907\u6570\u56de\u7e70\u308a\u8fd4\u3059\u3088\u3046\u306b\u3057\u3066\u3044\u308b\u3002<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">#define IN_CUDA_KERNEL\r\n#include &quot;CUDARay.hpp&quot;\r\n#include &quot;CUDARayHitInfo.hpp&quot;\r\n#include &quot;CUDAAabb.hpp&quot;\r\n\r\ntexture&lt;float4, cudaTextureType1D, cudaReadModeElementType&gt; bvhTexRef;\r\ntexture&lt;float4, cudaTextureType1D, cudaReadModeElementType&gt; triTexRef;\r\n\r\n\/\/ \u89aa\u30ce\u30fc\u30c9\u306eID\u3092\u5f97\u308b\r\n__device__ int parent(int nodeId) {\r\n    return (nodeId - (2 - nodeId % 2)) \/ 2;\r\n}\r\n\r\n\/\/ \u5de6\u306e\u5b50\u30ce\u30fc\u30c9\u306eID\u3092\u5f97\u308b\r\n__device__ int leftChild(int nodeId) {\r\n    return nodeId * 2 + 1;\r\n}\r\n\r\n\/\/ \u53f3\u306e\u5b50\u30ce\u30fc\u30c9\u306eID\u3092\u5f97\u308b\r\n__device__ int rightChild(int nodeId) {\r\n    return nodeId * 2 + 2;\r\n}\r\n\r\n\/\/ \u8fd1\u3044\u5074\u306e\u5b50\u3068\u9060\u3044\u5074\u306e\u5b50\u3092\u5f97\u308b\r\n__device__ void getNearAndFarChild(CUDARay &amp;ray, int nodeId, int &amp;near, int &amp;far) {\r\n    int left = leftChild(nodeId);\r\n    int right = rightChild(nodeId);\r\n    \r\n    CUDAAabb left_aabb(bvhTexRef, left);\r\n    CUDAAabb right_aabb(bvhTexRef, right);\r\n    \r\n    \/\/ AABB\u306e\u4e2d\u5fc3\u3068\u30ec\u30a4\u306e\u59cb\u70b9\u3067\u6bd4\u8f03\u3059\u308b\r\n    float ld = (left_aabb.getCenter() - ray.origin).length2();\r\n    float rd = (right_aabb.getCenter() - ray.origin).length2();\r\n    \r\n    near = (ld &lt; rd)?left:right;\r\n    far = (ld &lt; rd)?right:left;\r\n}\r\n\r\n\/\/ AABB\u3068\u30ec\u30a4\u306e\u4ea4\u5dee\u5224\u5b9a\r\n__device__ bool boxtest(CUDARay &amp;ray, float maxT, int nodeId)\r\n{\r\n    CUDAAabb aabb(bvhTexRef, nodeId);\r\n    return aabb.isIntersect(ray, maxT);\r\n}\r\n\r\n\/\/ \u8449\u30ce\u30fc\u30c9\u304b\u3069\u3046\u304b\r\n__device__ bool isLeaf(int nodeId, int aabb_count)\r\n{    \r\n    return (((aabb_count + 1) \/ 2 - 1) &lt;= nodeId &amp;&amp; nodeId &lt;= (aabb_count - 1));\r\n}\r\n\r\nextern &quot;C&quot;\r\n{\r\n    \/\/ \u30c8\u30e9\u30d0\u30fc\u30b9\u51e6\u7406\r\n    __global__ void traverse(CUDARay *rays, CUDARayHitInfo *hitInfos, int *stagedCount, int n, int aabb_count)\r\n    {\r\n        int threadIndex = blockDim.x * blockIdx.x + threadIdx.x;\r\n        if (!(threadIndex &lt; n)) return;\r\n        \r\n        CUDARay ray = rays&#x5B;threadIndex];\r\n        float maxT = hitInfos&#x5B;threadIndex].t;\r\n        \r\n        while (ray.current_node &gt; -1) {\r\n            int near, far;\r\n            getNearAndFarChild(ray, ray.current_node, near, far);\r\n            \r\n            \/\/ \u9060\u3044\u307b\u3046\u306e\u5b50\u30ce\u30fc\u30c9\u304b\u3089\u623b\u3063\u3066\u304d\u305f\u5834\u5408\u306f\u3055\u3089\u306b\u89aa\u306b\u623b\u308b\r\n            if (ray.last_node == far) {\r\n                ray.last_node = ray.current_node;\r\n                ray.current_node = parent(ray.current_node);\r\n                continue;\r\n            }\r\n            \r\n            \/\/ \u5b50\u30ce\u30fc\u30c9\u3068\u30ec\u30a4-AABB\u306e\u4ea4\u5dee\u30c6\u30b9\u30c8\u3092\u3059\u308b\u3002\r\n            \/\/ \u89aa\u304b\u3089\u964d\u308a\u3066\u304d\u305f\u5834\u5408\u306fnear, \u305d\u3046\u3058\u3083\u306a\u3044\u5834\u5408\u306ffar\u3092\u51e6\u7406\u3059\u308b\r\n            int tryChild = (ray.last_node == parent(ray.current_node))?near:far;\r\n            if (boxtest(ray, maxT, tryChild)) {\r\n                ray.last_node = ray.current_node;\r\n                ray.current_node = tryChild;\r\n                if (isLeaf(ray.current_node, aabb_count)) {\r\n                    \/\/ \u8449\u30ce\u30fc\u30c9\u306a\u3089\u7d42\u4e86\r\n                    atomicAdd(stagedCount, 1);\r\n                    break;\r\n                }\r\n            } else {\r\n                if (tryChild == near) {\r\n                    \/\/ \u4ea4\u5dee\u5224\u5b9a\u306e\u5931\u6557\u304cnear\u306a\u3089\u6b21\u306ffar\u3092\u51e6\u7406\u3059\u308b\r\n                    ray.last_node = near;\r\n                } else {\r\n                    \/\/ 2\u3064\u306e\u5b50\u3068\u306e\u4ea4\u5dee\u5224\u5b9a\u304c\u7d42\u308f\u3063\u305f\u3089\u89aa\u306b\u623b\u308b\r\n                    ray.last_node = ray.current_node;\r\n                    ray.current_node = parent(ray.current_node);\r\n                }\r\n                \/\/ray.last_node = (tryChild == near)?near:ray.current_node;\r\n                \/\/ray.current_node = (tryChild == near)?ray.current_node:parent(ray.current_node);\r\n            }\r\n        }\r\n        \r\n        \/\/__syncthreads();\r\n        rays&#x5B;threadIndex] = ray;\r\n        \/\/ \u3053\u306e\u30d5\u30a7\u30fc\u30ba\u3067\u306fhitInfo\u306b\u66f8\u304d\u8fbc\u307f\u306f\u306a\u3044\r\n    }\r\n    \r\n    \/\/ \u30ec\u30a4-\u4e09\u89d2\u5f62\u4ea4\u5dee\u5224\u5b9a\r\n    __global__ void intersect(CUDARay *rays, CUDARayHitInfo *hitInfos, int n, int leaf_base_id)\r\n    {\r\n        int threadIndex = blockDim.x * blockIdx.x + threadIdx.x;\r\n        if (!(threadIndex &lt; n)) return;\r\n        \r\n        CUDARay ray = rays&#x5B;threadIndex];\r\n        \r\n        if (ray.current_node &gt;= leaf_base_id) {\r\n            CUDARayHitInfo hitInfo = hitInfos&#x5B;threadIndex];\r\n            \/\/ ray.current_node\u304b\u3089\u4e09\u89d2\u5f62\u3092\u53d6\u5f97\r\n            CUDATriangle tri(triTexRef, ray.current_node - leaf_base_id);\r\n            if (tri.isIntersect(ray, hitInfo)) {\r\n                \/\/ hitInfo.t\u3088\u308a\u8fd1\u304f\u3067\u4ea4\u5dee\r\n                hitInfo.triangle_id = ray.current_node - leaf_base_id;\r\n                hitInfos&#x5B;threadIndex] = hitInfo;\r\n            }\r\n        }\r\n    }\r\n}<\/pre>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[32],"tags":[289],"class_list":["post-2499","post","type-post","status-publish","format-standard","hentry","category-tech","tag-cuda"],"_links":{"self":[{"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/posts\/2499","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/comments?post=2499"}],"version-history":[{"count":0,"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/posts\/2499\/revisions"}],"wp:attachment":[{"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/media?parent=2499"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/categories?post=2499"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/peta.okechan.net\/blog\/wp-json\/wp\/v2\/tags?post=2499"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}