foldingcookie2 replied to your post: This week on ‘developing for OpenCL really...
does “Intersection ret = {};” necessarily initialize ret to something sane (.intersects=false)? I’d pass the accumulator a simple float distance (potentially inf) rather than an Intersection just to decrease # of moving parts
Yeah no it definitely zeroes all the bits, which sets .intersects to false by virtue of ‘false’ and ‘no bits set’ being the same thing in C. The idea of getting rid of the .intersects field is good though, I’ll probably do that.
I tried changing things one-at-a-time until it worked again - my working version looks like this now:
#define INTERSECTION_ACCUMULATOR \ const Triangle tri = triangles[triangle_index]; \ const float distance = triangle_intersection(tri, vertices, ray); \ if (EPSILON < distance && (!ret.intersects || distance < ret.distance)) { \ ret.primitive = triangle_index; \ ret.distance = distance; \ ret.intersects = true; \ } Intersection ray_triangle_intersection(Ray ray, const global Triangle * triangles, ulong numtriangles, const global float3 * vertices) { Intersection ret = {}; for (ulong i = 0; i != numtriangles; ++i) { const ulong triangle_index = i; INTERSECTION_ACCUMULATOR } return ret; }
The thing that 'fixed' it was moving the triangles[triangle_index] lookup to its own line. I can only assume the AMD OpenCL compiler has taken some artistic liberties with the sequence point rules.
I'm also really excited to see how portable this is. The appeal of OpenCL over CUDA was that my code could run in more places, but given how rubbish the support for my system is, I wouldn't be surprised if the version that works for me fails completely on any other machine...









