Image Generation (Raytrace 5)
Image Generation (Raytrace 5)
Continuing on with my raytracer the greatest current issue to some of the experiments that I want to do is speed. The raytracer is single threaded and checks every ray against every object. For scenes with a lot of spheres this quickly becomes prohibitive.
I chose to implement two initial improvements first to make the scene store more generic as just something that can be traced against and then make two implementations the first just a list of elements (similar to what we have now) and a BSP tree that recursivly divides the space in half and sorts the elements into 3 those that lie on one side, those that lie in the other and those that lie in both. We trace against all the elements in both but can avoid tracing against one of the sides if the ray wont enter it.
This can be improved if we use a branch and bound strategy. If we trace one side and get a result that must be less than anything in the other side we can exit early savign time.
Finally we added multithreading which was reletively painless with rust. Our multithreading is quiet high level runnign a computation per image line. However with the number of image lines much greater than the number of CPUs on my PC this coarse division of the work is good enough and avoids too much overhead.
Initial results
I first just implemented a basic BSP tree and parallism. I Wrote a basic script that would make a test scene with some number of randomly placed spheres and render it as 512×512 resolution.
The initial results showed a good improvement and made things a lot more practical. The odd double linear behaviour we get on the base graph is odd my best guess is that it has something to do with the processors clock speed being ramped up?
Looking at the new sped up graph we get a reasonably nice pattern:
This is pretty linear with a small constant overhead for the image buffer and rendering the ground plane.
Ideally this would be sublinear in the number of entities as the BSP tree should mean we can quickly discard (ideally) half of the elements at each recursion level. However it makes the scenes render in a practical time from 200 seconds with 100 spheres to 4.
Some thoughts on this could be:
- The lack of the branch and bound so we still need to trace a ray fully through the scenekeeping the number of checks linear.
- The BSP tree having a lot of spheres that lie on axis cut boundaries and thus in two sections that result in them not being subdivided well.
- There possibly being a better space partition to use.
I decided to give adding branch and bound a try.
Distance bounding
I coded up a distance bound that we pass down the structure so we can bail early and so we dont need to investigate both sides of a tree if we find something on the first side. The result are less than stellar:
We do show a very slight speed up but not very much and for small iterations the overhead of the new code is more than the speed up.
This might be as for these type of scenes there are very few of the spheres occluding others limiting the improvements that this can bring?
Typical render with 115 spheres:
I am going to keep it in as i think it is valuable in general but I need to look at some actual profiles to see if there is something odd.
As this puts me in a place where i can get some pretty renders in a reasonable time I will probably leave it here for now and wait for the next bottleneck.