Based on Aras toy path tracer.
The main motivation is to show how easy/hard it is to use CUDA to boost the performance of existing applications, to learn how to identify performance bottlenecks, and to improve them. I will be documenting my progress in my dev blog.