20090710 - Video of Particles via L-System and Stochastic Visibility
I talked about this a while back and never did post a video, and I had to post something cool before SigGraph 2009, so below are screen shots and a quick video of an old stochastic visibility with particle generation effect test. The OpenGL2 (NVidia Specific) GLSL demo was built a long time ago for a 8600 GTS, and runs way too fast on the GTX275 (even bottle-necked by triangle setup in point scatter). Unfortunately all I had time to do was add a very slow serial frame capture and write MJPEG, which when turned on slows the program down too much to make a good video (it is hard to think in slow motion). The conversion from MJPEG to MPEG4 isn't ideal either (it is a huge file) at 60fps 720p.
There are a lot of things I never bothered fixing in the demo, such as the boundary between black background and the fractal (you will see a blurry 16x16 block pattern outline late in the video), or the pops when a bad floating point number gets into the particle buffer. This only shows the result of the tree expansion limited to one level per frame (fill artifacts from edges of occluders). Also hole filling was broken. So it isn't really representative of what I'm planning on for Atom, but it was a huge milestone in learning GPGPU techniques!
Other Screen Shots
What is it?
I'd qualify it as bad coder art used to test GPGPU tree data structures (everything is computed on the GPU, the CPU just sends in view position and view direction). The full fractal structure is huge, something like 2^24 pixels wide before precision problems sink in?
The scene tree is an 8-arry tree with one simple l-system rule for the 8 children based on the parent (position/scale/quaternion). It isn't limited to just one rule, I'm just lazy, can lookup any rule from a texture. The engine maintains the scene tree on the GPU (256K coarse nodes, 2M fine nodes). The projection is a 360 fisheye, and the particle effect runs in projected fisheye space based on motion vectors and parent tree position relative to child tree position. The visibility is computed in an octahedron space (different mapping from the view), so the edges of the fisheye projection loose quality fast. Color is instanced via a direct visualization of eye relative world position,
vec3 rgb = vec3(0.0);
rgb += vec3(0.7,0.625,0.5) * vec3(pow(abs(sin(sqrt(abs(y/4096.0))*2.0)),16.0));
rgb += vec3(0.3,0.4,0.5) * vec3(pow(abs(sin(sqrt(abs(x/4096.0))*32.0)),4.0));
rgb += vec3(0.5,0.4,0.3) * vec3(pow(abs(sin(z/4096.0*256.0)),4.0));
return rgb * rgb;
I'm still fascinated with the problem of solving visibility for fully dynamic geometry requiring very expensive ray traversals (if ray cast or traced), but without ray casting (or tracing) and instead using what I've termed as "stochastic visibility" and blogged about. Effectively keeping a consistent tree structure of the scene as required by visibility (and shading), then only expanding and contracting the tree structure enough each frame to service rendering to the quality level required for human frame to frame perception. Stochastic visibility collides a point per scene tree node in an view space mapping to both help compute tree updates and solve for visibility. It is stochastic because the points collided are randomly positioned inside the bounding volume of the node. Collisions help prune out nodes which yield non-visible geometry. Collisions also directly solve the tree node memory allocation problem (there is no memory allocation).
Vacation has provided some new ideas to improve upon my old non-CUDA/OpenCL "stochastic visibility" which I will be trying soon,
(1.) IMPROVED TEMPORAL CONSISTENCY VS HISTORY BUFFER. Collisions would sometimes prune out a part of the required scene tree branch, and keeping persistent nodes was way too costly to do in OpenGL3 with GPGPU methods on a 8600 level graphics card. One trivial solution to this problem would be to run a pass which checks the source data for collisions and does a reduction of the source nodes keeping the highest priority node which had a collision. If I had a set of reducing resolution history buffers, I could likely insure good temporal consistency (solving the random pruning problem).
(2.) ADD PARENT LINKS. My scene tree nodes didn't have links to the parents. Was a problem for animated l-systems and particle fluid dynamics, because I wanted a physical constraint which pulled nodes back to the static position in parent (as defined as a possibly animated l-system rule). Turns out that the history buffer pass enables me to correct parent links in child nodes, even though parents move to different memory locations each frame. If this works, this GPU data structure truly becomes awesome because it solves the 1M node memory allocation problem per frame with fully dynamic trees, automatic "defragmentation" of memory, automatic regrouping for good data/branch locality and cache performance, and now nodes maintaining parent links even though all memory locations change per frame.
(3.) MULTI-LEVEL TREE UPDATE PER FRAME. With LOD transparent blend-in and a triangle based scene, I've found in previous results that I can add/prune nodes at only a LOD level per frame and effectively service visibility if I have a conservative amount of overlap. However when the screen tree goes down to the pixel or near pixel/level, the tree must be able to expand by more than one level per frame to fill visibility gaps in dynamic geometry. I'm planning on a new method where I first re-project the current node set, then do a hierarchical image space reduction to choose the highest priority nodes for varying level multi-level tree update per frame.