20130206 - Virtual Texture Space Shading + Multi-Bounce IBL + More

Virtual texture space shading could provide an interesting solution to a few core problems.

For anti-aliasing, shading at fixed object-space texels provides more temporally stable shading on surfaces. Decoupling shading rate from resolution and frame rate, should enable low cost super-sampling (rotated grid super-sampled anti-aliasing with a good resolve filter is a requirement for high quality anti-aliasing). On top of this, ability to have easy lit transparency, enables an engine to smoothly fade out objects as they get too small, which could leveraged to both reduce geometric aliasing, and provide an easy way to handle level-of-detail for massive view distances.

Shading in virtual texture space could also enable a fast way to do approximate multi-bounce image-based-lighting. Instead of baking reflection maps and environment maps, the content pipeline for each environment map (of which there might be multiple levels), would bake out a low triangle mesh which is single textured using full precision virtual texture coordinates and a virtual texture coordinate displacement map. This enables the engine at runtime to be able to redraw any environment map, and even include dynamic objects, fetching from the texture space shading results. This sets up a feedback loop which enables multi-bounce image-based-lighting. Environment map mipmap chain would be re-generated at run-time. This fast approximation would have some interesting side effects, for instance bounces would be temporally delayed, and the surface shading for IBL reflection would be based on the wrong direction (texture space shading is always shaded from the perspective of the players view). Could attempt to build a second lower resolution texture space shading result cache which has directional shading results.

Shading into a virtual texture shading cache could be done for all objects in the page, rendering a special mesh, which sets output position to the position in the page, and also provides view or world position for shading. This would result in the same thread occupancy problem with screen space shading and small triangles. So maybe writing out compressed view position using this method, then using compute base shading into the virtual texture shading cache at some tile granularity would be a better idea. At the point of tile granularity shading, could also have a byte per tile, which marks if any texels in the tile are visible. During the view position rendering pass, could also use surfaces stores to update this byte. Note this store would get coalesced well (low cost). Tiles with byte=0 could update at a lower rate (note IBL reflection can easily use non-view visible tiles). Texture space tile based shading has the advantage that each tile has a smaller bounding box than screen space tiling which might include foreground and background elements.

If the virtual texture is around three times the screen resolution, with no optimizations texture space shading should cost lower than 4xSGSSAA (shading at four times screen resolution). If texture space shading is updated at 30 Hz and rendering at 60 Hz, the cost of texture space shading (1.5x) might be more inline with 4xMSAA deferred shading at 60Hz. Add in optimizations to reduce the update rate of shading tiles which are not view visible, or tiles which don't have strong view-dependent shading, and texture space shading might easily out perform other methods resulting in a shading rate which is less than 1 per display pixel.

This style engine could also help solve the post-processing with MSAA problem. The problem is easy to understand. Majority of the time MSAA resolve happens before post processing, merging background and foreground samples on an edge. Post processing typically uses only resolved data, for example using max depth per pixel, and the result is that the post processing re-introduces aliasing where it is applied. If post processing only includes motion blur, the effect might be limited to just edges which include both motion blur and no motion blur. However with titles which include strong screen space ambient occlusion (more so if SSAO is normal aware), just about any surface might get darkened, resulting in no-AA on most of the edges on the screen. Another example, titles using soft particles, but then using the resolved depth to compute the soft blend weight, leaving outlines everywhere. The net result of this is that all the high quality AA solutions, like MSAA/TXAA/SGSSAA, simply cannot work unless post processing is fixed.

The fix for soft particle blending is to manually super-sample, fetching MSAA-depth for all the samples/pixel, then computing blend weight based on all samples. Clearly leverage stencil here to only do the expensive route on complex pixels. The fix for motion blur depends strongly on the engine, but assuming the engine computes motion blur first using a down-sampled frame, composite of the motion blur to the resolved full size frame can be done with a blend weight which is computed via manually super-sampling in a pixel shader. Both these do a inline box filter which is not the best for quality, but provide a huge benefit compared doing nothing. The alternative is to skip MSAA, super-sample but render on an angle, apply post processing at super-sampled resolution reducing the taps/pixel and sampling pattern of the filter for things like motion blur, then use a really good resampling filter which down-samples and rotates. Rendering on a rotated angle massively increases frame-buffer size for 16:9 aspect, making it very unattractive for deferred rendering, but perhaps practical for a virtual texture space shader which is doing simple forward shading. Since there is no rotated viewport support on GPUs, use a pre-pass of 4 triangles to write depth=nearPlane to box out the rotated viewport. Returning to the larger problem for post processing with MSAA: SSAO, with virtual texture space shading, perhaps a better way is to not do screen-space AO, but instead do texture-space AO fetching from both the rendered depth buffer and some of the environment map depth surfaces.

Side Note on Point Rendering AA
This is from another side project playing around with 1080p@60Hz non-triangle based rendering.

Something which is going to be very obvious with 3D VR, with the combination of sub-pixel head movement and pixel size with huge FOV, is that getting AA and filtering done correctly will be the difference between a good and bad experience. Likewise for 2D, as shading improves, the uncanny valley gets deeper, if AA quality remains the same. The core requirement for good AA is multiple samples/pixel with a good sample distribution, and then a reconstruction filter which is aware of sample position. It is the correct pixel gradient change due to sub-pixel motion which makes the difference between your eye seeing a realistic image moving behind a screen, vs just a grid of pixels.

One can apply this theory directly to traditional point rendering. Instead of rendering just pixels to a surface, render the pixel and then the point's sub-pixel offset. Then later use a reconstruction filter based on sub-pixel position to provide a higher quality image.

With HDR, the required number of samples/pixel increases because because more accurate coverage is needed for small bright features, or dark thin features surrounded by a very bright background. With my 560ti the GPU can easily render enough points at 1080p, and I've been mostly able to solve the GPU managed scene structure problem even with 4-8M nodes, but ultimately I've not yet found a good solution to the problem of temporal aliasing with HDR points. The next thing on my list to try is to keep track of some estimation of pixel coverage based on temporal feedback (fraction of the number of frames the point is visible), then make the reconstruction filter weight contribution also by estimated coverage. As the tree expands and contracts based on visibility, new leaf nodes (or points) would start with near zero coverage estimate to insure no flicker on new "geometry".

A few images with 360 fisheye projection, DOF, a simple diffusion filter, and a film grain effect. The first image I posted before, the others I never posted, they all use the same point based rendering with good sub-pixel reconstruction.

Lost Images when Minus Went Down