20120802 - DX Camera Motion from Depth


Since I made the mistake of trying camera motion to depth shader code from a published source which was wrong, and then found lots of people on the interwebs posting stuff which is also wrong, I decided it would be a good idea to post a proper reference. Like any proper reference, I'd highly suggest you not trust me either (even though I have this validated and working).


Matrix Convention In This Post
Given a matrix,

mxx mxy mxz mxw
myx myy myz myw
mzx mzy mzz mzw
mwx mwy mwz mww


A matrix vector multiply is defined as,

x' = x*mxx + y*mxy + z*mxz + w*mxw
y' = x*myx + y*myy + z*myz + w*myw
z' = x*mzx + y*mzy + z*mzz + w*mzw
w' = x*mwx + y*mwy + z*mwz + w*mww



Infinite Projection in DX
This doc also assumes one is using an infinite projection matrix defined as follows,

? 0 0____ 0_____
0 ? 0____ 0_____
0 0 (1-k) (k-1)n
0 0 1____ 0_____


Where k insures geometry never reaches the far plane (I've been using e=2.4e-7), and n is the near plane.


The Math
Note I'm using INT24 depth buffer here instead of inverted FP32 because INT24 still has better performance on some GPUs. While this is showing the process, one wouldn't want to do this in the shader as seen here because precision would be crap when view translation gets large (more on this later). Inputs,

xy := texture position on screen: {0,0}=topLeft, {1,1}=bottomRight
d := depth fetched from INT24 depth buffer
n := constant near plane
k := constant insuring geometry does not hit far plane


First convert from texture XY to view XY. Note the DX convention is 1 at the top of the screen and -1 at the bottom (y is flipped compared to texture).

x' = -1 + 2*x
y' = 1 - 2*y


Next reverse the 1/w multiply done by fixed function graphics pipeline. Showing the case for the infinite projection. Remember this projection does {z''=(1-k)*z'''+(k-1)*n, w''=z'''} when transforming from world to clip space before the 1/w'' multiply step. Simply solve for w'' in d=((1-k)*w''+(k-1)*n)/w''.

xyzw'' = {x',y',d,1} * ((-n+k*n)/(-1+d+k))

Then grab world coordinates via a matrix multiply times the inverse of the view projection matrix,

xyzw''' = viewProjectionInverseMatrix . xyzw''

Then back-project to the prior frame texture coordinates,

xyzw'''' = priorViewProjectionMatrix . xyzw'''
xy''''' = xy'''' * (1.0/w'''')
xy'''''' = {x''''' * 0.5 + 0.5, y''''' * (-0.5) + 0.5}


Then diff to generate the motion vector,

mv = xy'''''' - xy


Public Domain
The example code below is public domain as it is just a simple equation in HLSL and C form which was generated by Mathematica. If this isn't enough, this generic code is covered in the unlicense,

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Making this Fast and Precise
Next grab your favorite symbolic solver like Mathematica and simply this. This step is required for good precision. The k and n factors drop out and what is left is a tiny bit of ALU work and 18 float constants. Here is the form of the HLSL shader,

float2 CameraMotionVector(
float3 xyd, // {x,y} := {0,0}=topLeft, {1,1}=bottomRight screen, d = fetched depth
// constants generated by CameraMotionConstants() function
float4 c0, float4 c1, float4 c2, float4 c3, float4 c4) {

float2 mv;
float rcpScale = 1.0/(dot(xyd, c0.xyz) + c0.w);
mv.x = ((xyd.x * ((c1.x * xyd.y) + (c1.y * xyd.z) + c1.z)) + (c1.w * xyd.y) + (c2.x * xyd.x * xyd.x) + (c2.y * xyd.z) + c2.z) * rcpScale;
mv.y = ((xyd.y * ((c3.x * xyd.x) + (c3.y * xyd.z) + c3.z)) + (c3.w * xyd.x) + (c4.x * xyd.y * xyd.y) + (c4.y * xyd.z) + c4.z) * rcpScale;
return mv; }


And the code which can generate the constants,

static void CameraMotionConstants(
float* restrict const cb, // constant buffer output
const double* restrict const c, // current view projection matrix inverse
const float* restrict const p // prior view projection matrix
) {

const double cxx = c[ 0]; const double cxy = c[ 1]; const double cxz = c[ 2]; const double cxw = c[ 3];
const double cyx = c[ 4]; const double cyy = c[ 5]; const double cyz = c[ 6]; const double cyw = c[ 7];
const double czx = c[ 8]; const double czy = c[ 9]; const double czz = c[10]; const double czw = c[11];
const double cwx = c[12]; const double cwy = c[13]; const double cwz = c[14]; const double cww = c[15];

const double pxx = (double)(p[ 0]); const double pxy = (double)(p[ 1]); const double pxz = (double)(p[ 2]); const double pxw = (double)(p[ 3]);
const double pyx = (double)(p[ 4]); const double pyy = (double)(p[ 5]); const double pyz = (double)(p[ 6]); const double pyw = (double)(p[ 7]);
const double pwx = (double)(p[12]); const double pwy = (double)(p[13]); const double pwz = (double)(p[14]); const double pww = (double)(p[15]);

// c0
cb[0] = (float)(4.0*(cwx*pww + cxx*pwx + cyx*pwy + czx*pwz));
cb[1] = (float)((-4.0)*(cwy*pww + cxy*pwx + cyy*pwy + czy*pwz));
cb[2] = (float)(2.0*(cwz*pww + cxz*pwx + cyz*pwy + czz*pwz));
cb[3] = (float)(2.0*(cww*pww - cwx*pww + cwy*pww + (cxw - cxx + cxy)*pwx + (cyw - cyx + cyy)*pwy + (czw - czx + czy)*pwz));

// c1
cb[4] = (float)(( 4.0)*(cwy*pww + cxy*pwx + cyy*pwy + czy*pwz));
cb[5] = (float)((-2.0)*(cwz*pww + cxz*pwx + cyz*pwy + czz*pwz));
cb[6] = (float)((-2.0)*(cww*pww + cwy*pww + cxw*pwx - 2.0*cxx*pwx + cxy*pwx + cyw*pwy - 2.0*cyx*pwy + cyy*pwy + czw*pwz - 2.0*czx*pwz + czy*pwz - cwx*(2.0*pww + pxw) - cxx*pxx - cyx*pxy - czx*pxz));
cb[7] = (float)(-2.0*(cyy*pwy + czy*pwz + cwy*(pww + pxw) + cxy*(pwx + pxx) + cyy*pxy + czy*pxz));

// c2
cb[ 8] = (float)((-4.0)*(cwx*pww + cxx*pwx + cyx*pwy + czx*pwz));
cb[ 9] = (float)(cyz*pwy + czz*pwz + cwz*(pww + pxw) + cxz*(pwx + pxx) + cyz*pxy + czz*pxz);
cb[10] = (float)(cwy*pww + cwy*pxw + cww*(pww + pxw) - cwx*(pww + pxw) + (cxw - cxx + cxy)*(pwx + pxx) + (cyw - cyx + cyy)*(pwy + pxy) + (czw - czx + czy)*(pwz + pxz));
cb[11] = (float)(0);

// c3
cb[12] = (float)((-4.0)*(cwx*pww + cxx*pwx + cyx*pwy + czx*pwz));
cb[13] = (float)((-2.0)*(cwz*pww + cxz*pwx + cyz*pwy + czz*pwz));
cb[14] = (float)(2.0*((-cww)*pww + cwx*pww - 2.0*cwy*pww - cxw*pwx + cxx*pwx - 2.0*cxy*pwx - cyw*pwy + cyx*pwy - 2.0*cyy*pwy - czw*pwz + czx*pwz - 2.0*czy*pwz + cwy*pyw + cxy*pyx + cyy*pyy + czy*pyz));
cb[15] = (float)(2.0*(cyx*pwy + czx*pwz + cwx*(pww - pyw) + cxx*(pwx - pyx) - cyx*pyy - czx*pyz));

// c4
cb[16] = (float)(4.0*(cwy*pww + cxy*pwx + cyy*pwy + czy*pwz));
cb[17] = (float)(cyz*pwy + czz*pwz + cwz*(pww - pyw) + cxz*(pwx - pyx) - cyz*pyy - czz*pyz);
cb[18] = (float)(cwy*pww + cww*(pww - pyw) - cwy*pyw + cwx*((-pww) + pyw) + (cxw - cxx + cxy)*(pwx - pyx) + (cyw - cyx + cyy)*(pwy - pyy) + (czw - czx + czy)*(pwz - pyz));
cb[19] = (float)(0); }