20140715 - Infinte Projection Matrix Notes


If you are reading this and have to deal with GL/GLES vendors not supporting DX style [0 to 1] clip space, please talk to your hardware/OS vendors and ask for it!!

Clip Coordinates (CC)
Output of the vertex shader,
GL: gl_Position
DX: SV_Position



Normalized Device Coordinates (NDC)
The following transform is done after the vertex shader,

NDC = float4(CC.xyz * rcp(CC.w), 1.0);

On both GL and DX NDC.xy are [-1 to 1] ranged. On DX NDC.z is [0 to 1] ranged. On GL NDC.z is [-1 to 1] ranged and this can cause precision problems (see below in Window Coordinate transform). Anything outside the range is clipped by hardware unless in DX11 DepthClipEnable=FALSE, or in GL glEnable(GL_DEPTH_CLAMP) is used to clamp clipped geometry to the near and far plane.


Window Coordinates (WC) : DX11
The following transform is done in hardware,

float3 WC = float3(
NDC.x * (W* 0.5 ) + (X + W*0.5),
NDC.y * (H*(-0.5)) + (Y + H*0.5),
NDC.z * (F-N) + N);


With parameters specified by RSSetViewports(),

X = D3D11_VIEWPORT.TopLeftX;
Y = D3D11_VIEWPORT.TopLeftY;
W = D3D11_VIEWPORT.Width;
H = D3D11_VIEWPORT.Height;
N = D3D11_VIEWPORT.MinDepth;
F = D3D11_VIEWPORT.MaxDepth;


Fractional viewport parameters for X and Y are supported on DX11. DX10 does not support fractional viewports, and DX11 feature level 9 implicitly casts to DWORD internally insuring fractional viewport will not work. Not sure if DX11 feature level 10 supports fractional viewports or not. Both N and F are required to be in the [0 to 1] range. It is better to not use the viewport transform to modify depth and instead fold any transform into the application projection shader code. For best precision, N should be zero.


Window Coordinates (WC) : GL
The following transform is done in hardware,

float3 WC = float3(
NDC.x * (W*0.5) + (X + W*0.5),
NDC.y * (H*0.5) + (Y + H*0.5),
NDC.z * ((F-N)*0.5) + (N+F)*0.5);



Window Coordinates (WC) : OpenGL 4.2 and OpenGL ES 2.0
These versions of GL have the following form to specify input parameters,

glDepthRangef(GLclampf N, GLclampf F); // ES and some versions of GL
glDepthRange(GLclampd N, GLclampd F); // GL
glViewport(GLint X, GLint Y, GLsizei W, GLsizei H);

Note the inputs to glDepthRange*() are clamped to [0 to 1] range. This insures NDC.z is biased by a precision destroying floating point addition. The default N=0 and N=1 results in,

WC.z = NDC.z * 0.5 + 0.5;

Which when computed with standard 32-bit floating point, I believe has in theory exactly only enough precision for 24-bit integer depth buffers.


Window Coordinates (WC) : OpenGL 4.3 and OpenGL ES 3.0
These versions of OpenGL dropped the clamp type resulting in,

glDepthRangef(GLfloat N, GLfloat F);

Both specs say, "If a ?xed-point representation is used, the parameters n and f are clamped to the range [0;1] when computing zw". So in theory for floating point depth buffers the following can be specified,

glDepthRangef(-1.0f, 1.0f);

Which results in no precision destroying floating point addition in the Window Coordinate transform,

WC.z = NDC.z * 1.0 + 0.0;

However at least some vendor(s) still do the clamp anyway. To get around the clamp in GL one can use GL_NV_depth_buffer_float which provides glDepthRangedNV() which is supported by both AMD and NVIDIA.


Projection Matrix
General form,

X 0 0 0
0 Y 0 0
0 0 A 1
0 0 B 0


The A=0 case provides the highest precision. Next highest precision from A=1 or A=-1, then ideally choose A to have an exact representation in floating point. In the low precision cases, it is better for precision to break the model view projection matrix into two parts for an increase in just one scalar multiply accumulate operation overall,

// Constants
float4 ConstX = ModelViewMatrixX * X;
float4 ConstY = ModelViewMatrixY * Y;
float4 ConstZ = ModelViewMatrixZ;
float2 ConstAB = float2(A, B);

// Vertex shader work
float3 View = float3(
dot(Vertex, ConstX) + ConstX.w,
dot(Vertex, ConstY) + ConstY.w,
dot(Vertex, ConstZ) + ConstZ.w);

float3 Projected = float3(
View.x,
View.y,
View.z * ConstA + ConstB,
View.z);



Projection Matrix : Infinite Reversed (1=near, 0=far)
This is both the fastest and highest precision path. For DX, or GL using glDepthRangedNV(-1.0, 1.0),

X 0 0 0
0 Y 0 0
0 0 0 1
0 0 N 0


This can be optimized to the following,

// Constants
float4 ConstX = ModelViewMatrixX * X;
float4 ConstY = ModelViewMatrixY * Y;
float4 ConstZ = ModelViewMatrixZ;
float ConstN = N; // Ideally N=1 and no constant is needed.

// Vertex shader work
float4 Projected = float4(
dot(Vertex, ConstX) + ConstX.w,
dot(Vertex, ConstY) + ConstY.w,
ConstN,
dot(Vertex, ConstZ) + ConstZ.w);


For GL without glDepthRangedNV(-1.0, 1.0),

X 0 0__ 0
0 Y 0__ 0
0 0 -1_ 1
0 0 2*N 0


Which can be optimized to the following,

// Constants
vec4 ConstX = ModelViewMatrixX * X;
vec4 ConstY = ModelViewMatrixY * Y;
vec4 ConstZ = ModelViewMatrixZ;
float ConstN = 2.0 * N; // Ideally N=1 and no constant is needed.

// Vertex shader work
vec4 Projected;
Projected.w = dot(Vertex, ConstZ);
Projected.xyz = float3(
dot(Vertex, ConstX) + ConstX.w,
dot(Vertex, ConstY) + ConstY.w,
ConstN - Projected.w);



References
http://www.humus.name/Articles/Persson_CreatingVastGameWorlds.pdf
http://www.geometry.caltech.edu/pubs/UD12.pdf
http://outerra.blogspot.com/2012/11/maximizing-depth-buffer-range-and.html