by Spike » Wed Dec 14, 2011 4:20 am
aye, if its actually testing the stencil buffer then you've already lost your early-z optimisation - its already rasterizing per pixel.
however, generating stencil shadow volumes doesn't test the stencil buffer, it only updates it, so for these passes, you should be able to keep any early-z optimisations on the condition that the first and middle argument of glStencilOp is GL_KEEP (zfail requires inc and dec for the front or back, while zpass has the inc/dec for the 3rd argument instead).
Its only the actual rendering which tests the stencil buffer, but in this case, the depthtest is also required to pass, so there's no reason for the stencil to be tested if its the depth will fail anyway, so early-z should apply for these passes without a problem.
for skys and unlit things, even if doom3 leaves stencil testing enabled, I would expect that it reset StencilOp to full GL_KEEPs and StencilFunc to GL_ALWAYS, both combined basically leave the stencil buffer untouched regardless of glEnables, so early-z testing should still apply.
Of course, hardware and drivers may have limitations that could give worse characteristics, but if they don't, I disagree with mh.
Zpass volume generation, where the stencil is only updated when the depth pass also succeeds should fully benefit from any early-z tests without needing to resort to specific pixels, while generation of z-fail volumes where the stencil is updated even if the depth fails will need to iterate over the pixels themselves.
Of course, this is pretty much a moot point as z-pass is faster anyway due to no front/back cap rendering. You'd only want to use z-fail where z-pass is inadequete anyway - any decent engine would want to use z-pass where it can. Any arguing would be over the amount of speed you can get by doing so (and whether its enough for it to be worth detecting if you can get away with z-pass).
Is the way I see it, anyway.
.