by mh » Tue Mar 29, 2011 1:52 pm
I suspect that most Q2 ports have by now incorporated the refresh DLL into the base engine code, but there may still be a market for such an idea in people who just want to run stock Q2 on lower-end or integrated hardware.
The biggest pain would likely be in removing all the qgl stuff, which existed for no reason other than to be able to separate 3DFX OpenGL from Default OpenGL (and for logging, but you'd use GLIntercept for that nowadays). That's just messy grunt work with nothing concrete at the end of it.
surf->polys->verts[0] can be used directly as a parameter to DrawPrimitiveUP which is kinda neat and would hugely simplify the surface refresh. It could be moved to Vertex Buffers as a later exercise (I'd only bother if it was established that using -UP was a performance problem with stock Q2 maps though), particularly if you switch to shaders and do all the texcoord manuipulation there. It's easy to convert surface vertex layouts from a poly/trifan to a tristrip which might give better performance on some hardware.
Lightmap updating in Q2 is total cack; even worse than Q1. That needs a total overhaul. Instead of updating changes as they happen you need to accumulate changes and update them in bulk once only per frame. D3D is neater and cleaner than OpenGL here as you can LockRect a D3DPOOL_MANAGED texture (with D3DLOCK_NO_DIRTY_UPDATE), pass pBits + offset into R_BuildLightmap, then UnlockRect it and AddDirtyRect with the rectangle that's actually changed. So you just need to do a bunch of AddDirtyRect calls at the end of the current frame or the start of the next for any lightmap that's been modified. It does mean that updated lightmaps will lag 1 frame behind, but I'd defy anyone to notice.
The MD2 renderer in Q2 is hellishly messy. I've experimented in DirectQ with just loading all vertexes into a Vertex Buffer for MDLs and using stream offset to define the two frames to interpolate between (with interpolation being done in a vertex shader). VRAM usage is typically around the 1-2 MB mark (never seen it top 5 or so), but then DirectQ does compress vertex data down to 8 bytes and remove duplicate vertexes, so MDLs in DirectQ end up maybe 10% to 25% of the size of the original data. The concept could easily transfer across, and would be cleaner in some ways as you wouldn't have to mess around with cache memory in Q2. Otherwise just have a big enough system memory array, transfer the data in and DrawIndexedPrimitiveUP it. It won't be lightning fast but it will be fast enough.
2D drawing needs serious work. Just drawing all the console text as individual quads (or trifans) with one draw call per character can drop framerates down to single digit on some systems. It needs an intermediate layer to batch things up and flush batches on a state change or at the end of a frame. ID3DXSprite can do all of this automatically for you, and it's also viable for use with particles (and sprites, of course). It can be a little slow though as it AddRefs the texture, which for some reason takes far far more CPU cycles than it should (I suspect that the runtime is doing more than just incrementing a reference counter here). But so long as you're not doing too many texture changes (which you're not with the 2D stuff and particles/sprites) it's good enough.
Hmmmm, ideas ideas.