by mh » Sun Feb 03, 2013 1:56 am
If I was going to do this, the way I'd do it would be to make 3 separate colormaps - one for each of R, G and B, then output to 3 separate color buffers - one again for each of R, G and B. Mix them together at swapbuffers time to get the final image.
I have no idea what the performance would be like, but the main advantages I see would be that you get to keep the original ASM code (with all of it's wacky Abrash optimizations) because you're writing to 8-bit targets, and various blends/colour shifts/palette hackery/etc can be done more-or-less for free because they can be applied during the final mix.