This article assumes that you know "C" programming, parts of the Win32 API, and have some knowledge of the Direct Draw API.
Game development is all about speed. A simple way to speed up your game is to reduce the number of Blits used by your app. By Blits I mean the Direct Draw function calls Blt and BltFast.Reading the direct draw samples included with the SDK, you'd see the following pattern:
Initialization:
Every Loop:
The main point of this article is that the expense of using Blit is largely understated. In addition to simply moving the data from system memory to video memory, a blit has to obtain a win16 exclusive lock on whatever resources blit needs. At the very least this means the source and destination buffers.This penalty occurs every time something is blit onto the screen. How can you reduce that overhead? Lock the back buffer only once per game loop and write everything to it yourself. This reduces the lock / unlock cycle to once per game loop - instead of once per write.This method also has other good side effects that I detail below. I call this method the Lock / Unlock method. In a sentence, you:Lock your back buffer once, write to it many times, unlock it, then flip or blit it to the front buffer.The new, improved method's pseudo code is:
Lock the back buffer
// Here is some sample source from a DirectDraw wrapper that I
// wrote awhile back, with the calls to Lock / Unlock.
// This method works with DirectX versions 2&3, and probably 5.
void renderer_c::Lock(void)
{
DDSURFACEDESC ddsd;
HRESULT ddrval;
// VERY IMPORTANT - the next two steps are crucial!
memset( &ddsd, 0, sizeof( ddsd ));
ddsd.dwSize = sizeof( ddsd );
// attempt to Lock the surface
ddrval = ddsBack->Lock(NULL, &ddsd, DDLOCK_WAIT, NULL);
// Always, always check for errors with DirectX!
if( ddrval == DD_OK )
{
locked = TRUE;
lockedSurf = ddsd.lpSurface;
lockedWidth = ddsd.lPitch;
lockedHeight = ddsd.dwHeight;
}
else
{
locked = FALSE;
lockedSurf = NULL;
lockedWidth = 0;
throw ddrawErr_c("Couldn't Lock the renderer!");
}
}
void renderer_c::UnLock(void)
{
if (DD_OK != ddsBack->Unlock(lockedSurf))
throw ddrawErr_c("Couldn't UnLock the renderer!");
locked = FALSE;
lockedSurf = NULL;
lockedWidth = 0;
lockedHeight = 0;
}
| Lock / Unlock Method | |
|---|---|
| Advantages: | |
| easier set up | sprites, backgrounds, etc. are loaded via standard memory allocation ( malloc, etc. ). |
| portability | via less dependence on DX screen drawing methods |
| flexibility | you can use run length encoding for sprites, load any size artwork ( limited by system ram ). |
| speed | in the worst case, it's the same speed but in most cases it's faster. |
| Disadvantage: | |
| no color keying | while this is an important function, implementing it yourself isn't difficult. |
| Blit Method | |
| Advantage: | |
| color keying | like I said above, this is easy enough to do. |
| Disadvantages: | |
| DirectX limitations | There is a limit to the size of your source surfaces ( using dx 3 it was 4 mb ). (I have never seen documentation that noted this limit, either.) |
| blit's overhead | slower than Visual C's version of memcpy ( and probably everyone else's too ). |
| harder set up | direct draw surfaces cannot be set to an explicit size ( pitch may differ from the width ). |
The sample code from MS, sucks, sucks, sucks! Globals are used everywhere and a the code displays a overall disregard for readability. I am convinced that all of the sample code is written by the least experienced people on any given team.
Furthermore, Hungarian notation is lame!Firstly, why not pick meaningful names for variables instead using the type to determine what the variable is? Secondly, you should know what the types are you are working with. Lastly, your compiler should warn you if you are truncating variables by downcasting.
To be honest, most of the books out there are written in this goofy, screwed up style - don't propagate this nonsense. Use variable names that are meaningful - and avoid globals.Email me with comments, suggestions, errors, flames, whatever: tfiner@jps.net.
Once we have the surface setup, and locked we are ready to play with our surface. In this article we will learn about modifying the surface pointer, and using memcpy to make our own drawing functions.
As explained in the previous article, the primary reason for going out of our way to copy the memory ourselves is to avoid constant locks and unlocks of the video card, resulting in a huge waste of time.
Here are the steps we will go though:
Locking the surface should look something like this:
// LOCK BACKBUFFER
DDSURFACEDESC DDSD;
ZeroMemory(&DDSD, sizeof(DDSD));
DDSD.dwSize = sizeof(DDSD);
HRESULT ddrval = lpDDSBack->Lock(NULL, &DDSD, DDLOCK_WAIT, NULL);
//LOCK MAINGFX
DDSURFACEDESC MainGFXDDSD;
ZeroMemory(&MainGFXDDSD, sizeof(MainGFXDDSD));
MainGFXDDSD.dwSize = sizeof(MainGFXDDSD);
ddrval = lpDDSMainGFX->Lock(NULL, &MainGFXDDSD, DDLOCK_WAIT, NULL);
Define the pointers we will be playing with, and point them to our locked surfaces:
BYTE * lpDDMemory = (BYTE *) DDSD.lpSurface;
BYTE * lpDDMemSource = (BYTE *) MainGFXDDSD.lpSurface;
OK, now we need to position the source and destination pointers to the correct location on the screen. For example, lets say we wanted to copy from our source at 150, 150...
//Move down to the correct line
lpDDMemSource += 150 * MainGFXDDSD.lPitch;
//Move over to correct pixel
lpDDMemSource += 150;
MainGFXDDSD.lPitch is easier thought of as the width of the surface, although it is not exactly the width, it makes it a little easier when writing the code. Do not use the surface Width when advancing the pointer down the y co-ord - that is incorrect - use the pitch.
Note: Remember, pointers are in bytes! If we were running in 16 bpp, we need to multiply by 2 - when moving over to the correct x co-ord - because 16 bits is two bytes. Here is the same code above, for a 16 bit surface:
//Move down to the correct block frame
lpDDMemSource += 150 * MainGFXDDSD.lPitch; //Don't multiply this value by 2!
//Move over to correct block
lpDDMemSource += 150 * 2; //(Times 2 cause we are in 16 bit color)
This als