An Incomplete Guide to Programming

DirectDraw and Direct3D Immediate Mode

(Release 0.46)

by Brian Hook (bwh@wksoftware.com)

http://www.wksoftware.com

Table of Contents

Introduction

Microsoft designed and released game development library called DirectX. Part of this library is a 3D graphics immediate mode rendering API called Direct3D. Everyone said it was going to be The Standard for 3D Graphics. I decided to learn it. It sucked. It was difficult. It was poorly documented. It was error prone. Since I've subjected myself to this pain, I figure that others may want to learn a bit from my own experiences, so I'm writing this document.

NOTE: I'm doing this for free, so don't expect it to be fancy and neat and easy to read like some magazine article, but it's probably a lot better than the crap that Microsoft is pawning off on the unsuspecting game developers. This is a WORK IN PROGRESS, so please be forgiving of typos, grammatical mistakes, coding errors, whatever.

Oh, I tend to jump between shot and long forms of the various API names, so D3D is the same as Direct3D, DD is the same as DirectDraw, and DX is the same as DirectX. This document assumes DirectX 3, but most of it probably applies to earlier releases of DirectX.

The code that is presented within this document is originally from my own source code, but has been munged during text editing for readability's sake. However, in the process of these edits some variable names may have gotten mixed up and other obvious errors may have been introduced, breaking compiles. If you see any errors of this type, I apologize, and please tell me about them and I'll fix them. If you see any logical or programming errors, or incorrect assumptions made by me, let me know also and I will attempt to rectify them.

There are a lot of books on DirectX programming out there, but one I've found to be pretty good is DirectDraw Programming by Bret Timmins (M&T Books, 1996). It's really well written and has a lot of easy to read chunks of code that help you understand DirectDraw. I recommend it to anyone trying to learn DirectX programming.

Since I started working on this Microsoft has put up a similar document that I highly recommend. It is located at http://www.microsoft.com/mediadev/graphics/imtutor.htm.

Organization of this Document

This document is organized (if you can call it that) into big sections, each of which addresses its own little issues. Each of these sections talks about one particular corner of the Hell that is Direct3D.

Conventions

This document assumes that you're using the C++ COM interface to DirectX. Syntactically there should be very little difference between the C and C++ interfaces - the C interface requires function access through the lpVtbl and it also requires that the COM object be passed to the method being called.

Certain variables are used consistently throughout this document, and just so that you can see what they are all supposed to be, here they are:


LPDIRECTDRAW            lpDD;

LPDIRECT3DDEVICE        lpD3DDevice; 

LPDIRECTDRAWSURFACE     lpBackBuffer;

LPDIRECTDRAWSURFACE     lpFrontBuffer;

LPDIRECTDRAWSURFACE     lpZBuffer;

LPDIRECTDRAWCLIPPER     lpClipper;

LPDIRECT3D              lpD3D;

LPDIRECT3DEXECUTEBUFFER lpExBuf;

D3DEXECUTEDATA          ExData;

LPVOID                  lpBufStart, lpPointer, lpInsStart;

D3DMATRIXHANDLE         hModelWorldMatrix;

D3DMATRIXHANDLE         hWorldViewMatrix;

D3DMATRIXHANDLE         hProjectionMatrix;

Revision History

Release 1.00 (Not so soon…)

Will add discussion of the new DX5 IM interface.

Release 0.50 (Coming Soon!)

Will add discussion of texture management, material management, fog, dynamic lighting, and restoring surfaces.

Release 0.46

Yet another bug fix, compile-time once again so nothing major.

Release 0.45

Added two more minor compile-time bug fixes in the code.

Release 0.44

Added minor updates to reflect the recently posted Microsoft simple Direct3D posted on their Web site (http://www.microsoft.com/mediadev/graphics/imtutor.htm).

Release 0.43

Bug report from the field that for some people lpHELDesc or lpHWDesc might actually be NULL in the driver callback. This seems odd to me, but a NULL check has been added since it doesn't hurt.

Release 0.42

Adjusted viewport computations to correct for aspect ratio. Added stub for The Hell of Surface Management. Fixed a small bug in the front/back buffer creation code where I was forgetting to set the dwSize field.

Release 0.41

Set the dvMaxZ and dvMinZ members of the viewport structure. This actually shouldn't make a difference, but I'm putting it in here for completeness.

Release 0.40

Added discussion of state management.

Release 0.30

Added discussion of transformation matrices. Added Rule #7.

Release 0.24

Forgot to mention that calls to IDirect3DDevice::Execute() must be inside a BeginScene()/EndScene() pair.

Release 0.23

Minor bug fixes in the code. Was using "ddcs" instead of "dl" in the sample device enumeration code. Was using C-style COM interface in some sections inconsistently with other parts.

Release 0.22

I was using the term HEL improperly so I fixed it. Added table of contents. Changed OpenGL™ to OpenGL®.

Release 0.21

Minor bug fixes in some of the code and minor textual changes. In the part of the code where I tell you to store the lpGUID, I was doing if ( lpGUID ) instead of the correct if ( lpGUID == NULL ).

Release 0.20

Added documentation on execute buffers and stubs for discussions on matrices. Changed title names. Cleaned up a lot of text, added Rule #5. Massively re-architected the document to make it more coherent and expandable into an actual book form later.

Release 0.12

Fixed small typographical error in code (compile time error).

Release 0.11

Fixed bug dcmColorModel bug where I used equivalence instead of a bitwise AND. Mentioned that you need to create and attach your Z-buffer before creating your DIRECT3DDEVICE.

Release 0.1

Added documentation on full screen mode initialization and also added Z-buffer initialization code.

Release 0.0

First publication on the Web.

Acknowledgments

Well, people have already started helping me out by sending in their comments and e-mails. I'd like to thank the following individuals for their feedback, commentary, and bug reports:

Legal Stuff

This document, and all its associated parts, are Copyright © 1996, Brian Hook. All rights reserved. Permission to distribute this document, in part or full, via electronic means (e-mail, posted, or archived) or printed copy are granted providing that no charges are involved, reasonable attempt is made to use the most current version, and all credits and copyright notices are retained. If you make a link to this WWW page, please inform Brian Hook(bwh@wksoftware.com).

Requests for other distribution rights, including incorporation in commercial products, such as books, magazine articles, CD-ROMs, and binary applications should be made to Brian Hook, bwh@wksoftware.com.

This document is provided as is without any express or implied warranties. I've done everything I can to make sure that it is accurate, but I assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. If you have money riding on this, don't trust just this article.

All names, trademarks, copyrights, etc. are the legal property of the parties that own them. Or something like that.

An Overview of Direct3D Immediate Mode

DirectDraw and Direct3D

Fundamental Fact Number One: Direct3D is a part of DirectDraw, Microsoft's other API for accessing video hardware. Direct3D cannot exist without DirectDraw, since it is a sub-interface to DirectDraw. So this means that you have to know the basics of DirectDraw before you can use Direct3D. Once you accept this fact of life, things make more sense. But don't expect things to get any easier.

Surfaces

DirectDraw promotes the concept of a surface. A surface is memory that represents a visual image - the IDirectDrawSurface is used to represent front and back rendering buffers, the Z-buffer, and texture map management.

Execute Buffers

All rendering data and state changes are sent to the Direct3D driver via execute buffers. Execute buffers are chunks of memory containing commands and data that are interpreted on the fly by Direct3D. You'll learn about these later.

The Hell of Initialization

The first major hurdle a programmer will encounter when writing a Direct3D application is initialization. This section describes initialization of DirectDraw and Direct3D, the differences between drivers and devices, and ways of selecting the best installed device and driver for your application.

The Hell of GUIDs

The first thing I need to get out of the way is the GUID Issue. This is something that has bitten every DirectX programmer at one point or another, and is more frustrating than trying to teach algebra to a poodle. DirectX uses GUIDs (Globally Unique Identifier) to identify interfaces. This would be fine and dandy, if GUIDs were documented at all. But they aren't.

There are two problems that you will encounter with GUIDs. The first one is linking errors, and the second is the "pointer vs. reference" issue.

The INITGUID Linker Error

The first problem a developer will encounter when trying to build a D3D IM application from scratch will usually be some type of error like this:


foo.obj : error LNK2001: unresolved external symbol _IID_IDirect3D

Debug/foo.exe : fatal error LNK1120: 1 unresolved externals

Error executing link.exe

This is the infamous INITGUID bug.

The reason that this happens is because you need this wonderful thing called a GUID when querying for an interface. For example:


lpDD->QueryInterface( IID_IDirect3D, ( LPVOID * ) &lpD3D );

IID_IDirect3D is a globally defined value, however it is must be instantiated, either directly by your application or by linking to DXGUID.LIB (in dxsdk/sdk/lib), or the previously mentioned linker error will occur.

Instantiating GUIDs Within an Application

To instantiate the required GUID (note that you may run into this same problem with other DirectX components that need their own GUIDs instantiated) within an application you must define INITGUID before you include your header files, but in only exactly one of your source modules! For example:

FOO.C:


#define INITGUID

#include <ddraw.h>

#include <d3d.h>

In file BAR.C:


// Do not #define INITGUID in more than one file!!!!

#include <ddraw.h>

#include <d3d.h>

If you accidentally instantiate too many GUIDs you will get yet another linker error (actually, a whole stack of them), this one along the lines of:


foo.obj : error LNK2005: _IID_IDirect3DViewport already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirect3DExecuteBuffer already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirect3DMaterial already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirect3DLight already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirect3DTexture already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirect3D already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirectDrawClipper already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirectDrawPalette already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirectDrawSurface2 already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirectDrawSurface already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirectDraw2 already defined in bar.obj

foo.obj : error LNK2005: _IID_IDirectDraw already defined in bar.obj

foo.obj : error LNK2005: _CLSID_DirectDrawClipper already defined in bar.obj

foo.obj : error LNK2005: _CLSID_DirectDraw already defined in bar.obj

The real crappy part of all this is that if you are building a static library that instantiates GUIDs, then any application that instantiates GUIDs but also links to your library will get the same errors as the above. The only real satisfactory method to solve this problem, then, is….

Linking to DXGUID.LIB

Alternatively, an application can skip defining INITGUID and instead link directly to DXGUID.LIB, which has all the DirectX GUIDs instantiated within it. This is probably the simplest way around things, but I couldn't find this documented anywhere and had to read a Usenet post before I learned about it. If you go this route you won't accidentally overinstantiate your GUIDs, whether you're building a library or an application. The only downside I've encountered is that developers using Borland or Watcom compilers have been having trouble linking to DXGUID.LIB.

C and C++ GUID Interfaces Are Different

A related INITGUID bug is that when programming DirectX in C you pass a pointer to a GUID to IDirectDraw::QueryInterface, but when using C++ you pass a reference to a GUID to IDirectDraw::QueryInterface. This means that in C you must have code that looks like this:


// pass address of IID_IDirect3D to QueryInterface

lpDD->lpVtbl->QueryInterface( lpDD, &IID_IDirect3D, ( LPVOID * ) &lpD3D );

Whereas in C++ you must have code like this:


// pass reference to IID_IDirect3D

lpDD->QueryInterface( IID_IDirect3D, ( LPVOID * ) &lpD3D ) ;

The Difference Between Devices and Drivers

Now that the GUID stuff is out of the way, the next thing we need to do is establish our device/driver terminology. A device is an actual physical piece of hardware installed a computer. There may be one or more DirectDraw devices installed in a computer. Each device will have exactly one DirectDraw driver, used for communication between the device and software, associated with it. Each DirectDraw device driver will, in turn, have several Direct3D drivers attached to it. Oh, and once you have selected a specific Direct3D driver to use, you create a IDirect3DDevice from it. Yes, it's really that bad.

One of my computers, for example, has the following DirectDraw devices installed:


Primary device

3Dfx DirectDraw device

The above system has a primary device, an Intergraph Reactor board with the Rendition Verite chipset, and a secondary device, a non-VGA Diamond Monster3D with the 3Dfx Interactive Voodoo Graphics accelerator. Each of these devices has exactly one DirectDraw device driver, but each DD device driver has multiple Direct3D drivers (very important to keep this distinction!).

The Microsoft documentation is very inconsistent about the usage of the terms "device" and "driver". I've adopted the convention mentioned in the previous paragraph, which does not necessarily match the Microsoft convention. But at least I'm consistent.

Okay, so we've established that there are one or more DD devices (and thus DD drivers) in a computer system. Each of these will have at least two D3D softwaredrivers provided by Microsoft, and possibly at least one D3D hardware HAL (Hardware Abstraction Layer) driver, provided by the hardware manufacturer. The two software drivers provide software rendering capabilities, and the HAL provides access to hardware acceleration capabilities if any are present.

In my system, the primary device driver (Intergraph Reactor) has the following D3D drivers attached to it:


Ramp Emulation

RGB Emulation

Direct3D HAL

Once you figure out what D3D driver you want, you have to create a D3D device that you actually use for your rendering operations.

Let's summarize, shall we?

DirectDraw device: a hardware device that provides DirectDraw (possibly including Direct3D hardware acceleration) capabilities to a computer system. There is at least one DirectDraw device installed in the system (the normal display adapter).

DirectDraw device driver: a software interface to a specific DirectDraw device. There is exactly one DirectDraw device driver per DirectDraw device.

Direct3D driver: a sub-interface to a specific DirectDraw device driver. More than one D3D driver may exist within a single DD driver. Microsoft ships two default D3D drivers, its Ramp Emulation Driver and its RGB Emulation Driver. In addition, if a particular Direct3D device has hardware acceleration capabilities then another D3D driver, the Direct3D HAL, will also be available.

IDirect3DDevice: This is a software object that you create from a specific D3D driver. The IDirect3DDevice is responsible for taking execute buffers you create and passing them to the driver.

Initializing Direct3D in Eleven Clumsy and Obtuse Steps

Initializing Direct3D is a daunting task, but it's fairly manageable if you understand the general things you're trying to accomplish. To get to a "ready state" for rendering you must initialize DirectDraw first (since Direct3D is a part of DirectDraw), then you need to initialize Direct3D. The steps are:

Step 1: Enumerating DirectDraw Devices

The first thing an application should do is enumerate all the DirectDraw devices available in a computer system. This is done by calling DirectDrawEnumerate. DirectDrawEnumerate executes a callback that you pass to it for each device, and you can then store this stuff in a list somewhere for later access, because you or the user are going to have select which DD device to use. The following is an example of a DirectDrawEnumerate callback function:


BOOL FAR PASCAL DDEnumCallback( GUID FAR* lpGUID, 

                                LPSTR lpDriverDesc,

                                LPSTR lpDriverName, 

                                LPVOID lpContext )

{

   LPDIRECTDRAW lpDD;

   DDCAPS       DDcaps, HELcaps;

   // once again, my own data structure

   DDDeviceList *dl = ( DDeviceList *) lpContext;



   /*

   ** try and create a DD device using the specified GUID

   */

   if ( DirectDrawCreate( lpGUID, &lpDD, NULL ) != DD_OK )

   {

      // failed, so ignore this device

      return DDENUMRET_OK;

   }



   /*

   ** get caps of this DD driver. If it fails, move on to the next

   ** driver

   */

   memset( &DDcaps, 0, sizeof( DDcaps ) );

   DDcaps.dwSize = sizeof( DDcaps );

   memset( &HELcaps, 0, sizeof( HELcaps ) );

   HELcaps.dwSize = sizeof( HELcaps );

   if ( lpDD->GetCaps( &DDcaps, &HELcaps ) != DD_OK )

   {

      lpDD->Release();

      lpDD = 0;

      return DDENUMRET_OK;

   }



   /*

   ** device is valid! Now store relevant information in our device list

   */

   dl->AddDevice( &DDcaps, lpGUID, lpDriverName, lpDriverDesc );

   lpDD->Release();



   if ( dl->IsFull() )

      return DDENUMRET_CANCEL;



   return DDENUMRET_OK;

}

The above code should be self-explanatory. One note of caution, however. The parameter lpGUID is NULL for the Primary Device, and the Primary Device only. All other devices should have a non-NULL lpGUID. This is one of those little gotcha things that may not be obvious at first.

Another thing to watch for - do not store the lpGUID directly! Instead, your driver information structure should have something like this:


struct DriverInfo

{

   LPGUID lpGUID;

   GUID   guid;

   // … other stuff

};

When adding a device use code like this:


struct DriverInfo *current;

if ( lpGUID == NULL )

{

   current->lpGUID = 0;

}

else

{

   current->lpGUID = &current->guid;

   memcpy( &current->guid, lpGUID, sizeof( GUID ) );

}

This will prevent you from getting screwed in case the data lpGUID points to accidentally goes away. The lpGUID isn't important, the GUID itself is what matters. For this reason we should store the GUID directly when possible.

Okay, we now have a list a of DirectDraw devices, one of which is the primary one (lpGUID of NULL) and zero or more non-primary devices (lpGUIDs of non-NULL). Which brings us to….

Step 2: Selecting and Creating the IDirectDraw Object

Okay, you now have a list of DD devices in the system. Odds are that there is only one, and if that's the case skip on to the next step. If more than one DD device is present in the system then you should let the user select which device to use - don't try and use some complex decision tree about which device is more suitable, just let the user choose for themselves and store their preference away somewhere. This covers your ass in case you (or a poorly written driver) screw up your selection heuristics.

Once you've selected a particular device you need to create an interface to the DD driver linked to that device. You do this using DirectDrawCreate by passing it a pointer to the GUID of the device you'd like to use. From that point out you have a pointer to an IDirectDraw object:


if ( DirectDrawCreate( d_selected_dd_device.lpGUID, &lpDD, NULL ) != DD_OK )

   goto fail;

Step 3: Enumerating Display Modes (Optional)

If you plan on supporting full screen rendering modes you need to now enumerate all the available display modes. This is done by calling IDirectDraw::EnumDisplayModes. This function enumerates all the given display modes and executes the callback passed to it. Your callback should store away the display mode information in a list somewhere so that the user can select a display mode later.

Step 4: Creating the IDirect3D Object

Okay, we're doing pretty well at this point, but we still don't have access to Direct3D. So now we have to query the IDirectDraw object (created earlier using DirectDrawCreate) to see if it supports Direct3D. This will also initialize a pointer to a D3D object if succeeds. This is pretty simple:


BOOL CreateD3D( LPDIRECTDRAW lpDD )

{

   if ( lpDD->QueryInterface( IID_IDirect3D, ( LPVOID *) &lpD3D ) != DD_OK )

      return FALSE;

   else

      return TRUE;

}

Step 5: Enumerating D3D Drivers

At this point, assuming all has gone well, we have a pointer to an IDirectDraw object (LPDIRECTDRAW) and a pointer to an IDirect3D device (LPDIRECT3D). Our next step is to enumerate the available D3D drivers associated with our DirectDraw device. Now this is done with the rather misleadingly title IDirect3D::EnumDevices function (it's misleading because we're enumerating drivers, not devices). Once again, we pass a callback to this function which is executed once for each installed D3D driver. Note that we also pass a pointer to "application specific data" - in this case we're probably going to pass a pointer to some storage structure where we can cram our driver information.

But before we get ahead of ourselves, we need to tread carefully - things get a little hairy at this point. Our callback needs to be a little smarter than just storing away a D3D driver description. The callback should probably only store those drivers it thinks are interesting, e.g. hardware drivers that have some set of criteria we need.

Here is what an example D3D enumeration callback might look like:


HRESULT WINAPI EnumD3DDriversCallback( LPGUID lpGuid,

                                       LPSTR lpDeviceDescription,

                                       LPSTR lpDeviceName,

                                       LPD3DDEVICEDESC lpHWDesc,

                                       LPD3DDEVICEDESC lpHELDesc, 

                                       LPVOID lpContext )

{

   // this is our own "d3d driver manager"

   D3DDriverList *r = ( D3DDriverList * ) lpContext;

   int should_keep = 0;



   /*

   ** is this a HAL? Check lpHWDesc->dwFlags. I'm not

   ** sure if this is the correct way of doing things, but

   ** it's worked so far and I haven't seen anything else

   ** that works better.

   */

   if ( lpHWDesc && lpHWDesc->dwFlags != 0 )

   {

      // this is a HAL! Check parameters and if we like

      // it set should_keep to 1. For example, we may

      // want to keep only hardware that does perspective

      // texture mapping and that has a Z-buffer of 16-bits

      // Note that we don't keep hardware accelerators if we're

      // trying to do debugging (surfaces in system memory)

      if ( ( lpHWDesc->dpcTriCaps.dwTextureCaps & D3DPTEXTURECAPS_PERSPECTIVE ) &&

           ( lpHWDesc->dwDeviceZBufferBitDepth & DDBD_16 ) &&

           !debugging )

      {

         should_keep = 1;

      }

   }

   else if ( lpHELDesc )

   {

      // this is a HEL! Check parameters and if we like

      // it set should_keep to 1. For example, we may want

      // to keep only RGB color model drivers.

      if ( lpHELDesc->dcmColorModel & D3DCOLOR_RGB )

         should_keep = 1;

   }



   /*

   ** record the D3D driver's information if we want it

   */

   if ( should_keep )

   {

      r->Add( lpGuid, lpDeviceDescription, lpDeviceName );

      if ( r->IsFull() )

         return D3DENUMRET_CANCEL;

   }

   return D3DENUMRET_OK;

}

I'm hoping that you think the above code is pretty self explanatory. I did some fudging in there - I'm not going to show you how to implement a storage container in which you can shove D3D drivers, since that's just a general programming task anyone should be able to do. One comment I would like to make is that the lpHELDesc->dcmColorModel member seems to be a bit field, not an enum. Thus you have to use a bitwise AND instead of a direct comparison to see if RGB mode is supported.

Some comments on the above may be in order. For starters, as far as I know this callback will always receive non-NULL pointers to lpHELDesc and lpHWDesc, so you can't tell whether the specified item is a hardware or software driver. The only way I've found to tell them apart is to check the dwFlags parameter of the lpHWDesc. This is empirical programming at its best, but I haven't found a better way of doing it.

The comments I made earlier about storing the GUID instead of the lpGUID apply here also.

Step 6: Selecting a D3D Driver

Alrightee, then. We should now have an LPDIRECTDRAW, an LPDIRECT3D, and a list of Direct3D drivers to choose from. Once again we should let the user select the right driver - the minute we start to think we know what's best, we're probably going to hose ourselves. Bad idea. Not to mention it makes our code a lot simpler. Typically I toss up a dialog box showing all the interesting D3D drivers I found, and I have the user select one. Simple as pie. Store their selected driver away somewhere - I'll refer to this selection as d_selected_driver later in this documentation. This structure is usually defined by an application, so I'll leave that up to you, but usually it has stuff like the driver's name, description, capabilities, and GUID.

Once the user selects a driver, we need to do some other mundane crap before we can actually create an honest to God D3D device we can render triangles with.

Step 7: Set the Cooperative Level

No big deal here. We're either running in a windowed mode or we're running full screen. The only gotcha here is that some DD devices don't support windowed mode (the 3Dfx Interactive Voodoo chipset comes to mind). If switching to full screen you probably want to use one of the display modes that you enumerated earlier.

Anyway, you should have a short chunk of code that does something like this:


if ( fullscreen )

{

   if ( lpDD->SetCooperativeLevel( hWnd, DDSCL_EXCLUSIVE | DDSCL_FULLSCREEN ) != DD_OK )

      return 0; // something bad happened

   /*

   ** set the display to full screen display

   */

   if ( lpDD->SetDisplayMode( width, height, bpp ) != DD_OK )

      return 0;

}

else

{

   if ( lpDD->SetCooperativeLevel( hWnd, DDSCL_NORMAL ) != DD_OK ) 

      return 0; // something bad happened

}

Step 8: Create Your Front and Back Buffers (and Clipper)

We now have to create our rendering buffers. This involves making a front buffer and a back buffer. I'll talk about texture surfaces some other time.

The first buffer you have to create is your front buffer, also known as your primary display surface. You will then need to create a back buffer. Also, if you are running in windowed mode then you need to create a clipper and attach it to the application's window.

The following example creates the buffers you need for a full-screen renderer:


LPDIRECTDRAWSURFACE lpFrontBuffer, lpBackBuffer;



BOOL CreateFullScreenSurfaces( LPDIRECTDRAW lpDD )

{

   DDSURFACEDESC ddsd;

   DDSCAPS ddscaps;



   memset( &ddsd, 0, sizeof( ddsd );



   ddsd.dwSize = sizeof( ddsd );

   ddsd.dwFlags = DDSD_CAPS | DDSD_BACKBUFFERCOUNT;

   ddsd.ddsCaps.dwCaps = DDSCAPS_PRIMARYSURFACE | 

                         DDSCAPS_FLIP | 

                         DDSCAPS_3DDEVICE | 

                         DDSCAPS_COMPLEX;

   ddsd.dwBackBufferCount = 1;



   if ( lpDD->CreateSurface( &ddsd, &lpFrontBuffer, NULL ) != DD_OK )

   {

      goto fail;

   }

   ddscaps.dwCaps = DDSCAPS_BACKBUFFER;

   if ( lpFrontBuffer->GetAttachedSurface( &ddscaps, &lpBackBuffer ) != DD_OK )

   {

      goto fail;

   }

   return TRUE;

fail:

   RELEASE( lpFrontBuffer );

   return FALSE;

}

This example creates the buffers and clipper you need for a windowed renderer:


BOOL CreateWindowedSurfaces( LPDIRECTDRAW lpDD )

{

   DDSURFACEDESC ddsd;

   DDSCAPS ddscaps;



   memset( &ddsd, 0, sizeof( ddsd );



   ddsd.dwSize  = sizeof( ddsd );

   ddsd.dwFlags = DDSD_CAPS;

   ddsd.ddsCaps.dwCaps = DDSCAPS_PRIMARYSURFACE;



   if ( lpDD->CreateSurface( &ddsd, &lpFrontBuffer, NULL ) != DD_OK )

   {

      goto fail;

   }



   ddsd.dwFlags = DDSD_WIDTH | DDSD_HEIGHT | DDSD_CAPS;

   ddsd.dwWidth = window_width;

   ddsd.dwHeight = window_height;



   ddsd.ddsCaps.dwCaps = DDSCAPS_OFFSCREENPLAIN | DDSCAPS_3DDEVICE;

   /*

   ** if debugging we have to create our surfaces in system memory

   ** so that our debugger isn't hosed when locking surfaces.

   */

   if ( debugging || !using_hardware )

      ddsd.ddsCaps.dwCaps |= DDSCAPS_SYSTEMMEMORY;

   else

      ddsd.ddsCaps.dwCaps |= DDSCAPS_VIDEOMEMORY;



   // create the back buffer

   if ( lpDD->CreateSurface( &ddsd, &lpBackBuffer , NULL ) != DD_OK )

   {

      goto fail;

   }



   // create a clipper and attach it to our window

   if (lpDD->CreateClipper( 0, &lpClipper, NULL ) != DD_OK )

   {

      goto fail;

   }



   if ( lpClipper->SetHWnd( 0, hWnd ) != DD_OK )

   {

      goto fail;

   }



   if ( lpFrontBuffer->SetClipper( lpClipper ) != DD_OK )

   {

      goto fail;

   }



   // release clipper because a reference to hit was automatically

   // made when we did the SetClipper

   if ( lpClipper->Release() != DD_OK )

      goto fail;



   return TRUE;

fail:

   RELEASE( lpFrontBuffer );

   RELEASE( lpBackBuffer );

   RELEASE( lpClipper );

   return FALSE;

}

Step 9: Create the Z-Buffer

We've only created front and back buffers at this point, so now we need to create a Z-buffer, assuming that we want to do depth buffered hidden surface removal. If you don't require this, then you can skip this section and ignore references to the Z-buffer in other portions of this document.

Creating the Z-buffer is actually fairly trivial. It seems that if you want to Z-buffer that you must create the Z-buffer before creating your IDirect3DDevice (described next). This is very important, since this can lead to some real weird bugs. As far as I know the following code should work fine for both full screen and windowed rendering:


/*

** CreateZBuffer

*/

BOOL CreateZBuffer( void )

{

   DDSURFACEDESC ddsd;

   

   memset( &ddsd, 0, sizeof( ddsd ) );

   ddsd.dwSize = sizeof( ddsd );

   

   /*

   ** create the Z-buffer

   */

   memset( &ddsd, 0 ,sizeof(DDSURFACEDESC));

   ddsd.dwSize = sizeof( ddsd );

   ddsd.dwFlags = DDSD_WIDTH | DDSD_HEIGHT | DDSD_CAPS | DDSD_ZBUFFERBITDEPTH;

   ddsd.ddsCaps.dwCaps = DDSCAPS_ZBUFFER;

   ddsd.dwWidth  = screen_width;

   ddsd.dwHeight = screen_height;

   

   if ( !debugging || !using_hardware )

      ddsd.ddsCaps.dwCaps |= DDSCAPS_SYSTEMMEMORY;

   else

      ddsd.ddsCaps.dwCaps |= DDSCAPS_VIDEOMEMORY;

   

   /*

   ** choose a 16-bit Z-buffer depth, at least that's what I do

   */

   if ( !( d_selected_driver.d_desc.dwDeviceZBufferBitDepth & DDBD_16 ) )

   {

      return FALSE;

   }

   ddsd.dwZBufferBitDepth = 16;

   

   if ( lpDD->CreateSurface( &ddsd, &lpZBuffer, NULL ) != DD_OK )

      return FALSE;



   /*

   ** attach the Z-buffer to the back buffer

   */

   if (lpBackBuffer->AddAttachedSurface( lpZBuffer ) != DD_OK )

   {

      RELEASE( lpZbuffer );

      return FALSE;

   }



   return TRUE;

}

Step 10: Create the IDirect3DDevice

Woo-hoo, almost there! We can create the D3D device and actually sort of try and get something rendering onto the screen. The reason we needed to create our buffers before doing this is that our D3D device, for some reason, is created via the IDirectDrawSurface interface. Don't ask, because I don't know. The following chunk of code creates an IDirect3DDevice device assuming that the user has selected a specific D3D driver and that we have stored the selected driver's GUID somewhere:


BOOL CreateD3DDevice( void ) 

{

   HRESULT hresult;

   

   hresult = lpBackBuffer->QueryInterface( d3d_driver.guid, 

                                           ( LPVOID * ) &lpD3DDevice );

   if ( hresult != DD_OK )

      return FALSE;

   return TRUE;

}



Step 11: Create an IDirect3DViewport and Attach It to the D3D Device

The last step in initialization is the creation of an IDirect3DViewport and attaching it to the IDirect3DDevice we created in the previous step. The only trick here is to make sure you select the right range of Z values when specifying dvMinZ and dvMaxZ. For some reason the D3D sample file D3DMAIN.CPP doesn't set these values, yet it seems to work anyway. Odd behavior. Anyway, here's the code:


BOOL CreateViewport( void )

{

   /*

   ** Create and add viewport

   */

   if ( lpD3D->CreateViewport( &lpViewport, NULL ) != DD_OK )

   {

      return FALSE;

   }

   if ( lpD3DDevice->AddViewport( lpViewport ) != DD_OK )

   {

      RELEASE( lpViewport );

      return FALSE;

   }



   /*

   ** setup the viewport for a reasonable viewing area

   */

   D3DVIEWPORT viewdata;

   DWORD       largest_side;



   memset( &viewdata, 0, sizeof( viewdata ) );



   /*

   ** this compensates for aspect ratio

   */

   if ( display_width > display_height )

      largest_side = display_width;

   else

      largest_side = display_height;

      

   viewdata.dwSize = sizeof( viewdata );

   viewdata.dwX = viewdata.dwY = 0;

   viewdata.dwWidth = display_width;

   viewdata.dwHeight = display_height;

   viewdata.dvScaleX = largest_side / 2.0F;

   viewdata.dvScaleY = largest_side / 2.0F;

   viewdata.dvMaxX = ( float ) ( viewdata.dwWidth / 

                                 ( 2.0F * viewdata.dvScaleX ) );

   viewdata.dvMaxY = ( float ) ( viewdata.dwHeight / 

                                 ( 2.0F * viewdata.dvScaleY ) );

   viewdata.dvMinZ = 1.0F;

   viewdata.dvMaxZ = 1000.0F; // choose something appropriate here!



   if (lpViewport->SetViewport( &viewdata ) != DD_OK )

   {

      RELEASE( lpViewport );

      return FALSE;

   }

   return TRUE;

}



Step 12: We're Done!

Okay, after all that, we now have a D3D device we can actually play with. I'm exhausted right now, but for the time being this should get you going. At this point we should have an lpViewport, lpD3DDevice, lpD3D, lpDD, lpClipper, lpZBuffer, lpBackBuffer, and an lpFrontBuffer, all there for your programming pain and pleasure. For your first rendering task I highly recommend clearing out the back buffer to some color (hint - IDirectDrawSurface::Blt with DDBLT_COPYFILL) and then displaying the back buffer. Other portions of this document address both buffer clearing and flipping.

The Hell of Buffer Management

DirectDraw is all about buffer management. There are surfaces and buffers for color data, depth data, and texture data. This section talks a bit about the color buffers and Z-buffers; texture stuff is discussed in The Hell of Texture Mapping.

Buffer Flipping and Blitting

How you get your rendered scene onto the screen can be performed in one of two manners - flipping or blitting. If you are doing full-screen rendering, you will be doing a page flip. If you are doing windowed rendering, you will be blitting from the back buffer to the front buffer. This is trivial stuff, but it's nice to see described in one logical place, side by side:


if ( fullscreen )

{

   if ( lpFrontBuffer->Flip( NULL, 1 ) != DD_OK )

      // something bad happened;

}

else

{

   RECT src_rect, dst_rect;

   POINT pt;



   /*

   ** src_rect is relative to offscreen buffer

   */

   GetClientRect( hWnd, &src_rect );



   /*

   ** dst_rect is relative to screen space so needs translation

   */

   pt.x = pt.y = 0;

   ClientToScreen( hWnd, &pt );

   dst_rect = src_rect;

   dst_rect.left += pt.x;

   dst_rect.right += pt.x;

   dst_rect.top += pt.y;

   dst_rect.bottom += pt.y;



   /*

   ** perform the blit from backbuffer to primary, using

   ** src_rect and dst_rect

   */

   if ( lpFrontBuffer->Blt( &dst_rect,

                             lpBackBuffer,

                            &src_rect,

                             DDBLT_WAIT,

                             0 ) != DD_OK )

   {

      // something bad happened

   }

}

Buffer Clearing

At some point while programming DirectX you are going to need to clear a surface to some color. This is straightforward enough:


void ClearSurface( LPDIRECTDRAWSURFACE lpDDS, float r, float g, float b )

{

   RECT dst;

   DDBLTFX ddbltfx;

   DWORD fillcolor;

   DDSURFACEDESC ddsd;



   /*

   ** compute the fill color

   */

   fillcolor = MakeSurfaceRGB( lpDDS, r, g, b );



   /*

   ** get the surface desc

   */

   ddsd.dwSize = sizeof(ddsd);

   lpDDS->GetSurfaceDesc(&ddsd);   



   memset(&ddbltfx, 0, sizeof(ddbltfx));

   ddbltfx.dwSize = sizeof(DDBLTFX);

   ddbltfx.dwFillColor = fillcolor;

   dst.left = dst.top = 0;

   dst.right = ddsd.dwWidth;

   dst.bottom = ddsd.dwHeight;



   if ( lpBackBuffer->Blt( &dst, NULL, NULL, 

                           DDBLT_COLORFILL | DDBLT_WAIT, 

                           &ddbltfx ) != DD_OK )

   {

      // something bad happened

   }

}

The only mystical stuff in the prior code is the computation of the fill color. For some bizarre reason clear colors are not specified in a frame buffer independent format, so every application has to figure out how to map a particular RGB color to the frame buffer format. This can be accomplished in one of two ways: reading the color masks directly, or just writing an RGB888 color to the frame buffer and reading it back.

The former method is probably the most "correct" and definitely smacks of being a little less seat of the pants than the latter, but it also requires going through some tedious steps. Basically you get a surface description for the surface and read the pixel format descriptor's values for the red, green, and blue masks. Given these masks you can determine shift and scale values for conversion between RGB888 format and the frame buffer native format. Unfortunately this way of doing things may not work with chromakeying, where color values have to be identical in order to work correctly.

So, the second method of doing things - the read-back method - is probably preferable, even though it is extremely slow. Luckily for us, computing frame buffer native colors isn't going to be the kind of thing we do often.

My implementation of MakeSurfaceRGB is pretty much stolen directly from the FOXBEAR code on the DirectX SDK. In a nut shell you write an 888 pixel to the frame buffer using the Win32 API call SetPixel, then you read it back using DirectDraw's direct frame buffer access capabilities.


/*

** this assumes that R, G, and B are passed as floats in the range [0,1]

*/

DWORD MakeSurfaceRGB( LPDIRECTDRAWSURFACE lpDDS, float r, float g, float b )

{

   unsigned long dw = 0;

   COLORREF cref = RGB( r * 255, g * 255, b * 255 );

   COLORREF tmpCref;

   DDSURFACEDESC ddsd;

   HDC hdc = NULL;



   /*

   ** Get a DC from the surface

   */

   if ( lpDDS->GetDC( &hdc ) != DD_OK )

      // something bad happened

      return 0;



   /*

   ** save pixel in surface then store a pixel into the surface

   */

   tmpCref = GetPixel( hdc, 0, 0 );

   SetPixel( hdc, 0, 0, cref );

   lpDDS->ReleaseDC( hdc );



   memset( &ddsd, 0, sizeof( ddsd ) );

   ddsd.dwSize = sizeof( ddsd );



   /*

   ** lock the back buffer so that we can read back the value

   ** we just wrote with SetPixel()

   */

   if ( lpDDS->Lock( NULL, &ddsd, DDLOCK_WAIT, NULL ) != DD_OK )

   {

      // something bad happened

      // should probably restore the color we wrote out

      // earlier, but I'm too lazy to write that code

      return 0;

   }



   /*

   ** read back the color

   */

   dw = * ( DWORD * ) ddsd.lpSurface;



   /*

   ** mask off high bits if the bit count is not 32

   */

   if ( ddsd.ddpfPixelFormat.dwRGBBitCount != 32 ) 

      dw &= ( ( 1 << ddsd.ddpfPixelFormat.dwRGBBitCount ) - 1 );





   /*

   ** unlock the back buffer

   */

   lpDDS->Unlock( NULL );



   /*

   ** restore the pixel we overwrote

   */

   if ( lpDDS->GetDC( &hdc ) == DD_OK )

   {

      SetPixel( hdc, 0, 0, tmpCref );

      lpDDS->ReleaseDC( hdc );

   }



   return dw;

}

The Hell of Surface Management

Direct3D and DirectDraw are organized around the concepts of surfaces, basically chunks of system or display memory that contain data. Some examples of surfaces include the front buffer, back buffer, Z-buffer, and textures.

Because Direct3D applications work within a multi-tasking environment, it is entirely possible (and likely!) that pieces of memory that an application is using are "lost", i.e. used by another application. Because of this a well behaved DirectX application must check to see what surfaces are lost at least once per frame and restore them.

Checking for Lost Surfaces

Restoring Lost Surfaces

The Hell of Execute Buffers

Communication with Direct3D is almost always facilitated through the use of an execute buffer. An application stores commands (op-codes) and data into the execute buffer and then passes the completed execute buffer to the Direct3D driver. The driver then parses this execute buffer information into actual commands which are then executed by the driver.

Drawing a triangle is probably the most fundamental action you can undertake with any graphics library, and probably the most poorly documented part of Direct3D. The sample code sucks and isn't clear as to what it's doing, and there exists no documentation in the help files on how to build an execute buffer.

Overview of the Execute Buffer

An execute buffer is a region of memory consisting of a stream of vertices followed by commands, all DWORD aligned, that are passed to the Direct3D driver and parsed by it.

The following illustrates the general structure of an execute buffer used for triangle rendering:


  DATA                     LENGTH

+------------------------+----------+ <- lpBufStart (lowest address)

| vertex data            | variable |

+------------------------+----------+ <- lpInsStart

| OP_CODE                | fixed    |

+------------------------+----------+

|   OP_CODE_PARAMS       |(variable)| 

+------------------------+----------+

\        *                     *    \

/   (variable number of op-codes)   /

\        *                     *    \

+------------------------+----------+

| (QWORD UNALIGNER)      | (fixed)  | (conditionally inserted)

+------------------------+----------+

| OP_TRIANGLE_LIST       | fixed    |

+------------------------+----------+

| triangle data          | variable |

+------------------------+----------+

+ OP_EXIT                | fixed    +

+------------------------+----------+











































The reference format consists of a bunch of vertices, followed by a bunch of op-codes (including triangle information), and closed by an OP_EXIT op-code. Note that other layouts are entirely possible - the important thing is setting the D3DEXECUTEDATA members accordingly when calling IDirect3DExecuteBuffer::SetExecuteData.

Vertex Data

Vertex data is data that describes the position, normal, and/or color information of a vertex. There are three types of vertices within Direct3D - the D3DVERTEX, the D3DLVERTEX, and the D3DTLVERTEX. The D3DVERTEX stores position (model coordinate), normal, and texture coordinate information. The D3DLVERTEX stores position (model coordinate), diffuse color, specular color, and texture coordinate information. The D3DTLVERTEX stores (screen coordinates) position, diffuse color, specular color, and texture coordinate information.

Direct3D provides a macro, VERTEX_DATA, to copy data into the execute buffer. Note that while the macro VERTEX_DATA uses a sizeof( D3DVERTEX ) statement, it seems that all of the Direct3D vertex structures are identical in size, so the VERTEX_DATA macro can be used with all three types of Direct3D vertices.

Processing Vertices

After the vertex data you typically put in an instruction to "process vertices", meaning "do whatever you have to do to get this data into something meaningful". The instruction is OP_PROCESS_VERTICES, followed immediately by the PROCESSVERTICES_DATA. The OP_xxx instructions simply insert D3D instructions into the execute buffer. The PROCESSVERTICES_DATA macro actually inserts the op-code parameters. The type of vertex processing you will do depends on the type of vertices you are using:

VertexProcessing Type
D3DVERTEXD3DPROCESSVERTICES_TRANSFORMLIGHT
D3DLVERTEXD3DPROCESSVERTICES_TRANSFORM
D3DTLVERTEXD3DPROCESSVERTICES_COPY

QWORD unalignment

Triangle data needs to be aligned on QWORD (8-byte) boundaries. Because of this, the instruction for triangle data needs to be unaligned! Code such as this is usually used:


if ( QWORD_ALIGNED( lpPointer ) ) {

   OP_NOP( lpPointer );

}

OP_TRIANGLE_LIST( 1, lpPointer );

!!WARNING!! You must have the braces around the OP_NOP instruction or the compiled code will be incorrect! This is because of a poor implementation of the OP_NOP macro, so do not remove those braces!

Triangle Data

Triangle data is inserted into the list using an OP_TRIANGLE_LIST instruction followed by N number of triangles. Triangles are specified as three indices into the vertex data at the beginning of the execute buffer. NOTE: it is important that you have your vertices ordered correctly, or your triangles may end up being inadvertently backface culled!

OP_EXIT

The OP_EXIT instruction tells Direct3D when to stop processing the data and closes out the execute buffer.

Creating an Execute Buffer

Now that we understand the core pieces of an execute buffer, we need to be able to create one. This is fairly easy -- the hardest part is computing how much memory the execute buffer will consume. In order to do this we have to sum up the sizes of all the op-codes and data to be put into the buffer, which is an error prone task.

Note that some drivers support limited size execute buffers. This limit may be on the overall size of the execute buffer or simply on the maximum number of vertices within the execute buffer. The driver description flags for the selected driver will tell you whether there is a limit on the number of bytes or vertices in an execute buffer:


// make sure driver specifies a max buffer size

if ( d_selected_driver.d_desc.dwFlags & D3DDD_MAXBUFFERSIZE )

{

   // if max buffer size == 0 then it's unlimited

   if ( d_selected_driver.d_desc.dwMaxBufferSize )

      d_max_execute_buffer_size = d_selected_driver.d_desc.dwMaxBufferSize;

   else

      d_max_execute_buffer_size = MY_DEFAULT_BUFFER_SIZE;

}

else

{

   d_max_execute_buffer_size = MY_DEFAULT_BUFFER_SIZE;

}

// make sure driver specified a max vertex count

if ( d_selected_driver.d_desc.dwFlags & D3DDD_MAXVERTEXCOUNT )

{

   // if max vertex count == 0 then it's unlimited

   if ( d_selected_driver.d_desc.dwMaxVertexCount )

      d_max_vertex_count = d_selected_driver.d_desc.dwMaxVertexCount;

   else

      d_max_vertex_count = MY_DEFAULT_VERTEX_COUNT;

}

else

{

   d_max_vertex_count = MY_DEFAULT_VERTEX_COUNT;

}

The above should be straightforward - if the driver gives us limits on vertex count or execute buffer size, we use them. If they aren't specified, then we use our defaults and can assume that there are no limits on vertex count or execute buffer size. I find it more straightforward to create an execute buffer once and use it over and over during the course of a program instead of trying to allocate one every single time I need to render something. I don't know if this is a bad thing to do or not, but it does make code less error prone.

Computing the Size of an Execute Buffer

Computing the amount of space we need for an execute buffer is a complicated and error prone task. The basic equation is something like:


sizeof( D3DINSTRUCTION) * num_opcodes +

sizeof( D3DVERTEX ) * num_vertices +

sizeof( D3DTRIANGLE ) * num_triangles +

sizeof( all parameters to opcodes );

Needless to say, this is pretty much an ugly mess.

For example, assume that you're rendering 10 independent triangles. This means that you need the following op-codes: OP_EXIT, OP_PROCESS_VERTICES, and OP_TRIANGLE_LIST. You need space for 30 vertices and 10 triangles. You also need space, potentially, for the OP_NOP when forcing QWORD unalignment. You also need space for the parameter to D3DPROCESSVERTICES parameter for OP_PROCESS_VERTICES.


size = sizeof( D3DINSTRUCTION ) * 4 // OP_EXIT, OP_PROCESSVERTICES, OP_TRIANGELIST, OP_NOP

       sizeof( D3DPROCESSVERTICES ) +

       sizeof( D3DTRIANGLE ) * 10 +

       sizeof( D3DSTATE ) * 0 + 

       sizeof( D3DVERTEX ) * 30;

There's no clean way of automating this, so I'm sorry I can't give you any helper code here. I put the code for the D3DSTATE instructions in there for completeness.

Allocating the Execute Buffer

Now we allocate the execute buffer. The following function allocates an execute buffer of the specified size. Note that it will fail if we try to allocate a buffer larger than the driver (or our software) is designed to deal with the function will fail.


LPDIRECT3DEXECUTEBUFFER AllocateExecuteBuffer( int size )

{

   D3DEXECUTEBUFFERDESC    debDesc;

   LPDIRECT3DEXECUTEBUFFER exBuf;



   // create a D3DEXECUTEBUFFERDESC

   memset( &debDesc, 0, sizeof( debDesc ) );

   debDesc.dwSize       = sizeof( debDesc );

   debDesc.dwFlags      = D3DDEB_BUFSIZE;

   debDesc.dwBufferSize = size;



   if ( size > d_max_buffer_size ) return 0;



   // create the buffer

   if ( lpD3DDevice->CreateExecuteBuffer( &debDesc, &exBuf, NULL ) != DD_OK )

   {

      return 0;

   }

   return exBuf;

}

Filling the Execute Buffer

Now that we know how to allocate an execute buffer we need to learn how to fill it with instructions. I apologize, once again, for this horribly ugly code - it's not me, I swear it, it's the way Direct3D is designed.

Locking the Execute Buffer

Before we can modify the execute buffer we need to lock it, which means we tell the driver we want to start filling it and it, in turn, promises not to mess with it while we do this. The IDirect3DExecuteBuffer::Lock method performs this function and also returns a pointer to the execute buffer's memory. The following code fragment locks an execute buffer:


D3DEXECUTEBUFFERDESC debDesc;



memset( &debDesc, 0, sizeof( debDesc ) );

debDesc.dwSize       = sizeof( debDesc );

if ( lpExBuf->Lock( &debDesc ) != DD_OK )

   fail();

After the call to IDirect3DExecuteBuffer::Lock the debDesc structure will have its lpData member filled in with the address of the execute buffer memory. We store this address in our three magic pointers (described next):


lpPointer = lpBufStart = lpInsStart = debDesc.lpData;

I've noticed that some applications zero out the execute buffer data with a call to memset, but I don't see this being particularly useful so I don't do this. It can't hurt, so if you want to do this go ahead and use something like:


memset( debDesc.lpData, 0, d_max_execute_buffer_size );

The Three Magic Pointers

When creating execute buffers you need to manage three pointers: the buffer start address, the instruction start address, and the buffer end address. Typically, when you lock down an execute buffer you store its beginning address in lpBufStart. You reference this later when doing pointer arithmetic. lpInsStart is also only set once, at the time you insert your first actual op-code into the execute buffer. Finally, lpPointer is the roaming pointer that tracks the "current location" you are writing into in the execute buffer. When you are done filling the execute buffer, lpPointer will point to the end of the execute buffer instructions and data you've written.

These three pointers are necessary later when computing vertex offsets, instruction offsets, and overall execute buffer size.

The Direct3D Helper Macros

Filling an execute buffer consists of writing data, op-codes, and op-code parameters into the execute buffer. You can write a set of wrapper macros or functions to fill in an execute buffer, or you can use the ones supplied with the DirectX SDK sample code (/dxsdk/sdk/samples/misc/d3dmacs.h). Be very careful when using the macros in D3DMACS.H, since they are written pretty poorly.

An Example of Filling the Execute Buffer

This example assumes that you have locked your execute buffer and that you have assigned the three magic pointers to point to debDesc.lpData. This code fragment fills an execute buffer with enough data to render a single transformed/lit triangle:


void FillBuffer( D3DVERTEX *vertices, int num_vertices, 

                 D3DTRIANGLE triangles[], int num_tris )

{

   int i;



   VERTEX_DATA( vertices, num_vertices, lpPointer );

   lpInsStart = lpPointer;

   OP_PROCESS_VERTICES( 1, lpPointer );

      PROCESSVERTICES_DATA( D3DPROCESSVERTICES_TRANSFORMLIGHT, 0,  num_vertices, lpPointer );

   // triangle data must be QWORD aligned, so we need to make sure

   // that the OP_TRIANGLE_LIST is unaligned!  Note that you MUST have

   // the braces {} around the OP_NOP since the macro in D3DMACS.H will

   // fail if you remove them.

   if ( QWORD_ALIGNED( lpPointer ) ) {

      OP_NOP( lpPointer );

   }

   OP_TRIANGLE_LIST( num_tris, lpPointer );

   for ( i = 0; i < num_tris; i++ ) {

      LPD3DTRIANGLE tri = ( LPD3DTRIANGLE ) lpPointer;



      tri->v1 = tris[i].v1;

      tri->v2 = tris[i].v2;

      tri->v3 = tris[i].v3;

      tri->wFlags = D3DTRIFLAG_EDGEENABLETRIANGLE;

      tri++;



      lpPointer = ( LPVOID ) tri;

   }

   OP_EXIT( lpPointer );

}

To change the above to use D3DTLVERTEXes or D3DLVERTEXes involves simply changing the D3DVERTEX declaration in function signature and changing the D3DPROCESSVERTICES_TRANSFORMLIGHT to D3DPROCESSVERTICES_COPY and D3DPROCESSVERTICES_TRANSFORM, respectively.

Executing the Execute Buffer

Now that we've filled the execute buffer with meaningful data the next and final step is to execute the actual instructions.

Unlocking the Execute Buffer

The first step to executing the buffer is to unlock it so that the driver knows that it can use it.


lpExBuf->Unlock();

Setting the Execute Buffer Data

Now that the execute buffer has been unlocked we actually have to tell it all about the instructions we just created. This is done via the IDirect3DExecuteBuffer::SetExecuteData method, as follows:


D3DEXECUTEDATA d3dExData;

   memset(&d3dExData, 0, sizeof(D3DEXECUTEDATA));

   d3dExData.dwSize = sizeof(D3DEXECUTEDATA);

   d3dExData.dwVertexCount = num_vertices;

   d3dExData.dwInstructionOffset = (ULONG)((char*)lpInsStart - (char*)lpBufStart);

   d3dExData.dwInstructionLength = (ULONG)((char*)lpPointer - (char*)lpInsStart);

   lpExBuf->SetExecuteData(&d3dExData);

The above code creates a D3DEXECUTEDATA structure, zeros it out, and sets its members appropriately. The dwInstructionOffset member is the offset, in bytes, from the beginning of the execute buffer data to the first instruction. The dwInstructionLength member is the length, in bytes, of the instruction data. It then calls IDirect3DExecuteBuffer::SetExecuteData to set the data.

This leaves us in a state where the execute buffer can actually be, well, executed.

Executing the Buffer

Executing the actual execute buffer is trivial. You simply call IDirect3DDevice::Execute and pass it the execute buffer. Note that calls to IDirect3DDevice::Execute must be bracketed by calls to IDirect3DDevice::BeginScene and IDirect3DDevice::EndScene, i.e. calls to IDirect3DDevice::Execute may not occur outside of a IDirect3DDevice::BeginScene/IDirect3DDevice::EndScene pair.


if ( lpD3DDevice->Execute( lpExBuf, lpViewport, D3DEXECUTE_CLIPPED ) != DD_OK )

   fail();

It is very important to check the return value from Execute, since in all likelihood you will encounter myriad bugs when executing this method. Note that D3DEXECUTE_CLIPPED can only be specified with D3DVERTEXes and D3DLVERTEXes (you can gain a slight performance increase if you specify D3DEXECUTE_UNCLIPPED with D3DVERTEX and D3DLVERTEXes if you know that what you're rendering definitely won't be clipped). With D3DTLVERTEXes you must clip the vertices yourself and specify D3DEXECUTE_UNCLIPPED.

Destroying the Execute Buffer

Deleting an execute buffer is done simply by releasing it:


lpExBuf->Release();

lpExBuf = 0;

alternatively, you can use the RELEASE macro provided in D3DMACS.H:


RELEASE( lpExBuf );

The Hell of Rendering Polygons

Since all rendering is performed with execute buffers, you should have an understanding of execute buffers before proceeding. Refer to The Hell of Execute Buffers if you have not already done so.

Rendering Independent Triangles

Rendering Triangle Strips

Rendering Triangle Fans

Rendering Convex Polygons

The Hell of State Management

State management consists of the actions an application undertakes to control rendering output. Examples of state management include selecting the current texture, the current shading model, fog mode, fog color, etc. Most types of state are changed infrequently, e.g. the shading model or fog color.

Changing state can be done globally, locally, or both. A global state change is generally used to affect rendering by all drawing routines. A local state change is generally intended to control rendering for a specific set of drawing routines. An example of a global state change may be something like selecting the texture filter mode - this is often something you wish to let the user control for aesthetic and performance reasons. A local state change, on the other hand, would include something such as selecting a texture - objects and meshes often have their own unique texture maps.

Global State Changes

Global state changes are usually accomplished by creating a single execute buffer designed to manage state. This execute buffer will have no vertices, only commands. For example, the following code changes the Z-buffering state by creating a single execute buffer and executing it:


D3DEXECUTEBUFFERDESC    debDesc;

D3DEXECUTEDATA          d3dExData;

LPDIRECT3DEXECUTEBUFFER lpD3DExCmdBuf = NULL;

LPVOID                  lpBuffer, lpInsStart;

size_t                  size = 0;

 

   /*

   ** create an execute buffer of the required size

   */

   size = 0;

   size += sizeof(D3DSTATE) * 25; // bigger than I need, but it doesn't matter

   memset(&debDesc, 0, sizeof(D3DEXECUTEBUFFERDESC));

   debDesc.dwSize = sizeof(D3DEXECUTEBUFFERDESC);

   debDesc.dwFlags = D3DDEB_BUFSIZE;

   debDesc.dwBufferSize = size;



   if ( lpD3DDevice->CreateExecuteBuffer( &debDesc, &lpD3DExCmdBuf, NULL ) != DD_OK )

   {

      fail();

   }

   /*

   ** lock the execute buffer

   */

   if ( lpD3DExCmdBuf->Lock( &debDesc ) != DD_OK )

   {

      fail();

   }



   /*

   ** zero out execute buffer memory

   */

   memset( debDesc.lpData, 0, size );

   

   lpInsStart = debDesc.lpData;

   lpBuffer = lpInsStart;



   /*

   ** set the render state

   */

   OP_STATE_RENDER( 3, lpBuffer);

      STATE_DATA( D3DRENDERSTATE_ZENABLE,      TRUE, lpBuffer );

      STATE_DATA( D3DRENDERSTATE_ZFUNC,        D3DCMP_LESSEQUAL, lpBuffer );

      STATE_DATA( D3DRENDERSTATE_ZWRITEENABLE, TRUE, lpBuffer );

   OP_EXIT( lpBuffer );



   /*

   ** unlock the buffer

   */

   if ( lpD3DExCmdBuf->Unlock() != DD_OK )

   {

      fail();

   }



   /*

   ** set the execute data and execute the buffer

   */

   memset( &d3dExData, 0, sizeof(D3DEXECUTEDATA) );

   d3dExData.dwSize              = sizeof(D3DEXECUTEDATA);

   d3dExData.dwInstructionOffset = (ULONG) 0;

   d3dExData.dwInstructionLength = (ULONG) ( (char*)lpBuffer - (char*)lpInsStart );



   if ( lpD3DExCmdBuf->SetExecuteData( &d3dExData ) != DD_OK )

   {

      fail();

   }



   if ( lpD3DDevice->Execute( lpD3DExCmdBuf, lpViewport, D3DEXECUTE_UNCLIPPED ) != DD_OK )

   {

      fail();

   }

   RELEASE( lpD3DExCmdBuf );

As you can tell, that's a rather huge chunk of code to execute something pretty trivial. So the key here is to try and minimize the number of times you have to create, fill in, and execute state management execute buffers. This is discussed in The Soft/Hard State Mechanism, but the preceding code at least gives you an idea of what it takes to change state.

Local State Changes

A local state change is just like a global state change, but it is usually inserted directly into a normal triangle/vertex display execute buffer as part of the command stream. You would typically do this to control the current texture while rendering polygons, for example.

The Soft/Hard State Mechanism

The mechanism I've adopted to control state management with Direct3D is to use the soft/hard paradigm. With this approach state changes are considered soft, i.e. they are not immediately acted upon. When the application is positive it needs the state to be up to date, it hardens or locks the current state.

So, the soft state is the current state that the application is assuming is in existence (but cannot necessarily count on), and the hard state is the actual state that Direct3D is under. A transition from soft to hard state is usually done with some type of explicit locking command that the application calls only when about to render.

Why all this fuss? As I've demonstrated, state changes with Direct3D are cumbersome and expensive. So to get around this we can create a single global state execute buffer that is updated and executed only when the state needs to be hardened. This minimizes expensive state execute buffer creation/fill/execution/release cycles.

So to implement the soft/hard mechanism a set of functions are created that manage soft state. For example, we could have a global structure containing the Direct3D state we think is relevant. Then we would have individual functions that control the Z-buffer function, for example:


void APISetZFunc( D3DCMPFUNC cmp )

{

   app_state.d_zfunc = cmp;

}

With the soft/hard mechanism an application can freely change the state as it sees fit, without worrying about the performance hit of changing one piece of state. The only time an application needs to worry about a potential performance hit is when the state is hardened, usually right before a rendering action.

Once the soft state management functions are created, we simply need a function that takes the global application state and inserts it into the execute buffer, effectively hardening the state. This would look almost identical to the example of global state changes, the major difference being that more state variables would be updated, and the source of their values would be from the global application state. For example:


/*

** set the render state

*/

OP_STATE_RENDER( NUM_STATE_VARIABLES, lpBuffer);

   STATE_DATA( D3DRENDERSTATE_ZENABLE,      app_state.d_zenable, lpBuffer );

   STATE_DATA( D3DRENDERSTATE_ZFUNC,        app_state.d_zfunc, lpBuffer );

   STATE_DATA( D3DRENDERSTATE_ZWRITEENABLE, app_state.d_zwriteenable, lpBuffer );

   // etc. etc.

OP_EXIT( lpBuffer );

The Hell of Transformation Matrices

Direct3D maintains three separate transformation matrices. These transformation matrices are used whenever you specify D3DPROCESSVERTICES_TRANSFORM or D3DPROCESSVERTICES_TRANSFORMLIGHT during an IDirect3DDevice::Execute call.

The three matrices are the modelworld, worldview, and projection. The modelworld matrix controls the transformation from model or object coordinates into the world coordinate system. The worldview matrix controls the transformation from world coordinates into viewer or eye coordinates. Finally, the projection matrix transforms view coordinates into screen coordinates, which are in turn run through a viewport transformation that scales them into screen coordinates.

Creating a Matrix

Matrices are created using the IDirect3DDevice::CreateMatrix method. When a matrix is created the application is returned a matrix handle, represented by the D3DMATRIXHANDLE type.


D3DMATRIXHANDLE hMatrix;

if ( lpD3DDevice->CreateMatrix( &hMatrix ) != DD_OK )

   fail();

This matrix handle simply represents any matrix within Direct3D.

Setting a Matrix

Once you've allocated a matrix you need to set its contents. This is done by filling in a D3DMATRIX structure then calling IDirect3DDevice::SetMatrix. The D3DMATRIX structure is a little less than intuitive - each matrix element is explicitly named, as opposed to using a single or double dimension array of floats.


   D3DMATRIX m;



   // the following code sets "m" to an identity matrix

   m._11 = 1.0F;

   m._12 = 0.0F;

   m._13 = 0.0F;

   m._14 = 0.0F;

   m._21 = 0.0F;

   m._22 = 1.0F;

   m._23 = 0.0F;

   m._24 = 0.0F;

   m._31 = 0.0F;

   m._32 = 0.0F;

   m._33 = 1.0F;

   m._34 = 0.0F;

   m._41 = 0.0F;

   m._42 = 0.0F;

   m._43 = 0.0F;

   m._44 = 1.0F;

   if ( lpD3DDevice->SetMatrix( hMatrix, &m ) != DD_OK )

      fail();

Setting Global Matrices

The three global matrices (modelworld, worldview, and projection) are set by creating a state execute buffer with the appropriate op-codes. The bare minimum entries you need are the following:


OP_STATE_TRANSFORM( 3, lpBuffer );

   STATE_DATA( D3DTRANSFORMSTATE_WORLD,      hModelWorldMatrix, lpBuffer );

   STATE_DATA( D3DTRANSFORMSTATE_VIEW,       hWorldViewMatrix,  lpBuffer );

   STATE_DATA( D3DTRANSFORMSTATE_PROJECTION, hProjectionMatrix, lpBuffer );

Note that unless you need multiple matrices simultaneously to represent your transformation that you can just set the three transformations once and leave them alone. They'll automatically be updated whenever you call IDirect3DDevice::SetMatrix.

For more information on state management, refer to The Hell of State Management.

Deleting a Matrix

When a matrix is no longer in use, you should delete it with IDirect3DDevice::DeleteMatrix.


if ( hMatrix )

{

   lpD3DDevice->DeleteMatrix( hMatrix );

   hMatrix = NULL;

}

Inserting Matrices Directly into Execute Buffers

It is also possible to insert transformation matrices directly into an execute buffer, however I have not personally needed this ability and thus can't discuss it authoritatively.

Direct3D Matrices Explained

Direct3D's matrix and vector transformation system works with 4x4 matrices and 1x4 (implicit w of 1.0) vectors. Direct3D's coordinate system is left-handed, and by default Direct3D assumes that you are using row vectors. Also, the D3DMATRIX structure is row major.

This is contrary to OpenGL's system, where the OpenGL matrix is column major (matrices are represented as single dimensional arrays), the OpenGL coordinate system is right-handed, and the OpenGL transformation system assumes column vectors.

I still haven't quite figured out how Direct3D's projection matrix system works, so I just use the template found in the DirectX documentation.

The Hell of Texture Mapping

The Hell of Debugging

Debugging Direct3D applications is about as much fun as chewing on shards of broken glass. So in an effort to make the glass a bit more palatable, this section shows some basic rules that help when trying to debug Direct3D applications.

Rule #1: Check Your Return Values

This is the most important rule of DirectX programming. Check return values from every single DirectX call you make. Even the most innocuous calls can fail! Some drivers, for example, simply don't work in some video modes (1152x864 is a common enough one where things fail for no apparent reason). Checking for return codes and looking up those codes can save you a lot of time spent debugging.

Rule #2: Use the Debug Libraries

Related to Rule 1, you should use the debug versions of the DirectX libraries for one major reason - they print debugging spew to your system debugger. This means that if you are using Microsoft Visual C++'s built in debugger, DirectX will print status and debug info to the Output Window. This is handier than you can imagine. Instead of looking up error codes by hand, they will just be printed, right there for you to read and understand. Obviously, when doing performance tuning or measurement, you'll want to use the release versions of the libraries. I have a batch file that toggles between debug and release versions of the libraries.

Rule #3: Test on Every Piece of Hardware You Can Find

Unlike Silicon Graphics' OpenGL®, just because your Direct3D program works on one particular device doesn't mean it will work on any other device. So you have to test your software on every imaginable type of Direct3D hardware available.

Rule #4: Use OutputDebugString to Help with Debugging

OutputDebugString is the coolest Windows API function in the world. Basically this function spits a string out to your system debugger. If you are using MSVC++ then it will print a string to the output window in your IDE (View|Output). The reason that this is handy is that you can now dump your "checkpoints" somewhere you can inspect after your program barfs (assuming it doesn't lock up your system). The old style method of printf( "Got here!\n" ) works great using OutputDebugString. Other methods that work as well include dumping data to a monochrome monitor ((char *)0xb0000 actually maps to monochrome graphics space with both Watcom and Microsoft under Win95!) or to a console that you create with AllocConsole and write to with WriteFile, or, my new favorite, calling MessageBox when a message is vital for debugging.

Rule #5: Support Emulation and Windowed Rendering

Debugging Direct3D is a nightmare. So coming up with ways of dealing with debugging Direct3D is the way to go. The first step is to allow your program to optionally render in a window - this makes stepping through your code with a debugger a lot easier. The second step is to allow your program to enable "emulation only", meaning that you turn off hardware acceleration even when available and that you use system memory surfaces whenever possible. This prevents the Win16 lock (see Rule 6) from hanging your system.

Rule #6: Remember that Locks Can Hose Your Debugger

As stated in Rule 5, there is a significant chance that when you execute a buffer or surface lock of some type that you will be causing a "Win16 lock" inadvertently. A Win16 lock will often hang your debugger, and can often lead you to think that there is a bug with your code when, in fact, your code is just fine. To prevent locks from hanging your system try and use emulation and/or system memory surfaces when debugging (cf. Rule 5). And never set a debugger break point between a Lock/Unlock pair!

Rule #7: Verify Your App Against Lots of Hardware

Since the Direct3D pixel rendering specification is, well, nonexistent, it's important that you test your application with every D3D accelerator you can get your hands on. And test it often. Stick as many Direct3D boards in your system as you can (I use a Rendition Verite and 3Dfx Voodoo graphics simultaneously). If you can, support the D3D software emulation driver, but in most cases this isn't feasible since it's lacking a lot of fundamental functionality.

Rule #8: Use Remote Debugging if Possible

Remote debugging is facilitated by connecting one computer to another and running the program on one and the debugger on the other. This has a lot of advantages, such as the ability to debug through locks/unlocks and the ability to remain stable after a crash or trashed memory. The downside is that you need two systems ($$$) and performance is pretty damn slow.

Conclusion

Well, this is what little I have to contribute to the world of DirectX programming. If you have any information you'd like to share, corrections, experiences, whatever, please write me at bwh@wksoftware.com and share your grief.


The Game Programming MegaSite
The Entire Site ©1996,1997,1998 Matt Reiferson.
Any Questions/Comments, Feel Free To E-Mail Me.