First Frame: Clear and Present

Write the first real render loop: transition the back buffer with a resource barrier, clear it to a colour, submit to the GPU, present, and wait with a fence. The window turns solid blue.

Early Draft Work in progress — GitHub links coming soon.

Every object is in place. We have a window, a DXGI factory, a GPU adapter, a D3D12 device, a command queue, a swap chain, render target views, allocators, and a command list. This chapter connects them into a running render loop.

By the end of this chapter you will have:

Nothing is drawn. The purpose of this chapter is the loop itself — recording, submitting, presenting, and waiting. Everything that comes later (geometry, shaders, textures) plugs into this foundation.

Resource States and Barriers

Every GPU resource in DX12 exists in a state. The state tells the driver how the resource is currently being used and what memory layout it is in. Using a resource in a state it was not transitioned to is undefined behaviour — on some hardware it produces garbage pixels, on others a GPU crash.

The two states we care about today:

Before clearing the back buffer you must transition it from PRESENT to RENDER_TARGET. Before calling Present you must transition it back.

A barrier is recorded as a command in the command list:

D3D12_RESOURCE_BARRIER barrier = {};
barrier.Type                   = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
barrier.Flags                  = D3D12_RESOURCE_BARRIER_FLAG_NONE;
barrier.Transition.pResource   = backBuffer;
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_PRESENT;
barrier.Transition.StateAfter  = D3D12_RESOURCE_STATE_RENDER_TARGET;
barrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;

commandList->ResourceBarrier(1, &barrier);

The reverse barrier uses the same structure with StateBefore and StateAfter swapped:

barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
barrier.Transition.StateAfter  = D3D12_RESOURCE_STATE_PRESENT;

commandList->ResourceBarrier(1, &barrier);

This is the most common error for beginners: forgetting one of the barriers. The debug layer will catch it immediately with a message about an invalid resource state transition.

The Frame Sequence

One frame follows this sequence:

1. Ask the swap chain which back buffer to render to this frame.
2. Reset the allocator and command list for that frame index.
3. Barrier: PRESENT → RENDER_TARGET
4. Set the render target.
5. Clear the render target.
6. Barrier: RENDER_TARGET → PRESENT
7. Close the command list.
8. Submit: ExecuteCommandLists.
9. Present.
10. Signal the fence and wait for the GPU.

Steps 3–6 are where rendering will eventually go. Right now steps 4 and 5 are the entire visual output.

Fences

A fence is a synchronisation primitive. You ask the GPU to signal it at a particular value after it finishes a batch of work, and the CPU waits for that value before touching anything the GPU was using.

Create the fence before the loop:

Microsoft::WRL::ComPtr<ID3D12Fence> fence;
UINT64                              fenceValue = 0;
HANDLE                              fenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);

d3d.Device()->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&fence));

After each submit and present, increment the fence value, signal the queue, and wait:

++fenceValue;
queue.Get()->Signal(fence.Get(), fenceValue);

if (fence->GetCompletedValue() < fenceValue)
{
    fence->SetEventOnCompletion(fenceValue, fenceEvent);
    WaitForSingleObject(fenceEvent, INFINITE);
}

Signal is a GPU-side command. The queue executes it after all previously submitted work finishes, then writes fenceValue into the fence. GetCompletedValue reads the fence from the CPU. If it is already at fenceValue or higher, the GPU is done and we skip the wait entirely.

SetEventOnCompletion registers an OS event that the GPU will fire when the fence reaches the target value. WaitForSingleObject blocks the CPU thread until that event fires.

This approach waits for the GPU after every single frame. It is simple and provably safe — the GPU always finishes before the CPU starts the next frame. It is not efficient; it leaves both the GPU and CPU idle for a portion of every frame while the other side drains. That is a fine starting point. The standard improvement (tracking per-frame fence values and only waiting when you are about to overwrite an in-flight frame) is a natural next step once this is working.

Putting the Loop Together

Here is the complete main.cpp with everything wired:

#include "Window.h"
#include "DXGIContext.h"
#include "D3DDevice.h"
#include "CommandQueue.h"
#include "SwapChain.h"
#include "RTVHeap.h"
#include "FrameResources.h"

#include <d3d12.h>
#include <wrl/client.h>

static void RenderFrame(
    const SwapChain&    swapChain,
    const RTVHeap&      rtvHeap,
    FrameResources&     frames,
    ID3D12CommandQueue* queue,
    IDXGISwapChain3*    dxgiSwapChain,
    ID3D12Fence*        fence,
    HANDLE              fenceEvent,
    UINT64&             fenceValue)
{
    UINT frameIndex = swapChain.CurrentBackBufferIndex();

    frames.ResetFor(frameIndex);
    auto* cmd = frames.CommandList();

    // Transition back buffer: PRESENT → RENDER_TARGET
    ID3D12Resource* backBuffer = swapChain.BackBuffer(frameIndex);

    D3D12_RESOURCE_BARRIER toRT = {};
    toRT.Type                   = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
    toRT.Flags                  = D3D12_RESOURCE_BARRIER_FLAG_NONE;
    toRT.Transition.pResource   = backBuffer;
    toRT.Transition.StateBefore = D3D12_RESOURCE_STATE_PRESENT;
    toRT.Transition.StateAfter  = D3D12_RESOURCE_STATE_RENDER_TARGET;
    toRT.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;

    cmd->ResourceBarrier(1, &toRT);

    // Clear
    D3D12_CPU_DESCRIPTOR_HANDLE rtv = rtvHeap.HandleFor(frameIndex);
    const float clearColor[] = { 0.1f, 0.1f, 0.4f, 1.0f };

    cmd->OMSetRenderTargets(1, &rtv, FALSE, nullptr);
    cmd->ClearRenderTargetView(rtv, clearColor, 0, nullptr);

    // Transition back buffer: RENDER_TARGET → PRESENT
    D3D12_RESOURCE_BARRIER toPresent = toRT;
    toPresent.Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
    toPresent.Transition.StateAfter  = D3D12_RESOURCE_STATE_PRESENT;

    cmd->ResourceBarrier(1, &toPresent);

    // Submit
    cmd->Close();
    ID3D12CommandList* lists[] = { cmd };
    queue->ExecuteCommandLists(1, lists);

    // Present
    UINT presentFlags = swapChain.TearingSupported() ? DXGI_PRESENT_ALLOW_TEARING : 0;
    UINT syncInterval = swapChain.TearingSupported() ? 0 : 1;
    dxgiSwapChain->Present(syncInterval, presentFlags);

    // Wait for GPU
    ++fenceValue;
    queue->Signal(fence, fenceValue);
    if (fence->GetCompletedValue() < fenceValue)
    {
        fence->SetEventOnCompletion(fenceValue, fenceEvent);
        WaitForSingleObject(fenceEvent, INFINITE);
    }
}

int WINAPI wWinMain(
    HINSTANCE hInstance,
    HINSTANCE,
    PWSTR,
    int nCmdShow)
{
    Window window;
    if (!window.Create(hInstance, 1280, 720, nCmdShow))
    {
        MessageBoxW(nullptr, L"Window creation failed.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    DXGIContext dxgi;
    if (!dxgi.Init(true))
    {
        MessageBoxW(nullptr, L"No suitable GPU found.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    D3DDevice d3d;
    if (!d3d.Init(dxgi.Adapter(), true))
    {
        MessageBoxW(nullptr, L"D3D12 device creation failed.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    CommandQueue queue;
    if (!queue.Init(d3d.Device()))
    {
        MessageBoxW(nullptr, L"Command queue creation failed.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    SwapChain swapChain;
    if (!swapChain.Init(dxgi.Factory(), queue.Get(), window.Handle(),
                        window.Width(), window.Height()))
    {
        MessageBoxW(nullptr, L"Swap chain creation failed.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    RTVHeap rtvHeap;
    if (!rtvHeap.Init(d3d.Device(), swapChain))
    {
        MessageBoxW(nullptr, L"RTV heap creation failed.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    FrameResources frames;
    if (!frames.Init(d3d.Device()))
    {
        MessageBoxW(nullptr, L"Frame resources creation failed.", L"Startup Error", MB_ICONERROR);
        return -1;
    }

    Microsoft::WRL::ComPtr<ID3D12Fence> fence;
    UINT64 fenceValue = 0;
    HANDLE fenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);
    d3d.Device()->CreateFence(0, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&fence));

    MSG msg = {};

    while (msg.message != WM_QUIT)
    {
        if (PeekMessageW(&msg, nullptr, 0, 0, PM_REMOVE))
        {
            TranslateMessage(&msg);
            DispatchMessageW(&msg);
        }
        else
        {
            RenderFrame(
                swapChain,
                rtvHeap,
                frames,
                queue.Get(),
                swapChain.Get(),
                fence.Get(),
                fenceEvent,
                fenceValue);
        }
    }

    // Drain the GPU before destroying objects
    ++fenceValue;
    queue.Get()->Signal(fence.Get(), fenceValue);
    fence->SetEventOnCompletion(fenceValue, fenceEvent);
    WaitForSingleObject(fenceEvent, INFINITE);

    CloseHandle(fenceEvent);

    return static_cast<int>(msg.wParam);
}

When you build and run this, the window should fill with a solid dark blue. The title bar and borders belong to Windows; the solid colour is the GPU clearing the back buffer every frame.

The Drain at Shutdown

Notice the wait after the message loop exits:

++fenceValue;
queue.Get()->Signal(fence.Get(), fenceValue);
fence->SetEventOnCompletion(fenceValue, fenceEvent);
WaitForSingleObject(fenceEvent, INFINITE);

When wWinMain returns, C++ destroys the local objects — the swap chain, the heap, the device. If the GPU is still executing commands that reference those objects when the destructor runs, you get a crash or silent corruption. Waiting here ensures the GPU has drained completely before anything is destroyed.

Common Mistakes

If the window appears black instead of blue, check that OMSetRenderTargets is called before ClearRenderTargetView. The render target must be bound before clearing.

If the debug layer reports “resource is in the wrong state”, you have a missing barrier. Check that you have both the PRESENT → RENDER_TARGET barrier before the clear and the RENDER_TARGET → PRESENT barrier before Present.

If the application crashes on allocator->Reset(), the fence wait is not working. Add an OutputDebugStringW after the fence wait to confirm it is reached, and check that Signal is called on the queue before SetEventOnCompletion.

If Present returns DXGI_ERROR_DEVICE_REMOVED, the GPU has been lost. This can happen if the debug layer detected a fatal error on the previous frame. Check the debug output for the underlying cause.

If the frame rate is unexpectedly low (near the display refresh rate), you are likely presenting with syncInterval = 1 (V-Sync). That is correct for the tearing-unsupported path. Adjust syncInterval and presentFlags as needed.

Quick Checkpoint

You have a working render loop. These ideas should feel solid:

What’s Next

The render loop works. The next step is geometry: a vertex buffer, a vertex shader, a pixel shader, a root signature, and a pipeline state object. At the end of that sequence the clear colour gives way to your first triangle.