This is my personal study notes of the awesome project Handmade Hero.
If you think writing a professional-quality game from scratch on your own (no engine no library) is interesting and challenging, I highly recommend this project.
In my opinion, it's the best I can find.
- We can jump π
- We can shoot π«
- We can go upstair and downstair πββοΈ
- We have a big world and many rooms πΊ
- We have a basic procedure generated ground πΏ
Windows 10 with Visual Studio 2019 community version and Sublime Text 3.
Build system for Sublime Text 3:
{
"build_systems":
[
{
"name": "HandmadeHero",
"shell_cmd": "build",
"file_regex":"^(.+?)\\((\\d+\\))(): (error)(.+)$"
}
]
}NOTE: This repo does not contain copyrighted HandmadeHero assets, to build this repo, please consider preorder HandmadeHero.
- Create a
wdrive using subst:subst w: /c/whatever_directory_you_choose - Clone this repo into the root of
w - Install Visual Studio 2019 community version
- cd into
wand init cl:.\handmade-hero\misc\shell.bat - Build and enjoy!
build
My preferred code style for C is different from Casey's.
- snake_case for types, e.g.
game_world - camelCase for variables, e.g.
globalRunning - PascalCase for functions and macro functions, e.g.
GameUpdateVideo - UPPER_SNAKE_CASE for macro constants, e.g.
TILES_PER_CHUNK - Prefix an underscore to indicate that this function should only be called with a corresponding macro, e.g.
_PushSize
NOTE: Something we need to pay attention toPLAN: Something we plan to do it laterRESOURCE: External valuable resourceDIFF: Something I have done it differently from CaseyFUN: Something interesting to know, like Windows can't correctly handle file formats they inventedCASEY: Casey's opinion about programming
- Every memory allocation should go through a macro, it will make the debugging much easier.
Premultiplied Alpha: check day 83 for more details.Gamma Correction: check day 94 for more details.Transform Normal: check day 102 for more details.
dir /s [keyword]: search filesfindstr -s -n -i -l [keyword]: find strings
WS_EX_TOPMOST: make window in front of othersWS_EX_LAYEREDandSetLayeredWindowAttributes: change window alpha
Spy++: inspect windows and messages
- Fix full screen problem caused by systeml-level display scale
- Fix long running freeze bug: let the game run for a while and it will freeze
- Slow speed when moving across rooms
- Day 1: Setting Up the Windows Build
- Day 2: Opening a Win32 Window
- Day 3: Allocating a Back Buffer
- Day 4: Animating the Back Buffer
- Day 5: Windows Graphics Review
- Day 6: Gamepad and Keyboard Input
- Day 7: Initializing DirectSound
- Day 8: Writing a Square Wave to DirectSound
- Day 9: Variable-Pitch Sine Wave Output
- Day 10: QueryPerformanceCounter and RDTSC
- Day 11: The Basics of Platform API Design
- Day 12: Platform-Independent Sound Output
- Day 13: Platform-Independent User Input
- Day 14: Platform-Independent Game Memory
- Day 15: Platform-Independent Debug File
- Day 16: Visual Studio Compiler Switches
- Day 17: Unified Keyboard and Gamepad Input
- Day 18: Enforcing a Video Frame Rate
- Day 19: Improving Audio Synchronization
- Day 20: Debugging the Audio Sync
- Day 21: Loading Game Code Dynamically
- Day 22: Instantaneous Live Code Editing
- Day 23: Looped Live Code Editing
- Day 24: Win32 Platform Layer Cleanup
- Day 25: Finishing the Win32 Prototyping Layer
- Day 26: Introduction to Game Architecture
- Day 27: Exploration-based Architecture
- Day 28: Drawing a Tilemap
- Day 29: Basic Tilemap Collision Checking
- Day 30: Moving Between Tilemaps
- Day 31: Tilemap Coordinate Systems
- Day 32: Unified Position Representation
- Day 33: Virtualized Tilemaps
- Day 34: Tilemap Memory
- Day 35: Basic Sparse Tilemap Storage
- Day 36: Loading BMPs
- Day 37: Basic Bitmap Rendering
- Day 38: Basic Linear Bitmap Blending
- Day 39: Basic Bitmap Rendering Cleanup
- Day 40: Cursor Hiding and Fullscreen
- Day 41: Overview of the Types of Math Used in Games
- Day 42: Basic 2D Vectors
- Day 43: The Equations of Motion
- Day 44: Reflecting Vectors
- Day 45: Geometric vs. Temporal Movement Search
- Day 46: Basic Multiplayer Support
- Day 47: Vector Lengths
- Day 48: Line Segment Intersection Collision
- Day 49: Debugging Canonical Coordinates
- Day 50: Basic Minkowski-based Collision Detection
- Day 51: Separating Entities by Update Frequency
- Day 52: Entity Movement in Camera Space
- Day 53: Environment Elements as Entities
- Day 54: Removing the Dormant Entity Concept
- Day 55: Hash-based World Storage
- Day 56: Switching from Tiles to Entities
- Day 57: Spatially Partitioning Entities
- Day 58: Using the Spatial Partition
- Day 59: Adding a Basic Familiar Entity
- Day 60: Adding Hitpoints
- Day 61: Adding a Simple Attack
- Day 62: Basic Moving Projectiles
- Day 63 & 64 & 65 & 66: Major Refactoring with Simulation Region
- Day 67: Making Updates Conditional
- Day 68: Exact Enforcement of Maximum Movement Distances
- Day 69: Pairwise Collision Rules
- Day 70: Exploration To-do List
- Day 71: Converting to Full 3D Positioning
- Day 72: Proper 3D Inclusion Test
- Day 73: Temporarily Overlapping Entities
- Day 74: Moving Entities Up and Down Stairwells
- Day 75: Conditional Movements Based on Step Heights
- Day 76: Entity Heights and Collision Detection
- Day 77: Entity Ground Points
- Day 78: Multiple Collision Volumes Per Entity
- Day 79: Defining the Ground
- Day 80: Handling Traversables in the Collision Loop
- Day 81: Creating Ground with Overlapping Bitmaps
- Day 82: Caching Composited Bitmaps
- Day 83: Premultiplied Alpha
- Day 84: Scrolling Ground Buffer
- Day 85: Transient Ground Buffers
- Day 86: Aligning Ground Buffers to World Chunks
- Day 87: Seamless Ground Textures
- Day 88: Push Buffer Rendering
- Day 89: Renderer Push Buffer Entry Types
- Day 90: Bases Part 1
- Day 91: Bases Part 2
- Day 92: Filling Rotated and Scaled Rectangles
- Day 93: Textured Quadrilaterals
- Day 94: Converting sRGB to Light-linear Space
- Day 95: Gamma-correct Premultiplied Alpha
- Day 96: Introduction to Lighting
- Day 97: Adding Normal Maps to the Pipeline
- Day 98: Normal Map Code Cleanup
- Day 99: Test Environment Maps
- Day 100: Reflection Vectors
- Day 101: The Inverse and the Transpose
- Day 102: Transforming Normals Properly
- Day 103: Card-like Normal Map Reflections
- Day 104: Switching to Y-is-up Render Targets
- Day 105: Cleaning Up the Renderer API
- Day 106: World Scaling
- Day 107: Fading Z Layers
- Day 108: Perspective Projection
- Day 109: Resolution-Independent Rendering
- Day 110: Unprojecting Screen Boudaries
- Day 111: Resolution-independent Ground Chunks
- Day 112: A Mental Model of CPU Performance
- Day 113: Simple Performance Counters
- Day 114: Preparing a Function for Optimization
- Day 115: SIMD Basics
- Day 116: Converting Math Operations to SIMD
- Day 117: Packing Pixels for the Framebuffer
- Day 118: Wide Unpacking
- Day 119: Counting Intrinsics
- Day 120: Measuring Port Usage with IACA
- Day 121: Rendering in Tiles
- Day 122: Introduction to multithreading
- Day 123: Interlocked Operations
- Day 124: Memory Barriers and Semaphores
- Day 125 && 126: Work Queue
- Day 127: Aligning Rendering Memory
- Install Visual Studio 2019
- Call
vsdevcmdto init command line tools - Use
clto build our program - Use
devenvto start visual studio to debug. e.g.devenv w:\build\win32_handmade.exe WinMain: Entry of Windows programMessageBox: Show a message box
WNDCLASS,RegisterClassGetModuleHandleOutputDebugStringDefWindowProcCreateWindow,CreateWindowExGetMessage,TranslateMessage,DispatchMessageBeginPaint,EndPaint,PatBlt
PostQuitMessage- #define
global_variableandinternaltostatic - Resize buffer when receive WM_RESIZE
GetClientRectCreateDIBSectionStretchDIBitsDeleteObjectCreateCompatibleDCReleaseDC
- Use
VirtualAllocto alloc bit map memory instead ofCreateDIBSection VirtualFree,VirtualProtect- Set
biHeightto negative value so we the image origin if top-left - Render a simple gradient. Each pixel has a value of form
0xXXRRGGBB - use
PeekMessageinstead ofGetMessage, because it doesn't block GetDC,ReleaseDC
HREDRAWandVREDRAWare used to tell Windows to redraw the whole window- Use
win32_offscreen_bufferto bundle all global variables - Create the back buffer just once, move it out of
WM_SIZE
XInput,XInputGetState,XInputSetState,XUSER_MAX_COUNT- Loading windows functions ourselves
- Use XInput 1.3
LoadLibrary,GetProcAddressWM_SYSKEYUP,WM_SYSKEYDOWN,WM_KEYUP,WM_KEYDOWN- Get IsDown and WasDown status from LParam
- Return
ERROR_DEVICE_NOT_CONNECTEDin xinput stub functions - Implement
Alt+F4to close the window - Use bool32 if we only care if the value is 0 or not 0
dsound.h, IDirectSound8 InterfaceDirectSoundCreate,SetCooperativeLevel,CreateSoundBuffer,SetFormat- Remember to clear
DSBUFFERDESCto zero - Add
MEM_RESERVEtoVirtualAlloc
- IDirectSouondBuffer8 Interface
Lock,Unlock,GetCurrentPosition,Play
sinfwin32_sound_output,Win32FillSoundBuffertSine,LatencySampleCount- We need to handle xinput deadzone in the future
- Use
DefWindowProcAinstead ofDefWindowProc
QueryPerformanceCounter,LARGE_INTEGER,QuyerPerformanceFrequencywsprintf,__rdtsc- Intrinsic: looks like a function call, but it's used to tell the compiler we want a specific assembly instruction here
- Win32 platform todo list:
- Saved game location
- Getting a handle to our executable
- Asset loading
- Threading
- Raw input (support for multiple keyboards)
- Sleep/timeBeginPeriod
- ClipCursor() (for multimonitor)
- Fullscreen
- WM_SETCURSOR (control cursor visibility)
- QueryCancelAutoplay
- WM_ACTIVATEAPP (for when we are not the active application)
- Blit speed improvements
- Hardware acceleration
- GetKeyboardLayout (for French keyboards)
- For each platform, we will have a big [platform]_handmade.cpp file. Inside this file, we #include other files.
- Treat our game as a service, rather than the operating system.
_alloca: Allocate some memory in the stack, freeed when the function exists rather than leave out of the enclosing scope- Move sound rendering logic to handmade.cpp
- Define
game_input,game_controller_input,game_button_state - Store OldInput and NewInput and do ping-pang at end of every frame
- Define
ArrayCountmacro
- Use a
game_memorystruct to handle all memory related stuff - We have permannent storage and trasient storage in our memory
- Define
Kilobytes,MegabytesandGigaBytesmacros - We require the memory allocated to be cleared to zero
- Define
Assertmacro - Use
cl -Dname=valto defineHANDMADE_INTERNALandHANDMADE_SLOWcompiler flags - Specify base address when we do
VirtualAllocfor debugging purpose in internal build
- Define
DebugPlatformReadFile,DebugPlatformWriteFileandDebugPlatformFreeFileMemoryonly when we are using internal build - Define
SafeTruncateUInt64inline functions CreateFile,GetFileSizeEx,ReadFile__FIEL__is a compile time macro points to current file
- VS compiler switches:
-WX,-W4: enable warning level 4 and treat warnings as errors-wd: turn off some warnings-MT: static link C runtime library-Oi: generates intrinsic functions.-Od: disable optimization-GR-: disable run-time type information, we don't need this-Gm-: disable minimal rebuild-EHa-: disable exception-handling-nologo: don't print compiler info-FC: full Path of Source Code File in Diagnostics
- Init
vsdevcmdusing-arch=x86flags to build a 32-bit version of our program - Use
/linkto pass linker options to make a valid Windows XP executable-subsystem:windows,5.1
- Add one controller, so we have 5 controllers now
- Extract
CommonCompilerFlagsandCommonLinkerFlagsin build.bat - Copy old keyboard button state to new keyboard button state
- Add MoveUp, MoveDown, MoveLeft, MoveRight buttons
- Handle XInput dead zone
- Check whether union in game_controller_input is aligned
- We need to find a way to reliably retrieve monitor refresh rate?
- We define
GameUpdateHzbased onMonitorRefreshHz - Use
Sleepto wait for the remaining time - Use
timeBeginPeriodto modify scheduler granularity
- Record last play cursor and write cursor
- Define
Win32DebugSyncPlayto draw it - Use a while loop to test direct sound audio update frequency
The audio sync logic is indeed very hard and complicated.
I didn't take many notes because I was really confused and I didn't understand much.
- Compute audio latency seconds using write cursor - play cursor
- Define
GameGetSoundSamples
- Compile win32_handmade and handmade separaely
- Define win32_game_code and
- Put platform debug functions to game memroy
- Enable
/LDswitch to build dll - Use /EXPORT linker flags to export dll functions
- We don't need to define
DllMainentry point in our dll - extern "C" to prevent name mangling
- Turn off incremental link
- Use
CopyFileto copy the dll
NOTE: CopyFile may fail the first time, We use a while loop to do it. This is debug code so We don't care the performance;
- Use /PDB:name linker options to specify pdb file name
- Add timestamp to output pdb file name
- Delete PDB files and pipe del output to NUL
- Use
FindFirstFileto get file write time - Use
CompareFileTimeto compare file time - Use
GetModuleFileNameto get exe path and use it to build full dll path - We can use MAX_PATH macro to define length of path buffer
- Define
win32_stateto store InputRecordIndex and InputPlayingIndex, we only support one slot now - Press L to toggle input recording
- Store input and memory into files
- Use a simple jump to test our looped editing
- We can use
WS_EX_TOPMOSTandWS_EX_LAYEREDto make our window the top most one and has some opacity - We can do it in
WM_ACTIVATEAPPmessage so when the game loses focus it will be transparent
- Fix the audio bug (I have already fixed that in previous day)
- Change blit mode to 1-to-1 pixles
- Use
%random%for pdb files - Change compiler flag
MTtoMTd - Store EXE directory in Win32State and put record input file to build dir
- Use
GetFileAttributeExinstead ofFindFirstFileto get last write time of a file
- Use
GetDeviceCapsto get monitor refresh rate - Pass
thread_contextfrom platform to game and from game to platform - Add mouse info to game_input, using
GetCursorPos,ScreenToClient - Record mouse buttons using
GetKeyState - Define win32_replay_buffer and store game state memory in memory using
CopyMemory(Storing in disk actually is very fast in my computer, but I am gonna do it anyway) - I am not gonna do memory mapping, because I think it's unnecessary
Today there isn't any code to write. I am just listening to Casey talking about what a good game architecture looks like.
In Casey's view, game architect is like a Urban Planner. Their job are organizing things roughly instead of planning things carefully. I can't agree more.
- Add
SecondsToAdvanceOverUpdateto game_input - Remove debug code
- Turn off warning C4505, it's annoying. We are gonna have unreferenced local functions.
- Target resolution: 960 x 540 x 30hz
- Define
DrawRectangle
- Use floating point to store colors, because it will make it a lot more eaiser when we have to do some math about colors
- Draw a simple tilemap
- Draw a simple player, keep in mind that player's moving should consider the time delta. Otherwise it will move fast if we run at a higher FPS.
- Using
PatBltto clear screen when display our buffer - We should only clear the four gutters otherwise there will be some flashing
- Implement a simple collision check
- Seperate the header file into handmade.h and handmade_platform.h
- Define four tilemaps, and notice that in C the two dimension array is Y first and X last
- Define
canonicol_positionandraw_postion - Implement moving between tilemaps
- NOTE: Basically any CPU we are gonna target at has SSE2
- Define
handmade_intrinsic.h - Define
TileSizeInMetersandTileSizeInPixels - Optimization switches:
/O2 /Oi /fp:fast - PLAN: Pack tilemap index and tile index into a single 32-bit integer
- PLAN: Convert TileRelX and TileRelY to resolution independent world units
- RESOURCE: Intel Intrinsics Guide, https://software.intel.com/sites/landingpage/IntrinsicsGuide/
- Remove
raw_position - Add
canonical_postion PlayerPosto game state - Define
RecononicalizePosition - Use meters instead of pixels as units
- Rename
canonical_positiontoworld_position - Make Y axis go upward
- RESOURCE: a great book about typology: Galois' Dream: Group Theory and Differential Equations
- Remove TileMapX and TileMapY
- Define
tile_map_position - 24-bit for tilemap and 8-bit for tiles
- Implement a simple scroll so the guy can move
- Implement smooth scrolling
- Implement a way to speed the guy up
- Make
TileRelXandTileRelYrelative to center of tile - Create
handmade_tile.handhandmade_tile.cpp - Rename
tile_maptotile_chunkand extract everything fromworldtotile_map, now tilemap means the whole map - Define
memory_arenaandPushSize,PushArrayand create tile chunks programmatically
- Make tile size small so we can see more chunks
- Remove
TileSizeInPixelsandMetersToPixelsfromtile_map - Use random.org to generate some random numbers and use them to generate screen randomly
- Generate doors based on our choice
- Allocate space for tiles only when we access
- Add Z index to tilemap
- FUN: Windows can't render BMPs correct. This is very amusing, because they are the guys who invented BMP.
- Make player go up and down. I already implemented this function in the previous day, but I need to reimplement it in a new way: when the player moves to the stair, it goes automatically, no need to push any button.
- Rename
TileRelXandTileRelYtoOffsetXandOffsetY - Define
bitmap_headerand parse bitmap. We have to use#pragma pack(push, 1) and #pragma pack(pop)to make vs pack our struct correctly
- Design a very specific BMP to help debug our rendering. This is a very clever method.
- I find that my structured_art.bmp has different byte order from casey's. It turns out that BMP has something called RedMask, GreenMask, BlueMask and AlphaMask.
- BMP byte order: should determined by masks
- Render background bmp
- Define
loaded_bitmapto pack all things up - Define
DrawBitmap
- Define
FindLeastSignificantSetBitandbit_scan_resultin intrinsics - Define
COMPILER_MSVCandCOMPILER_LLVMmacro variables - Use
_BitScanForwardMSVC compiler intrinsic when we are using windows - Implement a simple linear alpha blending
- Assert compression mode when loading BMP
- Load hero bitmaps for four directions
- Change hero direction when moves
- Align hero bitmaps with real position
- Replace camera scrolling with fixed camera
- Move camera when player moves
- Fix clipping problem in our bitmap drawing
- Check frame rate
- Fix msvc pdb problem when hot reloading by creating a lock file
- Write a
static_checkbat file to make sure we never typestatic - Set a default cursor style using
LoadCursor - Hide cursor by responding
WM_SETCURSORmessage withSetCursor(0)in production build - RESOURCE: How do I switch a window between normal and fullscreen? https://devblogs.microsoft.com/oldnewthing/20100412-00/?p=14353
- Implement full screen toggling
- Do fullscreen rendering in fullscreen mode
Math we are gonna need:
- Arithmetic
- Algebra
- Euclidean Geometry
- Trigonometry
- Arithmetic
- Calculus
- Linear Algebra
- Partial Differential Equation
- Ordinary Differential Equation
- Complex Numbers
- Non-Euclidean Geometry
- Topology
- Minkowski Algebra
- Control Theory
- Interval Arithmetic
- Graph Theory
- Operations Research
- Probability and Statistics
- Cryptography / Pseudo Number Generator
- Fix diagonal movement problem
- Define
v2and implement add operator, minus operator and unary minus operator inhandmade_math.h - Use v2 instead of x and y
- Add
dPlayerPto game state. This is the speed of the guy. - Add a back force based on player's speed
- Implement inner product for vectors
- Reflect speed when player hits the wall (or make the speed align the wall). This can be implemented by a clever verctor math
v' = v - 2 * Inner(v, r) * r. r means the vector of the reflecting direction. For bottom wall, r is(0, 1).
- CASEY: Search in p (position) is way better than searing in t (time)
- Part of new collision detection algorithm
- There is no code today. I will write the new collision detection algorithm when it's complete.
- We have a severe bug! Player has been moved multiple times!
- Define
entitystruct. AddEntities,EntityCount,PlayerIndexForControllerandCameraFollowingEntityIndexto game state - Support as many players as our controllers in game state
- Implement
RotateLeftandRotateRightintrincsics using_rotland_rotr
- Define
Length,SquareRootto fix diagonal movement problem - We will ue search in t instead of search in p. Because to implement the later, we have to build the whole search space. It's complex and doesn't pay off.
- Part of new collision detection algorithm
- Implement the new collistion detection algorithm
- Add a
tEpsilonto tolerate floating point problem
- Add an
Offsetmethod to manipulate tile map position and auto recononicalize - Maybe we shouldn't make the world toroidal, since it adds much complexity
- Introduction of Minkowski sum and GJK algorithm
- Implement area collision detection
- Take player area into account when calculating MinTileX, MaxTileX, MinTileY and MaxTileY
- Modify speed when player hits the wall
- Use a loop to move player
- Divide entities into high, low and dormant categories
- Define
entity_residenceenum - Casey did part of the new implementation
- No code today. I will wait till the new implementation is finished
- Make player move again
- Map float position to tile map position after moving player
- Make camera move again
- Define
SetCameraand move entities into/out of high set - Define
entity_typeand add wall entities - Remove
tRemainingin collision detection
- Remove dormant entity and entity residence concept
- Define
MakeEntityHighFrequencyandMakeEntityLowFrequency - Make code work again
- DIFF: I don't like all the index thing. I will use pointers instead.
- Use int32 as chunk index, so 0 will be the center.
- Add
TileChunkHashto tile map GetTileChunkshould take a memroy arena- DIFF: I will store pointers instead of index in TileChunkHash array
- Rename
CameraBoundtoHighFrequencyBound - Rename
handmade_tile.h/cpptohandmade_world.h/cpp - Rename tile chunk to world chunk. We dont have tiles anymore.
- There is no tiles any more, just chunks.
- Define
entity_blockandChangeEntityLocation - Implement
WorldPositionFromTilePosition
- Reimplement
SetCamerausing spatial partition - Call
ChangeEntityLocationwhen adding low entities - Load tree bitmap and render it as wall
- Add monster and familiar entity type
- Define
entity_render_pieceandentity_render_piece_group - Implement
UpdateFamiliar
- Define
hit_pointstruct - Draw hit points
- Define
v3andv4vectors
- CASEY: Always write the usage code first. It will prepare you necessary context for writing real stuff.
- Add
EntityType_Swordentity type - Define
DrawHitPoints()andInitHitPoints()and add hitpoints for our monster - Load rock03.bmp as sword and render it when some key is pressed
- Define
NullPosition()andIsPositionValid(). Use some specific value to represent a null position.
- Define
move_specand pass it toMoveEntity - Add
distanceRemainingto sword - Define
UpdateSwordand make sword disappaer when distance remaining reaches to zero
This is a big change but it defeinitely worth it.
- Remove
low_entityandhigh_entity. They are never a good idea. - Define
sim_entityandstored_entity.stored_entityis for storage andsim_entityis for simulation. - Every frame, pull relevant entities to our simulation region, simulate it and render it.
- Lots of modifications adjusted for this new model
- Add
updatableto sim entity and set it correspondingly - Add
updatableBoundsto sim region. Previous bounds becomes total bounds. LoadEntityReferenceshould get position from reference entityUpdateSworddoesn't have to check NonSpatial flag- Move update logic back to our main function
- CASEY: Avoid callbacks, plain switch statements are just better on every aspect.
- Consider
distanceLimitin moveEntity function - CASEY: Fight the double dispatch problem with a property system.
- Define a simple
HandleCollisionfunction to make sword hurt monster when they collides
- Remove
EntityFlag_Collides - Define
ShouldCollideto check whether two entities should collide - Define
pairwise_collision_rule - Add
collisionRuleHashandfirstFreeCollisionRuleto game state - Define
AddCollisionRuleandClearCollisionRulesFor
- One way to fix
ClearCollisionRulesForfunction: every time we just insert two entries so that we can query with each one.
To-do list:
-
Multiple sim regions per frame
- Per-entity clocking
- Sim-region merging? For multiple players?
-
Z!
- Clean up things by using v3
- Figure out how you go "up" and "down", and how is this rendered?
-
Collision detection?
- Entry/exit?
- What's the plan for robustness? / shape definition?
-
Debug code
- Logging
- Diagramming
- Switches / slides / etc.
-
Audio
- Sound effect triggers
- Ambient sounds
- Music
-
Asset streaming
-
Metagame / save game?
- Do we allow saved games? Probably yes, just only for "pausing".
- Continuous save for crash recovery?
-
Rudimentary world generation
- Placement of background things
- Connectivity?
- Non-overlapping
- Map display
- Magnets - how they work???
-
AI
- Rudimentary monster behavior example
- Path finding
- AI "storage"
-
Animation system
- Skeletal animation
- Partical system
-
Rendering
-
GAME
- World generation
- Entity system
- Remove
world_diff - Remove
chunkSizeInMetersand addchunkDimInMeters - Define
Hadamardfor v2 and v3 - Make
panddPv3 in sim entity - Define
rectangle3
- Implement the simple jump (Casey implemented this long time ago)
AABB: Axis aligned bounding boxes- Add
maxEntityRadius,maxEntityVelocityto sim region - Change
widthandheightin sim entity todim - Define
EntityOverlapsRectangleand use this method to test whether entity is inside a rectangle
- Add
EntityType_Stairwelland use rock_02 bmp as our stairwell asset - Implement
AddStairand draw our stair - Define
overlappingCountandoverlappingEntitesandRectanglesIntersect - Pass
wasOverlappingtoHandleCollision - Move
AddCollisionRuleto handle collision
- Remove overlapping stuff and define
CanOverlapandHandleOverlap - Rename
ShouldCollidetoCanCollide - Call
HandleOverlapat the end ofMoveEntity - Draw our stairwell as a rectangle
- Define
GetBarycentric - Define
SafeRatioN,SafeRatio0andSafeRatio1 - Define
Lerp - Add
EntityFlag_Moveable - Rename
AddFlag->AddFlags,ClearFlag->ClearFlags - Define
ClampandClamp01 - Fix
BeginSimto loop over chunkZ - Modify stairwell z so that the minimum z of its volumn is 0
- Add
EntityFlag_ZSupported - Prevent player from "jumping" when he goes up/down stairs
- Define
SpeculativeCollideto prevent hero from stepping out the stair and jumping into the stair - Add
zFudgewhen rendering
- Take into account
zinMoveEntity. Remember to set height for walls. - Change
TileDepthInMeters, currently it's just the same asTleSizeInMeters. - Modify
RectanglesIntersect - Define
AddGroundedEntity - TODO: need to fix the rendering!
- Fix ground handling, need to take the z dimension into account
- Fix the drawing code
- Define
GetEntityGroundPointand fixSpeculativeCollide - Add
walkableHeightto entity which is used only for stairwell, and modifySpeculativeCollide - Define
GetStairwellGroundand fixHandleOverlap. It should use the same method to calculate the stairwell ground asSpeculativeCollide.
- The position point doesn't necessarily have to be the collision point
- Define
sim_entity_collision_volumeandsim_entity_collision_volumn_group - Remove
dimin entity and addcollision - Define
walkableDimfor stairwell - Initialize collision groups when initialize memory
- Always initialize collistion to null collision
- Set z drag to 0
- Casey talks about difference between "filled and carve" (Quake way) vs "empty and fill" (Unreal way) model
- CASEY: Robustness > efficiency!
- Introduce the concept of "room"
- Define
AddStandardRoom - Define
PushRectOutlineto draw the room
- Define
test_walland make wall testing data driven - Inline
TestWallfunction - Test overlap using all volumes and extract code into
EntitiesOverlap - Add
epsilontoEntitiesOverlap - Add test for
tMax, mostly the same astMin - Test our new code that prevents hero from ever getting outside
- Load grass, ground and tuft bitmaps
- Define
DrawTestand randomly draw some grasses, grounds and tufts - Casey talks about megatexture
- Make random number more systemic
- define
random_series - define
Seed,RandomChoice,RandomUnilateral,RandomBilateral,RandomBetween - replace old random code with above new functions
- define
- Make
loaded_bitmaphas the same structure asgame_offscreen_bufferand all drawing functions previously taking game_offscreen_buffer now take loaded_bitmap - Define
MakeEmptyBitmap, remember to clear the data to zero! - Draw ground bitmap once and cache it in game state
- Casey explains what premultiplied alpha is
- Change
LoadBMPandDrawBitmapfunction to use premultiplied alpha - Handle
cAlphainDrawBitmap
- Add
groundBufferPand draw ground based on this position - Rename
DrawTesttoDrawGroundChunkand fill the whole buffer
- Introduce
transient_stateto help manage transient memory - Define
transientArenaand store multiple ground buffers in transient memory - Use
groundBitmapTemplateto store repeated info (width, height) about the ground buffer - DIFF: Casey uses
beginTemporaryMemoryandendTemporaryMemorycalls to restore memory space used only in one frame. I think the api is not easy to use, I implementsaveandrestorejust like in theCanvasRenderingContext2D. - Draw ground buffers
- Make world position
_offsetrelative to center point, and rename it to beoffset. There is no need to prefix it with the underscore. - Draw chunks to see how big it is
- Define
CenteredChunkPoint - Define
DrawRectangleOutline
- Define
- Change
metersToPixelsto a fixed number - Cleanup: remove tileSideInMeters, we no longer have any tile thing.
- Fill ground buffer for each chunk.
- Modify
FillGroundBufferto generate seamless grounds by iterating nine chunks each time - Select the furthest buffer and fill it if we have run out of buffers
- Decrease
groundBufferCountto test our eviction code - Regenerate ground when game reloading
- Add a field
executableReloadedin game_input to tell us whether game has reloaded
- Add a field
- Why the trees are wiggling around?
- Our bliting is not pixel perfect now, entities' float coordinates may round to different integers and cause their distance to change a little bit.
- We will solve this problem when we have a real renderer!
- Clean up rendering stuff
- Create
handmade_render_group.handhandmade_render_group.cppfile - Put
render_pieceandrender_piece_groupto our newly created file - Increase piece count in piece group and use transient arena to alloc our piece group
- Define
render_basis - Delayed rendering: render pieces after we have pushed them all
- Create
- Use delayed rendering for ground buffers
- Rename
render_piece_grouptorender_group - Implement push buffer
- Add
pushBufferBase,pushBufferSizeandmaxPushBufferSizeto render_group - Define
AllocateRenderGroupandPushRenderElement
- Add
- Why we use a push buffer to do the rendering?
- Sorting!
- Process the source buffer into someting most suitable for the target
- Architect our soft renderer the way actual GPU works
- Move all drawing functions to handmade_render_group.cpp
- Rename
render_piecetorender_entry - Define
RenderGroupToOutput - Use "compact discriminated union"
render_entry_clearrender_entry_rectanglerender_entry_bitmap
- Define
render_entity_basisto abstract common positioning logic - RESORUCE: The ryg blog
- Implement
Clear - Implement
PushRectOutline - Use
PushBitmapinFillGroundBuffer
- Casey explains what is a basis and how it works
- Implement a demo
render_entry_coordinate_systemto explore the basis transformation idea
- Collision detection has a lot to do with pixel filling
- Casey demonstrates how to fill a rectangle
- Define
DrawRectangleSlowly - Start from an aligned rectangle
- Move to a rotated rectangle
- Calculate the min/max bound rather than always check the whole buffer
- Define
Perpfunction
- Define
- Implement a textured quadrilaterals
- For each pixel, calculate the
uandvuniform coordinate - Use
uandvto get color from texture - Populate pixel with that color
- Implement alpha blending, just copy the old code
- For each pixel, calculate the
- Subpixel rendering
- Casey demonstrates wiggling
- And then solve it by Bilinear Texture Filtering
- Casey explains what Gamma Space is
- It's non linear and it makes our math broken
- RESOURCE
- Use
pow(2.2)andsqrt(2.2)to convert between sRGB and linear spacepow(2.2)is just a good approximation, is not suitable for all monitors- We use
pow(2)to approximatepow(2.2)because it's much more cheaper
- Implement
SRGB255ToLinear1andLinear1ToSRGB255 - Implement a simple color tint
- When we load a BMP
- Convert it to linear space
- Multiply alpha with the color
- Convert it back to sRGB space
- Remove
render_entry_headerin render entry types- Add this header in every render entry type is very error-prone
- Let's the
PushRenderElementfunction do this job
-Zoused for enhanced optimized code debugging
I am reading this book Computer Graphics from Scratch these days, it's a good source to learn about lighting.
- Only render render_entry_coordinate_system and turn the optimization flag off
- Doug Church: Lighting is the sound of graphics.
- Casey explains things about lighting and there are so many terms I don't understand...
- Get lighting fully right is extremely hard
- Lighting problems in 2D
- We don't know what the surfaces are
- normal maps
- We don't know what the light field is
- point lights
- light rendering
- We don't know what the surfaces are
- RESOURCE: A good book about lighting Physically Based Rendering:From Theory To Implementation
- Introduce normal map and environment map
- Define
environment_map - Add top, middle and bottom environment map and normal map to
DrawRectangleSlowly
- Define
- Define
SampleEnvironmentMap - Define
MakeSphereNormalMapto generate a fake normal map and test our code
- There are two types of bitmaps: front-facing bitmaps and up-facing bitmaps
- Clean up previous code
- Initialize top, middle and bottom env maps
- Note: Out roughness is always zero now
- Fill LOD with color and draw the LODs
- Fill LOD with checker board
- Define testDiffuse and testNormal to test our lighting program
- Casey demonstrates how to change saturation
- avg = (r + g + b)/3
- delta = (r - avg, g - avg, b- avg)
- color = avg + saturationLevel * dela
- Calc the reflection vector: -e + 2Inner(e, N)N
- Modify
SampleEnvironmentMap- Take the reflection vector as input
- Define distanceFromMapInZ, let's say it's 1.0f
- Define uvsPerMeter, let's say it's 0.01f
- Calculate the point in the environment map and get color from it
This is a math day. Casey explains matrices and other stuff of linear algebra.
Speaking abuot lingear algebra, I highly recommond this book Linear Algebra by Jim Hefferon. It's freely available and totally accessible.
Inverse of a rotation matrix:
X and Y are perpendicular unit vectors.
R =
Xx Yx
Xy Yy
R inverse = R' =
Xx Xy
Yx Yy
Based on this fact:
XxXx +XyXy = Inner(X, X) = 1
XxYx + XyYy = Inner(X, Y) = 0
YxXx + YyXy = Inner(X, Y) = 0
YxYx + YyYy = Inner(Y, Y) = 1
So R' R =
1 0
0 1
Because normals are perpendicular to vectors, they are affected in a perpendicular way by any transforms we do.
- Rotate the normal
- Document
SampleEnvironmentMapfunction - Paint the LOD to debug SampleEnvironmentMap
In a 2D perspective, things are intentionally wrong, because the art wants them to be different.
- Fix
DrawRectangleSlowly- Calculate the correct screenSpaceUV
- Add
zin environment_map - Set
zfor environment maps
- There are two types of cards:
- lying-down card
- standing-up card
- Define
MakeSphereDiffuseMap - TODO: The mechanism of lighting is very confusing for me now, need to review them later.
- Switch to Y-up render targets. I don't need to do anything cause I did this long before.
- Pull out render api
- Remove
PushPiece - Make alignment baked in the bitmap
- Remove
entityZC - Unify v2 offset and offsetZ into v3 offset
PushBitmapshould accept a v4 color
- Remove
- Use
DrawRectangleSlowlyto render bitmap so we can scale - Store
zOffsetin game state and control it using action up/down buttons - Scale the position and size based on Z
- Remove y offset caused by z
- Z Slices are what control the scaling of things, where Z offsets inside a slice are what control Y offseting
- Remove
zOffsetin game_state - Do not preserve the offset z of cameraP
- Add
globalAlphato render group - Fade entities based on its z value
- Define
fadeTopStartZ,fadeTopEndZ,fadeBottomStartZandfadeBottomEndZ - Define
Clamp01MapToRange
- Define
- Modify
cameraBounds
- Change
GetEntityRednerBasisResultfunction to implement proper perspective projection- the core formula: p' = (dp) / (Cz - Pz)
- NOTE: the
cameraPin game_state is where we are looking at, not the actual camera position
- Change
aligntoalignPercentage - Add
widthOverHeightto loaded_bitmap - Remove
metersToPixelsin game_state PushBitmapshould take aheightparam- Add
sizeto render_entry_bitmap - Add
screenDimparam toGetEntityRenderBasisResult
- Define
GetCameraRectangleAtDistance- Define
Unproject
- Define
- Add
MetersToPixelsto render group and it means meters on the monitor into pixels on the monitor - Use
PushRectOutlineto verify our GetCameraRectangleAtDistance returns correct value - Define
render_group_camera- Add
gameCameraandrenderCamerato render group - Now we can see the big picture of our game world
- Add
- Reenable ground buffer code
- Make ground the same size as the chunk
- Define another LoadBMP call to set a default center align
- Use meters in
FillGroundBuffer
This is a blackboard day. No code evolved.
- SIMD is everywhere
- Modern CPUS are heavily heavily out of order
- Casey explains the difference between latency and throughput
- In most cases, we only care the throughput not the latency.
- Basic process of making things run quickly
- Gather statistics
- where it is slow
- what are their characteristics
- Make an estimation
- Analyze "efficiency" and "performance"
- "efficiency" is about how much work we have to do
- "performance" is about how to make the CPU do the work
- Start coding
- Gather statistics
- Define
BEGIN_TIMED_BLOCKandEND_TIMED_BLOCKmacros to track performance - Define
debug_cycle_counterstruct to store counters - Define
HandleDebugCountersto display counters
- Copy
DrawRectangleSlowlytoDrawRectangleHopefullyQuickly - Flatten
DrawRectangleHopefullyQuickly - Think about the question: What is our "wide" strategy?
- SOA(Struct of Array) vs AOS(Array of Struct)
- C makes AOS really easy
- but SIMD needs SOA
- We are targeting SSE and SSE2
- Convert
FillRectangleHopefullyQuicklyto operate on 4 pixels a time - RESOURCE: Numerical Methods that work
- RESOURCE: What Every Computer Scientist Should Know About Floating-Point Arithmetic
- Define
END_TIMED_BLOCK_COUNTED - SIMDify
DrawRectangleHopefullyQuickly_mm_mul_ps_mm_add_ps_mm_sub_ps_mm_sqrt_ps_mm_max_ps_mm_min_ps
- before simdify: about 135 cycles, after simdify, about 100 cycles
- Casey accidently showed a performance boost by just removing two inline functions
- Write pixels using SIMD
- Use structure art technique to verify that our unpack code is correct
- Be careful about alignment
- Used SIMD intrinsics
_mm_unpacklo_epi32_mm_castps_si128_mm_cvttps_epi32_mm_or_si128_mm_slli_epi32_mm_storeu_si128
- Convert load code to SIMD except for texture fetching
- Used SIMD instrinsics
_mm_cvtps_epi32will round to nearest by default_mm_cvtepi32_ps_mm_andnot_si128_mm_loadu_si128_mm_srli_epi32_mm_cmpge_ps_mm_cmple_ps_mm_movemask_epi8
- Change the way of calculating px
- Count intrinsics by substituting intrinsics with macros (I skipped this)
_mm_sqrt_psdoes not hurt us too badly, about 5 cycles
- Casey demostrates how to use IACA to profile our program
- We don't have to convert color to 0-1 first. We can do opertions in 0-255 space, this can save bunch of mul ops.
- Make
texture->memoryandtexture->pitchlocal variables (This doesn't work for me) - Use
_mm_rsqrt_psinstead of_mm_sqrt_ps(This doesn't work for me either)
This was a really long day, 4 hours in total.
- The IACA analyser does not take loop into account, so we need to manually unroll the loop
_mm_setr_ps: use memory order rather than register order_mm_mul_epi32will produce 2 64bit value, we need to use_mm_mullo_epi32
- Define
rectangle2iandUnion,Intersectfunctions- As the
rectangle2,maxis not included, so we need to changeDrawRectangleQuickly
- As the
- Make
DrawRectangleQuicklydraw on even lines or odd lines DrawRectangleQuicklytakes a clip rect- Use
clipMaskto ensure drawing inside clip rect - Define
InvertedInfinityRectangle - Define
TiledRenderGroupToOutput - NOTE:
_mm_mullo_epi32belongs to SSE4
This is a blackboard day, no code involved.
- Casey talks about process, thread, hyperthreading and all kinds of that stuff
- Casey demonstrates how to use
CreateThreadwindows api CloseHandlewill not close the thread, it just releases the handle to the OS.
- X64 provides special instructions to help us write multithread programs
- interlocked compare and exchange
- Casey demonstrates the unsafe multithread code
- We need a way to tell the compiler and the processor not to reorder things
_WriteBarrierfor the compiler_mm_sfence()for the processor
- Use
volatileto tell the compiler some variable may be changed without its local knowledge - Use
InterlockedIncrementto safely modify our variable - Use semaphores to implement a basic multithread work queue
CreateSemaphoreExWaitForSingleObjectExReleaseSemaphore
- Build a single producer multiple consumer queue system
- data structure:
platform_work_queueplatform_work_queue_entryplatform_work_queue_callback
PlatformAddEntryPlatformCompleteAllWorkWin32ProcessNextEntryInterlockedCompareExchange- Make it circular so we don't worry about wrapping
- data structure:
- We can use
getCurrentThreadIdto get current thread id. This is for testing. - Render with multithreading
- In x64, there is no need to use
_mm_sfencebecause writes are always ordered. _mm_sfenceis only necessary when you are writing to someting like write combining memory which may reorder things for you- Assert that outputTarget's memory is aligned with 16 bytes in
TiledRenderGroupToOutput - Modify
DrawRectangleQuicklyto support memory aligning- Introduce
startClipMaskandendClipMaskand set them correctly - Use
_mm_load_si128and_mm_store_si128instead of the unaligned version
- Introduce
