Game Engines as Simulation Infrastructure: Foundations, Evolution, and Design Space

Game engines are software frameworks that encode specific assumptions about what simulated worlds are and how they behave—separating reusable infrastructure from game-specific content while embedding philosophical commitments about space, time, entities, and causality. Understanding game engines requires grasping not just their technical architecture but why the “engine” metaphor emerged, what problems each generation solved, and how architectural choices constrain and enable what designers can express. This foundation is essential before connecting game engines to other domains of agent-based simulation.

What distinguishes an engine from a framework

The term “game engine” emerged in the mid-1990s during id Software’s Doom and Quake era, when developers began licensing “core portions of the software” separately from game-specific content. Earlier 1980s systems—Sierra’s Adventure Game Interpreter, LucasArts’ SCUMM, Incentive Software’s Freescape—are now recognized as proto-engines, but the conceptual separation crystallized only when reuse became economically significant.

A useful hierarchy clarifies the boundaries: libraries are focused collections of reusable code for single domains (FMOD for audio, Box2D for physics); frameworks assemble multiple libraries with conventions for game creation (SFML, MonoGame); engines add two distinguishing features—a scene graph (the data structure holding the world, with functions for managing, querying, and persisting it) and a world editor (the visual authoring environment). This distinction reflects a fundamental philosophy: frameworks let you build anything, while engines assume a particular model of what a “game world” is. Engines expect developers to adapt to their assumptions rather than the reverse.

The “engine” metaphor borrows from industrial machinery to suggest separation of power from purpose—just as a car engine provides motive force independent of vehicle purpose, a game engine provides simulation capability independent of specific content. But the metaphor misleads in one crucial respect: car engines are relatively interchangeable, while game engines encode deep ontological commitments about what constitutes a world. A more accurate metaphor might be the theater stage: it provides space, lighting rigs, trapdoors, and backstage machinery that enable certain performances while making others impractical. The stage isn’t neutral infrastructure—it embodies assumptions about what theater is.

From hardcoded games to reusable technology

Before “game engines” existed as a concept, each game was built as a singular monolithic entity, tightly coupled to specific hardware. On the Atari 2600, developers created optimized display routines called “kernels” interfacing directly with hardware. Memory constraints—often just kilobytes—sabotaged attempts at modular design. Code was written in 6502 Assembly, and the rapid advance of arcade hardware meant most code was thrown out after each project.

The transition to reusable engines was driven by technical complexity and cost. id Software’s progression—Hovertank 3D (1991), Catacomb 3-D (1991), Wolfenstein 3D (1992), Doom (1993)—illustrates the shift. Wolfenstein 3D was developed in four months; its engine was reused for Spear of Destiny just two months later. This demonstrated that 3D graphics required specialized expertise that could be amortized across projects. When Doom introduced BSP trees for efficient rendering, lightmaps, texture mapping, and peer-to-peer networking, id explicitly designed for modding and licensed the engine to Raven Software for Heretic and Hexen.

The licensable engine era (1990s-2000s) was defined by id Tech and Unreal Engine. Tim Sweeney’s Unreal Engine, begun in 1995, was licensed two years before Unreal shipped—those license fees kept Epic Games afloat during development. Unreal cost approximately ** $3 mi ll i o n * * t o p ro d u ce, m os t o f w hi c h w e n tt oe n g in e d e v e l o p m e n t r a t h er t han co n t e n t . T h e l i ce n s in g m o d e l —$ 250,000-$350,000 upfront plus 5-7% royalties—solved multiple problems: it spread development costs across licensees, provided proven technology with known capabilities, and created a portable skill set as programmers could learn one engine and move between studios. Valve’s licensing of the Quake engine for Half-Life created GoldSrc, whose DNA persists in modern Source engines; Infinity Ward’s license of id Tech 3 became the foundation for Call of Duty.

Middleware fragmentation (2000s) emerged as games grew too complex for any single studio to build everything. Havok Physics, founded in 1998, provided real-time collision detection and ragdoll physics across platforms—it powered Half-Life 2, Halo 2, and eventually Breath of the Wild. Audio middleware like FMOD and Wwise enabled adaptive music and 3D spatialization that built-in audio systems couldn’t achieve. RenderWare simplified asset integration until EA acquired it in 2004 and stopped third-party licensing, forcing Rockstar to develop RAGE and leaving smaller studios stranded. The lesson: middleware dependency creates vendor lock-in risk.

The democratization wave (2005-present) lowered economic barriers. Unity, founded in 2004 by David Helgason, Nicholas Francis, and Joachim Ante, launched in June 2005 at Apple’s WWDC. Initially Mac-only at $1,500 for a professional license (compared to hundreds of thousands for Unreal), Unity’s timing with iPhone support in 2008 coincided perfectly with the App Store launch. By 2020, approximately half of all mobile games and 60% of AR/VR content were built with Unity. Godot, developed internally since 2001 and released open-source in 2014 under MIT license, provided a fully free alternative. Unreal responded by moving to free-to-use with 5% revenue share above thresholds.

Each transition solved specific problems: the move to reusable engines addressed how studios could create 3D games without spending years on technology; middleware addressed how studios could achieve AAA-quality physics and audio without building everything; democratization addressed how small teams and individuals could create professional-quality games. The trajectory has consistently moved toward lowering barriers—from a few dozen studios capable of making 3D games in 1993 to millions of developers using Unity and Unreal today.

Core components and how they interact

A game engine comprises multiple interconnected subsystems orchestrated by the game loop—the heartbeat of any interactive engine that processes input, updates game state (physics, AI, logic), renders the scene, and processes audio in sequence each frame. The rendering pipeline transforms mathematical representations into visual output through graphics APIs like DirectX, Vulkan, or OpenGL. Physics simulation handles collision detection, rigid body dynamics, and constraint solving. Input handling abstracts device-specific input into game-meaningful events. Audio systems manage spatial sound, dynamic mixing, and streaming. Scripting systems provide flexibility between the engine core and game-specific behavior. Asset management handles loading, caching, and streaming of resources. Scene management organizes the spatial representation of game world objects.

The scene graph is a hierarchical data structure—typically a tree—that organizes objects in a scene with transformations cascading from parent to child. If a robot arm rotates, all children (hand, attached tools) rotate accordingly. This enables skeletal animation, vehicle parts, and complex assemblies. However, traditional scene graphs have fundamental limitations: tree traversal causes cache misses as nodes are scattered in memory, deep recursive traversals are hard to parallelize, and complexity can spiral as scene graphs try to handle too many responsibilities.

Entity-Component-System (ECS) emerged as an alternative architecture separating identity (entities), data (components), and behavior (systems). Entities are lightweight identifiers—often just 32-bit integers. Components are pure data structures containing no behavior: Position {float x, y, z}, Velocity {float vx, vy, vz}. Systems are logic that operates on entities with specific component combinations. This solves the diamond problem of OOP inheritance (is a FlyingEnemy a type of Enemy or FlyingObject? ECS simply attaches both components), enables cache-efficient iteration through contiguous component storage, and makes parallelization straightforward since systems process independent data arrays.

Data-oriented design (DOD) optimizes for CPU cache utilization by reorganizing data layout. Modern CPUs access L1 cache in ~0.5 nanoseconds but main memory in ~50+ nanoseconds—a cache miss costs 100+ clock cycles. Traditional object-oriented “array of structures” scatters related data across memory; data-oriented “structure of arrays” stores components of the same type contiguously, enabling optimal cache utilization and SIMD vectorization. Unity’s DOTS transition demonstrated the impact: processes that took an hour were reduced to 100 milliseconds after implementing data-oriented approaches.

How Unity, Unreal, and Godot made different choices

Unity’s traditional architecture centers on GameObjects as containers holding MonoBehaviour components. This enables intuitive drag-and-drop composition, rapid prototyping, and inspector-driven development accessible to non-programmers. But it constrains performance: memory layout is scattered with each component a separate heap allocation, cache misses are common when iterating over many objects, and virtual method dispatch for Update() and Start() adds overhead. Unity pursued DOTS (Data-Oriented Technology Stack) to address these limitations, grouping entities into archetypes stored in 16KB memory chunks for cache-friendly iteration. C# provides memory safety and rich tooling but incurs garbage collection pauses; the Burst compiler closes the performance gap by compiling C# to optimized native code via LLVM.

Unreal’s architecture uses Actors as full-fledged objects with built-in replication, lifecycle management, and spawning logic. The Gameplay Framework provides pre-built classes: GameModeBase for rules, PlayerController for input, Pawn/Character for controlled entities. Blueprints visual scripting enables designers to implement gameplay without coding through a node-based system that compiles to bytecode—but with ~10x performance overhead versus native C++ and no multithreading support. Unreal’s custom reflection system (since C++ lacks native introspection) parses UCLASS/UPROPERTY/UFUNCTION macros through the Unreal Header Tool, enabling automatic serialization, garbage collection, Blueprint integration, and network replication. The optimal workflow builds core systems in C++, exposing functionality to Blueprints for designer customization.

Godot’s architecture differs fundamentally: everything is a Node organized into a Scene Tree, with scenes as reusable compositions that can be instanced within other scenes. Rather than components attached to entities, Godot uses inheritance hierarchies of node types (Node → Node2D → CanvasItem → Sprite2D). GDScript was created after testing Lua, Python, and Squirrel because “the language and interpreter for GDScript ended up being smaller than the binding code itself for Lua and Squirrel.” The open-source MIT license means the editor runs on the game engine itself, using the same UI toolkit, with all decisions documented publicly and no vendor lock-in.

Studios build custom engines when specialized requirements justify the investment. id Software targets consistent 60fps+ at high fidelity with genre-specific optimizations for FPS games. DICE built Frostbite for large-scale multiplayer with destructible environments—but when BioWare was forced to use it for Dragon Age: Inquisition, “introducing foundational RPG components into the engine turned out to be a more extended endeavor than anticipated… at launch we still didn’t actually have all our tools working.” Custom engines maximize performance for target use cases but narrow applicable scope.

The tradeoff space engines have explored

The fundamental tradeoffs form an interconnected design space where no single architecture is universally superior:

Performance versus flexibility creates a spectrum from custom C++ engines (highest performance, most specialized) through Unreal C++ and Unity DOTS (very high performance, moderate flexibility) to Unity traditional and Godot GDScript (moderate performance, high flexibility). Abstraction adds overhead but enables rapid development; data-oriented design maximizes performance but reduces OOP flexibility.

Ease of use versus power manifests most clearly in scripting approaches. Visual scripting (Blueprints, Godot’s former VisualScript) offers accessibility to non-programmers and immediate visual feedback but hits performance ceilings and doesn’t scale to complex systems. Text-based scripting (GDScript, C#) provides more expressiveness for complex logic and better tooling but requires higher learning curves. Native code (C++, Rust) delivers maximum performance but demands memory safety responsibility and longer iteration cycles.

General purpose versus specialized engines make different tradeoffs. Unity, Unreal, and Godot support wide genre ranges through large asset stores and communities but compromise for versatility—“UE5’s software Lumen basically exists because the engine has to be a jack of all trades.” Specialized engines like id Tech and Frostbite achieve higher performance ceilings for target use cases but have narrower applicable scope.

Open versus closed source carries profound implications. Open source (Godot) enables full engine modification, debugging with full source access, and code auditability—but relies on community-driven support and feature prioritization. Closed source (Unity, Unreal) provides professional support and business-prioritized features but creates vendor dependency and can change terms unexpectedly, as Unity’s 2023 pricing controversy demonstrated.

Memory management approaches range from garbage collection (Unity traditional, with automatic reclamation but unpredictable pauses) through reference counting (Godot, less pause-prone) to manual management with custom allocators (Unreal, DOTS, custom engines—maximum control but maximum responsibility). Game engines commonly implement specialized allocators: stack allocators for per-frame temporary data with O(1) allocation and reset; pool allocators for same-type objects like bullets and particles; double-buffered allocators for previous-frame data availability.

Determinism versus convenience matters critically for networking, replays, and debugging. ECS architectures enable determinism through predictable execution order and separation of simulation from rendering. Requirements include fixed-point math (floating point varies across platforms), deterministic random seeds, ordered data structures, and consistent execution order independent of frame rate. Lockstep networking sends only inputs and simulates identically everywhere; rollback networking predicts inputs and corrects when actual data arrives.

How architectural choices constrain and enable expression

Scripting boundaries embody the core tradeoff between iteration speed and performance. Fast iteration through scripting requires interpretation or JIT overhead; maximum performance requires compilation and longer cycles. Unity’s domain reloads serialize and deserialize entire state, slowing iteration in large projects (“with 2M+ lines of code, even months of work our biggest assembly still takes ~45s to compile”). Unreal offers Live Coding for C++ and nativized Blueprints that compile to C++ for shipping. Godot’s GDScript hot-reloads during play with @tool annotation for editor code.

Cache coherence optimization requires fundamental reorganization. Data layouts must shift from array-of-structures to structure-of-arrays. Systems process data in batches rather than individual objects. OOP encapsulation is sacrificed—“the Particle class no longer controls its own active state.” Code becomes less intuitive but dramatically faster; a cache miss costs approximately 600 CPU cycles on some architectures, while contiguous data access enables predictable prefetching.

ECS changes what’s easy versus hard to express. Adding capabilities becomes trivial—attach a component without creating new class hierarchies. Dynamic behavior modification happens at runtime. Parallelization is straightforward since systems process independent arrays. But ECS suits entity-heavy simulations better than unique singleton objects or complex UI systems. The learning curve requires mental model shifts from object-oriented thinking.

The fixed timestep is philosophically significant. Variable timestep is simpler but creates non-determinism—the same inputs can produce different outcomes depending on frame rate. Fixed timestep with accumulator (the industry standard) enables determinism, physics stability under frame drops, and replay systems. This implies a commitment: the simulation exists independently of observation—the world ticks forward whether or not frames are rendered.

Engines as possibility spaces encoding world models

Research on game engine affordances reveals that engines become synonymous with certain genres, and creating games outside those genres is “at the least cumbersome if not outright impossible.” This isn’t a bug—it’s a design feature. Every engine encodes assumptions about space (2D or 3D? grid-based or continuous? bounded or infinite?), time (discrete turns or continuous? fixed or variable timestep?), entities (what constitutes a “thing”? how are things identified, composed, related?), and causality (what can affect what? how do events propagate?).

Typical engines embed specific metaphysical commitments:

On identity: Engines assume haecceity—each entity has a unique identifier distinguishing it from all others, even if property-identical
On persistence: Objects continue existing between observations; the world doesn’t collapse when not rendered
On causality: Effects follow causes in time-ordered sequence; systems process in deterministic order
On substance: ECS architectures assume no essences—entities are pure bundles of components; traditional OOP engines imply essential natures through classes

The productive paradox of game design is that constraints generate creativity. Breath of the Wild’s stamina meter constrains unlimited climbing, forcing strategic resource management that becomes core gameplay. Minecraft’s block-based construction “ironically ignited an unprecedented explosion of player creativity” precisely through geometric limitations. VVVVVV removed jumping entirely and restricted players to gravity-flipping, creating a “fiendishly difficult and utterly unique experience.” As one analysis notes: “Constraints are the crucible within which player ingenuity is forged.”

Games that push against engine assumptions illuminate what those assumptions are. Braid required custom time mechanics maintaining multiple parallel world states. Antichamber reimagined fundamental assumptions about spatial continuity through non-Euclidean geometry. Portal used physics engines designed for realistic simulation to create deliberately impossible spaces.

Conceptual vocabulary for simulation systems

Several concepts translate beyond video games to any simulation system—agent-based models, digital twins, robotics simulators, virtual environments:

The engine/content separation assumes that “what runs the world” is separable from “what the world contains”—a position not all simulation philosophies share. Scene graphs encode assumptions about hierarchical spatial relationships. Fixed timestep simulation enables determinism crucial for reproducibility but requires careful design. Composition over inheritance (the ECS ontology) favors defining things by their properties rather than their categories, aligning with process philosophy over substance metaphysics.

State persistence distinguishes engines from generators: a game engine maintains continuous world state across time, tracking player progress, inventory, and world changes. This memory function is as fundamental as the rendering function. Emerging AI “world models” generate new scenes independently without underlying memory structures—highlighting what traditional engines provide by contrast.

The conceptual vocabulary—abstraction layers, game loops, archetypes, affordances, constraints—applies wherever simulation systems must make analogous decisions about how to represent entities, time, space, and causation. Game engines have explored this design space for decades, accumulating hard-won knowledge about what tradeoffs matter and why.

Archive Fever