Replay System
The ConflictInterface replay system provides efficient bidirectional recording and playback of game state changes. It enables rewinding and fast-forwarding through game history with minimal storage overhead by storing only the differences (patches) between game states, using a segmented, versioned timeline format.
Table of Contents
- Architecture Overview
- Core Data Structures
- Key Algorithms
- Storage Format
- Hook System
- Performance Considerations
Architecture Overview
The replay system is organized around a timeline of replay segments stored in a compressed file, plus a high-level interface used for playback:
┌─────────────────────────────────────────────────────────────────────────────┐
│ ReplayTimeline (file) │
│ (Zstd-compressed container for multiple ReplaySegment intervals) │
├─────────────────────────────────────────────────────────────────────────────┤
│ ReplaySegment │
│ (Owns ReplayStorage, patch graph, and per-segment metadata) │
├─────────────────────────────────────────────────────────────────────────────┤
│ ReplayStorage │
│ (Chunked LZ4 layout: metadata, states, PathTree, PatchGraph, patches) │
├─────────────────────────────────────────────────────────────────────────────┤
│ PathTree / PathTreeNode │ PatchGraph / PatchGraphNode │
│ (Deduplicated JSON paths) │ (Temporal navigation between states) │
└──────────────────────────────────┴──────────────────────────────────────────┘
High-level APIs:
- ReplayBuilder → produces ReplayTimeline files from JSON responses.
- ReplayInterface → reads ReplayTimeline files and exposes a GameInterface-compatible API with time travel and hooks.
Component Responsibilities
| Component | Purpose |
|---|---|
| ReplayTimeline | Timeline-level container for segments, file header, and timeline metadata (zstd-compressed RPLYZSTD file) |
| ReplaySegment | Segment-level owner of ReplayStorage, patch graph, and initial/last game state within a time interval |
| ReplayStorage | Serialization, compression (LZ4), and disk I/O for replay files |
| PathTree | Manages JSON paths as a tree structure with O(1) path lookups |
| PatchGraph | Graph-based navigation between timestamps using Dijkstra's algorithm |
| BidirectionalReplayPatch | Contains forward and backward patches for time travel |
| ReplayBuilder | High-level builder that converts a stream of API JSON responses into a replay file (ReplayTimeline) |
| ReplayInterface | High-level read-only interface for replay playback and time travel |
| ReplayHookSystem | Dispatches change events while patches are applied during playback |
Timeline & Segments
The ReplayTimeline stores one or more non-overlapping ReplaySegment intervals. Each segment:
- Owns its own
ReplayStorageand patch graph. - Has a fixed datatype
versionand amap_idstored in metadata. - Covers a continuous time range ([start, end]), where the last segment may be open-ended during recording.
When recording via ReplayBuilder, new segments are created as needed (for example when a full game state arrives, or when a datatype version changes). During playback, ReplayInterface locates the correct segment for a requested timestamp and applies patches only within that segment.
Supported datatype versions: To see which game/client datatype versions this library supports, use conflict_interface.versions.get_supported_datatype_versions(). The latest supported version is conflict_interface.versions.LATEST_VERSION.
Core Data Structures
PathTree
The PathTree stores all unique paths in the game state as a tree structure. Each path (e.g., ['units', 0, 'health']) is represented as a series of connected nodes.
class PathTree:
root: PathTreeNode # Root node of the tree
idx_counter: int # Next available index
idx_to_node: dict[int, PathTreeNode] # Fast index-to-node lookup
# Precomputed arrays for LCA queries (see Algorithms section)
euler: array # Euler tour traversal
tin: array # Entry times
tout: array # Exit times
depth: array # Node depths
st: list[array] # Sparse table for RMQ
Key Features:
- Paths are deduplicated - common prefixes share nodes
- O(1) path lookup using index-to-node mapping
- Supports efficient Lowest Common Ancestor (LCA) queries
PathTreeNode
Represents a single element in a path (e.g., an attribute name or list index).
class PathTreeNode:
path_element: str | int # The path segment (attribute name or index)
index: int # Unique identifier in the tree
is_leaf: bool # True if no children
reference: object # Runtime reference to the actual object
children: dict[str, PathTreeNode] # Child nodes
Note: The reference field points to the parent object, not the value at this path. For a path like ['units', 0, 'health'], the node for 'health' has reference pointing to the unit object at index 0, not the health value itself.
PatchGraph
The PatchGraph manages the temporal relationships between patches. Each patch connects two timestamps, and the graph enables efficient navigation from any timestamp to any other.
class PatchGraph:
time_stamps_cache: list[int] # Sorted list of all timestamps
patches: dict[tuple[int, int], PatchGraphNode] # (from, to) -> patch
adj: dict[int, list[int]] # Adjacency list for pathfinding
PatchGraphNode
Contains the actual patch operations between two timestamps.
class PatchGraphNode:
from_timestamp: int # Source timestamp (seconds since epoch)
to_timestamp: int # Target timestamp
op_types: list[int] # Operation types (ADD=1, REPLACE=2, REMOVE=3)
paths: list[int] # Path indices into PathTree
values: list[Any] # Values for each operation
cost: int # Cost for pathfinding (currently fixed at 1)
BidirectionalReplayPatch
Holds both forward and backward patches to enable bidirectional navigation.
class BidirectionalReplayPatch:
forward_patch: ReplayPatch # Apply to move forward in time
backward_patch: ReplayPatch # Apply to move backward in time
Operation Types
The system supports three operation types:
| Type | Key | Constant | Description |
|---|---|---|---|
| Add | "a" | ADD_OPERATION = 1 | Add a new value to a list or dict |
| Replace | "p" | REPLACE_OPERATION = 2 | Replace an existing value |
| Remove | "r" | REMOVE_OPERATION = 3 | Remove a value from a list, dict, or set to None |
TimelineMetadata
At the file level, TimelineMetadata stores metadata common to the entire replay file:
class TimelineMetadata:
game_ended: bool
start_of_game: int # Unix seconds, 0 if unknown
end_of_game: int # Unix seconds, 0 if unknown
game_id: int
player_id: int
scenario_id: int
day_of_game: int
speed: int # integer speed multiplier (e.g. 1, 2, 4)
segment_count: int # number of segments in this file
ReplayTimeline keeps this structure in sync with the current set of segments when writing to disk.
SegmentMetadata
Each ReplaySegment has a SegmentMetadata (accessed via ReplayStorage.metadata) that holds segment-local information such as:
start_time/last_time(Unix seconds for the segment interval).version(datatype version used in this segment).map_id(string key referencing external static map data).current_patches/max_patches(how many patches exist and capacity).- Flags like
is_fragmentedused during append mode.
This metadata is stored inside the segment payload and is loaded lazily as needed.
Key Algorithms
1. Euler Tour for Tree Preprocessing
PathTree.precompute() builds an Euler tour over the deduplicated PathTree so later LCA queries can be answered via RMQ.
During preprocessing, the tree is traversed once and the following arrays are produced:
tin[node_idx]: entry time (first time the node is visited)tout[node_idx]: exit timeeuler: the flattened traversal sequence of node indices (including re-visits when the traversal backtracks)first[node_idx]: index of the first occurrence ofnode_idxinsideeulerdepth[node_idx]: depth (distance fromroot)
These arrays (especially euler, first, depth, plus the Range Minimum Query(RMQ) sparse table built from them) are later passed into steiner_tree_cpp.build_steiner_tree(...), where Least Commen Ancestor(LCA) is computed to construct the expanded Steiner subtree.
Time Complexity: O(n) preprocessing, where n is the number of nodes.
2. LCA using Range Minimum Query (RMQ)
Finding the LCA of two nodes is reduced to a Range Minimum Query (RMQ) on the Euler tour.
PathTree.precompute_rmq() builds a sparse table (st) where each entry stores the Euler-tour position with the minimum depth in an interval. steiner_tree_cpp then uses that RMQ to compute LCA(u, v).
def lca(u_idx: int, v_idx: int) -> int:
# Find positions in Euler tour
left = self.first[u_idx]
right = self.first[v_idx]
if left > right:
left, right = right, left
# Sparse table RMQ for minimum depth
length = right - left + 1
k = self.log[length]
a = self.st[k][left]
b = self.st[k][right - (1 << k) + 1]
# Return node with minimum depth
return self.euler[a] if self.depth[self.euler[a]] < self.depth[self.euler[b]] else self.euler[b]
Time Complexity: O(1) per query after O(n log n) preprocessing.
3. Steiner Tree Construction
When applying a patch, we may need to resolve PathTreeNode.reference values for paths that were never accessed earlier. To do that efficiently, the replay builds an expanded Steiner subtree connecting all required path nodes (and the root), but expanded so every edge corresponds to a real edge in the underlying PathTree.
In practice:
ReplaySegment.apply_patch()collectsunknown_pathsfor patch operations whosePathTreeNode.referenceis missing.PathTree.build_steiner_tree()delegates to the C++ routinesteiner_tree_cpp.build_steiner_tree(...), using the PathTree’sparent,tin,tout, and RMQ/LCA helpers.
Algorithm Steps:
- Ensure
root_idxis included; deduplicate input nodes. - Sort nodes by
tin. - For each consecutive pair in this sorted order, compute
LCA(u, v)via RMQ over the Euler tour and add those LCAs to the working set. - Sort+deduplicate the full set again (original nodes + LCAs).
- Build a compressed virtual tree using a stack:
- pop until the current stack top is an ancestor of
v(checked viatin/tout) - record the compressed parent of
v
- pop until the current stack top is an ancestor of
- Expand each compressed edge into real PathTree edges by walking
parent[]pointers from child up to compressed parent, emitting directed edgesp -> child. - Return an adjacency list
adj[node_idx] = [child_idx, ...](children are ordered bytinfor deterministic output).
def build_steiner_tree(nodes: list[int]) -> dict[int, list[int]]:
nodes = deduplicate(nodes + [root])
nodes.sort(key=tin)
# Insert LCAs between consecutive nodes (sorted by tin)
full = nodes.copy()
for i in range(len(nodes) - 1):
full.append(LCA(nodes[i], nodes[i+1])) # via RMQ over euler/st/log
# Build compressed virtual tree (stack) and then expand edges to real edges
compressed_parent = build_virtual_tree_stack(full, tin, tout)
adj = expand_edges_to_real_path(compressed_parent, parent)
return adj
Time Complexity: O(k log k) where k is the number of nodes to connect.
4. BFS Reference Resolution
Given the expanded Steiner subtree adjacency list, reference resolution is done with a BFS over the subtree:
- Direct children of
PathTree.rootare seeded withreference = game_state. - Each subsequent node’s
referenceis derived from its direct parent viaget_reference_from_direct_parent(node)(i.e., look up the attribute/index/key described by the parent path element).
def bfs_set_references(sub_tree: dict[int, list[int]], game_state: GameState):
q = deque([])
# Seed: first-level nodes point at the game_state root.
for child in root.children.values():
child.set_reference(game_state)
q.append(child.index)
while q:
u = q.popleft()
for v in sub_tree.get(u, []):
node = idx_to_node[v]
node.set_reference(get_reference_from_direct_parent(node))
q.append(v)
5. Dijkstra's Algorithm for Patch Path Finding
When jumping to a target timestamp, the ReplayInterface asks the segment’s PatchGraph for the shortest patch sequence.
PatchGraph.find_patch_path():
- snaps both endpoints to the closest cached patch timestamps less than or equal to the requested times (
find_prev_timestamp) - runs SciPy’s
scipy.sparse.csgraph.dijkstraover the CSR graph with directed edges weighted byPatchGraphNode.cost - reconstructs the patch sequence by following SciPy’s predecessor array (
return_predecessors=True)
def find_patch_path(from_time: datetime, to_time: datetime) -> list[PatchGraphNode]:
from_exact = find_prev_timestamp(int(from_time.timestamp()))
to_exact = find_prev_timestamp(int(to_time.timestamp()))
if from_exact is None or to_exact is None:
raise ValueError("No exact patch timestamps found for the given time range.")
if from_exact == to_exact:
return []
src = time_to_dense_idx[from_exact]
dst = time_to_dense_idx[to_exact]
_, pred = scipy_dijkstra(graph_csr, directed=True, indices=src, return_predecessors=True)
if pred[dst] == -9999:
raise ValueError("No path")
path = []
cur = dst
while cur != src:
prev = pred[cur]
path.append(patches[(dense_idx_to_time[prev], dense_idx_to_time[cur])])
cur = prev
return path[::-1]
Time Complexity: O((E + V) log V) where E is the number of patches and V is the number of unique timestamps.
6. Diff Generation
Diff generation happens during recording in ReplayBuilder, while it mutates the “current” game state toward the “new” game state while simultaneously recording the operations into a BidirectionalReplayPatch.
Concretely:
- For each JSON response,
ReplayBuilder._create_patch_from_json(...)decides whether it is an incremental update (full == False) or a full replacement (full == True). - Incremental updates generate patches by calling
current_state.update(new_state, path=[], rp=bipatch). - The
update(...)call mutatescurrent_stateby iterating over its nested state attributes and invoking theirupdate(...)methods; when those nested updates detect changes andrpis provided, they emit the correspondingrp.add(...),rp.replace(...), andrp.remove(...)operations at the computed JSON paths. - For full replacements, the patch returned is an empty
BidirectionalReplayPatch(). In this case the new state becomes the segment base (the replay starts applying patches from the new baseline in the next segment).
During playback, ReplaySegment.apply_patch(...) applies the recorded BidirectionalReplayPatch operations, using the PathTree (reference resolution) and PatchGraph (timestamp navigation) described above.
Storage Format
Replay segment payloads managed by ReplayStorage use a chunked binary format with LZ4 compression:
┌─────────────────────────────────────────────────────┐
│ Chunk 1: Metadata (fixed-size header) │
│ [4 bytes: length][raw bytes] │
├─────────────────────────────────────────────────────┤
│ Chunk 2: Initial Game State (pickled / binary) │
│ [4 bytes: length][compressed data] │
├─────────────────────────────────────────────────────┤
│ Chunk 3: PathTree │
│ [4 bytes: length][compressed data] │
├─────────────────────────────────────────────────────┤
│ Chunk 4: Patch Graph (compressed) │
│ [4 bytes: length][compressed data] │
├─────────────────────────────────────────────────────┤
│ Chunk 5: Patch index (uncompressed) │
│ [4 bytes: length][raw bytes] │
├─────────────────────────────────────────────────────┤
│ Chunk 6: Data pool (patch data, uncompressed) │
│ [4 bytes: length][raw bytes] │
├─────────────────────────────────────────────────────┤
│ Chunk 7: Last Game State (pickled / binary) │
│ [4 bytes: length][compressed data] │
└─────────────────────────────────────────────────────┘
Static map data is no longer embedded inside replay segments. Instead, each segment's metadata stores a map_id: str that can be used to look up StaticMapData externally.
ReplayTimeline file header
Replay files written by ReplayTimeline (version 2) are zstd-compressed and have the following header before the per-segment records:
┌───────────────────────────────────────────────────────────────┐
│ Magic 8 bytes ASCII "RPLYZSTD" │
├───────────────────────────────────────────────────────────────┤
│ Version 4 bytes little-endian uint32 (currently 2)│
├───────────────────────────────────────────────────────────────┤
│ TimelineMetadata N bytes fixed-size struct, see below │
├───────────────────────────────────────────────────────────────┤
│ Segments ... repeated segment records │
└───────────────────────────────────────────────────────────────┘
The TimelineMetadata struct is a fixed-size little-endian layout:
bool game_ended
int32 start_of_game # Unix seconds, 0 if unknown
int32 end_of_game # Unix seconds, 0 if unknown
int32 game_id
int32 player_id
int32 scenario_id
int32 day_of_game
int32 speed # integer speed multiplier (e.g. 1, 2, 4)
int32 segment_count # number of segments in this file
After the header, each segment is stored as:
int64 start_ns # nanoseconds since epoch
int64 end_ns # nanoseconds since epoch (can be equal to start_ns)
int32 version # datatype version for this segment
uint64 size # number of bytes in the following payload
bytes payload # raw segment data managed by ReplayStorage
Note that there is no separate segment-count field in the header anymore; the total number of segments is stored inside TimelineMetadata.segment_count.
Patch Serialization
Individual patches use a columnar storage format for efficiency:
data = {
"p": path_list, # Deduplicated path table
"o": ops_col.tobytes(), # Operation types (1 byte each)
"i": path_indices.tobytes(), # Path references (2 or 4 bytes each)
"v": values_col, # Operation values
"t": index_type # "H" for 16-bit or "I" for 32-bit indices
}
Hook System
The replay system includes a hook/event system that can emit structured change events as patches are applied during playback.
ReplayHookSystemis created perReplaySegmentand is owned byReplayInterface.ReplayHookdescribes a subscription: tag, path index, change types (add/replace/remove), and optional attributes.ReplayHookEventis an event payload with a tag, object reference, and changed attributes.ReplayHookTagis an enum that includes tags likeProvinceChanged,PlayerChanged,TeamChanged,ArmyChanged,GameInfoChanged, andSegmentSwitch.
ReplayInterface exposes helpers for working with hooks:
hook_system = replay_interface.get_hook_system()
events_by_tag = replay_interface.poll_events()
replay_interface.unregister_all_hooks()
replay_interface.register_province_trigger(attributes=["owner_id", "resource_production"])
replay_interface.unregister_province_trigger()
replay_interface.register_player_trigger()
replay_interface.unregister_player_trigger()
replay_interface.register_team_trigger()
replay_interface.unregister_team_trigger()
replay_interface.register_army_trigger()
replay_interface.unregister_army_trigger()
replay_interface.register_game_info_trigger()
replay_interface.unregister_game_info_trigger()
A typical pattern is:
- Register triggers and/or hooks.
- Call
jump_toorjump_to_next_patchonReplayInterface. - Call
poll_events()to retrieve and clear the accumulated events.
Performance Considerations
- Path Deduplication: Common path prefixes are shared in
PathTree, reducing memory usage. - Lazy Reference Resolution: Object references are resolved only when needed using the Steiner tree algorithm and BFS.
- LZ4 Compression: Segment payloads are LZ4-compressed for fast compression/decompression with good compression ratios.
- Columnar Storage: Patches use a columnar format for better compression and faster access.
- Bidirectional Patches: Forward and backward patches allow efficient backward navigation without recomputing diffs.
- Segmented Timeline: Multiple segments in a
ReplayTimelineallow large games and version transitions to be handled without loading everything at once.
Long Patches
For large jumps, applying the patch path step-by-step can become expensive because you may need to apply many intermediate patches. As an optimization, ReplayInterface.jump_to() can create an on-demand "long patch" that collapses the net effect of a whole interval into a single patch`.
When long patches are created
A long patch is created automatically inside ReplayInterface.jump_to(...) when the
PatchGraph.cost(patches) is greater than LONG_PATCH_THRESHOLD and the patch path
contains at least 1 patch.
What a long patch contains
Long patches are created via conflict_interface.replay.long_patch.create_long_patch(...) and represent the consolidated effect of changes from from_time to to_time:
- find the shortest patch path between the timestamps (
PatchGraph.find_patch_path) - build the required sub-structure over that path (adjacency + op-tree)
- collapse operations to preserve the same end-state semantics
- create a single
PatchGraphNode(from_timestamp, to_timestamp, *operations)
The new edge is added to the loaded segment's in-memory PatchGraph, so subsequent jumps can reuse it while the replay is open.