Ledger Architecture
Deep dive into InferaDB's storage layer — a per-vault blockchain with Raft consensus.
Overview
The Ledger is InferaDB’s storage layer. It provides durable, replicated, cryptographically verifiable storage for all authorization data. Every write commits to a per-vault blockchain via Raft, producing Merkle proofs for independent integrity verification.
Consensus
The Ledger uses OpenRaft. Clusters require an odd number of nodes (typically 3 or 5). The leader handles all writes; reads can be served by any node depending on consistency requirements.
Storage Engine
Custom B+ tree storage engine built for authorization workloads:
| Property | Detail |
|---|---|
| Tables | 21 internal tables |
| Write model | Single-writer (serialized through Raft) |
| Page checksums | XXHash per page for corruption detection |
| Compression | zstd for snapshots |
Key Format
All keys in the storage engine follow a composite format:
vault_id (8 bytes) + bucket_id (1 byte) + local_key (variable)
This ensures vault-level data locality and efficient per-vault range scans.
State Root and Merkle Proofs
State Root Computation
256-bucket state root, updated incrementally. Only buckets affected by a write are recomputed, keeping overhead constant regardless of data size.
Merkle Proofs
| Proof Type | Verifiable? | Description |
|---|---|---|
| Point read | Yes | Prove that a specific key has a specific value |
| Transaction inclusion | Yes | Prove that a transaction was included in a specific block |
| List operation | No | List results are not individually provable |
Clients can verify point reads and transaction inclusion without trusting any Ledger node.
Write Path
Client
→ Leader node
→ Raft AppendEntries to followers
→ Quorum acknowledgment
→ Commit to local B+ tree
→ Produce block with state_root
→ Return revision token to client
Each committed write produces a block with the updated state_root, linked to the vault’s chain.
Write Latency
| Percentile | Latency |
|---|---|
| p50 | ~3–4 ms |
| p99 | ~10–15 ms |
Adaptive Batching
Writes are batched under load:
| Parameter | Default | Description |
|---|---|---|
| Max batch size | 100 | Maximum writes per batch |
| Batch timeout | 5 ms | Maximum wait time before flushing a batch |
| Eager commit | On | Commit immediately when a single write arrives |
With eager commit (default), single writes commit immediately. Under high throughput, writes batch automatically.
Idempotency
Two-tier idempotency for safe retries:
- In-memory cache — Moka LRU for fast deduplication of immediate retries
- Replicated entries — Idempotency keys persisted through Raft, surviving leader failover
Snapshots
| Property | Detail |
|---|---|
| Format | Binary, zstd-compressed |
| Trigger interval | Every 5 minutes or 10,000 blocks |
| Purpose | Faster node recovery and log compaction |
New nodes receive a snapshot instead of replaying the entire Raft log.
Vault Health Monitoring
Continuous per-vault chain health monitoring:
- Detects gaps in the block sequence
- Validates page checksums on read
- Triggers auto-recovery from peer replicas when corruption is detected
Multi-Region Deployment
Independent Raft groups per region. Data residency is enforced at the vault level:
# Pin a Ledger node to a specific region
inferadb-ledger --region us-east-1
Vaults are pinned to a region at creation. Data is stored only on nodes within that region for GDPR and data residency compliance.