Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Data Durability Model

Responsibility View

This document defines how data is stored, replicated, and protected. It answers the question: How is data kept safe and available?

Data Pipeline

The following diagram shows the lifecycle of data from the application to offsite storage.

flowchart TB
  App[Stateful App] --> Request["Storage Storage Interface"]
  Request --> Storage[Storage Fabric]
  Storage --> Replicas["Replicated Copies (N>=2)"]
  Replicas --> Backup[Backup / Snapshot System]
  Backup --> Offsite[(Optional: Offsite Copy)]

Layers of Protection

Storage Fabric

The active storage layer (e.g., Ceph, ZFS, or RAID). It provides immediate availability and protection against single-drive or single-node failures via real-time replication.

Snapshots

Point-in-time, read-only views of the storage. These provide “undo” capability for accidental deletions or software corruption without requiring a full restore.

Backup System

A separate, immutable copy of the data stored on different physical media. This protects against catastrophic failure of the primary storage fabric.