litestream

0
0
Source

Expert knowledge for contributing to Litestream, a standalone disaster recovery tool for SQLite. Provides architectural understanding, code patterns, critical rules, and debugging procedures for WAL monitoring, LTX replication format, storage backend implementation, multi-level compaction, and SQLite page management. Use when working with Litestream source code, writing storage backends, debugging replication issues, implementing compaction logic, or handling SQLite WAL operations.

Install

mkdir -p .claude/skills/litestream && curl -L -o skill.zip "https://mcp.directory/api/skills/download/9023" && unzip -o skill.zip -d .claude/skills/litestream && rm skill.zip

Installs to .claude/skills/litestream

About this skill

Litestream Agent Skill

Litestream is a standalone disaster recovery tool for SQLite. It runs as a background process, monitors the SQLite WAL (Write-Ahead Log), converts changes to immutable LTX files, and replicates them to cloud storage. It uses modernc.org/sqlite (pure Go, no CGO required).

Quick Start

# Build
go build -o bin/litestream ./cmd/litestream

# Test (always use race detector)
go test -race -v ./...

# Code quality
pre-commit run --all-files

Critical Rules

These invariants must never be violated:

1. Lock Page at 1GB

SQLite reserves a page at byte offset 0x40000000 (1 GB). Always skip it during replication and compaction. The page number varies by page size:

Page SizeLock Page Number
4 KB262145
8 KB131073
16 KB65537
32 KB32769
lockPgno := ltx.LockPgno(pageSize)
if pgno == lockPgno {
    continue
}

2. LTX Files Are Immutable

Once an LTX file is written, it must never be modified. New changes create new files. This guarantees point-in-time recovery integrity.

3. Single Replica per Database

Each database replicates to exactly one destination. The Replica component manages replication mechanics; database state belongs in the DB layer.

4. Read Local Before Remote During Compaction

Cloud storage is eventually consistent. Always read from local disk first:

f, err := os.Open(db.LTXPath(info.Level, info.MinTXID, info.MaxTXID))
if err == nil {
    return f, nil // Use local copy
}
return replica.Client.OpenLTXFile(...) // Fall back to remote

5. Preserve Timestamps During Compaction

Set the compacted file's CreatedAt to the earliest source file timestamp to maintain temporal granularity for point-in-time restoration.

info.CreatedAt = oldestSourceFile.CreatedAt

6. Use Lock() Not RLock() for Writes

// CORRECT
r.mu.Lock()
defer r.mu.Unlock()
r.pos = pos

// WRONG - race condition
r.mu.RLock()
defer r.mu.RUnlock()
r.pos = pos

7. Atomic File Operations

Always write to a temp file then rename. Never write directly to the final path.

tmpFile, err := os.CreateTemp(dir, ".tmp-*")
// ... write data, sync ...
os.Rename(tmpFile.Name(), finalPath)

Architecture

System Layers

LayerFile(s)Responsibility
Appcmd/litestream/CLI commands, YAML/env config
Storestore.goMulti-DB coordination, compaction
DBdb.goSingle DB management, WAL monitoring
Replicareplica.goReplication to one destination
Storage*/replica_client.goBackend implementations (S3, GCS, etc.)

Database state logic belongs in the DB layer, not the Replica layer.

ReplicaClient Interface

All storage backends implement this interface from replica_client.go:

type ReplicaClient interface {
    Type() string
    Init(ctx context.Context) error
    LTXFiles(ctx context.Context, level int, seek ltx.TXID, useMetadata bool) (ltx.FileIterator, error)
    OpenLTXFile(ctx context.Context, level int, minTXID, maxTXID ltx.TXID, offset, size int64) (io.ReadCloser, error)
    WriteLTXFile(ctx context.Context, level int, minTXID, maxTXID ltx.TXID, r io.Reader) (*ltx.FileInfo, error)
    DeleteLTXFiles(ctx context.Context, a []*ltx.FileInfo) error
    DeleteAll(ctx context.Context) error
}

Key contract details:

  • OpenLTXFile must return os.ErrNotExist when file is missing
  • WriteLTXFile must set CreatedAt from backend metadata or upload time
  • LTXFiles with useMetadata=true fetches accurate timestamps (for PIT restore)
  • LTXFiles with useMetadata=false uses fast timestamps (normal operations)

Lock Ordering

Always acquire locks in this order to prevent deadlocks:

  1. Store.mu
  2. DB.mu
  3. DB.chkMu
  4. Replica.mu

Core Components

DB (db.go): Manages SQLite connection, WAL monitoring, checkpointing, and long-running read transaction for consistency. Key fields: path, db, rtx (read transaction), pageSize, notify channel.

Replica (replica.go): Tracks replication position (ltx.Pos with TXID, PageNo, Checksum). One replica per database.

Store (store.go): Coordinates multiple databases and schedules compaction across levels.

LTX File Format

LTX (Log Transaction) files are immutable, checksummed archives of database changes. Structure:

+------------------+
|     Header       |  100 bytes (magic "LTX1", page size, TXID range, timestamp)
+------------------+
|   Page Frames    |  4-byte pgno + pageSize bytes data, per page
+------------------+
|   Page Index     |  Binary search index for page lookup
+------------------+
|     Trailer      |  16 bytes (post-apply checksum, file checksum)
+------------------+

Naming Convention

Format:  MMMMMMMMMMMMMMMM-NNNNNNNNNNNNNNNN.ltx
Example: 0000000000000001-0000000000000064.ltx  (TXID 1-100)

Compaction Levels

Level 0: /ltx/0000/  Raw LTX files (no compaction)
Level 1: /ltx/0001/  Compacted periodically
Level 2: /ltx/0002/  Compacted less frequently

Default compaction levels: L0 (raw), L1 (30s), L2 (5min), L3 (1h), plus daily snapshots. Compaction merges files by deduplicating pages (latest version wins) and always skips the lock page.

Code Patterns

DO

  • Return errors immediately; let callers decide handling
  • Use fmt.Errorf("context: %w", err) for error wrapping
  • Handle database state in the DB layer, not Replica
  • Use db.verify() to trigger snapshots (don't reimplement)
  • Test with race detector: go test -race
  • Use lazy iterators for LTXFiles (paginate, don't load all at once)

DON'T

  • Write data at the 1 GB lock page boundary
  • Modify LTX files after creation
  • Put database state logic in the Replica layer
  • Use RLock() when writing shared state
  • Write directly to final file paths (use temp + rename)
  • Ignore context cancellation in long operations
  • Return generic errors instead of os.ErrNotExist for missing files

Specialized Knowledge Areas

Load reference files on demand based on the task:

TaskReference File
Understanding system designreferences/ARCHITECTURE.md
Writing or reviewing codereferences/PATTERNS.md
Working with LTX filesreferences/LTX_FORMAT.md
WAL monitoring or page operationsreferences/SQLITE_INTERNALS.md
Implementing storage backendsreferences/REPLICA_CLIENT_GUIDE.md
Writing or debugging testsreferences/TESTING_GUIDE.md

Common Debugging Procedures

Replication Not Working

  1. Verify WAL mode: PRAGMA journal_mode must return wal
  2. Check monitor interval and that the monitor goroutine is running
  3. Confirm db.notify channel is being signaled on WAL changes
  4. Check replica position: replica.Pos() should advance with writes
  5. Look for os.ErrNotExist from OpenLTXFile (file not replicated yet)

Large Database Issues (>1 GB)

  1. Verify lock page is being skipped: check ltx.LockPgno(pageSize)
  2. Test with multiple page sizes (4K, 8K, 16K, 32K)
  3. Run with databases both smaller and larger than 1 GB
  4. Ensure page iteration loops include the continue guard for lock page

Compaction Problems

  1. Confirm local L0 files exist before compaction reads them
  2. Check that CreatedAt timestamps are preserved (earliest source)
  3. Verify compaction level intervals in Store.levels
  4. Look for eventual consistency issues if reading from remote storage

Storage Backend Issues

  1. Return os.ErrNotExist for missing files (not generic errors)
  2. Support partial reads via offset/size in OpenLTXFile
  3. Handle context cancellation in all methods
  4. Test concurrent operations with -race flag
  5. For eventually consistent backends, add retry logic with backoff

Corrupted or Missing LTX Files

  1. Check logs for LTXError messages - they include context (Op, Path, Level, TXID) and recovery hints
  2. Common error messages: "nonsequential page numbers", "non-contiguous transaction files", "ltx validation failed"
  3. Manual fix: litestream reset <db-path> clears local LTX state and forces fresh snapshot on next sync (database file is not modified)
  4. Automatic fix: set auto-recover: true on the replica config to auto-reset on LTX errors (disabled by default)
  5. Reference: cmd/litestream/reset.go, replica.go (auto-recover logic), db.go (ResetLocalState)

Contribution Guidelines

What's Accepted

  • Bug fixes and patches (welcome)
  • Documentation improvements
  • Small code improvements and performance optimizations
  • Security vulnerability reports (report privately)

Discuss First

  • Feature requests: open an issue before implementing
  • Large changes: discuss approach in an issue first

Pre-Submit Checklist

  • Read relevant docs from the reference table above
  • Follow patterns in references/PATTERNS.md
  • Run go test -race -v ./...
  • Run pre-commit run --all-files
  • For page iteration: test with >1 GB databases
  • Show investigation evidence in PR (see CONTRIBUTING.md)

Testing

# Full test suite with race detection
go test -race -v ./...

# Specific areas
go test -race -v -run TestReplica_Sync ./...
go test -race -v -run TestDB_Sync ./...
go test -race -v -run TestStore_CompactDB ./...

# Coverage
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out

Key testing areas:

  • Lock page handling with >1 GB databases and multiple page sizes

Content truncated.

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

1,4071,302

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

1,2201,024

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

9001,013

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

958658

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

970608

pdf-to-markdown

aliceisjustplaying

Convert entire PDF documents to clean, structured Markdown for full context loading. Use this skill when the user wants to extract ALL text from a PDF into context (not grep/search), when discussing or analyzing PDF content in full, when the user mentions "load the whole PDF", "bring the PDF into context", "read the entire PDF", or when partial extraction/grepping would miss important context. This is the preferred method for PDF text extraction over page-by-page or grep approaches.

1,033496

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.