find-hypertable-candidates

4
0
Source

Use this skill to analyze an existing PostgreSQL database and identify which tables should be converted to Timescale/TimescaleDB hypertables. **Trigger when user asks to:** - Analyze database tables for hypertable conversion potential - Identify time-series or event tables in an existing schema - Evaluate if a table would benefit from Timescale/TimescaleDB - Audit PostgreSQL tables for migration to Timescale/TimescaleDB/TigerData - Score or rank tables for hypertable candidacy **Keywords:** hypertable candidate, table analysis, migration assessment, Timescale, TimescaleDB, time-series detection, insert-heavy tables, event logs, audit tables Provides SQL queries to analyze table statistics, index patterns, and query patterns. Includes scoring criteria (8+ points = good candidate) and pattern recognition for IoT, events, transactions, and sequential data.

Install

mkdir -p .claude/skills/find-hypertable-candidates && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2364" && unzip -o skill.zip -d .claude/skills/find-hypertable-candidates && rm skill.zip

Installs to .claude/skills/find-hypertable-candidates

About this skill

PostgreSQL Hypertable Candidate Analysis

Identify tables that would benefit from TimescaleDB hypertable conversion. After identification, use the companion "migrate-postgres-tables-to-hypertables" skill for configuration and migration.

TimescaleDB Benefits

Performance gains: 90%+ compression, fast time-based queries, improved insert performance, efficient aggregations, continuous aggregates for materialization (dashboards, reports, analytics), automatic data management (retention, compression).

Best for insert-heavy patterns:

  • Time-series data (sensors, metrics, monitoring)
  • Event logs (user events, audit trails, application logs)
  • Transaction records (orders, payments, financial)
  • Sequential data (auto-incrementing IDs with timestamps)
  • Append-only datasets (immutable records, historical)

Requirements: Large volumes (1M+ rows), time-based queries, infrequent updates

Step 1: Database Schema Analysis

Option A: From Database Connection

Table statistics and size

-- Get all tables with row counts and insert/update patterns
WITH table_stats AS (
    SELECT
        schemaname, tablename,
        n_tup_ins as total_inserts,
        n_tup_upd as total_updates,
        n_tup_del as total_deletes,
        n_live_tup as live_rows,
        n_dead_tup as dead_rows
    FROM pg_stat_user_tables
),
table_sizes AS (
    SELECT
        schemaname, tablename,
        pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as total_size,
        pg_total_relation_size(schemaname||'.'||tablename) as total_size_bytes
    FROM pg_tables
    WHERE schemaname NOT IN ('information_schema', 'pg_catalog')
)
SELECT
    ts.schemaname, ts.tablename, ts.live_rows,
    tsize.total_size, tsize.total_size_bytes,
    ts.total_inserts, ts.total_updates, ts.total_deletes,
    ROUND(CASE WHEN ts.live_rows > 0
          THEN (ts.total_inserts::float / ts.live_rows) * 100
          ELSE 0 END, 2) as insert_ratio_pct
FROM table_stats ts
JOIN table_sizes tsize ON ts.schemaname = tsize.schemaname AND ts.tablename = tsize.tablename
ORDER BY tsize.total_size_bytes DESC;

Look for:

  • mostly insert-heavy patterns (less updates/deletes)
  • big tables (1M+ rows or 100MB+)

Index patterns

-- Identify common query dimensions
SELECT schemaname, tablename, indexname, indexdef
FROM pg_indexes
WHERE schemaname NOT IN ('information_schema', 'pg_catalog')
ORDER BY tablename, indexname;

Look for:

  • Multiple indexes with timestamp/created_at columns → time-based queries
  • Composite (entity_id, timestamp) indexes → good candidates
  • Time-only indexes → time range filtering common

Query patterns (if pg_stat_statements available)

-- Check availability
SELECT EXISTS (SELECT 1 FROM pg_extension WHERE extname = 'pg_stat_statements');

-- Analyze expensive queries for candidate tables
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE query ILIKE '%your_table_name%'
ORDER BY total_exec_time DESC LIMIT 20;

✅ Good patterns: Time-based WHERE, entity filtering combined with time-based qualifiers, GROUP BY time_bucket, range queries over time ❌ Poor patterns: Non-time lookups with no time-based qualifiers in same query (WHERE email = ...)

Constraints

-- Check migration compatibility
SELECT conname, contype, pg_get_constraintdef(oid) as definition
FROM pg_constraint
WHERE conrelid = 'your_table_name'::regclass;

Compatibility:

  • Primary keys (p): Must include partition column or ask user if can be modified
  • Foreign keys (f): Plain→Hypertable and Hypertable→Plain OK, Hypertable→Hypertable NOT supported
  • Unique constraints (u): Must include partition column or ask user if can be modified
  • Check constraints (c): Usually OK

Option B: From Code Analysis

✅ GOOD Patterns

# Append-only logging
INSERT INTO events (user_id, event_time, data) VALUES (...);
# Time-series collection
INSERT INTO metrics (device_id, timestamp, value) VALUES (...);
# Time-based queries
SELECT * FROM metrics WHERE timestamp >= NOW() - INTERVAL '24 hours';
# Time aggregations
SELECT DATE_TRUNC('day', timestamp), COUNT(*) GROUP BY 1;

❌ POOR Patterns

# Frequent updates to historical records
UPDATE users SET email = ..., updated_at = NOW() WHERE id = ...;
# Non-time lookups
SELECT * FROM users WHERE email = ...;
# Small reference tables
SELECT * FROM countries ORDER BY name;

Schema Indicators

✅ GOOD:

  • Has timestamp/timestamptz column
  • Multiple indexes with timestamp-based columns
  • Composite (entity_id, timestamp) indexes

❌ POOR:

  • Mostly indexes with non-time-based columns (on columns like email, name, status, etc.)
  • Columns that you expect to be updated over time (updated_at, updated_by, status, etc.)
  • Unique constraints on non-time fields
  • Frequent updated_at modifications
  • Small static tables

Special Case: ID-Based Tables

Sequential ID tables can be candidates if:

  • Insert-mostly pattern / updates are either infrequent or only on recent records.
  • If updates do happen, they occur on recent records (such as an order status being updated orderered->processing->delivered. Note once an order is delivered, it is unlikely to be updated again.)
  • IDs correlate with time (as is the case for serial/auto-incrementing IDs/GENERATED ALWAYS AS IDENTITY)
  • ID is the primary query dimension
  • Recent data accessed more often (frequently the case in ecommerce, finance, etc.)
  • Time-based reporting common (e.g. monthly, daily summaries/analytics)
CREATE TABLE orders (
    id BIGSERIAL PRIMARY KEY,           -- Can partition by ID
    user_id BIGINT,
    created_at TIMESTAMPTZ DEFAULT NOW() -- For sparse indexes
);

Note: For ID-based tables where there is also a time column (created_at, ordered_at, etc.), you can partition by ID and use sparse indexes on the time column. See the migrate-postgres-tables-to-hypertables skill for details.

Step 2: Candidacy Scoring (8+ points = good candidate)

Time-Series Characteristics (5+ points needed)

  • Has timestamp/timestamptz column: 3 points
  • Data inserted chronologically: 2 points
  • Queries filter by time: 2 points
  • Time aggregations common: 2 points

Scale & Performance (3+ points recommended)

  • Large table (1M+ rows or 100MB+): 2 points
  • High insert volume: 1 point
  • Infrequent updates to historical: 1 point
  • Range queries common: 1 point
  • Aggregation queries: 2 points

Data Patterns (bonus)

  • Contains entity ID for segmentation (device_id, user_id, product_id, symbol, etc.): 1 point
  • Numeric measurements: 1 point
  • Log/event structure: 1 point

Common Patterns

✅ GOOD Candidates

✅ Event/Log Tables (user_events, audit_logs)

CREATE TABLE user_events (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT,
    event_type TEXT,
    event_time TIMESTAMPTZ DEFAULT NOW(),
    metadata JSONB
);
-- Partition by id, segment by user_id, enable minmax sparse_index on event_time

✅ Sensor/IoT Data (sensor_readings, telemetry)

CREATE TABLE sensor_readings (
    device_id TEXT,
    timestamp TIMESTAMPTZ,
    temperature DOUBLE PRECISION,
    humidity DOUBLE PRECISION
);
-- Partition by timestamp, segment by device_id, minmax sparse indexes on temperature and humidity

✅ Financial/Trading (stock_prices, transactions)

CREATE TABLE stock_prices (
    symbol VARCHAR(10),
    price_time TIMESTAMPTZ,
    open_price DECIMAL,
    close_price DECIMAL,
    volume BIGINT
);
-- Partition by price_time, segment by symbol, minmax sparse indexes on open_price and close_price and volume

✅ System Metrics (monitoring_data)

CREATE TABLE system_metrics (
    hostname TEXT,
    metric_time TIMESTAMPTZ,
    cpu_usage DOUBLE PRECISION,
    memory_usage BIGINT
);
-- Partition by metric_time, segment by hostname, minmax sparse indexes on cpu_usage and memory_usage

❌ POOR Candidates

❌ Reference Tables (countries, categories)

CREATE TABLE countries (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    code CHAR(2)
);
-- Static data, no time component

❌ User Profiles (users, accounts)

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    email VARCHAR(255),
    created_at TIMESTAMPTZ,
    updated_at TIMESTAMPTZ
);
-- Accessed by ID, frequently updated, has timestamp but it's not the primary query dimension (the primary query dimension is id or email)

❌ Settings/Config (user_settings)

CREATE TABLE user_settings (
    user_id BIGINT PRIMARY KEY,
    theme VARCHAR(20),       -- Changes: light -> dark -> auto
    language VARCHAR(10),    -- Changes: en -> es -> fr
    notifications JSONB,     -- Frequent preference updates
    updated_at TIMESTAMPTZ
);
-- Accessed by user_id, frequently updated, has timestamp but it's not the primary query dimension (the primary query dimension is user_id)

Analysis Output Requirements

For each candidate table provide:

  • Score: Based on criteria (8+ = strong candidate)
  • Pattern: Insert vs update ratio
  • Access: Time-based vs entity lookups
  • Size: Current size and growth rate
  • Queries: Time-range, aggregations, point lookups

Focus on insert-heavy patterns with time-based or sequential access. Tables scoring 8+ points are strong candidates for conversion.

More by timescale

View all →

postgres-hybrid-text-search

timescale

Use this skill to implement hybrid search combining BM25 keyword search with semantic vector search using Reciprocal Rank Fusion (RRF). **Trigger when user asks to:** - Combine keyword and semantic search - Implement hybrid search or multi-modal retrieval - Use BM25/pg_textsearch with pgvector together - Implement RRF (Reciprocal Rank Fusion) for search - Build search that handles both exact terms and meaning **Keywords:** hybrid search, BM25, pg_textsearch, RRF, reciprocal rank fusion, keyword search, full-text search, reranking, cross-encoder Covers: pg_textsearch BM25 index setup, parallel query patterns, client-side RRF fusion (Python/TypeScript), weighting strategies, and optional ML reranking.

50

migrate-postgres-tables-to-hypertables

timescale

Use this skill to migrate identified PostgreSQL tables to Timescale/TimescaleDB hypertables with optimal configuration and validation. **Trigger when user asks to:** - Migrate or convert PostgreSQL tables to hypertables - Execute hypertable migration with minimal downtime - Plan blue-green migration for large tables - Validate hypertable migration success - Configure compression after migration **Prerequisites:** Tables already identified as candidates (use find-hypertable-candidates first if needed) **Keywords:** migrate to hypertable, convert table, Timescale, TimescaleDB, blue-green migration, in-place conversion, create_hypertable, migration validation, compression setup Step-by-step migration planning including: partition column selection, chunk interval calculation, PK/constraint handling, migration execution (in-place vs blue-green), and performance validation queries.

20

pgvector-semantic-search

timescale

Use this skill for setting up vector similarity search with pgvector for AI/ML embeddings, RAG applications, or semantic search. **Trigger when user asks to:** - Store or search vector embeddings in PostgreSQL - Set up semantic search, similarity search, or nearest neighbor search - Create HNSW or IVFFlat indexes for vectors - Implement RAG (Retrieval Augmented Generation) with PostgreSQL - Optimize pgvector performance, recall, or memory usage - Use binary quantization for large vector datasets **Keywords:** pgvector, embeddings, semantic search, vector similarity, HNSW, IVFFlat, halfvec, cosine distance, nearest neighbor, RAG, LLM, AI search Covers: halfvec storage, HNSW index configuration (m, ef_construction, ef_search), quantization strategies, filtered search, bulk loading, and performance tuning.

230

design-postgres-tables

timescale

Use this skill for general PostgreSQL table design. **Trigger when user asks to:** - Design PostgreSQL tables, schemas, or data models when creating new tables and when modifying existing ones. - Choose data types, constraints, or indexes for PostgreSQL - Create user tables, order tables, reference tables, or JSONB schemas - Understand PostgreSQL best practices for normalization, constraints, or indexing - Design update-heavy, upsert-heavy, or OLTP-style tables **Keywords:** PostgreSQL schema, table design, data types, PRIMARY KEY, FOREIGN KEY, indexes, B-tree, GIN, JSONB, constraints, normalization, identity columns, partitioning, row-level security Comprehensive reference covering data types, indexing strategies, constraints, JSONB patterns, partitioning, and PostgreSQL-specific best practices.

200

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

261780

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

200412

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

176270

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

204230

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

161194

rust-coding-skill

UtakataKyosui

Guides Claude in writing idiomatic, efficient, well-structured Rust code using proper data modeling, traits, impl organization, macros, and build-speed best practices.

159171

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.