Apache Iceberg

Apache Iceberg

ryft-io

Access Apache Iceberg tables stored in AWS through natural language queries in Claude or other MCP clients. Explore catalogs, schemas, and partition information without writing code.

Provides direct access to Apache Iceberg tables stored in AWS, enabling exploration of catalogs, schemas, properties, and partition information without complex queries or code.

42339 views6Local (stdio)

What it does

  • Browse Apache Iceberg catalogs and schemas
  • Inspect table properties and metadata
  • View partition information
  • Query table structures using natural language
  • Access AWS Glue managed catalogs

Best for

Data engineers exploring lakehouse architecturesAnalysts investigating Iceberg table structuresTeams working with AWS Glue catalogs
Natural language interfaceAWS Glue integrationNo complex SQL required

About Apache Iceberg

Apache Iceberg is a community-built MCP server published by ryft-io that provides AI assistants with tools and capabilities via the Model Context Protocol. Access Apache Iceberg tables in AWS: explore catalogs, schemas, properties and partitions—no queries or code required. It is categorized under databases, analytics data.

How to install

You can install Apache Iceberg in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

License

Apache Iceberg is released under the Apache-2.0 license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

Iceberg Logo

IcebergMCP 🚀

AI-native Lakehouse Integration

PyPI - Version License

IcebergMCP is a Model Context Protocol (MCP) server that lets you interact with your Apache Iceberg™ Lakehouse using natural language in Claude, Cursor, or any other MCP client.

Table of Contents

Installation

Prerequisites

  • Apache Iceberg™ catalog managed in AWS Glue
  • AWS profile configured on the machine, with access to the catalog
  • uv package manager - install via brew install uv or see official installation guide

Claude

  1. Inside Claude, go to Settings > Developer > Edit Config > claude_desktop_config.json

  2. Add the following:

{
  "mcpServers": {
    "iceberg-mcp": {
      "command": "uv", // If uv can't be found, replace with full absolute path to uv
      "args": [
        "run",
        "--with",
        "iceberg-mcp",
        "iceberg-mcp"
      ],
      "env": {
        "ICEBERG_MCP_PROFILE": "<aws-profile-name>"
      }
    }
  }
}

Cursor

  1. Inside Cursor, go to Settings -> Cursor Settings -> MCP -> Add new global MCP server

  2. Add the following:

{
  "mcpServers": {
    "iceberg-mcp": {
      "command": "uv", // If uv can't be found, replace with full absolute path to uv
      "args": [
        "run",
        "--with",
        "iceberg-mcp",
        "iceberg-mcp"
      ],
      "env": {
        "ICEBERG_MCP_PROFILE": "<aws-profile-name>"
      }
    }
  }
}

Configuration

Environment variables can be used to configure the AWS connection:

  • ICEBERG_MCP_PROFILE - The AWS profile name to use. This role will be assumed and used to connect to the catalog and the object storage. If not specified, the default role will be used.
  • ICEBERG_MCP_REGION - The AWS region to use. This is used to determine the catalog and object storage location. us-east-1 by default.

Available Tools

The server provides the following tools for interacting with your Apache Iceberg™ tables:

  • get_namespaces: Gets all namespaces in the Apache Iceberg™ catalog
  • get_iceberg_tables: Gets all tables for a given namespace
  • get_table_schema: Returns the schema for a given table
  • get_table_properties: Returns table properties for a given table, like total size and record count
  • get_table_partitions: Gets all partitions for a given table

Examples

Once installed and configured, you can start interacting with your Apache Iceberg™ tables through your MCP client. Here are some simple examples of how to interact with your lakehouse:

  1. "List all namespaces in my catalog"
  2. "List all tables for the namespace called bronze"
  3. "What are all the string columns in the table raw_events?
  4. "What is the size of the raw_events table?"
  5. "Generate an SQL query that calculates the sum and the p95 of all number columns in raw_metrics for all VIP users from users_info"
  6. "Why did the queries on raw_events recently become much slower?"

Limitations & Security Considerations

  • All tools are currently read-only and cannot modify or delete data from your lakehouse
  • Currently supported catalogs:
    • AWS Glue
    • Apache Iceberg™ REST Catalog (coming soon!)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Alternatives

Related Skills

Browse all skills
data-engineer

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms. Use PROACTIVELY for data pipeline design, analytics infrastructure, or modern data stack implementation.

2
spark-engineer

Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.

0
literature-review

Conduct comprehensive, systematic literature reviews using multiple academic databases (PubMed, arXiv, bioRxiv, Semantic Scholar, etc.). This skill should be used when conducting systematic literature reviews, meta-analyses, research synthesis, or comprehensive literature searches across biomedical, scientific, and technical domains. Creates professionally formatted markdown documents and PDFs with verified citations in multiple citation styles (APA, Nature, Vancouver, etc.).

377
postgresql-psql

Comprehensive guide for PostgreSQL psql - the interactive terminal client for PostgreSQL. Use when connecting to PostgreSQL databases, executing queries, managing databases/tables, configuring connection options, formatting output, writing scripts, managing transactions, and using advanced psql features for database administration and development.

38
data-storytelling

Transform data into compelling narratives using visualization, context, and persuasive structure. Use when presenting analytics to stakeholders, creating data reports, or building executive presentations.

27
content-trend-researcher

Advanced content and topic research skill that analyzes trends across Google Analytics, Google Trends, Substack, Medium, Reddit, LinkedIn, X, blogs, podcasts, and YouTube to generate data-driven article outlines based on user intent analysis

23