imaging-data-commons
Query and download public cancer imaging data from NCI Imaging Data Commons using idc-index. Use for accessing large-scale radiology (CT, MR, PET) and pathology datasets for AI training or research. No authentication required. Query by metadata, visualize in browser, check licenses.
Install
mkdir -p .claude/skills/imaging-data-commons && curl -L -o skill.zip "https://mcp.directory/api/skills/download/7232" && unzip -o skill.zip -d .claude/skills/imaging-data-commons && rm skill.zipInstalls to .claude/skills/imaging-data-commons
About this skill
Imaging Data Commons
Overview
Use the idc-index Python package to query and download public cancer imaging data from the National Cancer Institute Imaging Data Commons (IDC). No authentication required for data access.
Current IDC Data Version: v23 (always verify with IDCClient().get_idc_version())
Primary tool: idc-index (GitHub)
CRITICAL - Check package version and upgrade if needed (run this FIRST):
import idc_index
REQUIRED_VERSION = "0.11.10" # Must match metadata.idc-index in this file
installed = idc_index.__version__
if installed < REQUIRED_VERSION:
print(f"Upgrading idc-index from {installed} to {REQUIRED_VERSION}...")
import subprocess
subprocess.run(["pip3", "install", "--upgrade", "--break-system-packages", "idc-index"], check=True)
print("Upgrade complete. Restart Python to use new version.")
else:
print(f"idc-index {installed} meets requirement ({REQUIRED_VERSION})")
Verify IDC data version and check current data scale:
from idc_index import IDCClient
client = IDCClient()
# Verify IDC data version (should be "v23")
print(f"IDC data version: {client.get_idc_version()}")
# Get collection count and total series
stats = client.sql_query("""
SELECT
COUNT(DISTINCT collection_id) as collections,
COUNT(DISTINCT analysis_result_id) as analysis_results,
COUNT(DISTINCT PatientID) as patients,
COUNT(DISTINCT StudyInstanceUID) as studies,
COUNT(DISTINCT SeriesInstanceUID) as series,
SUM(instanceCount) as instances,
SUM(series_size_MB)/1000000 as size_TB
FROM index
""")
print(stats)
Core workflow:
- Query metadata →
client.sql_query() - Download DICOM files →
client.download_from_selection() - Visualize in browser →
client.get_viewer_URL(seriesInstanceUID=...)
When to Use This Skill
- Finding publicly available radiology (CT, MR, PET) or pathology (slide microscopy) images
- Selecting image subsets by cancer type, modality, anatomical site, or other metadata
- Downloading DICOM data from IDC
- Checking data licenses before use in research or commercial applications
- Visualizing medical images in a browser without local DICOM viewer software
Quick Navigation
Core Sections (inline):
- IDC Data Model - Collection and analysis result hierarchy
- Index Tables - Available tables and joining patterns
- Installation - Package setup and version verification
- Core Capabilities - Essential API patterns (query, download, visualize, license, citations, batch)
- Best Practices - Usage guidelines
- Troubleshooting - Common issues and solutions
Reference Guides (load on demand):
| Guide | When to Load |
|---|---|
index_tables_guide.md | Complex JOINs, schema discovery, DataFrame access |
use_cases.md | End-to-end workflow examples (training datasets, batch downloads) |
sql_patterns.md | Quick SQL patterns for filter discovery, annotations, size estimation |
clinical_data_guide.md | Clinical/tabular data, imaging+clinical joins, value mapping |
cloud_storage_guide.md | Direct S3/GCS access, versioning, UUID mapping |
dicomweb_guide.md | DICOMweb endpoints, PACS integration |
digital_pathology_guide.md | Slide microscopy (SM), annotations (ANN), pathology workflows |
bigquery_guide.md | Full DICOM metadata, private elements (requires GCP) |
cli_guide.md | Command-line tools (idc download, manifest files) |
IDC Data Model
IDC adds two grouping levels above the standard DICOM hierarchy (Patient → Study → Series → Instance):
- collection_id: Groups patients by disease, modality, or research focus (e.g.,
tcga_luad,nlst). A patient belongs to exactly one collection. - analysis_result_id: Identifies derived objects (segmentations, annotations, radiomics features) across one or more original collections.
Use collection_id to find original imaging data, may include annotations deposited along with the images; use analysis_result_id to find AI-generated or expert annotations.
Key identifiers for queries:
| Identifier | Scope | Use for |
|---|---|---|
collection_id | Dataset grouping | Filtering by project/study |
PatientID | Patient | Grouping images by patient |
StudyInstanceUID | DICOM study | Grouping of related series, visualization |
SeriesInstanceUID | DICOM series | Grouping of related series, visualization |
Index Tables
The idc-index package provides multiple metadata index tables, accessible via SQL or as pandas DataFrames.
Complete index table documentation: Use https://idc-index.readthedocs.io/en/latest/indices_reference.html for quick check of available tables and columns without executing any code.
Important: Use client.indices_overview to get current table descriptions and column schemas. This is the authoritative source for available columns and their types — always query it when writing SQL or exploring data structure.
Available Tables
| Table | Row Granularity | Loaded | Description |
|---|---|---|---|
index | 1 row = 1 DICOM series | Auto | Primary metadata for all current IDC data |
prior_versions_index | 1 row = 1 DICOM series | Auto | Series from previous IDC releases; for downloading deprecated data |
collections_index | 1 row = 1 collection | fetch_index() | Collection-level metadata and descriptions |
analysis_results_index | 1 row = 1 analysis result collection | fetch_index() | Metadata about derived datasets (annotations, segmentations) |
clinical_index | 1 row = 1 clinical data column | fetch_index() | Dictionary mapping clinical table columns to collections |
sm_index | 1 row = 1 slide microscopy series | fetch_index() | Slide Microscopy (pathology) series metadata |
sm_instance_index | 1 row = 1 slide microscopy instance | fetch_index() | Instance-level (SOPInstanceUID) metadata for slide microscopy |
seg_index | 1 row = 1 DICOM Segmentation series | fetch_index() | Segmentation metadata: algorithm, segment count, reference to source image series |
ann_index | 1 row = 1 DICOM ANN series | fetch_index() | Microscopy Bulk Simple Annotations series metadata; references annotated image series |
ann_group_index | 1 row = 1 annotation group | fetch_index() | Detailed annotation group metadata: graphic type, annotation count, property codes, algorithm |
contrast_index | 1 row = 1 series with contrast info | fetch_index() | Contrast agent metadata: agent name, ingredient, administration route (CT, MR, PT, XA, RF) |
Auto = loaded automatically when IDCClient() is instantiated
fetch_index() = requires client.fetch_index("table_name") to load
Joining Tables
Key columns are not explicitly labeled, the following is a subset that can be used in joins.
| Join Column | Tables | Use Case |
|---|---|---|
collection_id | index, prior_versions_index, collections_index, clinical_index | Link series to collection metadata or clinical data |
SeriesInstanceUID | index, prior_versions_index, sm_index, sm_instance_index | Link series across tables; connect to slide microscopy details |
StudyInstanceUID | index, prior_versions_index | Link studies across current and historical data |
PatientID | index, prior_versions_index | Link patients across current and historical data |
analysis_result_id | index, analysis_results_index | Link series to analysis result metadata (annotations, segmentations) |
source_DOI | index, analysis_results_index | Link by publication DOI |
crdc_series_uuid | index, prior_versions_index | Link by CRDC unique identifier |
Modality | index, prior_versions_index | Filter by imaging modality |
SeriesInstanceUID | index, seg_index, ann_index, ann_group_index, contrast_index | Link segmentation/annotation/contrast series to its index metadata |
segmented_SeriesInstanceUID | seg_index → index | Link segmentation to its source image series (join seg_index.segmented_SeriesInstanceUID = index.SeriesInstanceUID) |
referenced_SeriesInstanceUID | ann_index → index | Link annotation to its source image series (join ann_index.referenced_SeriesInstanceUID = index.SeriesInstanceUID) |
Note: Subjects, Updated, and Description appear in multiple tables but have different meanings (counts vs identifiers, different update contexts).
For detailed join examples, schema discovery patterns, key columns reference, and DataFrame access, see references/index_tables_guide.md.
Clinical Data Access
# Fetch clinical index (also downloads clinical data tables)
client.fetch_index("clinical_index")
# Query clinical index to find available tables and their columns
tables = client.sql_query("SELECT DISTINCT table_name, column_label FROM clinical_index")
# Load a specific clinical table as DataFrame
clinical_df = client.get_clinical_table("table_name")
See references/clinical_data_guide.md for detailed workflows including value mapping patterns and joining clinical data with imaging.
Data Access Options
| Method | Auth Required | Best For |
|---|---|---|
idc-index | No | Key queries and downloads (recommended) |
| IDC Portal | No | Interactive exploration, manual selection, browser-based download |
| BigQuery | Yes (GCP account) | Complex queries, full DICOM metadata |
| DICOMweb proxy | No | Tool integration via DICOMweb API |
| Cloud storage (S3/GCS) | No | Direct file access, bulk downloads, custom pipelines |
Cloud storage organization
IDC maintains all DICOM files in public cloud storage buckets mirrored between AWS S3 and Google Cloud Storage. Files are organized by CRDC UUIDs (not DICOM UIDs) to support versioning.
| Bucket (AWS / GCS) | License | Content |
|---|---|---|
idc-open-data / `i |
Content truncated.
More by K-Dense-AI
View all skills by K-Dense-AI →You might also like
flutter-development
aj-geddes
Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.
drawio-diagrams-enhanced
jgtolentino
Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.
ui-ux-pro-max
nextlevelbuilder
"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."
godot
bfollington
This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.
nano-banana-pro
garg-aayush
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fastapi-templates
wshobson
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
Related MCP Servers
Browse all serversAccess and analyze open government data, including budgets and crime stats, from Socrata gov datasets—no API key needed.
Create images instantly with Nano Banana, a free online Gemini AI image generator. Share with public URLs—no downloads n
Build persistent semantic networks for enterprise & engineering data management. Enable data persistence and memory acro
MCP Toolbox for Databases by Google. An open-source server that lets AI agents query Cloud SQL, Spanner, AlloyDB, and ot
Explore official Google BigQuery MCP servers. Find resources and examples to build context-aware apps in Google's ecosys
Connect Supabase projects to AI with Supabase MCP Server. Standardize LLM communication for secure, efficient developmen
Stay ahead of the MCP ecosystem
Get weekly updates on new skills and servers.