bulk-rna-seq-batch-correction-with-combat

4
0
Source

Use omicverse's pyComBat wrapper to remove batch effects from merged bulk RNA-seq or microarray cohorts, export corrected matrices, and benchmark pre/post correction visualisations.

Install

mkdir -p .claude/skills/bulk-rna-seq-batch-correction-with-combat && curl -L -o skill.zip "https://mcp.directory/api/skills/download/2787" && unzip -o skill.zip -d .claude/skills/bulk-rna-seq-batch-correction-with-combat && rm skill.zip

Installs to .claude/skills/bulk-rna-seq-batch-correction-with-combat

About this skill

Bulk RNA-seq batch correction with ComBat

Overview

Apply this skill when a user has multiple bulk expression matrices measured across different batches and needs to harmonise them before downstream analysis. It follows t_bulk_combat.ipynb, w hich demonstrates the pyComBat workflow on ovarian cancer microarray cohorts.

Instructions

  1. Import core libraries
    • Load omicverse as ov, anndata, pandas as pd, and matplotlib.pyplot as plt.
    • Call ov.ov_plot_set() (aliased ov.plot_set() in some releases) to align figures with omicverse styling.
  2. Load each batch separately
    • Read the prepared pickled matrices (or user-provided expression tables) with pd.read_pickle(...)/pd.read_csv(...).
    • Transpose to gene × sample before wrapping them in anndata.AnnData objects so adata.obs stores sample metadata.
    • Assign a batch column for every cohort (adata.obs['batch'] = '1', '2', ...). Encourage descriptive labels when availa ble.
  3. Concatenate on shared genes
    • Use anndata.concat([adata1, adata2, adata3], merge='same') to retain the intersection of genes across batches.
    • Confirm the combined adata reports balanced sample counts per batch; if not, prompt users to re-check inputs.
  4. Run ComBat batch correction
    • Execute ov.bulk.batch_correction(adata, batch_key='batch').
    • Explain that corrected values are stored in adata.layers['batch_correction'] while the original counts remain in adata.X.
  5. Export corrected and raw matrices
    • Obtain DataFrames via adata.to_df().T (raw) and adata.to_df(layer='batch_correction').T (corrected).
    • Encourage saving both tables (.to_csv(...)) plus the harmonised AnnData (adata.write_h5ad('adata_batch.h5ad', compressio n='gzip')).
  6. Benchmark the correction
    • For per-sample variance checks, draw before/after boxplots and recolour boxes using ov.pl.red_color, blue_color, gree n_color palettes to match batches.
    • Copy raw counts to a named layer with adata.layers['raw'] = adata.X.copy() before PCA.
    • Run ov.pp.pca(adata, layer='raw', n_pcs=50) and ov.pp.pca(adata, layer='batch_correction', n_pcs=50).
    • Visualise embeddings with ov.pl.embedding(..., basis='raw|original|X_pca', color='batch', frameon='small') and repeat fo r the corrected layer to verify mixing.
  7. Defensive validation
    # Before ComBat: verify batch column exists and has >1 batch
    assert 'batch' in adata.obs.columns, "adata.obs must contain a 'batch' column"
    n_batches = adata.obs['batch'].nunique()
    assert n_batches > 1, f"Only {n_batches} batch — need >1 for batch correction"
    # Verify gene overlap after concatenation
    if adata.n_vars < 100:
        print(f"WARNING: Only {adata.n_vars} shared genes after concat — check gene ID harmonization")
    
  8. Troubleshooting tips
    • Mismatched gene identifiers cause dropped features—remind users to harmonise feature names (e.g., gene symbols) before conca tenation.
    • pyComBat expects log-scale intensities or similarly distributed counts; recommend log-transforming strongly skewed matrices.
    • If batch_correction layer is missing, ensure the batch_key matches the column name in adata.obs.

Examples

  • "Combine three GEO ovarian cohorts, run ComBat, and export both the raw and corrected CSV matrices."
  • "Plot PCA embeddings before and after batch correction to confirm that batches 1–3 overlap."
  • "Save the harmonised AnnData file so I can reload it later for downstream DEG analysis."

References

You might also like

flutter-development

aj-geddes

Build beautiful cross-platform mobile apps with Flutter and Dart. Covers widgets, state management with Provider/BLoC, navigation, API integration, and material design.

643969

drawio-diagrams-enhanced

jgtolentino

Create professional draw.io (diagrams.net) diagrams in XML format (.drawio files) with integrated PMP/PMBOK methodologies, extensive visual asset libraries, and industry-standard professional templates. Use this skill when users ask to create flowcharts, swimlane diagrams, cross-functional flowcharts, org charts, network diagrams, UML diagrams, BPMN, project management diagrams (WBS, Gantt, PERT, RACI), risk matrices, stakeholder maps, or any other visual diagram in draw.io format. This skill includes access to custom shape libraries for icons, clipart, and professional symbols.

591705

ui-ux-pro-max

nextlevelbuilder

"UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, mobile app, .html, .tsx, .vue, .svelte. Elements: button, modal, navbar, sidebar, card, table, form, chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, flat design. Topics: color palette, accessibility, animation, layout, typography, font pairing, spacing, hover, shadow, gradient."

318398

godot

bfollington

This skill should be used when working on Godot Engine projects. It provides specialized knowledge of Godot's file formats (.gd, .tscn, .tres), architecture patterns (component-based, signal-driven, resource-based), common pitfalls, validation tools, code templates, and CLI workflows. The `godot` command is available for running the game, validating scripts, importing resources, and exporting builds. Use this skill for tasks involving Godot game development, debugging scene/resource files, implementing game systems, or creating new Godot components.

339397

nano-banana-pro

garg-aayush

Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.

451339

fastapi-templates

wshobson

Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.

304231

Stay ahead of the MCP ecosystem

Get weekly updates on new skills and servers.